Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Assessing New Database Capabilities – Multi-Model

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige

Hier ansehen

1 von 43 Anzeige

Assessing New Database Capabilities – Multi-Model

Herunterladen, um offline zu lesen

Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos. 

Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.

Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos. 

Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.

While vendor offerings have exploded in recent years, in due time frameworks will integrate components into what amounts to, for practical purposes, a single offering for multiple workloads, perhaps even for the enterprise.

A multi-model database is a database that can store, manage, and query data in multiple models, such as relational, document-oriented, key-value, graph (triplestore), and column store.

An enterprise will find reduced overhead and other synergies from choosing a single vendor for these workloads.

This session will explore the multi-model option and some criteria that decision makers should evaluate when choosing a multi-model solution.

Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos. 

Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.

Today’s enterprises have an unprecedented variety of data store choices to meet the needs of the varied workloads of an enterprise because there is no one-size-fits-all when it comes to data stores. Putting in place data stores to support a modern enterprise that is now reliant on data can lead to confusion and chaos. 

Enterprises have many needs for databases, including for cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, time series data, and numerous other specific needs.

While vendor offerings have exploded in recent years, in due time frameworks will integrate components into what amounts to, for practical purposes, a single offering for multiple workloads, perhaps even for the enterprise.

A multi-model database is a database that can store, manage, and query data in multiple models, such as relational, document-oriented, key-value, graph (triplestore), and column store.

An enterprise will find reduced overhead and other synergies from choosing a single vendor for these workloads.

This session will explore the multi-model option and some criteria that decision makers should evaluate when choosing a multi-model solution.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Weitere von DATAVERSITY (20)

Aktuellste (20)

Anzeige

Assessing New Database Capabilities – Multi-Model

  1. 1. Assessing New Database Capabilities: Multi-Model Presented by: William McKnight President, McKnight Consulting Group williammcknight www.mcknightcg.com (214) 514-1444
  2. 2. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved. Rick Jacobs, Technical Marketing Manager October 10th, 2022 Enterprise Level Advanced Analytics
  3. 3. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2022. All rights reserved. Agenda Why Couchbase Couchbase Analytics Use Cases & Customer Stories 1 2 3
  4. 4. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2019. All rights reserved. Why Couchbase 1
  5. 5. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2020. All rights reserved. 4 How is Couchbase Different? Mobile/Edge Apps Applications and Microservices Fast • Memory-first design • Cloud-native scale • Geo-replication via XDCR • HA, DR & backup • Low latency Cloud to Edge Familiar • SQL++ query language • Dynamic Schema • ACID SQL Transactions • Cost-based optimizer • SDKs for 12+ languages Affordable • Elastic scaling, sharding & rebalancing • Multidimensional scaling • High-density storage • Incredible price/performance Flexible • JSON document • Multimodel services • Cloud deploy anywhere • Mobile & Edge ready SQL Integrated Cache JSON Documents SQL Query Full Text Search Operational Analytics Eventing Key-Value Access Geo-Replication & Sync Mobile Database Relational Capabilities
  6. 6. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2020. All rights reserved. 5 Database-as-a-Service Self-Managed Cloud • Maximize convenience • Easy to start, manage, and scale • Industry leading price-performance • Highly available and secure • Maximize control & customizability • Leverage DBA’s & OPS team skills • Choose management strategy & tools • Deploy via Kubernetes if you choose Capella Server Flexible Cloud and Edge Options: Delivering Consistency “We wanted a solution that seamlessly works across server and mobile, without lots of retraining. No other solutions came even close to Couchbase.” Aviram Agmon Chief Technical Officer Maccabi • Offline first design for max uptime • Extreme speed and reliability • Data integrity: secure, automated sync • Broad SQL and device support Edge & IoT Mobile
  7. 7. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2019. All rights reserved. Couchbase Analytics 2
  8. 8. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2020. All rights reserved. 7 Analytics fundamentals • Fast ingestion • Near real-time data availability (using DCP) • No ETL (simple, no paradigm shift) • Same data model and query language • MPP processing • Uses best-of-breed DW algorithms (join, aggregation, sorting) • Memory-conscious operators (DGM) • Workload isolation • MDS – has its own sub-cluster • Each query uses all resources Operations Data Real-time Analytics Analytics Tool Business Application Ops Data Node Analytics Node Couchbase Data Platform
  9. 9. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2020. All rights reserved. 8 Timely Operational data is readily available for analytics when created and as current as possible Flexible Schema changes on operational side don’t impact analyses Speedy Analysis queries run quickly without impacting operational performance Scalable Scale to speed up queries and scale up data Requirements for an Agile Analytics Platform
  10. 10. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved. Couchbase Analytics Architecture
  11. 11. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2022. All rights reserved. Customer Stories 3
  12. 12. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved. 11 Key Use Cases Need: Perform data exploration on operational data in near-real time with agile data science modeling Outcome: Enabled new customer attributes to enable data science focused consumer segment strategies → faster time to insights for consumer marketing responses from weeks/months to hours Need: Perform complex analytical queries, computations, and aggregations on JSON data enriched with 3rd party data without data movement Outcome: Analytics Service powered regression calculations to compute 2M+ prices to further improve query performance by 100% for 200GB+ data. No need for ETL eCommerce Real-time marketing campaigns Finance Investments Modeling Need: Scale data platform to meet increased analytics and reporting needs Outcome: Executives able to answer key business revenue impact questions → “Show detailed effects of COVID-19 on hospitals cancelling elective procedures to identify underpaid or unidentified revenue” Healthcare Hospital/Clinics Customer Revenue Personalized Ordering Risk Scoring BI & Data Scale eCommerce Food Delivery. Finance. Healthcare
  13. 13. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved. 12 Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2020. All rights reserved. Outcomes • Reduction of targeted consumer offers from of weeks/months → hours & analyze data in near real- time • Enabled agile data mining models focused on order behaviors, propensity scoring and enabled flexible attribute creation • Removed need to ETL for data science experiments Requirements • Track average transaction size, annual purchase frequency and loyalty to determine customer lifetime value (CLV) • Deliver personalized marketing campaigns, segments and reduce time to perform data science experiments • Ability to perform data exploration on operational data in near-real time SOLUTION: Customer Data Management APPLICATION: Commerce Data Hub Data science experimentation USE CASE(S): Real time marketing campaigns and personalized ordering experience ABOUT: World leader in pizza delivery operating a network of company-owned and franchise-owned stores globally. 3M pizzas a day, 16.5K stores in 85 countries
  14. 14. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved. 13 Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2020. All rights reserved. Requirements • Action on near real-time data flow without transformation • Enable better fan experience at concession stands during games and IoT functionality for ticket scans • Easy to use SQL-like interface as their resources are lean and skilled in SQL Outcomes • Continuous data sync for real-time visitor and customer concessionaire analytics • Increased customer engagement via interactive scoreboards, fan kiosks, and more • Easy integration with Knowi and Tableau for real-time executive reporting SOLUTION: Customer 360 APPLICATION: Ticket scan VIP loyalty program USE CASE(S): Real time analytics for fan interactions ABOUT: Professional baseball franchise valued at $600M+ with 1.8M+ fan base
  15. 15. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved. 14 Scaling legacy DB Mainframe access NoSQL sprawl Scaling other NoSQL DB Managing multiple DBs Dedicated DB per use case Slow dev. cycles Mission-critical new features Ever-changing requirements Mobile apps take too long Modern DB tech. required Need to consolidate tech. Personalization + performance Fully featured mobile apps Single view of customer Legacy = more time, $$, effort Integrate disparate data Delivering Business Outcomes by Solving Technology Problems Improving customer experience & engagement Faster innovation & time to market Reducing infrastructure & operations costs Predictable performance
  16. 16. Confidential and Proprietary. Do not distribute without Couchbase consent. © Couchbase 2021. All rights reserved. Try Couchbase Capella free: No credit card required https://www.couchbase.com/products/capella/get-started THANK YOU
  17. 17. William McKnight President, McKnight Consulting Group • Frequent keynote speaker and trainer internationally • Consulted to Pfizer, Scotiabank, Fidelity, TD Ameritrade, Teva Pharmaceuticals, Verizon, and many other Global 1000 companies • Hundreds of articles, blogs and white papers in publication • Focused on delivering business value and solving business problems utilizing proven, streamlined approaches to information management • Former Database Engineer, Fortune 50 Information Technology executive and Ernst&Young Entrepreneur of Year Finalist • Owner/consultant: Research, Data Strategy and Implementation consulting firm 2 William McKnight The Savvy Manager’s Guide The Savvy Manager’s Guide Information Management Information Management Strategies for Gaining a Competitive Advantage with Data
  18. 18. McKnight Consulting Group Offerings Strategy Training Strategy § Trusted Advisor § Action Plans § Roadmaps § Tool Selections § Program Management Training § Classes § Workshops Implementation § Data/Data Warehousing/Business Intelligence/Analytics § Big Data § Master Data Management § Governance/Quality Implementation 3
  19. 19. McKnight Consulting Group Client Portfolio
  20. 20. Decisions, Decisions, Decisions • Unprecedented variety of data store choices to meet the needs of their varied workloads • Enterprises have many needs for databases, including cache, operational, data warehouse, master data, ERP, analytical, graph data, data lake, and time series data • While vendor offerings have exploded in recent years, in due time frameworks will integrate components into what amounts to a single offering for multiple workloads, perhaps even for the enterprise • But what if price-performant offerings for adjacent workloads in an enterprise have materialized? 5
  21. 21. Many Data Types • Web Crawlers • Open Linked Data • JSON • XML • Documents • Binary • Graph • Log Files 6
  22. 22. Why NoSQL for Operational Big Data More data model flexibility – Web Services as a data model – No !schema first" requirement; load first Faster time to insight from data acquisition Relaxed ACID – Eventual consistency – Willing to trade consistency for availability – ACID would crush things like storing clicks on Google Low upfront software and development costs Programmers love the freedoms Fault-tolerant redundancy Linear Scaling to “webscale” 7
  23. 23. • Placement policy: A copy is written to the node creating the file (write affinity) A second copy is written to a data node within the same rack (to minimize cross-rack network traffic) A third copy is written to a data node in a different rack (to tolerate switch failures) Node 5 Node 4 Node 3 Node 2 Node 1 Block 1 Block 3 Block 2 Block 1 Block 3 Block 2 Block 3 Block 2 Block 1 Objectives: load balancing, fast access, fault tolerance DFS Block Placement 8
  24. 24. CAR DRIVES name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Property Graph Model Components Nodes • The objects in the graph • Can have name-value properties • Can be labeled friends friends LIVES WITH O W N S PERSON PERSON Relationships • Relate nodes by type and direction • Can have name-value properties 9
  25. 25. Semantic Graph • RDF Triple Store – Semantic databases only work with RDF • Target market is users of third-party data in RDF (all Linked open data) – Working across data sets 10
  26. 26. Databases are Multi-Model when they can be either (for example): 11
  27. 27. Data Types and NoSQL Data Models Data Type Data Model CSV, TSV or web logs Column, Document Documents Document JSON Document Metadata catalog Column, Document Keyed images and documents Key-Value RDF, Linked data Graph 12
  28. 28. Key-Value Stores What are they? • NoSQL’s OLTP equivalent • Extremely simple • Key-”blob pairs”, that’s it • Associative array data model • Retrieve value given a key – All access is by a key (key,value) 13
  29. 29. Key-Value Stores Technical Characteristics: • Horizontally scalable • Fast (did I mention fast) • Resiliency to cluster failures • Simplicity • All nodes equal 14 (key,value)
  30. 30. Key-Value Stores Good for: • Any single object of unstructured data • Storing BLOBs • Fast writes • Web app cache • Session Information – get all session information in a single put/get • User profile data • Massive multi-player on-line gaming • Shopping carts (up until the payment transaction) • Geo-localized processing • Speed when you can’t be down (key,value) 15
  31. 31. A multi-model database is a single, integrated database that can store, manage and query data i multiple models such as relational, document, graph, key-value, column-store, cache. It is the opposite approach to Polyglot Persistence – the use of multiple databases in a workload. 16
  32. 32. Document-oriented Databases What are they? • Key-Value Stores with added capabilities – Ability to nest sub-documents • JSON/XML data models • With Tree-Like Structure • Encapsulated document objects • Groups data together more naturally and logically 17
  33. 33. Document-oriented Databases Technical Characteristics: • Store all data together – Example: Order document contains all line items • Documents are self-describing hierarchical tree structures • Unlike Key-Value Stores, the value part of the field can be queried 18
  34. 34. Document-oriented Databases Good for: • Semi-structured data • Web pages • Web traffic/E-Commerce • Web analytics • Log files • User actions/behaviors • Content Management Systems • Full text • Uncertain data • Extending object-oriented approaches • Event logging • JSON/XML data 19
  35. 35. Document Example { "type": "BakingRecipe", "name": "Mama’s Cornbread", "ingredients": [ { "name": "cornmeal", "amount": ”1c" }, { "name": "flour", "amount": "3/4c" }, { "name": "baking powder", "amount": "1-1/2t" }, { "name": "eggs", "amount": "2 large" }, { "name": ”butter", "amount": "6T" }, { "name": "buttermilk", "amount": "1-1/2c”, “brand”: “ABC Brand”} ], ”ovenTemperature": ”425 deg F" ”bakeTime": ”20 min” } 20
  36. 36. Multiple NoSQL Solutions Working Together You could use • Key-Value Store for Shopping Cart and Session Data • Document or Column Store for Consuming Completed Orders • RDBMS for inventory (small, not served real- time), financials • Graph Store for Customer Relationships for Marketing 21
  37. 37. Column Stores What are they? • Data model: – A big table, with column families – Map-reduce for querying/processing • Schema-lite • No single point of failure • Operational simplicity • Closest NoSQL implementation to RDBMS 22
  38. 38. Column Stores Good for: • Large amounts of data • Data that needs compression • Event logging • Content Management Systems • Data model supports semi-structured data • Naturally indexed (columns) • Good at scaling out horizontally • Time Series data – Weather data – Location data – Sensor data 23
  39. 39. Column Stores Example 24
  40. 40. What to Look for in Multi-Model 1/2 • Excellent implementation of multiple models • Single copy of data • Model change propagation • Works in microservices world • Submillisecond response time 25
  41. 41. What to Look for in Multi-Model 2/2 • Globally distributed multi-region deployments • Cross-model data processing language and optimizer • Edge-capable database • JSON flattening without data explosion • Universal indices 26
  42. 42. Emerging Technologies • Use of artificial intelligence (AI) • Integration with data catalog platforms • Robust user experience • Multi-cloud/native application 27
  43. 43. Assessing New Database Capabilities: Multi-Model Presented by: William McKnight President, McKnight Consulting Group williammcknight www.mcknightcg.com (214) 514-1444

×