Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19

430 Aufrufe

Veröffentlicht am

Join us to learn about the challenges of legacy data warehousing, the goals of modern data warehousing, and the design patterns and frameworks that help to accelerate modernization efforts.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19

  1. 1. 1
  2. 2. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019© Eckerson Group 2018 www.eckerson.com WELCOME Modernizing the Legacy Data Warehouse: What, Why, and How Dave Wells Advisory Consultant Eckerson Group Eva Nahari Director of Product Management Cloudera Sponsored & Hosted By
  3. 3. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019© Eckerson Group 2018 www.eckerson.com Modernizing the Legacy Data Warehouse: What, Why, and How Dave Wells dwells@eckerson.com
  4. 4. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Is the Data Warehouse Dead? 4 4 "The EDW is dead. Period. Like a dodo. Like Monty Python's parrot.” Philip Howard Bloor Research
  5. 5. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Is the Data Warehouse Dead? 5 How many data warehouses does your organization have? 2 – 5 (62%) only 1 (8%) 6 or more (28.3%) none (1.7%)
  6. 6. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 The Challenges of Legacy Data Warehousing 6 Scalability Grow with increasing data volume, users, and use cases Elasticity Dynamically adapt to workload volatility and fluctuation Data Variety Work well with differently structured (non-relational) data sources Data Latency Satisfy demand for real-time and near-real-time data Data Velocity Work well with streaming data sources Adaptability Quickly adjust to changes without complex data model refactoring Architectural Fit Complement and interoperate with data lake
  7. 7. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing in the Cloud 7 Data Storage Metadata Management Much more than just data storage Fast, efficient loading, bulk and Incremental, stream processing Integrate, aggregate, standardize … fast and scalable processing Query optimizers a must … hybrid relational & non-relational is valuable Monitoring, tuning, configuration, Prioritizing, workload balancing Business, process, and technical metadata … lineage, quality, etc. Data sensitivity, PII, data privacy … work with existing security infrastructure Preventing loss and corruption … backups, logging, replication, etc.
  8. 8. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing in the Cloud 8 STRENGTHS Business impact? Application of technology? Advanced skills? Expertise and specialized knowledge? Most experienced people? Innovative solutions? WEAKNESSES Business value? User satisfaction? Resources? Tools and technology? Knowledge and skills? Workload vs. capacity? OPPORTUNITIES Business goals and needs? Advancing BI and analytics? User wants and expectations? Technology utilization? Reach into and across the business? Related projects & initiatives? THREATS Obstacles and barriers? Stakeholder engagement & participation? Technological uncertainties? Data & process uncertainties? Scheduling & critical business events? Competing projects & initiatives? Should you migrate to cloud? SWOT once for status quo. SWOT again for migration.
  9. 9. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 ContinuousPlanning Initial Planning Data Warehousing in the Cloud 9 Step-by-step migration Migration Technology Selection Migration Strategy Architectural Assessment Business Caseincrementalmigration Scope, Timing, Resources, Schedule, User Transparency, Testing Plan Drivers, Costs, Benefits, Risk of Migrating, Risk of Not Migrating Reliability, Availability, Performance, Scalability, Adaptability, Maintainability Lift and Shift or Incremental by Workload, Workload Breakdown and Priorities Cloud Data Warehousing Platform, Migration Tools Schema, Data, Process, Metadata, Users and Applications Testing and Operationalization Function Test, Performance Test, DQ Audit, Scheduling, Monitoring, Support
  10. 10. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Analysis Dashboards Scorecards OLAP Reporting Applications Legacy Data OLTP Data External Data Sources Data Warehouses Master Data Repository Operational Data Store Data Management PublishETL Data Warehousing and Modern Data Architecture 10 Legacy Data Warehousing Architecture
  11. 11. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing and Modern Data Architecture 11 Automation Prescription Prediction Forecasting Discovery Exploration Dashboards Scorecards OLAP Reporting Web Open Commercial Social Media Machine / IoT Geospatial Legacy OLTP External Data Warehouses Master Data Repository Operational Data Store Data Lake Analytic Sandboxes ApplicationsSources Data Management Modern Data Management Architecture © David L. Wells Data PipelinesData PipelinesData PipelinesData PipelinesData Pipelines
  12. 12. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing and Modern Data Architecture 12 Automation Prescription Prediction Forecasting Discovery Exploration Dashboards Scorecards OLAP Reporting Web Open Commercial Social Media Machine / IoT Geospatial Legacy OLTP External Data Warehouses Master Data Repository Operational Data Store Data Lake Analytic Sandboxes ApplicationsSources Data Management Report Consumers Data Analysts Business Analysts Data Scientists Apps and Algorithms More sources & types More ways to organize and store data More uses for data More consumers © David L. Wells Data PipelinesData PipelinesData PipelinesData PipelinesData Pipelines More data flow and processing More data pipelines and more complex data pipelines Fast and on-demand data delivery More Components / More Complexity
  13. 13. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing and the Data Lake 13 DataAccess&DataPreparation withsecurity&governancecontrols Reporting OLAP Scorecards Dashboards Exploration Analytics Applications Legacy Transaction Web 3rd Party Social Media Machine Geospatial Sources DataIngestion ETL,ELT,BulkLoad,StreamProcessing landing area for incoming data raw data, refined data & sandboxes security, sensitivity, and semantic tagging classified by trust level (gold, silver, bronze) relational, subject-oriented, historical bus or hub-and-spoke architecture integrated, cleansed, aggregated includes master reference & metrics data ETL/ELT Data Refinement Data Lake Data Warehouse Data Warehouse Outside the Data Lake
  14. 14. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing and the Data Lake 14 Data Warehouse Inside the Data Lake Data Lake DataAccess&DataPreparation withsecurity&governancecontrols Reporting OLAP Scorecards Dashboards Exploration Analytics Applications Legacy Transaction Web 3rd Party Social Media Machine Geospatial Sources DataIngestion ETL,ELT,BulkLoad,StreamProcessing Raw Data Zone ingest & lightly tag Analytic Sandboxes explore & discover Refined Data Zone curate, improve & fully tag Data Warehouse integrate & aggregate
  15. 15. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing and the Data Lake 15 Data Warehouse In Front of the Data Lake Scorecards Dashboards Exploration Analytics Applications Sources Legacy Transaction Web 3rd Party Social Media Machine Geospatial subject-oriented integrated non-volatile time-variant Sources DataIngestion ETL,ELT,BulkLoad,StreamProcessing Raw Data Zone ingest & lightly tag Analytic Sandboxes explore & discover Refined Data Zone curate, improve & fully tag Data Warehouse(s) Data Lake Reporting OLAP Applications
  16. 16. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Data Warehousing and the Data Lake 16 Multi-Warehouse Hybrid Data Lake DataAccess&DataPreparation withsecurity&governancecontrols ApplicationsSources DataIngestion ETL,ELT,BulkLoad,StreamProcessing Raw Data Zone ingest & lightly tag Analytic Sandboxes explore & discover Refined Data Zone curate, improve & fully tag Data Warehouse A integrate & aggregate Web Open Commercial Social Media Machine / IoT Geospatial Legacy OLTP External Automation Prescription Prediction Forecasting Discovery Exploration Dashboards Scorecards OLAP Reporting relational, subject-oriented, historical bus or hub-and-spoke architecture integrated, cleansed, aggregated includes master reference & metrics data ETL/ELT Data Refinement Data Warehouse B
  17. 17. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Getting Started with Data Warehouse Modernization 17 Know Your Modernization Goals ✓ Complete, cohesive analytics ecosystem ✓ Complete, cohesive data management architecture ✓ Governed self-service with curated data ✓ Maximum reusability and reuse ✓ Maximum flexibility and agility ✓ Data for all data consumers ✓ Technological updates ✓ Architectural updates ✓ Automation opportunities and process optimization ✓ Deployment independence
  18. 18. © Eckerson Group 2018 www.eckerson.com© Eckerson Group 2019 Getting Started with Data Warehouse Modernization 18 Plan and Execute Assess the Current State Define the Future State Choose The Patterns Look to the Future Expect and Prepare for Change Execute One Step at a Time
  19. 19. 19
  20. 20. CLOUDERA DATA WAREHOUSE Eva Nahari, Director, Product Management, Cloudera Data Warehouse January, 2019
  21. 21. 21 © Cloudera, Inc. All rights reserved. NEW TRENDS IN DATA WAREHOUSING Deeper Business Insights at Extreme Speed and Scale While Managing Cost DEEPER business insights EXTREME speed & scale CONTROLLED resources & costs
  22. 22. 22 © Cloudera, Inc. All rights reserved. NEW TRENDS IN DATA WAREHOUSING Deeper Business Insights Protect ● Proactive Fraud Prevention ● Keep up with Regulatory Compliance ● Preempt Cyberthreats Real-time response on massive data volume and variety Optimize ● Improve Operational Efficiency ● Support Internet of Things (IoT) New analytics techniques democratized to all users Grow ● Customer Sentiment ● Fault Prevention ● Improve Product Quality ● New Revenue Streams Experimentation and collaboration at scale
  23. 23. 23 © Cloudera, Inc. All rights reserved. NEW TRENDS IN DATA WAREHOUSING Extreme Speed and Scale More Data ● Massive amounts handled faster at scale ● More variety from new sources (social media, IoT) ● Insight within minutes of new data arrival Performance and flexibility at scale More Workloads ● 100’s of production grade deployments ● Enterprise grade dependability ● Strict security and governance On-demand scale out, discovery, collaboration More People ● 1,000’s of new users and new user types ● 1,000’s of new use cases ● All skill levels: Analytics, Data Science, and Machine Learning All workloads with a shared data experience
  24. 24. 24 © Cloudera, Inc. All rights reserved. NEW TRENDS IN DATA WAREHOUSING Managing Resources and Costs Optimize Core Processes ● Automation to reduce pressure on organizational bottlenecks ● Consistent user experience Broaden data reach without increasing IT burden or costs Self-Service Everything ● Resource provisioning ● Workload development ● Optimizing and troubleshooting Deliver on increased SLA pressures without runaway cost Dynamic Consumption ● Transient Workloads ● Short-lived Workloads ● Permanent Workloads ● Public, Private, Hybrid Cloud Environmental flexibility and adaptive compute, storage
  25. 25. © Cloudera, Inc. All rights reserved. 25 Quickly enable business analytics by sharing petabytes of verified data across thousands of users while surpassing demands of SLAs and costs
  26. 26. 26 © Cloudera, Inc. All rights reserved. TRADITIONAL DATA WAREHOUSE: Structured Data Sources (ERP, CRM, SCM) Transformations EDW Advanced Analytics Dashboards Ad Hoc Canned Reports Staging Data Marts Many Months Master Schema ETLODS 2 3 4 1 5 Struggle to handle volume and variety Limited access
  27. 27. 27 © Cloudera, Inc. All rights reserved. MODERN DATA WAREHOUSE Advanced Analytics Dashboards Ad Hoc Canned Reports Data Store Within Days Data Marts 1 2 Ingest & Store all data at scale Self-serve / On- demand Variety of data sources/types
  28. 28. 28 © Cloudera, Inc. All rights reserved. WHAT CONCEPTS SURVIVE? Data Modeling Security & Governance Reports & Dashboards
  29. 29. 29 © Cloudera, Inc. All rights reserved. WHAT HAS CHANGED? Traditional DW Modern DW Supporting Role Foundational Role Primarily Internal Internal & External Constrained, Structured Freeform, Multi-Structured Planned ETLs On-Demand Pipelines Users Data Exploration Data Curation Data & Analytics
  30. 30. 30 © Cloudera, Inc. All rights reserved. WHAT IS NEW? Experimentation & Collaboration Dynamic Consumption Self Service Everything
  31. 31. 31 © Cloudera, Inc. All rights reserved. CLOUDERA MODERN DATA WAREHOUSE The modern platform for machine learning and analytics optimized for the cloud Object Store S3, ADLS Shared File Storage Time Series Data Store SECURITY GOVERNANCE WORKLOAD MANAGEMENT INGEST & REPLICATION DATA CATALOG Core Services Storage Services ANALYTICSDATA SCIENCE EXTENSIBLE SERVICES OPERATIONAL DATABASE DATA ENGINEERING
  32. 32. 32 © Cloudera, Inc. All rights reserved. CLOUD NATIVE OPTION ● Quick time to value - no software or clusters to manage ● Bring warehouse to the data with zero copy simplicity ● Use your security policies with your data - no proprietary stacks ● Apply enterprise governance to transient workloads ● Shared data experience with SDX ● Optimized for Azure & AWS DATA WAREHOUSE GOVERNANC E SECURITY CONTROL PLANE LIFECYCLE MANAGEMENT MULTI-CLOUD Amazon S3 Microsoft ADLS MULTI-CLOUD PAAS SOLUTION
  33. 33. 33 © Cloudera, Inc. All rights reserved. TD BANK: Delivering “Legendary Customer Experience” CHALLENGES Significantly improve customer experience with sentiment analysis, behavioral patterns, and predictive modeling Current system couldn’t handle: • Centralizing data from thousands of sources • Demands from increased users and use cases • Data cost and manageability at scale RESULTS • 30% reduction in repeat customer complaints • 90% productivity improvement for analytics projects • 60% decrease in data management costs • 98% decrease in per TB storage costs SOLUTION Modern Data Warehouse for customer marketing, fraud analytics and cybersecurity • Ingest data from 100+ corporate systems • Centralized data into “the hands of those that need it much more quickly” • Significantly reduce storage and management costs https://www.cloudera.com/more/customers/td-bank.html
  34. 34. 34 © Cloudera, Inc. All rights reserved. CLOUDERA DW - PARTING THOUGHTS Hybrid Optimized Shared Data ExperiencePerformance @Scale Shared Data Exponential Use Cases, Successful Outcomes
  35. 35. THANK YOU https://www.cloudera.com/products/data-warehouse.html

×