Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

489 Aufrufe

Veröffentlicht am

Pradeep Varadan, Verizon's Wireline OSS Data Science Lead and Scott Gidley, Zaloni's VP, Product Management discuss the benefits of augmenting your DW with a data lake in this webinar presentation.

Veröffentlicht in: Daten & Analysen
  • Als Erste(r) kommentieren

Webinar -Data Warehouse Augmentation: Cut Costs, Increase Power

  1. 1. Data Warehouse Augmentation Cut Costs, Increase Power October 26, 2016
  2. 2. • Award-winning provider of enterprise data lake management solutions: Integrated data lake management platform Self-service catalog and data preparation • Data Lake Design and Implementation Services: POC, Pilot, Production, Operations, Training • Data Science Professional Services
  3. 3. 3 Zaloni Proprietary About our speakers Pradeep Varadan, Verizon Wireline, OSS Data Science Leader Varadan is a data scientist and enterprise architect who specializes in data challenges within telecommunications. He is tasked with providing a competitive edge focused on utilizing data analytics to drive effective decision-making. He is skilled in creating systems that can be used to understand and make better decisions involving rapid technology shifts, customer lifestyle and behavior trends and relevant changes that impact the Verizon Network. Scott Gidley, Zaloni, VP Product Management Gidley is responsible for the strategy and roadmap of existing and future products within the Zaloni portfolio. He is a nearly 20 year veteran of the data management software and services market. Prior to joining Zaloni, he served as senior director of product management at SAS and was previously CTO and cofounder of DataFlux Corporation.
  4. 4. Zaloni Confidential and Proprietary - Provided under NDA 4 Zaloni Proprietary Current state of a corporate data flow architecture BI/ReportingData Generators Machines Data Channels Warehouses Marts Repositories Data stores 4 Zaloni Proprietary
  5. 5. 5 Zaloni Proprietary Business Challenges: • Increased processing time/reduced response • Lack of data lineage/lack of visibility • Constant CapEx for hardware upgrade • Lack of access to history Key Challenges IT Challenges: • Multiple data transfers • Multiple technology platforms with data copies • Constant performance tuning for CPU • Manual data offload for space management
  6. 6. Zaloni Confidential and Proprietary - Provided under NDA 6 Zaloni Proprietary Sources ETL Report Mart Data Discovery Analytics BI ELT/Reporting/MiningETL Resource consumption Staging Warehouse 6 Zaloni Proprietary
  7. 7. Zaloni Confidential and Proprietary - Provided under NDA 7 Zaloni Proprietary Typical utilization of RDBMS resources We expend almost all CPU for low business value ETL Business Value CPU ETL to Stage Auditing (Landing tables query) Data Mining (Staging query) Ad-hoc Analysis (Warehouse query) ETL to Warehouse ETL to Reporting Reporting (Presentation table query) *Size indicates frequency of use 7 Zaloni Proprietary
  8. 8. Zaloni Confidential and Proprietary - Provided under NDA 8 Zaloni Proprietary ~80% of system capacity used for batch processing (ELT) 8 Zaloni Proprietary
  9. 9. Zaloni Confidential and Proprietary - Provided under NDA 9 Zaloni Proprietary Reduce cost of ELT/ETL by offloading to Hadoop 9 Zaloni Proprietary
  10. 10. Zaloni Confidential and Proprietary - Provided under NDA 10 Zaloni Proprietary The future of enterprise data flowFuture 10 Zaloni Proprietary Legacy Structured Data ETL EDW+Sandbox BI/ReportingData Marts Transactional Systems Machine logs/IOT Structured/ Unstructured Data Lake Modern T-Systems Machines ETL Sandbox EDW BI/Reporting/ Analytics Data Marts Operational Dashboards/EDA/Mining/Reporting/Analytics Transactional Systems EDW Data Marts ETL Sandbox ETL
  11. 11. 11 Zaloni Proprietary Increased Agility New Insights Improved Scalability Data lakes are central to the modern data architecture
  12. 12. 12 Zaloni Proprietary Data lake challenges • Ingestion • Visibility and Quality • Privacy and Compliance • Timeliness • Reliance on IT • Reusability • Rate of Change • Skills Gap • Complexity Managing: Delivering:Building:
  13. 13. Zaloni Confidential and Proprietary - Provided under NDA 13 Zaloni Proprietary Data Lake 360°: A holistic approach to actionable big data 1. Enable the lake 2. Govern the data 3. Engage the business • Foster a data-driven business through self-service data discovery and preparation • Safeguard sensitive data and enable regulatory compliance • Improve data visibility, reliability and quality to reduce time-to- insight • Leverage the full power of a scale-out architecture with an actionable, scalable data lake
  14. 14. 14 Zaloni Proprietary • Managed Ingestion  Ability to ingest vast amounts of data  Ability to handle a wide variety of formats (streaming, files, custom) and sources  Build in repeatability through automation to pick up incoming data and apply pre-defined processing • Metadata Management  Capture and manage operational, technical and business metadata  Provides visibility and reliability – key to finding data in the lake  Reduced time to insight for analytics  File and record level watermarking provides data lineage, enables audit and traceability Enable the lake
  15. 15. 15 Zaloni Proprietary Govern the data • Data Lineage  See how data moves and how it is consumed in the data lake.  Safeguard data and reduce risk, always knowing where data has come from, where it is, and how it is being used. • Data Quality  Rules based Data validation  Integration with the Managed Data Pipeline  Stats and metrics for reporting and actions
  16. 16. 16 Zaloni Proprietary Govern the data • Data Security and Privacy  Differing permissions require enhanced data security  Mask or tokenize data before published in the lake for consumption  Policy-based security • Data lifecycle management across tiered storage environments  Hot -> Warm -> Cold on an entity level based on policies/SLAs  Across on-premise and cloud environments  Provide data management features to automate scheduling and orchestration of data movement between heterogeneous storage environments
  17. 17. Zaloni Confidential and Proprietary - Provided under NDA 17 Zaloni Proprietary Engage the business • Data Catalog  See what data is available across your enterprise  Contribute valuable business information to improve search and usage  Use a shopping cart experience to create sandbox for ad- hoc and exploratory analytics • Self-service Data Preparation  Blend data in the lake without a costly IT project  Perform interactive data-driven transformations  Collaborate and share data assets and transformations with peers
  18. 18. Zaloni Confidential and Proprietary - Provided under NDA 18 Zaloni Proprietary Data lake reference architecture • Data required for LOB specific views - transformed from existing certified data • Consumers are anyone with appropriate role-based access • Standardized on corporate governance/ quality policies • Consumers are anyone with appropriate role-based access • Single version of truth Transient Landing Zone Raw Zone Refined Zone Trusted Zone Sandbox Data Lake • Temporary store of source data • Consumers are IT, Data Stewards • Implemented in highly regulated industries • Original source data ready for consumption • Consumers are ETL developers, data stewards, some data scientists • Single source of truth with history • Data required for LOB specific views - transformed from existing certified data • Consumers are anyone with appropriate role-based access Sensors (or other time series data) Relational Data Stores (OLTP/ODS/DW) Logs (or other unstructured data) Social and shared data 16 Zaloni Proprietary
  19. 19. 19 Zaloni Proprietary Data lake reference architecture with Zaloni Consumption ZoneSource System File Data DB Data ETL Extracts Streaming Transient Landing Zone Raw Zone Refined Zone Trusted Zone Sandbox APIs Metadata Management Data Quality Data Catalog Security Data Lake Business Analysts Researchers Data Scientists DATA LAKE MANAGEMENT & GOVERNANCE PLATFORM Sensors (or other time series data) Relational Data Stores (OLTP/ODS/DW) Logs (or other unstructured data) Social and shared data EDW Data Marts
  20. 20. 20 Zaloni Proprietary • Save millions in storage costs • Significantly speed up processing • Maximize the data warehouse for BI • Extract more value from all of your data Four great reasons to augment with a data lake
  21. 21. 21 Zaloni Proprietary Centralized data, decentralized access Business Analyst Business Manager Data Scientist Business SME What happened? What is happening? What will happen? What can we control? Can I see the data? IT Team Business Users IT Analyst Programmer DBA/Modeler Data Scientist Data Engineer Data Lake Code Analysis App ImplementationApp PrototypeData ModelCode Development Operations Manager
  22. 22. Questions?
  23. 23. DATA LAKE MANAGEMENT AND GOVERNANCE PLATFORM SELF-SERVICE DATA PREPARATION

×