Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Data Lakes – The Key to a Scalable Data Architecture
May 24th , 2017
Ben Sharma | CEO
ben@zaloni.com
2
Industry-leading enterprise
data lake management,
governance and
self-service platform
Expert data lake
professional ser...
3 Zaloni Proprietary
Increased
Agility
New
Insights
Improved
Scalability
Data lakes are central to the modern data archite...
4 Zaloni Proprietary
Data architecture modernizationTraditionalModern
Data Lake
Sources ETL EDW
Derived
(Transformed)
Disc...
Zaloni Confidential and Proprietary - Provided under NDA
5 Zaloni Proprietary
0% of market
Optimize
Self-Organizing Data L...
6 Zaloni Proprietary
Data Lake Reference Architecture
•  Enables ad-hoc, exploratory analytics, experimentation
•  Consume...
7 Zaloni Proprietary
•  Leverage the full power of a scale-out
architecture with an actionable, scalable
data lake
Data La...
8 Zaloni Proprietary
1.  Based on a foundation of metadata management
2.  Lightweight and distributed
3.  Hybrid – top dow...
9 Zaloni Proprietary
•  Central to a well-managed data lake – provides visibility, reliability and enables
data governance...
10 Zaloni Proprietary
Metadata Exchange Framework
1.  Metadata sharing is critical for an integrated approach
2.  Federate...
11 Zaloni Proprietary
•  Ability to ingest vast amounts of data
•  Ability to handle a wide variety of formats (streaming,...
12 Zaloni Proprietary
•  See how data moves and how it is consumed in the data lake
•  Safeguard data and reduce risk, alw...
13 Zaloni Proprietary
•  Rules based data validation
•  Integration with the
managed data pipeline
•  Stats and metrics fo...
14 Zaloni Proprietary
•  Secure infrastructure for data in motion and data at rest
•  Role based access control for Metada...
15 Zaloni Proprietary
1.  Hot -> Warm -> Cold on an entity level based on policies/SLAs
2.  Provide data management featur...
16 Zaloni Proprietary
Data Catalog
•  See what data is available across your enterprise
•  Contribute valuable business in...
17 Zaloni Proprietary
Self-service Data Preparation
•  Blend data in the lake without a costly IT project
•  Perform inter...
18 Zaloni Proprietary
•  How do you create a cloud agnostic data
lake platform?
•  How deploy a cost-effective compute lay...
19 Zaloni Proprietary
Building your blueprint
1. Questions 2. Inputs 3. Outcomes
Business Drivers
AND Business
Questions:
...
20 Zaloni Proprietary
New Buyer’s Guide on Data Lake Management and Governance
•  Zaloni and Industry analyst firm, Enterp...
21 Zaloni Proprietary
Free eBooks to help you future-proof your data lake initiative
Download now at: resources.zaloni.com
DATA LAKE MANAGEMENT
AND GOVERNANCE PLATFORM
SELF-SERVICE DATA PLATFORM
Nächste SlideShare
Wird geladen in …5
×

Data Lakes - The Key to a Scalable Data Architecture

1.584 Aufrufe

Veröffentlicht am

Ben Sharma, Zaloni's CEO, presents at the Strata Data Conference in London on May 24th, 2017.

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Data Lakes - The Key to a Scalable Data Architecture

  1. 1. Data Lakes – The Key to a Scalable Data Architecture May 24th , 2017 Ben Sharma | CEO ben@zaloni.com
  2. 2. 2 Industry-leading enterprise data lake management, governance and self-service platform Expert data lake professional services (Design, Implementation, Workshops, Training) Solution-based packaged offerings to simplify implementation and reduce business risk Enabling the data-powered enterprise
  3. 3. 3 Zaloni Proprietary Increased Agility New Insights Improved Scalability Data lakes are central to the modern data architecture •  Store all types of data in its raw format •  Create Refined, Standardized, Trusted datasets for various use cases •  Store data for longer periods of time to enable historical analysis •  Query and access data using a variety of methods •  Manage streaming and batch data in a converged platform •  Provide shorter time-to-insight with proper management and governance
  4. 4. 4 Zaloni Proprietary Data architecture modernizationTraditionalModern Data Lake Sources ETL EDW Derived (Transformed) Discovery Sandbox EDW Streaming Unstructured Data Various Sources Data Discovery Analytics BI Data Science Data Discovery Analytics BI
  5. 5. Zaloni Confidential and Proprietary - Provided under NDA 5 Zaloni Proprietary 0% of market Optimize Self-Organizing Data Lake •  Self-improving data lake via machine learning algorithms •  True democratization of big data and analytics •  Intelligent data remediation and curation •  Recommended Data Security, and Governance policies •  Lights out business operations optimized for business success 2% of market Automate Responsive Data Lake •  Self-Service Ingestion & Provisioning •  360 View of Customer, Product, etc •  Enterprise Data Discovery •  Operationalize analytical models into business fabric •  Enables immediate data impact on business operations Manage 10% of market Managed Data Lake •  Acquire useful data from across the enterprise •  Improved visibility and understanding via managed Ingestion of data and metadata •  Ensure security and privacy of sensitive data •  Operationalize data at scale •  Leverage enterprise governance & security policies •  Scalable production data lake for new and improved business insights 22% of market Store Data Swamp •  Hadoop on premises or in the Cloud •  Limited visibility and usability of data •  Limited corporate oversight & governance •  Sandbox or Dev Environments •  Ad hoc and incremental growth of big data applications •  Ad-hoc and exploratory insights for individual use cases Zaloni Big Data Maturity Model Stage: Characteristics: Descriptor: Stage Today: Business Impact: Ignore 66% of market •  Emphasis on structured data •  Limited ability to leverage data at scale •  Business emphasis on retrospective reporting and analysis •  Strong governance and security policies •  Slow to accommodate business changes Data Warehouse Value Realized
  6. 6. 6 Zaloni Proprietary Data Lake Reference Architecture •  Enables ad-hoc, exploratory analytics, experimentation •  Consumers are anyone with appropriate role-based access •  Standardized on corporate governance/ quality policies •  Consumers are anyone with appropriate role-based access •  Single version of truth Transient Landing Zone Raw Zone Refined Zone Trusted Zone Sandbox Data Lake •  Temporary store of source data •  Consumers are IT, Data Stewards •  Implemented in highly regulation industries •  Original source data ready for consumption •  Consumers are ETL developers, data stewards, some data scientists •  Single source of truth with history •  Data required for LOB specific views - transformed from existing certified data •  Consumers are anyone with appropriate role-based access Sensors (or other time series data) Relational Data Stores (OLTP/ODS/ DW) Logs (or other unstructured data) Social and shared data
  7. 7. 7 Zaloni Proprietary •  Leverage the full power of a scale-out architecture with an actionable, scalable data lake Data Lake 360°: Zaloni’s integrated platform for data lakes 1. Enable the lake 2. Govern the data •  Improve data visibility, reliability and quality to reduce time-to-insight •  Safeguard sensitive data and enable regulatory compliance •  Foster a data-driven business through self-service data discovery and preparation 3. Engage the business
  8. 8. 8 Zaloni Proprietary 1.  Based on a foundation of metadata management 2.  Lightweight and distributed 3.  Hybrid – top down and bottom up approach Data Governance Centrally governed, critical data elements Regionally governed, departmental data sets Locally governed, data used in specific applications Gartner Data and Analytics 2017
  9. 9. 9 Zaloni Proprietary •  Central to a well-managed data lake – provides visibility, reliability and enables data governance •  Capture and manage operational, technical and business metadata •  Reduced time to insight for analytics •  Types of metadata: §  Where it resides, and how was it ingested §  What it means, and how it should be interpreted §  What governance policies apply to it §  What it's worth, and how its value can be expressed §  Who it's accessed and consumed by §  Which business processes downstream consume it Metadata Management
  10. 10. 10 Zaloni Proprietary Metadata Exchange Framework 1.  Metadata sharing is critical for an integrated approach 2.  Federated approach for metadata collection 3.  Two way metadata exchange between the Data Lake and other Enterprise repositories Metadata Exchange Framework Data Lake Enterprise Metadata repository Two way exchange
  11. 11. 11 Zaloni Proprietary •  Ability to ingest vast amounts of data •  Ability to handle a wide variety of formats (streaming, files, custom) and sources •  Build in repeatability via automation to pick up incoming data and apply pre-defined processing Managed Ingestion
  12. 12. 12 Zaloni Proprietary •  See how data moves and how it is consumed in the data lake •  Safeguard data and reduce risk, always knowing where data has come from, where it is, and how it is being used Data Lineage
  13. 13. 13 Zaloni Proprietary •  Rules based data validation •  Integration with the managed data pipeline •  Stats and metrics for reporting and actions •  Automation, Remediation, Notifications Future: •  ML based classification Data Quality
  14. 14. 14 Zaloni Proprietary •  Secure infrastructure for data in motion and data at rest •  Role based access control for Metadata and the data •  Mask or tokenize data before published in the lake for consumption •  Audit, access logs, alerts and notifications Data Security and Privacy
  15. 15. 15 Zaloni Proprietary 1.  Hot -> Warm -> Cold on an entity level based on policies/SLAs 2.  Provide data management features to automate scheduling and orchestration of data movement between heterogeneous storage environments 3.  Across on-premise and cloud environments Data Lifecycle Management
  16. 16. 16 Zaloni Proprietary Data Catalog •  See what data is available across your enterprise •  Contribute valuable business information to improve search and usage •  Use a shopping cart experience to create sandbox for ad-hoc and exploratory analytics
  17. 17. 17 Zaloni Proprietary Self-service Data Preparation •  Blend data in the lake without a costly IT project •  Perform interactive data-driven transformations
  18. 18. 18 Zaloni Proprietary •  How do you create a cloud agnostic data lake platform? •  How deploy a cost-effective compute layer? §  Elastic compute layer §  Batch and near real-time •  How do you optimize storage? §  Support polyglot persistence §  Data Lifecycle Management •  How do you optimize network connectivity between Ground to Cloud? •  How do you meet enterprise security requirements? Considerations for data lake in the cloud CLOUD and HYBRID ENVIRONMENTS
  19. 19. 19 Zaloni Proprietary Building your blueprint 1. Questions 2. Inputs 3. Outcomes Business Drivers AND Business Questions: e.g. Where is fraud occurring? How do I optimize inventory? Data Use Cases Platform Subject Areas Source System Capabilities, Process Ingest, Organize, Enrich, Explore Roadmap Managed Data Lake Analytics Strategy = ++
  20. 20. 20 Zaloni Proprietary New Buyer’s Guide on Data Lake Management and Governance •  Zaloni and Industry analyst firm, Enterprise Strategy Group, collaborated on a guide to help you: 1.  Define evaluation criteria and compare common options 2.  Set up a successful proof of concept (PoC) 3.  Develop an implementation that is future- proofed Download now at: resources.zaloni.com
  21. 21. 21 Zaloni Proprietary Free eBooks to help you future-proof your data lake initiative Download now at: resources.zaloni.com
  22. 22. DATA LAKE MANAGEMENT AND GOVERNANCE PLATFORM SELF-SERVICE DATA PLATFORM

×