Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Fireside Chat with Tony Baer, Ovum Research
Developing a Strategy for Data Lake Governance
Wednesday, May 18, 2016
1:00 pm...
Meet today’s speakers
Tony Baer Principle Analyst, Information Management, Ovum
Tony Baer leads Ovum’s Big Data research a...
•  Award-winning provider of enterprise data lake
management solutions:
Integrated data lake management platform
Self-serv...
Key Findings
•  Data lakes must be managed
•  Data lakes must have the capability to ingest all data &
related metadata
• ...
Group Multi-department Enterprise
Log analytics
Sentiment Analysis
DW offload
Data Lake
Exploratory Analytics
Line of busi...
IT Data Scientists Business
Bulk storage of raw data
Exploratory Analytics
Line of business
analytic applications
Operatio...
Availability/Reliability
(FT,HA,BackupDR)
Monitoring&troubleshooting
Perimeter
Security
Data platform (Hadoop)
Query/Analy...
Data lake challenges and complications
•  Ingestion
•  Lack of Visibility
•  Privacy and Compliance
•  Quality Issues
•  R...
Data Curation
Build your library of
information
Physical Inventory
Know/manage what data is in
the data lake
Data profilin...
Data lake reference architecture
Consumption
Zone
Source
System
File Data
DB Data
ETL Extracts
Streaming
Transient
Loading...
DON’T GO IN THE DATA
LAKE WITHOUT US
Zaloni Proprietary
Nächste SlideShare
Wird geladen in …5
×

Ovum Fireside Chat: Governing the data lake - Understanding what's in there

430 Aufrufe

Veröffentlicht am

In Ovum’s upcoming Big Data Trends to Watch 2016 report, Tony Baer forecasts that data lake management will become a front-burner issue as early Hadoop adopters get to the point of production implementation.

During this fireside chat, Tony Baer and Scott Gidley, VP of Product Management at Zaloni will assess the state of the industry regarding governance and data management tools, technologies, and practices that should fall into place as part of a data lake strategy.

Watch the webinar here: http://hubs.ly/H03374z0

Veröffentlicht in: Technologie
  • Als Erste(r) kommentieren

Ovum Fireside Chat: Governing the data lake - Understanding what's in there

  1. 1. Fireside Chat with Tony Baer, Ovum Research Developing a Strategy for Data Lake Governance Wednesday, May 18, 2016 1:00 pm EST
  2. 2. Meet today’s speakers Tony Baer Principle Analyst, Information Management, Ovum Tony Baer leads Ovum’s Big Data research area. His coverage focuses on how Big Data must become a first-class citizen in the data center, IT organization, and the business. He has a multi-disciplinary background touching the different tiers of enterprise software. He is an author and sought after speaker. Scott Gidley Vice President of Product, Zaloni Scott is a nearly 20 year veteran of the data management software and services market. Prior to joining Zaloni, Scott served as senior director of product management at SAS and was previously CTO and cofounder of DataFlux Corporation. Scott received his BS in Computer Science from University of Pittsburgh.
  3. 3. •  Award-winning provider of enterprise data lake management solutions: Integrated data lake management platform Self-service data preparation •  Data Lake Design and Implementation Services: POC, Pilot, Production, Operations, Training •  Data Science Professional Services Delivering on the business of big data Funded by top-tier technology investors:
  4. 4. Key Findings •  Data lakes must be managed •  Data lakes must have the capability to ingest all data & related metadata •  Data lakes will only succeed if they become shared resources •  Business users must be prepared to take responsibility for curating data. •  Maturity & readiness of tools, technologies & best practices are works in progress •  Mgmt. & governance of data lakes should be a phased process Ovum Big Data Report: Developing a Strategy for Data Lake Governance
  5. 5. Group Multi-department Enterprise Log analytics Sentiment Analysis DW offload Data Lake Exploratory Analytics Line of business analytic applications Operational analytics Data lake is later stage of Hadoop adoption
  6. 6. IT Data Scientists Business Bulk storage of raw data Exploratory Analytics Line of business analytic applications Operational analytics Migrate I/O-intensive operations (e.g., ELT) “Deep” analytics (e.g. segmentation, predictive, prescriptive modeling) Data lake use case maturity model
  7. 7. Availability/Reliability (FT,HA,BackupDR) Monitoring&troubleshooting Perimeter Security Data platform (Hadoop) Query/Analytics tools, programs Cost Optimization & Integration Data Inventory Data Curation Data-level security Self-service tier Data Lake building block Hadoop platform management End user tool Ovum’s data lake reference architecture
  8. 8. Data lake challenges and complications •  Ingestion •  Lack of Visibility •  Privacy and Compliance •  Quality Issues •  Reliance on IT •  Reusability •  Rate of Change •  Skills Gap •  Complexity Building: Managing: Delivering: Zaloni Confidential and Proprietary8 Engage the business • Discover • Enrich • Provision Govern the data in the lake • Cleanse • Secure • Operationalize Enable the data lake • Ingest • Organize • Catalog
  9. 9. Data Curation Build your library of information Physical Inventory Know/manage what data is in the data lake Data profiling, data preparation, collaborative data enrichment, catalog, match data, derive master data, record data lineage Business & Analytics teams Technology team Manage data access, track data lineage, tag for security, data retention Manage data access, tag for security, data retention, lifecycle & workflow, track data lineage Collaboration key to modern data management
  10. 10. Data lake reference architecture Consumption Zone Source System File Data DB Data ETL Extracts Streaming Transient Loading Zone Raw Data Refined Data Trusted Data Discovery Sandbox Original unaltered data attributes Tokenized Data APIs Reference Data Master Data Data Wrangling Data Discovery Exploratory Analytics Metadata Data Quality Data Catalog Security Data Lake Integrate to common format Data Validation Data Cleansing Aggregations OLTP or ODS Enterprise Data Warehouse Logs (or other unstructured data) Cloud Services Business Analysts Researchers Data Scientists Zaloni Proprietary 10
  11. 11. DON’T GO IN THE DATA LAKE WITHOUT US Zaloni Proprietary

×