SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Operationalizing your Data Lake:
Get Ready for Advanced Analytics
October 22nd , 2017
Parth Patel | Big Data Solutions Engineer
ppatel1@zaloni.com
2
Industry-leading enterprise
data lake management,
governance and
self-service platform
Expert data lake
professional services
(Design, Implementation,
Workshops, Training)
Solutions to simplify
implementation
and reduce business risk
Enabling the data-powered enterprise
Zaloni Confidential and Proprietary
3 Zaloni proprietary – do not duplicate without permission
Increased
Agility
New
Insights
Improved
Scalability
Data lakes are central to the modern data architecture
4 Zaloni Proprietary
Data architecture modernizationTraditionalModern
Data Lake
Sources ETL EDW
Derived
(Transformed)
Discovery Sandbox
EDW
Streaming
Unstructured Data
Various
Sources
Data Discovery
Analytics BI
Data Science
Data Discovery
Analytics BI
Zaloni Confidential and Proprietary - Provided under NDA
5 Zaloni Proprietary
0% of market
Optimize
Self-Organizing Data Lake
• Self-improving data
lake via machine
learning algorithms
• True democratization
of big data and
analytics
• Intelligent data
remediation and
curation
• Recommended Data
Security, and
Governance policies
• Lights out business
operations optimized
for business success
2% of market
Automate
Responsive Data Lake
• Self-Service Ingestion
& Provisioning
• 360 View of Customer,
Product, etc
• Enterprise Data
Discovery
• Operationalize
analytical models into
business fabric
• Enables immediate data
impact on business
operations
Manage
10% of market
Managed Data Lake
• Acquire useful data from
across the enterprise
• Improved visibility and
understanding via
managed Ingestion of
data and metadata
• Ensure security and
privacy of sensitive data
• Operationalize
data at scale
• Leverage enterprise
governance &
security policies
• Scalable production data
lake for new and improved
business insights
22% of market
Store
Data Swamp
• Hadoop on premises
or in the Cloud
• Limited visibility and
usability of data
• Limited corporate
oversight & governance
• Sandbox or Dev
Environments
• Ad hoc and incremental
growth of big data
applications
• Ad-hoc and exploratory
insights for individual
use cases
Zaloni Big Data Maturity Model
Stage:
Characteristics:
Descriptor:
Stage Today:
Business
Impact:
Ignore
66% of market
• Emphasis on
structured data
• Limited ability to
leverage data at
scale
• Business emphasis
on retrospective
reporting and
analysis
• Strong governance
and security policies
• Slow to
accommodate
business changes
Data Warehouse
Value Realized
6 Zaloni proprietary – do not duplicate without permission
Managing the Data Supply Chain from Source to Consumer
CONSUMERS
Business Analysts
Researchers
Data Scientists
Applications
• Data Lake Management Platform
• A software solution for data lake management that enables enterprise-wide scalability
• Provides end to end capabilities
Self-Service
Data
Data Lake Management Platform
Enable Govern Engage
Batch ingestion
Streaming
Ingestion
Auto
discovery
Data Quality
Data Privacy and
Security
Data Lifecycle
Management
CatalogMetadata
Management
Operationalize
Transformations
Self-Service
Data Preparation
PRODUCERS
File Data
Streaming
Relational
On-premise
7 Zaloni Proprietary
Data Lake Reference Architecture
• Data required for LOB specific views - transformed
from existing certified data
• Consumers are anyone with appropriate role-based access
• Standardized on corporate governance/ quality policies
• Consumers are anyone with appropriate role-based access
• Single version of truth
Transient
Landing Zone
Raw
Zone
Refined Zone
Trusted Zone
Sandbox
Data Lake
• Temporary store of
source data
• Consumers are IT,
Data Stewards
• Implemented in highly
regulation industries
• Original source data
ready for consumption
• Consumers are ETL
developers, data
stewards, some data
scientists
• Single source of truth
with history
• Data required for LOB specific views - transformed
from existing certified data
• Consumers are anyone with appropriate role-based access
Sensors
(or other time series data)
Relational Data
Stores
(OLTP/ODS/DW)
Logs
(or other unstructured
data)
Social and
shared data
8 Zaloni Proprietary
Machine learning for data lake implementations
Loyalty
Customer
Service
TransactionsMarketing
3rd Party
● Easily integrate data silos
● Probabilistic data matching and record
linkage
● Automatically classify, encrypt/mask
PII/Sensitive data for regulatory
compliance
Integrate Data
Silos
9 Zaloni Proprietary
• Extend data lake beyond Hadoop
• Catalog traditional sources
• Ingest datasets without IT
• Prepare & provision data to your tool of
choice
Increasing data lake adoption through self-service
Self-Service Data
Preparation
10 Zaloni Proprietary
DON’T GO IN THE
DATA LAKE
WITHOUT US
Questions?

Weitere ähnliche Inhalte

Was ist angesagt?

Data driven decision making through analytics and IoT
Data driven decision making through analytics and IoTData driven decision making through analytics and IoT
Data driven decision making through analytics and IoTAachen Data & AI Meetup
 
Constant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake JourneyConstant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake JourneySeeling Cheung
 
Houd controle over uw data
Houd controle over uw dataHoud controle over uw data
Houd controle over uw dataICT-Partners
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataCloudera, Inc.
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationDenodo
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
5 Myths about Spark and Big Data by Nik Rouda
5 Myths about Spark and Big Data by Nik Rouda5 Myths about Spark and Big Data by Nik Rouda
5 Myths about Spark and Big Data by Nik RoudaSpark Summit
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Cloudera, Inc.
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Denodo
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Technologies
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and moreDenodo
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics WebinarEckerson Group
 
Enable Advanced Analytics with Hadoop and an Enterprise Data Hub
Enable Advanced Analytics with Hadoop and an Enterprise Data HubEnable Advanced Analytics with Hadoop and an Enterprise Data Hub
Enable Advanced Analytics with Hadoop and an Enterprise Data HubCloudera, Inc.
 
Big Data Maturity Scorecard
Big Data Maturity ScorecardBig Data Maturity Scorecard
Big Data Maturity ScorecardDataWorks Summit
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Cloudera, Inc.
 
Enterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricEnterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricPrecisely
 
Solution Architecture US healthcare
Solution Architecture US healthcare Solution Architecture US healthcare
Solution Architecture US healthcare sumiteshkr
 

Was ist angesagt? (20)

Data driven decision making through analytics and IoT
Data driven decision making through analytics and IoTData driven decision making through analytics and IoT
Data driven decision making through analytics and IoT
 
Constant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake JourneyConstant Contact: An Online Marketing Leader’s Data Lake Journey
Constant Contact: An Online Marketing Leader’s Data Lake Journey
 
Houd controle over uw data
Houd controle over uw dataHoud controle over uw data
Houd controle over uw data
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
5 Myths about Spark and Big Data by Nik Rouda
5 Myths about Spark and Big Data by Nik Rouda5 Myths about Spark and Big Data by Nik Rouda
5 Myths about Spark and Big Data by Nik Rouda
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
Delivering Self-Service Analytics using Big Data and Data Virtualization on t...
 
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike FergusonMapR Enterprise Data Hub Webinar w/ Mike Ferguson
MapR Enterprise Data Hub Webinar w/ Mike Ferguson
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
Performance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and morePerformance Acceleration: Summaries, Recommendation, MPP and more
Performance Acceleration: Summaries, Recommendation, MPP and more
 
Big Data Analytics Webinar
Big Data Analytics WebinarBig Data Analytics Webinar
Big Data Analytics Webinar
 
Enable Advanced Analytics with Hadoop and an Enterprise Data Hub
Enable Advanced Analytics with Hadoop and an Enterprise Data HubEnable Advanced Analytics with Hadoop and an Enterprise Data Hub
Enable Advanced Analytics with Hadoop and an Enterprise Data Hub
 
Big Data Maturity Scorecard
Big Data Maturity ScorecardBig Data Maturity Scorecard
Big Data Maturity Scorecard
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
 
Enterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricEnterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data Fabric
 
Solution Architecture US healthcare
Solution Architecture US healthcare Solution Architecture US healthcare
Solution Architecture US healthcare
 
Lean Data Lineage
Lean Data LineageLean Data Lineage
Lean Data Lineage
 

Ähnlich wie Operationalizing your Data Lake: Get Ready for Advanced Analytics

Strata San Jose 2017 - Ben Sharma Presentation
Strata San Jose 2017 - Ben Sharma PresentationStrata San Jose 2017 - Ben Sharma Presentation
Strata San Jose 2017 - Ben Sharma PresentationZaloni
 
Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data ArchitectureZaloni
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Zaloni
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricNathan Bijnens
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data LakeCaserta
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)Moacyr Passador
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersRevolution Analytics
 
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraFrom Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraMolly Alexander
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationDATAVERSITY
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution AnalyticsRevolution Analytics
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It? Caserta
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...DataWorks Summit
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...DATAVERSITY
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceDATAVERSITY
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Denodo
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data LakesKiran Kamreddy
 
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeBig Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeDenodo
 

Ähnlich wie Operationalizing your Data Lake: Get Ready for Advanced Analytics (20)

Strata San Jose 2017 - Ben Sharma Presentation
Strata San Jose 2017 - Ben Sharma PresentationStrata San Jose 2017 - Ben Sharma Presentation
Strata San Jose 2017 - Ben Sharma Presentation
 
Creating a Modern Data Architecture
Creating a Modern Data ArchitectureCreating a Modern Data Architecture
Creating a Modern Data Architecture
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
Setting Up the Data Lake
Setting Up the Data LakeSetting Up the Data Lake
Setting Up the Data Lake
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)How to Quickly and Easily Draw Value  from Big Data Sources_Q3 symposia(Moa)
How to Quickly and Easily Draw Value from Big Data Sources_Q3 symposia(Moa)
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster AnswersR+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
R+Hadoop - Ask Bigger (and New) Questions and Get Better, Faster Answers
 
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav MisraFrom Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
From Foundation to Mastery – Building a Mature Analytics Roadmap - Manav Misra
 
The Shifting Landscape of Data Integration
The Shifting Landscape of Data IntegrationThe Shifting Landscape of Data Integration
The Shifting Landscape of Data Integration
 
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
12Nov13 Webinar: Big Data Analysis with Teradata and Revolution Analytics
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
ADV Slides: The Evolution of the Data Platform and What It Means to Enterpris...
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
 
Data Governance for Data Lakes
Data Governance for Data LakesData Governance for Data Lakes
Data Governance for Data Lakes
 
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data InitiativeBig Data Fabric: A Necessity For Any Successful Big Data Initiative
Big Data Fabric: A Necessity For Any Successful Big Data Initiative
 

Mehr von IDEAS - Int'l Data Engineering and Science Association

Mehr von IDEAS - Int'l Data Engineering and Science Association (20)

How to deliver effective data science projects
How to deliver effective data science projectsHow to deliver effective data science projects
How to deliver effective data science projects
 
Digital cracks in banking--Sid Nandi
Digital cracks in banking--Sid NandiDigital cracks in banking--Sid Nandi
Digital cracks in banking--Sid Nandi
 
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
“Full Stack” Data Science with R for Startups: Production-ready with Open-Sou...
 
Battling Skynet: The Role of Humanity in Artificial Intelligence
Battling Skynet: The Role of Humanity in Artificial IntelligenceBattling Skynet: The Role of Humanity in Artificial Intelligence
Battling Skynet: The Role of Humanity in Artificial Intelligence
 
Implementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big DataImplementing Artificial Intelligence with Big Data
Implementing Artificial Intelligence with Big Data
 
Data Architecture (i.e., normalization / relational algebra) and Database Sec...
Data Architecture (i.e., normalization / relational algebra) and Database Sec...Data Architecture (i.e., normalization / relational algebra) and Database Sec...
Data Architecture (i.e., normalization / relational algebra) and Database Sec...
 
Blockchain Application in Real Estate Transactions
Blockchain Application in Real Estate TransactionsBlockchain Application in Real Estate Transactions
Blockchain Application in Real Estate Transactions
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 
Practical Machine Learning at Work
Practical Machine Learning at WorkPractical Machine Learning at Work
Practical Machine Learning at Work
 
Artificial Intelligence: Hype, Reality, Vision.
Artificial Intelligence: Hype, Reality, Vision.Artificial Intelligence: Hype, Reality, Vision.
Artificial Intelligence: Hype, Reality, Vision.
 
Introduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement LearningIntroduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement Learning
 
Best Practices in Data Partnerships Between Mayor's Office and Academia
Best Practices in Data Partnerships Between Mayor's Office and AcademiaBest Practices in Data Partnerships Between Mayor's Office and Academia
Best Practices in Data Partnerships Between Mayor's Office and Academia
 
Everything You Wish You Knew About Search
Everything You Wish You Knew About SearchEverything You Wish You Knew About Search
Everything You Wish You Knew About Search
 
AliMe Bot Platform Technical Practice - Alibaba`s Personal Intelligent Assist...
AliMe Bot Platform Technical Practice - Alibaba`s Personal Intelligent Assist...AliMe Bot Platform Technical Practice - Alibaba`s Personal Intelligent Assist...
AliMe Bot Platform Technical Practice - Alibaba`s Personal Intelligent Assist...
 
Data-Driven AI for Entertainment and Healthcare
Data-Driven AI for Entertainment and HealthcareData-Driven AI for Entertainment and Healthcare
Data-Driven AI for Entertainment and Healthcare
 
Generating Creative Works with AI
Generating Creative Works with AIGenerating Creative Works with AI
Generating Creative Works with AI
 
Using AI to Tackle the Future of Health Care Data
Using AI to Tackle the Future of Health Care DataUsing AI to Tackle the Future of Health Care Data
Using AI to Tackle the Future of Health Care Data
 
State of AI/ML in Real Estate
State of AI/ML in Real EstateState of AI/ML in Real Estate
State of AI/ML in Real Estate
 
Hot Dog, Not Hot Dog! Generate new training data without taking more photos.
Hot Dog, Not Hot Dog! Generate new training data without taking more photos.Hot Dog, Not Hot Dog! Generate new training data without taking more photos.
Hot Dog, Not Hot Dog! Generate new training data without taking more photos.
 
Machine Learning in Healthcare and Life Science
Machine Learning in Healthcare and Life ScienceMachine Learning in Healthcare and Life Science
Machine Learning in Healthcare and Life Science
 

Kürzlich hochgeladen

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Kürzlich hochgeladen (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Operationalizing your Data Lake: Get Ready for Advanced Analytics

  • 1. Operationalizing your Data Lake: Get Ready for Advanced Analytics October 22nd , 2017 Parth Patel | Big Data Solutions Engineer ppatel1@zaloni.com
  • 2. 2 Industry-leading enterprise data lake management, governance and self-service platform Expert data lake professional services (Design, Implementation, Workshops, Training) Solutions to simplify implementation and reduce business risk Enabling the data-powered enterprise Zaloni Confidential and Proprietary
  • 3. 3 Zaloni proprietary – do not duplicate without permission Increased Agility New Insights Improved Scalability Data lakes are central to the modern data architecture
  • 4. 4 Zaloni Proprietary Data architecture modernizationTraditionalModern Data Lake Sources ETL EDW Derived (Transformed) Discovery Sandbox EDW Streaming Unstructured Data Various Sources Data Discovery Analytics BI Data Science Data Discovery Analytics BI
  • 5. Zaloni Confidential and Proprietary - Provided under NDA 5 Zaloni Proprietary 0% of market Optimize Self-Organizing Data Lake • Self-improving data lake via machine learning algorithms • True democratization of big data and analytics • Intelligent data remediation and curation • Recommended Data Security, and Governance policies • Lights out business operations optimized for business success 2% of market Automate Responsive Data Lake • Self-Service Ingestion & Provisioning • 360 View of Customer, Product, etc • Enterprise Data Discovery • Operationalize analytical models into business fabric • Enables immediate data impact on business operations Manage 10% of market Managed Data Lake • Acquire useful data from across the enterprise • Improved visibility and understanding via managed Ingestion of data and metadata • Ensure security and privacy of sensitive data • Operationalize data at scale • Leverage enterprise governance & security policies • Scalable production data lake for new and improved business insights 22% of market Store Data Swamp • Hadoop on premises or in the Cloud • Limited visibility and usability of data • Limited corporate oversight & governance • Sandbox or Dev Environments • Ad hoc and incremental growth of big data applications • Ad-hoc and exploratory insights for individual use cases Zaloni Big Data Maturity Model Stage: Characteristics: Descriptor: Stage Today: Business Impact: Ignore 66% of market • Emphasis on structured data • Limited ability to leverage data at scale • Business emphasis on retrospective reporting and analysis • Strong governance and security policies • Slow to accommodate business changes Data Warehouse Value Realized
  • 6. 6 Zaloni proprietary – do not duplicate without permission Managing the Data Supply Chain from Source to Consumer CONSUMERS Business Analysts Researchers Data Scientists Applications • Data Lake Management Platform • A software solution for data lake management that enables enterprise-wide scalability • Provides end to end capabilities Self-Service Data Data Lake Management Platform Enable Govern Engage Batch ingestion Streaming Ingestion Auto discovery Data Quality Data Privacy and Security Data Lifecycle Management CatalogMetadata Management Operationalize Transformations Self-Service Data Preparation PRODUCERS File Data Streaming Relational On-premise
  • 7. 7 Zaloni Proprietary Data Lake Reference Architecture • Data required for LOB specific views - transformed from existing certified data • Consumers are anyone with appropriate role-based access • Standardized on corporate governance/ quality policies • Consumers are anyone with appropriate role-based access • Single version of truth Transient Landing Zone Raw Zone Refined Zone Trusted Zone Sandbox Data Lake • Temporary store of source data • Consumers are IT, Data Stewards • Implemented in highly regulation industries • Original source data ready for consumption • Consumers are ETL developers, data stewards, some data scientists • Single source of truth with history • Data required for LOB specific views - transformed from existing certified data • Consumers are anyone with appropriate role-based access Sensors (or other time series data) Relational Data Stores (OLTP/ODS/DW) Logs (or other unstructured data) Social and shared data
  • 8. 8 Zaloni Proprietary Machine learning for data lake implementations Loyalty Customer Service TransactionsMarketing 3rd Party ● Easily integrate data silos ● Probabilistic data matching and record linkage ● Automatically classify, encrypt/mask PII/Sensitive data for regulatory compliance Integrate Data Silos
  • 9. 9 Zaloni Proprietary • Extend data lake beyond Hadoop • Catalog traditional sources • Ingest datasets without IT • Prepare & provision data to your tool of choice Increasing data lake adoption through self-service Self-Service Data Preparation
  • 10. 10 Zaloni Proprietary DON’T GO IN THE DATA LAKE WITHOUT US Questions?