SlideShare ist ein Scribd-Unternehmen logo
1 von 19
Downloaden Sie, um offline zu lesen
© 2016 IBM Corporation
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
IBM’s Data Lake – A Basic Definition
1st June 2016
Mandy	Chessell	CBE	FREng	CEng	FBCS
Distinguished	Engineer,	Master	Inventor
Analytics	Group	CTO	Office
© 2016 IBM Corporation2
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Data blues & skills issues
§ A	disproportionate	portion	of	the	time	spent	in	analytics	project	is	about	data	
preparation:	acquiring/preparing/formatting/normalizing	the	data
§ In	addition	to	raw	data,	augmented	data/analytical	assets	can	significantly	
speed	up	the	analytics	process	and	partially	bridge	the	talent	gap
© 2016 IBM Corporation3
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
A growing demand …
Business	Teams	want
• Open	access	to	more	information
• More	powerful	analysis	and	visualization	tools
IT	Teams	are
• Concerned	about	cost.
• Concerned	about	governance	and	regulatory	requirements.
© 2016 IBM Corporation4
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Big Data Lakes or Swamps?
§ As we collect data
• Can we preserve clarity?
• Do we know what we are collecting?
• Can we find the data we need?
§ Are we creating a data swamp?
§ How do we build trust in big data?
• Do we know what data is being used
for?
© 2016 IBM Corporation5
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
"The need for increased agility and accessibility for data analysis is the primary
driver for data lakes," said Andrew White, vice president and distinguished
analyst at Gartner. "Nevertheless, while it is certainly true that data lakes can
provide value to various parts of the organization, the proposition of enterprise
wide data management has yet to be realized."
http://www.gartner.com/newsroom/id/2809117
© 2016 IBM Corporation6
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
IBM’s Data Lake – designed for data access – with safeguards
IBM’s	Data	Lake	=	Efficient	Management,	Governance,	Protection	and	Access.
Data Lake (System of Insight)
Information Management and Governance Fabric
Data Lake Services
Data Lake Repositories
© 2016 IBM Corporation7
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Users supported by IBM’s Data Lake
Data Lake (System of Insight)
Information Management and Governance Fabric
Data Lake Services
Line of Business
Teams
Data Lake
Operations
Data Lake Repositories
Enterprise IT
Other Data
Lakes
Systems of
Engagement
Systems of
Automation
Systems of
Record
New Sources
Analytics
Teams
Governance, Risk and
Compliance Team
Information
Curator
© 2016 IBM Corporation8
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
The subsystems inside IBM’s Data Lake
Data Lake (System of Insight)
Information Management and Governance Fabric
Catalogue
Self-
Service
Access
Enterprise
IT Data
Exchange
Self-Service
Access
Analytics
Teams
Governance, Risk and
Compliance Team
Information
Curator
Line of Business
Teams
Data Lake
Operations
Enterprise IT
Other Data
Lakes
Systems of
Engagement
Data Lake Repositories
Systems of
Automation
Systems of
Record
New Sources
Analytics
Engines
© 2016 IBM Corporation9
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
View from the user community - fraud
Conform	to	
regulations
Investigate
Fraud	Case
Develop	new	
fraud	models
Detect	and	
prevent	fraud
Detect	and	
prevent	fraud
Detect	and	
prevent	fraud
© 2016 IBM Corporation10
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
The role of the catalogue
Data	
Stores
Curation	of	Metadata	about	
Stores,	Models,	Definitions
Information	Governance	
Catalogue
Search	for,	locate	and	download	
data	and	related	artifacts.
Provision	Sand	
Boxes.
Add	additional	insight	into	
data	sources	through	
automated	analysis.
Develop	data	management	
models	and	implementations.
Data	
StoresData	
Stores
Sand
Box Define	governance	policies,	
rules	and	classifications.
Monitor	compliance.
View	lineage	(business	and	technical)	
and	perform	impact	analysis.
© 2016 IBM Corporation11
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Governance ensures proper management and use of information
Information	Governance
Compliance
Policy
Administration
Policy
Enforcement
Policy
Monitoring
Policy
Implementation
Standards Protection
Lifecycle
Quality
Information	Values
Quality
Information
Dependencies
Information
Requirements
Information	Supply
Chain	Integrity
Information
Identification
Information
Retention
Information
Usage
Information
Privacy
Information
Architecture
Information
Disposal
Are	People/Systems	
operating	properly
Is	data	quality
sufficient	for	use?
Is	data	kept	for	
appropriate	
length	of	time?
Is	data	properly
protected	from	loss	or
inappropriate	use?
Are	systems	
built	to	appropriate
standards?
© 2016 IBM Corporation12
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Data lake security
§ The	data	lake’s	repositories	are	only	accessed	
by	authorized	processes.
§ People	access	the	data	from	the	data	lake	
through	the	services.
• Identified	through	a	common	authentication	mechanism	(eg	LDAP)
• Data	classified	in	the	catalog
• Access	granted	by	business	owners
• Access	controlled	by	data	lake	services
• All	activity	monitored	by	probes	that	store	log	information	in	the	audit	data	zone.
IBM’s	Data	Lake	=	Efficient	Management,	Governance,	Protection	and	Access.
Data Lake
Information Management and Governance Fabric
Data Lake Services
Data Lake Repositories
© 2016 IBM Corporation13
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Data Lake (System of Insight)
Information Management and Governance Fabric
Catalogue
Self-Service
Access
Enterprise
IT Data
Exchange
Self-Service Access
Analytics
Teams
Governance, Risk and
Compliance Team
Information
Curator
Line of Business
Teams
Data Lake
Operations
Enterprise IT
Other Data
Lakes
Systems of
Engagement
Systems of
Automation
Systems of
Record
New Sources
Analytics Engines
IBM’s Data Lake – example deployment options
InfoSphere
Streams
InfoSphere
Information
Server
InfoSphere
Information	Server
InfoSphere
Information	Server
Cognos
Watson
Explorer
Cloudant
Pure	Data	/	BLU
InfoSphere	BigInsights
InfoSphere	Master	Data	Management
Watson
Analytics
InfoSphere	Information	Server,	Optim	and	Guardium
SPSS
© 2016 IBM Corporation14
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
IBM’s Data Lake
§ As	organizations	experiment	with	
analytics	they	discover:
• Creating	new	analytics	requires	access	to	
historical	data	from	many	systems.
• This	data	includes	valuable	and	sensitive	
data	that	is	core	to	the	organization’s	operation.
• Hadoop	is	a	flexible	platform	for	storing	many	types	of	data	but	is	not	necessarily	fast	
enough	for	the	production	deployment	of	some	analytics.		Data	needs	to	be	
reformatted	and	copied	onto	a	specialist	analytics	platforms	such	as	Netezza.
§ A	data	lake	provides:
• Single	extraction	of	data	from	operational	systems	and	distribution	to	multiple	
analytics	platforms.
• Cataloguing	and	governance	of	the	data	in	the	analytics	platforms
• Simple	interfaces	for	the	line	of	business	to	access	the	information	they	need.
IBM’s	Data	Lake	=	Efficient	Management,	Governance,	Protection	and	Access.
Data Lake
Information Management and Governance Fabric
Data Lake Services
Data Lake Repositories
© 2016 IBM Corporation15
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Governing and managing Big Data for Analytics and Decision Makers
§ An	introduction	to	IBM’s	Data	Lake	solution
http://www.redbooks.ibm.com/redpieces/abstracts/redp5120.html
?Open
© 2016 IBM Corporation16
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Designing and Operating a Data Reservoir
§ Description	of	the	behaviour	and	
processes	that	make	up	a	data	
lake	from	IBM	(aka	data	
reservoir)
§ Blog
• 5	things	to	know	about	a	data	
reservoir	
https://www.ibm.com/developerwo
rks/community/blogs/5things/entry
/5_things_to_know_about_data_res
ervoir?lang=en
§ Redbook
• http://www.redbooks.ibm.com/Red
books.nsf/RedpieceAbstracts/sg248
274.html?Open
© 2016 IBM Corporation17
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Ethics for Big Data and Analytics
ü Context – for what purpose was the data originally surrendered? For
what purpose is the data now being used? How far removed from the
original context is its new use?
ü Consent & Choice – What are the choices given to an affected
party? Do they know they are making a choice? Do they really
understand what they are agreeing to? Do they really have an
opportunity to decline? What alternatives are offered?
ü Reasonable – is the depth and breadth of the data used and the
relationships derived reasonable for the application it is used for?
ü Substantiated – Are the sources of data used appropriate,
authoritative, complete and timely for the application?
ü Owned – Who owns the resulting insight? What are their
responsibilities towards it in terms of its protection and the obligation
to act?
ü Fair – How equitable are the results of the application to all
parties? Is everyone properly compensated?
ü Considered – What are the consequences of the data collection and
analysis?
ü Access – What access to data is given to the data subject?
ü Accountable – How are mistakes and unintended consequences
detected and repaired? Can the interested parties check the results
that affect them?
http://www.ibmbigdatahub.
com/whitepaper/ethics-big-
data-and-analytics
© 2016 IBM Corporation18
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Common Information Models for an Open, Analytical and Agile
World
§ To	drive	maximum	value	from	complex	IT	
projects,	IT	professionals	need	a	deep	
understanding	of	the	information	their	
projects	will	use.	Too	often,	however,	IT	
treats	information	as	an	afterthought:	the	
“poor	stepchild” behind	applications	and	
infrastructure.	That	needs	to	change.	This	
book	will	help	you	change	it.	
§ Using	a	complete	case	study,	the	authors	
explain	what	CIMs	are,	how	to	build	them,	
and	how	to	maintain	them.	You	learn	how	
to	clarify	the	structure,	meaning,	and	intent	
of	any	information	you	may	exchange,	and	
then	use	your	CIM	to	improve	integration,	
collaboration,	and	agility.	
§ In	today’s	mobile,	cloud,	and	analytics	
environments,	your	information	is	more	
valuable	than	ever.	To	build	systems	that	
make	the	most	of	it,	start	right	here.
© 2016 IBM Corporation19
Learn	more	about	Data	Lakes	on	ibm.com:	https://ibm.biz/Bdswi9
Data Lake: Taming the Data Dragon (White Paper)
Taming	the	data	dragon	leads	to	significant	benefits	across	the	enterprise,	from	improved	productivity	
to	increased	effectiveness	in	sales	and	marketing.	A	data	lake	accepts	data	flows	from	any	source	and	
brings	them	into	a	common	platform	for	use.	Data	is	stored	in	its	raw,	unrefined	state	and	located,	
processed,	refined	and	extracted	as	required.	 However,	governance	needs	to	be	applied	to	the	data	
lake	to	ensure	it	becomes	a	trusted	data	source,	rather	than	a	formless	landing	area	in	which	data	is	
stored	without	consideration	of	its	validity,	value	or	shelf	life.
Download	Now:	https://ibm.biz/Bdswiu

Weitere ähnliche Inhalte

Was ist angesagt?

Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleAdam Doyle
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureDatabricks
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake ArchitectureDATAVERSITY
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks FundamentalsDalibor Wijas
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks DeltaDatabricks
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptxWasm1953
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data MeshLibbySchulze
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation Brett VanderPlaats
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceAlation
 

Was ist angesagt? (20)

Snowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at ScaleSnowflake Data Science and AI/ML at Scale
Snowflake Data Science and AI/ML at Scale
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
 
Data Lake Architecture
Data Lake ArchitectureData Lake Architecture
Data Lake Architecture
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
Databricks Fundamentals
Databricks FundamentalsDatabricks Fundamentals
Databricks Fundamentals
 
Azure purview
Azure purviewAzure purview
Azure purview
 
Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Introducing Databricks Delta
Introducing Databricks DeltaIntroducing Databricks Delta
Introducing Databricks Delta
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
 
Lakehouse in Azure
Lakehouse in AzureLakehouse in Azure
Lakehouse in Azure
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Zero to Snowflake Presentation
Zero to Snowflake Presentation Zero to Snowflake Presentation
Zero to Snowflake Presentation
 
Data Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data IntelligenceData Catalog as the Platform for Data Intelligence
Data Catalog as the Platform for Data Intelligence
 

Andere mochten auch

Make data simple in the cognitive era
Make data simple in the cognitive eraMake data simple in the cognitive era
Make data simple in the cognitive eraIBM Analytics
 
Expert opinion on managing data breaches
Expert opinion on managing data breachesExpert opinion on managing data breaches
Expert opinion on managing data breachesIBM Analytics
 
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive era
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive eraIBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive era
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive eraIBM Analytics
 
The science of client insight: Increase revenue through improved engagement
The science of client insight: Increase revenue through improved engagementThe science of client insight: Increase revenue through improved engagement
The science of client insight: Increase revenue through improved engagementIBM Analytics
 
10 WealthTech podcasts every wealth advisor should listen to
10 WealthTech podcasts every wealth advisor should listen to10 WealthTech podcasts every wealth advisor should listen to
10 WealthTech podcasts every wealth advisor should listen toIBM Analytics
 
Top industry use cases for streaming analytics
Top industry use cases for streaming analyticsTop industry use cases for streaming analytics
Top industry use cases for streaming analyticsIBM Analytics
 

Andere mochten auch (6)

Make data simple in the cognitive era
Make data simple in the cognitive eraMake data simple in the cognitive era
Make data simple in the cognitive era
 
Expert opinion on managing data breaches
Expert opinion on managing data breachesExpert opinion on managing data breaches
Expert opinion on managing data breaches
 
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive era
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive eraIBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive era
IBM CDO Fall Summit 2016 Keynote: Driving innovation in the cognitive era
 
The science of client insight: Increase revenue through improved engagement
The science of client insight: Increase revenue through improved engagementThe science of client insight: Increase revenue through improved engagement
The science of client insight: Increase revenue through improved engagement
 
10 WealthTech podcasts every wealth advisor should listen to
10 WealthTech podcasts every wealth advisor should listen to10 WealthTech podcasts every wealth advisor should listen to
10 WealthTech podcasts every wealth advisor should listen to
 
Top industry use cases for streaming analytics
Top industry use cases for streaming analyticsTop industry use cases for streaming analytics
Top industry use cases for streaming analytics
 

Ähnlich wie Data Lake: A simple introduction

02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Real-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIReal-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIibi
 
Watson data platform_sofia_20171017
Watson data platform_sofia_20171017Watson data platform_sofia_20171017
Watson data platform_sofia_20171017Mladen Jovanovski
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesaziksa
 
Aziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jhaAziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jhaData Con LA
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnIBM Danmark
 
How Startups can leverage big data?
How Startups can leverage big data?How Startups can leverage big data?
How Startups can leverage big data?Rackspace
 
data analytics lecture2.pptx
data analytics lecture2.pptxdata analytics lecture2.pptx
data analytics lecture2.pptxNamrataBhatt8
 
krithi-talk-impact.ppt
krithi-talk-impact.pptkrithi-talk-impact.ppt
krithi-talk-impact.pptKRISHNARAJ207
 
Data-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse StrategiesData-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse StrategiesDATAVERSITY
 
From information to intelligence
From information to intelligence From information to intelligence
From information to intelligence Srini Koushik
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data PlatformVikas Manoria
 
Mis case study , Chapter 5, Chapter 6
Mis case study , Chapter 5, Chapter 6Mis case study , Chapter 5, Chapter 6
Mis case study , Chapter 5, Chapter 6Rakib Hasan
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Cambridge Semantics
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentationmillerca2
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Cynthia Saracco
 
Modernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyModernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyCloudera, Inc.
 
IBM Industry Models and Data Lake
IBM Industry Models and Data Lake IBM Industry Models and Data Lake
IBM Industry Models and Data Lake Pat O'Sullivan
 
C21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptxC21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptxAdityaDeshpande674450
 

Ähnlich wie Data Lake: A simple introduction (20)

02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Real-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BIReal-Time Data Integration for Modern BI
Real-Time Data Integration for Modern BI
 
Watson data platform_sofia_20171017
Watson data platform_sofia_20171017Watson data platform_sofia_20171017
Watson data platform_sofia_20171017
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Aziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jhaAziksa hadoop for buisness users2 santosh jha
Aziksa hadoop for buisness users2 santosh jha
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren Ravn
 
How Startups can leverage big data?
How Startups can leverage big data?How Startups can leverage big data?
How Startups can leverage big data?
 
data analytics lecture2.pptx
data analytics lecture2.pptxdata analytics lecture2.pptx
data analytics lecture2.pptx
 
krithi-talk-impact.ppt
krithi-talk-impact.pptkrithi-talk-impact.ppt
krithi-talk-impact.ppt
 
krithi-talk-impact.ppt
krithi-talk-impact.pptkrithi-talk-impact.ppt
krithi-talk-impact.ppt
 
Data-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse StrategiesData-Ed Webinar: Data Warehouse Strategies
Data-Ed Webinar: Data Warehouse Strategies
 
From information to intelligence
From information to intelligence From information to intelligence
From information to intelligence
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Mis case study , Chapter 5, Chapter 6
Mis case study , Chapter 5, Chapter 6Mis case study , Chapter 5, Chapter 6
Mis case study , Chapter 5, Chapter 6
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Modernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data StrategyModernizing Architecture for a Complete Data Strategy
Modernizing Architecture for a Complete Data Strategy
 
IBM Industry Models and Data Lake
IBM Industry Models and Data Lake IBM Industry Models and Data Lake
IBM Industry Models and Data Lake
 
C21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptxC21027_Aditya_Big Data Analytics In Baking Sector.pptx
C21027_Aditya_Big Data Analytics In Baking Sector.pptx
 

Mehr von IBM Analytics

Advantages of an integrated governance, risk and compliance environment
Advantages of an integrated governance, risk and compliance environmentAdvantages of an integrated governance, risk and compliance environment
Advantages of an integrated governance, risk and compliance environmentIBM Analytics
 
Cognitive banking with expert insights
Cognitive banking with expert insightsCognitive banking with expert insights
Cognitive banking with expert insightsIBM Analytics
 
Sales performance management and C-level goals
Sales performance management and C-level goalsSales performance management and C-level goals
Sales performance management and C-level goalsIBM Analytics
 
4 common headaches with sales compensation management
4 common headaches with sales compensation management4 common headaches with sales compensation management
4 common headaches with sales compensation managementIBM Analytics
 
IBM Virtual Finance Forum 2016: Top 10 reasons to attend
IBM Virtual Finance Forum 2016: Top 10 reasons to attendIBM Virtual Finance Forum 2016: Top 10 reasons to attend
IBM Virtual Finance Forum 2016: Top 10 reasons to attendIBM Analytics
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineersIBM Analytics
 
How secure is your enterprise from threats?
How secure is your enterprise from threats? How secure is your enterprise from threats?
How secure is your enterprise from threats? IBM Analytics
 
10 benefits to thinking inside Box
10 benefits to thinking inside Box10 benefits to thinking inside Box
10 benefits to thinking inside BoxIBM Analytics
 
The digital transformation of the French Open
The digital transformation of the French OpenThe digital transformation of the French Open
The digital transformation of the French OpenIBM Analytics
 
Bridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureBridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureIBM Analytics
 
What does data tell you about the customer journey?
What does data tell you about the customer journey?What does data tell you about the customer journey?
What does data tell you about the customer journey?IBM Analytics
 
What CEOs want from CDOs and how to deliver on it
What CEOs want from CDOs and how to deliver on itWhat CEOs want from CDOs and how to deliver on it
What CEOs want from CDOs and how to deliver on itIBM Analytics
 
Banking in the age of the empowered consumer
Banking in the age of the empowered consumerBanking in the age of the empowered consumer
Banking in the age of the empowered consumerIBM Analytics
 
Wimbledon fans love real-time analytics
Wimbledon fans love real-time analyticsWimbledon fans love real-time analytics
Wimbledon fans love real-time analyticsIBM Analytics
 
How IoT and weather data are transforming business decisions
How IoT and weather data are transforming business decisionsHow IoT and weather data are transforming business decisions
How IoT and weather data are transforming business decisionsIBM Analytics
 
The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...IBM Analytics
 
Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?IBM Analytics
 
IBM Cognos Analytics: Empowering business by infusing intelligence across the...
IBM Cognos Analytics: Empowering business by infusing intelligence across the...IBM Cognos Analytics: Empowering business by infusing intelligence across the...
IBM Cognos Analytics: Empowering business by infusing intelligence across the...IBM Analytics
 
Jen Q. Public: How analytics is impacting government, education and public sa...
Jen Q. Public: How analytics is impacting government, education and public sa...Jen Q. Public: How analytics is impacting government, education and public sa...
Jen Q. Public: How analytics is impacting government, education and public sa...IBM Analytics
 
5 common mistakes with sales incentive systems: Forgetting the management in ...
5 common mistakes with sales incentive systems: Forgetting the management in ...5 common mistakes with sales incentive systems: Forgetting the management in ...
5 common mistakes with sales incentive systems: Forgetting the management in ...IBM Analytics
 

Mehr von IBM Analytics (20)

Advantages of an integrated governance, risk and compliance environment
Advantages of an integrated governance, risk and compliance environmentAdvantages of an integrated governance, risk and compliance environment
Advantages of an integrated governance, risk and compliance environment
 
Cognitive banking with expert insights
Cognitive banking with expert insightsCognitive banking with expert insights
Cognitive banking with expert insights
 
Sales performance management and C-level goals
Sales performance management and C-level goalsSales performance management and C-level goals
Sales performance management and C-level goals
 
4 common headaches with sales compensation management
4 common headaches with sales compensation management4 common headaches with sales compensation management
4 common headaches with sales compensation management
 
IBM Virtual Finance Forum 2016: Top 10 reasons to attend
IBM Virtual Finance Forum 2016: Top 10 reasons to attendIBM Virtual Finance Forum 2016: Top 10 reasons to attend
IBM Virtual Finance Forum 2016: Top 10 reasons to attend
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineers
 
How secure is your enterprise from threats?
How secure is your enterprise from threats? How secure is your enterprise from threats?
How secure is your enterprise from threats?
 
10 benefits to thinking inside Box
10 benefits to thinking inside Box10 benefits to thinking inside Box
10 benefits to thinking inside Box
 
The digital transformation of the French Open
The digital transformation of the French OpenThe digital transformation of the French Open
The digital transformation of the French Open
 
Bridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architectureBridging to a hybrid cloud data services architecture
Bridging to a hybrid cloud data services architecture
 
What does data tell you about the customer journey?
What does data tell you about the customer journey?What does data tell you about the customer journey?
What does data tell you about the customer journey?
 
What CEOs want from CDOs and how to deliver on it
What CEOs want from CDOs and how to deliver on itWhat CEOs want from CDOs and how to deliver on it
What CEOs want from CDOs and how to deliver on it
 
Banking in the age of the empowered consumer
Banking in the age of the empowered consumerBanking in the age of the empowered consumer
Banking in the age of the empowered consumer
 
Wimbledon fans love real-time analytics
Wimbledon fans love real-time analyticsWimbledon fans love real-time analytics
Wimbledon fans love real-time analytics
 
How IoT and weather data are transforming business decisions
How IoT and weather data are transforming business decisionsHow IoT and weather data are transforming business decisions
How IoT and weather data are transforming business decisions
 
The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...The current challenges and opportunities of big data and analytics in emergen...
The current challenges and opportunities of big data and analytics in emergen...
 
Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?Cognitive analytics: What's coming in 2016?
Cognitive analytics: What's coming in 2016?
 
IBM Cognos Analytics: Empowering business by infusing intelligence across the...
IBM Cognos Analytics: Empowering business by infusing intelligence across the...IBM Cognos Analytics: Empowering business by infusing intelligence across the...
IBM Cognos Analytics: Empowering business by infusing intelligence across the...
 
Jen Q. Public: How analytics is impacting government, education and public sa...
Jen Q. Public: How analytics is impacting government, education and public sa...Jen Q. Public: How analytics is impacting government, education and public sa...
Jen Q. Public: How analytics is impacting government, education and public sa...
 
5 common mistakes with sales incentive systems: Forgetting the management in ...
5 common mistakes with sales incentive systems: Forgetting the management in ...5 common mistakes with sales incentive systems: Forgetting the management in ...
5 common mistakes with sales incentive systems: Forgetting the management in ...
 

Kürzlich hochgeladen

Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Boston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一F La
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxaleedritatuxx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 

Kürzlich hochgeladen (20)

Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
Decoding the Heart: Student Presentation on Heart Attack Prediction with Data...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
办理(Vancouver毕业证书)加拿大温哥华岛大学毕业证成绩单原版一比一
 
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptxmodul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 

Data Lake: A simple introduction

  • 1. © 2016 IBM Corporation Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 IBM’s Data Lake – A Basic Definition 1st June 2016 Mandy Chessell CBE FREng CEng FBCS Distinguished Engineer, Master Inventor Analytics Group CTO Office
  • 2. © 2016 IBM Corporation2 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Data blues & skills issues § A disproportionate portion of the time spent in analytics project is about data preparation: acquiring/preparing/formatting/normalizing the data § In addition to raw data, augmented data/analytical assets can significantly speed up the analytics process and partially bridge the talent gap
  • 3. © 2016 IBM Corporation3 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 A growing demand … Business Teams want • Open access to more information • More powerful analysis and visualization tools IT Teams are • Concerned about cost. • Concerned about governance and regulatory requirements.
  • 4. © 2016 IBM Corporation4 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Big Data Lakes or Swamps? § As we collect data • Can we preserve clarity? • Do we know what we are collecting? • Can we find the data we need? § Are we creating a data swamp? § How do we build trust in big data? • Do we know what data is being used for?
  • 5. © 2016 IBM Corporation5 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 "The need for increased agility and accessibility for data analysis is the primary driver for data lakes," said Andrew White, vice president and distinguished analyst at Gartner. "Nevertheless, while it is certainly true that data lakes can provide value to various parts of the organization, the proposition of enterprise wide data management has yet to be realized." http://www.gartner.com/newsroom/id/2809117
  • 6. © 2016 IBM Corporation6 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 IBM’s Data Lake – designed for data access – with safeguards IBM’s Data Lake = Efficient Management, Governance, Protection and Access. Data Lake (System of Insight) Information Management and Governance Fabric Data Lake Services Data Lake Repositories
  • 7. © 2016 IBM Corporation7 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Users supported by IBM’s Data Lake Data Lake (System of Insight) Information Management and Governance Fabric Data Lake Services Line of Business Teams Data Lake Operations Data Lake Repositories Enterprise IT Other Data Lakes Systems of Engagement Systems of Automation Systems of Record New Sources Analytics Teams Governance, Risk and Compliance Team Information Curator
  • 8. © 2016 IBM Corporation8 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 The subsystems inside IBM’s Data Lake Data Lake (System of Insight) Information Management and Governance Fabric Catalogue Self- Service Access Enterprise IT Data Exchange Self-Service Access Analytics Teams Governance, Risk and Compliance Team Information Curator Line of Business Teams Data Lake Operations Enterprise IT Other Data Lakes Systems of Engagement Data Lake Repositories Systems of Automation Systems of Record New Sources Analytics Engines
  • 9. © 2016 IBM Corporation9 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 View from the user community - fraud Conform to regulations Investigate Fraud Case Develop new fraud models Detect and prevent fraud Detect and prevent fraud Detect and prevent fraud
  • 10. © 2016 IBM Corporation10 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 The role of the catalogue Data Stores Curation of Metadata about Stores, Models, Definitions Information Governance Catalogue Search for, locate and download data and related artifacts. Provision Sand Boxes. Add additional insight into data sources through automated analysis. Develop data management models and implementations. Data StoresData Stores Sand Box Define governance policies, rules and classifications. Monitor compliance. View lineage (business and technical) and perform impact analysis.
  • 11. © 2016 IBM Corporation11 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Governance ensures proper management and use of information Information Governance Compliance Policy Administration Policy Enforcement Policy Monitoring Policy Implementation Standards Protection Lifecycle Quality Information Values Quality Information Dependencies Information Requirements Information Supply Chain Integrity Information Identification Information Retention Information Usage Information Privacy Information Architecture Information Disposal Are People/Systems operating properly Is data quality sufficient for use? Is data kept for appropriate length of time? Is data properly protected from loss or inappropriate use? Are systems built to appropriate standards?
  • 12. © 2016 IBM Corporation12 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Data lake security § The data lake’s repositories are only accessed by authorized processes. § People access the data from the data lake through the services. • Identified through a common authentication mechanism (eg LDAP) • Data classified in the catalog • Access granted by business owners • Access controlled by data lake services • All activity monitored by probes that store log information in the audit data zone. IBM’s Data Lake = Efficient Management, Governance, Protection and Access. Data Lake Information Management and Governance Fabric Data Lake Services Data Lake Repositories
  • 13. © 2016 IBM Corporation13 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Data Lake (System of Insight) Information Management and Governance Fabric Catalogue Self-Service Access Enterprise IT Data Exchange Self-Service Access Analytics Teams Governance, Risk and Compliance Team Information Curator Line of Business Teams Data Lake Operations Enterprise IT Other Data Lakes Systems of Engagement Systems of Automation Systems of Record New Sources Analytics Engines IBM’s Data Lake – example deployment options InfoSphere Streams InfoSphere Information Server InfoSphere Information Server InfoSphere Information Server Cognos Watson Explorer Cloudant Pure Data / BLU InfoSphere BigInsights InfoSphere Master Data Management Watson Analytics InfoSphere Information Server, Optim and Guardium SPSS
  • 14. © 2016 IBM Corporation14 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 IBM’s Data Lake § As organizations experiment with analytics they discover: • Creating new analytics requires access to historical data from many systems. • This data includes valuable and sensitive data that is core to the organization’s operation. • Hadoop is a flexible platform for storing many types of data but is not necessarily fast enough for the production deployment of some analytics. Data needs to be reformatted and copied onto a specialist analytics platforms such as Netezza. § A data lake provides: • Single extraction of data from operational systems and distribution to multiple analytics platforms. • Cataloguing and governance of the data in the analytics platforms • Simple interfaces for the line of business to access the information they need. IBM’s Data Lake = Efficient Management, Governance, Protection and Access. Data Lake Information Management and Governance Fabric Data Lake Services Data Lake Repositories
  • 15. © 2016 IBM Corporation15 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Governing and managing Big Data for Analytics and Decision Makers § An introduction to IBM’s Data Lake solution http://www.redbooks.ibm.com/redpieces/abstracts/redp5120.html ?Open
  • 16. © 2016 IBM Corporation16 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Designing and Operating a Data Reservoir § Description of the behaviour and processes that make up a data lake from IBM (aka data reservoir) § Blog • 5 things to know about a data reservoir https://www.ibm.com/developerwo rks/community/blogs/5things/entry /5_things_to_know_about_data_res ervoir?lang=en § Redbook • http://www.redbooks.ibm.com/Red books.nsf/RedpieceAbstracts/sg248 274.html?Open
  • 17. © 2016 IBM Corporation17 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Ethics for Big Data and Analytics ü Context – for what purpose was the data originally surrendered? For what purpose is the data now being used? How far removed from the original context is its new use? ü Consent & Choice – What are the choices given to an affected party? Do they know they are making a choice? Do they really understand what they are agreeing to? Do they really have an opportunity to decline? What alternatives are offered? ü Reasonable – is the depth and breadth of the data used and the relationships derived reasonable for the application it is used for? ü Substantiated – Are the sources of data used appropriate, authoritative, complete and timely for the application? ü Owned – Who owns the resulting insight? What are their responsibilities towards it in terms of its protection and the obligation to act? ü Fair – How equitable are the results of the application to all parties? Is everyone properly compensated? ü Considered – What are the consequences of the data collection and analysis? ü Access – What access to data is given to the data subject? ü Accountable – How are mistakes and unintended consequences detected and repaired? Can the interested parties check the results that affect them? http://www.ibmbigdatahub. com/whitepaper/ethics-big- data-and-analytics
  • 18. © 2016 IBM Corporation18 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Common Information Models for an Open, Analytical and Agile World § To drive maximum value from complex IT projects, IT professionals need a deep understanding of the information their projects will use. Too often, however, IT treats information as an afterthought: the “poor stepchild” behind applications and infrastructure. That needs to change. This book will help you change it. § Using a complete case study, the authors explain what CIMs are, how to build them, and how to maintain them. You learn how to clarify the structure, meaning, and intent of any information you may exchange, and then use your CIM to improve integration, collaboration, and agility. § In today’s mobile, cloud, and analytics environments, your information is more valuable than ever. To build systems that make the most of it, start right here.
  • 19. © 2016 IBM Corporation19 Learn more about Data Lakes on ibm.com: https://ibm.biz/Bdswi9 Data Lake: Taming the Data Dragon (White Paper) Taming the data dragon leads to significant benefits across the enterprise, from improved productivity to increased effectiveness in sales and marketing. A data lake accepts data flows from any source and brings them into a common platform for use. Data is stored in its raw, unrefined state and located, processed, refined and extracted as required. However, governance needs to be applied to the data lake to ensure it becomes a trusted data source, rather than a formless landing area in which data is stored without consideration of its validity, value or shelf life. Download Now: https://ibm.biz/Bdswiu