SlideShare a Scribd company logo
1 of 70
Download to read offline
Offload,	Transform,	and	Present	-	the	New	World	of	Data	Integration
Michael	Rainey
• Michael	Rainey	-	Technical	Advisor	
• Spreading	the	good	word	about	gluent	

products	with	the	world		
• Data	Integration	expertise	
• Oracle	ACE	Director	
• @mRainey
2
Introduction
we liberate enterprise data
3
About	Gluent
Globally	distributed	team	with	a	deep	
background	in	building	high	performance	
enterprise	applications	and	systems	…
Now	on	a	mission	to		
liberate	enterprise	data.
Gluent Data Platform
Advisor
Open Data Formats
Security
Orchestration
Access Offload Present
3
About	Gluent
Globally	distributed	team	with	a	deep	
background	in	building	high	performance	
enterprise	applications	and	systems	…
2nd	place	in	Strata+HadoopWorld	

Startup	Showcase	2017.	
(Demo	video	at	gluent.com)
Now	on	a	mission	to		
liberate	enterprise	data.
Recognized	by	Gartner	in		
their	Cool	Vendors	in	Data		
Management,	2017	report!	
Download	(no	registration	needed):	
https://gluent.com/cool-vendor-2017/
Listed	as	one	of	the	

“Top	10	Coolest	Big	Data	
Startups	of	2017”	by	CRN	
gluent.com/gluent-recognized-in-
top-10-coolest-startups-of-2017-on-crn
4
Gluent is running one-hour webinars each Wednesday in July beginning tomorrow, July 12.
See details below and register for your free spot today!



Apache Impala Internals

Speaker: Tanel Poder, Gluent

Wednesday, July 19 @ 12 PM CDT

gluent.com/event/gluent-webinar-apache-impala-internals-with-tanel-poder



Building an Analytics Platform with Oracle & Hadoop

Speakers: Gerry Moore & Suresh Irukulapati, Vistra Energy

Wednesday, July 26 @ 9 AM CDT

gluent.com/event/gluent-webinar-building-an-integrated-analytics-platform-with-oracle-and-hadoop
Gluent	Webinars	-	JULY	2017
5
Extract	Transform	Load	(ETL)
Source	1
Source	2
Source	X
EDW
Transform
Extract
Load
6
ETL	Examples	-	employee	data
EDW
HR
Network
Facilities
Security
Training
6
ETL	Examples	-	employee	data
EDW
HR
Network
Facilities
Security
Training
6
ETL	Examples	-	employee	data
EDW
HR
Network
Facilities
Security
Training
6
ETL	Examples	-	employee	data
EDW
HR
Network
Facilities
Security
Training
Employee
6
ETL	Examples	-	employee	data
EDW
HR
Network
Facilities
Security
Training
Employee
7
ETL	Examples	-	IoT	from	set-top	boxes
IoT
7
ETL	Examples	-	IoT	from	set-top	boxes
IoT
EDW
CRM
CRM
7
ETL	Examples	-	IoT	from	set-top	boxes
IoT
EDW
CRM
CRM
Enhanced
customer
data
7
ETL	Examples	-	IoT	from	set-top	boxes
IoT
EDW
CRM
CRM
Enhanced
customer
data
8
ETL	Examples	-	data	scientist	pipelines
8
ETL	Examples	-	data	scientist	pipelines
8
ETL	Examples	-	data	scientist	pipelines
8
ETL	Examples	-	data	scientist	pipelines
9
Typical	ETL	development	-	too	much	time	spent	on	“E”	and	“L”
reader = new TransformingReader(reader)
.add(new SetCalculatedField("AvailableCredit",
"parseDouble(CreditLimit) - parseDouble(Balance)"))
.add(new ExcludeFields("CreditLimit",
"Balance"));
DataWriter writer = new
JdbcWriter(getJdbcConnection(), "dp_credit_balance")
.setAutoCloseConnection(true);
JobTemplate.DEFAULT.transfer(reader, writer);
9
Typical	ETL	development	-	too	much	time	spent	on	“E”	and	“L”
ETL	Developer	/	Data	Engineer	
should	spend	their	time	on		
Transformations!
reader = new TransformingReader(reader)
.add(new SetCalculatedField("AvailableCredit",
"parseDouble(CreditLimit) - parseDouble(Balance)"))
.add(new ExcludeFields("CreditLimit",
"Balance"));
DataWriter writer = new
JdbcWriter(getJdbcConnection(), "dp_credit_balance")
.setAutoCloseConnection(true);
JobTemplate.DEFAULT.transfer(reader, writer);
10
We	still	need	to	move	data!
• Data	lake	-	enterprise	data	stored	in	raw	form	
• Data	hub	-	centralized	data	store	with	standards,	governance,	and	quality	
11
Data	lake	or	data	hub?
• Sharing	data	without	moving	data
• Metadata	is	stored	about	each	“source”	database
• Queries	are	passed	through	a	metadata	

layer,	which	provides	info	about	where	the	

data	lives	and	how	to	translate	the	query
12
What	about	data	federation?
Source	1
Source	2
Source	X
Source	1
Source	2
Source	X
App	1
App	2
App	1
App	2
• Sharing	data	without	moving	data
• Metadata	is	stored	about	each	“source”	database
• Queries	are	passed	through	a	metadata	

layer,	which	provides	info	about	where	the	

data	lives	and	how	to	translate	the	query
• But…
12
What	about	data	federation?
Source	1
Source	2
Source	X
Source	1
Source	2
Source	X
App	1
App	2
App	1
App	2
• Sharing	data	without	moving	data
• Metadata	is	stored	about	each	“source”	database
• Queries	are	passed	through	a	metadata	

layer,	which	provides	info	about	where	the	

data	lives	and	how	to	translate	the	query
• But…
• We	must	recode	applications!
• Data	remains	in	RDBMS	silos	and	are	

limited	to	CPU/storage	constraints
12
What	about	data	federation?
Source	1
Source	2
Source	X
Source	1
Source	2
Source	X
App	1
App	2
App	1
App	2
13
Three	things	we	need	the	ability	to	do:
1. Offload	data	from	Oracle	to	Hadoop		
2. Load	data	from	Hadoop	to	Oracle	
3. Query	Hadoop	data	in	Oracle
• “Offload”	-	copy	or	move	data	from	the	relational	database	to	Hadoop	
• Why	offload?	
• Store	data	in	a	centralized	location	for	access	and	sharing	
• Use	the	distributed,	parallel	processing	power	of	Hadoop	for	transformations	
• Enable	the	use	of	“new	world”	technologies	(Spark,	Impala,	etc)	
• Why	Hadoop?	
• We	can	now	afford	to	keep	a	copy	of	all	enterprise	data	for	data	sharing	reasons!
14
Offload,	not	Extract
15
How	to	offload?	There	are	many	options
Tool Offload	Data
Sqoop Yes
Oracle	Loader	for	Hadoop
Oracle	SQL	Connector	for	HDFS
ODBC	Gateway
Big	Data	SQL
Gluent	Data	Platform Yes
• Command	line	client	used	to	bulk	copy	data	from	a	relational	database	to	
HDFS	over	JDBC	connection	
• Also	works	in	reverse,	HDFS	to	RDBMS		
• Sqoop	generates	MapReduce	jobs	for	the	work
16
Bulk	data	load	with	Sqoop
sqoop import --connect jdbc:oracle:thin:@myserver:1521/MYDB1
--username myuser
--null-string ''
--null-non-string ''
--target-dir=/user/hive/warehouse/ssh/sales
--append -m1 --fetch-size=5000
--fields-terminated-by '','' --lines-terminated-by ''n''
--optionally-enclosed-by ''"'' --escaped-by ''"''
--split-by TIME_ID
--query ""SELECT * FROM SSH.SALES WHERE TIME_ID < DATE'1998-01-01'""
• Gluent	Advisor	helps	determine	which	tables	
and/or	partitions	are	candidates	for	offload	
• Offload	options:	
• Move	or	copy	100%	of	data	
• Move	“cold”	partitions	only	
• Data	can	be	kept	in-sync	with	an	incremental	
offload		
• Example:	when	a	partition	goes	inactive	(cold),	it	
can	be	offloaded	to	Hadoop	
• Data	can	be	updated	from	RDBMS	to	Hadoop
17
Offload	data	with	Gluent	Data	Platform
• Gluent	Advisor	helps	determine	which	tables	
and/or	partitions	are	candidates	for	offload	
• Offload	options:	
• Move	or	copy	100%	of	data	
• Move	“cold”	partitions	only	
• Data	can	be	kept	in-sync	with	an	incremental	
offload		
• Example:	when	a	partition	goes	inactive	(cold),	it	
can	be	offloaded	to	Hadoop	
• Data	can	be	updated	from	RDBMS	to	Hadoop
17
Offload	data	with	Gluent	Data	Platform
gluent.
80%	-	20%
offload -x -t EDW.BALANCE_DETAIL_MONTHLY --less-than-value=2017-05-01
• Gluent	Advisor	helps	determine	which	tables	
and/or	partitions	are	candidates	for	offload	
• Offload	options:	
• Move	or	copy	100%	of	data	
• Move	“cold”	partitions	only	
• Data	can	be	kept	in-sync	with	an	incremental	
offload		
• Example:	when	a	partition	goes	inactive	(cold),	it	
can	be	offloaded	to	Hadoop	
• Data	can	be	updated	from	RDBMS	to	Hadoop
17
Offload	data	with	Gluent	Data	Platform
gluent.
80%	-	20%
offload -x -t EDW.BALANCE_DETAIL_MONTHLY --less-than-value=2017-05-01
18
Three	things	we	need	the	ability	to	do:
1. Offload	data	from	Oracle	to	Hadoop		
2. Load	data	from	Hadoop	to	Oracle	
3. Query	Hadoop	data	in	Oracle
18
Three	things	we	need	the	ability	to	do:
1. Offload	data	from	Oracle	to	Hadoop		
2. Load	data	from	Hadoop	to	Oracle	
3. Query	Hadoop	data	in	Oracle
Present
• Sqoop	in	reverse	
• Reads	HDFS	files,	converts	to	JDBC	arrays	and	insert	(or	update)	statements
19
The	“L”	of	ETL	-	Loading	the	data
RDBMS
20
Forget	the	“L”oad.	Present	the	data	to	the	RDBMS
RDBMS
21
Load	the	data	from	Hadoop	to	RDBMS?
Present
21
Load	the	data	from	Hadoop	to	RDBMS?
Tool Offload	Data Load	Data
Sqoop Yes Yes
Oracle	Loader	for	Hadoop Yes
Oracle	SQL	Connector	for	HDFS Yes
ODBC	Gateway Yes
Big	Data	SQL Yes
Gluent	Data	Platform Yes
Present
Present
21
Load	the	data	from	Hadoop	to	RDBMS?
Tool Offload	Data Load	Data
Sqoop Yes Yes
Oracle	Loader	for	Hadoop Yes
Oracle	SQL	Connector	for	HDFS Yes
ODBC	Gateway Yes
Big	Data	SQL Yes
Gluent	Data	Platform Yes
Present
Present
• Oracle	SQL	Connector	for	HDFS	
• Uses	Oracle	External	Table	to	access	delimited	file	or	Hive	table	in	Hadoop	
• When	queried,	all	data	is	read	from	Hadoop	to	Oracle,	then	processed	
• Oracle	to	Hadoop	over	database	links	
• Create	a	database	link	from	Oracle	to	Hadoop	using	the	Impala	(or	Hive)	ODBC	
gateway	
• Pushes	filters	to	Hadoop	but	not	grouping	aggregates	
• Big	Data	SQL	
• Access	Hive	tables	via	an	Oracle	external	table	
• Predicate	pushdown	(not	aggregations	&	joins)
22
Present	data	to	Oracle
23
Gluent	Present
gluent.
23
Gluent	Present
gluent.
present -t "SH.SALES" -xv --target-name="SH.SALES_DIVISION_X"
23
Gluent	Present
gluent.
present -t "SH.SALES" -xv --target-name="SH.SALES_DIVISION_X"
Gluent	Present	can	
pushdown	predicates,	
aggregations,	and	joins
24
Three	things	we	need	the	ability	to	do:
1. Offload	data	from	Oracle	to	Hadoop		
2. Load	data	from	Hadoop	to	Oracle	
3. Query	Hadoop	data	in	Oracle
Present
25
How	to	query	the	data?	There	are	many	options
Tool Offload	Data Load	Data Allow	Query	
Data
Offload	
Query
Parallel	
Execution
Sqoop Yes Yes Yes
Oracle	Loader	for	Hadoop Yes Yes
Oracle	SQL	Connector	for	HDFS Yes Yes Yes
ODBC	Gateway Yes Yes Yes
Big	Data	SQL Yes Yes Yes Yes
Gluent	Data	Platform Yes Yes Yes YesPresent
25
How	to	query	the	data?	There	are	many	options
Tool Offload	Data Load	Data Allow	Query	
Data
Offload	
Query
Parallel	
Execution
Sqoop Yes Yes Yes
Oracle	Loader	for	Hadoop Yes Yes
Oracle	SQL	Connector	for	HDFS Yes Yes Yes
ODBC	Gateway Yes Yes Yes
Big	Data	SQL Yes Yes Yes Yes
Gluent	Data	Platform Yes Yes Yes YesPresent
26
Present	Data	From	Anywhere	To	Anywhere
gluent.
26
Present	Data	From	Anywhere	To	Anywhere
gluent.
26
Present	Data	From	Anywhere	To	Anywhere
gluent.
26
Present	Data	From	Anywhere	To	Anywhere
gluent.
• A	new	approach…a	new	acronym?
27
Offload	Transform	Present	(OTP?)
EDW
28
ETL	Examples	-	employee	data
EDW
HR
Network
Facilities
Security
Training
Employee
29
Now	using	OTP	-	employee	data
EDW
Employee
29
Now	using	OTP	-	employee	data
EDW
Employee
29
Now	using	OTP	-	employee	data
EDW
Employee
29
Now	using	OTP	-	employee	data
EDW
Employee
30
ETL	Examples	-	IoT	from	set-top	boxes
IoT
EDW
CRM
CRM
Enhanced
customer
data
EDW
CRM
CRM
IoT
Enhanced
customer
data
30
ETL	Examples	-	IoT	from	set-top	boxes
31
ETL	Examples	-	data	scientist	pipelines
31
ETL	Examples	-	data	scientist	pipelines
31
ETL	Examples	-	data	scientist	pipelines
• No	upfront	massive	project	cost	to	move	data	around.		
• Can	happen	in	minutes	/	hours,	not	weeks	/	months	
• Large	corporations	can	move	as	fast	as	startups	
• Data	Engineers	/	ETL	Developers	can	focus	on	transformations!	
• A	new	approach	to	data	sharing	across	enterprise		
• Offload	all	data	for	simple	access	
• Takes	a	different	mindset
32
OTP	simplifies	the	process	of	data	movement!
33
Data	Consolidation
SALES	
(customer	A)
Customers
Products Preferences
Promotions
Prices
SALES	
(customer	B)
Customers
Products Preferences
Promotions
Prices
SQL
33
Data	Consolidation
SALES	
(customer	A)
Customers
Products Preferences
Promotions
Prices
SALES	
(customer	B)
Customers
Products Preferences
Promotions
Prices
Gluent
Gluent
Offload
Customers	A
Customers	B
Raw	data
SQL
33
Data	Consolidation
SALES	
(customer	A)
Customers
Products Preferences
Promotions
Prices
SALES	
(customer	B)
Customers
Products Preferences
Promotions
Prices
Gluent
Gluent
Offload Transform	(optional)
Customers	A
Customers	B
->
Raw	data
SQL
33
Data	Consolidation
SALES	
(customer	A)
Customers
Products Preferences
Promotions
Prices
SALES	
(customer	B)
Customers
Products Preferences
Promotions
Prices
Gluent
Gluent
Offload Transform	(optional)
Customers	A
Customers	B
Customers	
ALL
->
Raw	data Consolidated	
data
Union	all,	SQL,	
Spark
SQL
33
Data	Consolidation
SALES	
(customer	A)
Customers
Products Preferences
Promotions
Prices
ALL
ALL
SALES	
(customer	B)
Customers
Products Preferences
Promotions
Prices
Gluent
Gluent
Offload Transform	(optional) Present
Customers	A
Customers	B
Products	ALL
Preferences	ALL
Prices	
Gluent
Customers	
ALL
Promotions
SALES	
(ALL)
-> ->
Raw	data Consolidated	
data
Customers	
ALL
Union	all,	SQL,	
Spark
SQLSQL
Virtualized	
tables
34
we liberate enterprise data
thank you!
35
Gluent is running one-hour webinars each Wednesday in July beginning tomorrow, July 12.
See details below and register for your free spot today!



Apache Impala Internals

Speaker: Tanel Poder, Gluent

Wednesday, July 19 @ 12 PM CDT

gluent.com/event/gluent-webinar-apache-impala-internals-with-tanel-poder



Building an Analytics Platform with Oracle & Hadoop

Speakers: Gerry Moore & Suresh Irukulapati, Vistra Energy

Wednesday, July 26 @ 9 AM CDT

gluent.com/event/gluent-webinar-building-an-integrated-analytics-platform-with-oracle-and-hadoop
Gluent	Webinars	-	JULY	2017

More Related Content

What's hot

DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesLars Albertsson
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...Big Data Spain
 
Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Big Data Spain
 
InfoTrack: Creating a single source of truth with the Elastic Stack
InfoTrack: Creating a single source of truth with the Elastic StackInfoTrack: Creating a single source of truth with the Elastic Stack
InfoTrack: Creating a single source of truth with the Elastic StackElasticsearch
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Big Data Spain
 
Deploying Big Data Platforms
Deploying Big Data PlatformsDeploying Big Data Platforms
Deploying Big Data PlatformsChris Kernaghan
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesSingleStore
 
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...DataWorks Summit/Hadoop Summit
 
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoIasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoCodecamp Romania
 
Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackMichel Tricot
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataDataWorks Summit/Hadoop Summit
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupBlake Irvine
 
How to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudHow to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudAttunity
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Flink Forward
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing ArchitectureGang Tao
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraAttunity
 
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahLeveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahDatabricks
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Big Data Spain
 
Modern data integration expert sessions
Modern data integration expert sessionsModern data integration expert sessions
Modern data integration expert sessionsJessicaMurrell3
 

What's hot (20)

DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
 
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 Migration and Coexistence between Relational and NoSQL Databases by Manuel H... Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
Migration and Coexistence between Relational and NoSQL Databases by Manuel H...
 
Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...Advanced data science algorithms applied to scalable stream processing by Dav...
Advanced data science algorithms applied to scalable stream processing by Dav...
 
InfoTrack: Creating a single source of truth with the Elastic Stack
InfoTrack: Creating a single source of truth with the Elastic StackInfoTrack: Creating a single source of truth with the Elastic Stack
InfoTrack: Creating a single source of truth with the Elastic Stack
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
 
Deploying Big Data Platforms
Deploying Big Data PlatformsDeploying Big Data Platforms
Deploying Big Data Platforms
 
Getting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming ArchitecturesGetting It Right Exactly Once: Principles for Streaming Architectures
Getting It Right Exactly Once: Principles for Streaming Architectures
 
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
Building a Graph Database in Neo4j with Spark & Spark SQL to gain new insight...
 
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcaderoIasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
Iasi code camp 20 april 2013 testing big data-anca sfecla - embarcadero
 
Airbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stackAirbyte @ Airflow Summit - The new modern data stack
Airbyte @ Airflow Summit - The new modern data stack
 
Using Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch dataUsing Hadoop to build a Data Quality Service for both real-time and batch data
Using Hadoop to build a Data Quality Service for both real-time and batch data
 
Netflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering MeetupNetflix Data Engineering @ Uber Engineering Meetup
Netflix Data Engineering @ Uber Engineering Meetup
 
How to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the CloudHow to Operationalise Real-Time Hadoop in the Cloud
How to Operationalise Real-Time Hadoop in the Cloud
 
Big data pipelines
Big data pipelinesBig data pipelines
Big data pipelines
 
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
Virtual Flink Forward 2020: Netflix Data Mesh: Composable Data Processing - J...
 
Big Data Computing Architecture
Big Data Computing ArchitectureBig Data Computing Architecture
Big Data Computing Architecture
 
Digital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming EraDigital Business Transformation in the Streaming Era
Digital Business Transformation in the Streaming Era
 
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahLeveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
 
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...Finding the needle in the haystack: how Nestle is leveraging big data to defe...
Finding the needle in the haystack: how Nestle is leveraging big data to defe...
 
Modern data integration expert sessions
Modern data integration expert sessionsModern data integration expert sessions
Modern data integration expert sessions
 

Similar to Offload, Transform, and Present - the New World of Data Integration

Offload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data IntegrationOffload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data IntegrationMichael Rainey
 
Oracle Data Integrator 12c - Getting Started
Oracle Data Integrator 12c - Getting StartedOracle Data Integrator 12c - Getting Started
Oracle Data Integrator 12c - Getting StartedMichael Rainey
 
Agents for Agility - The Just-in-Time Enterprise Has Arrived
Agents for Agility - The Just-in-Time Enterprise Has ArrivedAgents for Agility - The Just-in-Time Enterprise Has Arrived
Agents for Agility - The Just-in-Time Enterprise Has ArrivedInside Analysis
 
Where the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information AccessWhere the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information AccessInside Analysis
 
Becoming a data driven organization
Becoming a data driven organization Becoming a data driven organization
Becoming a data driven organization Magnus Backman
 
Optimizing Your IT Strategy: 5 Steps to Successfull Hybrid IT
Optimizing Your IT Strategy: 5 Steps to Successfull Hybrid ITOptimizing Your IT Strategy: 5 Steps to Successfull Hybrid IT
Optimizing Your IT Strategy: 5 Steps to Successfull Hybrid ITSirius
 
Hybrid IT – A Winning Strategy
Hybrid IT – A Winning StrategyHybrid IT – A Winning Strategy
Hybrid IT – A Winning StrategyOneNeck
 
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...SnapLogic
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnectaDigital
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...actualtechmedia
 
Patternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase DeckPatternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase DeckMaryLudloff
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Hybrid Development Webinar - English
Hybrid Development Webinar - EnglishHybrid Development Webinar - English
Hybrid Development Webinar - EnglishCollabNet
 
DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?DATAVERSITY
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataMatt Stubbs
 

Similar to Offload, Transform, and Present - the New World of Data Integration (20)

Offload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data IntegrationOffload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data Integration
 
Oracle Data Integrator 12c - Getting Started
Oracle Data Integrator 12c - Getting StartedOracle Data Integrator 12c - Getting Started
Oracle Data Integrator 12c - Getting Started
 
Agents for Agility - The Just-in-Time Enterprise Has Arrived
Agents for Agility - The Just-in-Time Enterprise Has ArrivedAgents for Agility - The Just-in-Time Enterprise Has Arrived
Agents for Agility - The Just-in-Time Enterprise Has Arrived
 
Where the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information AccessWhere the Warehouse Ends: A New Age of Information Access
Where the Warehouse Ends: A New Age of Information Access
 
How cloud are you?
How cloud are you?How cloud are you?
How cloud are you?
 
Becoming a data driven organization
Becoming a data driven organization Becoming a data driven organization
Becoming a data driven organization
 
Optimizing Your IT Strategy: 5 Steps to Successfull Hybrid IT
Optimizing Your IT Strategy: 5 Steps to Successfull Hybrid ITOptimizing Your IT Strategy: 5 Steps to Successfull Hybrid IT
Optimizing Your IT Strategy: 5 Steps to Successfull Hybrid IT
 
Agile EcoSystem
Agile EcoSystemAgile EcoSystem
Agile EcoSystem
 
Hybrid IT – A Winning Strategy
Hybrid IT – A Winning StrategyHybrid IT – A Winning Strategy
Hybrid IT – A Winning Strategy
 
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
Euromoney's integration journey: Selecting SnapLogic's self-service integrati...
 
Capgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with ClouderaCapgemini Leap Data Transformation Framework with Cloudera
Capgemini Leap Data Transformation Framework with Cloudera
 
Connecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud PlatformConnecta Event: Big Query och dataanalys med Google Cloud Platform
Connecta Event: Big Query och dataanalys med Google Cloud Platform
 
Cloud Seeding
Cloud SeedingCloud Seeding
Cloud Seeding
 
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
Conquering Disaster Recovery Challenges and Out-of-Control Data with the Hybr...
 
Patternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase DeckPatternbuilders Founder Showcase Deck
Patternbuilders Founder Showcase Deck
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Hybrid Development Webinar - English
Hybrid Development Webinar - EnglishHybrid Development Webinar - English
Hybrid Development Webinar - English
 
DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture — What’s the Next Big Thing?
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 

Recently uploaded

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 

Recently uploaded (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Offload, Transform, and Present - the New World of Data Integration