SlideShare ist ein Scribd-Unternehmen logo
1 von 29
Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
We do Hadoop together.
Modern Data Architecture for
Data Transformation and Acquisition
with Oracle® and Apache™
Hadoop®
Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Quick Housekeeping
Q&A box is available for your questions
Webinar will be recorded for future viewing
Thank you for joining!
Your Presenters
• Jeff Pollock
– Vice President, Product Management, Oracle
– Previously responsible for IBM InfoSphere Information
Integration & Governance products
– Author of “Semantic Web for Dummies” and "Adaptive
Information”
• Tim Hall
– Vice President, Product Management, Hortonworks
– Previously responsible for Oracle’s outbound product
management covering the Business Process
Management Suite, SOA Suite
Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Today’s Topics
• Drivers for the Modern Data Architecture
• New Analytic Applications for New Types of Data
• Hadoop as the solution for Data Lake
• Hortonworks and Oracle Data Integration teaming up
• Oracle patterns for successful Data Reservoirs
• Oracle Data Integration Strengths in Hadoop
• Oracle Data Governance for Hadoop
• Q&A
Poll: Where are you in your Hadoop journey?
1. Researching our options
2. Currently evaluating some software
3. Deep in a trial
4. In production with a Hadoop cluster
5. What’s Hadoop?
Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A Data Architecture Under Pressure From New DataAPPLICATIONSDATASYSTEM
REPOSITORIES
SOURCES
Existing Sources
(CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Business
Analytics
Custom
Applications
Packaged
Applications
Source: IDC
2.8 ZB in 2012
85% from New Data Types
15x Machine Data by 2020
40 ZB by 2020
OLTP, ERP, CRM Systems
Unstructured documents, emails
Clickstream
Server logs
Sentiment, Web Data
Sensor. Machine Data
Geolocation
Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop Within An Emerging Modern Data Architecture
OPERATIONS TOOLS
Provision,
Manage &
Monitor
DEV & DATA TOOLS
Build &
Test
DATASYSTEM
REPOSITORIES
SOURCES
RDBMS EDW MPP
OLTP, ERP,
CRM
Systems
Documents,
Emails
Web Logs,
Click
Streams
Social
Networks
Machine
Generated
Sensor
Data
Geolocation
Data
Governance
&Integration
Security
Operations
Data Access
Data Management
APPLICATIONS
Business
Analytics
Custom
Applications
Packaged
Applications
Clickstream
Capture and analyze
website visitors’ data
trails and optimize
your website
Sensors
Discover patterns in
data streaming
automatically from
remote sensors and
machines
Server Logs
Research logs to
diagnose process
failures and prevent
security breaches
New types of dataHadoop Value:
Sentiment
Understand how
your customers feel
about your brand
and products –
right now
Geographic
Analyze location-
based data to
manage operations
where they occur
Unstructured
Understand patterns
in files across
millions of web
pages, emails, and
documents
Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
New Analytic Applications For New Types Of Data
$
• Supplier Consolidation
• Supply Chain and Logistics
• Assembly Line Quality Assurance
• Proactive Maintenance
• Crowdsourced Quality Assurance
• New Account Risk Screens
• Fraud Prevention
• Trading Risk
• Maximize Deposit Spread
• Insurance Underwriting
• Accelerate Loan Processing
• Call Detail Records (CDRs)
• Infrastructure Investment
• Next Product to Buy (NPTB)
• Real-time Bandwidth
Allocation
• New Product Development
• 360° View of the Customer
• Analyze Brand Sentiment
• Localized, Personalized
Promotions
• Website Optimization
• Optimal Store Layout
Financial
Services
Retail Telecom Manufacturing
Healthcare
Utilities,
Oil & Gas
Public
Sector
• Genomic data for medical trials
• Monitor patient vitals
• Reduce re-admittance rates
• Store medical research data
• Recruit cohorts for
pharmaceutical trials
• Smart meter stream
analysis
• Slow oil well decline curves
• Optimize lease bidding
• Compliance reporting
• Proactive equipment repair
• Seismic image processing
• Analyze public sentiment
• Protect critical networks
• Prevent fraud and waste
• Crowdsource reporting for
repairs to infrastructure
• Fulfill open records requests
Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
… And Incrementally Delivers A ‘Data Lake’
Data Lake
• An architectural shift in the
data center that uses
Hadoop to deliver deeper
insight across a large,
broad, diverse set of data at
efficient scale
SCALE
SCOPE
A Modern Data Architecture/Data Lake
New Analytic Apps
New types of data
LOB-driven
RDBMS
MPP
EDW
Governance
&Integration
Security
Operations
Data Access
Data Management
Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The Modern Data Architecture
Oracle Data Integration
• Eliminates need for
separate ETL engine –
and associated H/W,
admin, overhead
• Non-invasive realtime
data staging into Hadoop
• Streamlines development
by providing capability to
separate Logical from
Physical mappings
• Reduces risk and
compliance exposure via
comprehensive data
governance
Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Oracle & Hortonworks
YARN Ready Partner
Certified on latest release of
Hortonworks Data Platform
Sandbox tutorial
Tutorial for
HWX Sandbox
Coming Soon!
ORCL Sandbox
Here Now!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integration & Governance
14
Dynamic Data Movement
– Low impact capture
– Continuous data staging
Data Transformation
– Bulk data movement
– Pushdown data processing
Data Federation
– Virtualized Data Services
Data Quality & Verification
– Fix quality at the source
– Verify data consistency
Metadata Management
– Lineage and Impact Analysis
– Business Glossary Semantics
Data Governance
Foundation
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Comprehensive capabilities for the end-to-end data integration
and governance of all data – including Hadoop based data
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Leverage Wide Range of Modern Analytic Styles
How to Succeed With a Big Data Reservoir
15
Do:
– Directly link to a Line of Business
initiative
– Iterate on short cycles, plan for
small high-value deliverables along
the way
– Use tools, not only custom coded
programs
Do Not:
– Start with a techie-led research
project w/out a biz objective
– Over promise business results on
the market hype alone
– Assume MapReduce is the answer
to all your technical challenges
DBMS
(on prem or cloud)
Data First
Analytics
Model First
Analytics
Streaming
Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Maximizing benefits:
1. Schema on Read
2. Cheaper Compute
3. Cheaper Storage
3 Core Patterns of Big Data Reservoir Success
16
DBMS
(on prem or cloud)
Sandbox
ETL Offload
Staging
Deep Data
Storage
Data Sandbox:
– Leader: Line of Business (LoB)
– Value: Faster access to business data, Faster
time to value on Analytics
– Innovation: Schema-on-read empowers
rapid staging and Data Discovery
ETL Offload:
– Leader: Information Technology (IT)
– Value: Cost avoidance on DW/Marts
– Innovation: YARN/Hadoop empowers lower
cost compute and lower cost storage
Deep Data Storage:
– Leader: Risk / Compliance (LoB)
– Core Value: High fidelity aged data
– Innovation: SQL on Hadoop engines enable
very low cost, queryable data access
Leverage Wide Range of Modern Analytic Styles
Data First
Analytics
Model First
Analytics
Streaming
Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Approach to Big Data Integration is Superior
17
DBMS
(on prem or cloud)
Sandbox
ETL Offload
Staging
Deep Data
Storage
Data Governance
Foundation
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
GoldenGate Veridata
(Online Data Verification)
Oracle GoldenGate:
– Non-invasive data capture
– Low-latency data movement
– Full or partial records staging
– Most proven integration tool worldwide
Oracle Data Integrator:
– No ETL engine is required
– Logical design separate from physical
– Deploys in Hadoop or off cluster
– Many options for movement
Metadata & Glossary:
– Search Driven
– Business Friendly
– Huge 3rd Party Support
– Automated Metadata Stitching
Leverage Wide Range of Modern Analytic Styles
Data First
Analytics
Model First
Analytics
Streaming
Analytics
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle GoldenGate Capabilities for Big Data
18
HDFS (Files)
HBase (NoSQL)
Hive / Hive Streaming (SQL)
Flume & Storm (Streaming)
Kafka (MPP Pub/Sub)
Spark Streaming (Machine Learning)
Capture Database Transactions and
Deliver to Big Data in Real-Time
Capture
Trail
Route
Deliver
Pump
GoldenGate
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Business Value of the GoldenGate Approach
19
Continuous Data Staging
– Don’t make the business wait
– CDC is by default, not an add-on
– Least invasive on sources
– Hadoop staging is fresh
Integrated, Native Capture
– Don’t create unnecessary risk
– Keep current with DB patches
– Certainty around licensing
– Proven best performance
Most Widely Proven
– 1000’s of customers
– Most demanding high volume
– Used for High Availability (HA)
– Dependable results
vs.
Batch Data Movement
– Typical ETL vendors all default to batch data
movement in their reference architectures
– Changed Data is an immature add-on
– ETL loading into Hadoop is mainly “batch mode”
Clumsy & Risky Data Capture
– Not in sync with Oracle Database versions
– Some can “talk the talk” but their CDC tech can’t
touch Oracle GoldenGate scale/performance
– Patches and Licensing create business risk
Niche, Low-End
– Some vendors only cover a few platforms
– Some vendors are broad, but don’t scale
– Few vendors have the reliability and dependability
to cover HA use cases
vs.
vs.
…the “Other Vendors”
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Data Integrator (ODI) Capabilities for Big Data
20
Flume
Hive on MR, Tez, Spark
Logs
OLTP DB
SQOOP
OGG
Pig on MR, Tez, Spark
ODI
SQOOP
Any DW
OGG
Spark
Oozie
OEDQ OEMM
Data Validation
& Cleansing
Metadata Mgmt
& Lineage
API/File
Hive/HCat,
HDFS,HBase
Hive/HCat,
HDFS,HBase
NoSQL
Flume
Map once at the logical level, and then choose which Big Data or
Hadoop framework you want to run in!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Business Value of ODI: Low Cost and High Dev Efficiency
21
No ETL engine is
required
Separation of
Logical and
Physical design
Physical exec on
SQL, Hive, Pig, or
Spark
Runtime exec in
Oozie or via ODI
Java Agent
Rich set of pre-
built operators
User defined
functions
Eliminate your ETL Engines and improve Developer efficiency –
now, everybody can be a Big Data developer!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Hadoop Cluster
Spark
Sqoop
Hive
Pig
ODI
Oozie
Sqoop
Data Flow Approaches to Big Data Integration
22
Hadoop Cluster
Spark
Sqoop Sqoop
Hive
Pig
Manual Code
Hadoop Cluster
ETLETL HDFS
Hadoop Cluster
ETLETLETL
HDFS
1. Traditional ETL Tools
(execute entirely outside of Hadoop)
2. ETL Tools with Native “on” Hadoop
(require proprietary code on Data Nodes)
3. Manual Coding
(ultimate flexibility, but at a very high cost)
4. ODI Native in Hadoop
(no ETL Engine & no Data Node footprint)
ETL
*small ODI Agent may optionally install off cluster or
on Name Node, no dependencies on Data Nodes
GG
BEST
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Metadata Management & Glossary for Big Data
23
Comprehensive Data Lineage
Business Friendly Navigation
Business & IT Collaboration
Easy to Use, Search Driven
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Value of Metadata and Business Glossary
My dashboard
does not match
this report…why?
Where did
this data
come
from?Where can I find
the data I need for
analytics?
Which ETL mappings or
BI Reports will be
affected by my column
change?
What systems does
the data flow
through?
TRUSTED DATA IT CERTAINTY
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Big Data Governance Lifecycle Tooling
25
Operational Data Flows
Business Sources
Quality KPIs Case
Management
Governance Cockpit for Data Stewards & Stakeholders
Exception
Review
Metadata
Management
Business
Glossary
Design Time
Support People and Processes with an end-to-
end tooling capability!
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
…to manage Risk/Compliance
 Records retention
 Rediscovery
 Litigation support
 Data access management
 Information security and protection
Minimize corporate liability through proper
governance of data
…to drive Business Value
 Metadata discovery
 Metadata & glossary cataloging
 Data profiling
 Data cleansing lifecycle
 Data remediation
Maximize opportunity by ensuring trusted
data is easily available for data driven
business processes
26
The Data Governance Opportunity with Big Data
Solving business and IT data challenges
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Most Heterogeneous, Deep 3rd Party Coverage
27
 Hadoop HBase
 Hadoop Hive/Flume
 HP Enscribe
 HP NonStop
 HP Neoview
 Hypersonic SQL
 IBM DB2 i Series
 IBM DB2 UDB
 IBM DB2 z Series
 IBM Informix
 IBM Netezza
 JMS / MQ
 Microsoft Access
 Microsoft SQLServer
 MySQL
 Pivotal Greenplum
 PostgreSQL
 Salesforce.com
 SAP BW / BI
 SAP ERP / ECC
 SAS
 SQL/MP
 SQL/MX
 Sybase ASE
 Sybase IQ
 Teradata
 Adaptive
 Altova
 Apache Hcatalog
 Apache Hive/HQL
 Borland
 CA ERwin
 Cloudera Impala
 COBOL Copybook
 DataStax
 Embarcadero
 EMC ProActivity
 GentleWare
 Google BigQuery
 Grandite
 Hadapt Hive
 Hortonworks Hive
 IBM Cognos
 IBM DB2
 IBM DataStage
 IBM Discovery
 IBM Federation Server
 IBM Lotus Notes
 IBM Netezza
 IBM Rational Rose
 IBM Rational Architect
 Informatica Metadata Mgr.
 Informatica PowerCenter
 CoSORT
 ISO SQL Standard (DDL)
 MapR Hadoop Hive
 MicroFocus
 Microsoft Access
 Microsoft Office Excel
 Microsoft Visio
 Microsoft SQL Server
 Microsoft SSIS
 Microsoft Visual Studio
 Microstrategy
 Magic Draw
 OMG CWM Standard
 OMG UML Standard
 Oracle BI Answers
 Oracle BI Enterprise Edition
 Oracle BI Server
 Oracle DAC
 Oracle Data Integrator
 Oracle Data Modeler
 Oracle Database
 Oracle Designer
 Oracle Hyperion Applications
 Oracle Hyperion Essbase
 Oracle Warehouse Builder
 Pivotal Greenplum
 PostgreSQL
 QlikView
 SAP BO Crystal Reports
 SAP BO Designer
 SAP BO Desktop Intelligence
 SAP BO Repository
 SAP BO Data Integrator
 SAP BO Data Steward
 SAP Master Data Management
 SAP Sybase PowerDesigner
 SAP Sybase ASE Database
 SAS Data Integration Studio
 SAS BI Server
 SAS Information Map
 SAS Metadata Management
 SAS OLAP Server
 Select
 Sparx Architect
 Syncsort
 Tableau
 Talend
 Teradata
 Tigris
 Visible
 W3C DTD & XSD Schema
Operational Integration (Movement / Transformation) Metadata Harvesting (Glossary, Lineage & Impact Analysis)
 Oracle Database
 Oracle Exadata
 Oracle Big Data Appliance
 Oracle TimesTen
 Oracle OLAP
 Oracle Business Intelligence
 Oracle BI Applications
 Oracle E-Business Suite
 Oracle JD Edwards Enterprise One
 Oracle JD Edwards World
 Oracle Fusion Applications
 Oracle Governance Risk and Compliance
 Oracle Fusion AIA
 Oracle Retail Applications
 Oracle Agile BI / DW
 Oracle Agile PLM for Process
 Oracle iFlex FlexCUBE
 Oracle iFlex Mantas
 Oracle Hyperion Applications
 Oracle PeopleSoft
 Oracle Siebel CRM / OnDemand
 Oracle Communications
 Oracle WebLogic Server
 Oracle Coherence Data Grid
 Oracle SOA Suite
 Oracle Enterprise Service Bus
+ open APIs and standards
based meta-model
No other vendor can compare:
• 50+ systems for Operational Integration
• 70+ systems for Metadata Harvesting
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Data Governance
Foundation
Differentiated Technical Approach from Oracle
28
Dynamic Data Movement
– Real-time by default, not ETL
– Least invasive on sources
– Proven best performance
– Native Oracle integration
No ETL Engines
– Take processing to the data;
don’t move the data
– Leverage the data engines for
workloads (Hadoop or SQL)
Most Heterogeneous
– Leverage open source Hadoop,
not proprietary distributions
– Hadoop is the Hub, not ETL tools
– Open metadata standards
Oracle Data Integrator
(Transformation)
Enterprise Data Quality
(Profile, Cleanse, Match and De-duplicate)
Fast
Load
Oracle GoldenGate
(Movement)
Enterprise Metadata Management & Business Glossary
(Business Glossary, Data Lineage, Impact Analysis and Data Provenance)
Data Service Integrator
(Federation)
GoldenGate Veridata
(Online Data Verification)
ELT Processing
on Hadoop or SQL
Continuous Availability
Comprehensive capabilities for the end-to-end data integration
and governance of all data – including Hadoop based data
Question & Answer session will be conducted electronically,
using the panel to the right of your screen
About Oracle and Hortonworks
hortonworks.com/partner/oracle/
Get started with Hortonworks Sandbox
hortonworks.com/sandbox
Follow us:
@hortonworks @Oracle
Learn more
Oracle.com/goto/dataintegration

Weitere ähnliche Inhalte

Was ist angesagt?

Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success DataWorks Summit/Hadoop Summit
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...DataWorks Summit/Hadoop Summit
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Jeffrey T. Pollock
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark Summit
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Hortonworks
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldJeffrey T. Pollock
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - OverviewJeffrey T. Pollock
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDataWorks Summit
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...Hortonworks
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseJeffrey T. Pollock
 
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)Jeffrey T. Pollock
 

Was ist angesagt? (20)

Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success Swimming Across the Data Lake, Lessons learned and keys to success
Swimming Across the Data Lake, Lessons learned and keys to success
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 
Oracle's BigData solutions
Oracle's BigData solutionsOracle's BigData solutions
Oracle's BigData solutions
 
Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group Hortonworks and Clarity Solution Group
Hortonworks and Clarity Solution Group
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
Top Three Big Data Governance Issues and How Apache ATLAS resolves it for the...
 
Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)Tapping into the Big Data Reservoir (CON7934)
Tapping into the Big Data Reservoir (CON7934)
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Spark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun MurthySpark and Hadoop Perfect Togeher by Arun Murthy
Spark and Hadoop Perfect Togeher by Arun Murthy
 
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
Accelerating the Value of Big Data Analytics for P&C Insurers with Hortonwork...
 
Oracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorldOracle Data Integration CON9737 at OpenWorld
Oracle Data Integration CON9737 at OpenWorld
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
The Modern Data Architecture for Advanced Business Intelligence with Hortonwo...
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
Unlocking Big Data Silos in the Enterprise or the Cloud (Con7877)
 

Ähnlich wie Hortonworks Oracle Big Data Integration

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudHortonworks
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Hortonworks
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Hortonworks
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleHarald Erb
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Barijaxconf
 
Modern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BIModern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BIKognitio
 
Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013Michael Hiskey
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationMichael Rainey
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Hortonworks
 

Ähnlich wie Hortonworks Oracle Big Data Integration (20)

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by Example
 
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu BariApache Hadoop and its role in Big Data architecture - Himanshu Bari
Apache Hadoop and its role in Big Data architecture - Himanshu Bari
 
Modern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BIModern Data Architecture: In-Memory with Hadoop - the new BI
Modern Data Architecture: In-Memory with Hadoop - the new BI
 
Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013Hortonworks kognitio webinar 10 dec 2013
Hortonworks kognitio webinar 10 dec 2013
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 
Tame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data IntegrationTame Big Data with Oracle Data Integration
Tame Big Data with Oracle Data Integration
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 

Mehr von Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyHortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsHortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysHortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's NewHortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerHortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsHortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeHortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidHortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleHortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATAHortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseHortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseHortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationHortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementHortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCHortonworks
 

Mehr von Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Kürzlich hochgeladen

Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Natan Silnitsky
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationBradBedford3
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtimeandrehoraa
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Matt Ray
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Cizo Technology Services
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 

Kürzlich hochgeladen (20)

Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
Taming Distributed Systems: Key Insights from Wix's Large-Scale Experience - ...
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
How to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion ApplicationHow to submit a standout Adobe Champion Application
How to submit a standout Adobe Champion Application
 
SpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at RuntimeSpotFlow: Tracking Method Calls and States at Runtime
SpotFlow: Tracking Method Calls and States at Runtime
 
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
Open Source Summit NA 2024: Open Source Cloud Costs - OpenCost's Impact on En...
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
Global Identity Enrolment and Verification Pro Solution - Cizo Technology Ser...
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 

Hortonworks Oracle Big Data Integration

  • 1. Page1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved We do Hadoop together.
  • 2. Modern Data Architecture for Data Transformation and Acquisition with Oracle® and Apache™ Hadoop®
  • 3. Page3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Quick Housekeeping Q&A box is available for your questions Webinar will be recorded for future viewing Thank you for joining!
  • 4. Your Presenters • Jeff Pollock – Vice President, Product Management, Oracle – Previously responsible for IBM InfoSphere Information Integration & Governance products – Author of “Semantic Web for Dummies” and "Adaptive Information” • Tim Hall – Vice President, Product Management, Hortonworks – Previously responsible for Oracle’s outbound product management covering the Business Process Management Suite, SOA Suite
  • 5. Page5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Today’s Topics • Drivers for the Modern Data Architecture • New Analytic Applications for New Types of Data • Hadoop as the solution for Data Lake • Hortonworks and Oracle Data Integration teaming up • Oracle patterns for successful Data Reservoirs • Oracle Data Integration Strengths in Hadoop • Oracle Data Governance for Hadoop • Q&A
  • 6. Poll: Where are you in your Hadoop journey? 1. Researching our options 2. Currently evaluating some software 3. Deep in a trial 4. In production with a Hadoop cluster 5. What’s Hadoop?
  • 7. Page7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A Data Architecture Under Pressure From New DataAPPLICATIONSDATASYSTEM REPOSITORIES SOURCES Existing Sources (CRM, ERP, Clickstream, Logs) RDBMS EDW MPP Business Analytics Custom Applications Packaged Applications Source: IDC 2.8 ZB in 2012 85% from New Data Types 15x Machine Data by 2020 40 ZB by 2020 OLTP, ERP, CRM Systems Unstructured documents, emails Clickstream Server logs Sentiment, Web Data Sensor. Machine Data Geolocation
  • 8. Page8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop Within An Emerging Modern Data Architecture OPERATIONS TOOLS Provision, Manage & Monitor DEV & DATA TOOLS Build & Test DATASYSTEM REPOSITORIES SOURCES RDBMS EDW MPP OLTP, ERP, CRM Systems Documents, Emails Web Logs, Click Streams Social Networks Machine Generated Sensor Data Geolocation Data Governance &Integration Security Operations Data Access Data Management APPLICATIONS Business Analytics Custom Applications Packaged Applications
  • 9. Clickstream Capture and analyze website visitors’ data trails and optimize your website Sensors Discover patterns in data streaming automatically from remote sensors and machines Server Logs Research logs to diagnose process failures and prevent security breaches New types of dataHadoop Value: Sentiment Understand how your customers feel about your brand and products – right now Geographic Analyze location- based data to manage operations where they occur Unstructured Understand patterns in files across millions of web pages, emails, and documents
  • 10. Page10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved New Analytic Applications For New Types Of Data $ • Supplier Consolidation • Supply Chain and Logistics • Assembly Line Quality Assurance • Proactive Maintenance • Crowdsourced Quality Assurance • New Account Risk Screens • Fraud Prevention • Trading Risk • Maximize Deposit Spread • Insurance Underwriting • Accelerate Loan Processing • Call Detail Records (CDRs) • Infrastructure Investment • Next Product to Buy (NPTB) • Real-time Bandwidth Allocation • New Product Development • 360° View of the Customer • Analyze Brand Sentiment • Localized, Personalized Promotions • Website Optimization • Optimal Store Layout Financial Services Retail Telecom Manufacturing Healthcare Utilities, Oil & Gas Public Sector • Genomic data for medical trials • Monitor patient vitals • Reduce re-admittance rates • Store medical research data • Recruit cohorts for pharmaceutical trials • Smart meter stream analysis • Slow oil well decline curves • Optimize lease bidding • Compliance reporting • Proactive equipment repair • Seismic image processing • Analyze public sentiment • Protect critical networks • Prevent fraud and waste • Crowdsource reporting for repairs to infrastructure • Fulfill open records requests
  • 11. Page11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved … And Incrementally Delivers A ‘Data Lake’ Data Lake • An architectural shift in the data center that uses Hadoop to deliver deeper insight across a large, broad, diverse set of data at efficient scale SCALE SCOPE A Modern Data Architecture/Data Lake New Analytic Apps New types of data LOB-driven RDBMS MPP EDW Governance &Integration Security Operations Data Access Data Management
  • 12. Page12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Modern Data Architecture Oracle Data Integration • Eliminates need for separate ETL engine – and associated H/W, admin, overhead • Non-invasive realtime data staging into Hadoop • Streamlines development by providing capability to separate Logical from Physical mappings • Reduces risk and compliance exposure via comprehensive data governance
  • 13. Page13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Oracle & Hortonworks YARN Ready Partner Certified on latest release of Hortonworks Data Platform Sandbox tutorial Tutorial for HWX Sandbox Coming Soon! ORCL Sandbox Here Now!
  • 14. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integration & Governance 14 Dynamic Data Movement – Low impact capture – Continuous data staging Data Transformation – Bulk data movement – Pushdown data processing Data Federation – Virtualized Data Services Data Quality & Verification – Fix quality at the source – Verify data consistency Metadata Management – Lineage and Impact Analysis – Business Glossary Semantics Data Governance Foundation Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability Comprehensive capabilities for the end-to-end data integration and governance of all data – including Hadoop based data
  • 15. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Leverage Wide Range of Modern Analytic Styles How to Succeed With a Big Data Reservoir 15 Do: – Directly link to a Line of Business initiative – Iterate on short cycles, plan for small high-value deliverables along the way – Use tools, not only custom coded programs Do Not: – Start with a techie-led research project w/out a biz objective – Over promise business results on the market hype alone – Assume MapReduce is the answer to all your technical challenges DBMS (on prem or cloud) Data First Analytics Model First Analytics Streaming Analytics
  • 16. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Maximizing benefits: 1. Schema on Read 2. Cheaper Compute 3. Cheaper Storage 3 Core Patterns of Big Data Reservoir Success 16 DBMS (on prem or cloud) Sandbox ETL Offload Staging Deep Data Storage Data Sandbox: – Leader: Line of Business (LoB) – Value: Faster access to business data, Faster time to value on Analytics – Innovation: Schema-on-read empowers rapid staging and Data Discovery ETL Offload: – Leader: Information Technology (IT) – Value: Cost avoidance on DW/Marts – Innovation: YARN/Hadoop empowers lower cost compute and lower cost storage Deep Data Storage: – Leader: Risk / Compliance (LoB) – Core Value: High fidelity aged data – Innovation: SQL on Hadoop engines enable very low cost, queryable data access Leverage Wide Range of Modern Analytic Styles Data First Analytics Model First Analytics Streaming Analytics
  • 17. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Approach to Big Data Integration is Superior 17 DBMS (on prem or cloud) Sandbox ETL Offload Staging Deep Data Storage Data Governance Foundation Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) GoldenGate Veridata (Online Data Verification) Oracle GoldenGate: – Non-invasive data capture – Low-latency data movement – Full or partial records staging – Most proven integration tool worldwide Oracle Data Integrator: – No ETL engine is required – Logical design separate from physical – Deploys in Hadoop or off cluster – Many options for movement Metadata & Glossary: – Search Driven – Business Friendly – Huge 3rd Party Support – Automated Metadata Stitching Leverage Wide Range of Modern Analytic Styles Data First Analytics Model First Analytics Streaming Analytics
  • 18. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle GoldenGate Capabilities for Big Data 18 HDFS (Files) HBase (NoSQL) Hive / Hive Streaming (SQL) Flume & Storm (Streaming) Kafka (MPP Pub/Sub) Spark Streaming (Machine Learning) Capture Database Transactions and Deliver to Big Data in Real-Time Capture Trail Route Deliver Pump GoldenGate
  • 19. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Business Value of the GoldenGate Approach 19 Continuous Data Staging – Don’t make the business wait – CDC is by default, not an add-on – Least invasive on sources – Hadoop staging is fresh Integrated, Native Capture – Don’t create unnecessary risk – Keep current with DB patches – Certainty around licensing – Proven best performance Most Widely Proven – 1000’s of customers – Most demanding high volume – Used for High Availability (HA) – Dependable results vs. Batch Data Movement – Typical ETL vendors all default to batch data movement in their reference architectures – Changed Data is an immature add-on – ETL loading into Hadoop is mainly “batch mode” Clumsy & Risky Data Capture – Not in sync with Oracle Database versions – Some can “talk the talk” but their CDC tech can’t touch Oracle GoldenGate scale/performance – Patches and Licensing create business risk Niche, Low-End – Some vendors only cover a few platforms – Some vendors are broad, but don’t scale – Few vendors have the reliability and dependability to cover HA use cases vs. vs. …the “Other Vendors”
  • 20. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Data Integrator (ODI) Capabilities for Big Data 20 Flume Hive on MR, Tez, Spark Logs OLTP DB SQOOP OGG Pig on MR, Tez, Spark ODI SQOOP Any DW OGG Spark Oozie OEDQ OEMM Data Validation & Cleansing Metadata Mgmt & Lineage API/File Hive/HCat, HDFS,HBase Hive/HCat, HDFS,HBase NoSQL Flume Map once at the logical level, and then choose which Big Data or Hadoop framework you want to run in!
  • 21. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Business Value of ODI: Low Cost and High Dev Efficiency 21 No ETL engine is required Separation of Logical and Physical design Physical exec on SQL, Hive, Pig, or Spark Runtime exec in Oozie or via ODI Java Agent Rich set of pre- built operators User defined functions Eliminate your ETL Engines and improve Developer efficiency – now, everybody can be a Big Data developer!
  • 22. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Hadoop Cluster Spark Sqoop Hive Pig ODI Oozie Sqoop Data Flow Approaches to Big Data Integration 22 Hadoop Cluster Spark Sqoop Sqoop Hive Pig Manual Code Hadoop Cluster ETLETL HDFS Hadoop Cluster ETLETLETL HDFS 1. Traditional ETL Tools (execute entirely outside of Hadoop) 2. ETL Tools with Native “on” Hadoop (require proprietary code on Data Nodes) 3. Manual Coding (ultimate flexibility, but at a very high cost) 4. ODI Native in Hadoop (no ETL Engine & no Data Node footprint) ETL *small ODI Agent may optionally install off cluster or on Name Node, no dependencies on Data Nodes GG BEST
  • 23. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Metadata Management & Glossary for Big Data 23 Comprehensive Data Lineage Business Friendly Navigation Business & IT Collaboration Easy to Use, Search Driven
  • 24. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Value of Metadata and Business Glossary My dashboard does not match this report…why? Where did this data come from?Where can I find the data I need for analytics? Which ETL mappings or BI Reports will be affected by my column change? What systems does the data flow through? TRUSTED DATA IT CERTAINTY
  • 25. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Big Data Governance Lifecycle Tooling 25 Operational Data Flows Business Sources Quality KPIs Case Management Governance Cockpit for Data Stewards & Stakeholders Exception Review Metadata Management Business Glossary Design Time Support People and Processes with an end-to- end tooling capability!
  • 26. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | …to manage Risk/Compliance  Records retention  Rediscovery  Litigation support  Data access management  Information security and protection Minimize corporate liability through proper governance of data …to drive Business Value  Metadata discovery  Metadata & glossary cataloging  Data profiling  Data cleansing lifecycle  Data remediation Maximize opportunity by ensuring trusted data is easily available for data driven business processes 26 The Data Governance Opportunity with Big Data Solving business and IT data challenges
  • 27. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Most Heterogeneous, Deep 3rd Party Coverage 27  Hadoop HBase  Hadoop Hive/Flume  HP Enscribe  HP NonStop  HP Neoview  Hypersonic SQL  IBM DB2 i Series  IBM DB2 UDB  IBM DB2 z Series  IBM Informix  IBM Netezza  JMS / MQ  Microsoft Access  Microsoft SQLServer  MySQL  Pivotal Greenplum  PostgreSQL  Salesforce.com  SAP BW / BI  SAP ERP / ECC  SAS  SQL/MP  SQL/MX  Sybase ASE  Sybase IQ  Teradata  Adaptive  Altova  Apache Hcatalog  Apache Hive/HQL  Borland  CA ERwin  Cloudera Impala  COBOL Copybook  DataStax  Embarcadero  EMC ProActivity  GentleWare  Google BigQuery  Grandite  Hadapt Hive  Hortonworks Hive  IBM Cognos  IBM DB2  IBM DataStage  IBM Discovery  IBM Federation Server  IBM Lotus Notes  IBM Netezza  IBM Rational Rose  IBM Rational Architect  Informatica Metadata Mgr.  Informatica PowerCenter  CoSORT  ISO SQL Standard (DDL)  MapR Hadoop Hive  MicroFocus  Microsoft Access  Microsoft Office Excel  Microsoft Visio  Microsoft SQL Server  Microsoft SSIS  Microsoft Visual Studio  Microstrategy  Magic Draw  OMG CWM Standard  OMG UML Standard  Oracle BI Answers  Oracle BI Enterprise Edition  Oracle BI Server  Oracle DAC  Oracle Data Integrator  Oracle Data Modeler  Oracle Database  Oracle Designer  Oracle Hyperion Applications  Oracle Hyperion Essbase  Oracle Warehouse Builder  Pivotal Greenplum  PostgreSQL  QlikView  SAP BO Crystal Reports  SAP BO Designer  SAP BO Desktop Intelligence  SAP BO Repository  SAP BO Data Integrator  SAP BO Data Steward  SAP Master Data Management  SAP Sybase PowerDesigner  SAP Sybase ASE Database  SAS Data Integration Studio  SAS BI Server  SAS Information Map  SAS Metadata Management  SAS OLAP Server  Select  Sparx Architect  Syncsort  Tableau  Talend  Teradata  Tigris  Visible  W3C DTD & XSD Schema Operational Integration (Movement / Transformation) Metadata Harvesting (Glossary, Lineage & Impact Analysis)  Oracle Database  Oracle Exadata  Oracle Big Data Appliance  Oracle TimesTen  Oracle OLAP  Oracle Business Intelligence  Oracle BI Applications  Oracle E-Business Suite  Oracle JD Edwards Enterprise One  Oracle JD Edwards World  Oracle Fusion Applications  Oracle Governance Risk and Compliance  Oracle Fusion AIA  Oracle Retail Applications  Oracle Agile BI / DW  Oracle Agile PLM for Process  Oracle iFlex FlexCUBE  Oracle iFlex Mantas  Oracle Hyperion Applications  Oracle PeopleSoft  Oracle Siebel CRM / OnDemand  Oracle Communications  Oracle WebLogic Server  Oracle Coherence Data Grid  Oracle SOA Suite  Oracle Enterprise Service Bus + open APIs and standards based meta-model No other vendor can compare: • 50+ systems for Operational Integration • 70+ systems for Metadata Harvesting
  • 28. Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | Data Governance Foundation Differentiated Technical Approach from Oracle 28 Dynamic Data Movement – Real-time by default, not ETL – Least invasive on sources – Proven best performance – Native Oracle integration No ETL Engines – Take processing to the data; don’t move the data – Leverage the data engines for workloads (Hadoop or SQL) Most Heterogeneous – Leverage open source Hadoop, not proprietary distributions – Hadoop is the Hub, not ETL tools – Open metadata standards Oracle Data Integrator (Transformation) Enterprise Data Quality (Profile, Cleanse, Match and De-duplicate) Fast Load Oracle GoldenGate (Movement) Enterprise Metadata Management & Business Glossary (Business Glossary, Data Lineage, Impact Analysis and Data Provenance) Data Service Integrator (Federation) GoldenGate Veridata (Online Data Verification) ELT Processing on Hadoop or SQL Continuous Availability Comprehensive capabilities for the end-to-end data integration and governance of all data – including Hadoop based data
  • 29. Question & Answer session will be conducted electronically, using the panel to the right of your screen About Oracle and Hortonworks hortonworks.com/partner/oracle/ Get started with Hortonworks Sandbox hortonworks.com/sandbox Follow us: @hortonworks @Oracle Learn more Oracle.com/goto/dataintegration