SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
ODI11g, Hadoop and “Big Data”
Mark Rittman, Technical Director, Rittman Mead
Rittman Mead BI Forum 2013, Brighton & Atlanta
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Big Data, Hadoop and Unstructured Data Sources
•“Big data” is the hot topic in BI, DW and Analytics circles
•The ability to harness vast datasets, at a highly-granular level, by harnessing massively-parallel computing
•Crunching loosely-structured and modelled datasets using simple algorithms: Map (project) + Reduce (agg)
•Largely based around open-source projects, non-relational technologies
‣ Apache Hadoop
‣ MapReduce
‣ Hadoop Distributed File System
‣ Apache Hive, Sqoop, HBase etc
•Emerging commercial vendors
‣ Cloudera
‣ Hortonworks etc
•Can be used standalone, or linked to an
enterprise DW/BI architecture
+
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Oracle’s Strategy for Business Analytics
•Connect to all of your data, from all your sources,
•Subject it to the full range of possible inquiry
•Package solutions for known problems and fixed sources, and
•Deploy to PCs and mobile devices, on premise or in the cloud
On Premise,
On Cloud,
On Mobile
Any Data,
Any Source
Full Range of
Analytics
Integrated
Analytic Apps
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Connect to All of Your Data, From All of Your Sources
•As well as traditional application and database files sources, unstructured source
and “big data” sources are within scope for business decision-making
‣ Data of great volume, great velocity and great variety
Any Data,
Any Source
Your Data :
Decisions based on
your data
Big Data :
Decisions based on
all data relevant to
you
Transactions
Documents
& Social
Data
Machine-Generated
Data
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Oracle’s Big Data Products
•Oracle Big Data Appliance - Engineered System for Big Data Acquisition and Processing
‣Cloudera Distribution of Hadoop
‣Cloudera Manager
‣Open-source R
‣Oracle NoSQL Database Community Edition
‣Oracle Enterprise Linux + Oracle JVM
•Oracle Big Data Connectors
‣Oracle Loader for Hadoop (Hadoop > Oracle RBDMS)
‣Oracle Direct Connector for HDFS (HFDS > Oracle RBDMS)
‣Oracle Data Integration Adapter for Hadoop
‣Oracle R Connector for Hadoop
•Oracle NoSQL Database (column/key-store DB based on BerkeleyDB)
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
ODI as Part of Oracle’s Big Data Strategy
•ODI is the data integration tool for extracting data from Hadoop/MapReduce, and loading
into Oracle Big Data Appliance, Oracle Exadata and Oracle Exalytics
•Oracle Application Adaptor for Hadoop provides required data adapters
‣ Load data into Hadoop from local filesystem,
or HDFS (Hadoop clustered FS)
‣ Read data from Hadoop/MapReduce using
Apache Hive (JDBC) and HiveQL, load
into Oracle RDBMS using
Oracle Loader for Hadoop
•Supported by Oracle’s Engineered Systems
‣ Exadata
‣ Exalytics
‣ Big Data Appliance (w/Cloudera Hadoop Distrib)
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
How ODI Accesses Hadoop and MapReduce
•ODI accesses data in Hadoop clusters through Apache Hive
‣ Metadata and query layer over MapReduce
‣ Provides SQL-like language (HiveQL) and a
metadata store (data dictionary)
‣ Provides a means to define “tables”, into which file
data is loaded, and then queried via MapReduce
‣ Accessed via Hive JDBC driver
(separate Hadoop install required
on ODI server, for client libs)
•Additional access through
Oracle Direct Connector for HDFS
and Oracle Loader for Hadoop
Hadoop Cluster
Hive Server
ODI 11g
Oracle RDBMS
HiveQL
MapReduce
Direct-path loads using
Oracle Loader for Hadoop,
transformation logic in
MapReduce
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Oracle Business Analytics and Big Data Sources
• OBIEE 11g, and other Oracle Business Analytics tools, can also make use of big data sources
‣ Oracle Exalytics, through in-memory aggregates and InfiniBand connection to Exadata, can analyze vast (structured)
datasets held in relational and OLAP databases
‣ Endeca Information Discovery can analyze unstructured and semi-structured sources
‣ InfiniBand connector to Big Data Applicance + Hadoop connector in OBIEE supports analysis via Map/Reduce
‣ Oracle R distribution + Oracle Enterprise R supports SAS-style statistical analysis
of large data sets, as part of
Oracle Advanced Analytics Option
‣ OBIEE can access Hadoop
datasource through another
Apache technology called Hive
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
OBIEE Access to Hadoop/Hive for BI Administration Tool RPD Creation
•HiveODBC driver has to be installed into Windows environment, so that
BI Administration tool can connect to Hive and return table metadata
•Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards
•Note that OBIEE queries cannot span >1 Hive schema (no table prefixes)
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Set up ODBC Connection at the OBIEE Server (Linux Only)
•OBIEE 11.1.1.7+ ships with HiveODBC drivers, need to use 7.x versions though
•Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name
•BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce
[ODBC Data Sources]
AnalyticsWeb=Oracle BI Server
Cluster=Oracle BI Server
SSL_Sample=Oracle BI Server
bigdatalite=Oracle 7.1 Apache Hive Wire Protocol
[bigdatalite]
Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/
Merant/7.0.1/lib/ARhive27.so
Description=Oracle 7.1 Apache Hive Wire Protocol
ArraySize=16384
Database=default
DefaultLongDataBuffLen=1024
EnableLongDataBuffLen=1024
EnableDescribeParam=0
Hostname=bigdatalite
LoginTimeout=30
MaxVarcharSize=2000
PortNumber=10000
RemoveColumnQualifiers=0
StringDescribeType=12
TransactionMode=0
UseCurrentSchema=0
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Opportunities for OBIEE and ODI with Big Data Sources and Tools
•Load data from a Hadoop/HDFS/NoSQL environment into a structured DW for analysis
•Provide OBIEE as an alternative to
Java coding or HiveQL for analysts
•Leverage Hadoop & HDFS for
massively-parallel staging-layer
number crunching
•Make use of low-cost, fault-tolerant
hardware for parts of your BI platform
•Provide the reporting and analysis
for customers who have bought
Oracle Big Data Appliance
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
What is Hadoop?
•Apache Hadoop is one of the most well-known Big Data technologies
•Family of open-source products used to store, and analyze distributed datasets
•Hadoop is the enabling framework, automatically parallelises and co-ordinates jobs
‣“Moves the compute to the data”
•MapReduce is the programming framework
for filtering, sorting and aggregating data
‣Map : filter and interpret input data, create key/value pairs
‣Reduce : summarise and aggregate
•MapReduce jobs can be written in any
language (Java etc), but it is complicated
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
What is HDFS?
•The filesystem behind Hadoop, used to store data for Hadoop analysis
‣Unix-like, uses commands such as ls, mkdir, chown, chmod
•Fault-tolerant, with rapid fault detection and recovery
•High-throughput, with streaming data access and large block sizes
•Designed for data-locality, placing data closed to where it is processed
•Accessed from the command-line, via internet (hdfs://), GUI tools etc
[oracle@bigdatalite mapreduce]$ hadoop fs -mkdir /user/oracle/my_stuff
[oracle@bigdatalite mapreduce]$ hadoop fs -ls /user/oracle
Found 5 items
drwx------ - oracle hadoop 0 2013-04-27 16:48 /user/oracle/.staging
drwxrwxrwx - oracle hadoop 0 2012-09-18 17:02 /user/oracle/moviedemo
drwxrwxrwx - oracle hadoop 0 2012-10-17 15:58 /user/oracle/moviework
drwxr-xr-x - oracle hadoop 0 2013-05-03 17:49 /user/oracle/my_stuff
drwxr-xr-x - oracle hadoop 0 2012-08-10 16:08 /user/oracle/stage
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Hive as the Hadoop “Data Warehouse”
•MapReduce jobs are typically written in Java, but Hive can make this simpler
•Hive is a query environment over Hadoop/MapReduce to support SQL-like queries
•Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automatically
creates MapReduce jobs against data previously loaded into the Hive HDFS tables
•Approach used by ODI and OBIEE
to gain access to Hadoop data
•Allows Hadoop data to be accessed just like
any other data source (sort of...)
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Hive Data and Metadata
•Hive uses a RBDMS metastore to hold
table and column definitions in schemas
•Hive tables then map onto HDFS-stored files
‣Managed tables
‣External tables
•Oracle-like query optimizer, compiler,
executor
•JDBC and OBDC drivers,
plus CLI etc
Hive Driver
(Compile
Optimize, Execute)
Managed Tables
/user/hive/warehouse/
External Tables
/user/oracle/
/user/movies/data/
HDFS
HDFS or local files
loaded into Hive HDFS
area, using HiveQL
CREATE TABLE
command
HDFS files loaded into HDFS
using external process, then
mapped into Hive using
CREATE EXTERNAL TABLE
command
Metastore
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Transforming HiveQL Queries into MapReduce Jobs
•HiveQL queries are automatically translated into Java MapReduce jobs
•Selection and filtering part becomes Map tasks
•Aggregation part becomes the Reduce tasks
SELECT a, sum(b)
FROM myTable
WHERE a<100
GROUP BY a
Map
Task
Map
Task
Map
Task
Reduce
Task
Reduce
Task
Result
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
An example Hive Query Session: Connect and Display Table List
[oracle@bigdatalite ~]$ hive
Hive history file=/tmp/oracle/hive_job_log_oracle_201304170403_1991392312.txt
hive> show tables;
OK
dwh_customer
dwh_customer_tmp
i_dwh_customer
ratings
src_customer
src_sales_person
weblog
weblog_preprocessed
weblog_sessionized
Time taken: 2.925 seconds
Hive Server lists out all
“tables” that have been
defined within the Hive
environment
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
An example Hive Query Session: Display Table Row Count
hive> select count(*) from src_customer;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
Starting Job = job_201303171815_0003, Tracking URL =
http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0003
Kill Command = /usr/lib/hadoop-0.20/bin/
hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0003
2013-04-17 04:06:59,867 Stage-1 map = 0%, reduce = 0%
2013-04-17 04:07:03,926 Stage-1 map = 100%, reduce = 0%
2013-04-17 04:07:14,040 Stage-1 map = 100%, reduce = 33%
2013-04-17 04:07:15,049 Stage-1 map = 100%, reduce = 100%
Ended Job = job_201303171815_0003
OK
25
Time taken: 22.21 seconds
Request count(*) from table
Hive server generates
MapReduce job to “map” table
key/value pairs, and then
reduce the results to table
count
MapReduce job automatically
run by Hive Server
Results returned to user
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
OBIEE and ODI Access to Hive, Leveraging MapReduce with no Java Coding
•Requests in HiveQL arrive via HiveODBC, HiveJDBC
or through the Hive command shell
•JDBC and ODBC access requires Thift server
‣Provides RPC call interface over Hive for external procs
•All queries then get parsed, optimized and compiled, then
sent to Hadoop NameNode and Job Tracker
•Then Hadoop processes the query, generating MapReduce
jobs and distributing it to run in parallel across all data nodes
•Hadoop access can still be performed procedurally if needed,
typically coded by hand in Java, or through Pig, etc
‣The equivalent of PL/SQL compared to SQL
‣But Hive works well with the OBIEE/ODI paradigm
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Complementary Technologies: HDFS, Cloudera Manager, Hue, Beeswax etc
•You can download your own Hive binaries, libraries etc from Apache Hadoop website
•Or use pre-built VMs and distributions from the likes of Cloudera
‣Cloudera CDH3/4 is used on Oracle Big Data Appliance
‣Open-source + proprietary tools (Cloudera Manager)
•Other tools for managing Hive, HFDS etc including
‣Hue (HDFS file browser + management)
‣Beeswax (Hive administration + querying)
•Other complementary/required Hadoop tools
‣Sqoop
‣HDFS
‣Thrift
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Demonstration
Simple Data Selection and Querying using Hive on Cloudera CDH3
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
ODI + Big Data Examples : Providing the Bridge Between Hadoop + OBIEE
•OBIEE now has the ability to report
against Hadoop data, via Hive
‣Assumes that data is already loaded
into the Hive warehouse tables
•ODI therefore can be used to load
the Hive tables, through either:
‣Loading Hive from files
‣Joining and loading from Hive-Hive
‣Loading and transforming via
shell scripts (python, perl etc)
•ODI could also extract the Hive data
and load into Oracle, if more appropriate
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Configuring ODI 11.1.1.6+ for Hadoop Connectivity
•Obtain an installation of Hadoop/Hive from somewhere (Cloudera CDH3/4 for example)
•Copy the following files into a temp directory, archive and transfer to ODI environment
for example...
•Copy JAR files into userlib directory and (standalone) agent lib directory
•Restart ODI Studio
$HIVE_HOME/lib/*.jar
$HADOOP_HOME/hadoop-*-core*.jar,
$HADOOP_HOME/Hadoop-*-tools*.jar
/usr/lib/hive/lib/*.jar
/usr/lib/hadoop-0.20/hadoop-*-core*.jar,
/usr/lib/hadoop-0.20/Hadoop-*-tools*.jar
c:UsersAdministratorAppDataRoamingodioraclediuserlib
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Registering HDFS and Hive Sources and Targets in the ODI Topology
•For Hive sources and targets, use Hive technology
‣JDBC Driver : Apache Hive JDBC Driver
‣JDBC URL : jdbc:hive://[server_name]:10000/default
‣(Flexfield Name) Hive Metastore URIs : thrift://[server_name]:10000
•For HFDS sources, use File technology
‣JDBC URL :
hdfs://[server_name]:port
‣Special HDFS “trick” to use File tech
(no specific HDFS technology)
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Reverse Engineering Hive, HDFS and Local File Datastores + Models
•Hive tables reverse-engineer just like regular tables
•Define model in Designer navigator, uses Hive RKM to retrieve table metadata
•Information on Hive-specific metadata stored in flexfields
‣Hive Buckets
‣Hive Partition Column
‣Hive Cluster Column
‣Hive Sort Column
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Demonstration
ODI 11.1.1.6 Configured for Hadoop Access, with Hive/HFDS source and targets registered
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
ODI Application Adapter for Hadoop KMs
•Application Adapter (pay-extra option) for Hadoop connectivity
•Works for both Windows and Linux installs of ODI Studio
‣Need to source HiveJDBC drivers and JARs from separate Hadoop install
•Provides six new knowledge modules
‣IKM File to Hive (Load Data)
‣IKM Hive Control Append
‣IKM Hive Transform
‣IKM File-Hive to Oracle (OLH)
‣CKM Hive
‣RKM Hive
Wednesday, 8 May 13
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Oracle Loader for Hadoop
•Oracle technology for accessing Hadoop data, and loading it into an Oracle database
•Pushes data transformation, “heavy lifting” to the Hadoop cluster, using MapReduce
•Direct-path loads into Oracle Database, partitioned and non-partitioned
•Online and offline loads
•Key technology for fast load of
Hadoop results into Oracle DB
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
IKM File to Hive (Load Data): Loading of Hive Tables from Local File or HDFS
•Uses the Hive Load Data command to load
from local or HDFS files
‣Calls Hadoop FS commands for simple
copy/move into/around HDFS
‣Commands generated by ODI through
IKM File to Hive (Load Data)
hive> load data inpath '/user/oracle/movielens_src/u.data'
> overwrite into table movie_ratings;
Loading data to table default.movie_ratings
Deleted hdfs://localhost.localdomain/user/hive/warehouse/
movie_ratings
OK
Time taken: 0.341 seconds
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
IKM File to Hive (Load Data): Loading of Hive Tables from Local File or HDFS
•IKM File to Hive (Load Data) generates the
required HiveQL commands using a script template
•Executed over HiveJDBC interface
•Success/Failure/Warning returned to ODI
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Load Data and Hadoop SerDe (Serializer-Deserializer) Transformations
•Hadoop SerDe transformations can be
accessed, for example to transform weblogs
•Hadoop interface that contains:
‣Deserializer - converts incoming data
into Java objects for Hive manipulation
‣Serializer - takes Hive Java objects &
converts to output for HDFS
•Library of SerDe transformations readily
available for use with Hive
•Use the OVERRIDE_ROW_FORMAT
option in IKM to override regular column
mappings in Mapping tab
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
IKM Hive Control Append: Loading, Joining & Filtering Between Hive Tables
•Hive source and target, transformations according to HiveQL
functionality (aggregations, functions etc)
•Ability to join data sources
•Other data sources can be used,
but will involve staging tables and
additional KMs (as per any multi-source join)
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
IKM Hive Transform: Use Custom Shell Scripts to Integrate into Hive Table
•Gives developer the ability
to transform data
programmatically using
Python, Perl etc scripts
•Options to map output
of script to columns in
Hive table
•Useful for more
programmatic and complex
data transformations
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
IKM File-Hive to Oracle: Extract from Hive into Oracle Tables
•Uses Oracle Loaded for Hadoop (OLH) to process
any filtering, aggregation, transformation in Hadoop,
using MapReduce
•OLH part of Oracle Big Data Connectors (additional cost)
•High-performance loader into Oracle DB
•Optional sort by primary key, pre-partioning of data
•Can utilise the two OLH loading modes:
‣JDBC or OCI direct load into Oracle
‣Unload to files, Oracle DP into Oracle DB
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Demonstration
Data Integration Tasks using ODIAAH Hadoop KMs
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
NoSQL Data Sources and Targets with ODI 11g
•No specific technology or driver for NoSQL databases, but can use Hive external tables
•Requires a specific “Hive Storage Handler” for key/value store sources
‣Hive feature for accessing data from other DB systems, for example MongoDB, Cassandra
‣For example, https://github.com/vilcek/HiveKVStorageHandler
•Additionally needs Hive collect_set aggregation method to aggregate results
‣Has to be defined in Languages panel in Topology
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
Pig, Sqoop and other Hadoop Technologies, and Hive
•Future versions of ODI might use other Hadoop technologies
‣Apache Sqoop for bulk transfer between Hadoop and RBDMSs
•Other technologies are not such an obvious fit
‣Apache Pig - the equivalent of PL/SQL for Hive’s SQL
•Commercial vendors may produce “better” versions of Hive, MapReduce etc
‣Cloudera Impala - more “real-time” version of Hive
‣MapR - solves many current issues with MapReduce, 100% Hadoop API compatibility
•Watch this space...!
Wednesday, 8 May 13
T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com
ODI11g, Hadoop and “Big Data”
Mark Rittman, Technical Director, Rittman Mead
Rittman Mead BI Forum 2013, Brighton & Atlanta
T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com
Wednesday, 8 May 13

Weitere ähnliche Inhalte

Was ist angesagt?

Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQueryCsaba Toth
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Daniel Abadi
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoMark Kromer
 
DataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and UglyDataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and UglyWes McKinney
 
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA)  - SaharaOpenStack Trove Day (19 Aug 2014, Cambridge MA)  - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Saharaspinningmatt
 
Serverless data pipelines gcp
Serverless data pipelines gcpServerless data pipelines gcp
Serverless data pipelines gcpCatherine Kimani
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventThe Hive
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irdatastack
 
Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInDataWorks Summit
 
Data warehousing with Hadoop
Data warehousing with HadoopData warehousing with Hadoop
Data warehousing with Hadoophadooparchbook
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...StampedeCon
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and DeploymentCisco Canada
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Spark Summit
 
Optiq: A dynamic data management framework
Optiq: A dynamic data management frameworkOptiq: A dynamic data management framework
Optiq: A dynamic data management frameworkJulian Hyde
 
Big Data on the Microsoft Platform
Big Data on the Microsoft PlatformBig Data on the Microsoft Platform
Big Data on the Microsoft PlatformAndrew Brust
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureVenu Anuganti
 

Was ist angesagt? (20)

Column Stores and Google BigQuery
Column Stores and Google BigQueryColumn Stores and Google BigQuery
Column Stores and Google BigQuery
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012Boston Hadoop Meetup, April 26 2012
Boston Hadoop Meetup, April 26 2012
 
Big Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with PentahoBig Data Analytics Projects - Real World with Pentaho
Big Data Analytics Projects - Real World with Pentaho
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
DataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and UglyDataFrames: The Good, Bad, and Ugly
DataFrames: The Good, Bad, and Ugly
 
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA)  - SaharaOpenStack Trove Day (19 Aug 2014, Cambridge MA)  - Sahara
OpenStack Trove Day (19 Aug 2014, Cambridge MA) - Sahara
 
Serverless data pipelines gcp
Serverless data pipelines gcpServerless data pipelines gcp
Serverless data pipelines gcp
 
Apache drill
Apache drillApache drill
Apache drill
 
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India eventBig Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
Big Data App servor by Lance Riedel, CTO, The Hive for The Hive India event
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.ir
 
Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedIn
 
Data warehousing with Hadoop
Data warehousing with HadoopData warehousing with Hadoop
Data warehousing with Hadoop
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
Big Data Architecture and Deployment
Big Data Architecture and DeploymentBig Data Architecture and Deployment
Big Data Architecture and Deployment
 
NoSQL Needs SomeSQL
NoSQL Needs SomeSQLNoSQL Needs SomeSQL
NoSQL Needs SomeSQL
 
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
Data Storage Tips for Optimal Spark Performance-(Vida Ha, Databricks)
 
Optiq: A dynamic data management framework
Optiq: A dynamic data management frameworkOptiq: A dynamic data management framework
Optiq: A dynamic data management framework
 
Big Data on the Microsoft Platform
Big Data on the Microsoft PlatformBig Data on the Microsoft Platform
Big Data on the Microsoft Platform
 
SQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data ArchitectureSQL, NoSQL, BigData in Data Architecture
SQL, NoSQL, BigData in Data Architecture
 

Andere mochten auch

Not only SQL - Database Choices
Not only SQL - Database ChoicesNot only SQL - Database Choices
Not only SQL - Database ChoicesLynn Langit
 
XML Parsing with Map Reduce
XML Parsing with Map ReduceXML Parsing with Map Reduce
XML Parsing with Map ReduceEdureka!
 
Application architectures with Hadoop and Sessionization in MR
Application architectures with Hadoop and Sessionization in MRApplication architectures with Hadoop and Sessionization in MR
Application architectures with Hadoop and Sessionization in MRmarkgrover
 
Gobblin: Unifying Data Ingestion for Hadoop
Gobblin: Unifying Data Ingestion for HadoopGobblin: Unifying Data Ingestion for Hadoop
Gobblin: Unifying Data Ingestion for HadoopYinan Li
 
High Speed Continuous & Reliable Data Ingest into Hadoop
High Speed Continuous & Reliable Data Ingest into HadoopHigh Speed Continuous & Reliable Data Ingest into Hadoop
High Speed Continuous & Reliable Data Ingest into HadoopDataWorks Summit
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging  flume,kafka,SQS,kinesis Introduction to streaming and messaging  flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis Omid Vahdaty
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applicationshadooparchbook
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Hortonworks
 
Realtime Apache Hadoop at Facebook
Realtime Apache Hadoop at FacebookRealtime Apache Hadoop at Facebook
Realtime Apache Hadoop at Facebookparallellabs
 
Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Rohit Agrawal
 
Data Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on HadoopData Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on Hadoopskaluska
 
ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016Jayesh Thakrar
 
Real Time Data Processing using Spark Streaming | Data Day Texas 2015
Real Time Data Processing using Spark Streaming | Data Day Texas 2015Real Time Data Processing using Spark Streaming | Data Day Texas 2015
Real Time Data Processing using Spark Streaming | Data Day Texas 2015Cloudera, Inc.
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and PigRicardo Varela
 
2011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v52011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v5Samuel Rash
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Centralized logging with Flume
Centralized logging with FlumeCentralized logging with Flume
Centralized logging with FlumeRatnakar Pawar
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Hortonworks
 

Andere mochten auch (20)

Not only SQL - Database Choices
Not only SQL - Database ChoicesNot only SQL - Database Choices
Not only SQL - Database Choices
 
XML Parsing with Map Reduce
XML Parsing with Map ReduceXML Parsing with Map Reduce
XML Parsing with Map Reduce
 
Application architectures with Hadoop and Sessionization in MR
Application architectures with Hadoop and Sessionization in MRApplication architectures with Hadoop and Sessionization in MR
Application architectures with Hadoop and Sessionization in MR
 
Open source data ingestion
Open source data ingestionOpen source data ingestion
Open source data ingestion
 
Gobblin: Unifying Data Ingestion for Hadoop
Gobblin: Unifying Data Ingestion for HadoopGobblin: Unifying Data Ingestion for Hadoop
Gobblin: Unifying Data Ingestion for Hadoop
 
High Speed Continuous & Reliable Data Ingest into Hadoop
High Speed Continuous & Reliable Data Ingest into HadoopHigh Speed Continuous & Reliable Data Ingest into Hadoop
High Speed Continuous & Reliable Data Ingest into Hadoop
 
Introduction to streaming and messaging flume,kafka,SQS,kinesis
Introduction to streaming and messaging  flume,kafka,SQS,kinesis Introduction to streaming and messaging  flume,kafka,SQS,kinesis
Introduction to streaming and messaging flume,kafka,SQS,kinesis
 
Architectural considerations for Hadoop Applications
Architectural considerations for Hadoop ApplicationsArchitectural considerations for Hadoop Applications
Architectural considerations for Hadoop Applications
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
 
Realtime Apache Hadoop at Facebook
Realtime Apache Hadoop at FacebookRealtime Apache Hadoop at Facebook
Realtime Apache Hadoop at Facebook
 
Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1Introduction to Big Data & Hadoop Architecture - Module 1
Introduction to Big Data & Hadoop Architecture - Module 1
 
Data Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on HadoopData Ingestion, Extraction & Parsing on Hadoop
Data Ingestion, Extraction & Parsing on Hadoop
 
ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016ApacheCon-Flume-Kafka-2016
ApacheCon-Flume-Kafka-2016
 
Real Time Data Processing using Spark Streaming | Data Day Texas 2015
Real Time Data Processing using Spark Streaming | Data Day Texas 2015Real Time Data Processing using Spark Streaming | Data Day Texas 2015
Real Time Data Processing using Spark Streaming | Data Day Texas 2015
 
Flume vs. kafka
Flume vs. kafkaFlume vs. kafka
Flume vs. kafka
 
introduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pigintroduction to data processing using Hadoop and Pig
introduction to data processing using Hadoop and Pig
 
2011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v52011 06-30-hadoop-summit v5
2011 06-30-hadoop-summit v5
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Centralized logging with Flume
Centralized logging with FlumeCentralized logging with Flume
Centralized logging with Flume
 
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
Modern Data Architecture for a Data Lake with Informatica and Hortonworks Dat...
 

Ähnlich wie ODI11g, Hadoop and "Big Data" Sources

Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13
Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13
Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13Mark Rittman
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013Mark Rittman
 
Ougn2013 high speed, in-memory big data analysis with oracle exalytics
Ougn2013   high speed, in-memory big data analysis with oracle exalyticsOugn2013   high speed, in-memory big data analysis with oracle exalytics
Ougn2013 high speed, in-memory big data analysis with oracle exalyticsMark Rittman
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gMark Rittman
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data avanttic Consultoría Tecnológica
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsMark Rittman
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle TechnologiesOleksii Movchaniuk
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Mark Rittman
 
ODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration HubODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration HubMark Rittman
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data InfrastructureTrivadis
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataPentaho
 
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Mark Rittman
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3xKinAnx
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...Mark Rittman
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - OverviewJeffrey T. Pollock
 
Web Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analyticsWeb Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analyticsKognitio
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonDremio Corporation
 

Ähnlich wie ODI11g, Hadoop and "Big Data" Sources (20)

Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13
Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13
Leveraging Hadoop with OBIEE 11g and ODI 11g - UKOUG Tech'13
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
Ougn2013 high speed, in-memory big data analysis with oracle exalytics
Ougn2013   high speed, in-memory big data analysis with oracle exalyticsOugn2013   high speed, in-memory big data analysis with oracle exalytics
Ougn2013 high speed, in-memory big data analysis with oracle exalytics
 
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11gPart 4 - Hadoop Data Output and Reporting using OBIEE11g
Part 4 - Hadoop Data Output and Reporting using OBIEE11g
 
Meetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management TrendsMeetup Oracle Database BCN: 2.1 Data Management Trends
Meetup Oracle Database BCN: 2.1 Data Management Trends
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle Technologies
 
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
Part 1 - Introduction to Hadoop and Big Data Technologies for Oracle BI & DW ...
 
ODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration HubODI12c as your Big Data Integration Hub
ODI12c as your Big Data Integration Hub
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data Infrastructure
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
 
User 2013-oracle-big-data-analytics-1971985
User 2013-oracle-big-data-analytics-1971985User 2013-oracle-big-data-analytics-1971985
User 2013-oracle-big-data-analytics-1971985
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
Oracle Data Integration - Overview
Oracle Data Integration - OverviewOracle Data Integration - Overview
Oracle Data Integration - Overview
 
Web Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analyticsWeb Briefing: Unlock the power of Hadoop to enable interactive analytics
Web Briefing: Unlock the power of Hadoop to enable interactive analytics
 
Bi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in LondonBi on Big Data - Strata 2016 in London
Bi on Big Data - Strata 2016 in London
 

Mehr von Mark Rittman

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsMark Rittman
 
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitUsing Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitMark Rittman
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...Mark Rittman
 
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?Mark Rittman
 
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Mark Rittman
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...Mark Rittman
 
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudMark Rittman
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...Mark Rittman
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Mark Rittman
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Mark Rittman
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsMark Rittman
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Mark Rittman
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...Mark Rittman
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...Mark Rittman
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyMark Rittman
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudMark Rittman
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Mark Rittman
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...Mark Rittman
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Mark Rittman
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Mark Rittman
 

Mehr von Mark Rittman (20)

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
 
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitUsing Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
 
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
 
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
 

Kürzlich hochgeladen

Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesMd Hossain Ali
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 

Kürzlich hochgeladen (20)

Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just MinutesAI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 

ODI11g, Hadoop and "Big Data" Sources

  • 1. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com ODI11g, Hadoop and “Big Data” Mark Rittman, Technical Director, Rittman Mead Rittman Mead BI Forum 2013, Brighton & Atlanta T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Wednesday, 8 May 13
  • 2. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Big Data, Hadoop and Unstructured Data Sources •“Big data” is the hot topic in BI, DW and Analytics circles •The ability to harness vast datasets, at a highly-granular level, by harnessing massively-parallel computing •Crunching loosely-structured and modelled datasets using simple algorithms: Map (project) + Reduce (agg) •Largely based around open-source projects, non-relational technologies ‣ Apache Hadoop ‣ MapReduce ‣ Hadoop Distributed File System ‣ Apache Hive, Sqoop, HBase etc •Emerging commercial vendors ‣ Cloudera ‣ Hortonworks etc •Can be used standalone, or linked to an enterprise DW/BI architecture + Wednesday, 8 May 13
  • 3. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Oracle’s Strategy for Business Analytics •Connect to all of your data, from all your sources, •Subject it to the full range of possible inquiry •Package solutions for known problems and fixed sources, and •Deploy to PCs and mobile devices, on premise or in the cloud On Premise, On Cloud, On Mobile Any Data, Any Source Full Range of Analytics Integrated Analytic Apps Wednesday, 8 May 13
  • 4. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Connect to All of Your Data, From All of Your Sources •As well as traditional application and database files sources, unstructured source and “big data” sources are within scope for business decision-making ‣ Data of great volume, great velocity and great variety Any Data, Any Source Your Data : Decisions based on your data Big Data : Decisions based on all data relevant to you Transactions Documents & Social Data Machine-Generated Data Wednesday, 8 May 13
  • 5. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Oracle’s Big Data Products •Oracle Big Data Appliance - Engineered System for Big Data Acquisition and Processing ‣Cloudera Distribution of Hadoop ‣Cloudera Manager ‣Open-source R ‣Oracle NoSQL Database Community Edition ‣Oracle Enterprise Linux + Oracle JVM •Oracle Big Data Connectors ‣Oracle Loader for Hadoop (Hadoop > Oracle RBDMS) ‣Oracle Direct Connector for HDFS (HFDS > Oracle RBDMS) ‣Oracle Data Integration Adapter for Hadoop ‣Oracle R Connector for Hadoop •Oracle NoSQL Database (column/key-store DB based on BerkeleyDB) Wednesday, 8 May 13
  • 6. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com ODI as Part of Oracle’s Big Data Strategy •ODI is the data integration tool for extracting data from Hadoop/MapReduce, and loading into Oracle Big Data Appliance, Oracle Exadata and Oracle Exalytics •Oracle Application Adaptor for Hadoop provides required data adapters ‣ Load data into Hadoop from local filesystem, or HDFS (Hadoop clustered FS) ‣ Read data from Hadoop/MapReduce using Apache Hive (JDBC) and HiveQL, load into Oracle RDBMS using Oracle Loader for Hadoop •Supported by Oracle’s Engineered Systems ‣ Exadata ‣ Exalytics ‣ Big Data Appliance (w/Cloudera Hadoop Distrib) Wednesday, 8 May 13
  • 7. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com How ODI Accesses Hadoop and MapReduce •ODI accesses data in Hadoop clusters through Apache Hive ‣ Metadata and query layer over MapReduce ‣ Provides SQL-like language (HiveQL) and a metadata store (data dictionary) ‣ Provides a means to define “tables”, into which file data is loaded, and then queried via MapReduce ‣ Accessed via Hive JDBC driver (separate Hadoop install required on ODI server, for client libs) •Additional access through Oracle Direct Connector for HDFS and Oracle Loader for Hadoop Hadoop Cluster Hive Server ODI 11g Oracle RDBMS HiveQL MapReduce Direct-path loads using Oracle Loader for Hadoop, transformation logic in MapReduce Wednesday, 8 May 13
  • 8. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Oracle Business Analytics and Big Data Sources • OBIEE 11g, and other Oracle Business Analytics tools, can also make use of big data sources ‣ Oracle Exalytics, through in-memory aggregates and InfiniBand connection to Exadata, can analyze vast (structured) datasets held in relational and OLAP databases ‣ Endeca Information Discovery can analyze unstructured and semi-structured sources ‣ InfiniBand connector to Big Data Applicance + Hadoop connector in OBIEE supports analysis via Map/Reduce ‣ Oracle R distribution + Oracle Enterprise R supports SAS-style statistical analysis of large data sets, as part of Oracle Advanced Analytics Option ‣ OBIEE can access Hadoop datasource through another Apache technology called Hive Wednesday, 8 May 13
  • 9. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com OBIEE Access to Hadoop/Hive for BI Administration Tool RPD Creation •HiveODBC driver has to be installed into Windows environment, so that BI Administration tool can connect to Hive and return table metadata •Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards •Note that OBIEE queries cannot span >1 Hive schema (no table prefixes) Wednesday, 8 May 13
  • 10. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Set up ODBC Connection at the OBIEE Server (Linux Only) •OBIEE 11.1.1.7+ ships with HiveODBC drivers, need to use 7.x versions though •Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name •BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce [ODBC Data Sources] AnalyticsWeb=Oracle BI Server Cluster=Oracle BI Server SSL_Sample=Oracle BI Server bigdatalite=Oracle 7.1 Apache Hive Wire Protocol [bigdatalite] Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/ Merant/7.0.1/lib/ARhive27.so Description=Oracle 7.1 Apache Hive Wire Protocol ArraySize=16384 Database=default DefaultLongDataBuffLen=1024 EnableLongDataBuffLen=1024 EnableDescribeParam=0 Hostname=bigdatalite LoginTimeout=30 MaxVarcharSize=2000 PortNumber=10000 RemoveColumnQualifiers=0 StringDescribeType=12 TransactionMode=0 UseCurrentSchema=0 Wednesday, 8 May 13
  • 11. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Opportunities for OBIEE and ODI with Big Data Sources and Tools •Load data from a Hadoop/HDFS/NoSQL environment into a structured DW for analysis •Provide OBIEE as an alternative to Java coding or HiveQL for analysts •Leverage Hadoop & HDFS for massively-parallel staging-layer number crunching •Make use of low-cost, fault-tolerant hardware for parts of your BI platform •Provide the reporting and analysis for customers who have bought Oracle Big Data Appliance Wednesday, 8 May 13
  • 12. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com What is Hadoop? •Apache Hadoop is one of the most well-known Big Data technologies •Family of open-source products used to store, and analyze distributed datasets •Hadoop is the enabling framework, automatically parallelises and co-ordinates jobs ‣“Moves the compute to the data” •MapReduce is the programming framework for filtering, sorting and aggregating data ‣Map : filter and interpret input data, create key/value pairs ‣Reduce : summarise and aggregate •MapReduce jobs can be written in any language (Java etc), but it is complicated Wednesday, 8 May 13
  • 13. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com What is HDFS? •The filesystem behind Hadoop, used to store data for Hadoop analysis ‣Unix-like, uses commands such as ls, mkdir, chown, chmod •Fault-tolerant, with rapid fault detection and recovery •High-throughput, with streaming data access and large block sizes •Designed for data-locality, placing data closed to where it is processed •Accessed from the command-line, via internet (hdfs://), GUI tools etc [oracle@bigdatalite mapreduce]$ hadoop fs -mkdir /user/oracle/my_stuff [oracle@bigdatalite mapreduce]$ hadoop fs -ls /user/oracle Found 5 items drwx------ - oracle hadoop 0 2013-04-27 16:48 /user/oracle/.staging drwxrwxrwx - oracle hadoop 0 2012-09-18 17:02 /user/oracle/moviedemo drwxrwxrwx - oracle hadoop 0 2012-10-17 15:58 /user/oracle/moviework drwxr-xr-x - oracle hadoop 0 2013-05-03 17:49 /user/oracle/my_stuff drwxr-xr-x - oracle hadoop 0 2012-08-10 16:08 /user/oracle/stage Wednesday, 8 May 13
  • 14. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Hive as the Hadoop “Data Warehouse” •MapReduce jobs are typically written in Java, but Hive can make this simpler •Hive is a query environment over Hadoop/MapReduce to support SQL-like queries •Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automatically creates MapReduce jobs against data previously loaded into the Hive HDFS tables •Approach used by ODI and OBIEE to gain access to Hadoop data •Allows Hadoop data to be accessed just like any other data source (sort of...) Wednesday, 8 May 13
  • 15. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Hive Data and Metadata •Hive uses a RBDMS metastore to hold table and column definitions in schemas •Hive tables then map onto HDFS-stored files ‣Managed tables ‣External tables •Oracle-like query optimizer, compiler, executor •JDBC and OBDC drivers, plus CLI etc Hive Driver (Compile Optimize, Execute) Managed Tables /user/hive/warehouse/ External Tables /user/oracle/ /user/movies/data/ HDFS HDFS or local files loaded into Hive HDFS area, using HiveQL CREATE TABLE command HDFS files loaded into HDFS using external process, then mapped into Hive using CREATE EXTERNAL TABLE command Metastore Wednesday, 8 May 13
  • 16. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Transforming HiveQL Queries into MapReduce Jobs •HiveQL queries are automatically translated into Java MapReduce jobs •Selection and filtering part becomes Map tasks •Aggregation part becomes the Reduce tasks SELECT a, sum(b) FROM myTable WHERE a<100 GROUP BY a Map Task Map Task Map Task Reduce Task Reduce Task Result Wednesday, 8 May 13
  • 17. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com An example Hive Query Session: Connect and Display Table List [oracle@bigdatalite ~]$ hive Hive history file=/tmp/oracle/hive_job_log_oracle_201304170403_1991392312.txt hive> show tables; OK dwh_customer dwh_customer_tmp i_dwh_customer ratings src_customer src_sales_person weblog weblog_preprocessed weblog_sessionized Time taken: 2.925 seconds Hive Server lists out all “tables” that have been defined within the Hive environment Wednesday, 8 May 13
  • 18. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com An example Hive Query Session: Display Table Row Count hive> select count(*) from src_customer; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer= In order to limit the maximum number of reducers: set hive.exec.reducers.max= In order to set a constant number of reducers: set mapred.reduce.tasks= Starting Job = job_201303171815_0003, Tracking URL = http://localhost.localdomain:50030/jobdetails.jsp?jobid=job_201303171815_0003 Kill Command = /usr/lib/hadoop-0.20/bin/ hadoop job -Dmapred.job.tracker=localhost.localdomain:8021 -kill job_201303171815_0003 2013-04-17 04:06:59,867 Stage-1 map = 0%, reduce = 0% 2013-04-17 04:07:03,926 Stage-1 map = 100%, reduce = 0% 2013-04-17 04:07:14,040 Stage-1 map = 100%, reduce = 33% 2013-04-17 04:07:15,049 Stage-1 map = 100%, reduce = 100% Ended Job = job_201303171815_0003 OK 25 Time taken: 22.21 seconds Request count(*) from table Hive server generates MapReduce job to “map” table key/value pairs, and then reduce the results to table count MapReduce job automatically run by Hive Server Results returned to user Wednesday, 8 May 13
  • 19. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com OBIEE and ODI Access to Hive, Leveraging MapReduce with no Java Coding •Requests in HiveQL arrive via HiveODBC, HiveJDBC or through the Hive command shell •JDBC and ODBC access requires Thift server ‣Provides RPC call interface over Hive for external procs •All queries then get parsed, optimized and compiled, then sent to Hadoop NameNode and Job Tracker •Then Hadoop processes the query, generating MapReduce jobs and distributing it to run in parallel across all data nodes •Hadoop access can still be performed procedurally if needed, typically coded by hand in Java, or through Pig, etc ‣The equivalent of PL/SQL compared to SQL ‣But Hive works well with the OBIEE/ODI paradigm Wednesday, 8 May 13
  • 20. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Complementary Technologies: HDFS, Cloudera Manager, Hue, Beeswax etc •You can download your own Hive binaries, libraries etc from Apache Hadoop website •Or use pre-built VMs and distributions from the likes of Cloudera ‣Cloudera CDH3/4 is used on Oracle Big Data Appliance ‣Open-source + proprietary tools (Cloudera Manager) •Other tools for managing Hive, HFDS etc including ‣Hue (HDFS file browser + management) ‣Beeswax (Hive administration + querying) •Other complementary/required Hadoop tools ‣Sqoop ‣HDFS ‣Thrift Wednesday, 8 May 13
  • 21. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Demonstration Simple Data Selection and Querying using Hive on Cloudera CDH3 Wednesday, 8 May 13
  • 22. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com ODI + Big Data Examples : Providing the Bridge Between Hadoop + OBIEE •OBIEE now has the ability to report against Hadoop data, via Hive ‣Assumes that data is already loaded into the Hive warehouse tables •ODI therefore can be used to load the Hive tables, through either: ‣Loading Hive from files ‣Joining and loading from Hive-Hive ‣Loading and transforming via shell scripts (python, perl etc) •ODI could also extract the Hive data and load into Oracle, if more appropriate Wednesday, 8 May 13
  • 23. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Configuring ODI 11.1.1.6+ for Hadoop Connectivity •Obtain an installation of Hadoop/Hive from somewhere (Cloudera CDH3/4 for example) •Copy the following files into a temp directory, archive and transfer to ODI environment for example... •Copy JAR files into userlib directory and (standalone) agent lib directory •Restart ODI Studio $HIVE_HOME/lib/*.jar $HADOOP_HOME/hadoop-*-core*.jar, $HADOOP_HOME/Hadoop-*-tools*.jar /usr/lib/hive/lib/*.jar /usr/lib/hadoop-0.20/hadoop-*-core*.jar, /usr/lib/hadoop-0.20/Hadoop-*-tools*.jar c:UsersAdministratorAppDataRoamingodioraclediuserlib Wednesday, 8 May 13
  • 24. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Registering HDFS and Hive Sources and Targets in the ODI Topology •For Hive sources and targets, use Hive technology ‣JDBC Driver : Apache Hive JDBC Driver ‣JDBC URL : jdbc:hive://[server_name]:10000/default ‣(Flexfield Name) Hive Metastore URIs : thrift://[server_name]:10000 •For HFDS sources, use File technology ‣JDBC URL : hdfs://[server_name]:port ‣Special HDFS “trick” to use File tech (no specific HDFS technology) Wednesday, 8 May 13
  • 25. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Reverse Engineering Hive, HDFS and Local File Datastores + Models •Hive tables reverse-engineer just like regular tables •Define model in Designer navigator, uses Hive RKM to retrieve table metadata •Information on Hive-specific metadata stored in flexfields ‣Hive Buckets ‣Hive Partition Column ‣Hive Cluster Column ‣Hive Sort Column Wednesday, 8 May 13
  • 26. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Demonstration ODI 11.1.1.6 Configured for Hadoop Access, with Hive/HFDS source and targets registered Wednesday, 8 May 13
  • 27. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com ODI Application Adapter for Hadoop KMs •Application Adapter (pay-extra option) for Hadoop connectivity •Works for both Windows and Linux installs of ODI Studio ‣Need to source HiveJDBC drivers and JARs from separate Hadoop install •Provides six new knowledge modules ‣IKM File to Hive (Load Data) ‣IKM Hive Control Append ‣IKM Hive Transform ‣IKM File-Hive to Oracle (OLH) ‣CKM Hive ‣RKM Hive Wednesday, 8 May 13
  • 28. T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Oracle Loader for Hadoop •Oracle technology for accessing Hadoop data, and loading it into an Oracle database •Pushes data transformation, “heavy lifting” to the Hadoop cluster, using MapReduce •Direct-path loads into Oracle Database, partitioned and non-partitioned •Online and offline loads •Key technology for fast load of Hadoop results into Oracle DB Wednesday, 8 May 13
  • 29. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com IKM File to Hive (Load Data): Loading of Hive Tables from Local File or HDFS •Uses the Hive Load Data command to load from local or HDFS files ‣Calls Hadoop FS commands for simple copy/move into/around HDFS ‣Commands generated by ODI through IKM File to Hive (Load Data) hive> load data inpath '/user/oracle/movielens_src/u.data' > overwrite into table movie_ratings; Loading data to table default.movie_ratings Deleted hdfs://localhost.localdomain/user/hive/warehouse/ movie_ratings OK Time taken: 0.341 seconds Wednesday, 8 May 13
  • 30. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com IKM File to Hive (Load Data): Loading of Hive Tables from Local File or HDFS •IKM File to Hive (Load Data) generates the required HiveQL commands using a script template •Executed over HiveJDBC interface •Success/Failure/Warning returned to ODI Wednesday, 8 May 13
  • 31. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Load Data and Hadoop SerDe (Serializer-Deserializer) Transformations •Hadoop SerDe transformations can be accessed, for example to transform weblogs •Hadoop interface that contains: ‣Deserializer - converts incoming data into Java objects for Hive manipulation ‣Serializer - takes Hive Java objects & converts to output for HDFS •Library of SerDe transformations readily available for use with Hive •Use the OVERRIDE_ROW_FORMAT option in IKM to override regular column mappings in Mapping tab Wednesday, 8 May 13
  • 32. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com IKM Hive Control Append: Loading, Joining & Filtering Between Hive Tables •Hive source and target, transformations according to HiveQL functionality (aggregations, functions etc) •Ability to join data sources •Other data sources can be used, but will involve staging tables and additional KMs (as per any multi-source join) Wednesday, 8 May 13
  • 33. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com IKM Hive Transform: Use Custom Shell Scripts to Integrate into Hive Table •Gives developer the ability to transform data programmatically using Python, Perl etc scripts •Options to map output of script to columns in Hive table •Useful for more programmatic and complex data transformations Wednesday, 8 May 13
  • 34. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com IKM File-Hive to Oracle: Extract from Hive into Oracle Tables •Uses Oracle Loaded for Hadoop (OLH) to process any filtering, aggregation, transformation in Hadoop, using MapReduce •OLH part of Oracle Big Data Connectors (additional cost) •High-performance loader into Oracle DB •Optional sort by primary key, pre-partioning of data •Can utilise the two OLH loading modes: ‣JDBC or OCI direct load into Oracle ‣Unload to files, Oracle DP into Oracle DB Wednesday, 8 May 13
  • 35. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Demonstration Data Integration Tasks using ODIAAH Hadoop KMs Wednesday, 8 May 13
  • 36. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com NoSQL Data Sources and Targets with ODI 11g •No specific technology or driver for NoSQL databases, but can use Hive external tables •Requires a specific “Hive Storage Handler” for key/value store sources ‣Hive feature for accessing data from other DB systems, for example MongoDB, Cassandra ‣For example, https://github.com/vilcek/HiveKVStorageHandler •Additionally needs Hive collect_set aggregation method to aggregate results ‣Has to be defined in Languages panel in Topology Wednesday, 8 May 13
  • 37. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com Pig, Sqoop and other Hadoop Technologies, and Hive •Future versions of ODI might use other Hadoop technologies ‣Apache Sqoop for bulk transfer between Hadoop and RBDMSs •Other technologies are not such an obvious fit ‣Apache Pig - the equivalent of PL/SQL for Hive’s SQL •Commercial vendors may produce “better” versions of Hive, MapReduce etc ‣Cloudera Impala - more “real-time” version of Hive ‣MapR - solves many current issues with MapReduce, 100% Hadoop API compatibility •Watch this space...! Wednesday, 8 May 13
  • 38. T : +1 (888) 631-1410 E : inquiries@rittmanmead.com W: www.rittmanmead.com ODI11g, Hadoop and “Big Data” Mark Rittman, Technical Director, Rittman Mead Rittman Mead BI Forum 2013, Brighton & Atlanta T : +44 (0) 8446 697 995 E : enquiries@rittmanmead.com W: www.rittmanmead.com Wednesday, 8 May 13