SlideShare ist ein Scribd-Unternehmen logo
1 von 43
Downloaden Sie, um offline zu lesen
Lesson 4 : Hadoop Data Output and 
Reporting using OBIEE 11g 
Mark Rittman, CTO, Rittman Mead 
SIOUG and HROUG Conferences, Oct 2014 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Moving Data In, Around and Out of Hadoop 
•Three stages to Hadoop data movement, with dedicated Apache / other tools 
‣Load : receive files in batch, or in real-time (logs, events) 
‣Transform : process & transform data to answer questions 
‣Store / Export : store in structured form, or export to RDBMS using Sqoop 
RDBMS 
Imports 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
Loading 
Stage 
!!!! 
Processing 
Stage 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
!!!! 
Store / Export 
Stage 
!!!! 
Real-Time 
Logs / Events 
File / 
Unstructured 
Imports 
File 
Exports 
RDBMS 
Exports
Discovery vs. Exploitation Project Phases 
•Discovery and monetising steps in Big Data projects have different requirements 
•Discovery phase 
‣Unbounded discovery 
‣Self-Service sandbox 
‣Wide toolset 
•Promotion to Exploitation 
‣Commercial exploitation 
‣Narrower toolset 
‣Integration to operations 
‣Non-functional requirements 
‣Code standardisation & governance 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Actionable 
Events 
Event Engine Enterprise 
www.rittmanmead.com inquiries@rittmanmead.com @rittmanmead 
Information Store 
Reporting 
Discovery Lab 
Actionable 
Information 
Actionable 
Insights 
Input 
Events 
Execution 
Innovation 
Discovery 
Output 
Events 
& Data 
Structured 
Enterprise 
Data 
Other 
Data 
Data 
Reservoir 
Data Factory 
Information Solution
Actionable 
Events 
Event Engine Enterprise 
www.rittmanmead.com inquiries@rittmanmead.com @rittmanmead 
Information Store 
Reporting 
Discovery Lab 
Actionable 
Information 
Actionable 
Insights 
Input 
Events 
Execution 
Innovation 
Discovery 
Output 
Events 
& Data 
Data 
Reservoir 
Data Factory 
Information Solution
Design Pattern : Information Solution 
•Specific solution based on Big Data technologies requiring broader 
integration to the wider Information Management estate 
e.g. ETL pre-processor for the DW or affordably store a lower level of 
grain 
•Non-functional requirements more critical in this solution 
•Scalable integration to IM estate 
an important factor for success 
•Analysis may take place in Reservoir 
or Reservoir only used as an aggregator 
www.rittmanmead.com inquiries@rittmanmead.com @rittmanmead www.facebook.com/rittmanmead
Options for Sharing Hadoop Output with Wider Audience 
•During the discovery phase of a Hadoop project, audience are likely technical 
‣Most comfortable with data analyst tools, command-line, low-level access to the data 
•During the exploitation phase, audience will be less technical 
‣Emphasis on graphical tools, and integration with wider reporting toolset + metadata 
•Three main options for visualising and sharing Hadoop data 
1.Coming Soon - Oracle Big Data Discovery (Endeca on Hadoop) 
2.OBIEE reporting against Hadoop 
direct using Hive/Impala, or Oracle Big Data SQL 
3.OBIEE reporting against an export of the 
Hadoop data, on Exalytics / RDBMS 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
Oracle Big Data Discovery 
•Launched at Oracle Openworld 2014 as “The Visual Face of Hadoop” 
•Combined Endeca Server search + analytical technology with Spark data transformation
Interactive Analysis & Exploration of Hadoop Data 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Share and Collaborate on Big Data Discovery Projects 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Oracle Business Analytics and Big Data Sources 
•OBIEE 11g can also make use of big data sources 
‣OBIEE 11.1.1.7+ supports Hive/Hadoop as a data source 
‣Oracle R Enterprise can expose R models through DB functions, columns 
‣Oracle Exalytics has InfiniBand connectivity to Oracle BDA 
•Endeca Information Discovery can analyze unstructured and semi-structured sources 
‣Increasingly tighter-integration between 
OBIEE and Endeca 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
New in OBIEE 11.1.1.7 : Hadoop Connectivity through Hive 
•MapReduce jobs are typically written in Java, but Hive can make this simpler 
•Hive is a query environment over Hadoop/MapReduce to support SQL-like queries 
•Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automatically 
creates MapReduce jobs against data previously loaded into the Hive HDFS tables 
•Approach used by ODI and OBIEE to gain access to Hadoop data 
•Allows Hadoop data to be accessed just like any other data source 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Importing Hadoop/Hive Metadata into RPD 
•HiveODBC driver has to be installed into Windows environment, so that 
BI Administration tool can connect to Hive and return table metadata 
•Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards 
•Note that OBIEE queries cannot span >1 Hive schema (no table prefixes) 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
1 
2 
3
Set up ODBC Connection at the OBIEE Server 
•OBIEE 11.1.1.7+ ships with HiveODBC drivers, need to use 7.x versions though (only Linux 
supported) 
•Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name 
•BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce 
[ODBC Data Sources] 
AnalyticsWeb=Oracle BI Server 
Cluster=Oracle BI Server 
SSL_Sample=Oracle BI Server 
bigdatalite=Oracle 7.1 Apache Hive Wire Protocol 
[bigdatalite] 
Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/ 
Merant/7.0.1/lib/ARhive27.so 
Description=Oracle 7.1 Apache Hive Wire Protocol 
ArraySize=16384 
Database=default 
DefaultLongDataBuffLen=1024 
EnableLongDataBuffLen=1024 
EnableDescribeParam=0 
Hostname=bigdatalite 
LoginTimeout=30 
MaxVarcharSize=2000 
PortNumber=10000 
RemoveColumnQualifiers=0 
StringDescribeType=12 
TransactionMode=0 
UseCurrentSchema=0 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Dealing with Hadoop / Hive Latency Option 1 : Impala 
•Hadoop access through Hive can be slow - due to inherent latency in Hive 
•Hive queries use MapReduce in the background to query Hadoop 
•Spins-up Java VM on each query 
•Generates MapReduce job 
•Runs and collates the answer 
•Great for large, distributed queries ... 
•... but not so good for “speed-of-thought” dashboards 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Dealing with Hadoop / Hive Latency Option 1 : Use Impala 
•Hive is slow - because it’s meant to be used for batch-mode queries 
•Many companies / projects are trying to improve Hive - one of which is Cloudera 
•Cloudera Impala is an open-source but 
commercially-sponsored in-memory MPP platform 
•Replaces Hive and MapReduce in the Hadoop stack 
•Can we use this, instead of Hive, to access Hadoop? 
‣It will need to work with OBIEE 
‣Warning - it won’t be a supported data source (yet…) 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Demo 
Using Hive to Provide Data for OBIEE11g 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
How Impala Works 
•A replacement for Hive, but uses Hive concepts and 
data dictionary (metastore) 
•MPP (Massively Parallel Processing) query engine 
that runs within Hadoop 
‣Uses same file formats, security, 
resource management as Hadoop 
•Processes queries in-memory 
•Accesses standard HDFS file data 
•Option to use Apache AVRO, RCFile, 
LZO or Parquet (column-store) 
•Designed for interactive, real-time 
SQL-like access to Hadoop 
BI Server 
Presentation Svr 
Impala 
Hadoop 
HDFS etc 
Cloudera Impala 
ODBC Driver 
Impala 
Hadoop 
HDFS etc 
Impala 
Hadoop 
HDFS etc 
Impala 
Hadoop 
HDFS etc 
Impala 
Hadoop 
HDFS etc 
Multi-Node 
Hadoop Cluster
Connecting OBIEE 11.1.1.7 to Cloudera Impala 
•Warning - unsupported source - limited testing and no support from MOS 
•Requires Cloudera Impala ODBC drivers - Windows or Linux (RHEL etc/SLES) - 32/64 bit 
•ODBC Driver / DSN connection steps similar to Hive 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
Importing Impala Metadata 
•Import Impala tables (via the Hive metastore) into RPD 
•Set database type to “Apache Hadoop” 
‣Warning - don’t set ODBC type to Hadoop- leave at ODBC 2.0 
‣Create physical layer keys, joins etc as normal
Importing RPD using Impala Metadata 
•Create BMM layer, Presentation layer as normal 
•Use “View Rows” feature to check connectivity back to Impala / Hadoop 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Impala / OBIEE Issue with ORDER BY Clause 
•Although checking rows in the BI Administration tool worked, any query that aggregates 
data in the dashboard will fail 
•Issue is that Impala requires LIMIT with all ORDER BY clauses 
‣OBIEE could use LIMIT, but doesn’t for Impala 
at the moment (because not supported) 
•Workaround - disable ORDER BY in 
Database Features, have the BI Server do sorting 
‣Not ideal - but it works, until Impala supported 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Demo 
Using Impala to Provide Data for OBIEE11g 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
So Does Impala Work, as a Hive Substitute? 
•With ORDER BY disabled in DB features, it appears to 
•But not extensively tested by me, or Oracle 
•But it’s certainly interesting 
•Reduces 30s, 180s queries down to 1s, 10s etc 
•Impala, or one of the competitor projects 
(Drill, Dremel etc) assumed to be the 
real-time query replacement for Hive, in time 
‣Oracle announced planned support for 
Impala at OOW2013 - watch this space 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Dealing with Hadoop / Hive Latency Option 2 : Big Data SQL 
•Preferred solution for customers with Oracle Big Data Appliance is Big Data SQL 
•Oracle SQL Access to both relational, and Hive/NoSQL data sources 
•Exadata-type SmartScan against Hadoop datasets 
•Response-time equivalent to Impala or Hive on Tez 
•No issues around HiveQL limitations 
•Insulates end-users around differences 
between Oracle and Hive datasets 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
Oracle Big Data SQL 
•Part of Oracle Big Data 4.0 (BDA-only) 
‣Also requires Oracle Database 12c, Oracle Exadata Database Machine 
•Extends Oracle Data Dictionary to cover Hive 
•Extends Oracle SQL and SmartScan to Hadoop 
•Extends Oracle Security Model over Hadoop 
‣Fine-grained access control 
‣Data redaction, data masking 
‣Uses fast c-based readers where possible 
(vs. Hive MapReduce generation) 
‣Map Hadoop parallelism to Oracle PQ 
‣Big Data SQL engine works on top of YARN 
Exadata 
Storage Servers 
‣Like Spark, Tez, MR2 
Exadata Database 
Server 
Hadoop 
Cluster 
Oracle Big 
Data SQL 
SQL Queries 
SmartScan SmartScan
Big Data SQL Server Dataflow 
•Read data from HDFS Data Node 
‣Direct-path reads 
‣C-based readers when possible 
‣Use native Hadoop classes otherwise 
•Translate bytes to Oracle 
•Apply SmartScan to Oracle bytes 
‣Apply filters 
‣Project columns 
‣Parse JSON/XML 
‣Score models Disks% 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
Big$Data$SQL$Server$ 
Smart$Scan$ 
External$Table$Services$ 
SerDe% 
RecordReader% 
Data$Node$ 
10110010% 10110010% 10110010% 
3% 
2% 
1% 
1 
2 
3
View Hive Table Metadata in the Oracle Data Dictionary 
•Oracle Database 12c 12.1.0.2.0 with Big Data SQL option can view Hive table metadata 
‣Linked by Exadata configuration steps to one or more BDA clusters 
•DBA_HIVE_TABLES and USER_HIVE_TABLES exposes Hive metadata 
•Oracle SQL*Developer 4.0.3, with Cloudera Hive drivers, can connect to Hive metastore 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
SQL> col database_name for a30 
SQL> col table_name for a30 
SQL> select database_name, table_name 
2 from dba_hive_tables; 
! 
DATABASE_NAME TABLE_NAME 
------------------------------ ------------------------------ 
default access_per_post 
default access_per_post_categories 
default access_per_post_full 
default apachelog 
default categories 
default countries 
default cust 
default hive_raw_apache_access_log
Hive Access through Oracle External Tables + Hive Driver 
•Big Data SQL accesses Hive tables through external table mechanism 
‣ORACLE_HIVE external table type imports Hive metastore metadata 
‣ORACLE_HDFS requires metadata to be specified 
•Access parameters cluster and tablename specify Hive table source and BDA cluster 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
CREATE TABLE access_per_post_categories( 
hostname varchar2(100), 
request_date varchar2(100), 
post_id varchar2(10), 
title varchar2(200), 
author varchar2(100), 
category varchar2(100), 
ip_integer number) 
organization external 
(type oracle_hive 
default directory default_dir 
access parameters(com.oracle.bigdata.tablename=default.access_per_post_categories));
Demo 
Using Big Data SQL to Provide Data for OBIEE11g 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Alternative to Direct Against Hadoop : Export to Data Mart 
•In most cases, for general reporting access, exporting into RDBMS makes sense 
•Export Hive data from Hadoop into Oracle Data Mart or Data Warehouse 
•Use Oracle RDBMS for high-value data analysis, full access to RBDMS optimisations 
•Potentially use Exalytics for in-memory RBDMS access 
RDBMS 
Imports 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
Loading 
Stage 
!!!! 
Processing 
Stage 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
!!!! 
Store / Export 
Stage 
!!!! 
Real-Time 
Logs / Events 
File / 
Unstructured 
Imports 
File 
Exports 
RDBMS 
Exports
Using the Right Server for the Right Job 
•Hadoop for large scale, high-speed data ingestion and processing 
•Oracle RDBMS and Exadata for long-term storage of high-value data 
•Oracle Exalytics for speed-of-though analytics in TimesTen and Oracle Essbase 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Bulk Unload Summary Data to Oracle Database 
•Final requirement is to unload final Hive table contents to Oracle Database 
•Several use-cases for this: 
•Use Hadoop / BDA for ETL offloading 
•Use analysis capabilities of BDA, but then output results to RDBMS data mart or DW 
•Permit use of more advanced SQL query tools 
•Share results with other applications 
•Can use Sqoop for this, or use Oracle Big Data Connectors 
•Fast bulk unload, or transparent Oracle access to Hive 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
5
Oracle Direct Connector for HDFS 
•Enables HDFS as a data-source for Oracle Database external tables 
•Effectively provides Oracle SQL access over HDFS 
•Supports data query, or import into Oracle DB 
•Treat HDFS-stored files in the same way as regular files 
•But with HDFS’s low-cost 
•… and fault-tolerance 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Oracle Loader for Hadoop (OLH) 
•Oracle technology for accessing Hadoop data, and loading it into an Oracle database 
•Pushes data transformation, “heavy lifting” to the Hadoop cluster, using MapReduce 
•Direct-path loads into Oracle Database, partitioned and non-partitioned 
•Online and offline loads 
•Load from HDFS or Hive tables 
•Key technology for fast load of 
Hadoop results into Oracle DB 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
IKM File/Hive to Oracle (OLH/ODCH) 
•KM for accessing HDFS/Hive data from Oracle 
•Either sets up ODCH connectivity, or bulk-unloads via OLH 
•Map from HDFS or Hive source to Oracle tables (via Oracle technology in Topology) 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Environment Variable Requirements 
•Hardest part in setting up OLH / IKM File/Hive to Oracle is getting environment variables 
correct - OLH needs to be able to see correct JARs, configuration files 
•Set in /home/oracle/.bashrc - see example below 
export HIVE_HOME=/usr/lib/hive 
export HADOOP_CLASSPATH=/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/*:/etc/hive/conf:$HIVE_HOME/lib/ 
hive-metastore-0.12.0-cdh5.0.1.jar:$HIVE_HOME/lib/libthrift.jar:$HIVE_HOME/lib/libfb303-0.9.0.jar:$HIVE_HOME/ 
lib/hive-common-0.12.0-cdh5.0.1.jar:$HIVE_HOME/lib/hive-exec-0.12.0-cdh5.0.1.jar 
export OLH_HOME=/home/oracle/oracle/product/oraloader-3.0.0-h2 
export HADOOP_HOME=/usr/lib/hadoop 
export JAVA_HOME=/usr/java/jdk1.7.0_60 
export ODI_HIVE_SESSION_JARS=/usr/lib/hive/lib/hive-contrib.jar 
export ODI_OLH_JARS=/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/ojdbc6.jar,/home/oracle/oracle/ 
product/oraloader-3.0.0-h2/jlib/orai18n.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n-utility. 
jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n-mapping.jar,/home/oracle/oracle/ 
product/oraloader-3.0.0-h2/jlib/orai18n-collation.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/ 
oraclepki.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/osdt_cert.jar,/home/oracle/oracle/product/ 
oraloader-3.0.0-h2/jlib/osdt_core.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/commons-math- 
2.2.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/jackson-core-asl-1.8.8.jar,/home/oracle/ 
oracle/product/oraloader-3.0.0-h2/jlib/jackson-mapper-asl-1.8.8.jar,/home/oracle/oracle/product/ 
oraloader-3.0.0-h2/jlib/avro-1.7.3.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/avro-mapred-1.7.3- 
hadoop2.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/oraloader.jar,/usr/lib/hive/lib/hive-metastore. 
jar,/usr/lib/hive/lib/libthrift-0.9.0.cloudera.2.jar,/usr/lib/hive/lib/libfb303-0.9.0.jar,/usr/lib/ 
hive/lib/hive-common-0.12.0-cdh5.0.1.jar,/usr/lib/hive/lib/hive-exec.jar 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Configuring the KM Physical Settings 
•For the access table in Physical view, change LKM to LKM SQL Multi-Connect 
•Delegates the multi-connect capabilities to the downstream node, so you can use a multi-connect 
IKM such as IKM File/Hive to Oracle 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
Configuring the KM Physical Settings 
•For the target table, select IKM File/Hive to Oracle 
•Only becomes available to select once 
LKM SQL Multi-Connect selected for access table 
•Key option values to set are: 
•OLH_OUTPUT_MODE (use JDBC initially, OCI 
if Oracle Client installed on Hadoop client node) 
•MAPRED_OUTPUT_BASE_DIR (set to directory 
on HFDS that OS user running ODI can access) 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
Executing the Mapping 
•Executing the mapping will invoke 
OLH from the OS command line 
•Hive table (or HDFS file) contents 
copied to Oracle table
Demo 
Unloading Hive Data to Oracle using OLH / ODI12c 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com 
Thank You for Attending! 
•Thank you for attending this presentation, and more information can be found at http:// 
www.rittmanmead.com 
•Contact us at info@rittmanmead.com or mark.rittman@rittmanmead.com 
•Look out for our book, “Oracle Business Intelligence Developers Guide” out now! 
•Follow-us on Twitter (@rittmanmead) or Facebook (facebook.com/rittmanmead)
Lesson 4 : Hadoop Data Output 
and Reporting OBIEE 11g 
Mark Rittman, CTO, Rittman Mead 
SIOUG and HROUG Conferences, Oct 2014 
T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or 
+61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) 
E : info@rittmanmead.com 
W : www.rittmanmead.com

Weitere ähnliche Inhalte

Was ist angesagt?

Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
StampedeCon
 

Was ist angesagt? (20)

Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data ConnectorsDeep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
Deep-Dive into Big Data ETL with ODI12c and Oracle Big Data Connectors
 
What is Big Data Discovery, and how it complements traditional business anal...
What is Big Data Discovery, and how it complements  traditional business anal...What is Big Data Discovery, and how it complements  traditional business anal...
What is Big Data Discovery, and how it complements traditional business anal...
 
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
Delivering the Data Factory, Data Reservoir and a Scalable Oracle Big Data Ar...
 
Real-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
Real-Time Data Replication to Hadoop using GoldenGate 12c AdaptorsReal-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
Real-Time Data Replication to Hadoop using GoldenGate 12c Adaptors
 
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
Adding a Data Reservoir to your Oracle Data Warehouse for Customer 360-Degree...
 
2014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part12014 sept 26_thug_lambda_part1
2014 sept 26_thug_lambda_part1
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...Unlock the value in your big data reservoir using oracle big data discovery a...
Unlock the value in your big data reservoir using oracle big data discovery a...
 
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data StreamingOracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
 
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
 
2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final2015 nov 27_thug_paytm_rt_ingest_brief_final
2015 nov 27_thug_paytm_rt_ingest_brief_final
 
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
TimesTen - Beyond the Summary Advisor (ODTUG KScope'14)
 
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
Building a Next-gen Data Platform and Leveraging the OSS Ecosystem for Easy W...
 
Talend Big Data Capabilities - 2014
Talend Big Data Capabilities - 2014Talend Big Data Capabilities - 2014
Talend Big Data Capabilities - 2014
 
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data InsightSyncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
Syncsort, Tableau, & Cloudera present: Break the Barriers to Big Data Insight
 
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case StudyOracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
Oracle Big Data Spatial & Graph 
Social Media Analysis - Case Study
 
Flink in Zalando's world of Microservices
Flink in Zalando's world of Microservices   Flink in Zalando's world of Microservices
Flink in Zalando's world of Microservices
 
Interactive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroDataInteractive SQL-on-Hadoop and JethroData
Interactive SQL-on-Hadoop and JethroData
 
Hadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of HadoopHadoop at the Center: The Next Generation of Hadoop
Hadoop at the Center: The Next Generation of Hadoop
 
What’s New in Spark 2.0: Structured Streaming and Datasets - StampedeCon 2016
What’s New in Spark 2.0: Structured Streaming and Datasets - StampedeCon 2016What’s New in Spark 2.0: Structured Streaming and Datasets - StampedeCon 2016
What’s New in Spark 2.0: Structured Streaming and Datasets - StampedeCon 2016
 

Andere mochten auch

SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData
 

Andere mochten auch (6)

SnappyData overview NikeTechTalk 11/19/15
SnappyData overview NikeTechTalk 11/19/15SnappyData overview NikeTechTalk 11/19/15
SnappyData overview NikeTechTalk 11/19/15
 
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
 
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...SnappyData, the Spark Database. A unified cluster for streaming, transactions...
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
 
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...1° Sessione Oracle CRUI: Analytics Data Lab,  the power of Big Data Investiga...
1° Sessione Oracle CRUI: Analytics Data Lab, the power of Big Data Investiga...
 
Creating truly personal omni-channel customer experiences by Brian Solis and ...
Creating truly personal omni-channel customer experiences by Brian Solis and ...Creating truly personal omni-channel customer experiences by Brian Solis and ...
Creating truly personal omni-channel customer experiences by Brian Solis and ...
 
Omni-Channel Retailing: The Future Trend in Fashion and Luxury Industry - Par...
Omni-Channel Retailing: The Future Trend in Fashion and Luxury Industry - Par...Omni-Channel Retailing: The Future Trend in Fashion and Luxury Industry - Par...
Omni-Channel Retailing: The Future Trend in Fashion and Luxury Industry - Par...
 

Ähnlich wie Part 4 - Hadoop Data Output and Reporting using OBIEE11g

Ougn2013 high speed, in-memory big data analysis with oracle exalytics
Ougn2013   high speed, in-memory big data analysis with oracle exalyticsOugn2013   high speed, in-memory big data analysis with oracle exalytics
Ougn2013 high speed, in-memory big data analysis with oracle exalytics
Mark Rittman
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Pentaho
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle Technologies
Oleksii Movchaniuk
 

Ähnlich wie Part 4 - Hadoop Data Output and Reporting using OBIEE11g (20)

Ougn2013 high speed, in-memory big data analysis with oracle exalytics
Ougn2013   high speed, in-memory big data analysis with oracle exalyticsOugn2013   high speed, in-memory big data analysis with oracle exalytics
Ougn2013 high speed, in-memory big data analysis with oracle exalytics
 
Deploying OBIEE in the Cloud - Oracle Openworld 2014
Deploying OBIEE in the Cloud - Oracle Openworld 2014Deploying OBIEE in the Cloud - Oracle Openworld 2014
Deploying OBIEE in the Cloud - Oracle Openworld 2014
 
ODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" SourcesODI11g, Hadoop and "Big Data" Sources
ODI11g, Hadoop and "Big Data" Sources
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
UKOUG Tech 15 - Migration from Oracle Warehouse Builder to Oracle Data Integr...
UKOUG Tech 15 - Migration from Oracle Warehouse Builder to Oracle Data Integr...UKOUG Tech 15 - Migration from Oracle Warehouse Builder to Oracle Data Integr...
UKOUG Tech 15 - Migration from Oracle Warehouse Builder to Oracle Data Integr...
 
Using Endeca with Oracle Exalytics - Oracle France BI Customer Event, October...
Using Endeca with Oracle Exalytics - Oracle France BI Customer Event, October...Using Endeca with Oracle Exalytics - Oracle France BI Customer Event, October...
Using Endeca with Oracle Exalytics - Oracle France BI Customer Event, October...
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdfDataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
DataEng Mad - 03.03.2020 - Tibero 30-min Presentation.pdf
 
2-in-1 : RPD Magic and Hyperion Planning "Adapter"
2-in-1 : RPD Magic and Hyperion Planning "Adapter"2-in-1 : RPD Magic and Hyperion Planning "Adapter"
2-in-1 : RPD Magic and Hyperion Planning "Adapter"
 
Big Data & Oracle Technologies
Big Data & Oracle TechnologiesBig Data & Oracle Technologies
Big Data & Oracle Technologies
 
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
OBIEE11g Seminar by Mark Rittman for OU Expert Summit, Dubai 2015
 
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
A New “Sparkitecture” for Modernizing your Data Warehouse: Spark Summit East ...
 
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
Using Oracle Big Data SQL 3.0 to add Hadoop & NoSQL to your Oracle Data Wareh...
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
 
Demystifying Data Warehouse as a Service
Demystifying Data Warehouse as a ServiceDemystifying Data Warehouse as a Service
Demystifying Data Warehouse as a Service
 
Exploring sql server 2016 bi
Exploring sql server 2016 biExploring sql server 2016 bi
Exploring sql server 2016 bi
 
Unlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator OptimizerUnlock Hadoop Success with Cloudera Navigator Optimizer
Unlock Hadoop Success with Cloudera Navigator Optimizer
 
Data Warehouse Offload
Data Warehouse OffloadData Warehouse Offload
Data Warehouse Offload
 
What it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! PerspectivesWhat it takes to run Hadoop at Scale: Yahoo! Perspectives
What it takes to run Hadoop at Scale: Yahoo! Perspectives
 

Mehr von Mark Rittman

Mehr von Mark Rittman (17)

The Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data PlatformsThe Future of Analytics, Data Integration and BI on Big Data Platforms
The Future of Analytics, Data Integration and BI on Big Data Platforms
 
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's ToolkitUsing Oracle Big Data Discovey as a Data Scientist's Toolkit
Using Oracle Big Data Discovey as a Data Scientist's Toolkit
 
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
From lots of reports (with some data Analysis) 
to Massive Data Analysis (Wit...
 
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
SQL-on-Hadoop for Analytics + BI: What Are My Options, What's the Future?
 
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
Social Network Analysis using Oracle Big Data Spatial & Graph (incl. why I di...
 
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
IlOUG Tech Days 2016 - Unlock the Value in your Data Reservoir using Oracle B...
 
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle CloudOTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
OTN EMEA Tour 2016 : Deploying Full BI Platforms to Oracle Cloud
 
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...OTN EMEA TOUR 2016  - OBIEE12c New Features for End-Users, Developers and Sys...
OTN EMEA TOUR 2016 - OBIEE12c New Features for End-Users, Developers and Sys...
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
 
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
Gluent New World #02 - SQL-on-Hadoop : A bit of History, Current State-of-the...
 
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business AnalyticsOracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
Oracle BI Hybrid BI : Mode 1 + Mode 2, Cloud + On-Premise Business Analytics
 
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...Riga dev day 2016   adding a data reservoir and oracle bdd to extend your ora...
Riga dev day 2016 adding a data reservoir and oracle bdd to extend your ora...
 
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive AnalyticsBig Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
Big Data for Oracle Devs - Towards Spark, Real-Time and Predictive Analytics
 
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
OBIEE12c and Embedded Essbase 12c - An Initial Look at Query Acceleration Use...
 
Deploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle CloudDeploying Full BI Platforms to Oracle Cloud
Deploying Full BI Platforms to Oracle Cloud
 
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
Deploying Full Oracle BI Platforms to Oracle Cloud - OOW2015
 

Kürzlich hochgeladen

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Kürzlich hochgeladen (20)

Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 

Part 4 - Hadoop Data Output and Reporting using OBIEE11g

  • 1. Lesson 4 : Hadoop Data Output and Reporting using OBIEE 11g Mark Rittman, CTO, Rittman Mead SIOUG and HROUG Conferences, Oct 2014 T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 2. Moving Data In, Around and Out of Hadoop •Three stages to Hadoop data movement, with dedicated Apache / other tools ‣Load : receive files in batch, or in real-time (logs, events) ‣Transform : process & transform data to answer questions ‣Store / Export : store in structured form, or export to RDBMS using Sqoop RDBMS Imports T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) Loading Stage !!!! Processing Stage E : info@rittmanmead.com W : www.rittmanmead.com !!!! Store / Export Stage !!!! Real-Time Logs / Events File / Unstructured Imports File Exports RDBMS Exports
  • 3. Discovery vs. Exploitation Project Phases •Discovery and monetising steps in Big Data projects have different requirements •Discovery phase ‣Unbounded discovery ‣Self-Service sandbox ‣Wide toolset •Promotion to Exploitation ‣Commercial exploitation ‣Narrower toolset ‣Integration to operations ‣Non-functional requirements ‣Code standardisation & governance T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 4. Actionable Events Event Engine Enterprise www.rittmanmead.com inquiries@rittmanmead.com @rittmanmead Information Store Reporting Discovery Lab Actionable Information Actionable Insights Input Events Execution Innovation Discovery Output Events & Data Structured Enterprise Data Other Data Data Reservoir Data Factory Information Solution
  • 5. Actionable Events Event Engine Enterprise www.rittmanmead.com inquiries@rittmanmead.com @rittmanmead Information Store Reporting Discovery Lab Actionable Information Actionable Insights Input Events Execution Innovation Discovery Output Events & Data Data Reservoir Data Factory Information Solution
  • 6. Design Pattern : Information Solution •Specific solution based on Big Data technologies requiring broader integration to the wider Information Management estate e.g. ETL pre-processor for the DW or affordably store a lower level of grain •Non-functional requirements more critical in this solution •Scalable integration to IM estate an important factor for success •Analysis may take place in Reservoir or Reservoir only used as an aggregator www.rittmanmead.com inquiries@rittmanmead.com @rittmanmead www.facebook.com/rittmanmead
  • 7. Options for Sharing Hadoop Output with Wider Audience •During the discovery phase of a Hadoop project, audience are likely technical ‣Most comfortable with data analyst tools, command-line, low-level access to the data •During the exploitation phase, audience will be less technical ‣Emphasis on graphical tools, and integration with wider reporting toolset + metadata •Three main options for visualising and sharing Hadoop data 1.Coming Soon - Oracle Big Data Discovery (Endeca on Hadoop) 2.OBIEE reporting against Hadoop direct using Hive/Impala, or Oracle Big Data SQL 3.OBIEE reporting against an export of the Hadoop data, on Exalytics / RDBMS T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 8. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com Oracle Big Data Discovery •Launched at Oracle Openworld 2014 as “The Visual Face of Hadoop” •Combined Endeca Server search + analytical technology with Spark data transformation
  • 9. Interactive Analysis & Exploration of Hadoop Data T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 10. Share and Collaborate on Big Data Discovery Projects T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 11. Oracle Business Analytics and Big Data Sources •OBIEE 11g can also make use of big data sources ‣OBIEE 11.1.1.7+ supports Hive/Hadoop as a data source ‣Oracle R Enterprise can expose R models through DB functions, columns ‣Oracle Exalytics has InfiniBand connectivity to Oracle BDA •Endeca Information Discovery can analyze unstructured and semi-structured sources ‣Increasingly tighter-integration between OBIEE and Endeca T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 12. New in OBIEE 11.1.1.7 : Hadoop Connectivity through Hive •MapReduce jobs are typically written in Java, but Hive can make this simpler •Hive is a query environment over Hadoop/MapReduce to support SQL-like queries •Hive server accepts HiveQL queries via HiveODBC or HiveJDBC, automatically creates MapReduce jobs against data previously loaded into the Hive HDFS tables •Approach used by ODI and OBIEE to gain access to Hadoop data •Allows Hadoop data to be accessed just like any other data source T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 13. Importing Hadoop/Hive Metadata into RPD •HiveODBC driver has to be installed into Windows environment, so that BI Administration tool can connect to Hive and return table metadata •Import as ODBC datasource, change physical DB type to Apache Hadoop afterwards •Note that OBIEE queries cannot span >1 Hive schema (no table prefixes) T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com 1 2 3
  • 14. Set up ODBC Connection at the OBIEE Server •OBIEE 11.1.1.7+ ships with HiveODBC drivers, need to use 7.x versions though (only Linux supported) •Configure the ODBC connection in odbc.ini, name needs to match RPD ODBC name •BI Server should then be able to connect to the Hive server, and Hadoop/MapReduce [ODBC Data Sources] AnalyticsWeb=Oracle BI Server Cluster=Oracle BI Server SSL_Sample=Oracle BI Server bigdatalite=Oracle 7.1 Apache Hive Wire Protocol [bigdatalite] Driver=/u01/app/Middleware/Oracle_BI1/common/ODBC/ Merant/7.0.1/lib/ARhive27.so Description=Oracle 7.1 Apache Hive Wire Protocol ArraySize=16384 Database=default DefaultLongDataBuffLen=1024 EnableLongDataBuffLen=1024 EnableDescribeParam=0 Hostname=bigdatalite LoginTimeout=30 MaxVarcharSize=2000 PortNumber=10000 RemoveColumnQualifiers=0 StringDescribeType=12 TransactionMode=0 UseCurrentSchema=0 T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 15. Dealing with Hadoop / Hive Latency Option 1 : Impala •Hadoop access through Hive can be slow - due to inherent latency in Hive •Hive queries use MapReduce in the background to query Hadoop •Spins-up Java VM on each query •Generates MapReduce job •Runs and collates the answer •Great for large, distributed queries ... •... but not so good for “speed-of-thought” dashboards T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 16. Dealing with Hadoop / Hive Latency Option 1 : Use Impala •Hive is slow - because it’s meant to be used for batch-mode queries •Many companies / projects are trying to improve Hive - one of which is Cloudera •Cloudera Impala is an open-source but commercially-sponsored in-memory MPP platform •Replaces Hive and MapReduce in the Hadoop stack •Can we use this, instead of Hive, to access Hadoop? ‣It will need to work with OBIEE ‣Warning - it won’t be a supported data source (yet…) T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 17. Demo Using Hive to Provide Data for OBIEE11g T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 18. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com How Impala Works •A replacement for Hive, but uses Hive concepts and data dictionary (metastore) •MPP (Massively Parallel Processing) query engine that runs within Hadoop ‣Uses same file formats, security, resource management as Hadoop •Processes queries in-memory •Accesses standard HDFS file data •Option to use Apache AVRO, RCFile, LZO or Parquet (column-store) •Designed for interactive, real-time SQL-like access to Hadoop BI Server Presentation Svr Impala Hadoop HDFS etc Cloudera Impala ODBC Driver Impala Hadoop HDFS etc Impala Hadoop HDFS etc Impala Hadoop HDFS etc Impala Hadoop HDFS etc Multi-Node Hadoop Cluster
  • 19. Connecting OBIEE 11.1.1.7 to Cloudera Impala •Warning - unsupported source - limited testing and no support from MOS •Requires Cloudera Impala ODBC drivers - Windows or Linux (RHEL etc/SLES) - 32/64 bit •ODBC Driver / DSN connection steps similar to Hive T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 20. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com Importing Impala Metadata •Import Impala tables (via the Hive metastore) into RPD •Set database type to “Apache Hadoop” ‣Warning - don’t set ODBC type to Hadoop- leave at ODBC 2.0 ‣Create physical layer keys, joins etc as normal
  • 21. Importing RPD using Impala Metadata •Create BMM layer, Presentation layer as normal •Use “View Rows” feature to check connectivity back to Impala / Hadoop T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 22. Impala / OBIEE Issue with ORDER BY Clause •Although checking rows in the BI Administration tool worked, any query that aggregates data in the dashboard will fail •Issue is that Impala requires LIMIT with all ORDER BY clauses ‣OBIEE could use LIMIT, but doesn’t for Impala at the moment (because not supported) •Workaround - disable ORDER BY in Database Features, have the BI Server do sorting ‣Not ideal - but it works, until Impala supported T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 23. Demo Using Impala to Provide Data for OBIEE11g T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 24. So Does Impala Work, as a Hive Substitute? •With ORDER BY disabled in DB features, it appears to •But not extensively tested by me, or Oracle •But it’s certainly interesting •Reduces 30s, 180s queries down to 1s, 10s etc •Impala, or one of the competitor projects (Drill, Dremel etc) assumed to be the real-time query replacement for Hive, in time ‣Oracle announced planned support for Impala at OOW2013 - watch this space T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 25. Dealing with Hadoop / Hive Latency Option 2 : Big Data SQL •Preferred solution for customers with Oracle Big Data Appliance is Big Data SQL •Oracle SQL Access to both relational, and Hive/NoSQL data sources •Exadata-type SmartScan against Hadoop datasets •Response-time equivalent to Impala or Hive on Tez •No issues around HiveQL limitations •Insulates end-users around differences between Oracle and Hive datasets T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 26. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com Oracle Big Data SQL •Part of Oracle Big Data 4.0 (BDA-only) ‣Also requires Oracle Database 12c, Oracle Exadata Database Machine •Extends Oracle Data Dictionary to cover Hive •Extends Oracle SQL and SmartScan to Hadoop •Extends Oracle Security Model over Hadoop ‣Fine-grained access control ‣Data redaction, data masking ‣Uses fast c-based readers where possible (vs. Hive MapReduce generation) ‣Map Hadoop parallelism to Oracle PQ ‣Big Data SQL engine works on top of YARN Exadata Storage Servers ‣Like Spark, Tez, MR2 Exadata Database Server Hadoop Cluster Oracle Big Data SQL SQL Queries SmartScan SmartScan
  • 27. Big Data SQL Server Dataflow •Read data from HDFS Data Node ‣Direct-path reads ‣C-based readers when possible ‣Use native Hadoop classes otherwise •Translate bytes to Oracle •Apply SmartScan to Oracle bytes ‣Apply filters ‣Project columns ‣Parse JSON/XML ‣Score models Disks% T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com Big$Data$SQL$Server$ Smart$Scan$ External$Table$Services$ SerDe% RecordReader% Data$Node$ 10110010% 10110010% 10110010% 3% 2% 1% 1 2 3
  • 28. View Hive Table Metadata in the Oracle Data Dictionary •Oracle Database 12c 12.1.0.2.0 with Big Data SQL option can view Hive table metadata ‣Linked by Exadata configuration steps to one or more BDA clusters •DBA_HIVE_TABLES and USER_HIVE_TABLES exposes Hive metadata •Oracle SQL*Developer 4.0.3, with Cloudera Hive drivers, can connect to Hive metastore T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com SQL> col database_name for a30 SQL> col table_name for a30 SQL> select database_name, table_name 2 from dba_hive_tables; ! DATABASE_NAME TABLE_NAME ------------------------------ ------------------------------ default access_per_post default access_per_post_categories default access_per_post_full default apachelog default categories default countries default cust default hive_raw_apache_access_log
  • 29. Hive Access through Oracle External Tables + Hive Driver •Big Data SQL accesses Hive tables through external table mechanism ‣ORACLE_HIVE external table type imports Hive metastore metadata ‣ORACLE_HDFS requires metadata to be specified •Access parameters cluster and tablename specify Hive table source and BDA cluster T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com CREATE TABLE access_per_post_categories( hostname varchar2(100), request_date varchar2(100), post_id varchar2(10), title varchar2(200), author varchar2(100), category varchar2(100), ip_integer number) organization external (type oracle_hive default directory default_dir access parameters(com.oracle.bigdata.tablename=default.access_per_post_categories));
  • 30. Demo Using Big Data SQL to Provide Data for OBIEE11g T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 31. Alternative to Direct Against Hadoop : Export to Data Mart •In most cases, for general reporting access, exporting into RDBMS makes sense •Export Hive data from Hadoop into Oracle Data Mart or Data Warehouse •Use Oracle RDBMS for high-value data analysis, full access to RBDMS optimisations •Potentially use Exalytics for in-memory RBDMS access RDBMS Imports T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) Loading Stage !!!! Processing Stage E : info@rittmanmead.com W : www.rittmanmead.com !!!! Store / Export Stage !!!! Real-Time Logs / Events File / Unstructured Imports File Exports RDBMS Exports
  • 32. Using the Right Server for the Right Job •Hadoop for large scale, high-speed data ingestion and processing •Oracle RDBMS and Exadata for long-term storage of high-value data •Oracle Exalytics for speed-of-though analytics in TimesTen and Oracle Essbase T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 33. Bulk Unload Summary Data to Oracle Database •Final requirement is to unload final Hive table contents to Oracle Database •Several use-cases for this: •Use Hadoop / BDA for ETL offloading •Use analysis capabilities of BDA, but then output results to RDBMS data mart or DW •Permit use of more advanced SQL query tools •Share results with other applications •Can use Sqoop for this, or use Oracle Big Data Connectors •Fast bulk unload, or transparent Oracle access to Hive T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com 5
  • 34. Oracle Direct Connector for HDFS •Enables HDFS as a data-source for Oracle Database external tables •Effectively provides Oracle SQL access over HDFS •Supports data query, or import into Oracle DB •Treat HDFS-stored files in the same way as regular files •But with HDFS’s low-cost •… and fault-tolerance T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 35. Oracle Loader for Hadoop (OLH) •Oracle technology for accessing Hadoop data, and loading it into an Oracle database •Pushes data transformation, “heavy lifting” to the Hadoop cluster, using MapReduce •Direct-path loads into Oracle Database, partitioned and non-partitioned •Online and offline loads •Load from HDFS or Hive tables •Key technology for fast load of Hadoop results into Oracle DB T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 36. IKM File/Hive to Oracle (OLH/ODCH) •KM for accessing HDFS/Hive data from Oracle •Either sets up ODCH connectivity, or bulk-unloads via OLH •Map from HDFS or Hive source to Oracle tables (via Oracle technology in Topology) T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 37. Environment Variable Requirements •Hardest part in setting up OLH / IKM File/Hive to Oracle is getting environment variables correct - OLH needs to be able to see correct JARs, configuration files •Set in /home/oracle/.bashrc - see example below export HIVE_HOME=/usr/lib/hive export HADOOP_CLASSPATH=/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/*:/etc/hive/conf:$HIVE_HOME/lib/ hive-metastore-0.12.0-cdh5.0.1.jar:$HIVE_HOME/lib/libthrift.jar:$HIVE_HOME/lib/libfb303-0.9.0.jar:$HIVE_HOME/ lib/hive-common-0.12.0-cdh5.0.1.jar:$HIVE_HOME/lib/hive-exec-0.12.0-cdh5.0.1.jar export OLH_HOME=/home/oracle/oracle/product/oraloader-3.0.0-h2 export HADOOP_HOME=/usr/lib/hadoop export JAVA_HOME=/usr/java/jdk1.7.0_60 export ODI_HIVE_SESSION_JARS=/usr/lib/hive/lib/hive-contrib.jar export ODI_OLH_JARS=/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/ojdbc6.jar,/home/oracle/oracle/ product/oraloader-3.0.0-h2/jlib/orai18n.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n-utility. jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/orai18n-mapping.jar,/home/oracle/oracle/ product/oraloader-3.0.0-h2/jlib/orai18n-collation.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/ oraclepki.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/osdt_cert.jar,/home/oracle/oracle/product/ oraloader-3.0.0-h2/jlib/osdt_core.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/commons-math- 2.2.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/jackson-core-asl-1.8.8.jar,/home/oracle/ oracle/product/oraloader-3.0.0-h2/jlib/jackson-mapper-asl-1.8.8.jar,/home/oracle/oracle/product/ oraloader-3.0.0-h2/jlib/avro-1.7.3.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/avro-mapred-1.7.3- hadoop2.jar,/home/oracle/oracle/product/oraloader-3.0.0-h2/jlib/oraloader.jar,/usr/lib/hive/lib/hive-metastore. jar,/usr/lib/hive/lib/libthrift-0.9.0.cloudera.2.jar,/usr/lib/hive/lib/libfb303-0.9.0.jar,/usr/lib/ hive/lib/hive-common-0.12.0-cdh5.0.1.jar,/usr/lib/hive/lib/hive-exec.jar T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 38. Configuring the KM Physical Settings •For the access table in Physical view, change LKM to LKM SQL Multi-Connect •Delegates the multi-connect capabilities to the downstream node, so you can use a multi-connect IKM such as IKM File/Hive to Oracle T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 39. Configuring the KM Physical Settings •For the target table, select IKM File/Hive to Oracle •Only becomes available to select once LKM SQL Multi-Connect selected for access table •Key option values to set are: •OLH_OUTPUT_MODE (use JDBC initially, OCI if Oracle Client installed on Hadoop client node) •MAPRED_OUTPUT_BASE_DIR (set to directory on HFDS that OS user running ODI can access) T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 40. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com Executing the Mapping •Executing the mapping will invoke OLH from the OS command line •Hive table (or HDFS file) contents copied to Oracle table
  • 41. Demo Unloading Hive Data to Oracle using OLH / ODI12c T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com
  • 42. T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com Thank You for Attending! •Thank you for attending this presentation, and more information can be found at http:// www.rittmanmead.com •Contact us at info@rittmanmead.com or mark.rittman@rittmanmead.com •Look out for our book, “Oracle Business Intelligence Developers Guide” out now! •Follow-us on Twitter (@rittmanmead) or Facebook (facebook.com/rittmanmead)
  • 43. Lesson 4 : Hadoop Data Output and Reporting OBIEE 11g Mark Rittman, CTO, Rittman Mead SIOUG and HROUG Conferences, Oct 2014 T : +44 (0) 1273 911 268 (UK) or (888) 631-1410 (USA) or +61 3 9596 7186 (Australia & New Zealand) or +91 997 256 7970 (India) E : info@rittmanmead.com W : www.rittmanmead.com