OWB11gR2 - Extending ETL

Suraj Bang BI Consultant Extend ETL to Heterogeneous and Unstructured Data Sources

[object Object],[object Object]

Databases Non - Databases ,[object Object],[object Object],[object Object],[object Object],[object Object],XML files CSV files HTML files PDF documents Web Services

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Name of the Platform

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],JDBC Driver Class

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Source data type to OWB generic data type

[object Object],[object Object],[object Object],[object Object]

Variable Action (SQL function to calculate new value)

TO_CHAR(MAX(LAST_UPDATE_DATE),’MM/DD/YYYY HH24:MI:SS.FF3’)

Updates Variable with New Value

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],SQL Statement Command Line

Simple Design with CT assigned

SQL Statement executes at source

Data types mapped to OWB Generic Data types

Defines what kind of objects can be Imported Tables,views,etc

An API based CMI implements oracle.wh.service.sdk.integrator.MetadataImport OWB interface

Uses the Location Details to create metadata objects

getColumns routine gets the PDF form fields names from the PDF document . These are represented as columns in the Metadata.

iText Api – reader.getAcroForm().getFields() collects all information from the PDF document

All the form fields are generated as columns in the table

JDBC Stub Driver to treat the PDF location as a JDBC source and register the location

URL field used for reverse engineering the PDF document by CMI

Internal Names of the PDF form fields extracted from the PDF Document

Assigned Code Template for PDF Extraction

Inserting Row for every processed PDF

iText API to extract data from PDF document

Tags parsed to capture metadata and data by LCT

[object Object],FILENAME HEADER_NAME HEADER_VALUE RW_NBR MSERVLOC Backup Host testhost 1 MSERVLOC Backup IP 1.1.1.1 1 MSERVLOC Days Mon,Wed,Fri 1 MSERVLOC Start Time 10:00 1 MSERVLOC End Time 11:00 1 MSERVLOC Location rosh 1 MSERVLOC Backup Host hosttest 2 MSERVLOC Backup IP 2.2.2.2 2 MSERVLOC Days Tue,Thu 2 MSERVLOC Start Time 11:00 2 MSERVLOC End Time 12:30 2 MSERVLOC Location hugh 2

UNPIVOT & AGGREGATOR gets all the data in the tabular format

Input to LCT to specify number of Columns on the HTML Document ( table tags )

Utilizing the HTMLLIB library to parse <td> tags

Inserts Data into Work table for every <td> tag

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],[object Object],[object Object]

Web Service lists.asmx SOAP Action’s - Getddpdateelete list, Getddpdateelete list items, Add Attachments, etc

HTTP Post operation to invoke the web service Placeholders are replaced by values from tables in the database

XML with CAML <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <UpdateListItems xmlns="http://schemas.microsoft.com/sharepoint/soap/"> <listName>%s</listName> <updates> <Batch> <Method ID='1' Cmd='New'> <Field Name='Title'> %S </Field> <Field Name='AssignedTo'> %S </Field> <Field Name='Status'> %S </Field> … .. </Method> </Batch> </updates> </UpdateListItems> </soap:Body> </soap:Envelope>

Input Parameters- Web Service End Point SOAP Action SOAP Content Type Parallel Threads BASE64 Encoded Field

Input Parameter - SOAP XML Format <soap:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <AddList xmlns="http://schemas.microsoft.com/sharepoint/soap/"> <listName> %S </listName> <description> %S </description> <templateID> %S </templateID> </AddList> </soap:Body> </soap:Envelope>

JAVA Bean Shell to populate the XML for every row

Uses the Input Parameter to setup the web service call

XML returned after Invoking the Web Service <?xml version="1.0" encoding="utf-8"?> <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <soap:Body> <GetListItemsResponse xmlns="http://schemas.microsoft.com/sharepoint/soap/"> <GetListItemsResult> <listitems xmlns:s='uuid:BDC6E3F0-6DA3-11d1-A2A3-00AA00C14882' xmlns:dt='uuid:C2F41010-65B3-11d1-A29F-00AA00C14882' xmlns:rs='urn:schemas-microsoft-com:rowset’ xmlns:z='#RowsetSchema'> <rs:data ItemCount="2"> <z:row ows_Attachments='0' ows_LinkIssueIDNoMenu='1' ows_LinkTitle=‘ODI-EE' ows_Status='Active' ows_Priority='(2) Normal' ows_MetaInfo='1;#' ows__ModerationStatus='0' ows__Level='1' ows_Title=‘ODI-EE' ows_ID='1' ows_owshiddenversion='1' ows_UniqueId='1;#{962F968C-6C61-4097-91C3-5A2899C5F8B4}' ows_FSObjType='1;#0' ows_Created_x0020_Date='1;#2010-07-23 15:51:36' ows_Created='2010-08-23 15:51:36' ows_FileLeafRef='1;#1_.000' ows_FileRef='1;#yoursite/yoursubsite/Lists/Test/1_.000' /> </rs:data></listitems></GetListItemsResult></GetListItemsResponse></soap:Body></soap:Envelope> Data in the returned XML

Unbounded View using Inline SQL

[object Object],[object Object],[object Object],[object Object],[object Object]

LCT Invokes Web Service for every row

Return XML parsed to get Conversion Rate

TO_NUMBER(EXTRACTVALUE( INGRP1.SOAP_XML ,'//ConversionRateResult/text()', 'xmlns="http://www.webserviceX.NET/"'))

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

OWB11gR2 - Extending ETL

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie OWB11gR2 - Extending ETL

Ähnlich wie OWB11gR2 - Extending ETL (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

OWB11gR2 - Extending ETL

Hinweis der Redaktion