4. Outline
●
A Classic Use Case
●
What’s ETL and How It Is Interpreted In The Modern World?
●
Why ETL?
●
Challenges In Implementing ETL Solutions
●
Why Traditional Standalone ETL Products Are Considered
Dead In The Modern World?
●
What Factors To Be Considered When Implementing ETL In
Re-Architecting A System?
5. Outline contd..
●
Impact Of Tooling
●
Reference Architecture
○
How to build an “efficient, robust, scalable, auditable,
performing and maintainable” ETL solution with WSO2
EMP?
●
Demo - Data Mapping With WSO2 Developer Studio
●
Summary
●
Q&A
6. A Classic Use Case - Financial Sector
Flat files
Financial
Reporting
RDBMS
ETL
Process
Enterprise
Data
Warehouse
Revenue
Predictions
XML, Web
Services
Other
Analytics &
BI fronts
9. Why ETL?
●
●
Generally, to build and maintain data repositories with
“single version of the truth” out of the multiple
heterogenous data sources scattered across an
organization or a business domain.
Then, the business users can use that data for,
○
Predictive Analysis
○
Revenue predictions and comparisons
○
Monitor Overall Growth of an organization
○
Business Policies
○
Strategic Decisions
10. Challenges
●
Data definition establishment
●
Need for expert knowledge
●
Scalability and Performance
●
Business user acceptance and seamless support for wide
range of business use cases
●
Maintenance, Data Archival
●
Real-time or Near Real-time data synchronization
11. Why Standalone ETL Products Are Dead?
●
●
●
●
Modern day organizations are evolving as it’s never been
before.
Tendency to adopt architecture patterns such as SOA to
reduce IT costs and have flexible business processes is
rapidly increasing.
Organizations are more focussed towards “Connected
businesses”.
Thus, it’s very likely that an organization might have a IT
infrastructure in place already.
12. Why Standalone ETL Products Are Dead?
●
●
●
●
Adopting a standalone ETL product? Possible, but
worthwhile?
Generally less support for open standards. Extension
points? Connectors? More custom code!
Usually, relies on some proprietary data integration
patterns, inducing high maintenance costs.
Additional licensing costs, need for separate
expert/operational assistance, again inducing high
maintenance costs.
13. Why Standalone ETL Products Are Dead?
●
Tendency to use in-house re-usable business components
leveraging the benefits of SOA
●
Less operational costs
●
Scalability is a main focus nowadays.
●
Having a similar process implemented enables, horizontal
scalability at different layers as the need arises.
14. Re-Architecting A System’s DIL?
●
●
Data Integration is always cumbersome
Need for ensuring policy compliance of data at its target
containers. (usually Enterprise Data Warehouses, Central
MDM repositories, etc)
●
Flexibility
●
Ensuring acceptable Performance
●
What about Reliability?
15. Re-Architecting A System’s DIL?
●
How to deal with the freshness of data?
●
When to synchronize?
●
Need for tuning the system to meet various SLAs
17. Impact Of Tooling
●
●
●
●
Numerous ETL solutions fail because of the lack of tooling.
Developers/Solution composers are left with manual coding
of XSLT, Custom mappers, etc.
Not scalable!
Often requires a powerful flexible tooling platform
particularly, as the system grows and matures.
27. Summary
●
●
●
●
ETL, plays a pivotal role in any business organization.
Often requires a lot of effort put into implementing a
proper ETL process within an organization.
Standalone ETL solutions can be costly.
Re-architecting data models is made easy with WSO2
Enterprise Middleware Platform.
28. References
[1] How to use the Smooks Editor shipped with WSO2
Developer Studio
http://wso2.
com/library/tutorials/2011/06/perform-data-mapping-smookseditor-wso2-carbon-studio/