UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
Hadoop in a Relational Data Warehouse, Expedia
1. HADOOP IN A RELATIONAL DATA
WAREHOUSE
Data andAnalytics/Enterprise DW, Expedia
June 2013
Arek Kaczmarek
2. Background
Expedia
Site
Competitors
DW
Legacy
EDW
DNA
Hadoop at Expedia
Original Purpose
Early expectations
3. A case study
Project objective
Datasets
Competitive shopping comparisons
Properties
Bookings
Clickstream demand
Forecast
4. DW architecture –
what’s different?
Normalized vs denormalized tables
Does it matter?
Performance
Ingestion speed
Analytical flexibility
5. DEV work – do you need
different skills?
Data files: csv, tsv, txt or xml – which work best?
Hive: HQL UDFs for analytic functions – do you
need them?
Optimization – reuse your knowledge?
Architecture (temp tables, partitions)
HQL (set parameters)
Load_tags: partitioning, appending, syncing
6. RDBMSes and Hadoop –
what’s their relationship?
- Syncing from DB2 - Exporting into HBase
- Importing from SQLServer - Exporting into SQLServer
- Exporting into DB2
7. Place of Hadoop in a Relational
Data Warehouse?
Conflicting
Mutually exclusive
Coexisting
Complementing
8. What’s the new Data Warehouse
for data and analytics?
Complementing:
Polyglot Persistence