SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Physical Database Design for
MPP and Columnar Databases
Geoffrey Clark
Principal at Lucidata, Inc.
September 2013
copywrite, Lucidata, 2013
Conceptual, Logical, Physical
• Conceptual links to Business Strategy.
– This is now becoming more quantitative
• Logical maps to the Business Semantics.
– Con-way example
• Physical maps to your Data Stores
– These will be more varied and heterogeneous in
the future, due to specialization.
copywrite, Lucidata, 2013
HBR Business Strategy
The New Dynamics of Competition, Michael D. Ryall, Harvard Business Review, June 2013
Michael Porter’s Five Forces
has dominated strategic
and competitive analysis
since 1979. This analysis
has largely been conceptual
in nature.
Quantitative analysis on
structured data in context is
changing the nature of
business culture, and
improving business
decisions.
This drives the demand for
data modeling and
management.
copywrite, Lucidata, 2013
Design and Evolution
• Hierarchies
– 14th Century Europe and the Financial Revolution
– Aggregations & Allocations
• Cards, Tapes – physical analog media
• Computer Science
– Moore’s Law
• Processor Speed Improvements
• Memory Improvements
• Media Improvements – Punch Cards, Tape, Disk, Memory
• Design for Context & the Future
– Character encoding - Internationalization
– Calendars – Gregorian, Fiscal, Lunar, ... Y2K?
• Files and Fields
– Separation of Data and Metadata
– Modern versions -> XML, JSON
• Joins!
– Data Sets – Super types, Sub types
– Associations describe Networks!
copywrite, Lucidata, 2013
Technology’s Improvement Pace
copywrite, Lucidata, 2013
... and Demand Forecast
copywrite, Lucidata, 2013
Separation of Church and State
• Operational uses
– Capture the data, hand-entered <- validation
– A Data Flow, such as Order to Cash cycle
– Con-way example of PRO(-gressive) numbers
• Analytical uses
– Desire for reports, Reporting crashes the
Operational cycle, Cash flow problem.
– Banished from OLTP, go make an ODS
copywrite, Lucidata, 2013
The Star Schema
The purpose of business computers is to sort data. A graphical
representation of sorted data is called a ‘Star Schema’.
– Michael Silves, Principal at Datamorphosis
• The right design at the right time, becomes default doctrine for DW
– Early RDBMS (Relational Data Base Management Systems)
• Low memory, slow disks, slow CPU
• Big Demand, with questions that spanned the datasets
• Performance issues over large datasets
– Interview Business people to get questions
• Pre-process the data, based on business questions
– Separation into Dimensions and Facts/Metrics
• Link to Business Semantics
• OLAP (On-Line Analytical Processing)
• Educate Users on Aggregation and Allocation
• Conformed Dimensions across Departments to give an Enterprise-wide view of the data.
• But as technology changes, problems emerge
– Ad-hoc questions require redesign & rework
– With business hierarchies when one concept is both a fact & dimension, e.g. Shipment
– Fact tables become difficult to distribute for MPP ... e.g. Teradata prefers a normalized DW
• Example – transportation networks
copywrite, Lucidata, 2013
Example – Multi-Modal Freight
• Shipments are agreements between a Carrier and a
Shipper to move goods between two places.
• Shipments can be split into “ProFreight” (which is
assigned a cost via activity-based costing).
• Shipments/ProFreight are composed of Freight
handling units.
• Freight can be “re-tendered” to another carrier, in
which case is is linked to the original and the new
Shipment.
• Freight moves between places on one or many “VFCs”
or Containers.
• Containers are moved between places on Trips.
copywrite, Lucidata, 2013
Kimball on Transportation, 3NF
copywrite, Lucidata, 2013
Kimball on Transportation, Star
copywrite, Lucidata, 2013
Table Level DW diagram
copywrite, Lucidata, 2013
Dim Modeling Dogma
• “Our carefully normalized data model can not
be translated into a star schema... “
– Dimensional modeling is necessary in order to
generate correct queries
– Any (normalized) data model can be transformed
in a dimensional model...
– ... and there exists an algorithm to do it
copywrite, Lucidata, 2013
Dim Modeling Example
copywrite, Lucidata, 2013
Star option considered
copywrite, Lucidata, 2013
Bridge table
(remember, we tried this)
We tried this with
hesmith When
selecting a main
hierarchy is has
too much of a
downside, and
you don’t have a
weight factor …
copywrite, Lucidata, 2013
Multi-fact option considered
copywrite, Lucidata, 2013
Oracle’s Algorithmic approach
copywrite, Lucidata, 2013
Basic DW diagram
copywrite, Lucidata, 2013
Build Dimensional Model in BI
copywrite, Lucidata, 2013
Freight moves through Networks
copywrite, Lucidata, 2013
Information Factory & MPP
• Normalized Base
– Integrate data once
• Source -> Normalized -> Denormalized -> OK
• Source -> Denormalized? -> Un-normalized -> ?
– Detect problems and fix them once!
• Does not preclude Data Marts
• Massive Parallel Processing
– Data distribution
• Optimizations – Broadcast, Co-location, Re-distribution
• Scalability, the quest for 1:1
• Normalized data - reduced IO, better match for
copywrite, Lucidata, 2013
Bob Conway’s Rapid Methodology
copywrite, Lucidata, 2013
Core Model with many Roles
Transaction
Tables
Reference Tables
copywrite, Lucidata, 2013
Power of Conformed Dimensions
copywrite, Lucidata, 2013
Example Data Model & Hierarchy
copywrite, Lucidata, 2013
Data Flow and Usage
copywrite, Lucidata, 2013
Cubes and In-memory BI
• Multi-Dimensional OLAP (MOLAP)
– Drag-and-Drop OLAP environment, analysts
become capable of self-service.
– Dealt with Ragged Hierarchies, common in
Financial data such as General Ledger (GL)
– Limited by memory size
– Pressure for more dimensionality floods cube size,
build times from relational sources exceed load
windows ...
• Relational OLAP (ROLAP)
copywrite, Lucidata, 2013
But a network this size choked it
copywrite, Lucidata, 2013
Columnar vs Row-wise
• Physically store data by Column vs Row
– Rather like Fifth Normal Form.
– If Semantically Organized, then Rapid Response to
user’s ad-hoc aggregation requests.
– Prefers batch loading, always loads once per
column, even if loading one row.
• Continues to Appear and Operate as a normal
Row-wise cousin.
copywrite, Lucidata, 2013
Columnar IO example
Compression becomes
much more effective
Reading a Column is
like reading a Row
copywrite, Lucidata, 2013
Design Pattern for Log Data
Data Stewards for
Master Data
Data Stewards for
Metadata
Architects
integrate data
and metadata
Architects
organize data for
analysis with
physical in mind
Architects identify levels for
analysis, and distributionColumnar
MPP
copywrite, Lucidata, 2013
Importance of Reference Data
copywrite, Lucidata, 2013
Infobright’s Database Landscape 2011
copywrite, Lucidata, 2013
Analytic Database Comparison
Actian
ParAccel
IBM
Netezza
HP
Vertica
Green
plum
Tera
data
Sybase
IQ
copywrite, Lucidata, 2013
Gartner’s Magic Quadrant
copywrite, Lucidata, 2013
Hadoop (Cloudera & Hortonworks)
“Although it’s true that Hadoop can be valuable as an analytic silo, most
organizations will prefer to get the most business value out of Hadoop by
integrating it with—or into—their BI, DW, DI, and analytics technology
stacks.” – Philip Russom TDWI
http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspx
copywrite, Lucidata, 2013
Hadoop for Analytics?
Analytics performs
best on Structured
Data, for good
reasons.
Maintain MPP strengths in
the solution through
Architecture.
copywrite, Lucidata, 2013
Message from Hortonworks (Hadoop)
“Although it’s true that Hadoop can be valuable as an analytic silo, most
organizations will prefer to get the most business value out of Hadoop by
integrating it with—or into—their BI, DW, DI, and analytics technology
stacks.” – Philip Russom TDWI
http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspxcopywrite, Lucidata, 2013
Hadoop as ETL
copywrite, Lucidata, 2013
Data Flow Reference Architecture
copywrite, Lucidata, 2013
Message from Neo4J NoSQL
copywrite, Lucidata, 2013
Message from MongoDB (NoSQL)
http://www.slideshare.net/fullscreen/mongodb/schema-design-by-example/1copywrite, Lucidata, 2013
Message from Couchbase (NoSQL)
http://www.couchbase.com/why-nosql/nosql-databasecopywrite, Lucidata, 2013

Weitere ähnliche Inhalte

Was ist angesagt?

Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentationvickyc
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence ArchitecturePhilippe Julio
 
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0Daniel Westzaan
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecturepcherukumalla
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Steve Keil
 
Column Oriented Databases
Column Oriented DatabasesColumn Oriented Databases
Column Oriented DatabasesArundhati Kanungo
 
Data warehousing
Data warehousingData warehousing
Data warehousingBhaskar Pathak
 
Project+team+1 slides (2)
Project+team+1 slides (2)Project+team+1 slides (2)
Project+team+1 slides (2)Vijay Pappu, Ph.D.
 
A hadoop map reduce
A hadoop map reduceA hadoop map reduce
A hadoop map reducesrikanthhadoop
 
BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)Thierry de Spirlet
 
Optimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and ServicesOptimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and ServicesIBM India Smarter Computing
 
7 - Enterprise IT in Action
7 - Enterprise IT in Action7 - Enterprise IT in Action
7 - Enterprise IT in ActionRaymond Gao
 
Austin fraser sap hana presentation
Austin fraser sap hana presentationAustin fraser sap hana presentation
Austin fraser sap hana presentationShane Sale
 
What exactly is Business Intelligence?
What exactly is Business Intelligence?What exactly is Business Intelligence?
What exactly is Business Intelligence?James Serra
 
SAP HANA Integrated with Microstrategy
SAP HANA Integrated with MicrostrategySAP HANA Integrated with Microstrategy
SAP HANA Integrated with Microstrategysnehal parikh
 
Datawarehousing and Business Intelligence
Datawarehousing and Business IntelligenceDatawarehousing and Business Intelligence
Datawarehousing and Business IntelligencePrithwis Mukerjee
 
Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Bikramjit Sarkar, Ph.D.
 
Keynote Sap UA Conference March 23 a zeier final
Keynote Sap UA Conference March 23 a zeier  finalKeynote Sap UA Conference March 23 a zeier  final
Keynote Sap UA Conference March 23 a zeier finalProf. Dr. Alexander Zeier
 
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 FebResume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 FebPallavi Gokhale Mishra
 

Was ist angesagt? (20)

Bi Dw Presentation
Bi Dw PresentationBi Dw Presentation
Bi Dw Presentation
 
Mr bi
Mr biMr bi
Mr bi
 
Business Intelligence Architecture
Business Intelligence ArchitectureBusiness Intelligence Architecture
Business Intelligence Architecture
 
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
BA Summit 2014 Ontdek de nieuwe mogelijkheden van IBM SPSS Modeler 16.0
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!Mammothdb - Public VC Pitchdeck!
Mammothdb - Public VC Pitchdeck!
 
Column Oriented Databases
Column Oriented DatabasesColumn Oriented Databases
Column Oriented Databases
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Project+team+1 slides (2)
Project+team+1 slides (2)Project+team+1 slides (2)
Project+team+1 slides (2)
 
A hadoop map reduce
A hadoop map reduceA hadoop map reduce
A hadoop map reduce
 
BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)BI architecture presentation and involved models (short)
BI architecture presentation and involved models (short)
 
Optimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and ServicesOptimize Workloads with IBM Solutions and Services
Optimize Workloads with IBM Solutions and Services
 
7 - Enterprise IT in Action
7 - Enterprise IT in Action7 - Enterprise IT in Action
7 - Enterprise IT in Action
 
Austin fraser sap hana presentation
Austin fraser sap hana presentationAustin fraser sap hana presentation
Austin fraser sap hana presentation
 
What exactly is Business Intelligence?
What exactly is Business Intelligence?What exactly is Business Intelligence?
What exactly is Business Intelligence?
 
SAP HANA Integrated with Microstrategy
SAP HANA Integrated with MicrostrategySAP HANA Integrated with Microstrategy
SAP HANA Integrated with Microstrategy
 
Datawarehousing and Business Intelligence
Datawarehousing and Business IntelligenceDatawarehousing and Business Intelligence
Datawarehousing and Business Intelligence
 
Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)
 
Keynote Sap UA Conference March 23 a zeier final
Keynote Sap UA Conference March 23 a zeier  finalKeynote Sap UA Conference March 23 a zeier  final
Keynote Sap UA Conference March 23 a zeier final
 
Resume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 FebResume Pallavi Mishra as of 2017 Feb
Resume Pallavi Mishra as of 2017 Feb
 

Ähnlich wie Data modelingzone geoffrey-clark-v2

The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3Terry Bunio
 
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...DataStax Academy
 
2009/11 Database Architechs Presentation
2009/11   Database Architechs Presentation2009/11   Database Architechs Presentation
2009/11 Database Architechs PresentationDatabase Architechs
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?DATAVERSITY
 
Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010ERwin Modeling
 
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web with the Real World  - A Journey between Two Cities ...Integrating Semantic Web with the Real World  - A Journey between Two Cities ...
Integrating Semantic Web with the Real World - A Journey between Two Cities ...Juan Sequeda
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingPrithwis Mukerjee
 
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...Cambridge Semantics
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architecturesRaji Gogulapati
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldKaren Lopez
 
Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Roland Bullivant
 
86921864 olap-case-study-vj
86921864 olap-case-study-vj86921864 olap-case-study-vj
86921864 olap-case-study-vjhomeworkping4
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2Mohit Garg
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataAshnikbiz
 
Big data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupBig data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupSri Kanajan
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolutionmark madsen
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliData Driven Innovation
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)Ben Stopford
 

Ähnlich wie Data modelingzone geoffrey-clark-v2 (20)

The final frontier v3
The final frontier v3The final frontier v3
The final frontier v3
 
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
C* Summit 2013: Data Modelers Still Have Jobs - Adjusting For the NoSQL Envir...
 
2009/11 Database Architechs Presentation
2009/11   Database Architechs Presentation2009/11   Database Architechs Presentation
2009/11 Database Architechs Presentation
 
BI Introduction
BI IntroductionBI Introduction
BI Introduction
 
Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?Agile & Data Modeling – How Can They Work Together?
Agile & Data Modeling – How Can They Work Together?
 
Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010Mastering your data with ca e rwin dm 09082010
Mastering your data with ca e rwin dm 09082010
 
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
Integrating Semantic Web with the Real World  - A Journey between Two Cities ...Integrating Semantic Web with the Real World  - A Journey between Two Cities ...
Integrating Semantic Web with the Real World - A Journey between Two Cities ...
 
OLAP Cubes in Datawarehousing
OLAP Cubes in DatawarehousingOLAP Cubes in Datawarehousing
OLAP Cubes in Datawarehousing
 
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
AnzoGraph DB: Driving AI and Machine Insights with Knowledge Graphs in a Conn...
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
Information processing architectures
Information processing architecturesInformation processing architectures
Information processing architectures
 
How to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database WorldHow to Survive as a Data Architect in a Polyglot Database World
How to Survive as a Data Architect in a Polyglot Database World
 
Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2Bbbt presentation 210415_final_2
Bbbt presentation 210415_final_2
 
86921864 olap-case-study-vj
86921864 olap-case-study-vj86921864 olap-case-study-vj
86921864 olap-case-study-vj
 
Big learning 1.2
Big learning   1.2Big learning   1.2
Big learning 1.2
 
Transform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big DataTransform your DBMS to drive engagement innovation with Big Data
Transform your DBMS to drive engagement innovation with Big Data
 
Big data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup GroupBig data Intro - Presentation to OCHackerz Meetup Group
Big data Intro - Presentation to OCHackerz Meetup Group
 
One Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database RevolutionOne Size Doesn't Fit All: The New Database Revolution
One Size Doesn't Fit All: The New Database Revolution
 
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo BrignoliL'architettura di classe enterprise di nuova generazione - Massimo Brignoli
L'architettura di classe enterprise di nuova generazione - Massimo Brignoli
 
Big iron 2 (published)
Big iron 2 (published)Big iron 2 (published)
Big iron 2 (published)
 

KĂźrzlich hochgeladen

Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779Delhi Call girls
 
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyHire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyNitya salvi
 
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking Men08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking MenDelhi Call girls
 
Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236Sherazi Tours
 
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...Abortion pills in Riyadh +966572737505 get cytotec
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxseri bangash
 
08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking MenDelhi Call girls
 
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls AgencyHire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls AgencyNitya salvi
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxdishha99
 
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday SafarisKibera Holiday Safaris Safaris
 
Discover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdfDiscover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdfMathura Vrindavan Tour Packages
 
DARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda BuxDARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda BuxBeEducate
 
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-..."Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...Ishwaholidays
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh HaldighatiApsara Of India
 
Book Cheap Flight Tickets - TraveljunctionUK
Book  Cheap Flight Tickets - TraveljunctionUKBook  Cheap Flight Tickets - TraveljunctionUK
Book Cheap Flight Tickets - TraveljunctionUKTravel Juncation
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking MenDelhi Call girls
 
ITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsMarco Mazzeschi
 

KĂźrzlich hochgeladen (20)

Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
 
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyHire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
 
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
08448380779 Call Girls In Bhikaji Cama Palace Women Seeking Men
 
08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking Men08448380779 Call Girls In Chhattarpur Women Seeking Men
08448380779 Call Girls In Chhattarpur Women Seeking Men
 
Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236
 
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
CYTOTEC DUBAI ☎️ +966572737505 } Abortion pills in Abu dhabi,get misoprostal ...
 
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptx
 
08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men
 
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls AgencyHire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
Hire 💕 8617697112 Reckong Peo Call Girls Service Call Girls Agency
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptx
 
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
 
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance VVVIP 🍎 S...
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance  VVVIP 🍎 S...Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance  VVVIP 🍎 S...
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance VVVIP 🍎 S...
 
Discover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdfDiscover Mathura And Vrindavan A Spritual Journey.pdf
Discover Mathura And Vrindavan A Spritual Journey.pdf
 
DARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda BuxDARK TRAVEL AGENCY presented by Khuda Bux
DARK TRAVEL AGENCY presented by Khuda Bux
 
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-..."Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
"Embark on the Ultimate Adventure: Top 10 Must-Visit Destinations for Thrill-...
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
 
Book Cheap Flight Tickets - TraveljunctionUK
Book  Cheap Flight Tickets - TraveljunctionUKBook  Cheap Flight Tickets - TraveljunctionUK
Book Cheap Flight Tickets - TraveljunctionUK
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men
 
ITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomadsITALY - Visa Options for expats and digital nomads
ITALY - Visa Options for expats and digital nomads
 

Data modelingzone geoffrey-clark-v2

  • 1. Physical Database Design for MPP and Columnar Databases Geoffrey Clark Principal at Lucidata, Inc. September 2013 copywrite, Lucidata, 2013
  • 2. Conceptual, Logical, Physical • Conceptual links to Business Strategy. – This is now becoming more quantitative • Logical maps to the Business Semantics. – Con-way example • Physical maps to your Data Stores – These will be more varied and heterogeneous in the future, due to specialization. copywrite, Lucidata, 2013
  • 3. HBR Business Strategy The New Dynamics of Competition, Michael D. Ryall, Harvard Business Review, June 2013 Michael Porter’s Five Forces has dominated strategic and competitive analysis since 1979. This analysis has largely been conceptual in nature. Quantitative analysis on structured data in context is changing the nature of business culture, and improving business decisions. This drives the demand for data modeling and management. copywrite, Lucidata, 2013
  • 4. Design and Evolution • Hierarchies – 14th Century Europe and the Financial Revolution – Aggregations & Allocations • Cards, Tapes – physical analog media • Computer Science – Moore’s Law • Processor Speed Improvements • Memory Improvements • Media Improvements – Punch Cards, Tape, Disk, Memory • Design for Context & the Future – Character encoding - Internationalization – Calendars – Gregorian, Fiscal, Lunar, ... Y2K? • Files and Fields – Separation of Data and Metadata – Modern versions -> XML, JSON • Joins! – Data Sets – Super types, Sub types – Associations describe Networks! copywrite, Lucidata, 2013
  • 6. ... and Demand Forecast copywrite, Lucidata, 2013
  • 7. Separation of Church and State • Operational uses – Capture the data, hand-entered <- validation – A Data Flow, such as Order to Cash cycle – Con-way example of PRO(-gressive) numbers • Analytical uses – Desire for reports, Reporting crashes the Operational cycle, Cash flow problem. – Banished from OLTP, go make an ODS copywrite, Lucidata, 2013
  • 8. The Star Schema The purpose of business computers is to sort data. A graphical representation of sorted data is called a ‘Star Schema’. – Michael Silves, Principal at Datamorphosis • The right design at the right time, becomes default doctrine for DW – Early RDBMS (Relational Data Base Management Systems) • Low memory, slow disks, slow CPU • Big Demand, with questions that spanned the datasets • Performance issues over large datasets – Interview Business people to get questions • Pre-process the data, based on business questions – Separation into Dimensions and Facts/Metrics • Link to Business Semantics • OLAP (On-Line Analytical Processing) • Educate Users on Aggregation and Allocation • Conformed Dimensions across Departments to give an Enterprise-wide view of the data. • But as technology changes, problems emerge – Ad-hoc questions require redesign & rework – With business hierarchies when one concept is both a fact & dimension, e.g. Shipment – Fact tables become difficult to distribute for MPP ... e.g. Teradata prefers a normalized DW • Example – transportation networks copywrite, Lucidata, 2013
  • 9. Example – Multi-Modal Freight • Shipments are agreements between a Carrier and a Shipper to move goods between two places. • Shipments can be split into “ProFreight” (which is assigned a cost via activity-based costing). • Shipments/ProFreight are composed of Freight handling units. • Freight can be “re-tendered” to another carrier, in which case is is linked to the original and the new Shipment. • Freight moves between places on one or many “VFCs” or Containers. • Containers are moved between places on Trips. copywrite, Lucidata, 2013
  • 10. Kimball on Transportation, 3NF copywrite, Lucidata, 2013
  • 11. Kimball on Transportation, Star copywrite, Lucidata, 2013
  • 12. Table Level DW diagram copywrite, Lucidata, 2013
  • 13. Dim Modeling Dogma • “Our carefully normalized data model can not be translated into a star schema... “ – Dimensional modeling is necessary in order to generate correct queries – Any (normalized) data model can be transformed in a dimensional model... – ... and there exists an algorithm to do it copywrite, Lucidata, 2013
  • 16. Bridge table (remember, we tried this) We tried this with hesmith When selecting a main hierarchy is has too much of a downside, and you don’t have a weight factor … copywrite, Lucidata, 2013
  • 19. Basic DW diagram copywrite, Lucidata, 2013
  • 20. Build Dimensional Model in BI copywrite, Lucidata, 2013
  • 21. Freight moves through Networks copywrite, Lucidata, 2013
  • 22. Information Factory & MPP • Normalized Base – Integrate data once • Source -> Normalized -> Denormalized -> OK • Source -> Denormalized? -> Un-normalized -> ? – Detect problems and fix them once! • Does not preclude Data Marts • Massive Parallel Processing – Data distribution • Optimizations – Broadcast, Co-location, Re-distribution • Scalability, the quest for 1:1 • Normalized data - reduced IO, better match for copywrite, Lucidata, 2013
  • 23. Bob Conway’s Rapid Methodology copywrite, Lucidata, 2013
  • 24. Core Model with many Roles Transaction Tables Reference Tables copywrite, Lucidata, 2013
  • 25. Power of Conformed Dimensions copywrite, Lucidata, 2013
  • 26. Example Data Model & Hierarchy copywrite, Lucidata, 2013
  • 27. Data Flow and Usage copywrite, Lucidata, 2013
  • 28. Cubes and In-memory BI • Multi-Dimensional OLAP (MOLAP) – Drag-and-Drop OLAP environment, analysts become capable of self-service. – Dealt with Ragged Hierarchies, common in Financial data such as General Ledger (GL) – Limited by memory size – Pressure for more dimensionality floods cube size, build times from relational sources exceed load windows ... • Relational OLAP (ROLAP) copywrite, Lucidata, 2013
  • 29. But a network this size choked it copywrite, Lucidata, 2013
  • 30. Columnar vs Row-wise • Physically store data by Column vs Row – Rather like Fifth Normal Form. – If Semantically Organized, then Rapid Response to user’s ad-hoc aggregation requests. – Prefers batch loading, always loads once per column, even if loading one row. • Continues to Appear and Operate as a normal Row-wise cousin. copywrite, Lucidata, 2013
  • 31. Columnar IO example Compression becomes much more effective Reading a Column is like reading a Row copywrite, Lucidata, 2013
  • 32. Design Pattern for Log Data Data Stewards for Master Data Data Stewards for Metadata Architects integrate data and metadata Architects organize data for analysis with physical in mind Architects identify levels for analysis, and distributionColumnar MPP copywrite, Lucidata, 2013
  • 33. Importance of Reference Data copywrite, Lucidata, 2013
  • 34. Infobright’s Database Landscape 2011 copywrite, Lucidata, 2013
  • 37. Hadoop (Cloudera & Hortonworks) “Although it’s true that Hadoop can be valuable as an analytic silo, most organizations will prefer to get the most business value out of Hadoop by integrating it with—or into—their BI, DW, DI, and analytics technology stacks.” – Philip Russom TDWI http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspx copywrite, Lucidata, 2013
  • 38. Hadoop for Analytics? Analytics performs best on Structured Data, for good reasons. Maintain MPP strengths in the solution through Architecture. copywrite, Lucidata, 2013
  • 39. Message from Hortonworks (Hadoop) “Although it’s true that Hadoop can be valuable as an analytic silo, most organizations will prefer to get the most business value out of Hadoop by integrating it with—or into—their BI, DW, DI, and analytics technology stacks.” – Philip Russom TDWI http://tdwi.org/webcasts/2013/04/integrating-hadoop-into-business-intelligence-and-data-warehousing.aspxcopywrite, Lucidata, 2013
  • 40. Hadoop as ETL copywrite, Lucidata, 2013
  • 41. Data Flow Reference Architecture copywrite, Lucidata, 2013
  • 42. Message from Neo4J NoSQL copywrite, Lucidata, 2013
  • 43. Message from MongoDB (NoSQL) http://www.slideshare.net/fullscreen/mongodb/schema-design-by-example/1copywrite, Lucidata, 2013
  • 44. Message from Couchbase (NoSQL) http://www.couchbase.com/why-nosql/nosql-databasecopywrite, Lucidata, 2013

Hinweis der Redaktion

  1. Jeff Kibler @ Infobright