SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
 Data Vault Modeling

 DW2.0 & Unstructured Data

 Big Data

 Ensemble Modeling

 Agile DW

Ensemble Modeling
& Data Vault

© 2014 Genesee Academy, LLC
USA +1 303 526 0340
Sweden 072 736 8700
Hans@GeneseeAcademy.com
www.GeneseeAcademy.com

2014
Ensemble Modeling & Data Vault
AGENDA

About Hans Hultgren:

Ensemble Modeling
&Unified Decomposition
Data Vault Ensemble
Colors of Data Vault
Data Vault Hubs, Links and
Satellites
• More Information
•
•
•
•
•

gohansgo
Author, Advisor, Speaker &
Industry Analyst; President
Genesee Academy LLC,
Principal at

Book available on Amazon.com
© 2014 Genesee Academy, LLC

2
A Saga of Data Warehousing
Once upon a time data warehousing was becoming more popular and
everyone was eager to build their own. But whenever they tried they failed.
They called upon their best to fix this but they just couldn’t solve the
problem.
They discovered that meeting the needs of the data warehouse meant that
the tables got too big and too hard to work with. They just could not handle
changes over time. If the smallest thing changed it always meant they had
to change the entire table. When just a single attribute was updated they
had to insert a record for all of the attributes. All seemed lost.
But around the world there were rebels who questioned the conventional
wisdom. And their voices were finally heard: Why not separate the things
that change from the things that don’t change?

© 2014 Genesee Academy, LLC

3
Ensemble Modeling™
• The constellation of component parts acts as a whole – an Ensemble.
All the parts of a thing taken together, so that
each part is considered only in relation to the whole.

• With Ensemble Modeling the Core Business Concepts that we define and
model are represented as a whole – an ensemble – including all of the
component parts.
© 2014 Genesee Academy, LLC

4
Based on Unified Decomposition™
• With the EDW, we break things out into parts for flexibility, agility, and
generally to facilitate the capture of things that are either interpreted in
different ways or changing independently of each other.

• At the same time a core premise of data warehousing is integration and
moving to a common standard view of unified concepts. So we also
want to tie things together – Unify.
© 2014 Genesee Academy, LLC

5
THE DATA VAULT ENSEMBLE:
APPLYING THE ENSEMBLE

© 2014 Genesee Academy, LLC

6
The Data Vault Ensemble
• The Data Vault Ensemble conforms to a single key – embodied in the Hub
construct.

• The component parts for the Data Vault Ensemble include:
– Hub
The Natural Business Key
– Link
The Natural Business Relationships
– Satellite
All Context, Descriptive Data and History
© 2014 Genesee Academy, LLC

7
The Data Vault modeling approach
3NF

Data Vault

HUB

SAT

LINK

Entity

Dimensional

SAT

Dim

Core Concept Business Keys
Associations / Relationships
Details / Context

© 2014 Genesee Academy, LLC

8
Modeling Comparison
Start Schema and Snow Flake Models:

Region

Store
Customer

Sale Fact

Associations
Business Keys
Details
Product

Facts contain all three types of data…

Employee

Vendor

Dimensions can also contain all types

*** Requires complex loading routines for key dependencies…

© 2014 Genesee Academy, LLC

9
Modeling Comparison
3rd Normal Form has the same issue: each construct – or Entity –
typically contains a business key, one or more associations
and also details (context, descriptive data)…

Region

Customer

Store
Sale

Sale LI

Employee

Product

Vendor

© 2014 Genesee Academy, LLC

10
Colors of the Data Vault
Sat
Sat Sat
Sat Sat
Sat Sat

Sat
Sat
Sat
Sat Sat
Sat
Sat
Sat
Sat
Sat

Sat
Sat
Sat
Sat
Sat
Sat
Region

Sat
Sat
Sat
Sat
Sat
Sat
Sat
Sat
Sat
Sat
Employee
Customer

Sat
Sat

Link

Store

Link
Sat
Sat
Sat
Sat

Sat
Sat
Sat

Link

Sat
Sat

Product

Sale

Link
Sat
Sat

Vendor

Sat
Sat Sat
© 2014 Genesee Academy, LLC

Sat
Sat
Sat Sat
Sat Sat
Sat
Sat
Sat
Sat

Sat
Sat
Sat
Sat

11
Data Vault means thinking differently
• The minimal construct then for an “entity”
such as “Customer” is now a
Customer

Hub with a set of Satellites

Customer

© 2014 Genesee Academy, LLC

12
Data Vault Modeling Process
• The Modeling Process for creating a Data Vault model includes
three primary steps:
1) Identify and Model your Core Business Concepts
• Business Interviews is at the heart of this step
What do you do?

What are the main things you work with?

• Also find best/target Natural Business Key

2) Identify and Model your Natural Business Relationships
• Specific Unique Relationships
• Be considerate of the Unit of Work and Grain

3) Analyze and Design your Context Satellites
• Consider Rate of Change, Type of Data
and also the Sources of your
data during design process
© 2014 Genesee Academy, LLC

13
Hubs
– A Hub Construct in Data Vault
• contains Business Key
• only the Business Key
• contains No Context
• is always 1:1 with EWBK

H_Customer
H_Customer_SID
Business Key 
Date/Time Stamp
Record source

– A Hub Table contains only
• Business Key
• Surrogate Key (Data Warehouse)
• Load Date / Time Stamp
• Record Source
© 2014 Genesee Academy, LLC

14
Links
H_Customer

– A Link Construct in Data Vault
• contains Relationship
• only a Relationship
• contains No Context
• is always 1:1 with Relationship
– A Link Table contains only
• 2-n FKs for the Relationship
• Surrogate Key (Data Warehouse)
• Load Date / Time Stamp
• Record Source
© 2014 Genesee Academy, LLC

H_Customer_SID
Business Key 
Date/Tim e Stamp

L_Cust_Class

Record source

L_Cust_Class_SID
H_Customer_SID
H_Sequence2_SID
Date/Time Stamp
Record source

– Unique
– Specific
– Natural
Business
Relationship

15
Satellites
– A Satellite Construct in Data Vault
• contains Context only
• has no FKs (no relationships)
• Designed by * Rate of Change
* Type of Data * System…

S_Customer
H_Customer_SID
Date/Time Stamp
Context A
Context B
Context C
Context D

– A Satellite Table contains only
• Business Key FK +
•
Load Date / Time Stamp
• Context Data…
• Record Source

© 2014 Genesee Academy, LLC

Record source

H_Customer
H_Customer_SID
Business Key 
Date/Tim e Stamp
Record source

16
About Data Vault Ensemble

Estimated 800 Data Vault based
Data Warehouses around the world

© 2014 Genesee Academy, LLC

17
Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com
Hans@GeneseeAcademy.com

gohansgo

Book DataVaultBook.blogspot.com
HansHultgren.WordPress.com
HansHultgren
DataVaultAcademy

Online video-lesson training

DataVaultAcademy.com
© 2014 Genesee Academy, LLC

18

Weitere ähnliche Inhalte

Andere mochten auch

Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijev
Ilja Dmitrijevs
 
Data Vault: What is it? Where does it fit? SQL Saturday #249
Data Vault: What is it?  Where does it fit?  SQL Saturday #249Data Vault: What is it?  Where does it fit?  SQL Saturday #249
Data Vault: What is it? Where does it fit? SQL Saturday #249
Daniel Upton
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshop
johannesvdb
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
Vishal Kumar
 

Andere mochten auch (17)

Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best Practices
 
DOAG 2016 Oracle Logon Security
DOAG 2016 Oracle Logon SecurityDOAG 2016 Oracle Logon Security
DOAG 2016 Oracle Logon Security
 
Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijev
 
IT Governance - Core Concepts for Business Managers
IT Governance - Core Concepts for Business ManagersIT Governance - Core Concepts for Business Managers
IT Governance - Core Concepts for Business Managers
 
Credit Scoring 101 Education
Credit Scoring 101 EducationCredit Scoring 101 Education
Credit Scoring 101 Education
 
Modellierung agliler Data Warehouses mit Data Vault
Modellierung agliler Data Warehouses mit Data VaultModellierung agliler Data Warehouses mit Data Vault
Modellierung agliler Data Warehouses mit Data Vault
 
Credit scorecard
Credit scorecardCredit scorecard
Credit scorecard
 
Data Vault: What is it? Where does it fit? SQL Saturday #249
Data Vault: What is it?  Where does it fit?  SQL Saturday #249Data Vault: What is it?  Where does it fit?  SQL Saturday #249
Data Vault: What is it? Where does it fit? SQL Saturday #249
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshop
 
Ensemble modeling overview, Big Data meetup
Ensemble modeling overview, Big Data meetupEnsemble modeling overview, Big Data meetup
Ensemble modeling overview, Big Data meetup
 
Data Vault Overview
Data Vault OverviewData Vault Overview
Data Vault Overview
 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data Vault
 
Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012Introduction To Data Vault - DAMA Oregon 2012
Introduction To Data Vault - DAMA Oregon 2012
 
Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes Agile
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
 

Kürzlich hochgeladen

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Kürzlich hochgeladen (20)

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 

Ensemble Modeling and Data Vault 2014

  • 1.  Data Vault Modeling  DW2.0 & Unstructured Data  Big Data  Ensemble Modeling  Agile DW Ensemble Modeling & Data Vault © 2014 Genesee Academy, LLC USA +1 303 526 0340 Sweden 072 736 8700 Hans@GeneseeAcademy.com www.GeneseeAcademy.com 2014
  • 2. Ensemble Modeling & Data Vault AGENDA About Hans Hultgren: Ensemble Modeling &Unified Decomposition Data Vault Ensemble Colors of Data Vault Data Vault Hubs, Links and Satellites • More Information • • • • • gohansgo Author, Advisor, Speaker & Industry Analyst; President Genesee Academy LLC, Principal at Book available on Amazon.com © 2014 Genesee Academy, LLC 2
  • 3. A Saga of Data Warehousing Once upon a time data warehousing was becoming more popular and everyone was eager to build their own. But whenever they tried they failed. They called upon their best to fix this but they just couldn’t solve the problem. They discovered that meeting the needs of the data warehouse meant that the tables got too big and too hard to work with. They just could not handle changes over time. If the smallest thing changed it always meant they had to change the entire table. When just a single attribute was updated they had to insert a record for all of the attributes. All seemed lost. But around the world there were rebels who questioned the conventional wisdom. And their voices were finally heard: Why not separate the things that change from the things that don’t change? © 2014 Genesee Academy, LLC 3
  • 4. Ensemble Modeling™ • The constellation of component parts acts as a whole – an Ensemble. All the parts of a thing taken together, so that each part is considered only in relation to the whole. • With Ensemble Modeling the Core Business Concepts that we define and model are represented as a whole – an ensemble – including all of the component parts. © 2014 Genesee Academy, LLC 4
  • 5. Based on Unified Decomposition™ • With the EDW, we break things out into parts for flexibility, agility, and generally to facilitate the capture of things that are either interpreted in different ways or changing independently of each other. • At the same time a core premise of data warehousing is integration and moving to a common standard view of unified concepts. So we also want to tie things together – Unify. © 2014 Genesee Academy, LLC 5
  • 6. THE DATA VAULT ENSEMBLE: APPLYING THE ENSEMBLE © 2014 Genesee Academy, LLC 6
  • 7. The Data Vault Ensemble • The Data Vault Ensemble conforms to a single key – embodied in the Hub construct. • The component parts for the Data Vault Ensemble include: – Hub The Natural Business Key – Link The Natural Business Relationships – Satellite All Context, Descriptive Data and History © 2014 Genesee Academy, LLC 7
  • 8. The Data Vault modeling approach 3NF Data Vault HUB SAT LINK Entity Dimensional SAT Dim Core Concept Business Keys Associations / Relationships Details / Context © 2014 Genesee Academy, LLC 8
  • 9. Modeling Comparison Start Schema and Snow Flake Models: Region Store Customer Sale Fact Associations Business Keys Details Product Facts contain all three types of data… Employee Vendor Dimensions can also contain all types *** Requires complex loading routines for key dependencies… © 2014 Genesee Academy, LLC 9
  • 10. Modeling Comparison 3rd Normal Form has the same issue: each construct – or Entity – typically contains a business key, one or more associations and also details (context, descriptive data)… Region Customer Store Sale Sale LI Employee Product Vendor © 2014 Genesee Academy, LLC 10
  • 11. Colors of the Data Vault Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Region Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Employee Customer Sat Sat Link Store Link Sat Sat Sat Sat Sat Sat Sat Link Sat Sat Product Sale Link Sat Sat Vendor Sat Sat Sat © 2014 Genesee Academy, LLC Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat Sat 11
  • 12. Data Vault means thinking differently • The minimal construct then for an “entity” such as “Customer” is now a Customer Hub with a set of Satellites Customer © 2014 Genesee Academy, LLC 12
  • 13. Data Vault Modeling Process • The Modeling Process for creating a Data Vault model includes three primary steps: 1) Identify and Model your Core Business Concepts • Business Interviews is at the heart of this step What do you do? What are the main things you work with? • Also find best/target Natural Business Key 2) Identify and Model your Natural Business Relationships • Specific Unique Relationships • Be considerate of the Unit of Work and Grain 3) Analyze and Design your Context Satellites • Consider Rate of Change, Type of Data and also the Sources of your data during design process © 2014 Genesee Academy, LLC 13
  • 14. Hubs – A Hub Construct in Data Vault • contains Business Key • only the Business Key • contains No Context • is always 1:1 with EWBK H_Customer H_Customer_SID Business Key  Date/Time Stamp Record source – A Hub Table contains only • Business Key • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source © 2014 Genesee Academy, LLC 14
  • 15. Links H_Customer – A Link Construct in Data Vault • contains Relationship • only a Relationship • contains No Context • is always 1:1 with Relationship – A Link Table contains only • 2-n FKs for the Relationship • Surrogate Key (Data Warehouse) • Load Date / Time Stamp • Record Source © 2014 Genesee Academy, LLC H_Customer_SID Business Key  Date/Tim e Stamp L_Cust_Class Record source L_Cust_Class_SID H_Customer_SID H_Sequence2_SID Date/Time Stamp Record source – Unique – Specific – Natural Business Relationship 15
  • 16. Satellites – A Satellite Construct in Data Vault • contains Context only • has no FKs (no relationships) • Designed by * Rate of Change * Type of Data * System… S_Customer H_Customer_SID Date/Time Stamp Context A Context B Context C Context D – A Satellite Table contains only • Business Key FK + • Load Date / Time Stamp • Context Data… • Record Source © 2014 Genesee Academy, LLC Record source H_Customer H_Customer_SID Business Key  Date/Tim e Stamp Record source 16
  • 17. About Data Vault Ensemble Estimated 800 Data Vault based Data Warehouses around the world © 2014 Genesee Academy, LLC 17
  • 18. Links and Information CDVDM Training & Certification www.GeneseeAcademy.com Hans@GeneseeAcademy.com gohansgo Book DataVaultBook.blogspot.com HansHultgren.WordPress.com HansHultgren DataVaultAcademy Online video-lesson training DataVaultAcademy.com © 2014 Genesee Academy, LLC 18