SlideShare ist ein Scribd-Unternehmen logo
1 von 25
The Application of Data Vault to DW2.0 © Dan Linstedt, 2011-2012 all rights reserved
A bit about me… 2 Author, Inventor, Speaker – and part time photographer… 25+ years in the IT industry Worked in DoD, US Gov’t, Fortune 50, and so on… Find out more about the Data Vault: http://www.youtube.com/LearnDataVault http://LearnDataVault.com Full profile on http://www.LinkedIn.com/dlinstedt
Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 3
DW2.0 Architecture 10/5/2011 Do Not Duplicate Without Written Permission 4 Enterprise Service Bus ESB Connectivity: ,[object Object]
EII
ETL / ELT
Web ServicesCube  Processing Temporal Indexing Semantic Management Active  Data Mining Transformation Active Cleansing Unstructured Data: ,[object Object]
Plain Text
Word Docs
ImagesM E T A D A T A Interactive Tactical Data Models Must be consistently applied throughout all layers. Integrated Strategic ESB Management: ,[object Object]
Email
Spread Sheets
Transaction
Structured InformationNear-Line Extended Archival Historical Enterprise Data Warehouse
DW2.0 Drivers for Data Modeling 10/5/2011 Do Not Duplicate Without Written Permission 5 Technical Drivers Business Drivers Flexibility Compliance Volume Frequency Data Model Data Model Understandability Granularity Data Models are one of the main integration points between Technical and Business drivers. Business Keys drive understandability, and granularity Normalization drives flexibility, and frequency of load Raw data sets in the EDW/ADW drive compliance and volume
Divergence of Data Models over Time Data models (both logical and physical) have diverged from business drivers and direction over time. The Data Models have driven towards physical improvements instead of towards business improvements. The Data Vault Architecture drives data modeling back to the business sides of the house. 10/5/2011 Do Not Duplicate Without Written Permission 6
Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 7 Image is from - What The Bleep Do We Know?
Defining the Data Vault 10/5/2011 Do Not Duplicate Without Written Permission 8 The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business.  It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses. Defining the Data Vault TDAN.com Article
What Does One Look Like? 10/5/2011 Do Not Duplicate Without Written Permission 9 Records a history of the interaction Account Information Sat Sat Sat Link Account F(x) F(x) Sat Sat Invoice ID Sat F(x) Sat Invoice / Billing Information Customer Information Sat Elements: ,[object Object]
Link
SatelliteSat Customer F(x) Sat The impact of linking disparate systems together, is inside the shaded area.
Modeling in DW2.0 Bill Says: DW2.0 must be brought down to a very finite level of detail. The starting point for DW2.0 is the modeling process. The data model applies to the integrated sector, the near line sector, and the archival sector. The way that data warehouses are built is in an incremental manner The Data Vault specializes in: Providing finite grain at the lowest level possible, Mapping business process models to data models Existing in all sectors simultaneously without changes. Flexibility and managing change so that impacts are not a mile-wide and 10 miles deep. 10/5/2011 Do Not Duplicate Without Written Permission 10
Elements in a Data Vault Hub Unique List of Business Keys, tracked by the first time the warehouse saw them appear. Link Relationships between business keys, also representing a grain shift, or a hierarchical roll-up. Satellite Data over time, granular, and descriptive about the business key.  Also setup according to type of information, and rate of change. 10/5/2011 Do Not Duplicate Without Written Permission 11
Applying the Data Vault to Global DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 12 Manufacturing EDW  in China Planning in Brazil Hub Hub Link Sat Sat Link Sat Sat Link Hub Link Hub Hub Sat Sat Sat Sat Sat Sat Sat Sat Base EDW Created in Corporate Financials in USA
Applying the Data Vault to Time-Value DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 13 Satellite Data Over Time Row 1 Row 2 Row 3 Row 4 Satellite entities in the Data Vault house data over time.  They are split by type of information and rate of change.  This is an example set of data for a customer name satellite.

Weitere ähnliche Inhalte

Was ist angesagt?

Présentation data vault et bi v20120508
Présentation data vault et bi v20120508Présentation data vault et bi v20120508
Présentation data vault et bi v20120508Empowered Holdings, LLC
 
ETL VS ELT.pdf
ETL VS ELT.pdfETL VS ELT.pdf
ETL VS ELT.pdfBOSupport
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault ModelingKent Graziano
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture DesignKujambu Murugesan
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileDaniel Upton
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseData Con LA
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architectureSudheer Kondla
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerDatabricks
 

Was ist angesagt? (20)

Présentation data vault et bi v20120508
Présentation data vault et bi v20120508Présentation data vault et bi v20120508
Présentation data vault et bi v20120508
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Data Vault Introduction
Data Vault IntroductionData Vault Introduction
Data Vault Introduction
 
ETL VS ELT.pdf
ETL VS ELT.pdfETL VS ELT.pdf
ETL VS ELT.pdf
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Modern Data architecture Design
Modern Data architecture DesignModern Data architecture Design
Modern Data architecture Design
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Data Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes AgileData Vault: Data Warehouse Design Goes Agile
Data Vault: Data Warehouse Design Goes Agile
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
 
Operational Data Vault
Operational Data VaultOperational Data Vault
Operational Data Vault
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Data engineering
Data engineeringData engineering
Data engineering
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Data Mesh
Data MeshData Mesh
Data Mesh
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Building Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics PrimerBuilding Lakehouses on Delta Lake with SQL Analytics Primer
Building Lakehouses on Delta Lake with SQL Analytics Primer
 
What is ETL?
What is ETL?What is ETL?
What is ETL?
 

Andere mochten auch

IRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyIRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyEmpowered Holdings, LLC
 
Best Practices: Data Admin & Data Management
Best Practices: Data Admin & Data ManagementBest Practices: Data Admin & Data Management
Best Practices: Data Admin & Data ManagementEmpowered Holdings, LLC
 
Oracle Database Vault
Oracle Database VaultOracle Database Vault
Oracle Database VaultKhalid ALLILI
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopjohannesvdb
 
Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2atul randive
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourHans Hultgren
 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultDaniel Upton
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Hans Hultgren
 
Data Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneData Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneHans Hultgren
 
Data Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoData Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoHans Hultgren
 
Data Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part ThreeData Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part ThreeHans Hultgren
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesCGI
 
Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Andreas Buckenhofer
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieAndreas Buckenhofer
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationVishal Kumar
 

Andere mochten auch (19)

Data vault what's Next: Part 2
Data vault what's Next: Part 2Data vault what's Next: Part 2
Data vault what's Next: Part 2
 
IRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And MethodologyIRM UK - 2009: DV Modeling And Methodology
IRM UK - 2009: DV Modeling And Methodology
 
Data vault: What's Next
Data vault: What's NextData vault: What's Next
Data vault: What's Next
 
Best Practices: Data Admin & Data Management
Best Practices: Data Admin & Data ManagementBest Practices: Data Admin & Data Management
Best Practices: Data Admin & Data Management
 
Visual Data Vault
Visual Data VaultVisual Data Vault
Visual Data Vault
 
Oracle Database Vault
Oracle Database VaultOracle Database Vault
Oracle Database Vault
 
Data vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshopData vault seminar May 5-6 Dommel - The factory and the workshop
Data vault seminar May 5-6 Dommel - The factory and the workshop
 
Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2Atul Randive CV_IKnowSolutions_ENv2
Atul Randive CV_IKnowSolutions_ENv2
 
Data Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part FourData Vault ReConnect Speed Presenting PM Part Four
Data Vault ReConnect Speed Presenting PM Part Four
 
Lean Data Warehouse via Data Vault
Lean Data Warehouse via Data VaultLean Data Warehouse via Data Vault
Lean Data Warehouse via Data Vault
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011
 
Data Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part OneData Vault ReConnect Speed Presenting AM Part One
Data Vault ReConnect Speed Presenting AM Part One
 
Data Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part TwoData Vault ReConnect Speed Presenting AM Part Two
Data Vault ReConnect Speed Presenting AM Part Two
 
Data Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part ThreeData Vault ReConnect Speed Presenting PM Part Three
Data Vault ReConnect Speed Presenting PM Part Three
 
Guru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best PracticesGuru4Pro Data Vault Best Practices
Guru4Pro Data Vault Best Practices
 
Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
 
Big Data Modeling
Big Data ModelingBig Data Modeling
Big Data Modeling
 
Agile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data PresentationAgile Data Warehouse Design for Big Data Presentation
Agile Data Warehouse Design for Big Data Presentation
 

Ähnlich wie Data Vault and DW2.0

Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroDenodo
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Denodo
 
Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijevIlja Dmitrijevs
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Denodo
 
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...Denodo
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesDenodo
 
Data API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementData API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementVictor Olex
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionKlaudiia Jacome
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?Denodo
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonCapgemini
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lakeCapgemini
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaJeffrey T. Pollock
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionDenodo
 
Data Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast TourData Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast TourWhereScape
 
Sql server briefing sept
Sql server briefing septSql server briefing sept
Sql server briefing septMark Kromer
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data LakeIRJET Journal
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerDataWorks Summit
 

Ähnlich wie Data Vault and DW2.0 (20)

Data vault
Data vaultData vault
Data vault
 
Data Virtualization: From Zero to Hero
Data Virtualization: From Zero to HeroData Virtualization: From Zero to Hero
Data Virtualization: From Zero to Hero
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes
 
Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijev
 
Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)Introduction to Modern Data Virtualization 2021 (APAC)
Introduction to Modern Data Virtualization 2021 (APAC)
 
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
Data Ninja Webinar Series: Accelerating Business Value with Data Virtualizati...
 
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data LakesData Ninja Webinar Series: Realizing the Promise of Data Lakes
Data Ninja Webinar Series: Realizing the Promise of Data Lakes
 
Data API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of EngagementData API as a Foundation for Systems of Engagement
Data API as a Foundation for Systems of Engagement
 
Data warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and visionData warehouse 2.0 and sql server architecture and vision
Data warehouse 2.0 and sql server architecture and vision
 
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
 
Traditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A ComparisonTraditional BI vs. Business Data Lake – A Comparison
Traditional BI vs. Business Data Lake – A Comparison
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
The technology of the business data lake
The technology of the business data lakeThe technology of the business data lake
The technology of the business data lake
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Data Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast TourData Vault 2.0 Demystified: East Coast Tour
Data Vault 2.0 Demystified: East Coast Tour
 
Sql server briefing sept
Sql server briefing septSql server briefing sept
Sql server briefing sept
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data Lake
 
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services LayerLogical Data Warehouse: How to Build a Virtualized Data Services Layer
Logical Data Warehouse: How to Build a Virtualized Data Services Layer
 

Kürzlich hochgeladen

Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...noida100girls
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyEthan lee
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewasmakika9823
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessAggregage
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Delhi Call girls
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurSuhani Kapoor
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth MarketingShawn Pang
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 

Kürzlich hochgeladen (20)

Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...BEST ✨ Call Girls In  Indirapuram Ghaziabad  ✔️ 9871031762 ✔️ Escorts Service...
BEST ✨ Call Girls In Indirapuram Ghaziabad ✔️ 9871031762 ✔️ Escorts Service...
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service DewasVip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
Vip Dewas Call Girls #9907093804 Contact Number Escorts Service Dewas
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for Success
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
Best VIP Call Girls Noida Sector 40 Call Me: 8448380779
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 

Data Vault and DW2.0

  • 1. The Application of Data Vault to DW2.0 © Dan Linstedt, 2011-2012 all rights reserved
  • 2. A bit about me… 2 Author, Inventor, Speaker – and part time photographer… 25+ years in the IT industry Worked in DoD, US Gov’t, Fortune 50, and so on… Find out more about the Data Vault: http://www.youtube.com/LearnDataVault http://LearnDataVault.com Full profile on http://www.LinkedIn.com/dlinstedt
  • 3. Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 3
  • 4.
  • 5. EII
  • 7.
  • 10.
  • 11. Email
  • 14. Structured InformationNear-Line Extended Archival Historical Enterprise Data Warehouse
  • 15. DW2.0 Drivers for Data Modeling 10/5/2011 Do Not Duplicate Without Written Permission 5 Technical Drivers Business Drivers Flexibility Compliance Volume Frequency Data Model Data Model Understandability Granularity Data Models are one of the main integration points between Technical and Business drivers. Business Keys drive understandability, and granularity Normalization drives flexibility, and frequency of load Raw data sets in the EDW/ADW drive compliance and volume
  • 16. Divergence of Data Models over Time Data models (both logical and physical) have diverged from business drivers and direction over time. The Data Models have driven towards physical improvements instead of towards business improvements. The Data Vault Architecture drives data modeling back to the business sides of the house. 10/5/2011 Do Not Duplicate Without Written Permission 6
  • 17. Agenda Defining The Needs for the Data Vault DW2.0 Architecture DW2.0 Drivers for Data Modeling Divergence of Data Models over Time Data Vault in DW2.0 Defining the Data Vault What does one look like? Modeling in DW2.0 Applying Data Vault to Global DW2.0 Applying Data Vault to Time-Value DW2.0 Compliance in DW2.0 Applying Data Vault to System of Record The Paradox of DW2.0 Volume, Latency, Complexity,Normalization andTransformation ability 10/5/2011 Do Not Duplicate Without Written Permission 7 Image is from - What The Bleep Do We Know?
  • 18. Defining the Data Vault 10/5/2011 Do Not Duplicate Without Written Permission 8 The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. It is a data model that is architected specifically to meet the needs of today’s enterprise data warehouses. Defining the Data Vault TDAN.com Article
  • 19.
  • 20. Link
  • 21. SatelliteSat Customer F(x) Sat The impact of linking disparate systems together, is inside the shaded area.
  • 22. Modeling in DW2.0 Bill Says: DW2.0 must be brought down to a very finite level of detail. The starting point for DW2.0 is the modeling process. The data model applies to the integrated sector, the near line sector, and the archival sector. The way that data warehouses are built is in an incremental manner The Data Vault specializes in: Providing finite grain at the lowest level possible, Mapping business process models to data models Existing in all sectors simultaneously without changes. Flexibility and managing change so that impacts are not a mile-wide and 10 miles deep. 10/5/2011 Do Not Duplicate Without Written Permission 10
  • 23. Elements in a Data Vault Hub Unique List of Business Keys, tracked by the first time the warehouse saw them appear. Link Relationships between business keys, also representing a grain shift, or a hierarchical roll-up. Satellite Data over time, granular, and descriptive about the business key. Also setup according to type of information, and rate of change. 10/5/2011 Do Not Duplicate Without Written Permission 11
  • 24. Applying the Data Vault to Global DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 12 Manufacturing EDW in China Planning in Brazil Hub Hub Link Sat Sat Link Sat Sat Link Hub Link Hub Hub Sat Sat Sat Sat Sat Sat Sat Sat Base EDW Created in Corporate Financials in USA
  • 25. Applying the Data Vault to Time-Value DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 13 Satellite Data Over Time Row 1 Row 2 Row 3 Row 4 Satellite entities in the Data Vault house data over time. They are split by type of information and rate of change. This is an example set of data for a customer name satellite.
  • 26. Batch and Real-Time Data Arrival 10/5/2011 Do Not Duplicate Without Written Permission 14 All Inserts All the time Transaction ID Date Stamp Customer Account # Amount Sat Transaction Type Hub Customer Link Transaction Hub Acct Sat Customer Sat Acct 3, 6 or 12 Hr Load Window Batch Load Customer Info Acct Data
  • 27. Star Schema Real-Time Data Issues 10/5/2011 Do Not Duplicate Without Written Permission 15 Updates are REQUIRED! Transaction ID Date Stamp Customer Account # Amount Type 3, 6 or 12 Hr Load Window Dimension Customer Fact Transaction Dimension Account Batch Load Customer Info Acct Data Cleansing & Quality must occur before the data can reach the target tables, cleansing and quality introduce unwanted latency!
  • 28. Compliance in DW2.0 10/5/2011 Do Not Duplicate Without Written Permission 16 Changes to Source Information Source Systems EDW / ADW Data Vault Data Marts Data Delivery Raw Detail = auditable Loads in Real-Time or in Batch Integrated by Business Key Flexible, allows business changes (with little to no impact) No delay in loading data Data type conformity Semantic Integration True Marts Raw Integration Business Rules User or Auditor Continuous Data Improvement Error Mart Quality Direction of Information Flow Master Data (Operational)
  • 29. Applying the Data Vault to System Of Record 10/5/2011 Do Not Duplicate Without Written Permission 17 Master Data or Conformed Dimensions Normalized EDW Source Systems SOR Definition 2 SOR Definition 3 SOR Definition 1 SOR 1 Data Capture, Data Produced by system algorithms SOR 2 Raw Detailed Integrated Data over time, Integrated by Horizontal (functional) Business Key. Auditable. SOR 3 Current view of the business, merged, quality cleansed, single copy, single source, feeds operational systems.
  • 30. DW2.0 Paradoxes DW2.0 incorporates: Unstructured, Semi-Structured, Real-Time, and Batch Data Global views All of which drive volumes of data. Volume causes latency in transformation. Volume is directly proportional to transformation complexity. Real-Time data arrival is inversely proportional to complexity and volume. Time for “quality, cleansing, and transformation” on the way in to the EDW diminishes as near-real-time is approached, or massive volumes of batch data are found within a shrinking batch window. Transformation can destroy data audit ability and compliance of the EDW / ADW. 10/5/2011 Do Not Duplicate Without Written Permission 18
  • 31. DW2.0 Paradoxes - Imagery 10/5/2011 Do Not Duplicate Without Written Permission 19 Drives DW2.0 Real-Time Transactions Unstructured Data Low-Level Grain Pushes Increases Low Latency Volume Fights Requires Merging, Quality, Cleansing Fights Data Model Denormalization Fights Data Model Normalization & Raw Details Inhibits Requires Inhibits Auditability & Compliance Provides
  • 32. DW2.0 Paradox Hypothesis As we reach near-real time, the ability to transform data and “wait” for parent dependencies directly decreases, the data decay rates increase, and therefore can cause data death if not processed in time. Normalization of the data model increases flexibility, and scalability. The closer we get to near-real-time, the more normalized the data model in the EDW/ADW must become. In order to process high volumes of batch data extremely fast, the “business transformations” must be removed from the load stream of the EDW. 10/5/2011 Do Not Duplicate Without Written Permission 20
  • 33. Data Vault Volumetrics 10/5/2011 Do Not Duplicate Without Written Permission 21 Volumetrics (10% null Data) Upon Initial Investigation, the 12 month growth rate for new customers is 197.4 MB per year…. Now let’s factor in the DELTA’s.
  • 34. Data Vault Growth 10/5/2011 Do Not Duplicate Without Written Permission 22 Volumetrics (10% null Data) – Delta Growth Only Original Dimension: 497.16 MB per Year New Data Vault:317.03 MB Per Year
  • 35. Data Vault VS Dimension Growth 10/5/2011 Do Not Duplicate Without Written Permission 23 How does the extensive growth rate affect queries?
  • 36. Summarization Business: Lack of a single view of a customer, product, service, etc... Lack of visibility into ALL information across the enterprise. Competition does it better, faster, cheaper. Unable to identify and forecast business trends and their impacts. WHERE’S THE KNOWLEDGE? OR IS IT JUST ALL DATA? 10/5/2011 Do Not Duplicate Without Written Permission 24 Technical: Near-Real-Time (Active) Huge Data Volumes Massive Data Dis-Integration Spread-Marts Convergence of Operational and Strategic Questions Duplication of data in the ODS, Warehouse, and Data Marts! Dimension-itis!! ODS Ulcer! Fact Table Granularity JUNK tables, Helper Tables
  • 37. Where To Learn More The Technical Modeling Book: http://LearnDataVault.com The Discussion Forums: & eventshttp://LinkedIn.com – Data Vault Discussions Contact me:http://DanLinstedt.com - web siteDanLinstedt@gmail.com - email World wide User Group (Free)http://dvusergroup.com 25