SlideShare ist ein Scribd-Unternehmen logo
1 von 40
Downloaden Sie, um offline zu lesen
Agile Data Warehouse Modeling:
Introduction to Data Vault Modeling
Kent Graziano
Data Warrior LLC
Twitter @KentGraziano
Agenda
 Bio
 What do we mean by Agile?
 What is a Data Vault?
 Where does it fit in an Oracle BI architecture
 How to design a Data Vault model
 Being “agile”
My Bio
 Oracle ACE Director
 Certified Data Vault Master and DV 2.0 Architect
 Blogger: Oracle Data Warrior
 Data Architecture and Data Warehouse Specialist
● 30+ years in IT
● 20+ years of Oracle-related work
● 15+ years of data warehousing experience
 Co-Author of
● The Business of Data Vault Modeling
● The Data Model Resource Book (1st Edition)
 Editor of “The” Data Vault Book
 Past-President of ODTUG and Rocky Mountain Oracle
User Group
Manifesto for Agile Software Development
 “We are uncovering better ways of developing
software by doing it and helping others do it.
 Through this work we have come to value:
 Individuals and interactions over processes and
tools
 Working software over comprehensive
documentation
 Customer collaboration over contract negotiation
 Responding to change over following a plan
 That is, while there is value in the items on the right,
we value the items on the left more.”
 http://agilemanifesto.org/
Applying the Agile Manifesto to DW
 User Stories instead of
requirements documents
 Time-boxed iterations
● Iteration has a standard length
● Choose one or more user stories to fit in that
iteration
 Rework is part of the game
● There are no “missed requirements”... only
those that haven’t been delivered or
discovered yet.
Data Vault Definition
The Data Vault is a detail oriented, historical tracking
and uniquely linked set of normalized tables that
support one or more functional areas of business.
It is a hybrid approach encompassing the best of
breed between 3rd normal form (3NF) and star
schema. The design is flexible, scalable, consistent
and adaptable to the needs of the enterprise.
Dan Linstedt: Defining the Data Vault
TDAN.com Article
Architected specifically to meet the needs
of today’s enterprise data warehouses
What is Data Vault Trying to Solve?
 What are our other Enterprise
Data Warehouse options?
● Third-Normal Form (3NF): Complex
primary keys (PK’s) with cascading
snapshot dates
● Star Schema (Dimensional): Difficult to
reengineer fact tables for granularity
changes
 Difficult to get it right the first
time
 Not adaptable to rapid
business change
 NOT AGILE!
(C) Kent Graziano
Data Vault Time Line
20001960 1970 1980 1990
E.F. Codd invented
relational modeling
Chris Date and
Hugh Darwen
Maintained and
Refined
Modeling
1976 Dr Peter Chen
Created E-R
Diagramming
Early 70’s Bill
Inmon Began
Discussing Data
Warehousing
Mid 60’s Dimension & Fact
Modeling presented by
General Mills and Dartmouth
University
Mid 70’s AC Nielsen
Popularized
Dimension & Fact Terms
Mid – Late 80’s Dr Kimball
Popularizes Star Schema
Mid 80’s Bill Inmon
Popularizes Data
Warehousing
Late 80’s – Barry
Devlin and Dr Kimball
Release “Business
Data Warehouse”
1990 – Dan Linstedt
Begins R&D on Data
Vault Modeling
2000 – Dan Linstedt
releases first 5
articles on Data Vault
Modeling
© LearnDataVault.com
Data Vault Evolution
 The work on the Data Vault approach began in the
early 1990s, and completed around 1999.
 Throughout 1999, 2000, and 2001, the Data Vault
design was tested, refined, and deployed into specific
customer sites.
 In 2002, the industry thought leaders were asked to
review the architecture.
● This is when I attend my first DV seminar in Denver and met
Dan!
 In 2003, Dan began teaching the modeling techniques
to the mass public.
(C) Kent Graziano
Where does a Data Vault Fit?
© LearnDataVault.com
Oracle Information Management Reference
Architecture
 Staging Layer
● Change tables
● Reject tables for Data Quality
● External tables for file feeds
 Foundation Layer
● Transactional granularity
maintained
● Process neutral: no user or
business requirements
● Just recording what happened
 Access and Performance
Layer
● Dimensional model
● “Star Schemas”
● Process specific: targeting user
and business requirements
Where does Data Vault fit?
Data Vault goes here
What is a Foundation Layer?
 Basis for long term enterprise scale data
warehouse
 Must be atomic level data
● A historical source of facts
 Not based on any one data source or system
 Single point of integration
 Flexible
 Extensible
 Provides data to the access/reporting layer
(C) Kent Graziano
How to be Agile using DV and Oracle
 Model iteratively
● Use Data Vault data modeling technique
● Create basic components, then add over time
 Virtualize the Access Layer
● Don’t waste time building facts and dimensions up front
● ETL and testing takes too long
● “Project” objects using pattern-based DV model with OBIEE
BMM or Oracle Views
 Users see real reports with real data
(C) Kent Graziano
Data Vault: 3 Simple Structures
© LearnDataVault.com
Data Vault Core Architecture
 Hubs = Unique List of Business Keys
 Links = Unique List of Relationships across
keys
 Satellites = Descriptive Data
 Satellites have one and only one parent table
 Satellites cannot be “Parents” to other tables
 Hubs cannot be child tables
© LearnDataVault.com
1. Hub = Business Keys
Hubs = Unique Lists of Business Keys
Business Keys are used to
TRACK and IDENTIFY key information
(C) Kent Graziano
Hub Definition
 What Makes a Hub Key?
● A Hub is based on an identifiable business key.
● An identifiable business key is an attribute that is used in
the source systems to locate data.
● The business key has a very low propensity to change,
and usually is not editable on the source systems.
● The business key has the same semantic meaning, and
the same granularity across the company, but not
necessarily the same format.
 Attributes and Ordering
● All attributes are mandatory.
● Sequence ID 1st, Busn. Key 2nd , Load Date 3rd ,Record
Source Last (4th).
● All attributes in the Business Key form a UNIQUE Index.
© LearnDataVault.com
2: Links = Associations
Links =
Transactions and
Associations
They are used to
hook together
multiple sets of
information
(C) Kent Graziano
Link Definitions
 What Makes a Link?
● A Link is based on identifiable business element
relationships.
● Otherwise known as a foreign key,
● AKA a business event or transaction between business keys,
● The relationship shouldn’t change over time
● It is established as a fact that occurred at a specific point in time
and will remain that way forever.
● The link table may also represent a hierarchy.
 Attributes
● All attributes are mandatory
(C) LearnDataVault.com
Modeling Links - 1:1 or 1:M?
 Today:
● Relationship is a 1:1 so why model a Link?
 Tomorrow:
● The business rule can change to a 1:M.
● You discover new data later.
 With a Link in the Data Vault:
● No need to change the EDW structure.
● Existing data is fine.
● New data is added.
(C) Kent Graziano
3. Satellites = Descriptors
Satellites provide
context for the
Hubs and the
Links
(C) Kent Graziano
Satellite Definitions
 What Makes a Satellite?
● A Satellite is based on an non-identifying business
elements.
● The Satellite data changes, sometimes rapidly,
sometimes slowly.
● The Satellite is dependent on the Hub or Link key as
a parent,
● Satellites are never dependent on more than one parent table.
● The Satellite is never a parent table to any other table (no snow
flaking).
 Attributes and Ordering
● All attributes are mandatory – EXCEPT END DATE.
● Parent ID 1st, Load Date 2nd, Load End Date
3rd,Record Source Last.
(C) LearnDataVault.com
Satellite Entity- Details
 A Satellite has only 1 foreign key; it is dependent on
the parent table (Hub or Link)
 A Satellite may or may not have an “Item
Numbering” attribute.
 A Satellite’s Load Date represents the date the
EDW saw the data (must be a delta set).
● This is not Effective Date from the Source!
 A Satellite’s Record Source represents the actual
source of the row (unit of work).
 To avoid Outer Joins, you must ensure that every
satellite has at least 1 entry for every Hub Key.
(C) LearnDataVault.com
Data Vault Model Flexibility (Agility)
 Goes beyond standard 3NF
• Hyper normalized
● Hubs and Links only hold keys and meta data
● Satellites split by rate of change and/or source
• Enables Agile data modeling
● Easy to add to model without having to change existing
structures and load routines
• Relationships (links) can be dropped and created on-demand.
● No more reloading history because of a missed requirement
 Based on natural business keys
• Not system surrogate keys
• Allows for integrating data across functions and source
systems more easily
● All data relationships are key driven.
© LearnDataVault.com
Data Vault Extensibility
Adding new components to
the EDW has NEAR ZERO
impact to:
• Existing Loading
Processes
• Existing Data Model
• Existing Reporting & BI
Functions
• Existing Source Systems
• Existing Star Schemas
and Data Marts
© LearnDataVault.com
 Standardized modeling rules
• Highly repeatable and learnable modeling technique
• Can standardize load routines
● Delta Driven process
● Re-startable, consistent loading patterns.
• Can standardize extract routines
● Rapid build of new or revised Data Marts
• Can be automated
‣ Can use a BI-meta layer to virtualize the reporting
structures
‣ Example: OBIEE Business Model and Mapping tool
‣ Can put views on the DV structures as well
‣ Simulate ODS/3NF or Star Schemas
Data Vault Productivity
(C) Kent Graziano
• The Data Vault holds granular historical
relationships.
• Holds all history for all time, allowing any
source system feeds to be reconstructed on-
demand
• Easy generation of Audit Trails for data lineage
and compliance.
• Data Mining can discover new relationships
between elements
• Patterns of change emerge from the historical
pictures and linkages.
• The Data Vault can be accessed by power-users
© LearnDataVault.com
Data Vault Adaptability
Other Benefits of a Data Vault
 Modeling it as a DV forces integration of the Business Keys
upfront.
• Good for organizational alignment.
 An integrated data set with raw data extends it’s value beyond BI:
• Source for data quality projects
• Source for master data
• Source for data mining
• Source for Data as a Service (DaaS) in an SOA (Service Oriented Architecture).
 Upfront Hub integration simplifies the data integration routines
required to load data marts.
• Helps divide the work a bit.
 It is much easier to implement security on these granular pieces.
 Granular, re-startable processes enable pin-point failure
correction.
 It is designed and optimized for real-time loading in its core
architecture (without any tweaks or mods).
© LearnDataVault.com
Worlds Smallest Data Vault
 The Data Vault doesn’t have to be
“BIG”.
 An Data Vault can be built
incrementally.
 Reverse engineering one component
of the existing models is not
uncommon.
 Building one part of the Data Vault,
then changing the marts to feed from
that vault is a best practice.
 The smallest Enterprise Data
Warehouse consists of two tables:
● One Hub,
● One Satellite
Hub_Cust_Seq_ID
Hub_Cust_Num
Hub_Cust_Load_DTS
Hub_Cust_Rec_Src
Hub Customer
Hub_Cust_Seq_ID
Sat_Cust_Load_DTS
Sat_Cust_Load_End_DTS
Sat_Cust_Name
Sat_Cust_Rec_Src
Satellite Customer Name
© LearnDataVault.com
Notably…
 In 2008 Bill Inmon stated that the “Data Vault
is the optimal approach for modeling the EDW
in the DW2.0 framework.” (DW2.0)
 The number of Data Vault users in the US
surpassed 500 in 2010 and grows rapidly
(http://danlinstedt.com/about/dv-customers/)
Organizations using Data Vault
 WebMD Health Services
 Anthem Blue-Cross Blue Shield
 MD Anderson Cancer Center
 Denver Public Schools
 Independent Purchasing Cooperative (IPC, Miami)
• Owner of Subway
 Kaplan
 US Defense Department
 Colorado Springs Utilities
 State Court of Wyoming
 Federal Express
 US Dept. Of Agriculture
What’s New in DV2.0?
 Modeling Structure Includes…
● NoSQL, and Non-Relational DB systems, Hybrid Systems
● Minor Structure Changes to support NoSQL
 New ETL Implementation Standards
● For true real-time support
● For NoSQL support
 New Architecture Standards
● To include support for NoSQL data management systems
 New Methodology Components
● Including CMMI, Six Sigma, and TQM
● Including Project Planning, Tracking, and Oversight
● Agile Delivery Mechanisms
● Standards, and templates for Projects
© LearnDataVault.com
Conclusion?
Changing the direction of the river takes less
effort than stopping the flow of water
© LearnDataVault.com
Summary
• Data Vault provides a data
modeling technique that
allows:
‣ Model Agility
‣ Enabling rapid changes and additions
‣ Productivity
‣ Enabling low complexity systems with high
value output at a rapid pace
‣ Easy projections of dimensional models
‣ So? Agile Data Warehousing?
Super Charge Your Data Warehouse
Available on Amazon.com
Soft Cover or Kindle Format
Now also available in PDF at
LearnDataVault.com
Hint: Kent is the Technical
Editor
Data Vault References
www.learndatavault.com
www.danlinstedt.com
On LinkedIn:
http://www.linkedin.com/groups?gid=44926
On YouTube:
www.youtube.com/LearnDataVault
On Facebook:
www.facebook.com/learndatavault
Contact Information
Kent Graziano
The Oracle Data Warrior
Data Warrior LLC
Kent.graziano@att.net
Visit my blog at
http://kentgraziano.com

Weitere ähnliche Inhalte

Was ist angesagt?

Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookJames Serra
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Hans Hultgren
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introductionmattcasters
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureDatabricks
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Kent Graziano
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and DeltaDatabricks
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Dr. Arif Wider
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?James Serra
 
Glossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data GovernanceGlossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data GovernanceDATAVERSITY
 
Essential Metadata Strategies
Essential Metadata StrategiesEssential Metadata Strategies
Essential Metadata StrategiesDATAVERSITY
 
The Importance of Metadata
The Importance of MetadataThe Importance of Metadata
The Importance of MetadataDATAVERSITY
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...HostedbyConfluent
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseDatabricks
 

Was ist angesagt? (20)

Operational Data Vault
Operational Data VaultOperational Data Vault
Operational Data Vault
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011Data Warehouse Agility Array Conference2011
Data Warehouse Agility Array Conference2011
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introduction
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse ArchitectureIntroduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 
Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)Agile Data Engineering - Intro to Data Vault Modeling (2016)
Agile Data Engineering - Intro to Data Vault Modeling (2016)
 
3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta3D: DBT using Databricks and Delta
3D: DBT using Databricks and Delta
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
 
Glossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data GovernanceGlossaries, Dictionaries, and Catalogs Result in Data Governance
Glossaries, Dictionaries, and Catalogs Result in Data Governance
 
Essential Metadata Strategies
Essential Metadata StrategiesEssential Metadata Strategies
Essential Metadata Strategies
 
The Importance of Metadata
The Importance of MetadataThe Importance of Metadata
The Importance of Metadata
 
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
Data Mess to Data Mesh | Jay Kreps, CEO, Confluent | Kafka Summit Americas 20...
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 

Andere mochten auch

Agile Methods and Data Warehousing
Agile Methods and Data WarehousingAgile Methods and Data Warehousing
Agile Methods and Data WarehousingKent Graziano
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsKent Graziano
 
Top Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data ModelerTop Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data ModelerKent Graziano
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignKent Graziano
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Kent Graziano
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSKent Graziano
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016Kent Graziano
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureKent Graziano
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Kent Graziano
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachKent Graziano
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBaseJames Serra
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 

Andere mochten auch (12)

Agile Methods and Data Warehousing
Agile Methods and Data WarehousingAgile Methods and Data Warehousing
Agile Methods and Data Warehousing
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
 
Top Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data ModelerTop Five Cool Features in Oracle SQL Developer Data Modeler
Top Five Cool Features in Oracle SQL Developer Data Modeler
 
Worst Practices in Data Warehouse Design
Worst Practices in Data Warehouse DesignWorst Practices in Data Warehouse Design
Worst Practices in Data Warehouse Design
 
Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)Agile Methods and Data Warehousing (2016 update)
Agile Methods and Data Warehousing (2016 update)
 
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODSAgile Data Warehousing: Using SDDM to Build a Virtualized ODS
Agile Data Warehousing: Using SDDM to Build a Virtualized ODS
 
Data Warehousing 2016
Data Warehousing 2016Data Warehousing 2016
Data Warehousing 2016
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
 
Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)Demystifying Data Warehouse as a Service (DWaaS)
Demystifying Data Warehouse as a Service (DWaaS)
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
 
Introduction to PolyBase
Introduction to PolyBaseIntroduction to PolyBase
Introduction to PolyBase
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 

Ähnlich wie (OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling

Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijevIlja Dmitrijevs
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesIvo Andreev
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxHong Ong
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesDenodo
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionDenodo
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Daniel Zivkovic
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksGrega Kespret
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An IntroductionDenodo
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeKent Graziano
 
Original: Lean Data Model Storming for the Agile Enterprise
Original: Lean Data Model Storming for the Agile EnterpriseOriginal: Lean Data Model Storming for the Agile Enterprise
Original: Lean Data Model Storming for the Agile EnterpriseDaniel Upton
 
Trends in Data Modeling
Trends in Data ModelingTrends in Data Modeling
Trends in Data ModelingDATAVERSITY
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...InfluxData
 
Exploiting the Data / Code Duality with Dali
Exploiting the Data / Code Duality with DaliExploiting the Data / Code Duality with Dali
Exploiting the Data / Code Duality with DaliCarl Steinbach
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InSnapLogic
 
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data WranglingArtifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data WranglingFaisal Akbar
 
CWIN 17 / sessions data vault modeling - f2-f - nishat gupta
CWIN 17 / sessions data vault modeling -  f2-f - nishat guptaCWIN 17 / sessions data vault modeling -  f2-f - nishat gupta
CWIN 17 / sessions data vault modeling - f2-f - nishat guptaCapgemini
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIDenodo
 
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...VMware Tanzu
 

Ähnlich wie (OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling (20)

Introduction to data vault ilja dmitrijev
Introduction to data vault   ilja dmitrijevIntroduction to data vault   ilja dmitrijev
Introduction to data vault ilja dmitrijev
 
Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
DBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptxDBT ELT approach for Advanced Analytics.pptx
DBT ELT approach for Advanced Analytics.pptx
 
Logical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business OutcomesLogical Data Fabric and Data Mesh – Driving Business Outcomes
Logical Data Fabric and Data Mesh – Driving Business Outcomes
 
Why Data Virtualization? An Introduction
Why Data Virtualization? An IntroductionWhy Data Virtualization? An Introduction
Why Data Virtualization? An Introduction
 
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
Canadian Experts Discuss Modern Data Stacks and Cloud Computing for 5 Years o...
 
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and DatabricksSelf-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 
Intro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on SnowflakeIntro to Data Vault 2.0 on Snowflake
Intro to Data Vault 2.0 on Snowflake
 
Original: Lean Data Model Storming for the Agile Enterprise
Original: Lean Data Model Storming for the Agile EnterpriseOriginal: Lean Data Model Storming for the Agile Enterprise
Original: Lean Data Model Storming for the Agile Enterprise
 
Trends in Data Modeling
Trends in Data ModelingTrends in Data Modeling
Trends in Data Modeling
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
 
Date Analysis .pdf
Date Analysis .pdfDate Analysis .pdf
Date Analysis .pdf
 
Exploiting the Data / Code Duality with Dali
Exploiting the Data / Code Duality with DaliExploiting the Data / Code Duality with Dali
Exploiting the Data / Code Duality with Dali
 
Building the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump InBuilding the Enterprise Data Lake - Important Considerations Before You Jump In
Building the Enterprise Data Lake - Important Considerations Before You Jump In
 
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data WranglingArtifacts, Data Dictionary, Data Modeling, Data Wrangling
Artifacts, Data Dictionary, Data Modeling, Data Wrangling
 
From Data Warehouse to Lakehouse
From Data Warehouse to LakehouseFrom Data Warehouse to Lakehouse
From Data Warehouse to Lakehouse
 
CWIN 17 / sessions data vault modeling - f2-f - nishat gupta
CWIN 17 / sessions data vault modeling -  f2-f - nishat guptaCWIN 17 / sessions data vault modeling -  f2-f - nishat gupta
CWIN 17 / sessions data vault modeling - f2-f - nishat gupta
 
Data Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AIData Science Operationalization: The Journey of Enterprise AI
Data Science Operationalization: The Journey of Enterprise AI
 
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
A Modern Interface for Data Science on Postgres/Greenplum - Greenplum Summit ...
 

Mehr von Kent Graziano

Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudKent Graziano
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for DinnerKent Graziano
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...Kent Graziano
 
Rise of the Data Cloud
Rise of the Data CloudRise of the Data Cloud
Rise of the Data CloudKent Graziano
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeKent Graziano
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Kent Graziano
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on ReadKent Graziano
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Kent Graziano
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWKent Graziano
 

Mehr von Kent Graziano (9)

Balance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data CloudBalance agility and governance with #TrueDataOps and The Data Cloud
Balance agility and governance with #TrueDataOps and The Data Cloud
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...HOW TO SAVE  PILEs of $$$BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
HOW TO SAVE PILEs of $$$ BY CREATING THE BEST DATA MODEL THE FIRST TIME (Ksc...
 
Rise of the Data Cloud
Rise of the Data CloudRise of the Data Cloud
Rise of the Data Cloud
 
Delivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with SnowflakeDelivering Data Democratization in the Cloud with Snowflake
Delivering Data Democratization in the Cloud with Snowflake
 
Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)Demystifying Data Warehousing as a Service (GLOC 2019)
Demystifying Data Warehousing as a Service (GLOC 2019)
 
Making Sense of Schema on Read
Making Sense of Schema on ReadMaking Sense of Schema on Read
Making Sense of Schema on Read
 
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
Agile Data Engineering: Introduction to Data Vault 2.0 (2018)
 
Demystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFWDemystifying Data Warehousing as a Service - DFW
Demystifying Data Warehousing as a Service - DFW
 

Kürzlich hochgeladen

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 

Kürzlich hochgeladen (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 

(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling

  • 1. Agile Data Warehouse Modeling: Introduction to Data Vault Modeling Kent Graziano Data Warrior LLC Twitter @KentGraziano
  • 2. Agenda  Bio  What do we mean by Agile?  What is a Data Vault?  Where does it fit in an Oracle BI architecture  How to design a Data Vault model  Being “agile”
  • 3. My Bio  Oracle ACE Director  Certified Data Vault Master and DV 2.0 Architect  Blogger: Oracle Data Warrior  Data Architecture and Data Warehouse Specialist ● 30+ years in IT ● 20+ years of Oracle-related work ● 15+ years of data warehousing experience  Co-Author of ● The Business of Data Vault Modeling ● The Data Model Resource Book (1st Edition)  Editor of “The” Data Vault Book  Past-President of ODTUG and Rocky Mountain Oracle User Group
  • 4. Manifesto for Agile Software Development  “We are uncovering better ways of developing software by doing it and helping others do it.  Through this work we have come to value:  Individuals and interactions over processes and tools  Working software over comprehensive documentation  Customer collaboration over contract negotiation  Responding to change over following a plan  That is, while there is value in the items on the right, we value the items on the left more.”  http://agilemanifesto.org/
  • 5. Applying the Agile Manifesto to DW  User Stories instead of requirements documents  Time-boxed iterations ● Iteration has a standard length ● Choose one or more user stories to fit in that iteration  Rework is part of the game ● There are no “missed requirements”... only those that haven’t been delivered or discovered yet.
  • 6. Data Vault Definition The Data Vault is a detail oriented, historical tracking and uniquely linked set of normalized tables that support one or more functional areas of business. It is a hybrid approach encompassing the best of breed between 3rd normal form (3NF) and star schema. The design is flexible, scalable, consistent and adaptable to the needs of the enterprise. Dan Linstedt: Defining the Data Vault TDAN.com Article Architected specifically to meet the needs of today’s enterprise data warehouses
  • 7. What is Data Vault Trying to Solve?  What are our other Enterprise Data Warehouse options? ● Third-Normal Form (3NF): Complex primary keys (PK’s) with cascading snapshot dates ● Star Schema (Dimensional): Difficult to reengineer fact tables for granularity changes  Difficult to get it right the first time  Not adaptable to rapid business change  NOT AGILE! (C) Kent Graziano
  • 8. Data Vault Time Line 20001960 1970 1980 1990 E.F. Codd invented relational modeling Chris Date and Hugh Darwen Maintained and Refined Modeling 1976 Dr Peter Chen Created E-R Diagramming Early 70’s Bill Inmon Began Discussing Data Warehousing Mid 60’s Dimension & Fact Modeling presented by General Mills and Dartmouth University Mid 70’s AC Nielsen Popularized Dimension & Fact Terms Mid – Late 80’s Dr Kimball Popularizes Star Schema Mid 80’s Bill Inmon Popularizes Data Warehousing Late 80’s – Barry Devlin and Dr Kimball Release “Business Data Warehouse” 1990 – Dan Linstedt Begins R&D on Data Vault Modeling 2000 – Dan Linstedt releases first 5 articles on Data Vault Modeling © LearnDataVault.com
  • 9. Data Vault Evolution  The work on the Data Vault approach began in the early 1990s, and completed around 1999.  Throughout 1999, 2000, and 2001, the Data Vault design was tested, refined, and deployed into specific customer sites.  In 2002, the industry thought leaders were asked to review the architecture. ● This is when I attend my first DV seminar in Denver and met Dan!  In 2003, Dan began teaching the modeling techniques to the mass public. (C) Kent Graziano
  • 10. Where does a Data Vault Fit? © LearnDataVault.com
  • 11. Oracle Information Management Reference Architecture  Staging Layer ● Change tables ● Reject tables for Data Quality ● External tables for file feeds  Foundation Layer ● Transactional granularity maintained ● Process neutral: no user or business requirements ● Just recording what happened  Access and Performance Layer ● Dimensional model ● “Star Schemas” ● Process specific: targeting user and business requirements
  • 12. Where does Data Vault fit? Data Vault goes here
  • 13. What is a Foundation Layer?  Basis for long term enterprise scale data warehouse  Must be atomic level data ● A historical source of facts  Not based on any one data source or system  Single point of integration  Flexible  Extensible  Provides data to the access/reporting layer (C) Kent Graziano
  • 14. How to be Agile using DV and Oracle  Model iteratively ● Use Data Vault data modeling technique ● Create basic components, then add over time  Virtualize the Access Layer ● Don’t waste time building facts and dimensions up front ● ETL and testing takes too long ● “Project” objects using pattern-based DV model with OBIEE BMM or Oracle Views  Users see real reports with real data (C) Kent Graziano
  • 15. Data Vault: 3 Simple Structures © LearnDataVault.com
  • 16. Data Vault Core Architecture  Hubs = Unique List of Business Keys  Links = Unique List of Relationships across keys  Satellites = Descriptive Data  Satellites have one and only one parent table  Satellites cannot be “Parents” to other tables  Hubs cannot be child tables © LearnDataVault.com
  • 17. 1. Hub = Business Keys Hubs = Unique Lists of Business Keys Business Keys are used to TRACK and IDENTIFY key information (C) Kent Graziano
  • 18. Hub Definition  What Makes a Hub Key? ● A Hub is based on an identifiable business key. ● An identifiable business key is an attribute that is used in the source systems to locate data. ● The business key has a very low propensity to change, and usually is not editable on the source systems. ● The business key has the same semantic meaning, and the same granularity across the company, but not necessarily the same format.  Attributes and Ordering ● All attributes are mandatory. ● Sequence ID 1st, Busn. Key 2nd , Load Date 3rd ,Record Source Last (4th). ● All attributes in the Business Key form a UNIQUE Index. © LearnDataVault.com
  • 19. 2: Links = Associations Links = Transactions and Associations They are used to hook together multiple sets of information (C) Kent Graziano
  • 20. Link Definitions  What Makes a Link? ● A Link is based on identifiable business element relationships. ● Otherwise known as a foreign key, ● AKA a business event or transaction between business keys, ● The relationship shouldn’t change over time ● It is established as a fact that occurred at a specific point in time and will remain that way forever. ● The link table may also represent a hierarchy.  Attributes ● All attributes are mandatory (C) LearnDataVault.com
  • 21. Modeling Links - 1:1 or 1:M?  Today: ● Relationship is a 1:1 so why model a Link?  Tomorrow: ● The business rule can change to a 1:M. ● You discover new data later.  With a Link in the Data Vault: ● No need to change the EDW structure. ● Existing data is fine. ● New data is added. (C) Kent Graziano
  • 22. 3. Satellites = Descriptors Satellites provide context for the Hubs and the Links (C) Kent Graziano
  • 23. Satellite Definitions  What Makes a Satellite? ● A Satellite is based on an non-identifying business elements. ● The Satellite data changes, sometimes rapidly, sometimes slowly. ● The Satellite is dependent on the Hub or Link key as a parent, ● Satellites are never dependent on more than one parent table. ● The Satellite is never a parent table to any other table (no snow flaking).  Attributes and Ordering ● All attributes are mandatory – EXCEPT END DATE. ● Parent ID 1st, Load Date 2nd, Load End Date 3rd,Record Source Last. (C) LearnDataVault.com
  • 24. Satellite Entity- Details  A Satellite has only 1 foreign key; it is dependent on the parent table (Hub or Link)  A Satellite may or may not have an “Item Numbering” attribute.  A Satellite’s Load Date represents the date the EDW saw the data (must be a delta set). ● This is not Effective Date from the Source!  A Satellite’s Record Source represents the actual source of the row (unit of work).  To avoid Outer Joins, you must ensure that every satellite has at least 1 entry for every Hub Key. (C) LearnDataVault.com
  • 25. Data Vault Model Flexibility (Agility)  Goes beyond standard 3NF • Hyper normalized ● Hubs and Links only hold keys and meta data ● Satellites split by rate of change and/or source • Enables Agile data modeling ● Easy to add to model without having to change existing structures and load routines • Relationships (links) can be dropped and created on-demand. ● No more reloading history because of a missed requirement  Based on natural business keys • Not system surrogate keys • Allows for integrating data across functions and source systems more easily ● All data relationships are key driven. © LearnDataVault.com
  • 26. Data Vault Extensibility Adding new components to the EDW has NEAR ZERO impact to: • Existing Loading Processes • Existing Data Model • Existing Reporting & BI Functions • Existing Source Systems • Existing Star Schemas and Data Marts © LearnDataVault.com
  • 27.  Standardized modeling rules • Highly repeatable and learnable modeling technique • Can standardize load routines ● Delta Driven process ● Re-startable, consistent loading patterns. • Can standardize extract routines ● Rapid build of new or revised Data Marts • Can be automated ‣ Can use a BI-meta layer to virtualize the reporting structures ‣ Example: OBIEE Business Model and Mapping tool ‣ Can put views on the DV structures as well ‣ Simulate ODS/3NF or Star Schemas Data Vault Productivity (C) Kent Graziano
  • 28. • The Data Vault holds granular historical relationships. • Holds all history for all time, allowing any source system feeds to be reconstructed on- demand • Easy generation of Audit Trails for data lineage and compliance. • Data Mining can discover new relationships between elements • Patterns of change emerge from the historical pictures and linkages. • The Data Vault can be accessed by power-users © LearnDataVault.com Data Vault Adaptability
  • 29. Other Benefits of a Data Vault  Modeling it as a DV forces integration of the Business Keys upfront. • Good for organizational alignment.  An integrated data set with raw data extends it’s value beyond BI: • Source for data quality projects • Source for master data • Source for data mining • Source for Data as a Service (DaaS) in an SOA (Service Oriented Architecture).  Upfront Hub integration simplifies the data integration routines required to load data marts. • Helps divide the work a bit.  It is much easier to implement security on these granular pieces.  Granular, re-startable processes enable pin-point failure correction.  It is designed and optimized for real-time loading in its core architecture (without any tweaks or mods). © LearnDataVault.com
  • 30.
  • 31. Worlds Smallest Data Vault  The Data Vault doesn’t have to be “BIG”.  An Data Vault can be built incrementally.  Reverse engineering one component of the existing models is not uncommon.  Building one part of the Data Vault, then changing the marts to feed from that vault is a best practice.  The smallest Enterprise Data Warehouse consists of two tables: ● One Hub, ● One Satellite Hub_Cust_Seq_ID Hub_Cust_Num Hub_Cust_Load_DTS Hub_Cust_Rec_Src Hub Customer Hub_Cust_Seq_ID Sat_Cust_Load_DTS Sat_Cust_Load_End_DTS Sat_Cust_Name Sat_Cust_Rec_Src Satellite Customer Name © LearnDataVault.com
  • 32. Notably…  In 2008 Bill Inmon stated that the “Data Vault is the optimal approach for modeling the EDW in the DW2.0 framework.” (DW2.0)  The number of Data Vault users in the US surpassed 500 in 2010 and grows rapidly (http://danlinstedt.com/about/dv-customers/)
  • 33. Organizations using Data Vault  WebMD Health Services  Anthem Blue-Cross Blue Shield  MD Anderson Cancer Center  Denver Public Schools  Independent Purchasing Cooperative (IPC, Miami) • Owner of Subway  Kaplan  US Defense Department  Colorado Springs Utilities  State Court of Wyoming  Federal Express  US Dept. Of Agriculture
  • 34. What’s New in DV2.0?  Modeling Structure Includes… ● NoSQL, and Non-Relational DB systems, Hybrid Systems ● Minor Structure Changes to support NoSQL  New ETL Implementation Standards ● For true real-time support ● For NoSQL support  New Architecture Standards ● To include support for NoSQL data management systems  New Methodology Components ● Including CMMI, Six Sigma, and TQM ● Including Project Planning, Tracking, and Oversight ● Agile Delivery Mechanisms ● Standards, and templates for Projects © LearnDataVault.com
  • 35. Conclusion? Changing the direction of the river takes less effort than stopping the flow of water © LearnDataVault.com
  • 36. Summary • Data Vault provides a data modeling technique that allows: ‣ Model Agility ‣ Enabling rapid changes and additions ‣ Productivity ‣ Enabling low complexity systems with high value output at a rapid pace ‣ Easy projections of dimensional models ‣ So? Agile Data Warehousing?
  • 37. Super Charge Your Data Warehouse Available on Amazon.com Soft Cover or Kindle Format Now also available in PDF at LearnDataVault.com Hint: Kent is the Technical Editor
  • 38. Data Vault References www.learndatavault.com www.danlinstedt.com On LinkedIn: http://www.linkedin.com/groups?gid=44926 On YouTube: www.youtube.com/LearnDataVault On Facebook: www.facebook.com/learndatavault
  • 39.
  • 40. Contact Information Kent Graziano The Oracle Data Warrior Data Warrior LLC Kent.graziano@att.net Visit my blog at http://kentgraziano.com