SlideShare ist ein Scribd-Unternehmen logo
Data Vault
Torsten Glunde
Farbenlehre 3NF
Sale
Customer
Employee
ProductVendor
Store
Region
Sale LI
Business Key
Relation
Kontext
(historisch)
Farbenlehre Star Schema
Sale
Customer
Employee
Product
Vendor
Store
Region
Customer
Employee
Product
Vendor
Store
Region
Business Key
Relation
Kontext
(historisch)
Farbenlehre Data Vault
Sale
Customer
Employee
Product
Vendor
Store
Region
Link
Link
Link
Business Key
Relation
Kontext
(historisch)
Farbenlehre Data Vault
Sale
Customer
Employee
Product
Vendor
Store
Region
Link
Link
Link
Sat
Sat
Sat
Sat
Sat
Sat
Sat
Business Key
Relation
Kontext
(historisch)
Farbenlehre Data Vault
Sale
Customer
Employee
Product
Vendor
Store
Region
Link
Link
Link
Sat
Sat
Sat
Sat
Sat
Sat
Sat
Business Key
Relation
Kontext
(historisch)
3FolieFolieAWF Arbeitsgemeinschaft “Pull-Systeme” – Dipl.-Ing. O. Völker und Dipl.-Ing. S. Binner
Einleitung „Push“ und „Pull“
In OutBestand in der Fertigung
Ziehlogik (PullZiehlogik (Pull--Prinzip)Prinzip)
Bestand in der Fertigung
In Out
Schiebelogik (PushSchiebelogik (Push--Prinzip)Prinzip)
Lieferkette Push-Pull Point
Datenlieferungsprozess
I
•Single Version of Facts
II
•Multiple Versions of Truth
III
• Single
Sources
IV
• All Data
MPP
Automatisierung
des DWH mit
DataVault
Enterprise Information Products
Reports
Predictive Analytics
Adhoc-QueriesDWH Mart
Data LakeInput
ComplicatedSimple
Chaotic
Analytics, Innovations
Data Science
Data Mining
Machine Learning
Alle Daten
Complex
Manuelles ETL
Bereinigung
Geschäftsregeln
Datenmodell getriebene Automation
Integration nach Business Key
(fachlich)
Historisierung
Moderne DWH Architektur mit Data
Vault
I
• Facts
II
• Context
III
• Shadow IT
IV
• Analytics, Research, Prototyping
Raw Vault
“Single Version of
Facts”
Business
Vault
Source Stage
Report
Mart
“Multiple
Versions of
Truth”
Steuerung durch fachliche
Modellierung
Ladestrecken - Hub
SELECT
DISTINCT
BK
Erstelle SK
Im Ziel
vorhanden
?
Lookup
INSERT INTO
Hub
Stage
Raw
Vault
SELECT
DISTINCT
BK
WHERE NOT EXISTS IN Hub
Erstelle SK
INSERT INTO
Hub
Stage
Raw
Vault
SELECT
DISTINCT
BK, MD5
WHERE NOT EXISTS IN Hub
INSERT INTO
Hub
Stage
Raw
Vault
INSERT INTO HUB
SELECT
DISTINCT
BK, MD5
WHERE NOT EXISTS IN Hub
Stage
Raw
Vault
Ja
Nein
Ladestrecken - Link
SELECT
DISTINCT
Liste der BKs
Erstelle SK
Im Ziel
vorhanden
?
Lookup
INSERT INTO
Link
Stage
Raw
Vault
Ja
Nein
Lookup SK 1
Lookup SK 2
Lookup SK n
?
?
?
Ladestrecken - Link
SELECT
DISTINCT
List der BK
Erstelle SK
Im Ziel
vorhanden
?
Lookup
INSERT INTO
Link
Stage
Raw
Vault
Ja
Nein
Erstelle SK
pro BK
SELECT
DISTINCT
List der BK,MD5
WHERE NOT EXISTS IN Link Erstelle SK
INSERT INTO
Link
Stage
Raw
Vault
Ladestrecken - Satellite
SELECT
DISTINCT
BK,
Attribute
Erstelle SK
Im Ziel
vorhanden
?
Lookup
INSERT INTO
Sat
Stage
Raw
Vault
Ja
Nein
Lookup SK
Änderung?
Nein
Ja
End-Dating
Open Sat
Records
Raw
Vault
Ladeabhängigkeiten
Hubs
Links
Satellites
MD5
Alles
parallel ETL oder ELT?
MD5
• Message-digest Algorithm 128-bit (16-byte) oder 32 digit
hexadecimal
• Ronald Rivest in 1991
• RFC-1321
• Collision durch Präparation der Eingabgedateien erzwingbar
• Algorithmus zur Berechnung im Data-Vault muss eingehalten
werden!
– NULL-Handling
– Formate für Zahlen und Datum
– Trennzeichen!
• Alternativen: http://en.wikipedia.org/wiki/List_of_hash_functions
Vielen Dank für Ihre Aufmerksamkeit!
Fragen?
tglunde
Torsten Glunde
mailto:t.glunde(at)alligator-company.de
Weitere Netzwerke:
https://www.xing.com/profile/Torsten_Glunde
https://www.linkedin.com/pub/torsten-glunde/8/aba/97
Farbenlehre Data Vault
Sale
Customer
Employee
Product
Vendor
Store
Region
Link
Link
Link
Business Key
Relation
Kontext
(historisch)
I
• Facts
II
• Context
III
• Shadow IT
IV
• Analytics, Research, Prototyping
Raw Vault
Business
Vault
Source Stage
Conceptional Data Model
Report
Mart
PDM
LDM
Sync
Sync
Data Flow
Stage
Tables
Map 1:1 Map F(x) F(x) Map
Complexe Geschäftsregeln
Bereinigung, Historisierung und
Integration – alles in einem Schritt
Traditionelle DWH Architektur
Staging
(EDW)
“Single Version of the
Truth”
Source Mart
3NF
Data Vault Vor- und Nachteile

Weitere ähnliche Inhalte

Was ist angesagt?

cassandra調査レポート
cassandra調査レポートcassandra調査レポート
cassandra調査レポート
Akihiro Kuwano
 
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
DataWorks Summit/Hadoop Summit
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Altinity Ltd
 
Oracle Client Failover - Under The Hood
Oracle Client Failover - Under The HoodOracle Client Failover - Under The Hood
Oracle Client Failover - Under The Hood
Ludovico Caldara
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
Yuki Morishita
 
[Container Plumbing Days 2023] Why was nerdctl made?
[Container Plumbing Days 2023] Why was nerdctl made?[Container Plumbing Days 2023] Why was nerdctl made?
[Container Plumbing Days 2023] Why was nerdctl made?
Akihiro Suda
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014
Ricard Clau
 
New in Oracle Universal Installer (OUI)
New in Oracle Universal Installer (OUI) New in Oracle Universal Installer (OUI)
New in Oracle Universal Installer (OUI)
Markus Michalewicz
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
nickmbailey
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
Altinity Ltd
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Simplilearn
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
JAX London
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
Gleb Kanterov
 
AmebaのMongoDB活用事例
AmebaのMongoDB活用事例AmebaのMongoDB活用事例
AmebaのMongoDB活用事例Akihiro Kuwano
 
2021 10-13 i ox query processing
2021 10-13 i ox query processing2021 10-13 i ox query processing
2021 10-13 i ox query processing
Andrew Lamb
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Simplilearn
 
Big Data
Big DataBig Data
Big Data
Priyanka Tuteja
 
[오픈소스컨설팅]Nginx jboss 연동가이드__v1
[오픈소스컨설팅]Nginx jboss 연동가이드__v1[오픈소스컨설팅]Nginx jboss 연동가이드__v1
[오픈소스컨설팅]Nginx jboss 연동가이드__v1
Ji-Woong Choi
 
捕鯨!詳解docker
捕鯨!詳解docker捕鯨!詳解docker
捕鯨!詳解docker
雄哉 吉田
 
Introduction to Return-Oriented Exploitation on ARM64 - Billy Ellis
Introduction to Return-Oriented Exploitation on ARM64 - Billy EllisIntroduction to Return-Oriented Exploitation on ARM64 - Billy Ellis
Introduction to Return-Oriented Exploitation on ARM64 - Billy Ellis
BillyEllis3
 

Was ist angesagt? (20)

cassandra調査レポート
cassandra調査レポートcassandra調査レポート
cassandra調査レポート
 
File Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & ParquetFile Format Benchmark - Avro, JSON, ORC & Parquet
File Format Benchmark - Avro, JSON, ORC & Parquet
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
Oracle Client Failover - Under The Hood
Oracle Client Failover - Under The HoodOracle Client Failover - Under The Hood
Oracle Client Failover - Under The Hood
 
Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編Cassandraのしくみ データの読み書き編
Cassandraのしくみ データの読み書き編
 
[Container Plumbing Days 2023] Why was nerdctl made?
[Container Plumbing Days 2023] Why was nerdctl made?[Container Plumbing Days 2023] Why was nerdctl made?
[Container Plumbing Days 2023] Why was nerdctl made?
 
Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014Big Data! Great! Now What? #SymfonyCon 2014
Big Data! Great! Now What? #SymfonyCon 2014
 
New in Oracle Universal Installer (OUI)
New in Oracle Universal Installer (OUI) New in Oracle Universal Installer (OUI)
New in Oracle Universal Installer (OUI)
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
 
HBase Advanced - Lars George
HBase Advanced - Lars GeorgeHBase Advanced - Lars George
HBase Advanced - Lars George
 
Using ClickHouse for Experimentation
Using ClickHouse for ExperimentationUsing ClickHouse for Experimentation
Using ClickHouse for Experimentation
 
AmebaのMongoDB活用事例
AmebaのMongoDB活用事例AmebaのMongoDB活用事例
AmebaのMongoDB活用事例
 
2021 10-13 i ox query processing
2021 10-13 i ox query processing2021 10-13 i ox query processing
2021 10-13 i ox query processing
 
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
Hadoop Interview Questions And Answers Part-1 | Big Data Interview Questions ...
 
Big Data
Big DataBig Data
Big Data
 
[오픈소스컨설팅]Nginx jboss 연동가이드__v1
[오픈소스컨설팅]Nginx jboss 연동가이드__v1[오픈소스컨설팅]Nginx jboss 연동가이드__v1
[오픈소스컨설팅]Nginx jboss 연동가이드__v1
 
捕鯨!詳解docker
捕鯨!詳解docker捕鯨!詳解docker
捕鯨!詳解docker
 
Introduction to Return-Oriented Exploitation on ARM64 - Billy Ellis
Introduction to Return-Oriented Exploitation on ARM64 - Billy EllisIntroduction to Return-Oriented Exploitation on ARM64 - Billy Ellis
Introduction to Return-Oriented Exploitation on ARM64 - Billy Ellis
 

Andere mochten auch

DWH-Modellierung mit Data Vault
DWH-Modellierung mit Data VaultDWH-Modellierung mit Data Vault
DWH-Modellierung mit Data Vault
Trivadis
 
Modellierung agliler Data Warehouses mit Data Vault
Modellierung agliler Data Warehouses mit Data VaultModellierung agliler Data Warehouses mit Data Vault
Modellierung agliler Data Warehouses mit Data Vault
Trivadis
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
Andreas Buckenhofer
 
Data Vault Architektur
Data Vault ArchitekturData Vault Architektur
Data Vault Architektur
Torsten Glunde
 
MT AG Data Vault Generator
MT AG Data Vault GeneratorMT AG Data Vault Generator
MT AG Data Vault Generator
MT AG
 
Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)
Andreas Buckenhofer
 
Visual Data Vault
Visual Data VaultVisual Data Vault
Visual Data Vault
Michael Olschimke
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Kent Graziano
 
OpenDMA - Daten Management Solution
OpenDMA  - Daten Management SolutionOpenDMA  - Daten Management Solution
OpenDMA - Daten Management Solution
Torsten Glunde
 
Data Quadrant - Daten Management Methode
Data Quadrant - Daten Management MethodeData Quadrant - Daten Management Methode
Data Quadrant - Daten Management Methode
Torsten Glunde
 
Data Virtualization - Supernova
Data Virtualization - SupernovaData Virtualization - Supernova
Data Virtualization - Supernova
Torsten Glunde
 
Dv 20 sdlc_oss_automation
Dv 20 sdlc_oss_automationDv 20 sdlc_oss_automation
Dv 20 sdlc_oss_automation
Torsten Glunde
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
Kent Graziano
 

Andere mochten auch (13)

DWH-Modellierung mit Data Vault
DWH-Modellierung mit Data VaultDWH-Modellierung mit Data Vault
DWH-Modellierung mit Data Vault
 
Modellierung agliler Data Warehouses mit Data Vault
Modellierung agliler Data Warehouses mit Data VaultModellierung agliler Data Warehouses mit Data Vault
Modellierung agliler Data Warehouses mit Data Vault
 
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der AutomobilindustrieCDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
CDC und Data Vault für den Aufbau eines DWH in der Automobilindustrie
 
Data Vault Architektur
Data Vault ArchitekturData Vault Architektur
Data Vault Architektur
 
MT AG Data Vault Generator
MT AG Data Vault GeneratorMT AG Data Vault Generator
MT AG Data Vault Generator
 
Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)Metadaten und Data Vault (Meta Vault)
Metadaten und Data Vault (Meta Vault)
 
Visual Data Vault
Visual Data VaultVisual Data Vault
Visual Data Vault
 
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data CaptureData Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
 
OpenDMA - Daten Management Solution
OpenDMA  - Daten Management SolutionOpenDMA  - Daten Management Solution
OpenDMA - Daten Management Solution
 
Data Quadrant - Daten Management Methode
Data Quadrant - Daten Management MethodeData Quadrant - Daten Management Methode
Data Quadrant - Daten Management Methode
 
Data Virtualization - Supernova
Data Virtualization - SupernovaData Virtualization - Supernova
Data Virtualization - Supernova
 
Dv 20 sdlc_oss_automation
Dv 20 sdlc_oss_automationDv 20 sdlc_oss_automation
Dv 20 sdlc_oss_automation
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 

Data Vault Vor- und Nachteile