3. • Big Data1
• Data Modeling2
• Big Data Modeling3
AGENDA
4. Session Objectives
• Big Data Fundamentals
– Components of Big Data
– Structure & Schemas
– Tools & Architecture
• Data Modeling
– Integration & History
– Data Warehousing & BI
– Conceptual to Physical
• Big Data Modeling
– Focus on Meaning
• Ensemble Modeling
– The Blended Architecture
6. Big Data
“Huge” Data Volumes
n-Structured & Very Complex
Streaming & Shape-Shifting
Typical Data
v v
v v
v v
v v
Typical Data Big Data
A
B
C
7. Big Data
• Volume
Huge Volumes of Data
• Velocity
Drinking from a Fire Hose
• Variety
n-Structured Data
• Veracity
Quality, Accuracy, Reliability, Trustworthiness
• Value
Business Value and Value Potential
8. Big Data Architecture
• To deal with the features of Big Data,
supporting architectural components are
based on:
–Data distribution, and
–Late Binding of Schemas
KVP
9. Modeling and Understanding
• Schema on Write
• Schema on Read
• Dismantled Schema on Write
• Schema on Focus
• Schema on Leverage
9
LOAD
MODEL APPLY
EXPLORE
13. Data Modeling
Mans Search for Meaning…
• Conceptual Modeling
• Logical Modeling
• Information Modeling
• Physical Data Modeling
14. Ensemble Modeling™
14
All the parts of a thing taken together, so that
each part is considered only in relation to the whole.
• The constellation of component parts acts as a whole.
• With Ensemble Modeling the Core Business Concepts that we define and
model are represented as a whole – an ensemble – including all of the
component parts. An Ensemble is typically based on all things defining a
Core Business Concept that can be uniquely and specifically said for one
instance of that Concept.
E M F
15. Forms of Modeling & Ensemble
15
Ensemble
Anchor Focal Point Data Vault
DV2.02G
Hyper Agility
Temporal
6NF, etc.
Matter
EDW
Data
Mart
Data
Mart
Data
Mart
ERP
Acctg
Sales
3NF Dimensional
E M F
16. The Data Vault Ensemble
16
• The Data Vault Ensemble conforms to a single key – embodied
in the Hub construct.
• The component parts for the Data Vault Ensemble include:
– Hub The Natural Business Key
– Link The Natural Business Relationships
– Satellite All Context, Descriptive Data and History
17. Ensemble means thinking differently
17
Customer
Customer
• The minimal construct then for an “entity”
such as “Customer” is now (in data vault) a
Hub with a set of Satellites
19. Data Vault Ensemble Modeling Process
1) Identify and Model the Core Business Concepts
• Business Interviews is at the heart of this step
What do you do? What are the main things you work with?
• Find best/target Natural Business Key
19
20. Data Vault Ensemble Modeling Process
2) Identify and Model the Natural Business Relationships
• Specific Unique Relationships
• Be considerate of the Unit of Work and Grain
20
21. Data Vault Ensemble Modeling Process
3) Analyze and Design the Context Satellites
• Consider Rate of Change, Type of Data
and also the Sources
21
23. Logical business model
• Leveraged for all logical
model needs including
the data warehouse, big
data lake, master data
management (MDM) and
operational integration
initiatives
• Closely aligned to DV
physical model
Ensemble Logical Form ( )
23
Customer
Region Store
Sale
Vendor
Product
Sale LI
Employee
Customer
Region
Store
Sale
Vendor
Product
Sale LI
Employee
Customer
Region
Store
Sale
Vendor
Product
Sale LI
Employee
24. Ensemble Logical Form
24
Customer
Region
Store
Sale
Vendor
Product
Sale LI
Employee
ELF Modeling maintained in:
* Metadata
* Logical Data Model
* Data Modeling Tools
* Virtual Schemas
* Other Tools or Artifacts
Map to Context Data stored in:
* JSON Docs
* XML (w/ XSD or Not)
* Blobs (Free Form Text)
* Big Data Platforms
* Hadoop
* In the Cloud
25. Three Paths for Modeling
Structured / Known
• CBC
• NBR
• Attribution
• Columns
Results in a backbone
model with attributes
in defined columns
N-Structured / NVP
• CBC
• NBR
• Attribution
Results in a backbone
modes with
known/expected
attribute names/tags
N-Structured / KVP
• CBC
• NBR
Results in a backbone
model with capacity
to capture unknown
attribution either
named/tagged or not
28. Summary
Ensemble in the Big Data World
• Conceptual Modeling
• Logical Modeling
• Information Modeling
• Physical Data Modeling
• Integration Platform
+
+
+
-
+ + +
29. Links and Information
CDVDM Training & Certification
www.GeneseeAcademy.com
gohansgo
Hans@GeneseeAcademy.com
HansHultgren.WordPress.com
HansHultgren
Online, On-Demand Video Lessons
DataVaultAcademy.com
DataVaultAcademy
29
e-Book: Book:
ModelingtheAgile DataWarehousewithDataVault ModelingtheAgile DataWarehousewithDataVault