SlideShare ist ein Scribd-Unternehmen logo
1 von 32
AGILE DATA MINING 
WITH DATA VAULT 2.0 
Timo Cirkel, Michael Olschimke 
DĂśrffler & Partner GmbH
Introduction 
Background 
Example 
Conclusion 
AGENDA 
Agile 12.02.2014 Data Mining with Data Vault 2.0 2
INTRODUCTION 
Agile Data Mining with DataVault 2.0 
Agile 12.02.2014 Data Mining with Data Vault 2.0 3
TIMO CIRKEL 
BI-Consultant 
Certified Data Vault 2.0 Practitioner 
Analysis Of Policyholders 
Specialized inCRM, Software Development, 
DWHAutomation 
Industries: Insurance, Energy 
B. Sc. Business Informatics 
12.02.2014 Agile Data Mining with Data Vault 2.0 4
MICHAEL OLSCHIMKE 
Senior BI-Consultant 
Certified Data Vault 2.0 Practitioner 
Official Data Vault 2.0 Trainer in Europe 
AssociateTeacher University of Hannover 
Specializing in Data Vault 2.0, Data Mining, 
CRM, project management 
Industries: Insurance, Automotive, Retail, 
Public Sector, Non-Profits 
12.02.2014 Agile Data Mining with Data Vault 2.0 5
• Medium-sized consulting firm 
• Official Partner of Dan Linstedt In 
Europe 
• Consulting, Training, 
Implementation 
• Industries: 
• Insurance 
• Automotive 
• Banks 
• Trade 
• Pharmaceuticals 
• Telecommunications 
DÖRFFLER & PARTNER GMBH 
12.02.2014 Agile Data Mining With Data Vault 2.0 6
BACKGROUND 
Agile Data Mining with DataVault 2.0 
Agile 12.02.2014 Data Mining with Data Vault 2.0 7
DATA MINING PROJECT IN THE VGH 
Motor insurance 
Customer segmentation 
A first datamining pilot, therefore: 
No specific requirements 
Vision is developed during project 
Agile Project Methodology 
Close co-operation with business 
12.02.2014 Agile Data Mining with Data Vault 2.0 8
• Extracting 
information from 
existing data and 
Patterns 
• Four (large) 
categories: 
• Segmentation 
• Classification 
• Prediction 
• Association 
• Wide range of 
available algorithms 
and methods 
DATA MINING PROJECTS 
"The term Data Mining ... describes 
the extraction implicitly existing, 
non-trivial and useful knowledge 
from large, dynamic, relatively 
complex structured data." 
Datenbank 
Anwendung 
Anwender 
Data-Mining- 
Techniken 
Aussagen, Regeln & 
Informationen 
Data Dictionary 
Fachwissen 
12.02.2014 Agile Data Mining with Data Vault 2.0 9
DATA VAULT 2.0 MODELING 
Surrogate 
Key 
Business 
Keys 
Foreign Keys 
Descriptors 
In accordance with its own representation Linstedt, 2014 
12.02.2014 Agile Data Mining with Data Vault 2.0 10
DATA VAULT 2.0 METHODOLOGY 
Data Vault 
2.0 
Methodology 
Six 
Sigma 
TQM 
Scrum CMMI 
PMP 
SDLC 
12.02.2014 Agile Data Mining with Data Vault 2.0 11
DATA VAULT 2.0 METHODOLOGY FOR DATA MINING 
Advantages 
• Agile project management for DWH projects 
• Automation and generation 
• Rapid adoption to changes in the model 
• Incremental build-out = incremental cost control 
• Targeted delivery = two week sprints 
• Predictable and measurable results 
Disadvantages 
• Focus on loading of raw data and the production 
of information 
• Not many data mining references 
• Many concepts in the methodology are not 
applicable for data mining projects 
• Difficult scaling of team sizes in data mining 
projects 
12.02.2014 Agile Data Mining with Data Vault 2.0 12
CRISP-DM 
Own Representation in accordance with Chapman, et al. , 2000 
12.02.2014 Agile Data Mining with Data Vault 2.0 13
PROCESS MODEL 
Prozessmodell – VGH Kundensegmentierung 
ivv KTC D & P 
Daten in Data Vault 
Modell speichern 
Daten abziehen 
Algorithmus 
auswählen 
Segmentierung 
ausfĂźhren 
Ergebnis erzielt? 
Ja 
Ergebnis 
präsentieren 
Ergebnis ok? 
Ende 
Ja 
Start 
GĂźtefunktion 
erarbeiten 
SQL-Query erstellen 
Relevante VN-Attribute 
ermitteln 
Nein Formel ok? 
Ja 
Nein 
Algorithmen 
erforschen 
Nein 
Geeigneter 
Algorithmus 
gefunden? 
Ja 
Nein 
12.02.2014 Agile Data Mining with Data Vault 2.0 14
RAPIDMINER 
 Java-based 
data 
mining 
software 
 One of 
the most 
widely used 
data mining 
tools 
 Offers 
 Environment fo 
r control flow 
 Large number 
of algorithms 
 Large choice 
of data sources 
Overall CorporaTE Consultants Academics NGO / GOV'T 
Š 2012 Rexer AnalYTICS 
12.02.2014 Agile Data Mining with Data Vault 2.0 15
EXAMPLE 
Agile Data Mining with DataVault 2.0 
Agile 12.02.2014 Data Mining with Data Vault 2.0 16
EXAMPLE 
 AdventureWorks-Database 
 Scenario: 
 Advertising campaign for a new bike 
 Identification of the target group 
 Solution: 
 Decision Tree 
 Identify relevant attributes in several iterations 
Lachev, 2005, p. 238ff 
Simple 
Example 
12.02.2014 Agile Data Mining with Data Vault 2.0 17
Agile Data Mining with Data Vault 2.0 18 
10066 Records 
Attribute 
Marital 
Status 
Gender 
Yearly 
Income 
Total 
Children 
Education 
Number Cars 
Owned 
Commute 
Distance 
Occupation 
House Owner 
Flag 
Age
ITERATION 1: DATA VAULT 2.0 MODEL 
English 
Education 
Numbers Cars 
Owned 
Gender 
Marital Status 
Sat 
Customer 
Hub 
Customer 
Customer Key 
Commute 
Distance 
Age 
House Owner 
Flag 
English 
Occupation 
Sat Category 
Product 
Category 
12.02.2014 Agile Data Mining with Data Vault 2.0 19
ITERATION 1: RAPIDMINER PROCESS 
Data Gathering 
Data preparation 
Modeling 
12.02.2014 Agile Data Mining with Data Vault 2.0 20
ITERATION 1: DECISIONTREE MODEL 
12.02.2014 Agile Data Mining with Data Vault 2.0 21
ITERATION 1: RESULTS 
12.02.2014 Agile Data Mining with Data Vault 2.0 22
ITERATION 2: DATA VAULT 2.0 MODEL 
English 
Education 
Numbers Cars 
Owned 
Gender 
Marital Status 
Sat 
Customer 
Hub 
Customer 
Sat Customer 
Income 
Customer Key 
Commute 
Distance 
Age 
House Owner 
Flag 
English 
Occupation 
Sat Customer 
Children 
Sat Category 
Total 
Children 
Yearly 
Income 
Product 
Category 
12.02.2014 Agile Data Mining with Data Vault 2.0 23
ITERATION 2: RAPIDMINER PROCESS 
Data Gathering 
Preparation Modeling 
12.02.2014 Agile Data Mining with Data Vault 2.0 24
ITERATION 2: RESULTS 
+4.01% 
12.02.2014 Agile Data Mining with Data Vault 2.0 25
ITERATION 3: DATA VAULT 2.0 MODEL 
English 
Education 
Numbers Cars 
Owned 
Gender 
Marital Status 
Sat 
Customer 
Hub 
Customer 
Sat Customer 
Income 
Customer Key 
Commute 
Distance 
Age 
House Owner 
Flag 
English 
Occupation 
Sat Customer 
Children 
Sat Category 
Total 
Children 
Yearly 
Income 
Product 
Category 
Commute 
Distance Miles 
CSat Customer 
Distance 
12.02.2014 Agile Data Mining with Data Vault 2.0 26
ITERATION 3: RAPIDMINER PROCESS 
Data Gathering 
Preparation Modeling 
12.02.2014 Agile Data Mining with Data Vault 2.0 27
ITERATION 3: RESULTS 
+0.12% 
12.02.2014 Agile Data Mining with Data Vault 2.0 28
CONCLUSIONS 
Agile Data Mining with DataVault 2.0 
Agile 12.02.2014 Data Mining with Data Vault 2.0 29
CONCLUSIONS 
 Data Vault is a flexible data 
model, with good support for agile project 
methodology 
 DataVault is not an additional hurdle in data mining 
projects 
 Additional attributes can be added at any time during 
the project, in an incremental fashion 
Business Vault: transparent data processing 
12.02.2014 Agile Data Mining with Data Vault 2.0 30
FURTHER INFORMATION 
Appears 
2015 
Available 
Www.doerffler.com WWW.datavault.de Www.learndatavault.com 
Appears 
2015 
12.02.2014 Agile Data Mining with Data Vault 2.0 31
Give us feedback 
Agile Data Mining with Data Vault 2.0 32 
Http://goo.gl/LGO4ze 
Source:Vasilijonline.com 
12.02.2014

Weitere ähnliche Inhalte

Was ist angesagt?

Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
pcherukumalla
 

Was ist angesagt? (20)

Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
Data Vault 2.0 DeMystified with Dan Linstedt and WhereScape
Data Vault 2.0 DeMystified with Dan Linstedt and WhereScapeData Vault 2.0 DeMystified with Dan Linstedt and WhereScape
Data Vault 2.0 DeMystified with Dan Linstedt and WhereScape
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodology
 
DAMA Feb2015 Mastering Master Data
DAMA Feb2015 Mastering Master DataDAMA Feb2015 Mastering Master Data
DAMA Feb2015 Mastering Master Data
 
Date warehousing concepts
Date warehousing conceptsDate warehousing concepts
Date warehousing concepts
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
Cloud Data Warehouses
Cloud Data WarehousesCloud Data Warehouses
Cloud Data Warehouses
 
Data Governance and Metadata Management
Data Governance and Metadata ManagementData Governance and Metadata Management
Data Governance and Metadata Management
 
Data Governance Best Practices
Data Governance Best PracticesData Governance Best Practices
Data Governance Best Practices
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
Dimensional modeling primer - SQL Saturday Madison - April 11th, 2015
 
Automate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile wayAutomate data warehouse etl testing and migration testing the agile way
Automate data warehouse etl testing and migration testing the agile way
 
Data platform architecture
Data platform architectureData platform architecture
Data platform architecture
 
Introduction to Data Vault Modeling
Introduction to Data Vault ModelingIntroduction to Data Vault Modeling
Introduction to Data Vault Modeling
 
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
(OTW13) Agile Data Warehousing: Introduction to Data Vault Modeling
 
Reference master data management
Reference master data managementReference master data management
Reference master data management
 
The Importance of Metadata
The Importance of MetadataThe Importance of Metadata
The Importance of Metadata
 
Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaion
 
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance -  Combining Data Management with Organizational ...DAS Slides: Data Governance -  Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
 

Ähnlich wie Agile Data Mining with Data Vault 2.0 (english)

Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...
Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...
Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...
Denodo
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
IngridBuenaventura
 
ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?
ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?
ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?
Denodo
 

Ähnlich wie Agile Data Mining with Data Vault 2.0 (english) (20)

Building Resiliency and Agility with Data Virtualization for the New Normal
Building Resiliency and Agility with Data Virtualization for the New NormalBuilding Resiliency and Agility with Data Virtualization for the New Normal
Building Resiliency and Agility with Data Virtualization for the New Normal
 
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Making Big Data Analytics with Hadoop fast & easy (webinar slides)
Making Big Data Analytics with Hadoop fast & easy (webinar slides)
 
Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...
Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...
Innovative Data Strategies for Advanced Analytics Solutions and the Role of D...
 
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
By Thoughtworks | Building data as a product: The key to unlocking Data Mesh'...
 
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
Denodo DataFest 2016: Data Science: Operationalizing Analytical Models in Rea...
 
ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?
ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?
ÂżEn quĂŠ se parece el Gobierno del Dato a un parque de atracciones?
 
A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)A Key to Real-time Insights in a Post-COVID World (ASEAN)
A Key to Real-time Insights in a Post-COVID World (ASEAN)
 
Big Data with Data Virtualization (session 3 from Packed Lunch Webinar Series)
Big Data with Data Virtualization (session 3 from Packed Lunch Webinar Series)Big Data with Data Virtualization (session 3 from Packed Lunch Webinar Series)
Big Data with Data Virtualization (session 3 from Packed Lunch Webinar Series)
 
Slides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-CloudSlides: Success Stories for Data-to-Cloud
Slides: Success Stories for Data-to-Cloud
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
Multi-Cloud Data Integration with Data Virtualization (APAC)
Multi-Cloud Data Integration with Data Virtualization (APAC)Multi-Cloud Data Integration with Data Virtualization (APAC)
Multi-Cloud Data Integration with Data Virtualization (APAC)
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
 
Trends for Modernizing Analytics and Data Warehousing in 2019
Trends for Modernizing Analytics and Data Warehousing in 2019Trends for Modernizing Analytics and Data Warehousing in 2019
Trends for Modernizing Analytics and Data Warehousing in 2019
 
Bridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need ItBridging the Last Mile: Getting Data to the People Who Need It
Bridging the Last Mile: Getting Data to the People Who Need It
 
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database451 Research + NuoDB: What It Means to be a Container-Native SQL Database
451 Research + NuoDB: What It Means to be a Container-Native SQL Database
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Your Data is Waiting. What are the Top 5 Trends for Data in 2022? (ASEAN)
Your Data is Waiting. What are the Top 5 Trends for Data in 2022? (ASEAN)Your Data is Waiting. What are the Top 5 Trends for Data in 2022? (ASEAN)
Your Data is Waiting. What are the Top 5 Trends for Data in 2022? (ASEAN)
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
TechEvent DWH Modernization
TechEvent DWH ModernizationTechEvent DWH Modernization
TechEvent DWH Modernization
 
Data Virtualization: An Introduction
Data Virtualization: An IntroductionData Virtualization: An Introduction
Data Virtualization: An Introduction
 

Mehr von Michael Olschimke

Introduction to Salesforce CRM Reporting
Introduction to Salesforce CRM ReportingIntroduction to Salesforce CRM Reporting
Introduction to Salesforce CRM Reporting
Michael Olschimke
 
Introduction to Google Analytics
Introduction to Google AnalyticsIntroduction to Google Analytics
Introduction to Google Analytics
Michael Olschimke
 

Mehr von Michael Olschimke (9)

Agiles Data Mining mit Data Vault 2.0
Agiles Data Mining mit Data Vault 2.0Agiles Data Mining mit Data Vault 2.0
Agiles Data Mining mit Data Vault 2.0
 
Introduction to Salesforce CRM Reporting
Introduction to Salesforce CRM ReportingIntroduction to Salesforce CRM Reporting
Introduction to Salesforce CRM Reporting
 
Introduction to Google Analytics
Introduction to Google AnalyticsIntroduction to Google Analytics
Introduction to Google Analytics
 
Visual Data Vault
Visual Data VaultVisual Data Vault
Visual Data Vault
 
Introduction to Piwik
Introduction to PiwikIntroduction to Piwik
Introduction to Piwik
 
Business Concepts for Mobile Applications
Business Concepts for Mobile ApplicationsBusiness Concepts for Mobile Applications
Business Concepts for Mobile Applications
 
Technology Concepts for Mobile Applications
Technology Concepts for Mobile ApplicationsTechnology Concepts for Mobile Applications
Technology Concepts for Mobile Applications
 
Ethische Entscheidungskompetenz
Ethische EntscheidungskompetenzEthische Entscheidungskompetenz
Ethische Entscheidungskompetenz
 
Data Modeling Zone 2013
Data Modeling Zone 2013Data Modeling Zone 2013
Data Modeling Zone 2013
 

KĂźrzlich hochgeladen

CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
shivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
SUHANI PANDEY
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 

KĂźrzlich hochgeladen (20)

100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 

Agile Data Mining with Data Vault 2.0 (english)

  • 1. AGILE DATA MINING WITH DATA VAULT 2.0 Timo Cirkel, Michael Olschimke DĂśrffler & Partner GmbH
  • 2. Introduction Background Example Conclusion AGENDA Agile 12.02.2014 Data Mining with Data Vault 2.0 2
  • 3. INTRODUCTION Agile Data Mining with DataVault 2.0 Agile 12.02.2014 Data Mining with Data Vault 2.0 3
  • 4. TIMO CIRKEL BI-Consultant Certified Data Vault 2.0 Practitioner Analysis Of Policyholders Specialized inCRM, Software Development, DWHAutomation Industries: Insurance, Energy B. Sc. Business Informatics 12.02.2014 Agile Data Mining with Data Vault 2.0 4
  • 5. MICHAEL OLSCHIMKE Senior BI-Consultant Certified Data Vault 2.0 Practitioner Official Data Vault 2.0 Trainer in Europe AssociateTeacher University of Hannover Specializing in Data Vault 2.0, Data Mining, CRM, project management Industries: Insurance, Automotive, Retail, Public Sector, Non-Profits 12.02.2014 Agile Data Mining with Data Vault 2.0 5
  • 6. • Medium-sized consulting firm • Official Partner of Dan Linstedt In Europe • Consulting, Training, Implementation • Industries: • Insurance • Automotive • Banks • Trade • Pharmaceuticals • Telecommunications DÖRFFLER & PARTNER GMBH 12.02.2014 Agile Data Mining With Data Vault 2.0 6
  • 7. BACKGROUND Agile Data Mining with DataVault 2.0 Agile 12.02.2014 Data Mining with Data Vault 2.0 7
  • 8. DATA MINING PROJECT IN THE VGH Motor insurance Customer segmentation A first datamining pilot, therefore: No specific requirements Vision is developed during project Agile Project Methodology Close co-operation with business 12.02.2014 Agile Data Mining with Data Vault 2.0 8
  • 9. • Extracting information from existing data and Patterns • Four (large) categories: • Segmentation • Classification • Prediction • Association • Wide range of available algorithms and methods DATA MINING PROJECTS "The term Data Mining ... describes the extraction implicitly existing, non-trivial and useful knowledge from large, dynamic, relatively complex structured data." Datenbank Anwendung Anwender Data-Mining- Techniken Aussagen, Regeln & Informationen Data Dictionary Fachwissen 12.02.2014 Agile Data Mining with Data Vault 2.0 9
  • 10. DATA VAULT 2.0 MODELING Surrogate Key Business Keys Foreign Keys Descriptors In accordance with its own representation Linstedt, 2014 12.02.2014 Agile Data Mining with Data Vault 2.0 10
  • 11. DATA VAULT 2.0 METHODOLOGY Data Vault 2.0 Methodology Six Sigma TQM Scrum CMMI PMP SDLC 12.02.2014 Agile Data Mining with Data Vault 2.0 11
  • 12. DATA VAULT 2.0 METHODOLOGY FOR DATA MINING Advantages • Agile project management for DWH projects • Automation and generation • Rapid adoption to changes in the model • Incremental build-out = incremental cost control • Targeted delivery = two week sprints • Predictable and measurable results Disadvantages • Focus on loading of raw data and the production of information • Not many data mining references • Many concepts in the methodology are not applicable for data mining projects • Difficult scaling of team sizes in data mining projects 12.02.2014 Agile Data Mining with Data Vault 2.0 12
  • 13. CRISP-DM Own Representation in accordance with Chapman, et al. , 2000 12.02.2014 Agile Data Mining with Data Vault 2.0 13
  • 14. PROCESS MODEL Prozessmodell – VGH Kundensegmentierung ivv KTC D & P Daten in Data Vault Modell speichern Daten abziehen Algorithmus auswählen Segmentierung ausfĂźhren Ergebnis erzielt? Ja Ergebnis präsentieren Ergebnis ok? Ende Ja Start GĂźtefunktion erarbeiten SQL-Query erstellen Relevante VN-Attribute ermitteln Nein Formel ok? Ja Nein Algorithmen erforschen Nein Geeigneter Algorithmus gefunden? Ja Nein 12.02.2014 Agile Data Mining with Data Vault 2.0 14
  • 15. RAPIDMINER  Java-based data mining software  One of the most widely used data mining tools  Offers  Environment fo r control flow  Large number of algorithms  Large choice of data sources Overall CorporaTE Consultants Academics NGO / GOV'T Š 2012 Rexer AnalYTICS 12.02.2014 Agile Data Mining with Data Vault 2.0 15
  • 16. EXAMPLE Agile Data Mining with DataVault 2.0 Agile 12.02.2014 Data Mining with Data Vault 2.0 16
  • 17. EXAMPLE  AdventureWorks-Database  Scenario:  Advertising campaign for a new bike  Identification of the target group  Solution:  Decision Tree  Identify relevant attributes in several iterations Lachev, 2005, p. 238ff Simple Example 12.02.2014 Agile Data Mining with Data Vault 2.0 17
  • 18. Agile Data Mining with Data Vault 2.0 18 10066 Records Attribute Marital Status Gender Yearly Income Total Children Education Number Cars Owned Commute Distance Occupation House Owner Flag Age
  • 19. ITERATION 1: DATA VAULT 2.0 MODEL English Education Numbers Cars Owned Gender Marital Status Sat Customer Hub Customer Customer Key Commute Distance Age House Owner Flag English Occupation Sat Category Product Category 12.02.2014 Agile Data Mining with Data Vault 2.0 19
  • 20. ITERATION 1: RAPIDMINER PROCESS Data Gathering Data preparation Modeling 12.02.2014 Agile Data Mining with Data Vault 2.0 20
  • 21. ITERATION 1: DECISIONTREE MODEL 12.02.2014 Agile Data Mining with Data Vault 2.0 21
  • 22. ITERATION 1: RESULTS 12.02.2014 Agile Data Mining with Data Vault 2.0 22
  • 23. ITERATION 2: DATA VAULT 2.0 MODEL English Education Numbers Cars Owned Gender Marital Status Sat Customer Hub Customer Sat Customer Income Customer Key Commute Distance Age House Owner Flag English Occupation Sat Customer Children Sat Category Total Children Yearly Income Product Category 12.02.2014 Agile Data Mining with Data Vault 2.0 23
  • 24. ITERATION 2: RAPIDMINER PROCESS Data Gathering Preparation Modeling 12.02.2014 Agile Data Mining with Data Vault 2.0 24
  • 25. ITERATION 2: RESULTS +4.01% 12.02.2014 Agile Data Mining with Data Vault 2.0 25
  • 26. ITERATION 3: DATA VAULT 2.0 MODEL English Education Numbers Cars Owned Gender Marital Status Sat Customer Hub Customer Sat Customer Income Customer Key Commute Distance Age House Owner Flag English Occupation Sat Customer Children Sat Category Total Children Yearly Income Product Category Commute Distance Miles CSat Customer Distance 12.02.2014 Agile Data Mining with Data Vault 2.0 26
  • 27. ITERATION 3: RAPIDMINER PROCESS Data Gathering Preparation Modeling 12.02.2014 Agile Data Mining with Data Vault 2.0 27
  • 28. ITERATION 3: RESULTS +0.12% 12.02.2014 Agile Data Mining with Data Vault 2.0 28
  • 29. CONCLUSIONS Agile Data Mining with DataVault 2.0 Agile 12.02.2014 Data Mining with Data Vault 2.0 29
  • 30. CONCLUSIONS  Data Vault is a flexible data model, with good support for agile project methodology  DataVault is not an additional hurdle in data mining projects  Additional attributes can be added at any time during the project, in an incremental fashion Business Vault: transparent data processing 12.02.2014 Agile Data Mining with Data Vault 2.0 30
  • 31. FURTHER INFORMATION Appears 2015 Available Www.doerffler.com WWW.datavault.de Www.learndatavault.com Appears 2015 12.02.2014 Agile Data Mining with Data Vault 2.0 31
  • 32. Give us feedback Agile Data Mining with Data Vault 2.0 32 Http://goo.gl/LGO4ze Source:Vasilijonline.com 12.02.2014

Hinweis der Redaktion

  1. In This Slides Only The logos Replace. To Try it out New Design /Discuss Have We No Time
  2. Short On the DM Project In The VGH Comment. On the BI Spectrum Article Point out Objectives The Project Used Tools. Crisp-DM Used. Etc. GGF. For more Slides Open Name The insurance? No specific requirements Attributes evolve over time "Customer" does not exactly define first Only private clients or companies? Policyholders or vehicle owners? What kinds of contracts? How are "good" customers?
  3. Hubs, Left, Satellite Short Explains With VDV. Take a look at In the Folder Sources, There Can You You Use.
  4. We can no data and Findings of the VGH present Therefore to avoid AdventureWorks Setup took over from book
  5. Short On Adenture Works DW Comment Background Information Model of the Relevant Tables 25 Attributes, 500k Records
  6. On the First DV model Comment.
  7. Demo in Rapidminer Also On Measures Comment (Accuracy, Or Precision/recall).  On Best Graphically In Rm Represent.
  8. Scatter Matrix Confusion matrix (performance matrix).
  9. On the Changes The DV Model Comment. Show As The Then Looks like.  Changes Comprehensible Make (On Animations)
  10. Demo in Rapidminer Also On Measures Comment (Accuracy, Or Precision/recall).  On Best Graphically In Rm Represent.
  11. On the Changes The DV Model Comment. Show As The Then Looks like.  Changes Comprehensible Make (On Animations)
  12. Demo in Rapidminer Also On Measures Comment (Accuracy, Or Precision/recall).  On Best Graphically In Rm Represent.
  13. What Are The Benefits From Approach? Reference The VGH Project Take, But Also On the demo
  14. TBC: Link Revise (Make I)