SlideShare ist ein Scribd-Unternehmen logo
1 von 24
STUDIEREN UND DURCHSTARTEN. Author I:	Dip.-Inf. (FH) Johannes Hoppe Author II:	M.Sc. Johannes Hofmeister Author III:	Prof. Dr. Dieter Homeister Date:	25.03.2011
Data Mining Theory Author I:	Dip.-Inf. (FH) Johannes Hoppe Author II:	M.Sc. Johannes Hofmeister Author III:	Prof. Dr. Dieter Homeister  Date:	25.03.2011
01 Applied Data Warehousing 3
Applied Data Warehousing A typical Data Flow looks like this 4
Data Warehouse Practical task Create groups of 3 people Grab “dmdw_rooms_fh_heidelberg-2006-04-03.xls” Explore the data in the “room reservations” spreadsheet Discuss and create a simple database table / documentthat matches the data Find a way to migrate the data from the excel spreadsheet to the database For today I recommend  SQL Server Business Intelligence Development Studio + SQL Server  Today's System: 5
Data Warehouse Next practical task One team will have to present a different solution 	to migrate the data It should be a hands-on lab for the other students I will upload the materials to my blog Preferred time box: 45 - 90 minutes First team will be: _________________________________ Next System: ? 6
Data Warehouse ETL Teams 1. Team:  Access – 2x Sebastian, Matthias  2. Team:  Access  Access2MySQL  MySQL – Mercedes, Fabian, Marcus, Albert 3. Teams:  Silverlight  MS SQL – Sebastian, Patrick 4. Teams: PHP  MySQL – Lars, Maurice, Jeff Next System: ? 7
02 Data Mining Theory 8
Data Mining Introduction(1/3) Data Mining is done by running software that examines a database and looks for patterns in the data  A data warehouse by itself will respond to queries from users  It will not tell users about patterns in data that users may not have thought about  To find patterns in data, data mining is used to try and mine key information from a data warehouse  9
Data Mining Introduction(2/3) Data mining allows companies to collect information 
 to make them more productive and 
 to beat their  competitors Data mining helps to identify why customers buy certain products  ideas for very direct marketing  ideas for shelf placement  training of employees vs. employee retention  employee benefits vs. employee retention  10
Data Mining Introduction(3/3) Data mining attempts to find patterns in data that we did not know about  Often data mining is just a new buzzword for statistics  But data mining differs from (school) statistics in the waythat large volumes of data are used  Trivial information or well known facts are not an aim of data mining! 11
Data Mining Some Data Mining Algorithms Machinelearning Statistics Pattern recognition Regression  Associationrules ,[object Object]
Decisiontrees
Neuralnetworks
Clustering
Classification
etc. 12
Data Mining Implementing DM on Top of a DW (1/2) Data mining tools / mining algorithms require data! There are two approaches:  Copy data from the Data Warehouse and mine it  Mine the data directly in the Data Warehouse  Popular tools use a variety of different data mining algorithms:  association rules  genetic algorithms  decision trees  neural networks  13
Data Mining Implementing DM on Top of a DW (2/2) a)  Copy data from the data warehouse to data mining tools  Advantage : Data mining tools may organize data so they can run faster  Disadvantage:  Can be very "expensive” to move large amounts of data  b)  Data mining tools can access data directly in the Data Warehouse  Advantage: No copy of data is needed for data mining  Disadvantage: Data may not be organized in a way that is efficient for the tool  14
Data Mining The Data Mining Process Step 1: Data preparation: cleanup ("scrubbing"), selection, check by specialists for the data. ( data warehouse) Step 2: Analysis phase, process the data by a data mining algorithm. Phase 3: Evaluation of the output, check if something new was discovered. 15
Data Mining The Data Mining Process Step 1 - Data preparation It is useful to fetch data from a data warehouse. This eliminates the need of collecting data from different sources, filtering and handling inconsistencies. Theoretically a data warehouse is not absolutely necessary, but in practice it is.  The data preparation process includes data selection and manipulation.  Validating and cleaning is necessary to eliminate out-of-range values and to handle missing values of our raw data. This may include plausibility checks.  16
Data Mining The Data Mining Process Step 1 - Data preparation Even if the data warehouse data are already cleaned and filtered, experience shows that this is not good enough for data mining.  The Data preparation also includes formatting, scaling and transformation of the raw data depending on the needs of the data mining algorithm. Examples: scaling of numeric data, currency or metric/inch conversion.  17
Data Mining The Data Mining Process Step 1 - Data preparation Many joined tables may be involved, selecting of rows or columns may be necessary, or two fields are combined as ratio, or we need derived values. This process needs guidance of someone with a good knowledge about the data and the problem domain.  It is usual that this data preparation consumes 50% to 80% of the data mining budget.  18
Data Mining The Data Mining Process Step 2 - Analysis phase process the data by a data mining algorithm. information discovery Analysis servicesbuild in:  Association Clustering  Decision Trees Linear Regression Logistic Regression Naives Bayes Neural Network 19

Weitere Àhnliche Inhalte

Was ist angesagt?

Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs ShahDhruv21
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reductionKrish_ver2
 
Mining 3-Clusters in Vertically Partitioned Data
Mining 3-Clusters in Vertically Partitioned DataMining 3-Clusters in Vertically Partitioned Data
Mining 3-Clusters in Vertically Partitioned DataFaris Alqadah
 
Data Reduction Stratergies
Data Reduction StratergiesData Reduction Stratergies
Data Reduction StratergiesAnjaliSoorej
 
Data reduction
Data reductionData reduction
Data reductionGowriLatha1
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingkayathri02
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 abhagathk
 
R Regression Models with Zelig
R Regression Models with ZeligR Regression Models with Zelig
R Regression Models with Zeligizahn
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditMichael BENESTY
 
Mit401 data warehousing and data mining
Mit401  data warehousing and data miningMit401  data warehousing and data mining
Mit401 data warehousing and data miningsmumbahelp
 
Review Over Sequential Rule Mining
Review Over Sequential Rule MiningReview Over Sequential Rule Mining
Review Over Sequential Rule Miningijsrd.com
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2uetian12
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessingHarry Potter
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series dataKrish_ver2
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANNKrishna Mohan Mishra
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachinePulse
 
Session 06 machine learning.pptx
Session 06 machine learning.pptxSession 06 machine learning.pptx
Session 06 machine learning.pptxbodaceacat
 
An intelligent scalable stock market prediction system
An intelligent scalable stock market prediction systemAn intelligent scalable stock market prediction system
An intelligent scalable stock market prediction systemHarshit Agarwal
 
Heart disease classification
Heart disease classificationHeart disease classification
Heart disease classificationSnehaDey21
 

Was ist angesagt? (20)

Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs Data Compression in Data mining and Business Intelligencs
Data Compression in Data mining and Business Intelligencs
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reduction
 
Mining 3-Clusters in Vertically Partitioned Data
Mining 3-Clusters in Vertically Partitioned DataMining 3-Clusters in Vertically Partitioned Data
Mining 3-Clusters in Vertically Partitioned Data
 
Data Reduction Stratergies
Data Reduction StratergiesData Reduction Stratergies
Data Reduction Stratergies
 
Data reduction
Data reductionData reduction
Data reduction
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
R Regression Models with Zelig
R Regression Models with ZeligR Regression Models with Zelig
R Regression Models with Zelig
 
Feature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax auditFeature Importance Analysis with XGBoost in Tax audit
Feature Importance Analysis with XGBoost in Tax audit
 
Mit401 data warehousing and data mining
Mit401  data warehousing and data miningMit401  data warehousing and data mining
Mit401 data warehousing and data mining
 
Review Over Sequential Rule Mining
Review Over Sequential Rule MiningReview Over Sequential Rule Mining
Review Over Sequential Rule Mining
 
2018 p 2019-ee-a2
2018 p 2019-ee-a22018 p 2019-ee-a2
2018 p 2019-ee-a2
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
5.2 mining time series data
5.2 mining time series data5.2 mining time series data
5.2 mining time series data
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
 
Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
Session 06 machine learning.pptx
Session 06 machine learning.pptxSession 06 machine learning.pptx
Session 06 machine learning.pptx
 
An intelligent scalable stock market prediction system
An intelligent scalable stock market prediction systemAn intelligent scalable stock market prediction system
An intelligent scalable stock market prediction system
 
Heart disease classification
Heart disease classificationHeart disease classification
Heart disease classification
 

Andere mochten auch

DMDW Lesson 01 - Introduction
DMDW Lesson 01 - IntroductionDMDW Lesson 01 - Introduction
DMDW Lesson 01 - IntroductionJohannes Hoppe
 
Ria 09 trends_and_technologies
Ria 09 trends_and_technologiesRia 09 trends_and_technologies
Ria 09 trends_and_technologiesJohannes Hoppe
 
DMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryDMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryJohannes Hoppe
 
DMDW Extra Lesson - NoSql and MongoDB
DMDW  Extra Lesson - NoSql and MongoDBDMDW  Extra Lesson - NoSql and MongoDB
DMDW Extra Lesson - NoSql and MongoDBJohannes Hoppe
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)Johannes Hoppe
 
2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung Mosbach2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung MosbachJohannes Hoppe
 
04 data types & variables
04   data types & variables04   data types & variables
04 data types & variablesdhrubo kayal
 
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Cloudera, Inc.
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data MiningValerii Klymchuk
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classificationKrish_ver2
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Bayesian classification
Bayesian classificationBayesian classification
Bayesian classificationManu Chandel
 
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian ClassifiersMachine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian ClassifiersPier Luca Lanzi
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsSalah Amean
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationAdnan Masood
 
04 theories and classification of retailing
04 theories and classification of retailing04 theories and classification of retailing
04 theories and classification of retailingDr. Chandan Vichoray
 
NoSQL - Hands on
NoSQL - Hands onNoSQL - Hands on
NoSQL - Hands onJohannes Hoppe
 

Andere mochten auch (20)

DMDW Lesson 01 - Introduction
DMDW Lesson 01 - IntroductionDMDW Lesson 01 - Introduction
DMDW Lesson 01 - Introduction
 
Ria 09 trends_and_technologies
Ria 09 trends_and_technologiesRia 09 trends_and_technologies
Ria 09 trends_and_technologies
 
DMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse TheoryDMDW Lesson 03 - Data Warehouse Theory
DMDW Lesson 03 - Data Warehouse Theory
 
DMDW Extra Lesson - NoSql and MongoDB
DMDW  Extra Lesson - NoSql and MongoDBDMDW  Extra Lesson - NoSql and MongoDB
DMDW Extra Lesson - NoSql and MongoDB
 
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)
2012-08-29 - NoSQL Bootcamp (Redis, RavenDB & MongoDB fĂŒr .NET Entwickler)
 
2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung Mosbach2017 - NoSQL Vorlesung Mosbach
2017 - NoSQL Vorlesung Mosbach
 
04 data types & variables
04   data types & variables04   data types & variables
04 data types & variables
 
04 data mining : data generelization
04 data mining : data generelization04 data mining : data generelization
04 data mining : data generelization
 
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
Hadoop Summit 2012 | Bayesian Counters AKA In Memory Data Mining for Large Da...
 
04 Classification in Data Mining
04 Classification in Data Mining04 Classification in Data Mining
04 Classification in Data Mining
 
Lecture 04 data resource management
Lecture 04 data resource managementLecture 04 data resource management
Lecture 04 data resource management
 
2.3 bayesian classification
2.3 bayesian classification2.3 bayesian classification
2.3 bayesian classification
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Bayesian classification
Bayesian classificationBayesian classification
Bayesian classification
 
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian ClassifiersMachine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
Machine Learning and Data Mining: 13 Nearest Neighbor and Bayesian Classifiers
 
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methodsData Mining: Concepts and techniques classification _chapter 9 :advanced methods
Data Mining: Concepts and techniques classification _chapter 9 :advanced methods
 
Belief Networks & Bayesian Classification
Belief Networks & Bayesian ClassificationBelief Networks & Bayesian Classification
Belief Networks & Bayesian Classification
 
04 theories and classification of retailing
04 theories and classification of retailing04 theories and classification of retailing
04 theories and classification of retailing
 
Naive bayes
Naive bayesNaive bayes
Naive bayes
 
NoSQL - Hands on
NoSQL - Hands onNoSQL - Hands on
NoSQL - Hands on
 

Ähnlich wie DMDW Lesson 04 - Data Mining Theory

Data warehousing interview questions
Data warehousing interview questionsData warehousing interview questions
Data warehousing interview questionsSatyam Jaiswal
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxHarsha Patel
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining Sushil Kulkarni
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data MiningIOSR Journals
 
Data Mining
Data MiningData Mining
Data Miningksanthosh
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceMahir Haque
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationDr. Abdul Ahad Abro
 
data mining
data miningdata mining
data miningmanasa polu
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introductionBasma Gamal
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentationmillerca2
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Miningtobiemuir
 
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
IRJET-	 Fault Detection and Prediction of Failure using Vibration AnalysisIRJET-	 Fault Detection and Prediction of Failure using Vibration Analysis
IRJET- Fault Detection and Prediction of Failure using Vibration AnalysisIRJET Journal
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesFellowBuddy.com
 
Modern trends in information systems
Modern trends in information systemsModern trends in information systems
Modern trends in information systemsPreeti Sontakke
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data WarehousingJason S
 

Ähnlich wie DMDW Lesson 04 - Data Mining Theory (20)

Data warehousing interview questions
Data warehousing interview questionsData warehousing interview questions
Data warehousing interview questions
 
Unit 5
Unit 5 Unit 5
Unit 5
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptx
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
A Survey on Data Mining
A Survey on Data MiningA Survey on Data Mining
A Survey on Data Mining
 
Data Mining
Data MiningData Mining
Data Mining
 
Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Data mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, ClassificationData mining , Knowledge Discovery Process, Classification
Data mining , Knowledge Discovery Process, Classification
 
data mining
data miningdata mining
data mining
 
Data mining introduction
Data mining introductionData mining introduction
Data mining introduction
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
Key Principles Of Data Mining
Key Principles Of Data MiningKey Principles Of Data Mining
Key Principles Of Data Mining
 
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
IRJET-	 Fault Detection and Prediction of Failure using Vibration AnalysisIRJET-	 Fault Detection and Prediction of Failure using Vibration Analysis
IRJET- Fault Detection and Prediction of Failure using Vibration Analysis
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
Abstract
AbstractAbstract
Abstract
 
Modern trends in information systems
Modern trends in information systemsModern trends in information systems
Modern trends in information systems
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Z36149154
Z36149154Z36149154
Z36149154
 

Mehr von Johannes Hoppe

EinfĂŒhrung in Angular 2
EinfĂŒhrung in Angular 2EinfĂŒhrung in Angular 2
EinfĂŒhrung in Angular 2Johannes Hoppe
 
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und IonicMDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und IonicJohannes Hoppe
 
2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung MosbachJohannes Hoppe
 
2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf AzureJohannes Hoppe
 
2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript SecurityJohannes Hoppe
 
2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScript2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScriptJohannes Hoppe
 
2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScriptJohannes Hoppe
 
2013 05-03 - HTML5 & JavaScript Security
2013 05-03 -  HTML5 & JavaScript Security2013 05-03 -  HTML5 & JavaScript Security
2013 05-03 - HTML5 & JavaScript SecurityJohannes Hoppe
 
2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL SpartakiadeJohannes Hoppe
 
2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo dbJohannes Hoppe
 
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best PracticesJohannes Hoppe
 
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGLJohannes Hoppe
 
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und MongodbJohannes Hoppe
 
2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGLJohannes Hoppe
 
2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDBJohannes Hoppe
 
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDBJohannes Hoppe
 
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDBJohannes Hoppe
 
2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup NiederrheinJohannes Hoppe
 
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS AzureJohannes Hoppe
 
2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NETJohannes Hoppe
 

Mehr von Johannes Hoppe (20)

EinfĂŒhrung in Angular 2
EinfĂŒhrung in Angular 2EinfĂŒhrung in Angular 2
EinfĂŒhrung in Angular 2
 
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und IonicMDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
MDC kompakt 2014: Hybride Apps mit Cordova, AngularJS und Ionic
 
2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach2015 02-09 - NoSQL Vorlesung Mosbach
2015 02-09 - NoSQL Vorlesung Mosbach
 
2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure2012-06-25 - MapReduce auf Azure
2012-06-25 - MapReduce auf Azure
 
2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security2013-06-25 - HTML5 & JavaScript Security
2013-06-25 - HTML5 & JavaScript Security
 
2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScript2013-06-24 - Software Craftsmanship with JavaScript
2013-06-24 - Software Craftsmanship with JavaScript
 
2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript2013-06-15 - Software Craftsmanship mit JavaScript
2013-06-15 - Software Craftsmanship mit JavaScript
 
2013 05-03 - HTML5 & JavaScript Security
2013 05-03 -  HTML5 & JavaScript Security2013 05-03 -  HTML5 & JavaScript Security
2013 05-03 - HTML5 & JavaScript Security
 
2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade2013-03-23 - NoSQL Spartakiade
2013-03-23 - NoSQL Spartakiade
 
2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db2013 02-26 - Software Tests with Mongo db
2013 02-26 - Software Tests with Mongo db
 
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
2013-02-21 - .NET UG Rhein-Neckar: JavaScript Best Practices
 
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL2012-10-16 - WebTechCon 2012: HTML5 & WebGL
2012-10-16 - WebTechCon 2012: HTML5 & WebGL
 
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
2012-10-12 - NoSQL in .NET - mit Redis und Mongodb
 
2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL2012-09-18 - HTML5 & WebGL
2012-09-18 - HTML5 & WebGL
 
2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB2012-09-17 - WDC12: Node.js & MongoDB
2012-09-17 - WDC12: Node.js & MongoDB
 
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB2012-05-14 NoSQL in .NET - mit Redis und MongoDB
2012-05-14 NoSQL in .NET - mit Redis und MongoDB
 
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
2012-05-10 - UG Karlsruhe: NoSQL in .NET - mit Redis und MongoDB
 
2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein2012-04-12 - AOP .NET UserGroup Niederrhein
2012-04-12 - AOP .NET UserGroup Niederrhein
 
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
2012-03-20 - Getting started with Node.js and MongoDB on MS Azure
 
2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET2012-01-31 NoSQL in .NET
2012-01-31 NoSQL in .NET
 

KĂŒrzlich hochgeladen

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...gurkirankumar98700
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

KĂŒrzlich hochgeladen (20)

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍾 8923113531 🎰 Avail...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

DMDW Lesson 04 - Data Mining Theory

  • 1. STUDIEREN UND DURCHSTARTEN. Author I: Dip.-Inf. (FH) Johannes Hoppe Author II: M.Sc. Johannes Hofmeister Author III: Prof. Dr. Dieter Homeister Date: 25.03.2011
  • 2. Data Mining Theory Author I: Dip.-Inf. (FH) Johannes Hoppe Author II: M.Sc. Johannes Hofmeister Author III: Prof. Dr. Dieter Homeister Date: 25.03.2011
  • 3. 01 Applied Data Warehousing 3
  • 4. Applied Data Warehousing A typical Data Flow looks like this 4
  • 5. Data Warehouse Practical task Create groups of 3 people Grab “dmdw_rooms_fh_heidelberg-2006-04-03.xls” Explore the data in the “room reservations” spreadsheet Discuss and create a simple database table / documentthat matches the data Find a way to migrate the data from the excel spreadsheet to the database For today I recommend SQL Server Business Intelligence Development Studio + SQL Server Today's System: 5
  • 6. Data Warehouse Next practical task One team will have to present a different solution to migrate the data It should be a hands-on lab for the other students I will upload the materials to my blog Preferred time box: 45 - 90 minutes First team will be: _________________________________ Next System: ? 6
  • 7. Data Warehouse ETL Teams 1. Team: Access – 2x Sebastian, Matthias 2. Team: Access  Access2MySQL  MySQL – Mercedes, Fabian, Marcus, Albert 3. Teams: Silverlight  MS SQL – Sebastian, Patrick 4. Teams: PHP  MySQL – Lars, Maurice, Jeff Next System: ? 7
  • 8. 02 Data Mining Theory 8
  • 9. Data Mining Introduction(1/3) Data Mining is done by running software that examines a database and looks for patterns in the data A data warehouse by itself will respond to queries from users It will not tell users about patterns in data that users may not have thought about To find patterns in data, data mining is used to try and mine key information from a data warehouse 9
  • 10. Data Mining Introduction(2/3) Data mining allows companies to collect information 
 to make them more productive and 
 to beat their competitors Data mining helps to identify why customers buy certain products ideas for very direct marketing ideas for shelf placement training of employees vs. employee retention employee benefits vs. employee retention 10
  • 11. Data Mining Introduction(3/3) Data mining attempts to find patterns in data that we did not know about Often data mining is just a new buzzword for statistics But data mining differs from (school) statistics in the waythat large volumes of data are used Trivial information or well known facts are not an aim of data mining! 11
  • 12.
  • 18. Data Mining Implementing DM on Top of a DW (1/2) Data mining tools / mining algorithms require data! There are two approaches: Copy data from the Data Warehouse and mine it Mine the data directly in the Data Warehouse Popular tools use a variety of different data mining algorithms: association rules genetic algorithms decision trees neural networks 13
  • 19. Data Mining Implementing DM on Top of a DW (2/2) a) Copy data from the data warehouse to data mining tools Advantage : Data mining tools may organize data so they can run faster Disadvantage: Can be very "expensive” to move large amounts of data b) Data mining tools can access data directly in the Data Warehouse Advantage: No copy of data is needed for data mining Disadvantage: Data may not be organized in a way that is efficient for the tool 14
  • 20. Data Mining The Data Mining Process Step 1: Data preparation: cleanup ("scrubbing"), selection, check by specialists for the data. ( data warehouse) Step 2: Analysis phase, process the data by a data mining algorithm. Phase 3: Evaluation of the output, check if something new was discovered. 15
  • 21. Data Mining The Data Mining Process Step 1 - Data preparation It is useful to fetch data from a data warehouse. This eliminates the need of collecting data from different sources, filtering and handling inconsistencies. Theoretically a data warehouse is not absolutely necessary, but in practice it is. The data preparation process includes data selection and manipulation. Validating and cleaning is necessary to eliminate out-of-range values and to handle missing values of our raw data. This may include plausibility checks. 16
  • 22. Data Mining The Data Mining Process Step 1 - Data preparation Even if the data warehouse data are already cleaned and filtered, experience shows that this is not good enough for data mining. The Data preparation also includes formatting, scaling and transformation of the raw data depending on the needs of the data mining algorithm. Examples: scaling of numeric data, currency or metric/inch conversion. 17
  • 23. Data Mining The Data Mining Process Step 1 - Data preparation Many joined tables may be involved, selecting of rows or columns may be necessary, or two fields are combined as ratio, or we need derived values. This process needs guidance of someone with a good knowledge about the data and the problem domain. It is usual that this data preparation consumes 50% to 80% of the data mining budget. 18
  • 24. Data Mining The Data Mining Process Step 2 - Analysis phase process the data by a data mining algorithm. information discovery Analysis servicesbuild in: Association Clustering Decision Trees Linear Regression Logistic Regression Naives Bayes Neural Network 19
  • 25. Data Mining The Data Mining Process Step 3 - Evaluation oftheoutput The interpretation and presentation of the results. The purpose is either decision support or the application development. Presentation: A graphical representation is often useful to present the results to executives. Example in text form: "If a customer buys washers or dryers, 61% buy a service agreement. This pattern is present in 1.0% of the transactions". 20
  • 26. Data Mining The Data Mining Process Step 3 - Evaluation oftheoutput Interpretation of the output data might be necessary. Data mining may replace DSS/EIS(which is mainly a query application with a graphical display). In addition to traditional business software with a clearly visible idea and algorithm, it can also offer the possibility to construct an automated decision support. 21
  • 27. Data Mining The Data Mining Process Step 3 - Evaluation oftheoutput Automated decision support - Example: Every loan application of a bank is passed to a previously trained neural network and results in a score for loan rejected to loan approved. The results of data mining on lots of loan contracts lead to training of the neural network. Such algorithms may work even if the underlying processes are not well understood. Warning: neural networks are a black box!! 22
  • 28. References Additional Books and References for Data Mining David A. Grossman, Ophir Frieder: Introductionto Data Mining, Illinois Institute of Technology 2005 J.P. Bigus, Data Mining withNeural Networks, McGraw-Hill, 1996 Olivia Parr Rud et. al, Data Mining Cookbook - Modeling Data for Marketing, Risk, and Customer Relationship Management, Wiley, 2001 NongYe (ed.): The Handbook of Data Mining, Lawrence Erlbaum Associates, 2003 http://www.eruditionhome.com/datamining/http://en.wikipedia.org/wiki/Data_mininghttp://www.the-data-mine.com/bin/view/Misc/IntroductionToDataMining 23
  • 29. THANK YOU FOR YOUR ATTENTION 24