SlideShare ist ein Scribd-Unternehmen logo
1 von 79
Introduction to Data Mining
Course Overview ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Acknowledgement ,[object Object],[object Object],[object Object]
Literature Data Mining – Concepts and Techniques by J. Han & M. Kamber, Morgan Kaufmann Publishers, 2001 Pattern Classification by R. Duda, P. Hart and D. Stork, 2 nd  edition, John Wiley & Sons, 2001
Introduction to Knowledge Discovery in Databases and Data Mining
Computational Knowledge Discovery
Terminology ,[object Object],[object Object],[object Object],[object Object]
Terminology - A Working Definition ,[object Object],[object Object],[object Object],[object Object]
Data Mining: On What Kind of Data? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Structure - 3D Anatomy Function – 1D Signal Metadata – Annotation
Data Mining: Confluence of Multiple Disciplines ? 20x20 ~ 2^400    10^120 patterns
Why Do We Need Data Mining ? ,[object Object],[object Object],[object Object],How do you explore millions of records, tens or hundreds of fields, and find patterns?
Why Do We Need Data Mining ? ,[object Object],[object Object],[object Object],[object Object]
Why Do We Need Data Mining? ,[object Object],[object Object],[object Object],[object Object],[object Object],QUERY RESULT (Latitude, Longitude) 1 (Latitude, Longitude) 2
What is It? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Applications of Data Mining
Data Mining Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Market Analysis ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Corporate Analysis & Risk Management ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Fraud Detection & Mining Unusual Patterns ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining and Business Intelligence
Knowledge Discovery in Databases Process
KDD Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Precision Farming Filter
KDD Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Knowledge Discovery
Required effort for each KDD Step ,[object Object]
Data Mining Tools
Commercial and Research Tools ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Software Engineering in Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
D2K - Software Environment for Data Mining ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
D2K Architecture ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Flow Programming Environment: D2K Jump Up Panes Workspace Tool Bar Tool Menu Side Tab Panes
D2K Programming and Runtime Environment
Streamlined Data Mining Environment: D2K SL KDD Steps Session KDD Options Workspace
Data Mining Techniques in D2K ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining at Work Data Sources Project Objectives Single Multiple Numerous Diagnostics Target Marketing Effluent Quality Control Decision Support Automation Transaction Management Cost Prediction (Warranty, Insurance Claims) Warranty Clustering Territorial Ratemaking Web Information Retrieval, Archival and Clustering Auto Loss Ratio Predictions Precision Farming Bio-Informatics Functional Foods Heterogeneous Data Visualization Crime Data Analysis Data Fusion and Visualization Survey Study of Disability
Examples of Data Mining Methods
Three Primary Data Mining Paradigms ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Association Rules and  Market Basket Analysis
What is Market Basket Analysis? ,[object Object],[object Object],[object Object],[object Object]
Market Basket Example Is soda typically purchased with bananas? Does the brand of soda make a difference? Where should detergents be placed in the Store to maximize their sales? Are window cleaning products purchased  when detergents and orange juice are  bought together? How are the demographics of the  neighborhood affecting what customers  are buying? ? ? ? ?
Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],+
Results: Useful, Trivial, or Inexplicable? ,[object Object],[object Object],[object Object],[object Object]
How Does It Work? Orange juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange juice, detergent, soda Window cleaner, soda OJ 4 1 1 2 1 OJ Window Cleaner Milk Soda Detergent 1 2 1 1 0 1 1 1 0 0 2 1 0 3 1 1 0 0 1 2 Window Cleaner Milk Soda Detergent Co-Occurrence of Products Customer Items 1 2 3 4 5 Grocery Point-of-Sale Transactions Orange Juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange Juice, Detergent, Soda Window Cleaner, Soda
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],How Does It Work? OJ Window Cleaner Milk Soda Detergent 1 1 1 0 0 2 1 0 3 1 1 0 0 1 2 OJ Window Cleaner Milk Soda Detergent 1 2 1 1 0 4 1 1 2 1
How Good Are the Rules? ,[object Object],[object Object]
Confidence and Support - How Good Are the Rules ,[object Object],[object Object],[object Object],[object Object]
Confidence and Support Transaction ID # Items 1 2 3 4 { 1, 2, 3 } { 1,3 } { 1,4 } { 2, 5, 6 } Frequent One Item Set Support { 1 } { 2 } { 3 } { 4 } 75 % 50 % 50 % 25 % For minimum support = 50% = 2 transactions  and minimum confidence = 50% For the rule 1=> 3: Support = Support({1,3}) = 50% Confidence (1->3) = Support ({1,3})/Support({1}) = 66% Confidence (3->1)= Support ({1,3})/Support({3}) = 100% Frequent Two Item Set Support { 1,2 } { 1,3 } { 1,4 } { 2,3 } 25 % 50 % 25 % 25 %
Association Examples ,[object Object],[object Object],[object Object],[object Object]
The Basic Process ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Choosing the Right Set of Items Frozen Foods Frozen Desserts Frozen Vegetables Frozen Dinners Frozen Yogurt Frozen Fruit Bars Ice Cream Peas Carrots Mixed Other Rocky Road Chocolate Strawberry Vanilla Cherry Garcia Other Partial Product Taxonomy General Specific
Example - Minimum Support Pruning / Rule Generation Transaction ID # Items 1 2 3 4 { 1, 3, 4 } { 2, 3, 5 } { 1, 2, 3, 5 } { 2, 5 } Itemset Support { 1 } { 2 } { 3 } { 4 } { 5 } 2 3 3 1 3 Itemset Support { 2 } { 3 } { 5 } 3 3 3 Itemset { 2 } { 3 } { 5 } Itemset Support { 2, 3 } { 2, 5 } { 3, 5 } 2 3 2 Itemset Support { 2, 5 } 3 Scan Database Find Pairings Find Level of Support Scan Database Find Pairings Find Level of Support Two rules with the highest support for two item set: 2->5 and 5->2
Other Association Rule Applications ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],+
Strengths of Market Basket Analysis ,[object Object],[object Object],[object Object],[object Object]
Weaknesses of Market Basket Analysis ,[object Object],[object Object],[object Object],[object Object]
Decision Tree Learning
Example: Supervised Learning with Decision Trees
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Decision Tree Learning
Decision Trees ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Decision Tree for Concept:  PlayTennis Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Yes Yes Yes Yes
Decision Trees and Decision Boundaries + + - - + + + + - - y x 1 3 5 7 How to Visualize Decision Trees?  Example: Dividing Instance Space into Axis-Parallel Rectangles More than two variables ? y  > 7? No Yes x  < 3? No Yes y  < 5? No Yes x < 1? No Yes
An Illustrative Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Day Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot  Mild Temperature Humidity Wind PlayTennis? High High High High Normal Normal Normal High Normal Normal Normal High Normal High Outlook Light Strong Light Light Light Strong Strong Light Light Light Strong Strong Light Strong No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No Training Examples for Concept  PlayTennis
Constructing a Decision Tree for  PlayTennis [9+, 5-] E(D) = min(9/14, 5/14) = 5/14 = 36% The Initial Decision Tree with One Leaf ,[object Object],[object Object],1 2 3 4 5 6 7 8 9 10 11 12 13 14 Day Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot  Mild Temperature Humidity Wind Play Tennis? High High High High Normal Normal Normal High Normal Normal Normal High Normal High Outlook Light Strong Light Light Light Strong Strong Light Light Light Strong Strong Light Strong No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No
Constructing a Decision Tree for  PlayTennis Potential Splits of Root Node [3+, 4-] [6+, 1-] Humidity High Normal [9+, 5-] [6+, 2-] [3+, 3-] Wind Light Strong [9+, 5-] [2+, 3-] [3+, 2-] Outlook Sunny Rain [9+, 5-] Overcast [4+, 0-] [3+, 1-] [2+, 2-] Temperature Cool Hot [9+, 5-] Mild [4+, 2-] E(Split/Outlook)  = (5/14) – ((5/14)(min(2/5,3/5)) + (4/14)(min(4/4,0/4)) + (5/14)(min(3/5,2/5))) = 7% E(Split/Temperature) = (5/14) – ((4/14)(min(3/4,1/4)) + (6/14)(min(4/6,2/6)) + (4/14)(min(2/4,2/4))) = 0% E(Split/Humidity)  = (5/14) – ((7/14)(min(3/7,4/7)) + (7/14)(min(6/7,1/7))) = 7% E(Split/Wind)  = (5/14) – ((8/14)(min(6/8,2/8)) + (6/14)(min(3/6,3/6))) = 0%
Constructing a Decision Tree for PlayTennis Humidity? Wind? Yes Yes No Yes No Outlook? 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 [ 9+ , 5- ] ,[object Object],[object Object],[object Object],Sunny Overcast Rain 1 , 2 , 8 , 9 , 11 [ 2+ , 3- ] 3 , 7 , 12 , 13 [ 4+ , 0- ] 4 , 5 , 6 , 10 , 14 [ 3+ , 2- ] High Normal 1 , 2 , 8 [ 0+ , 3- ] 9 , 11 [ 2+ , 0- ] Strong Light 6 , 14 [ 0+ , 2- ] 4 , 5 , 10 [ 3+ , 0- ]
Strengths Of Decision Trees ,[object Object],[object Object],[object Object],[object Object]
Weakness Of Decision Trees ,[object Object],[object Object],[object Object]
Visualization
Visualization Example: Naïve Bayesian Three Flower Types; Petal and Sepal Based Classification
Naïve Bayesian Visualization ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Notice Iris-versicolor has a 33% likelihood
Rule Association Visualization ,[object Object],[object Object],[object Object],[object Object],[object Object]
Discovery Using Rule Association ,[object Object],[object Object],[object Object]
Parallel Coordinates - Visualization ,[object Object],[object Object],[object Object],[object Object],[object Object]
Scatterplots - Visualization
Image To Knowledge (I2K): Data Visualization ,[object Object]
Image To Knowledge (I2K): Visualization of Results ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
T2K - Text to Knowledge: Topic Evolution ,[object Object],[object Object],[object Object]
Protein Consumption Dynamics ,[object Object],[object Object],[object Object]
Data Comparison, Reduction & Synthesis ,[object Object],[object Object],[object Object],[object Object],[object Object]
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Weitere ähnliche Inhalte

Was ist angesagt?

Data mining slides
Data mining slidesData mining slides
Data mining slides
smj
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
Saif Ullah
 

Was ist angesagt? (20)

Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data Mining: What is Data Mining?
Data Mining: What is Data Mining?Data Mining: What is Data Mining?
Data Mining: What is Data Mining?
 
Application of data mining
Application of data miningApplication of data mining
Application of data mining
 
Data Mining & Applications
Data Mining & ApplicationsData Mining & Applications
Data Mining & Applications
 
Kdd process
Kdd processKdd process
Kdd process
 
Data mining
Data miningData mining
Data mining
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Introduction to-data-mining chapter 1
Introduction to-data-mining  chapter 1Introduction to-data-mining  chapter 1
Introduction to-data-mining chapter 1
 
Introduction to Data mining
Introduction to Data miningIntroduction to Data mining
Introduction to Data mining
 
Data mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniquesData mining (lecture 1 & 2) conecpts and techniques
Data mining (lecture 1 & 2) conecpts and techniques
 
Data mining presentation.ppt
Data mining presentation.pptData mining presentation.ppt
Data mining presentation.ppt
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 
Data Mining: Classification and analysis
Data Mining: Classification and analysisData Mining: Classification and analysis
Data Mining: Classification and analysis
 
Introduction to data science.pptx
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptx
 
Data Mining
Data MiningData Mining
Data Mining
 
Data analytics
Data analyticsData analytics
Data analytics
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data Mining Overview
Data Mining OverviewData Mining Overview
Data Mining Overview
 
Data Mining Concepts
Data Mining ConceptsData Mining Concepts
Data Mining Concepts
 
Data mining
Data miningData mining
Data mining
 

Andere mochten auch

Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
Revolution Analytics
 
Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basket
Swapnil Soni
 

Andere mochten auch (16)

Data Mining: an Introduction
Data Mining: an IntroductionData Mining: an Introduction
Data Mining: an Introduction
 
Data mining
Data miningData mining
Data mining
 
introduction to data mining tutorial
introduction to data mining tutorial introduction to data mining tutorial
introduction to data mining tutorial
 
Data mining
Data miningData mining
Data mining
 
Introduction to DataMining
Introduction to DataMiningIntroduction to DataMining
Introduction to DataMining
 
Lecture 01 Data Mining
Lecture 01 Data MiningLecture 01 Data Mining
Lecture 01 Data Mining
 
Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
 
Data Mining
Data Mining Data Mining
Data Mining
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Basic Overview of Data Mining
Basic Overview of Data MiningBasic Overview of Data Mining
Basic Overview of Data Mining
 
Market Basket Analysis in SAS
Market Basket Analysis in SASMarket Basket Analysis in SAS
Market Basket Analysis in SAS
 
What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)What Is DATA MINING(INTRODUCTION)
What Is DATA MINING(INTRODUCTION)
 
Data mining- Association Analysis -market basket
Data mining- Association Analysis -market basketData mining- Association Analysis -market basket
Data mining- Association Analysis -market basket
 
An introduction to data mining and its techniques
An introduction to data mining and its techniquesAn introduction to data mining and its techniques
An introduction to data mining and its techniques
 
Data Mining Techniques
Data Mining TechniquesData Mining Techniques
Data Mining Techniques
 
Market basket analysis
Market basket analysisMarket basket analysis
Market basket analysis
 

Ähnlich wie Introduction To Data Mining

Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
butest
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
bhagathk
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
PadmajaLaksh
 

Ähnlich wie Introduction To Data Mining (20)

Data mining
Data miningData mining
Data mining
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
 
6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana6months industrial training in data mining,ludhiana
6months industrial training in data mining,ludhiana
 
6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar6months industrial training in data mining, jalandhar
6months industrial training in data mining, jalandhar
 
6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana6 weeks summer training in data mining,ludhiana
6 weeks summer training in data mining,ludhiana
 
6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar6 weeks summer training in data mining,jalandhar
6 weeks summer training in data mining,jalandhar
 
Introduction to data warehouse
Introduction to data warehouseIntroduction to data warehouse
Introduction to data warehouse
 
Data mining 1
Data mining 1Data mining 1
Data mining 1
 
Talk
TalkTalk
Talk
 
Introduction
IntroductionIntroduction
Introduction
 
Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)Data mining 1 - Introduction (cheat sheet - printable)
Data mining 1 - Introduction (cheat sheet - printable)
 
Data mining
Data miningData mining
Data mining
 
Dma unit 1
Dma unit   1Dma unit   1
Dma unit 1
 
Data mining final year project in jalandhar
Data mining final year project in jalandharData mining final year project in jalandhar
Data mining final year project in jalandhar
 
Data mining final year project in ludhiana
Data mining final year project in ludhianaData mining final year project in ludhiana
Data mining final year project in ludhiana
 
Introduction.ppt
Introduction.pptIntroduction.ppt
Introduction.ppt
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Dwdmunit1 a
Dwdmunit1 aDwdmunit1 a
Dwdmunit1 a
 
Unit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.pptUnit 1 (Chapter-1) on data mining concepts.ppt
Unit 1 (Chapter-1) on data mining concepts.ppt
 
Data mining
Data miningData mining
Data mining
 

Mehr von Phi Jack

Zara's Fast-Fashion Edge
Zara's Fast-Fashion EdgeZara's Fast-Fashion Edge
Zara's Fast-Fashion Edge
Phi Jack
 
The vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysisThe vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysis
Phi Jack
 
Vietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, NielsenVietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, Nielsen
Phi Jack
 
ID.com's prospectus for IPO
ID.com's prospectus for IPOID.com's prospectus for IPO
ID.com's prospectus for IPO
Phi Jack
 
Color theory
Color theoryColor theory
Color theory
Phi Jack
 
China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]
Phi Jack
 
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
Phi Jack
 
User behavior
User behaviorUser behavior
User behavior
Phi Jack
 
Buoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip KotlerBuoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip Kotler
Phi Jack
 
Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001
Phi Jack
 
FinalStyle Ms Excel
FinalStyle Ms ExcelFinalStyle Ms Excel
FinalStyle Ms Excel
Phi Jack
 
Nguoi Tieu Dung
Nguoi Tieu DungNguoi Tieu Dung
Nguoi Tieu Dung
Phi Jack
 
Google Story
Google StoryGoogle Story
Google Story
Phi Jack
 
e-Marketing
e-Marketinge-Marketing
e-Marketing
Phi Jack
 

Mehr von Phi Jack (20)

Vietnam Retail Store Modern Trade Trend 2022.pdf
Vietnam Retail Store Modern Trade Trend 2022.pdfVietnam Retail Store Modern Trade Trend 2022.pdf
Vietnam Retail Store Modern Trade Trend 2022.pdf
 
K-Beauty E-catalog
K-Beauty E-catalogK-Beauty E-catalog
K-Beauty E-catalog
 
Market Research on Beauty Industry in Vietnam
Market Research on Beauty Industry in VietnamMarket Research on Beauty Industry in Vietnam
Market Research on Beauty Industry in Vietnam
 
Hành Vi Người Dùng Internet Vietnam 2015 - Google
Hành Vi Người Dùng Internet Vietnam 2015 - GoogleHành Vi Người Dùng Internet Vietnam 2015 - Google
Hành Vi Người Dùng Internet Vietnam 2015 - Google
 
Rocket Internet 2014 & Q1 2015 Results Report
Rocket Internet 2014 & Q1 2015 Results ReportRocket Internet 2014 & Q1 2015 Results Report
Rocket Internet 2014 & Q1 2015 Results Report
 
Hành vi mua sắm Online của Phụ nữ Châu Á
Hành vi mua sắm Online của Phụ nữ Châu ÁHành vi mua sắm Online của Phụ nữ Châu Á
Hành vi mua sắm Online của Phụ nữ Châu Á
 
Zara's Fast-Fashion Edge
Zara's Fast-Fashion EdgeZara's Fast-Fashion Edge
Zara's Fast-Fashion Edge
 
The vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysisThe vietnamese seafood sector - A value chain analysis
The vietnamese seafood sector - A value chain analysis
 
Vietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, NielsenVietnam Retail Market Report, Nielsen
Vietnam Retail Market Report, Nielsen
 
ID.com's prospectus for IPO
ID.com's prospectus for IPOID.com's prospectus for IPO
ID.com's prospectus for IPO
 
Color theory
Color theoryColor theory
Color theory
 
China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]China E-commerce Analytics [Credit Suisse]
China E-commerce Analytics [Credit Suisse]
 
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
How Businesses Fare with Daily Deals: A Multi-Site Analysis of Groupon, Livin...
 
User behavior
User behaviorUser behavior
User behavior
 
Buoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip KotlerBuoi Thuyet Trinh Philip Kotler
Buoi Thuyet Trinh Philip Kotler
 
Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001Huong Dan Ap Dung ISO 9001
Huong Dan Ap Dung ISO 9001
 
FinalStyle Ms Excel
FinalStyle Ms ExcelFinalStyle Ms Excel
FinalStyle Ms Excel
 
Nguoi Tieu Dung
Nguoi Tieu DungNguoi Tieu Dung
Nguoi Tieu Dung
 
Google Story
Google StoryGoogle Story
Google Story
 
e-Marketing
e-Marketinge-Marketing
e-Marketing
 

Kürzlich hochgeladen

Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
allensay1
 
Mckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for ViewingMckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for Viewing
Nauman Safdar
 
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
ZurliaSoop
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
daisycvs
 

Kürzlich hochgeladen (20)

Phases of Negotiation .pptx
 Phases of Negotiation .pptx Phases of Negotiation .pptx
Phases of Negotiation .pptx
 
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book nowPARK STREET 💋 Call Girl 9827461493 Call Girls in  Escort service book now
PARK STREET 💋 Call Girl 9827461493 Call Girls in Escort service book now
 
Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...
Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...
Chennai Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Av...
 
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al MizharAl Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
Al Mizhar Dubai Escorts +971561403006 Escorts Service In Al Mizhar
 
Arti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdfArti Languages Pre Seed Teaser Deck 2024.pdf
Arti Languages Pre Seed Teaser Deck 2024.pdf
 
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
Ooty Call Gril 80022//12248 Only For Sex And High Profile Best Gril Sex Avail...
 
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAIGetting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
Getting Real with AI - Columbus DAW - May 2024 - Nick Woo from AlignAI
 
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 MonthsSEO Case Study: How I Increased SEO Traffic & Ranking by 50-60%  in 6 Months
SEO Case Study: How I Increased SEO Traffic & Ranking by 50-60% in 6 Months
 
Mckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for ViewingMckinsey foundation level Handbook for Viewing
Mckinsey foundation level Handbook for Viewing
 
CROSS CULTURAL NEGOTIATION BY PANMISEM NS
CROSS CULTURAL NEGOTIATION BY PANMISEM NSCROSS CULTURAL NEGOTIATION BY PANMISEM NS
CROSS CULTURAL NEGOTIATION BY PANMISEM NS
 
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan CytotecJual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
Jual Obat Aborsi ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan Cytotec
 
Pre Engineered Building Manufacturers Hyderabad.pptx
Pre Engineered  Building Manufacturers Hyderabad.pptxPre Engineered  Building Manufacturers Hyderabad.pptx
Pre Engineered Building Manufacturers Hyderabad.pptx
 
UAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur Dubai
UAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur DubaiUAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur Dubai
UAE Bur Dubai Call Girls ☏ 0564401582 Call Girl in Bur Dubai
 
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
Unveiling Falcon Invoice Discounting: Leading the Way as India's Premier Bill...
 
Putting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptxPutting the SPARK into Virtual Training.pptx
Putting the SPARK into Virtual Training.pptx
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
Falcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investorsFalcon Invoice Discounting: The best investment platform in india for investors
Falcon Invoice Discounting: The best investment platform in india for investors
 
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDINGBerhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
Berhampur 70918*19311 CALL GIRLS IN ESCORT SERVICE WE ARE PROVIDING
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
 
WheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond InsightsWheelTug Short Pitch Deck 2024 | Byond Insights
WheelTug Short Pitch Deck 2024 | Byond Insights
 

Introduction To Data Mining

  • 2.
  • 3.
  • 4. Literature Data Mining – Concepts and Techniques by J. Han & M. Kamber, Morgan Kaufmann Publishers, 2001 Pattern Classification by R. Duda, P. Hart and D. Stork, 2 nd edition, John Wiley & Sons, 2001
  • 5. Introduction to Knowledge Discovery in Databases and Data Mining
  • 7.
  • 8.
  • 9.
  • 10. Data Mining: Confluence of Multiple Disciplines ? 20x20 ~ 2^400  10^120 patterns
  • 11.
  • 12.
  • 13.
  • 14.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20. Data Mining and Business Intelligence
  • 21. Knowledge Discovery in Databases Process
  • 22.
  • 23.
  • 25.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31. Data Flow Programming Environment: D2K Jump Up Panes Workspace Tool Bar Tool Menu Side Tab Panes
  • 32. D2K Programming and Runtime Environment
  • 33. Streamlined Data Mining Environment: D2K SL KDD Steps Session KDD Options Workspace
  • 34.
  • 35. Data Mining at Work Data Sources Project Objectives Single Multiple Numerous Diagnostics Target Marketing Effluent Quality Control Decision Support Automation Transaction Management Cost Prediction (Warranty, Insurance Claims) Warranty Clustering Territorial Ratemaking Web Information Retrieval, Archival and Clustering Auto Loss Ratio Predictions Precision Farming Bio-Informatics Functional Foods Heterogeneous Data Visualization Crime Data Analysis Data Fusion and Visualization Survey Study of Disability
  • 36. Examples of Data Mining Methods
  • 37.
  • 38. Association Rules and Market Basket Analysis
  • 39.
  • 40. Market Basket Example Is soda typically purchased with bananas? Does the brand of soda make a difference? Where should detergents be placed in the Store to maximize their sales? Are window cleaning products purchased when detergents and orange juice are bought together? How are the demographics of the neighborhood affecting what customers are buying? ? ? ? ?
  • 41.
  • 42.
  • 43. How Does It Work? Orange juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange juice, detergent, soda Window cleaner, soda OJ 4 1 1 2 1 OJ Window Cleaner Milk Soda Detergent 1 2 1 1 0 1 1 1 0 0 2 1 0 3 1 1 0 0 1 2 Window Cleaner Milk Soda Detergent Co-Occurrence of Products Customer Items 1 2 3 4 5 Grocery Point-of-Sale Transactions Orange Juice, Soda Milk, Orange Juice, Window Cleaner Orange Juice, Detergent Orange Juice, Detergent, Soda Window Cleaner, Soda
  • 44.
  • 45.
  • 46.
  • 47. Confidence and Support Transaction ID # Items 1 2 3 4 { 1, 2, 3 } { 1,3 } { 1,4 } { 2, 5, 6 } Frequent One Item Set Support { 1 } { 2 } { 3 } { 4 } 75 % 50 % 50 % 25 % For minimum support = 50% = 2 transactions and minimum confidence = 50% For the rule 1=> 3: Support = Support({1,3}) = 50% Confidence (1->3) = Support ({1,3})/Support({1}) = 66% Confidence (3->1)= Support ({1,3})/Support({3}) = 100% Frequent Two Item Set Support { 1,2 } { 1,3 } { 1,4 } { 2,3 } 25 % 50 % 25 % 25 %
  • 48.
  • 49.
  • 50. Choosing the Right Set of Items Frozen Foods Frozen Desserts Frozen Vegetables Frozen Dinners Frozen Yogurt Frozen Fruit Bars Ice Cream Peas Carrots Mixed Other Rocky Road Chocolate Strawberry Vanilla Cherry Garcia Other Partial Product Taxonomy General Specific
  • 51. Example - Minimum Support Pruning / Rule Generation Transaction ID # Items 1 2 3 4 { 1, 3, 4 } { 2, 3, 5 } { 1, 2, 3, 5 } { 2, 5 } Itemset Support { 1 } { 2 } { 3 } { 4 } { 5 } 2 3 3 1 3 Itemset Support { 2 } { 3 } { 5 } 3 3 3 Itemset { 2 } { 3 } { 5 } Itemset Support { 2, 3 } { 2, 5 } { 3, 5 } 2 3 2 Itemset Support { 2, 5 } 3 Scan Database Find Pairings Find Level of Support Scan Database Find Pairings Find Level of Support Two rules with the highest support for two item set: 2->5 and 5->2
  • 52.
  • 53.
  • 54.
  • 56. Example: Supervised Learning with Decision Trees
  • 57.
  • 58.
  • 59. Decision Tree for Concept: PlayTennis Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Outlook? Humidity? Wind? Sunny Overcast Rain Yes No High Normal No Strong Light Yes Yes Yes Yes
  • 60. Decision Trees and Decision Boundaries + + - - + + + + - - y x 1 3 5 7 How to Visualize Decision Trees? Example: Dividing Instance Space into Axis-Parallel Rectangles More than two variables ? y > 7? No Yes x < 3? No Yes y < 5? No Yes x < 1? No Yes
  • 61. An Illustrative Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Day Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild Temperature Humidity Wind PlayTennis? High High High High Normal Normal Normal High Normal Normal Normal High Normal High Outlook Light Strong Light Light Light Strong Strong Light Light Light Strong Strong Light Strong No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No Training Examples for Concept PlayTennis
  • 62.
  • 63. Constructing a Decision Tree for PlayTennis Potential Splits of Root Node [3+, 4-] [6+, 1-] Humidity High Normal [9+, 5-] [6+, 2-] [3+, 3-] Wind Light Strong [9+, 5-] [2+, 3-] [3+, 2-] Outlook Sunny Rain [9+, 5-] Overcast [4+, 0-] [3+, 1-] [2+, 2-] Temperature Cool Hot [9+, 5-] Mild [4+, 2-] E(Split/Outlook) = (5/14) – ((5/14)(min(2/5,3/5)) + (4/14)(min(4/4,0/4)) + (5/14)(min(3/5,2/5))) = 7% E(Split/Temperature) = (5/14) – ((4/14)(min(3/4,1/4)) + (6/14)(min(4/6,2/6)) + (4/14)(min(2/4,2/4))) = 0% E(Split/Humidity) = (5/14) – ((7/14)(min(3/7,4/7)) + (7/14)(min(6/7,1/7))) = 7% E(Split/Wind) = (5/14) – ((8/14)(min(6/8,2/8)) + (6/14)(min(3/6,3/6))) = 0%
  • 64.
  • 65.
  • 66.
  • 68. Visualization Example: Naïve Bayesian Three Flower Types; Petal and Sepal Based Classification
  • 69.
  • 70.
  • 71.
  • 72.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.