SlideShare ist ein Scribd-Unternehmen logo
1 von 16
Data Mining and the Web: Past, Present and Future Kyuseok Shim Bell Laboratories [email_address] S. Seshadri Bell Laboratories [email_address] Rajeev Rastogi Bell Laboratories [email_address] Minos N. Garofalakis Bell Laboratories [email_address]
Agenda ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining Techniques – Association Rules ,[object Object],[object Object],[object Object],[object Object],[object Object]
Association Rules example ,[object Object],[object Object],[object Object],Min. support 50% Min. confidence 50%
Association Rules – Apriori Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Apriori Algorithm Example Database D Scan D C 1 L 1 L 2 C 2 C 2 Scan D C 3 L 3 Scan D
Data Mining Techniques – Classification ,[object Object],[object Object],[object Object],[object Object],[object Object]
Classification Process (1): Model Construction Classification Algorithms IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’  Training Data Classifier (Model)
Classification Process (2): Use the Model in Prediction (Jeff, Professor, 4) Tenured? Classifier Testing Data Unseen Data
Classification ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Data Mining Techniques – Cluster ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Web Mining Techniques ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Web Mining Research Issues ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Reference ,[object Object],[object Object],[object Object],[object Object],[object Object]
Q & A Thanks! ^_^

Weitere ähnliche Inhalte

Was ist angesagt?

Email Data Cleaning
Email Data CleaningEmail Data Cleaning
Email Data Cleaning
feiwin
 
Tdm recent trends
Tdm recent trendsTdm recent trends
Tdm recent trends
KU Leuven
 
The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspective
ankurpandeyinfo
 

Was ist angesagt? (20)

Semantic Annotation of Documents
Semantic Annotation of DocumentsSemantic Annotation of Documents
Semantic Annotation of Documents
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Multidimensioal database
Multidimensioal  databaseMultidimensioal  database
Multidimensioal database
 
Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)Domain Ontology Usage Analysis Framework (OUSAF)
Domain Ontology Usage Analysis Framework (OUSAF)
 
Email Data Cleaning
Email Data CleaningEmail Data Cleaning
Email Data Cleaning
 
Information Retrieval-1
Information Retrieval-1Information Retrieval-1
Information Retrieval-1
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
 
Tdm recent trends
Tdm recent trendsTdm recent trends
Tdm recent trends
 
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
Slides: Concurrent Inference of Topic Models and Distributed Vector Represent...
 
Ir 02
Ir   02Ir   02
Ir 02
 
Textmining Predictive Models
Textmining Predictive ModelsTextmining Predictive Models
Textmining Predictive Models
 
IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents IRE Semantic Annotation of Documents
IRE Semantic Annotation of Documents
 
5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil5. Queue - Data Structures using C++ by Varsha Patil
5. Queue - Data Structures using C++ by Varsha Patil
 
3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil3. Stack - Data Structures using C++ by Varsha Patil
3. Stack - Data Structures using C++ by Varsha Patil
 
The science behind predictive analytics a text mining perspective
The science behind predictive analytics  a text mining perspectiveThe science behind predictive analytics  a text mining perspective
The science behind predictive analytics a text mining perspective
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
Searching Techniques and Analysis
Searching Techniques and AnalysisSearching Techniques and Analysis
Searching Techniques and Analysis
 
Analysis of the Datasets
Analysis of the DatasetsAnalysis of the Datasets
Analysis of the Datasets
 
Sub1579
Sub1579Sub1579
Sub1579
 
4.4 text mining
4.4 text mining4.4 text mining
4.4 text mining
 

Ähnlich wie Data Mining and the Web_Past_Present and Future

M phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projectsM phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projects
Vijay Karan
 
Associative Classification: Synopsis
Associative Classification: SynopsisAssociative Classification: Synopsis
Associative Classification: Synopsis
Jagdeep Singh Malhi
 
Deep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech Talks
Deep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech TalksDeep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech Talks
Deep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech Talks
Amazon Web Services
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
Andre Freitas
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
butest
 

Ähnlich wie Data Mining and the Web_Past_Present and Future (20)

Web Information Extraction for the Database Research Domain
Web Information Extraction for the Database Research DomainWeb Information Extraction for the Database Research Domain
Web Information Extraction for the Database Research Domain
 
Database novelty detection
Database novelty detectionDatabase novelty detection
Database novelty detection
 
MINING FUZZY ASSOCIATION RULES FROM WEB USAGE QUANTITATIVE DATA
MINING FUZZY ASSOCIATION RULES FROM WEB USAGE QUANTITATIVE DATAMINING FUZZY ASSOCIATION RULES FROM WEB USAGE QUANTITATIVE DATA
MINING FUZZY ASSOCIATION RULES FROM WEB USAGE QUANTITATIVE DATA
 
Mining Fuzzy Association Rules from Web Usage Quantitative Data
Mining Fuzzy Association Rules from Web Usage Quantitative Data Mining Fuzzy Association Rules from Web Usage Quantitative Data
Mining Fuzzy Association Rules from Web Usage Quantitative Data
 
Web Information Extraction for the DB Research Domain
Web Information Extraction for the DB Research DomainWeb Information Extraction for the DB Research Domain
Web Information Extraction for the DB Research Domain
 
Bioschemas Workshop
Bioschemas WorkshopBioschemas Workshop
Bioschemas Workshop
 
M phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projectsM phil-computer-science-data-mining-projects
M phil-computer-science-data-mining-projects
 
M.Phil Computer Science Data Mining Projects
M.Phil Computer Science Data Mining ProjectsM.Phil Computer Science Data Mining Projects
M.Phil Computer Science Data Mining Projects
 
A1303060109
A1303060109A1303060109
A1303060109
 
A1303060109
A1303060109A1303060109
A1303060109
 
Web mining
Web miningWeb mining
Web mining
 
Associative Classification: Synopsis
Associative Classification: SynopsisAssociative Classification: Synopsis
Associative Classification: Synopsis
 
M.E Computer Science Data Mining Projects
M.E Computer Science Data Mining ProjectsM.E Computer Science Data Mining Projects
M.E Computer Science Data Mining Projects
 
Deep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech Talks
Deep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech TalksDeep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech Talks
Deep Dive on Amazon Cloud Directory - April 2017 AWS Online Tech Talks
 
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
Frequent Pattern Analysis, Apriori and FP Growth AlgorithmFrequent Pattern Analysis, Apriori and FP Growth Algorithm
Frequent Pattern Analysis, Apriori and FP Growth Algorithm
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
Perception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document ClusteringPerception Determined Constructing Algorithm for Document Clustering
Perception Determined Constructing Algorithm for Document Clustering
 
You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...You Never Walk Along: Recommending Academic Events Based on Social Network ...
You Never Walk Along: Recommending Academic Events Based on Social Network ...
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical UniversityData Mining Xuequn Shang NorthWestern Polytechnical University
Data Mining Xuequn Shang NorthWestern Polytechnical University
 

Kürzlich hochgeladen

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Kürzlich hochgeladen (20)

Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Data Mining and the Web_Past_Present and Future

  • 1. Data Mining and the Web: Past, Present and Future Kyuseok Shim Bell Laboratories [email_address] S. Seshadri Bell Laboratories [email_address] Rajeev Rastogi Bell Laboratories [email_address] Minos N. Garofalakis Bell Laboratories [email_address]
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. Apriori Algorithm Example Database D Scan D C 1 L 1 L 2 C 2 C 2 Scan D C 3 L 3 Scan D
  • 8.
  • 9. Classification Process (1): Model Construction Classification Algorithms IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’ Training Data Classifier (Model)
  • 10. Classification Process (2): Use the Model in Prediction (Jeff, Professor, 4) Tenured? Classifier Testing Data Unseen Data
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16. Q & A Thanks! ^_^