Presentation from workshop "R & Data mining in action" given at JDD 2013.
Code samples with description (in Polish): https://gist.github.com/kmrowca/public
6. Agenda
• Quick glance on theory - Data mining
• Exercises on… paper
• Quick glance on tool – R console
• Exercises – became friend with R
•…
7. Agenda
• Quick glance on theory - Data mining
• Exercises on… paper
• Quick glance on tool – R console
• Exercises – became friend with R
•…
Theory
Exercise
8. Agenda
• Quick glance on theory - Data preparation
• Exercises
• Regression
• Time series
• Decision trees
• Cluser analysis
Theory
• Text mining
•…
Exercise
12. What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science,
13. What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
14. What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
15. What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
16. What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
17. What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
18. What „google” says?
The overall goal of the data mining process is to extract information
from a data set and transform it into an understandable structure for
further use.
19. What „google” says?
The overall goal of the data mining process is to extract information
from a data set and transform it into an understandable structure for
further use.
20. What „google” says?
The overall goal of the data mining process is to extract information
from a data set and transform it into an understandable structure for
further use.
21. What „google” says?
Aside from the raw analysis step, it involves database and data
management aspects, data pre-processing, model and inference
considerations, interestingness metrics, complexity considerations,
post-processing of discovered structures, visualization, and online
updating.
Source: wikipedia
22. Data mining – what is „inside”
• Predictive
• Regression
• Classification
• Collaborative Filtering
• Descriptive
• Clustering / similarity matching
• Association rules and variants
• Deviation detection
23. Data mining – what is „inside”
• Predictive:
• Regression
• Classification
• Collaborative Filtering
• Descriptive:
• Clustering / similarity matching
• Association rules and variants
• Deviation detection
24. Data mining – what is „inside”
• Predictive:
• Regression
• Classification
• Collaborative Filtering
• Descriptive:
• Clustering / similarity matching
• Association rules and variants
• Deviation detection