SlideShare ist ein Scribd-Unternehmen logo
1 von 43
R & data mining in action
Katarzyna Mrowca
Sztuka czytania między
wierszami
czyli język R i Data Mining w akcji
<me>

Katarzyna Mrowca

</me>
The deal 
Agenda
• Quick glance on theory - Data mining
• Exercises on… paper
• Quick glance on tool – R console
• Exercises – became friend with R
•…
Agenda
• Quick glance on theory - Data mining
• Exercises on… paper
• Quick glance on tool – R console
• Exercises – became friend with R
•…

Theory

Exercise
Agenda
• Quick glance on theory - Data preparation
• Exercises
• Regression
• Time series
• Decision trees
• Cluser analysis
Theory
• Text mining
•…

Exercise
Quick glance on theory!
What data mining is?
What „google” says?
What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science,
What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
What „google” says?
Data mining (the analysis step of the "Knowledge Discovery in
Databases" process, or KDD), an interdisciplinary subfield of computer
science, is the computational process of discovering patterns in large
data sets involving methods at the intersection of artificial intelligence,
machine learning, statistics.
What „google” says?
The overall goal of the data mining process is to extract information
from a data set and transform it into an understandable structure for
further use.
What „google” says?
The overall goal of the data mining process is to extract information
from a data set and transform it into an understandable structure for
further use.
What „google” says?
The overall goal of the data mining process is to extract information
from a data set and transform it into an understandable structure for
further use.
What „google” says?
Aside from the raw analysis step, it involves database and data
management aspects, data pre-processing, model and inference
considerations, interestingness metrics, complexity considerations,
post-processing of discovered structures, visualization, and online
updating.

Source: wikipedia
Data mining – what is „inside”
• Predictive
• Regression
• Classification
• Collaborative Filtering

• Descriptive
• Clustering / similarity matching
• Association rules and variants
• Deviation detection
Data mining – what is „inside”
• Predictive:
• Regression
• Classification
• Collaborative Filtering

• Descriptive:
• Clustering / similarity matching
• Association rules and variants
• Deviation detection
Data mining – what is „inside”
• Predictive:
• Regression
• Classification
• Collaborative Filtering

• Descriptive:
• Clustering / similarity matching
• Association rules and variants
• Deviation detection
What data mining is not?
Why Data Mining is so
popular?
What is a difference between
statistics and data mining?
Data preparation
Variables
Qualitative & Quantitative
Tame R console!
NetBeans + R

Source: https://blogs.oracle.com/geertjan/entry/r_plugin_for_netbeans_ide
RHIPE <– R+ Hadoop
Find out more: http://www.datadr.org/
Revolution Analytics <- R +
Hadoop + Enterprise
Find out more: http://www.revolutionanalytics.com
Take a break 
Regression
Time series
Decision trees
Regression trees
Classification trees
K means
Text mining
Thank you!

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (19)

Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
 
Introduction to Big Data and its Trends
Introduction to Big Data and its TrendsIntroduction to Big Data and its Trends
Introduction to Big Data and its Trends
 
The Importance of Open Innovation in AI era
The Importance of Open Innovation in AI eraThe Importance of Open Innovation in AI era
The Importance of Open Innovation in AI era
 
When Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic HappensWhen Big Data and Predictive Analytics Collide: Visual Magic Happens
When Big Data and Predictive Analytics Collide: Visual Magic Happens
 
Data science
Data scienceData science
Data science
 
Data science
Data scienceData science
Data science
 
Big Data and Predictive Analysis
Big Data and Predictive AnalysisBig Data and Predictive Analysis
Big Data and Predictive Analysis
 
Big data and Predictive Analytics By : Professor Lili Saghafi
Big data and Predictive Analytics By : Professor Lili SaghafiBig data and Predictive Analytics By : Professor Lili Saghafi
Big data and Predictive Analytics By : Professor Lili Saghafi
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace
 
AI on Big Data
AI on Big DataAI on Big Data
AI on Big Data
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 
Data Science presentation for elementary school students
Data Science presentation for elementary school studentsData Science presentation for elementary school students
Data Science presentation for elementary school students
 
Big Data Maturity Model and Governance
Big Data Maturity Model and GovernanceBig Data Maturity Model and Governance
Big Data Maturity Model and Governance
 
Unit 3 part 2
Unit  3 part 2Unit  3 part 2
Unit 3 part 2
 
Traffic Data Analysis and Prediction using Big Data
Traffic Data Analysis and Prediction using Big DataTraffic Data Analysis and Prediction using Big Data
Traffic Data Analysis and Prediction using Big Data
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)Introduction to Data Science (Data Summit, 2017)
Introduction to Data Science (Data Summit, 2017)
 

Andere mochten auch

Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweets
Vasu Jain
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
Rachit Goel
 

Andere mochten auch (12)

Innovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle RInnovate Analytics with Oracle Data Mining & Oracle R
Innovate Analytics with Oracle Data Mining & Oracle R
 
TextMining with R
TextMining with RTextMining with R
TextMining with R
 
R by example: mining Twitter for consumer attitudes towards airlines
R by example: mining Twitter for consumer attitudes towards airlinesR by example: mining Twitter for consumer attitudes towards airlines
R by example: mining Twitter for consumer attitudes towards airlines
 
How Sentiment Analysis works
How Sentiment Analysis worksHow Sentiment Analysis works
How Sentiment Analysis works
 
Introduction to Sentiment Analysis
Introduction to Sentiment AnalysisIntroduction to Sentiment Analysis
Introduction to Sentiment Analysis
 
Social media analysis in R using twitter API
Social media analysis in R using twitter API Social media analysis in R using twitter API
Social media analysis in R using twitter API
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter data
 
Sentiment analysis of tweets
Sentiment analysis of tweetsSentiment analysis of tweets
Sentiment analysis of tweets
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
 
Sentiment Analysis in Twitter
Sentiment Analysis in TwitterSentiment Analysis in Twitter
Sentiment Analysis in Twitter
 
Sentiment Analysis of Twitter Data
Sentiment Analysis of Twitter DataSentiment Analysis of Twitter Data
Sentiment Analysis of Twitter Data
 
Text Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter DataText Mining with R -- an Analysis of Twitter Data
Text Mining with R -- an Analysis of Twitter Data
 

Ähnlich wie R & Data mining in action

Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)
Bikramjit Sarkar, Ph.D.
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
dataminers.ir
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
Phi Jack
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
Andre Freitas
 

Ähnlich wie R & Data mining in action (20)

Sztuka czytania między wierszami - R i Data mining
Sztuka czytania między wierszami - R i Data miningSztuka czytania między wierszami - R i Data mining
Sztuka czytania między wierszami - R i Data mining
 
TTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining TechniqueTTG Int.LTD Data Mining Technique
TTG Int.LTD Data Mining Technique
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 
Data mining
Data miningData mining
Data mining
 
Data mining
Data miningData mining
Data mining
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...01-introduction.ppt the paper that you can unless you want to join me because...
01-introduction.ppt the paper that you can unless you want to join me because...
 
Introduction to Data Mining
Introduction to Data Mining Introduction to Data Mining
Introduction to Data Mining
 
Week-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptxWeek-1-Introduction to Data Mining.pptx
Week-1-Introduction to Data Mining.pptx
 
data mining
data miningdata mining
data mining
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)Data Mining and Data Warehousing (MAKAUT)
Data Mining and Data Warehousing (MAKAUT)
 
Introduction To Data Mining
Introduction To Data MiningIntroduction To Data Mining
Introduction To Data Mining
 
Introduction To Data Mining
Introduction To Data Mining   Introduction To Data Mining
Introduction To Data Mining
 
Leveraging Graphs for AI and ML - Alicia Frame, Neo4j
Leveraging Graphs for AI and ML - Alicia Frame, Neo4jLeveraging Graphs for AI and ML - Alicia Frame, Neo4j
Leveraging Graphs for AI and ML - Alicia Frame, Neo4j
 
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargColloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
 
Data mining techniques
Data mining techniquesData mining techniques
Data mining techniques
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
data.2.pptx
data.2.pptxdata.2.pptx
data.2.pptx
 

Mehr von Katarzyna Mrowca

Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?
Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?
Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?
Katarzyna Mrowca
 

Mehr von Katarzyna Mrowca (20)

Delivering unicorns
Delivering unicornsDelivering unicorns
Delivering unicorns
 
Make your data beautiful!
Make your data beautiful! Make your data beautiful!
Make your data beautiful!
 
Defeat feature gluttony
Defeat feature gluttony Defeat feature gluttony
Defeat feature gluttony
 
Technical... User Stories?!
Technical... User Stories?!Technical... User Stories?!
Technical... User Stories?!
 
How to defeat feature gluttony?
How to defeat feature gluttony?How to defeat feature gluttony?
How to defeat feature gluttony?
 
User Stories Refactoring
User Stories RefactoringUser Stories Refactoring
User Stories Refactoring
 
Architecture for rookies
Architecture for rookiesArchitecture for rookies
Architecture for rookies
 
Agile project management anti patterns
Agile project management anti patterns Agile project management anti patterns
Agile project management anti patterns
 
User Stories Refactoring
User Stories RefactoringUser Stories Refactoring
User Stories Refactoring
 
Technical... user stories?!
Technical... user stories?!Technical... user stories?!
Technical... user stories?!
 
Tajniki współpracy z (trudnym) klientem
Tajniki współpracy z (trudnym) klientemTajniki współpracy z (trudnym) klientem
Tajniki współpracy z (trudnym) klientem
 
[ACE'14] The art of saying no
[ACE'14] The art of saying no [ACE'14] The art of saying no
[ACE'14] The art of saying no
 
Skad programisci wiedza co pisac
Skad programisci wiedza co pisacSkad programisci wiedza co pisac
Skad programisci wiedza co pisac
 
Sztuka mówienia NIE - w kontekście zbierania wymagań biznesowych
Sztuka mówienia NIE - w kontekście zbierania wymagań biznesowychSztuka mówienia NIE - w kontekście zbierania wymagań biznesowych
Sztuka mówienia NIE - w kontekście zbierania wymagań biznesowych
 
Sztuka wojny wg analityka IT - jak współpracować z trudnym klientem
Sztuka wojny wg analityka IT - jak współpracować z trudnym klientemSztuka wojny wg analityka IT - jak współpracować z trudnym klientem
Sztuka wojny wg analityka IT - jak współpracować z trudnym klientem
 
Jak wybrać systemy IT wspierające działalność przedsiębiorstwa
Jak wybrać systemy IT wspierające działalność przedsiębiorstwaJak wybrać systemy IT wspierające działalność przedsiębiorstwa
Jak wybrać systemy IT wspierające działalność przedsiębiorstwa
 
Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?
Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?
Aplikacja od początku do końca - czyli skąd programiści wiedzą co pisać?
 
"Z IT na nasze" - czyli na czym polega praca Analityka IT. (Wersja plus size :))
"Z IT na nasze" - czyli na czym polega praca Analityka IT. (Wersja plus size :))"Z IT na nasze" - czyli na czym polega praca Analityka IT. (Wersja plus size :))
"Z IT na nasze" - czyli na czym polega praca Analityka IT. (Wersja plus size :))
 
"Z IT na nasze" - czyli na czym polega praca analityka?
"Z IT na nasze" - czyli na czym polega praca analityka?"Z IT na nasze" - czyli na czym polega praca analityka?
"Z IT na nasze" - czyli na czym polega praca analityka?
 
Confitura 2013
Confitura 2013Confitura 2013
Confitura 2013
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Kürzlich hochgeladen (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 

R & Data mining in action

  • 1. R & data mining in action Katarzyna Mrowca
  • 2. Sztuka czytania między wierszami czyli język R i Data Mining w akcji
  • 4.
  • 6. Agenda • Quick glance on theory - Data mining • Exercises on… paper • Quick glance on tool – R console • Exercises – became friend with R •…
  • 7. Agenda • Quick glance on theory - Data mining • Exercises on… paper • Quick glance on tool – R console • Exercises – became friend with R •… Theory Exercise
  • 8. Agenda • Quick glance on theory - Data preparation • Exercises • Regression • Time series • Decision trees • Cluser analysis Theory • Text mining •… Exercise
  • 9. Quick glance on theory!
  • 12. What „google” says? Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science,
  • 13. What „google” says? Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics.
  • 14. What „google” says? Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics.
  • 15. What „google” says? Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics.
  • 16. What „google” says? Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics.
  • 17. What „google” says? Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics.
  • 18. What „google” says? The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.
  • 19. What „google” says? The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.
  • 20. What „google” says? The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use.
  • 21. What „google” says? Aside from the raw analysis step, it involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating. Source: wikipedia
  • 22. Data mining – what is „inside” • Predictive • Regression • Classification • Collaborative Filtering • Descriptive • Clustering / similarity matching • Association rules and variants • Deviation detection
  • 23. Data mining – what is „inside” • Predictive: • Regression • Classification • Collaborative Filtering • Descriptive: • Clustering / similarity matching • Association rules and variants • Deviation detection
  • 24. Data mining – what is „inside” • Predictive: • Regression • Classification • Collaborative Filtering • Descriptive: • Clustering / similarity matching • Association rules and variants • Deviation detection
  • 25. What data mining is not?
  • 26. Why Data Mining is so popular?
  • 27. What is a difference between statistics and data mining?
  • 32. NetBeans + R Source: https://blogs.oracle.com/geertjan/entry/r_plugin_for_netbeans_ide
  • 33. RHIPE <– R+ Hadoop Find out more: http://www.datadr.org/
  • 34. Revolution Analytics <- R + Hadoop + Enterprise Find out more: http://www.revolutionanalytics.com

Hinweis der Redaktion

  1. Przykład z kodem pocztowym i numerem telefonu