SlideShare ist ein Scribd-Unternehmen logo
1 von 66
Downloaden Sie, um offline zu lesen
Machine Learning with WEKA
      An Introduction

                   林松江
                 2005/9/30
WEKA: the bird




     Copyright: Martin Kramer (mkramer@wxs.nl)
WEKA is available at
 http://www.cs.waikato.ac.nz/ml/weka/
The format of Dataset in WEKA(1)
@relation heart-disease-simplified

@attribute   age numeric
@attribute   sex { female, male}
@attribute   chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}
@attribute   cholesterol numeric
@attribute   exercise_induced_angina { no, yes}
@attribute   class { present, not_present}

@data
63,male,typ_angina,233,no,not_present
67,male,asympt,286,yes,present
67,male,asympt,229,yes,present
38,female,non_anginal,?,no,not_present
...
The format of Dataset in WEKA(2)
@relation heart-disease-simplified

@attribute age numeric
@attribute sex { female, male}
@attribute chest_pain_type { typ_angina, asympt, non_anginal,
    atyp_angina}
@attribute cholesterol numeric
@attribute exercise_induced_angina { no, yes}
@attribute class { present, not_present}
@data
63,male,typ_angina,233,no,not_present
67,male,asympt,286,yes,present
67,male,asympt,229,yes,present
38,female,non_anginal,?,no,not_present
...
Machine Learning with WEKA
Data can be imported from a file
in various formats: ARFF, CSV,
C4.5, binary
Explorer: pre-processing the data
 Data can be imported from a file in
 various formats: ARFF, CSV, C4.5, binary
 Data can also be read from a URL or from
 an SQL database (using JDBC)
 Pre-processing tools in WEKA are called
 “filters”
 WEKA contains filters for:
   Discretization, normalization, resampling,
   attribute selection, transforming and
   combining attributes, …
Pre-processing tools in WEKA are called “filters”:
including discretization, normalization, re-
sampling, attribute selection, transforming and
combining attributes, …
Visualize class distribution for each attribute
Machine Learning with WEKA
Machine Learning with WEKA
Click here to choose filter algorithm
Machine Learning with WEKA
Click here to set the parameter for
filter algorithm
Set parameter
Machine Learning with WEKA
apply the filter algorithm
Equal frequence
Building “classifiers”
 Classifiers in WEKA are models for
 predicting nominal or numeric quantities
 Implemented learning schemes include:
   Decision trees and lists, instance-based
   classifiers, support vector machines,
   multi-layer perceptrons, logistic regression,
   Bayes’ nets, …
 “Meta”-classifiers include:
   Bagging, boosting, stacking, error-correcting
   output codes, locally weighted learning, …
Choose classification algorithm
J48 : Decision tree algorithm
Click here to set parameter
for decision tree
Information
about
decision
tree
algorithm
Output setting
Start to build classifier
Fitted result




Click right bottom to
view more information
View fitted tree
Machine Learning with WEKA
Crosses : correctly classified instances
Squares : incorrectly classified instances
SMO : Support Vector Machine algorithm
Choose RBF kernel
Click right bottom
to view more
information
clustering data
 WEKA contains “clusterers” for finding
 groups of similar instances in a dataset
 Implemented schemes are:
   k-Means, EM, Cobweb, X-means,
   FarthestFirst
 Clusters can be visualized and compared
 to “true” clusters (if given)
 Evaluation based on loglikelihood if
 clustering scheme produces a probability
 distribution
clustering data
Choose clustering algorithm
Click here to set parameter
Start to perform clustering
Machine Learning with WEKA
View cluster
Click right bottom   assignments
to view more
information
Finding associations
WEKA contains an implementation of the Apriori
algorithm for learning association rules
  Works only with discrete data
Can identify statistical dependencies between
groups of attributes:
  milk, butter ⇒ bread, eggs (with confidence 0.9 and
  support 2000)
Apriori can compute all rules that have a given
minimum support and exceed a given
confidence
Finding associations
Attribute selection
 Panel that can be used to investigate which
 (subsets of) attributes are the most predictive
 ones
 Attribute selection methods contain two parts:
   A search method: best-first, forward selection,
   random, exhaustive, genetic algorithm, ranking
   An evaluation method: correlation-based, wrapper,
   information gain, chi-squared, …
 Very flexible: WEKA allows (almost) arbitrary
 combinations of these two
Choose attribute evalutor




Choose search
method
Machine Learning with WEKA
Data visualization
Visualization very useful in practice: e.g. helps
to determine difficulty of the learning problem
WEKA can visualize single attributes (1-d) and
pairs of attributes (2-d)
  To do: rotating 3-d visualizations (Xgobi-style)
Color-coded class values
“Jitter” option to deal with nominal attributes
(and to detect “hidden” data points)
“Zoom-in” function
Adjust view   Click here to view
 point size   single result
Choose range to magnify
local result
submit to magnify local result
Machine Learning with WEKA
Performing experiments
 Experimenter makes it easy to compare the
 performance of different learning schemes
 For classification and regression problems
 Results can be written into file or database
 Evaluation options: cross-validation, learning
 curve, hold-out
 Can also iterate over different parameter
 settings
 Significance-testing built in!
Machine Learning with WEKA
Click here to run experiment




                                 Step 1 : Set output file

                               Step 3 : choose algorithm


     Step 2 : add dataset
Experiment status
View the experiment result
Click here to perform test
Machine Learning with WEKA
Knowledge Flow GUI
New graphical user interface for WEKA
Java-Beans-based interface for setting up and
running machine learning experiments
Data sources, classifiers, etc. are beans and
can be connected graphically
Data “flows” through components: e.g.,
“data source” -> “filter” -> “classifier” ->
“evaluator”
Layouts can be saved and loaded again later
Step 1 : choose data source
Step 2 : choose data
source format
Step 3 : set data connection
Visualize data
Step 4 : choose   Step 5 : choose       Step 6 : choose
  evaluation       filter method           classifier




                                    Set parameter
Final step : start to run
Finish
Thank you

Weitere ähnliche Inhalte

Was ist angesagt?

Weka a tool_for_exploratory_data_mining
Weka a tool_for_exploratory_data_miningWeka a tool_for_exploratory_data_mining
Weka a tool_for_exploratory_data_miningTony Frame
 
Analytics machine learning in weka
Analytics machine learning in wekaAnalytics machine learning in weka
Analytics machine learning in wekaSudhakar Chavan
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using wekaPrashant Menon
 
Weka presentation
Weka presentationWeka presentation
Weka presentationAbrar ali
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using wekarathorenitin87
 
1.5 weka an intoduction
1.5 weka an intoduction1.5 weka an intoduction
1.5 weka an intoductionKrish_ver2
 
Data mining with Weka
Data mining with WekaData mining with Weka
Data mining with WekaAlbanLevy
 
WEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesWEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesDataminingTools Inc
 
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)Ankit Pandey
 

Was ist angesagt? (17)

Weka a tool_for_exploratory_data_mining
Weka a tool_for_exploratory_data_miningWeka a tool_for_exploratory_data_mining
Weka a tool_for_exploratory_data_mining
 
Analytics machine learning in weka
Analytics machine learning in wekaAnalytics machine learning in weka
Analytics machine learning in weka
 
Weka bike rental
Weka bike rentalWeka bike rental
Weka bike rental
 
Weka
WekaWeka
Weka
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
 
weka data mining
weka data mining weka data mining
weka data mining
 
Weka
WekaWeka
Weka
 
Weka presentation
Weka presentationWeka presentation
Weka presentation
 
Weka tutorial
Weka tutorialWeka tutorial
Weka tutorial
 
Wek1
Wek1Wek1
Wek1
 
Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
 
1.5 weka an intoduction
1.5 weka an intoduction1.5 weka an intoduction
1.5 weka an intoduction
 
Weka library, JAVA
Weka library, JAVAWeka library, JAVA
Weka library, JAVA
 
Data mining with Weka
Data mining with WekaData mining with Weka
Data mining with Weka
 
WEKA: Introduction To Weka
WEKA: Introduction To WekaWEKA: Introduction To Weka
WEKA: Introduction To Weka
 
WEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And AttributesWEKA: Data Mining Input Concepts Instances And Attributes
WEKA: Data Mining Input Concepts Instances And Attributes
 
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
Data Mining Techniques using WEKA (Ankit Pandey-10BM60012)
 

Andere mochten auch

Clustering and Regression using WEKA
Clustering and Regression using WEKAClustering and Regression using WEKA
Clustering and Regression using WEKAVijaya Prabhu
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine LearningPranav Challa
 
Linear Regression Parameters
Linear Regression ParametersLinear Regression Parameters
Linear Regression Parameterscamposer
 
DATA MINING WITH WEKA
DATA MINING WITH WEKADATA MINING WITH WEKA
DATA MINING WITH WEKAShubham Gupta
 
Discovering knowledge using web structure mining
Discovering knowledge using web structure miningDiscovering knowledge using web structure mining
Discovering knowledge using web structure miningAtul Khanna
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSivagowry Shathesh
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationMohammed Bennamoun
 
Multiple regression presentation
Multiple regression presentationMultiple regression presentation
Multiple regression presentationCarlo Magno
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regressionalok tiwari
 
Regression analysis
Regression analysisRegression analysis
Regression analysisRavi shankar
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regressionJames Neill
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionKhalid Aziz
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedSlideShare
 

Andere mochten auch (17)

Clustering and Regression using WEKA
Clustering and Regression using WEKAClustering and Regression using WEKA
Clustering and Regression using WEKA
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Clustering
ClusteringClustering
Clustering
 
Linear Regression Parameters
Linear Regression ParametersLinear Regression Parameters
Linear Regression Parameters
 
DATA MINING WITH WEKA
DATA MINING WITH WEKADATA MINING WITH WEKA
DATA MINING WITH WEKA
 
Discovering knowledge using web structure mining
Discovering knowledge using web structure miningDiscovering knowledge using web structure mining
Discovering knowledge using web structure mining
 
Survey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
 
How to detect & diagnose congenital heart disease in children
How to detect & diagnose congenital heart disease in childrenHow to detect & diagnose congenital heart disease in children
How to detect & diagnose congenital heart disease in children
 
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & BackpropagationArtificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
 
Apriori Algorithm
Apriori AlgorithmApriori Algorithm
Apriori Algorithm
 
Multiple regression presentation
Multiple regression presentationMultiple regression presentation
Multiple regression presentation
 
Presentation On Regression
Presentation On RegressionPresentation On Regression
Presentation On Regression
 
Regression analysis
Regression analysisRegression analysis
Regression analysis
 
Multiple linear regression
Multiple linear regressionMultiple linear regression
Multiple linear regression
 
Techniques Machine Learning
Techniques Machine LearningTechniques Machine Learning
Techniques Machine Learning
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-Presented
 

Ähnlich wie Machine Learning with WEKA

Weka toolkit introduction
Weka toolkit introductionWeka toolkit introduction
Weka toolkit introductionbutest
 
Introduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.pptIntroduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.pptradhikadsu
 
XL-Miner: Classification
XL-Miner: ClassificationXL-Miner: Classification
XL-Miner: Classificationxlminer content
 
wekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdfwekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdfDr. Rajesh P Barnwal
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityGon-soo Moon
 
Machine learning Algorithms
Machine learning AlgorithmsMachine learning Algorithms
Machine learning AlgorithmsWalaa Hamdy Assy
 
Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Siddharth Verma
 
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxTask A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxjosies1
 
Weka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningWeka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningKeshab Kumar Gaurav
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Wekaweka Content
 
Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011Amu Singh
 
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET Journal
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
Machine learning(UNIT 4)
Machine learning(UNIT 4)Machine learning(UNIT 4)
Machine learning(UNIT 4)SURBHI SAROHA
 

Ähnlich wie Machine Learning with WEKA (20)

Weka toolkit introduction
Weka toolkit introductionWeka toolkit introduction
Weka toolkit introduction
 
Weka
Weka Weka
Weka
 
Introduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.pptIntroduction to Weka and Preprocessing.ppt
Introduction to Weka and Preprocessing.ppt
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
XL Miner: Classification
XL Miner: ClassificationXL Miner: Classification
XL Miner: Classification
 
XL-Miner: Classification
XL-Miner: ClassificationXL-Miner: Classification
XL-Miner: Classification
 
wekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdfwekapresentation-130107115704-phpapp02.pdf
wekapresentation-130107115704-phpapp02.pdf
 
WEKA: The Explorer
WEKA: The ExplorerWEKA: The Explorer
WEKA: The Explorer
 
WEKA:The Explorer
WEKA:The ExplorerWEKA:The Explorer
WEKA:The Explorer
 
Paper-Allstate-Claim-Severity
Paper-Allstate-Claim-SeverityPaper-Allstate-Claim-Severity
Paper-Allstate-Claim-Severity
 
Machine learning Algorithms
Machine learning AlgorithmsMachine learning Algorithms
Machine learning Algorithms
 
Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)
 
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docxTask A. [20 marks] Data Choice. Name the chosen data set(s) .docx
Task A. [20 marks] Data Choice. Name the chosen data set(s) .docx
 
Weka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data miningWeka : A machine learning algorithms for data mining
Weka : A machine learning algorithms for data mining
 
An Introduction To Weka
An Introduction To WekaAn Introduction To Weka
An Introduction To Weka
 
Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011
 
Itb weka
Itb wekaItb weka
Itb weka
 
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data MiningIRJET- Study and Evaluation of Classification Algorithms in Data Mining
IRJET- Study and Evaluation of Classification Algorithms in Data Mining
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Machine learning(UNIT 4)
Machine learning(UNIT 4)Machine learning(UNIT 4)
Machine learning(UNIT 4)
 

Mehr von butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Mehr von butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Machine Learning with WEKA

  • 1. Machine Learning with WEKA An Introduction 林松江 2005/9/30
  • 2. WEKA: the bird Copyright: Martin Kramer (mkramer@wxs.nl)
  • 3. WEKA is available at http://www.cs.waikato.ac.nz/ml/weka/
  • 4. The format of Dataset in WEKA(1) @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...
  • 5. The format of Dataset in WEKA(2) @relation heart-disease-simplified @attribute age numeric @attribute sex { female, male} @attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina} @attribute cholesterol numeric @attribute exercise_induced_angina { no, yes} @attribute class { present, not_present} @data 63,male,typ_angina,233,no,not_present 67,male,asympt,286,yes,present 67,male,asympt,229,yes,present 38,female,non_anginal,?,no,not_present ...
  • 7. Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary
  • 8. Explorer: pre-processing the data Data can be imported from a file in various formats: ARFF, CSV, C4.5, binary Data can also be read from a URL or from an SQL database (using JDBC) Pre-processing tools in WEKA are called “filters” WEKA contains filters for: Discretization, normalization, resampling, attribute selection, transforming and combining attributes, …
  • 9. Pre-processing tools in WEKA are called “filters”: including discretization, normalization, re- sampling, attribute selection, transforming and combining attributes, …
  • 10. Visualize class distribution for each attribute
  • 13. Click here to choose filter algorithm
  • 15. Click here to set the parameter for filter algorithm
  • 18. apply the filter algorithm
  • 20. Building “classifiers” Classifiers in WEKA are models for predicting nominal or numeric quantities Implemented learning schemes include: Decision trees and lists, instance-based classifiers, support vector machines, multi-layer perceptrons, logistic regression, Bayes’ nets, … “Meta”-classifiers include: Bagging, boosting, stacking, error-correcting output codes, locally weighted learning, …
  • 22. J48 : Decision tree algorithm
  • 23. Click here to set parameter for decision tree
  • 26. Start to build classifier
  • 27. Fitted result Click right bottom to view more information
  • 30. Crosses : correctly classified instances Squares : incorrectly classified instances
  • 31. SMO : Support Vector Machine algorithm
  • 33. Click right bottom to view more information
  • 34. clustering data WEKA contains “clusterers” for finding groups of similar instances in a dataset Implemented schemes are: k-Means, EM, Cobweb, X-means, FarthestFirst Clusters can be visualized and compared to “true” clusters (if given) Evaluation based on loglikelihood if clustering scheme produces a probability distribution
  • 37. Click here to set parameter
  • 38. Start to perform clustering
  • 40. View cluster Click right bottom assignments to view more information
  • 41. Finding associations WEKA contains an implementation of the Apriori algorithm for learning association rules Works only with discrete data Can identify statistical dependencies between groups of attributes: milk, butter ⇒ bread, eggs (with confidence 0.9 and support 2000) Apriori can compute all rules that have a given minimum support and exceed a given confidence
  • 43. Attribute selection Panel that can be used to investigate which (subsets of) attributes are the most predictive ones Attribute selection methods contain two parts: A search method: best-first, forward selection, random, exhaustive, genetic algorithm, ranking An evaluation method: correlation-based, wrapper, information gain, chi-squared, … Very flexible: WEKA allows (almost) arbitrary combinations of these two
  • 46. Data visualization Visualization very useful in practice: e.g. helps to determine difficulty of the learning problem WEKA can visualize single attributes (1-d) and pairs of attributes (2-d) To do: rotating 3-d visualizations (Xgobi-style) Color-coded class values “Jitter” option to deal with nominal attributes (and to detect “hidden” data points) “Zoom-in” function
  • 47. Adjust view Click here to view point size single result
  • 48. Choose range to magnify local result
  • 49. submit to magnify local result
  • 51. Performing experiments Experimenter makes it easy to compare the performance of different learning schemes For classification and regression problems Results can be written into file or database Evaluation options: cross-validation, learning curve, hold-out Can also iterate over different parameter settings Significance-testing built in!
  • 53. Click here to run experiment Step 1 : Set output file Step 3 : choose algorithm Step 2 : add dataset
  • 56. Click here to perform test
  • 58. Knowledge Flow GUI New graphical user interface for WEKA Java-Beans-based interface for setting up and running machine learning experiments Data sources, classifiers, etc. are beans and can be connected graphically Data “flows” through components: e.g., “data source” -> “filter” -> “classifier” -> “evaluator” Layouts can be saved and loaded again later
  • 59. Step 1 : choose data source
  • 60. Step 2 : choose data source format
  • 61. Step 3 : set data connection
  • 63. Step 4 : choose Step 5 : choose Step 6 : choose evaluation filter method classifier Set parameter
  • 64. Final step : start to run