SlideShare ist ein Scribd-Unternehmen logo
1 von 10
IT for Business Intelligence




Data Mining Techniques Classification and
        regression Using WEKA



                         A.Kranthikumar (10BM60001)
Classification via decision trees using WEKA
Problem:
A bank is introducing a new financial product. So the bank wants to classify the new
customers whether they will be ready to buy the new product or not. Bank has the
existing information from the old clients who are interested in buying the new
product.

Classification is a statistical technique that helps to classify any new client into one of
the existing groups. It will create a model on the test data available. And then
classifies the new data based on the model that is developed using the test data.

Steps to do classification in WEKA
Step 1: Create a data file in the format of arff or csv. Weka understands these two
formats. We are taking the file in csv format Bank.csv

Step 2: Open the Weka application. This will show the following screen




Now click on the Explorer tab. This directs to the following window.
Step 3: Loading data into WEKA.

To do that click on the open file button and browse for the bank.csv file. Then it
shows all the attributes as shown in the below figure.
Step 4: View the data

      In the selected attribute panel you can see the values corresponding to the
      attributes and also its type, name e.t.c
      You can also visualize the frequency distribution of all the attributes at a time
      by clicking on the “Visualize All” button. It shows the following screen.




This visualizes all shows the range of data for each attribute and also the mean,
median and frequency of each attribute. For example the value of age in our case is
ranging from 18 to 67 with an average of 42.5

Step 5: Classify the Test data

             To do this select the classify button which shows the following screen.
Then select the J48 algorithm which is under the node of tree when
you click on the choose button. This will show the following screen.
Step 6: Run the classification Algorithm

             Select the dependent variable that should be classified and click on the
             start.
             This shows the output in the classifier output panel in ASCII version of
             the tree.
             This is difficult to understand. To view the output in the form of tree,
             right click on the trees.j48 and select “visualize tree” option. This shows
             the following screen by again right clicking on the output and selecting
             full screen option.




Step 7: Analyze the model created by existing data

      From the Classifier output we can find that the Classification accuracy of the
      model is 89%.
      This means that the model is able to predict the values 89% correctly. So if
      we use the same model to find out the buying decision of new customer the
      probability will be 0.89

Step 8: Test the New customer data

      Create your new customer data in arff or csv format with the same attributes
      as test data.
      Now input the data by checking the radio button “Supplied test set” and click
      on “ set” to browse for the new data set.
Then click on the start button which generates a new tree.
Save the classification result as arff. This file contains a copy of the new
instances along with an additional column for the predicted value. The result
will look like following.
Regression Using WEKA
Problem: The idea is to find out how the CPU performance is correlated with the
attributes like machine cycle time, minimum main memory, cache memory e.t.c

A regression is a statistic tool that helps in finding out how the dependent variable
(CPU performance) is related to the independent attributes.

Steps to do Regression in WEKA
Step 1: Create data file and open the WEKA as in the same way as we did for
Classification.

Step 2: Load the regression data file CPU.arff into weka.

       Click on open file and browse for the file, that shows the following screen




Step 3: Run the regression

       Click on the Classify tab and choose “Linear Regression” from the node under
       function. This shows the following screen.
Click on start that will show output in the classifier output screen which gives a
regression equation.
Interpretation of the output:
   From the output you can see that the CPU performance is more dependent on
   CHMAX and then CACHE
   High correlation coefficient of 0.912 from output suggests that the dependent
   variable is strongly associated with the independent variables.
   We can also determine the new CPU performance by using the regression
   equation if we have the values of the attributes.

Weitere ähnliche Inhalte

Was ist angesagt?

BI-Validator Usecase - Stress Test Plan
BI-Validator Usecase - Stress Test PlanBI-Validator Usecase - Stress Test Plan
BI-Validator Usecase - Stress Test PlanDatagaps Inc
 
What if analysis-goal_seek
What if analysis-goal_seekWhat if analysis-goal_seek
What if analysis-goal_seekIlgar Zarbaliyev
 
ETL Validator Usecase - Validating Measures, Counts with Variance
ETL Validator Usecase - Validating Measures, Counts with VarianceETL Validator Usecase - Validating Measures, Counts with Variance
ETL Validator Usecase - Validating Measures, Counts with VarianceDatagaps Inc
 
ETL Validator Usecase -Metadata Comparison
ETL Validator Usecase -Metadata ComparisonETL Validator Usecase -Metadata Comparison
ETL Validator Usecase -Metadata ComparisonVasavi Chinta
 
ETL Validator Usecase - checking for LoV conformance
ETL Validator Usecase - checking for LoV conformanceETL Validator Usecase - checking for LoV conformance
ETL Validator Usecase - checking for LoV conformanceDatagaps Inc
 
ETL Validator Usecase - Data Profiling and Comparison
ETL Validator Usecase - Data Profiling and ComparisonETL Validator Usecase - Data Profiling and Comparison
ETL Validator Usecase - Data Profiling and ComparisonDatagaps Inc
 
ETL Validator Usecase - Transformation logic in input data source
ETL Validator Usecase - Transformation logic in input data sourceETL Validator Usecase - Transformation logic in input data source
ETL Validator Usecase - Transformation logic in input data sourceDatagaps Inc
 
Excel Datamining Addin Intermediate
Excel Datamining Addin IntermediateExcel Datamining Addin Intermediate
Excel Datamining Addin IntermediateDataminingTools Inc
 
ETL Validator Usecase - Check for Mandatory Fields
ETL Validator Usecase - Check for Mandatory FieldsETL Validator Usecase - Check for Mandatory Fields
ETL Validator Usecase - Check for Mandatory FieldsDatagaps Inc
 
ETL Validator Usecase - Testing Transformations or Derived fields
ETL Validator Usecase - Testing Transformations or Derived fieldsETL Validator Usecase - Testing Transformations or Derived fields
ETL Validator Usecase - Testing Transformations or Derived fieldsDatagaps Inc
 

Was ist angesagt? (17)

BI-Validator Usecase - Stress Test Plan
BI-Validator Usecase - Stress Test PlanBI-Validator Usecase - Stress Test Plan
BI-Validator Usecase - Stress Test Plan
 
XL-MINER:Partition
XL-MINER:PartitionXL-MINER:Partition
XL-MINER:Partition
 
What if analysis-goal_seek
What if analysis-goal_seekWhat if analysis-goal_seek
What if analysis-goal_seek
 
Data analysis scenarios
Data analysis scenariosData analysis scenarios
Data analysis scenarios
 
XL Miner: Classification
XL Miner: ClassificationXL Miner: Classification
XL Miner: Classification
 
ETL Validator Usecase - Validating Measures, Counts with Variance
ETL Validator Usecase - Validating Measures, Counts with VarianceETL Validator Usecase - Validating Measures, Counts with Variance
ETL Validator Usecase - Validating Measures, Counts with Variance
 
ETL Validator Usecase -Metadata Comparison
ETL Validator Usecase -Metadata ComparisonETL Validator Usecase -Metadata Comparison
ETL Validator Usecase -Metadata Comparison
 
ETL Validator Usecase - checking for LoV conformance
ETL Validator Usecase - checking for LoV conformanceETL Validator Usecase - checking for LoV conformance
ETL Validator Usecase - checking for LoV conformance
 
ETL Validator Usecase - Data Profiling and Comparison
ETL Validator Usecase - Data Profiling and ComparisonETL Validator Usecase - Data Profiling and Comparison
ETL Validator Usecase - Data Profiling and Comparison
 
ETL Validator Usecase - Transformation logic in input data source
ETL Validator Usecase - Transformation logic in input data sourceETL Validator Usecase - Transformation logic in input data source
ETL Validator Usecase - Transformation logic in input data source
 
Excel Datamining Addin Intermediate
Excel Datamining Addin IntermediateExcel Datamining Addin Intermediate
Excel Datamining Addin Intermediate
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
XL-MINER:Prediction
XL-MINER:PredictionXL-MINER:Prediction
XL-MINER:Prediction
 
ETL Validator Usecase - Check for Mandatory Fields
ETL Validator Usecase - Check for Mandatory FieldsETL Validator Usecase - Check for Mandatory Fields
ETL Validator Usecase - Check for Mandatory Fields
 
List
ListList
List
 
ETL Validator Usecase - Testing Transformations or Derived fields
ETL Validator Usecase - Testing Transformations or Derived fieldsETL Validator Usecase - Testing Transformations or Derived fields
ETL Validator Usecase - Testing Transformations or Derived fields
 
Excel chapter-8
Excel chapter-8Excel chapter-8
Excel chapter-8
 

Andere mochten auch

Clustering and Regression using WEKA
Clustering and Regression using WEKAClustering and Regression using WEKA
Clustering and Regression using WEKAVijaya Prabhu
 
Linear Regression Parameters
Linear Regression ParametersLinear Regression Parameters
Linear Regression Parameterscamposer
 
Baidu
BaiduBaidu
BaiduNanor
 
PPT ON WOOD JOINTS AND CARPENTRY TOOLS
PPT ON WOOD JOINTS AND CARPENTRY TOOLSPPT ON WOOD JOINTS AND CARPENTRY TOOLS
PPT ON WOOD JOINTS AND CARPENTRY TOOLSHimanshu Yadav
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerLuminary Labs
 

Andere mochten auch (9)

Weka
WekaWeka
Weka
 
Clustering and Regression using WEKA
Clustering and Regression using WEKAClustering and Regression using WEKA
Clustering and Regression using WEKA
 
Linear Regression Parameters
Linear Regression ParametersLinear Regression Parameters
Linear Regression Parameters
 
Baidu
BaiduBaidu
Baidu
 
Joints..
Joints..Joints..
Joints..
 
23 joints
23 joints23 joints
23 joints
 
PPT ON WOOD JOINTS AND CARPENTRY TOOLS
PPT ON WOOD JOINTS AND CARPENTRY TOOLSPPT ON WOOD JOINTS AND CARPENTRY TOOLS
PPT ON WOOD JOINTS AND CARPENTRY TOOLS
 
Carpentry
CarpentryCarpentry
Carpentry
 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
 

Ähnlich wie Itb weka

Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using wekarathorenitin87
 
Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011Amu Singh
 
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082Saurabh Singh
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advancedexcel content
 
Less06 2 e_testermodule_5
Less06 2 e_testermodule_5Less06 2 e_testermodule_5
Less06 2 e_testermodule_5Suresh Mishra
 
EA261_2015_Exercises
EA261_2015_ExercisesEA261_2015_Exercises
EA261_2015_ExercisesLuc Vanrobays
 
Empowerment Technology Lesson 4
Empowerment Technology Lesson 4Empowerment Technology Lesson 4
Empowerment Technology Lesson 4alicelagajino
 
Oracle business rules
Oracle business rulesOracle business rules
Oracle business rulesxavier john
 
Create a basic performance point dashboard epc
Create a basic performance point dashboard   epcCreate a basic performance point dashboard   epc
Create a basic performance point dashboard epcEPC Group
 
d5)Go to the following website by clicking on the provided link,
d5)Go to the following website by clicking on the provided link,d5)Go to the following website by clicking on the provided link,
d5)Go to the following website by clicking on the provided link,OllieShoresna
 
Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Siddharth Verma
 
AI Builder - Text Classification
AI Builder - Text ClassificationAI Builder - Text Classification
AI Builder - Text ClassificationCheah Eng Soon
 
Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...Venkat Projects
 

Ähnlich wie Itb weka (20)

Data mining techniques using weka
Data mining techniques using wekaData mining techniques using weka
Data mining techniques using weka
 
Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011Weka Term Paper_VGSoM_10BM60011
Weka Term Paper_VGSoM_10BM60011
 
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
Data Mining Techniques using WEKA_Saurabh Singh_10BM60082
 
Excel Datamining Addin Advanced
Excel Datamining Addin AdvancedExcel Datamining Addin Advanced
Excel Datamining Addin Advanced
 
Remedy Presentation
Remedy PresentationRemedy Presentation
Remedy Presentation
 
Less06 2 e_testermodule_5
Less06 2 e_testermodule_5Less06 2 e_testermodule_5
Less06 2 e_testermodule_5
 
Spss basics tutorial
Spss basics tutorialSpss basics tutorial
Spss basics tutorial
 
EA261_2015_Exercises
EA261_2015_ExercisesEA261_2015_Exercises
EA261_2015_Exercises
 
Empowerment Technology Lesson 4
Empowerment Technology Lesson 4Empowerment Technology Lesson 4
Empowerment Technology Lesson 4
 
Data Mining using Weka
Data Mining using WekaData Mining using Weka
Data Mining using Weka
 
Oracle business rules
Oracle business rulesOracle business rules
Oracle business rules
 
(Manual spss)
(Manual spss)(Manual spss)
(Manual spss)
 
Create a basic performance point dashboard epc
Create a basic performance point dashboard   epcCreate a basic performance point dashboard   epc
Create a basic performance point dashboard epc
 
OLT open script
OLT open script OLT open script
OLT open script
 
Hpalm
HpalmHpalm
Hpalm
 
d5)Go to the following website by clicking on the provided link,
d5)Go to the following website by clicking on the provided link,d5)Go to the following website by clicking on the provided link,
d5)Go to the following website by clicking on the provided link,
 
Mca 504 dotnet_unit5
Mca 504 dotnet_unit5Mca 504 dotnet_unit5
Mca 504 dotnet_unit5
 
Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)Weka term paper(siddharth 10 bm60086)
Weka term paper(siddharth 10 bm60086)
 
AI Builder - Text Classification
AI Builder - Text ClassificationAI Builder - Text Classification
AI Builder - Text Classification
 
Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...Prediction of quality for different type of winebased on different feature se...
Prediction of quality for different type of winebased on different feature se...
 

Kürzlich hochgeladen

4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6Vanessa Camilleri
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseCeline George
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesVijayaLaxmi84
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1GloryAnnCastre1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...DhatriParmar
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research DiscourseAnita GoswamiGiri
 

Kürzlich hochgeladen (20)

4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6ICS 2208 Lecture Slide Notes for Topic 6
ICS 2208 Lecture Slide Notes for Topic 6
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of EngineeringFaculty Profile prashantha K EEE dept Sri Sairam college of Engineering
Faculty Profile prashantha K EEE dept Sri Sairam college of Engineering
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
How to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 DatabaseHow to Make a Duplicate of Your Odoo 17 Database
How to Make a Duplicate of Your Odoo 17 Database
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Sulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their usesSulphonamides, mechanisms and their uses
Sulphonamides, mechanisms and their uses
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1Reading and Writing Skills 11 quarter 4 melc 1
Reading and Writing Skills 11 quarter 4 melc 1
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
Blowin' in the Wind of Caste_ Bob Dylan's Song as a Catalyst for Social Justi...
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
Scientific Writing :Research Discourse
Scientific  Writing :Research  DiscourseScientific  Writing :Research  Discourse
Scientific Writing :Research Discourse
 

Itb weka

  • 1. IT for Business Intelligence Data Mining Techniques Classification and regression Using WEKA A.Kranthikumar (10BM60001)
  • 2. Classification via decision trees using WEKA Problem: A bank is introducing a new financial product. So the bank wants to classify the new customers whether they will be ready to buy the new product or not. Bank has the existing information from the old clients who are interested in buying the new product. Classification is a statistical technique that helps to classify any new client into one of the existing groups. It will create a model on the test data available. And then classifies the new data based on the model that is developed using the test data. Steps to do classification in WEKA Step 1: Create a data file in the format of arff or csv. Weka understands these two formats. We are taking the file in csv format Bank.csv Step 2: Open the Weka application. This will show the following screen Now click on the Explorer tab. This directs to the following window.
  • 3. Step 3: Loading data into WEKA. To do that click on the open file button and browse for the bank.csv file. Then it shows all the attributes as shown in the below figure.
  • 4. Step 4: View the data In the selected attribute panel you can see the values corresponding to the attributes and also its type, name e.t.c You can also visualize the frequency distribution of all the attributes at a time by clicking on the “Visualize All” button. It shows the following screen. This visualizes all shows the range of data for each attribute and also the mean, median and frequency of each attribute. For example the value of age in our case is ranging from 18 to 67 with an average of 42.5 Step 5: Classify the Test data To do this select the classify button which shows the following screen.
  • 5. Then select the J48 algorithm which is under the node of tree when you click on the choose button. This will show the following screen.
  • 6. Step 6: Run the classification Algorithm Select the dependent variable that should be classified and click on the start. This shows the output in the classifier output panel in ASCII version of the tree. This is difficult to understand. To view the output in the form of tree, right click on the trees.j48 and select “visualize tree” option. This shows the following screen by again right clicking on the output and selecting full screen option. Step 7: Analyze the model created by existing data From the Classifier output we can find that the Classification accuracy of the model is 89%. This means that the model is able to predict the values 89% correctly. So if we use the same model to find out the buying decision of new customer the probability will be 0.89 Step 8: Test the New customer data Create your new customer data in arff or csv format with the same attributes as test data. Now input the data by checking the radio button “Supplied test set” and click on “ set” to browse for the new data set.
  • 7. Then click on the start button which generates a new tree. Save the classification result as arff. This file contains a copy of the new instances along with an additional column for the predicted value. The result will look like following.
  • 8. Regression Using WEKA Problem: The idea is to find out how the CPU performance is correlated with the attributes like machine cycle time, minimum main memory, cache memory e.t.c A regression is a statistic tool that helps in finding out how the dependent variable (CPU performance) is related to the independent attributes. Steps to do Regression in WEKA Step 1: Create data file and open the WEKA as in the same way as we did for Classification. Step 2: Load the regression data file CPU.arff into weka. Click on open file and browse for the file, that shows the following screen Step 3: Run the regression Click on the Classify tab and choose “Linear Regression” from the node under function. This shows the following screen.
  • 9. Click on start that will show output in the classifier output screen which gives a regression equation.
  • 10. Interpretation of the output: From the output you can see that the CPU performance is more dependent on CHMAX and then CACHE High correlation coefficient of 0.912 from output suggests that the dependent variable is strongly associated with the independent variables. We can also determine the new CPU performance by using the regression equation if we have the values of the attributes.