SlideShare ist ein Scribd-Unternehmen logo
1 von 2
Use cluster analysis and decision tree induction algorithm on a | Analytics
| Harrisburg University of Science and Technology
Use the Cluster Analysis and Decision Tree induction algorithmon a weather forecast
problem. It is a binary classification problem to predict whether or not a location will get
rain the next day.Information about the dataset (Weather Forecast Training.csv):• Location:
The location name of the weather station• MinTemp: The minimum temperature in degrees
celsius• MaxTemp: The maximum temperature in degrees celsius• Rainfall: The amount of
rainfall recorded for the day in mm• Evaporation: The so-called Class A pan evaporation
(mm) in the 24 hours to 9am• Sunshine: The number of hours of bright sunshine in the
day.• WindGustDir: The direction of the strongest wind gust in the 24 hours to midnight•
WindGustSpeed: The speed (km/h) of the strongest wind gust in the 24 hours to midnight•
WindDir: Direction of the wind• WindSpeed: Wind speed (km/hr) averaged over 10
minutes• Humidity: Humidity (percent)• Pressure: Atmospheric pressure (hpa) reduced to
mean sea level• Cloud: Fraction of sky obscured by cloud This is measured in “oktas”, which
are a unit of eigths. Itrecords how many eigths of the sky are obscured by the cloud. A 0
measure indicates a completely clear skywhilst an 8 indicates that it is completely
overcast.• Temp: Temperature (degrees C)• RainTodayBoolean: 1 if precipitation (mm) in
the 24 hours to 9 am exceeds 1mm, otherwise 0• RainTomorrow: The target variable. Did it
rain tomorrow?Organize your report using the following template (with the section
breakdown and grading rubrics):Section 1: Data preparation (30%)• Discuss the potential
data quality issues you identify about the dataset and how you apply various data
preprocessing techniques to cope with those issues and perform Exploratory Data Analysis
(EDA). Specifically, discuss the type of techniques you carry out in order to prepare the
dataset for the machine learning algorithms you use in the next section. Whenever
appropriate, enhance your EDA with effective data visualization.Section 2: Build, tune and
evaluate cluster analysis and decision tree models (50%)• Apply both clustering algorithm
(kmeans and HAC) and decision tree induction algorithm to the weather forest training data
and construct models. Perform extensive model experiments with hyper-parameters’
tuning. Discuss your choice of hyper-parameters for each algorithm and produce tables
summarizing the best performing models and their corresponding model specifications (i.e.
the combination of hyper-parameters). Also, explain your choice of model performance
evaluation methods and metrics in order to produce unbiased and low variance estimates.•
In your analysis writeup, include the discussion regarding how to repurpose the
unsupervised learning algorithms like clustering for classification and how to judge the
performance of the algorithms.• For decision tree induction algorithm model performance
evaluation, generate Receiver OperatingCharacteristics (ROC) curve and calculate the Area
Under Curve (AUC) metric for the identified best performing model. Include a decision tree
output visualization and interpret the decision tree model.• Include a detailed explanation
of your modeling process and interpretation of the results in your analysiswriteup (with
markdown language) and structure such writeup in an easy-to-follow layout. Please limit
the program output only to the most relevant part which is used to support your analysis.
An excessive amount of less relevant outputs (e.g. display the whole dataset) in your report
is not neededSection 3: Prediction and interpretation (20%)• After building the
classification models, apply them to the test dataset (Weather Forcast Testing.csv)provided
to predict if each location will rain tomorrow.•Needed prediction results as a CSV file with
four columns (ID, kmeans, HAC, DT) for theclassification results out of the pre-specified
machine learning algorithms respectively.Needed rmarkdown or notebook document
together with a knitted report (HTML format) and prediction CSV file.

Weitere ähnliche Inhalte

Ähnlich wie Use cluster analysis and decision tree induction algorithm on a.docx

BREEZE 3D Analyst for the Advanced AERMOD Modeler
BREEZE 3D Analyst for the Advanced AERMOD ModelerBREEZE 3D Analyst for the Advanced AERMOD Modeler
BREEZE 3D Analyst for the Advanced AERMOD ModelerBREEZE Software
 
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docxFall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docxlmelaine
 
Churn Modeling For Mobile Telecommunications
Churn Modeling For Mobile TelecommunicationsChurn Modeling For Mobile Telecommunications
Churn Modeling For Mobile TelecommunicationsSalford Systems
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.pptArumugam90
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?Anubhav Jain
 
CAD theory presentation.pptx .
CAD theory presentation.pptx                .CAD theory presentation.pptx                .
CAD theory presentation.pptx .Athar739197
 
Data Structures and Algorithms Unit 01
Data Structures and Algorithms Unit 01Data Structures and Algorithms Unit 01
Data Structures and Algorithms Unit 01Prashanth Shivakumar
 
Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)IrSOLaV Pomares
 
computer networking
computer networkingcomputer networking
computer networkingAvi Nash
 
University of Victoria Talk - Metocean analysis and Machine Learning for Impr...
University of Victoria Talk - Metocean analysis and Machine Learning for Impr...University of Victoria Talk - Metocean analysis and Machine Learning for Impr...
University of Victoria Talk - Metocean analysis and Machine Learning for Impr...Aaron Barker
 
(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...
(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...
(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...Dr.-Ing. Michael Menzel
 
Meteoio Introduction given by Mathias Bavey in Bozen
Meteoio Introduction given by Mathias Bavey in BozenMeteoio Introduction given by Mathias Bavey in Bozen
Meteoio Introduction given by Mathias Bavey in BozenRiccardo Rigon
 
Parallel Programming for Multi- Core and Cluster Systems - Performance Analysis
Parallel Programming for Multi- Core and Cluster Systems - Performance AnalysisParallel Programming for Multi- Core and Cluster Systems - Performance Analysis
Parallel Programming for Multi- Core and Cluster Systems - Performance AnalysisShah Zaib
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0Mohsen Sadok
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsHPCC Systems
 

Ähnlich wie Use cluster analysis and decision tree induction algorithm on a.docx (20)

Mannasim for NS2
Mannasim for NS2Mannasim for NS2
Mannasim for NS2
 
BREEZE 3D Analyst for the Advanced AERMOD Modeler
BREEZE 3D Analyst for the Advanced AERMOD ModelerBREEZE 3D Analyst for the Advanced AERMOD Modeler
BREEZE 3D Analyst for the Advanced AERMOD Modeler
 
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docxFall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
Fall 2016 Insurance Case Study – Finance 360Loss ControlLoss.docx
 
Churn Modeling For Mobile Telecommunications
Churn Modeling For Mobile TelecommunicationsChurn Modeling For Mobile Telecommunications
Churn Modeling For Mobile Telecommunications
 
CS3114_09212011.ppt
CS3114_09212011.pptCS3114_09212011.ppt
CS3114_09212011.ppt
 
How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?How might machine learning help advance solar PV research?
How might machine learning help advance solar PV research?
 
CFD & ANSYS FLUENT
CFD & ANSYS FLUENTCFD & ANSYS FLUENT
CFD & ANSYS FLUENT
 
Recent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor ModelRecent and Planned Improvements to the System Advisor Model
Recent and Planned Improvements to the System Advisor Model
 
CAD theory presentation.pptx .
CAD theory presentation.pptx                .CAD theory presentation.pptx                .
CAD theory presentation.pptx .
 
Data Structures and Algorithms Unit 01
Data Structures and Algorithms Unit 01Data Structures and Algorithms Unit 01
Data Structures and Algorithms Unit 01
 
Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)Future guidelines the meteorological view - Isabel Martínez (AEMet)
Future guidelines the meteorological view - Isabel Martínez (AEMet)
 
Visual Techniques
Visual TechniquesVisual Techniques
Visual Techniques
 
computer networking
computer networkingcomputer networking
computer networking
 
University of Victoria Talk - Metocean analysis and Machine Learning for Impr...
University of Victoria Talk - Metocean analysis and Machine Learning for Impr...University of Victoria Talk - Metocean analysis and Machine Learning for Impr...
University of Victoria Talk - Metocean analysis and Machine Learning for Impr...
 
(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...
(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...
(MC²)²: A Generic Decision-Making Framework and its Application to Cloud Comp...
 
Meteoio Introduction given by Mathias Bavey in Bozen
Meteoio Introduction given by Mathias Bavey in BozenMeteoio Introduction given by Mathias Bavey in Bozen
Meteoio Introduction given by Mathias Bavey in Bozen
 
Parallel Programming for Multi- Core and Cluster Systems - Performance Analysis
Parallel Programming for Multi- Core and Cluster Systems - Performance AnalysisParallel Programming for Multi- Core and Cluster Systems - Performance Analysis
Parallel Programming for Multi- Core and Cluster Systems - Performance Analysis
 
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
A study of Machine Learning approach for Predictive Maintenance in Industry 4.0
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Modeling full scale-data(2)
Modeling full scale-data(2)Modeling full scale-data(2)
Modeling full scale-data(2)
 

Mehr von bkbk37

Range of.docx
Range of.docxRange of.docx
Range of.docxbkbk37
 
Ralph Waldo Emerson.docx
Ralph Waldo Emerson.docxRalph Waldo Emerson.docx
Ralph Waldo Emerson.docxbkbk37
 
Raising Minimum An explanation of the its.docx
Raising Minimum An explanation of the its.docxRaising Minimum An explanation of the its.docx
Raising Minimum An explanation of the its.docxbkbk37
 
Rail Project A goal of the Obama administration.docx
Rail Project A goal of the Obama administration.docxRail Project A goal of the Obama administration.docx
Rail Project A goal of the Obama administration.docxbkbk37
 
Racism toward Indigenous peoples in Canada.docx
Racism toward Indigenous peoples in Canada.docxRacism toward Indigenous peoples in Canada.docx
Racism toward Indigenous peoples in Canada.docxbkbk37
 
Race and.docx
Race and.docxRace and.docx
Race and.docxbkbk37
 
R2P and Syria.docx
R2P and Syria.docxR2P and Syria.docx
R2P and Syria.docxbkbk37
 
Racial Disparities.docx
Racial Disparities.docxRacial Disparities.docx
Racial Disparities.docxbkbk37
 
Race and Technology.docx
Race and Technology.docxRace and Technology.docx
Race and Technology.docxbkbk37
 
QuickBooks uses windows API to follow orders to get updates.docx
QuickBooks uses windows API to follow orders to get updates.docxQuickBooks uses windows API to follow orders to get updates.docx
QuickBooks uses windows API to follow orders to get updates.docxbkbk37
 
Questions What are the purposes of Just.docx
Questions What are the purposes of Just.docxQuestions What are the purposes of Just.docx
Questions What are the purposes of Just.docxbkbk37
 
Questions to Each group you read about is.docx
Questions to Each group you read about is.docxQuestions to Each group you read about is.docx
Questions to Each group you read about is.docxbkbk37
 
Questions that must be answered in your plus other.docx
Questions that must be answered in your plus other.docxQuestions that must be answered in your plus other.docx
Questions that must be answered in your plus other.docxbkbk37
 
Questions for Brief Explicit Spiritual.docx
Questions for Brief Explicit Spiritual.docxQuestions for Brief Explicit Spiritual.docx
Questions for Brief Explicit Spiritual.docxbkbk37
 
Question Libya recently announced that it is claiming a.docx
Question Libya recently announced that it is claiming a.docxQuestion Libya recently announced that it is claiming a.docx
Question Libya recently announced that it is claiming a.docxbkbk37
 
Question Use the Internet or the IGlobal Resource.docx
Question Use the Internet or the IGlobal Resource.docxQuestion Use the Internet or the IGlobal Resource.docx
Question Use the Internet or the IGlobal Resource.docxbkbk37
 
Question Please define motivation and discuss why it is.docx
Question Please define motivation and discuss why it is.docxQuestion Please define motivation and discuss why it is.docx
Question Please define motivation and discuss why it is.docxbkbk37
 
Question share your perspective on personal data as a.docx
Question share your perspective on personal data as a.docxQuestion share your perspective on personal data as a.docx
Question share your perspective on personal data as a.docxbkbk37
 
QEP Assignment Death Penalty.docx
QEP Assignment Death Penalty.docxQEP Assignment Death Penalty.docx
QEP Assignment Death Penalty.docxbkbk37
 
Question In your what are the main workforce.docx
Question In your what are the main workforce.docxQuestion In your what are the main workforce.docx
Question In your what are the main workforce.docxbkbk37
 

Mehr von bkbk37 (20)

Range of.docx
Range of.docxRange of.docx
Range of.docx
 
Ralph Waldo Emerson.docx
Ralph Waldo Emerson.docxRalph Waldo Emerson.docx
Ralph Waldo Emerson.docx
 
Raising Minimum An explanation of the its.docx
Raising Minimum An explanation of the its.docxRaising Minimum An explanation of the its.docx
Raising Minimum An explanation of the its.docx
 
Rail Project A goal of the Obama administration.docx
Rail Project A goal of the Obama administration.docxRail Project A goal of the Obama administration.docx
Rail Project A goal of the Obama administration.docx
 
Racism toward Indigenous peoples in Canada.docx
Racism toward Indigenous peoples in Canada.docxRacism toward Indigenous peoples in Canada.docx
Racism toward Indigenous peoples in Canada.docx
 
Race and.docx
Race and.docxRace and.docx
Race and.docx
 
R2P and Syria.docx
R2P and Syria.docxR2P and Syria.docx
R2P and Syria.docx
 
Racial Disparities.docx
Racial Disparities.docxRacial Disparities.docx
Racial Disparities.docx
 
Race and Technology.docx
Race and Technology.docxRace and Technology.docx
Race and Technology.docx
 
QuickBooks uses windows API to follow orders to get updates.docx
QuickBooks uses windows API to follow orders to get updates.docxQuickBooks uses windows API to follow orders to get updates.docx
QuickBooks uses windows API to follow orders to get updates.docx
 
Questions What are the purposes of Just.docx
Questions What are the purposes of Just.docxQuestions What are the purposes of Just.docx
Questions What are the purposes of Just.docx
 
Questions to Each group you read about is.docx
Questions to Each group you read about is.docxQuestions to Each group you read about is.docx
Questions to Each group you read about is.docx
 
Questions that must be answered in your plus other.docx
Questions that must be answered in your plus other.docxQuestions that must be answered in your plus other.docx
Questions that must be answered in your plus other.docx
 
Questions for Brief Explicit Spiritual.docx
Questions for Brief Explicit Spiritual.docxQuestions for Brief Explicit Spiritual.docx
Questions for Brief Explicit Spiritual.docx
 
Question Libya recently announced that it is claiming a.docx
Question Libya recently announced that it is claiming a.docxQuestion Libya recently announced that it is claiming a.docx
Question Libya recently announced that it is claiming a.docx
 
Question Use the Internet or the IGlobal Resource.docx
Question Use the Internet or the IGlobal Resource.docxQuestion Use the Internet or the IGlobal Resource.docx
Question Use the Internet or the IGlobal Resource.docx
 
Question Please define motivation and discuss why it is.docx
Question Please define motivation and discuss why it is.docxQuestion Please define motivation and discuss why it is.docx
Question Please define motivation and discuss why it is.docx
 
Question share your perspective on personal data as a.docx
Question share your perspective on personal data as a.docxQuestion share your perspective on personal data as a.docx
Question share your perspective on personal data as a.docx
 
QEP Assignment Death Penalty.docx
QEP Assignment Death Penalty.docxQEP Assignment Death Penalty.docx
QEP Assignment Death Penalty.docx
 
Question In your what are the main workforce.docx
Question In your what are the main workforce.docxQuestion In your what are the main workforce.docx
Question In your what are the main workforce.docx
 

Kürzlich hochgeladen

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...PsychoTech Services
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 

Kürzlich hochgeladen (20)

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
IGNOU MSCCFT and PGDCFT Exam Question Pattern: MCFT003 Counselling and Family...
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 

Use cluster analysis and decision tree induction algorithm on a.docx

  • 1. Use cluster analysis and decision tree induction algorithm on a | Analytics | Harrisburg University of Science and Technology Use the Cluster Analysis and Decision Tree induction algorithmon a weather forecast problem. It is a binary classification problem to predict whether or not a location will get rain the next day.Information about the dataset (Weather Forecast Training.csv):• Location: The location name of the weather station• MinTemp: The minimum temperature in degrees celsius• MaxTemp: The maximum temperature in degrees celsius• Rainfall: The amount of rainfall recorded for the day in mm• Evaporation: The so-called Class A pan evaporation (mm) in the 24 hours to 9am• Sunshine: The number of hours of bright sunshine in the day.• WindGustDir: The direction of the strongest wind gust in the 24 hours to midnight• WindGustSpeed: The speed (km/h) of the strongest wind gust in the 24 hours to midnight• WindDir: Direction of the wind• WindSpeed: Wind speed (km/hr) averaged over 10 minutes• Humidity: Humidity (percent)• Pressure: Atmospheric pressure (hpa) reduced to mean sea level• Cloud: Fraction of sky obscured by cloud This is measured in “oktas”, which are a unit of eigths. Itrecords how many eigths of the sky are obscured by the cloud. A 0 measure indicates a completely clear skywhilst an 8 indicates that it is completely overcast.• Temp: Temperature (degrees C)• RainTodayBoolean: 1 if precipitation (mm) in the 24 hours to 9 am exceeds 1mm, otherwise 0• RainTomorrow: The target variable. Did it rain tomorrow?Organize your report using the following template (with the section breakdown and grading rubrics):Section 1: Data preparation (30%)• Discuss the potential data quality issues you identify about the dataset and how you apply various data preprocessing techniques to cope with those issues and perform Exploratory Data Analysis (EDA). Specifically, discuss the type of techniques you carry out in order to prepare the dataset for the machine learning algorithms you use in the next section. Whenever appropriate, enhance your EDA with effective data visualization.Section 2: Build, tune and evaluate cluster analysis and decision tree models (50%)• Apply both clustering algorithm (kmeans and HAC) and decision tree induction algorithm to the weather forest training data and construct models. Perform extensive model experiments with hyper-parameters’ tuning. Discuss your choice of hyper-parameters for each algorithm and produce tables summarizing the best performing models and their corresponding model specifications (i.e. the combination of hyper-parameters). Also, explain your choice of model performance evaluation methods and metrics in order to produce unbiased and low variance estimates.• In your analysis writeup, include the discussion regarding how to repurpose the
  • 2. unsupervised learning algorithms like clustering for classification and how to judge the performance of the algorithms.• For decision tree induction algorithm model performance evaluation, generate Receiver OperatingCharacteristics (ROC) curve and calculate the Area Under Curve (AUC) metric for the identified best performing model. Include a decision tree output visualization and interpret the decision tree model.• Include a detailed explanation of your modeling process and interpretation of the results in your analysiswriteup (with markdown language) and structure such writeup in an easy-to-follow layout. Please limit the program output only to the most relevant part which is used to support your analysis. An excessive amount of less relevant outputs (e.g. display the whole dataset) in your report is not neededSection 3: Prediction and interpretation (20%)• After building the classification models, apply them to the test dataset (Weather Forcast Testing.csv)provided to predict if each location will rain tomorrow.•Needed prediction results as a CSV file with four columns (ID, kmeans, HAC, DT) for theclassification results out of the pre-specified machine learning algorithms respectively.Needed rmarkdown or notebook document together with a knitted report (HTML format) and prediction CSV file.