SlideShare ist ein Scribd-Unternehmen logo
1 von 3
Downloaden Sie, um offline zu lesen
General Requirements This section contains the general requirements which must be met by
your submitted assignment. Marks will be deducted if you fail to meet any of the following
general requirements. - You must complete Tasks 1-3 in the Jupyter Notebook under the Py thon
3 kernel. - All code must be written in one single ipynb file, where each task and the sub-tasks
therein (if any) must be clearly separated via Markdown cells to ensure good readability. - You
must include code-level comments in the..tpynb file to explain the key parts of your code. - You
must follow the instructions given in each task to complete the corresponding task - You must
follow the rules specified in the "Submission Requirements" section to make your final
submission. 1 - Your code in the submitted ipynb file must be executable during marking, where
all necessary files needed for executing the code must also be submitted, as detalled in the
"Submission Requirements" section. - All graphs must be properly sized and formatted to
include a meaningful title, appropriate axis labels, and a legend. The fonts contained in the graph
must be properly sized for good readability. The components of the graph should be
appropriately coloured, if appicable.
Task 1 - Problem Formulation, Data Acquisition and Preparation (12%) Please visit the UCI
repository at httos:/larchive icsuciedu/m/datasets phe and cick on the "Classification" link under
the "Default Task" section, as illustrated in Figure 1, to check the available data sets that fall into
the category of dassification tasks. You can find the details about each of the listed data sets by
clicking on its name (beside its icon), as illustrated in Figure 1, and accordingly gain a better
understanding of the data and its domain. After that, you need to choose ONE data set Which
must satisfy the following criteria: - The data set must contain at least 150 rows (l.e, data
records). - The data set must contain at least five columns except the class label column. - The
data set must contain at least one categorical column except the class label column. The data set
must NoT be a multiabel data set, eg. the Anuran Calls (MFCCs) data set, where each data
record is associated with multiple different labeis. Note: If you choose a data set not satisfying
the above criteria, your totai marks of this assignment will be hatived. Once you have chosen a
certain data set, you can click on the "Data Folder" link in the frontpage of that data set, as
shown in Figure 2, to find and download the data file into your local Jupyter Notebook working
folder. Note that some dati files may not be in the format of .ovv, xis or xila in such cases, you
need to first convert them into the format of cov before looding the data. Next, you need to load
the data, periorm necessayy and appropriate data preparation operations to faclitate the
subsequent data analysis and modelling. Note: If multiple data files evst in the Data Folder," you
may just choose one of them which you believe is the most appropriate one to work on.
Furthermore, feature engineering might need to be performed in the step of data preparation. You
must describe vour workflow (including the invoived key components) for completing this tosk
present key observations and anolyses, provide justifcobions of any choices you hove mode, and
discuss any issues if encountered finduding the wors you nove used to oddress them in the report
required in Tosk 4.
Task 2 - Data Exploration (16%) Now you've finished Task 1. You can start to explore the data
loaded and prepared in Task 1 by carrying out the following steps: 2.1 Exploring each column
(li.e. attributes) by using appropriate descriptive statistics and/or graphical visualisations. If the
data set contains more than 10 attributes, you just need to select 10 columns to explone. You
must eioborate the woy(5) you've used for explorotion and present key observations, onolyses
and conciusions in the report required in Task 4. 2,2. Exploring the relationships between all
pairs of columns (example 10 selected pairs of columns If the data set contains more than 10
attributes) by using appropriate descriptive statistics and/or graphical visualisations. You must
eloborote the way(s) you've used for exploration and present key observations, onalyses and
conclusions regarding the reiationships between the explored poirs in the report required in Tosk
4 23 Posing one meaningtu question and exploring the data by using appropriate methods to find
its answer. You must stote the question, descnbe the woy you've used to find its answer, report
key observations bosed upon numenc metrics (e 9 , descriptive/inferential statistics) and/or
grophical visuolisations, and presentany interesting takeoways in the report required in Tosk 4.
You must oiso describe your workfow (including the involved key components) for completing
this task, provide justifications of any choices you ve made, and discuss any issues if
encountered (including the ways you ve used to oddress them) in the report required in Tosk 4.
(including the woys you ve used to oddress them) in the report required in Tosk 4 . Task 3 - Data
Modelling (32%6) In this task, you are asked to choose TWO classification models, and carry
out the following steps: 3.1 Splitting the data into a training set and a test set. Specifically, you
need to split the data at the following ratios, respectivey, to form three different suites of training
and test sets: - Suite1: 50is for training and 50 s.for testing - Suite2: 60%f for training and 40 -
for testing - Suite3: 80% for training and 205 for testing You must describe the woy you hove
used for splitting the doto to ensure the reproduciblify of your work in the report required in
Tosk 4 3.2. Performing the following steps for each of the two chosen moders on each of the
above three suites: - Identifying the mathod in the "scikit-learn" package which implements the
chosen madel. - Selecting appropriate model parameters and using them to train the modef vie
the aboveidentified method. You must elaborate and justify the way you've used for parameter
selection in the report required in Task 4. - Evaluating the performances of the model on the
training and test sets, respectively, in terms of contusion matrix Classificaton accuracy Q.
Predion Recall o F1 score and report them in the report required in Tosk 4 .
You are asked to write a report for the data science "project" you've completed in Tasks 1-3. Th.
report must have the following structure. - A cover page, induding o Report title 0. Your full
name and student 10 Q. Affiliation 0. Contact details o Date - Abstract (i.e., an executive
summary) - Introduction (including the entire worktlow of this data science "project") - Task 1
(following the italic instructions related to the report as specfied in Task 1) - Task 2 (following
the italic instructions related to the report as specified in Task 2) - Task 3 (following the
italicinstructions related to the report as specified in Task 3 ) - Discussion and Conclusions The
report must be saved in the PDF format and named "report pof for submission. It MUST be
written in the single column format with font size between 10 and 12 points and no more than is
paies (including tables, graphs and/or references). Penalties will apply if the report does not
satisfy these requirements. Moreover, the quality of the report will be considered when marking,
e. &. organisation, clarity, and grammatical mistakes. Please remember to explicitly cite any
sources which your ve referred to when doing your work! Submission Requirements

Weitere ähnliche Inhalte

Ähnlich wie General Requirements This section contains the general requirements w.pdf

75629 Topic prevention measures for vulneranbilitiesNumber of.docx
75629 Topic prevention measures for vulneranbilitiesNumber of.docx75629 Topic prevention measures for vulneranbilitiesNumber of.docx
75629 Topic prevention measures for vulneranbilitiesNumber of.docxsleeperharwell
 
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docxCase Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docxwendolynhalbert
 
CIS 336 STUDY Introduction Education--cis336study.com
CIS 336 STUDY Introduction Education--cis336study.comCIS 336 STUDY Introduction Education--cis336study.com
CIS 336 STUDY Introduction Education--cis336study.comclaric262
 
Educational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdfEducational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdfrajeshjangid1865
 
TAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKATAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKAFayan TAO
 
Office excel tips and tricks 201101
Office excel tips and tricks 201101Office excel tips and tricks 201101
Office excel tips and tricks 201101Vishwanath Ramdas
 
CSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and Func
CSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and FuncCSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and Func
CSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and FuncMargenePurnell14
 
Reaction StatisticsBackgroundWhen collecting experimental data f.pdf
Reaction StatisticsBackgroundWhen collecting experimental data f.pdfReaction StatisticsBackgroundWhen collecting experimental data f.pdf
Reaction StatisticsBackgroundWhen collecting experimental data f.pdffashionbigchennai
 
In-plant Training Guidelines_SCSE
In-plant Training Guidelines_SCSEIn-plant Training Guidelines_SCSE
In-plant Training Guidelines_SCSEMargret Anouncia
 
Comp 220 ilab 6 of 7
Comp 220 ilab 6 of 7Comp 220 ilab 6 of 7
Comp 220 ilab 6 of 7ashhadiqbal
 
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...KarenMiner
 
Management of database information system
Management of database information systemManagement of database information system
Management of database information systemKAZEMBETVOnline
 

Ähnlich wie General Requirements This section contains the general requirements w.pdf (14)

75629 Topic prevention measures for vulneranbilitiesNumber of.docx
75629 Topic prevention measures for vulneranbilitiesNumber of.docx75629 Topic prevention measures for vulneranbilitiesNumber of.docx
75629 Topic prevention measures for vulneranbilitiesNumber of.docx
 
Bitstuffing
BitstuffingBitstuffing
Bitstuffing
 
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docxCase Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
Case Study Analysis 2The Cholesterol.xls records cholesterol lev.docx
 
CIS 336 STUDY Introduction Education--cis336study.com
CIS 336 STUDY Introduction Education--cis336study.comCIS 336 STUDY Introduction Education--cis336study.com
CIS 336 STUDY Introduction Education--cis336study.com
 
Educational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdfEducational Objectives After successfully completing this assignmen.pdf
Educational Objectives After successfully completing this assignmen.pdf
 
TAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKATAO Fayan_ Introduction to WEKA
TAO Fayan_ Introduction to WEKA
 
Office excel tips and tricks 201101
Office excel tips and tricks 201101Office excel tips and tricks 201101
Office excel tips and tricks 201101
 
CSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and Func
CSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and FuncCSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and Func
CSE 1310 – Spring 21Introduction to ProgrammingLab 4 Arrays and Func
 
Reaction StatisticsBackgroundWhen collecting experimental data f.pdf
Reaction StatisticsBackgroundWhen collecting experimental data f.pdfReaction StatisticsBackgroundWhen collecting experimental data f.pdf
Reaction StatisticsBackgroundWhen collecting experimental data f.pdf
 
In-plant Training Guidelines_SCSE
In-plant Training Guidelines_SCSEIn-plant Training Guidelines_SCSE
In-plant Training Guidelines_SCSE
 
Comp 220 ilab 6 of 7
Comp 220 ilab 6 of 7Comp 220 ilab 6 of 7
Comp 220 ilab 6 of 7
 
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
Solved Practice questions for Microsoft Querying Data with Transact-SQL 70-76...
 
Management of database information system
Management of database information systemManagement of database information system
Management of database information system
 
Lect11
Lect11Lect11
Lect11
 

Mehr von alltiusind

Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdf
 Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdf Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdf
Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdfalltiusind
 
Given a list, remove item with index 0,4,5 from list if there exist a.pdf
 Given a list, remove item with index 0,4,5 from list if there exist a.pdf Given a list, remove item with index 0,4,5 from list if there exist a.pdf
Given a list, remove item with index 0,4,5 from list if there exist a.pdfalltiusind
 
Georgia Kennel uses tenant-days as its measure of activity; an animal.pdf
 Georgia Kennel uses tenant-days as its measure of activity; an animal.pdf Georgia Kennel uses tenant-days as its measure of activity; an animal.pdf
Georgia Kennel uses tenant-days as its measure of activity; an animal.pdfalltiusind
 
Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdf
 Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdf Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdf
Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdfalltiusind
 
George Morton, gerente del departamento de mantenimiento de un gr.pdf
 George Morton, gerente del departamento de mantenimiento de un gr.pdf George Morton, gerente del departamento de mantenimiento de un gr.pdf
George Morton, gerente del departamento de mantenimiento de un gr.pdfalltiusind
 
Get an educations A sucvey asked 32,114 people how much confidence th.pdf
 Get an educations A sucvey asked 32,114 people how much confidence th.pdf Get an educations A sucvey asked 32,114 people how much confidence th.pdf
Get an educations A sucvey asked 32,114 people how much confidence th.pdfalltiusind
 
GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdf
 GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdf GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdf
GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdfalltiusind
 
General Instructions - Submit your solution no later than 1159pm ET .pdf
 General Instructions - Submit your solution no later than 1159pm ET .pdf General Instructions - Submit your solution no later than 1159pm ET .pdf
General Instructions - Submit your solution no later than 1159pm ET .pdfalltiusind
 
Generate and upload a PDA in JFF format for the following grammar Hi.pdf
 Generate and upload a PDA in JFF format for the following grammar Hi.pdf Generate and upload a PDA in JFF format for the following grammar Hi.pdf
Generate and upload a PDA in JFF format for the following grammar Hi.pdfalltiusind
 
General exceptions to.pdf
 General exceptions to.pdf General exceptions to.pdf
General exceptions to.pdfalltiusind
 
Gator Investments provides financial services related to investment s.pdf
 Gator Investments provides financial services related to investment s.pdf Gator Investments provides financial services related to investment s.pdf
Gator Investments provides financial services related to investment s.pdfalltiusind
 
Gotthelf Clinic uses client-visits as its measure of activity. During.pdf
 Gotthelf Clinic uses client-visits as its measure of activity. During.pdf Gotthelf Clinic uses client-visits as its measure of activity. During.pdf
Gotthelf Clinic uses client-visits as its measure of activity. During.pdfalltiusind
 
Gopton Company Is considering a capptal bodgeting froject that would .pdf
 Gopton Company Is considering a capptal bodgeting froject that would .pdf Gopton Company Is considering a capptal bodgeting froject that would .pdf
Gopton Company Is considering a capptal bodgeting froject that would .pdfalltiusind
 
Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdf
 Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdf Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdf
Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdfalltiusind
 
Given the following relation staff property staffinspection Express.pdf
 Given the following relation staff property staffinspection Express.pdf Given the following relation staff property staffinspection Express.pdf
Given the following relation staff property staffinspection Express.pdfalltiusind
 
Globalization and Tourism Development - western values, authenticity,.pdf
 Globalization and Tourism Development - western values, authenticity,.pdf Globalization and Tourism Development - western values, authenticity,.pdf
Globalization and Tourism Development - western values, authenticity,.pdfalltiusind
 
Give you two list, return True if they have common item, False if the.pdf
 Give you two list, return True if they have common item, False if the.pdf Give you two list, return True if they have common item, False if the.pdf
Give you two list, return True if they have common item, False if the.pdfalltiusind
 
Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdf
 Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdf Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdf
Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdfalltiusind
 
Glucans are small pieces of the cell walls of molds that may cause .pdf
 Glucans are small pieces of the cell walls of molds that may cause .pdf Glucans are small pieces of the cell walls of molds that may cause .pdf
Glucans are small pieces of the cell walls of molds that may cause .pdfalltiusind
 
Give Me Your Money bank knows they get on average 18 calls per hour o.pdf
 Give Me Your Money bank knows they get on average 18 calls per hour o.pdf Give Me Your Money bank knows they get on average 18 calls per hour o.pdf
Give Me Your Money bank knows they get on average 18 calls per hour o.pdfalltiusind
 

Mehr von alltiusind (20)

Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdf
 Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdf Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdf
Given a doubly-linked list (2,3,4,5,6,7), node 2 s pointer(s) point(.pdf
 
Given a list, remove item with index 0,4,5 from list if there exist a.pdf
 Given a list, remove item with index 0,4,5 from list if there exist a.pdf Given a list, remove item with index 0,4,5 from list if there exist a.pdf
Given a list, remove item with index 0,4,5 from list if there exist a.pdf
 
Georgia Kennel uses tenant-days as its measure of activity; an animal.pdf
 Georgia Kennel uses tenant-days as its measure of activity; an animal.pdf Georgia Kennel uses tenant-days as its measure of activity; an animal.pdf
Georgia Kennel uses tenant-days as its measure of activity; an animal.pdf
 
Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdf
 Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdf Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdf
Georgia Kennel uses tenant-days as its measure of activity; an anmal .pdf
 
George Morton, gerente del departamento de mantenimiento de un gr.pdf
 George Morton, gerente del departamento de mantenimiento de un gr.pdf George Morton, gerente del departamento de mantenimiento de un gr.pdf
George Morton, gerente del departamento de mantenimiento de un gr.pdf
 
Get an educations A sucvey asked 32,114 people how much confidence th.pdf
 Get an educations A sucvey asked 32,114 people how much confidence th.pdf Get an educations A sucvey asked 32,114 people how much confidence th.pdf
Get an educations A sucvey asked 32,114 people how much confidence th.pdf
 
GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdf
 GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdf GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdf
GENESIS BOARD MINI SUMO ROBOT PROGRAMFOR 3 OPPONENT SENSOR, .pdf
 
General Instructions - Submit your solution no later than 1159pm ET .pdf
 General Instructions - Submit your solution no later than 1159pm ET .pdf General Instructions - Submit your solution no later than 1159pm ET .pdf
General Instructions - Submit your solution no later than 1159pm ET .pdf
 
Generate and upload a PDA in JFF format for the following grammar Hi.pdf
 Generate and upload a PDA in JFF format for the following grammar Hi.pdf Generate and upload a PDA in JFF format for the following grammar Hi.pdf
Generate and upload a PDA in JFF format for the following grammar Hi.pdf
 
General exceptions to.pdf
 General exceptions to.pdf General exceptions to.pdf
General exceptions to.pdf
 
Gator Investments provides financial services related to investment s.pdf
 Gator Investments provides financial services related to investment s.pdf Gator Investments provides financial services related to investment s.pdf
Gator Investments provides financial services related to investment s.pdf
 
Gotthelf Clinic uses client-visits as its measure of activity. During.pdf
 Gotthelf Clinic uses client-visits as its measure of activity. During.pdf Gotthelf Clinic uses client-visits as its measure of activity. During.pdf
Gotthelf Clinic uses client-visits as its measure of activity. During.pdf
 
Gopton Company Is considering a capptal bodgeting froject that would .pdf
 Gopton Company Is considering a capptal bodgeting froject that would .pdf Gopton Company Is considering a capptal bodgeting froject that would .pdf
Gopton Company Is considering a capptal bodgeting froject that would .pdf
 
Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdf
 Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdf Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdf
Gordon Millers job shop has four work areas, A, B, C, and D. Distanc.pdf
 
Given the following relation staff property staffinspection Express.pdf
 Given the following relation staff property staffinspection Express.pdf Given the following relation staff property staffinspection Express.pdf
Given the following relation staff property staffinspection Express.pdf
 
Globalization and Tourism Development - western values, authenticity,.pdf
 Globalization and Tourism Development - western values, authenticity,.pdf Globalization and Tourism Development - western values, authenticity,.pdf
Globalization and Tourism Development - western values, authenticity,.pdf
 
Give you two list, return True if they have common item, False if the.pdf
 Give you two list, return True if they have common item, False if the.pdf Give you two list, return True if they have common item, False if the.pdf
Give you two list, return True if they have common item, False if the.pdf
 
Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdf
 Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdf Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdf
Glycolaldehyde inhibits the Calvin cycle by inhibiting RuBP. What Cal.pdf
 
Glucans are small pieces of the cell walls of molds that may cause .pdf
 Glucans are small pieces of the cell walls of molds that may cause .pdf Glucans are small pieces of the cell walls of molds that may cause .pdf
Glucans are small pieces of the cell walls of molds that may cause .pdf
 
Give Me Your Money bank knows they get on average 18 calls per hour o.pdf
 Give Me Your Money bank knows they get on average 18 calls per hour o.pdf Give Me Your Money bank knows they get on average 18 calls per hour o.pdf
Give Me Your Money bank knows they get on average 18 calls per hour o.pdf
 

Kürzlich hochgeladen

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 

Kürzlich hochgeladen (20)

Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 

General Requirements This section contains the general requirements w.pdf

  • 1. General Requirements This section contains the general requirements which must be met by your submitted assignment. Marks will be deducted if you fail to meet any of the following general requirements. - You must complete Tasks 1-3 in the Jupyter Notebook under the Py thon 3 kernel. - All code must be written in one single ipynb file, where each task and the sub-tasks therein (if any) must be clearly separated via Markdown cells to ensure good readability. - You must include code-level comments in the..tpynb file to explain the key parts of your code. - You must follow the instructions given in each task to complete the corresponding task - You must follow the rules specified in the "Submission Requirements" section to make your final submission. 1 - Your code in the submitted ipynb file must be executable during marking, where all necessary files needed for executing the code must also be submitted, as detalled in the "Submission Requirements" section. - All graphs must be properly sized and formatted to include a meaningful title, appropriate axis labels, and a legend. The fonts contained in the graph must be properly sized for good readability. The components of the graph should be appropriately coloured, if appicable. Task 1 - Problem Formulation, Data Acquisition and Preparation (12%) Please visit the UCI repository at httos:/larchive icsuciedu/m/datasets phe and cick on the "Classification" link under the "Default Task" section, as illustrated in Figure 1, to check the available data sets that fall into the category of dassification tasks. You can find the details about each of the listed data sets by clicking on its name (beside its icon), as illustrated in Figure 1, and accordingly gain a better understanding of the data and its domain. After that, you need to choose ONE data set Which must satisfy the following criteria: - The data set must contain at least 150 rows (l.e, data records). - The data set must contain at least five columns except the class label column. - The data set must contain at least one categorical column except the class label column. The data set must NoT be a multiabel data set, eg. the Anuran Calls (MFCCs) data set, where each data record is associated with multiple different labeis. Note: If you choose a data set not satisfying the above criteria, your totai marks of this assignment will be hatived. Once you have chosen a certain data set, you can click on the "Data Folder" link in the frontpage of that data set, as shown in Figure 2, to find and download the data file into your local Jupyter Notebook working folder. Note that some dati files may not be in the format of .ovv, xis or xila in such cases, you need to first convert them into the format of cov before looding the data. Next, you need to load the data, periorm necessayy and appropriate data preparation operations to faclitate the subsequent data analysis and modelling. Note: If multiple data files evst in the Data Folder," you may just choose one of them which you believe is the most appropriate one to work on. Furthermore, feature engineering might need to be performed in the step of data preparation. You must describe vour workflow (including the invoived key components) for completing this tosk
  • 2. present key observations and anolyses, provide justifcobions of any choices you hove mode, and discuss any issues if encountered finduding the wors you nove used to oddress them in the report required in Tosk 4. Task 2 - Data Exploration (16%) Now you've finished Task 1. You can start to explore the data loaded and prepared in Task 1 by carrying out the following steps: 2.1 Exploring each column (li.e. attributes) by using appropriate descriptive statistics and/or graphical visualisations. If the data set contains more than 10 attributes, you just need to select 10 columns to explone. You must eioborate the woy(5) you've used for explorotion and present key observations, onolyses and conciusions in the report required in Task 4. 2,2. Exploring the relationships between all pairs of columns (example 10 selected pairs of columns If the data set contains more than 10 attributes) by using appropriate descriptive statistics and/or graphical visualisations. You must eloborote the way(s) you've used for exploration and present key observations, onalyses and conclusions regarding the reiationships between the explored poirs in the report required in Tosk 4 23 Posing one meaningtu question and exploring the data by using appropriate methods to find its answer. You must stote the question, descnbe the woy you've used to find its answer, report key observations bosed upon numenc metrics (e 9 , descriptive/inferential statistics) and/or grophical visuolisations, and presentany interesting takeoways in the report required in Tosk 4. You must oiso describe your workfow (including the involved key components) for completing this task, provide justifications of any choices you ve made, and discuss any issues if encountered (including the ways you ve used to oddress them) in the report required in Tosk 4. (including the woys you ve used to oddress them) in the report required in Tosk 4 . Task 3 - Data Modelling (32%6) In this task, you are asked to choose TWO classification models, and carry out the following steps: 3.1 Splitting the data into a training set and a test set. Specifically, you need to split the data at the following ratios, respectivey, to form three different suites of training and test sets: - Suite1: 50is for training and 50 s.for testing - Suite2: 60%f for training and 40 - for testing - Suite3: 80% for training and 205 for testing You must describe the woy you hove used for splitting the doto to ensure the reproduciblify of your work in the report required in Tosk 4 3.2. Performing the following steps for each of the two chosen moders on each of the above three suites: - Identifying the mathod in the "scikit-learn" package which implements the chosen madel. - Selecting appropriate model parameters and using them to train the modef vie the aboveidentified method. You must elaborate and justify the way you've used for parameter selection in the report required in Task 4. - Evaluating the performances of the model on the training and test sets, respectively, in terms of contusion matrix Classificaton accuracy Q. Predion Recall o F1 score and report them in the report required in Tosk 4 .
  • 3. You are asked to write a report for the data science "project" you've completed in Tasks 1-3. Th. report must have the following structure. - A cover page, induding o Report title 0. Your full name and student 10 Q. Affiliation 0. Contact details o Date - Abstract (i.e., an executive summary) - Introduction (including the entire worktlow of this data science "project") - Task 1 (following the italic instructions related to the report as specfied in Task 1) - Task 2 (following the italic instructions related to the report as specified in Task 2) - Task 3 (following the italicinstructions related to the report as specified in Task 3 ) - Discussion and Conclusions The report must be saved in the PDF format and named "report pof for submission. It MUST be written in the single column format with font size between 10 and 12 points and no more than is paies (including tables, graphs and/or references). Penalties will apply if the report does not satisfy these requirements. Moreover, the quality of the report will be considered when marking, e. &. organisation, clarity, and grammatical mistakes. Please remember to explicitly cite any sources which your ve referred to when doing your work! Submission Requirements