SlideShare ist ein Scribd-Unternehmen logo
1 von 10
Downloaden Sie, um offline zu lesen
Course Recommender
System
Aakash Chotrani | Curtis Eichner
The Problem
The aim of this project is to recommend the courses based on the student
liking. The recommendations will be generated based on the previous
courses which student took, trying to find similarities between the courses
and the predicted future performance of the student.
Students face constant dilemma when they want to select a subject out of
elective subjects. If we can determine the performance of student on each
and every elective subject by predicting the grade we can sort the results
and provide the student best recommendation out of all the elective
subjects.
Approach To The Problem
Dataset​:
We have been given 5 CSV files which contains information about the
students and the courses offered by DigiPen. All the data is anonymous:
1)​StudGPAInfo : Contains info about 937 students and their cumulative
GPA, last attended.(Students who passed DigiPen. The student id’s are not
same as the other file which makes it not useful for the application).
2)​StudGradesInfo : Contains info about 4102 students and mid/final grade
in each subject.
3)​StudChangeMajorInfo : Contains info about 448 students who changed
their major.
4)​ProgramCourses : Contains info about all the courses offered since 2011
to 2017 (Credits,Core,Semester). There are 478 unique courses that are
offered by DigiPen.
5)​CourseRequisites : All the courses which have pre requisites to current
course.
Currently the program only uses StudGradesInfo and ProgramCourses file
to make predictions.
The first step is selecting the relevant features and setting up the dataset to
train our algorithm. We use python to extract features and create a sparse
matrix of students and the subjects that they took.
Student 1 Student 2 ………….. Student 4102
Subject 1 .98 -1 …... .88
Subject 2 -1 .77 …... -1
………….
………….
………….
Subject 478 .52 .73 …... -1
Each number in the matrix represents the grade that student achieved at
the end of the semester in a particular subject. 0 represents student
achieved 0% and 1 represents 100%. If the student didn’t take the subject
it’s represented as -1. Hence we have a sparse matrix of dimension
478x4102. We export this file as a text file to train our algorithm.
Creating Training Set and Test Set:
In conventional data splitting stage we divide the dataset into 70/30 split by
choosing first 70% values as training set and testing remaining 30% values
on test set.
We can’t apply the same method in this case since we have to train on
each subject and train on each student. Thus, we use the method of
masking the data. For example in the above table we will select 30% of
cells randomly with GPA values on them and dump it into test set. We will
treat it as -1 as if the student hasn’t taken that course and try to predict the
grade.
Student 1 Student 2 ………….. Student 4102
Subject 1 ? (-1) -1 …... .88
Subject 2 -1 .77 …... -1
………….
………….
………….
Subject 478 .52 ? (-1) …... -1
After training our algorithm we will try to predict the ? values. If the
predicted value was .90 for Student 1 and Subject 1 we know that our error
was .08 which is 8.16%. If we can determine the grade with average
accuracy of less than 10% we can determine how a student will perform on
future elective subjects based on past performance.
Learning:
The grade that student receives depends on :
● Difficulty of subject.
● Performance of student.
Every subject has requirements to fulfill in-order to pass that subject like
homework, assignments, midterm, final-exam, etc. Student performance
depends on his/her performance on each requirement.
Thus we consider the subject and student to be a vector of 7 traits.
Subject : [x1, x2, x3, x4, x5, x6, x7]
Student : [y1, y2,y3, y4, y5, y6, y7 ]
We are trying to learn individual weights of each vector. Each variable in
the vector represents weight for one of the requirements and performance
of student on each requirement respectively.
The grade in the end will be the dot product of both the vectors.
Grade = [x1*y1 + x2*y2 + x3*y3 + x4*y4 + x5*y5 + x6*y6 + x7*y7]
(Note : We chose to have 7 elements in each vector after testing with
different vectors of different size from 2-10 and 7 variable vector gave the
best accuracy. We will experiment with different variable in future if it
affects the accuracy.)
The application is using Q-Learning algorithm to adjust the weights by
using the error term and actual score the student received in that subject.
Pseudocode for our algorithm (LearningSubjectVector)
For​ ​each​ ​course​ ​in​ ​all​ ​courses
For​ ​each​ ​weight in​ ​all​ ​course vector
​For​ ​each​ ​student​ in all students
​If​ ​student​ ​has​ ​taken​ ​the​ ​course
​Prediction​ = ​Calculate​ ​dot​ ​product​ ​from​ ​both​ ​the​ ​vectors
​Error​ = ​prediction​ ​-​ ​actual
​TotalError​ += ​Error
discountFactor = discoutFactor*Stud_Actual_Score
newWeight= Stud_Actual_Score+learning_rate(TotalError-discountFactor)
This process is repeated 2000 times to learn the weights of subjects and
students. Same algorithm is used for learning the student vector.
After every 100 iteration the weights are saved into two separate text files.
These weights could be loaded back to the application when making a new
prediction
Performance
The algorithm starts with 59% error and after 2000 iterations it reaches
27-28% error.
Next steps of improvement are if we want to predict the grade of student in
CS class it should be only based on only CS + MAT class instead of
considering all the classes. It will improve the accuracy of the predictions.
Considering batch training instead of training the data at once which could
improve the prediction. Also playing with different learning rates and
discount factor values.
Making changes to the main algorithm. Instead of using Q-Learning to
reach the optimal values I would implement different variations of gradient
descent like vanilla, SGD, Adam and try test the results.
If we reach the error less than 10% I will store the weights and try
predicting the best recommendation for the elective subject for a particular
student after 1st year of performance data.
Alternate Methods
1) Course Specific Regression (CSR)
Undergraduate courses are structured in such a way that the courses that
student takes at the beginning prepares students for the future class.
This method assumes that the performance of student in the previous class
directly impacts the future class performance.
Example: grade in class CS300 depends on student’s performance in MAT
100 and CS 200.
Hence if we are able to calculate the weight/contribution of each class and
multiply with the students performance on that class then we can calculate
the final grade.
Future_Grade_For_Class_CS300 =
Grade_In_ClassMAT100 * Contribution_MAT100 +
Grade_In_ClassCS200 * Contribution_CS200
Suppose:
Grade_In_ClassMAT100 = 80%
Grade_In_ClassCS200 = 70%
Contribution_MAT100 = 30%
Contribution_CS200 = 70%
Future_Grade_For_Class_CS300 = (0.8)(0.3) + (0.7)(0.7)
Future_Grade_For_Class_CS300 = 0.73
Thus we get the formula
Future_Grade = [Grades_In_Previous_Class]^T* [Weights_Each_Class]
How to get the matrix of Weights_Each_Class?
We will create a matrix only including the rows of students who have
actually taken that class previously. We will include the grade of each
subject that they have taken before attempting the current target course.
: Actual grade received in the subjectY
︿
1 : Vector of 1s
: Bias Valueb0
W : Vector of Weights (We are trying to learn)
G : Matrix of grades of other student grades similar to target student
, : Regularization parameters to control overfittingλ1 λ2
W WY 1b W|
|
︿
− 0 − G |
|
2
+ λ1| |
2
+ λ2| |
2
2) Student Specific Regresssion(SSR)
The downfall of CSR (Course Specific Regression) method is sometimes
there is too much flexibility in terms of selection of subjects.The order of
selection of subjects by students is messed up.
Hence there is alternative method which solves this problem. It’s called
Student Specific Regression(SSR).
We will compare our target student (Who has taken N number of courses
uptill now) with other students and see if they have taken a minimum of K
subjects which are common with our target student (K is subset of N and
always K<N)
If yes then we include the other student in our training data else we just
exclude the other student from dataset while training to improve accuracy.
Also we remove the data of the subjets which have not been taken by the
target student from other students data. Thus we don't consider the weights
of subjects that has not been taken by the target student.
We train apply the same formula for learning the weights of each subject
and predicting the grade.

Weitere ähnliche Inhalte

Was ist angesagt?

Visibility control in java
Visibility control in javaVisibility control in java
Visibility control in javaTech_MX
 
Online Shopping Agent in AI
Online Shopping Agent in AIOnline Shopping Agent in AI
Online Shopping Agent in AIFazle Rabbi Ador
 
Performance analysis and randamized agoritham
Performance analysis and randamized agorithamPerformance analysis and randamized agoritham
Performance analysis and randamized agorithamlilyMalar1
 
Introduction to oops concepts
Introduction to oops conceptsIntroduction to oops concepts
Introduction to oops conceptsNilesh Dalvi
 
Inheritance in java
Inheritance in javaInheritance in java
Inheritance in javaRahulAnanda1
 
sum of subset problem using Backtracking
sum of subset problem using Backtrackingsum of subset problem using Backtracking
sum of subset problem using BacktrackingAbhishek Singh
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler DesignKuppusamy P
 
I. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AII. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AIvikas dhakane
 
Spanning trees & applications
Spanning trees & applicationsSpanning trees & applications
Spanning trees & applicationsTech_MX
 
Grasp patterns and its types
Grasp patterns and its typesGrasp patterns and its types
Grasp patterns and its typesSyed Hassan Ali
 
Mathematical Analysis of Recursive Algorithm.
Mathematical Analysis of Recursive Algorithm.Mathematical Analysis of Recursive Algorithm.
Mathematical Analysis of Recursive Algorithm.mohanrathod18
 
Merge sort analysis and its real time applications
Merge sort analysis and its real time applicationsMerge sort analysis and its real time applications
Merge sort analysis and its real time applicationsyazad dumasia
 
Planning in AI(Partial order planning)
Planning in AI(Partial order planning)Planning in AI(Partial order planning)
Planning in AI(Partial order planning)Vicky Tyagi
 

Was ist angesagt? (20)

Visibility control in java
Visibility control in javaVisibility control in java
Visibility control in java
 
Online Shopping Agent in AI
Online Shopping Agent in AIOnline Shopping Agent in AI
Online Shopping Agent in AI
 
Chapter 05 classes and objects
Chapter 05 classes and objectsChapter 05 classes and objects
Chapter 05 classes and objects
 
Performance analysis and randamized agoritham
Performance analysis and randamized agorithamPerformance analysis and randamized agoritham
Performance analysis and randamized agoritham
 
Disk scheduling
Disk schedulingDisk scheduling
Disk scheduling
 
Introduction to oops concepts
Introduction to oops conceptsIntroduction to oops concepts
Introduction to oops concepts
 
Inheritance in java
Inheritance in javaInheritance in java
Inheritance in java
 
sum of subset problem using Backtracking
sum of subset problem using Backtrackingsum of subset problem using Backtracking
sum of subset problem using Backtracking
 
Symbol table in compiler Design
Symbol table in compiler DesignSymbol table in compiler Design
Symbol table in compiler Design
 
I. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AII. Mini-Max Algorithm in AI
I. Mini-Max Algorithm in AI
 
Spanning trees & applications
Spanning trees & applicationsSpanning trees & applications
Spanning trees & applications
 
OsI reference model
OsI reference modelOsI reference model
OsI reference model
 
LR(0) PARSER
LR(0) PARSERLR(0) PARSER
LR(0) PARSER
 
Chat Application
Chat ApplicationChat Application
Chat Application
 
Grasp patterns and its types
Grasp patterns and its typesGrasp patterns and its types
Grasp patterns and its types
 
Mathematical Analysis of Recursive Algorithm.
Mathematical Analysis of Recursive Algorithm.Mathematical Analysis of Recursive Algorithm.
Mathematical Analysis of Recursive Algorithm.
 
[OOP - Lec 18] Static Data Member
[OOP - Lec 18] Static Data Member[OOP - Lec 18] Static Data Member
[OOP - Lec 18] Static Data Member
 
Python-Inheritance.pptx
Python-Inheritance.pptxPython-Inheritance.pptx
Python-Inheritance.pptx
 
Merge sort analysis and its real time applications
Merge sort analysis and its real time applicationsMerge sort analysis and its real time applications
Merge sort analysis and its real time applications
 
Planning in AI(Partial order planning)
Planning in AI(Partial order planning)Planning in AI(Partial order planning)
Planning in AI(Partial order planning)
 

Ähnlich wie Course recommender system

An accurate ability evaluation method for every student with small problem it...
An accurate ability evaluation method for every student with small problem it...An accurate ability evaluation method for every student with small problem it...
An accurate ability evaluation method for every student with small problem it...Hideo Hirose
 
Project Allocation Linear Programming Optimisation
Project Allocation Linear Programming OptimisationProject Allocation Linear Programming Optimisation
Project Allocation Linear Programming OptimisationRistanti Ramadanti
 
“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...
“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...
“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...theijes
 
利用模糊歸屬函數
利用模糊歸屬函數利用模糊歸屬函數
利用模糊歸屬函數acksinkwung
 
Item and Distracter Analysis
Item and Distracter AnalysisItem and Distracter Analysis
Item and Distracter AnalysisSue Quirante
 
software engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyonesoftware engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyonerebantaofficial
 
Learning Strategy with Groups on Page Based Students' Profiles
Learning Strategy with Groups on Page Based Students' ProfilesLearning Strategy with Groups on Page Based Students' Profiles
Learning Strategy with Groups on Page Based Students' Profilesaciijournal
 
ch11sped420PP
ch11sped420PPch11sped420PP
ch11sped420PPfiegent
 
Learning strategy with groups on page based students' profiles
Learning strategy with groups on page based students' profilesLearning strategy with groups on page based students' profiles
Learning strategy with groups on page based students' profilesaciijournal
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for StudentsIRJET Journal
 
MSWord
MSWordMSWord
MSWordbutest
 
Math grade 2 training
Math grade 2 trainingMath grade 2 training
Math grade 2 trainingAnthony Smith
 
COP2800 - Java Programming Course Project Fall 2014 Page .docx
COP2800 - Java Programming  Course Project Fall 2014 Page .docxCOP2800 - Java Programming  Course Project Fall 2014 Page .docx
COP2800 - Java Programming Course Project Fall 2014 Page .docxmaxinesmith73660
 

Ähnlich wie Course recommender system (20)

C0364010013
C0364010013C0364010013
C0364010013
 
An accurate ability evaluation method for every student with small problem it...
An accurate ability evaluation method for every student with small problem it...An accurate ability evaluation method for every student with small problem it...
An accurate ability evaluation method for every student with small problem it...
 
Project Allocation Linear Programming Optimisation
Project Allocation Linear Programming OptimisationProject Allocation Linear Programming Optimisation
Project Allocation Linear Programming Optimisation
 
“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...
“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...
“TSEWG” Model for Teaching Students How to Solve Exercises with GeoGebra Soft...
 
利用模糊歸屬函數
利用模糊歸屬函數利用模糊歸屬函數
利用模糊歸屬函數
 
Kaggle KDD Cup Report
Kaggle KDD Cup ReportKaggle KDD Cup Report
Kaggle KDD Cup Report
 
Csrde discriminant analysis final
Csrde discriminant analysis finalCsrde discriminant analysis final
Csrde discriminant analysis final
 
Item and Distracter Analysis
Item and Distracter AnalysisItem and Distracter Analysis
Item and Distracter Analysis
 
software engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyonesoftware engineering powerpoint presentation foe everyone
software engineering powerpoint presentation foe everyone
 
Himani
HimaniHimani
Himani
 
Learning Strategy with Groups on Page Based Students' Profiles
Learning Strategy with Groups on Page Based Students' ProfilesLearning Strategy with Groups on Page Based Students' Profiles
Learning Strategy with Groups on Page Based Students' Profiles
 
ch11sped420PP
ch11sped420PPch11sped420PP
ch11sped420PP
 
Learning strategy with groups on page based students' profiles
Learning strategy with groups on page based students' profilesLearning strategy with groups on page based students' profiles
Learning strategy with groups on page based students' profiles
 
2014 11-13
2014 11-132014 11-13
2014 11-13
 
Data Clustering in Education for Students
Data Clustering in Education for StudentsData Clustering in Education for Students
Data Clustering in Education for Students
 
E1802023741
E1802023741E1802023741
E1802023741
 
Student Performance Data Mining Project Report
Student Performance Data Mining Project ReportStudent Performance Data Mining Project Report
Student Performance Data Mining Project Report
 
MSWord
MSWordMSWord
MSWord
 
Math grade 2 training
Math grade 2 trainingMath grade 2 training
Math grade 2 training
 
COP2800 - Java Programming Course Project Fall 2014 Page .docx
COP2800 - Java Programming  Course Project Fall 2014 Page .docxCOP2800 - Java Programming  Course Project Fall 2014 Page .docx
COP2800 - Java Programming Course Project Fall 2014 Page .docx
 

Mehr von Aakash Chotrani

Efficient Backpropagation
Efficient BackpropagationEfficient Backpropagation
Efficient BackpropagationAakash Chotrani
 
What is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamWhat is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamAakash Chotrani
 
Deep q learning with lunar lander
Deep q learning with lunar landerDeep q learning with lunar lander
Deep q learning with lunar landerAakash Chotrani
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in gamesAakash Chotrani
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Aakash Chotrani
 

Mehr von Aakash Chotrani (7)

Efficient Backpropagation
Efficient BackpropagationEfficient Backpropagation
Efficient Backpropagation
 
What is goap, and why is it not already mainstream
What is goap, and why is it not already mainstreamWhat is goap, and why is it not already mainstream
What is goap, and why is it not already mainstream
 
Deep q learning with lunar lander
Deep q learning with lunar landerDeep q learning with lunar lander
Deep q learning with lunar lander
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Artificial Intelligence in games
Artificial Intelligence in gamesArtificial Intelligence in games
Artificial Intelligence in games
 
Simple & Fast Fluids
Simple & Fast FluidsSimple & Fast Fluids
Simple & Fast Fluids
 
Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning Supervised Unsupervised and Reinforcement Learning
Supervised Unsupervised and Reinforcement Learning
 

Kürzlich hochgeladen

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSAishani27
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 

Kürzlich hochgeladen (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Ukraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICSUkraine War presentation: KNOW THE BASICS
Ukraine War presentation: KNOW THE BASICS
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 

Course recommender system

  • 2. The Problem The aim of this project is to recommend the courses based on the student liking. The recommendations will be generated based on the previous courses which student took, trying to find similarities between the courses and the predicted future performance of the student. Students face constant dilemma when they want to select a subject out of elective subjects. If we can determine the performance of student on each and every elective subject by predicting the grade we can sort the results and provide the student best recommendation out of all the elective subjects.
  • 3. Approach To The Problem Dataset​: We have been given 5 CSV files which contains information about the students and the courses offered by DigiPen. All the data is anonymous: 1)​StudGPAInfo : Contains info about 937 students and their cumulative GPA, last attended.(Students who passed DigiPen. The student id’s are not same as the other file which makes it not useful for the application). 2)​StudGradesInfo : Contains info about 4102 students and mid/final grade in each subject. 3)​StudChangeMajorInfo : Contains info about 448 students who changed their major. 4)​ProgramCourses : Contains info about all the courses offered since 2011 to 2017 (Credits,Core,Semester). There are 478 unique courses that are offered by DigiPen. 5)​CourseRequisites : All the courses which have pre requisites to current course. Currently the program only uses StudGradesInfo and ProgramCourses file to make predictions. The first step is selecting the relevant features and setting up the dataset to train our algorithm. We use python to extract features and create a sparse matrix of students and the subjects that they took.
  • 4. Student 1 Student 2 ………….. Student 4102 Subject 1 .98 -1 …... .88 Subject 2 -1 .77 …... -1 …………. …………. …………. Subject 478 .52 .73 …... -1 Each number in the matrix represents the grade that student achieved at the end of the semester in a particular subject. 0 represents student achieved 0% and 1 represents 100%. If the student didn’t take the subject it’s represented as -1. Hence we have a sparse matrix of dimension 478x4102. We export this file as a text file to train our algorithm. Creating Training Set and Test Set: In conventional data splitting stage we divide the dataset into 70/30 split by choosing first 70% values as training set and testing remaining 30% values on test set. We can’t apply the same method in this case since we have to train on each subject and train on each student. Thus, we use the method of masking the data. For example in the above table we will select 30% of cells randomly with GPA values on them and dump it into test set. We will treat it as -1 as if the student hasn’t taken that course and try to predict the grade.
  • 5. Student 1 Student 2 ………….. Student 4102 Subject 1 ? (-1) -1 …... .88 Subject 2 -1 .77 …... -1 …………. …………. …………. Subject 478 .52 ? (-1) …... -1 After training our algorithm we will try to predict the ? values. If the predicted value was .90 for Student 1 and Subject 1 we know that our error was .08 which is 8.16%. If we can determine the grade with average accuracy of less than 10% we can determine how a student will perform on future elective subjects based on past performance. Learning: The grade that student receives depends on : ● Difficulty of subject. ● Performance of student. Every subject has requirements to fulfill in-order to pass that subject like homework, assignments, midterm, final-exam, etc. Student performance depends on his/her performance on each requirement. Thus we consider the subject and student to be a vector of 7 traits. Subject : [x1, x2, x3, x4, x5, x6, x7] Student : [y1, y2,y3, y4, y5, y6, y7 ]
  • 6. We are trying to learn individual weights of each vector. Each variable in the vector represents weight for one of the requirements and performance of student on each requirement respectively. The grade in the end will be the dot product of both the vectors. Grade = [x1*y1 + x2*y2 + x3*y3 + x4*y4 + x5*y5 + x6*y6 + x7*y7] (Note : We chose to have 7 elements in each vector after testing with different vectors of different size from 2-10 and 7 variable vector gave the best accuracy. We will experiment with different variable in future if it affects the accuracy.) The application is using Q-Learning algorithm to adjust the weights by using the error term and actual score the student received in that subject. Pseudocode for our algorithm (LearningSubjectVector) For​ ​each​ ​course​ ​in​ ​all​ ​courses For​ ​each​ ​weight in​ ​all​ ​course vector ​For​ ​each​ ​student​ in all students ​If​ ​student​ ​has​ ​taken​ ​the​ ​course ​Prediction​ = ​Calculate​ ​dot​ ​product​ ​from​ ​both​ ​the​ ​vectors ​Error​ = ​prediction​ ​-​ ​actual ​TotalError​ += ​Error discountFactor = discoutFactor*Stud_Actual_Score newWeight= Stud_Actual_Score+learning_rate(TotalError-discountFactor) This process is repeated 2000 times to learn the weights of subjects and students. Same algorithm is used for learning the student vector. After every 100 iteration the weights are saved into two separate text files. These weights could be loaded back to the application when making a new prediction
  • 7. Performance The algorithm starts with 59% error and after 2000 iterations it reaches 27-28% error. Next steps of improvement are if we want to predict the grade of student in CS class it should be only based on only CS + MAT class instead of considering all the classes. It will improve the accuracy of the predictions. Considering batch training instead of training the data at once which could improve the prediction. Also playing with different learning rates and discount factor values. Making changes to the main algorithm. Instead of using Q-Learning to reach the optimal values I would implement different variations of gradient descent like vanilla, SGD, Adam and try test the results. If we reach the error less than 10% I will store the weights and try predicting the best recommendation for the elective subject for a particular student after 1st year of performance data.
  • 8. Alternate Methods 1) Course Specific Regression (CSR) Undergraduate courses are structured in such a way that the courses that student takes at the beginning prepares students for the future class. This method assumes that the performance of student in the previous class directly impacts the future class performance. Example: grade in class CS300 depends on student’s performance in MAT 100 and CS 200. Hence if we are able to calculate the weight/contribution of each class and multiply with the students performance on that class then we can calculate the final grade. Future_Grade_For_Class_CS300 = Grade_In_ClassMAT100 * Contribution_MAT100 + Grade_In_ClassCS200 * Contribution_CS200 Suppose: Grade_In_ClassMAT100 = 80% Grade_In_ClassCS200 = 70% Contribution_MAT100 = 30% Contribution_CS200 = 70% Future_Grade_For_Class_CS300 = (0.8)(0.3) + (0.7)(0.7) Future_Grade_For_Class_CS300 = 0.73
  • 9. Thus we get the formula Future_Grade = [Grades_In_Previous_Class]^T* [Weights_Each_Class] How to get the matrix of Weights_Each_Class? We will create a matrix only including the rows of students who have actually taken that class previously. We will include the grade of each subject that they have taken before attempting the current target course. : Actual grade received in the subjectY ︿ 1 : Vector of 1s : Bias Valueb0 W : Vector of Weights (We are trying to learn) G : Matrix of grades of other student grades similar to target student , : Regularization parameters to control overfittingλ1 λ2 W WY 1b W| | ︿ − 0 − G | | 2 + λ1| | 2 + λ2| | 2 2) Student Specific Regresssion(SSR) The downfall of CSR (Course Specific Regression) method is sometimes there is too much flexibility in terms of selection of subjects.The order of selection of subjects by students is messed up. Hence there is alternative method which solves this problem. It’s called Student Specific Regression(SSR).
  • 10. We will compare our target student (Who has taken N number of courses uptill now) with other students and see if they have taken a minimum of K subjects which are common with our target student (K is subset of N and always K<N) If yes then we include the other student in our training data else we just exclude the other student from dataset while training to improve accuracy. Also we remove the data of the subjets which have not been taken by the target student from other students data. Thus we don't consider the weights of subjects that has not been taken by the target student. We train apply the same formula for learning the weights of each subject and predicting the grade.