This is a UNIBZ research project devoted to personalized rankings and recommendations in the educational domain. Respondent is asked to rate a few universities that are offered based on the geographic region they select. Based on these ratings the system would show three lists of potential universities the respondent might be interested in.
This Master thesis was undertaken by Anna Lambrix with the supervision of the project by Nabil El Ioini and Mehdi Elahi.
1. Personal Rankings of
Educational Institutions
Masters Thesis Defense
Free University of Bozen-Bolzano
Author: Anna Alexander Lambrix
Supervisors: Nabil El Ioini & Mehdi Elahi
March 22, 2019
4. 1.1 Introduction >> Background & Problem
Background
» Growing variety of choices & information
overload
» Recommender systems & personalized
suggestions in various domains
» Recommenders in education: predicting
college admissions and help with course
selection
Problem
» Tens of thousands of universities
worldwide: how to choose one?
» Non-personalized universal ranking lists
» Education domain: problematic
challenges that could be tackled by
recommender systems
» Complexity of human decision making
(preferences & personality interplay)
4
5. 1.2 Introduction >> Objectives 5
Objectives
To develop a system that would:
» Provide personalized ranking lists of the universities.
» Compare the results of different algorithms.
» Collect data in order to investigate the users’ decision-making process.
» Help uncover potential relation between the personality and preferred university features and
algorithms.
» Pass the usability test with a score above the accepted benchmark.
7. 2. Research Questions
1. Which recommender algorithm can be adopted - based on the preferences of users - in order to
generate personalized university ranking?
2. Do recommender algorithm preferences depend on personality types?
3. What are the most important features that different users consider when decide in which
university to study?
4. Will the system for generating personalized university rankings be usable according to the
users’ assessment?
7
9. » Web-based application
» Customized WP install + LAMP stack
» Surveys to collect data
» RESTful APIs/JSON data
» Recommender: 3 CF algorithms
» Singular Value Decomposition
» k-Nearest Neighbor Basic
» k-Nearest Neighbor User Baseline
93.1 Methodology >> Design + Implementation
10. 103.2 Methodology >> User Flow
● Registration step
● Age, gender, country of origin, education
information
● Personality survey: Five Factor Model
● Openness, Conscientiousness,
Extraversion, Agreeableness, Neuroticism
11. 113.2 Methodology >> User Flow
1. Low or free tuition
2. Prestigious brand
3. High-quality teaching
4. International diversity
5. High graduate employment rate
6. Family members have gone to that
university
7. Size of the university
8. Research or internship opportunities
9. Party environment or extracurricular
activities
10. Cost of food and rent in the area
11. Access to sport facilities and sport clubs
● Select features (at least 3 out of 11)
12. 123.2 Methodology >> User Flow
● Rate universities (at least 3 out of 10 or
more)
● Select country
13. 133.2 Methodology >> User Flow
● Usability survey (SUS: 10-item
questionnaire based on a 5-point Likert
scale)
● Results (3 lists)
● Evaluation survey
15. 4.1 Experiment >> Users Count
Registration Personality Features Country Rating Evaluate Usability
67
(100%)
65
(96%)
60
(88%)
60
(88%)
58
(87%)
49
(72%)
46
(68%)
15
● Online evaluation, with real users
● ~ 1 month run
● Small dataset available from another study (universities, users & ratings)
● 67 new users attempted our experiment, 46 completed all the steps
● Data collected was analyzed in order to find possible patterns
16. 4.2 Experiment >> Users: Age, Gender, Education, Origin 16
Age Education
Under
18
18-24 25-34 35-44 44-54
Over
55
2% 30% 37% 22% 6% 3%
Gender
Males Females
Refused to
disclose
75% 21% 4%
High
School
Professional
Degree
Bachelor’s
Degree
Master’s
Degree
Doctorate
Degree
5% 2% 73% 16% 5%
Origin
● Italy (30%)
● Russia (10%)
● Germany (7%)
● India (6%)
● + Various other countries
18. 5.1 Results >> Algorithm Comparison
» 14 question Evaluation survey
» 5 metrics: Accuracy, Diversity, Understand Me, Satisfaction, Novelty
» 3 lists presented for evaluation: SVD, KNN1, KNN2
» 58 participants provided ratings, 49 completed the evaluation survey
» 19 ratings per participant on average with a median of 6
18
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
19. 5.1 Results >> Algorithm Comparison 19
Metric Question SVD KNN1 KNN2 Graph
Accuracy 1. Which list has more selections that you find appealing? 47% 35% 18%
Accuracy 2. Which list has more obviously bad suggestions for you? 22% 24% 53%
Diversity 3. Which list has more universities that are similar to each
other?
45% 27% 29%
Diversity 4. Which list has a more varied selection of universities? 24% 31% 45%
Diversity 5. Which list has universities that match a wider variety of
preferences?
29% 41% 31%
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
20. 5.1 Results >> Algorithm Comparison 20
Metric Question SVD KNN1 KNN2 Graph
Understand
Me
6. Which list better reflects your preferences in
universities?
53% 29% 18%
Understand
Me
7. Which list seems more personalized to your university
ratings?
49% 37% 18%
Understand
me
8. Which list represents more mainstream ratings
instead of your own?
41% 33% 27%
Satisfaction 9. Which list would better help you find universities to
consider?
47% 35% 18%
Satisfaction 10. Which list would you be more likely to recommend to
your friends?
49% 33% 18%
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
21. 5.1 Results >> Algorithm Comparison 21
Metric Question SVD KNN1 KNN2 Graph
Novelty 11. Which list has more universities you did not expect? 18% 16% 65%
Novelty 12. Which list has more universities that are familiar to
you?
45% 39% 16%
Novelty 13. Which list has more pleasantly surprising
universities?
33% 35% 33%
Novelty 14. Which list provides fewer new suggestions? 41% 35% 24%
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
22. 5.1 Results >> Algorithm Comparison 22
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
Metric Algorithm
Accuracy SVD
Diversity KNN1
Understand Me SVD
Satisfaction SVD
Novelty KNN1
Comparison Condition One-tailed paired T-test,
p-value
SVD vs KNN1 0.348
SVD vs KNN2 0.091
KNN1 vs KNN2 0.057
23. 23
● SVD - better in terms of Accuracy, Understand me,
Satisfaction
● SVD - many mainstream suggestions
● KNN1 - better in terms of Novelty & Diversity
● KNN2 - deemed underperforming by majority of users across
most of the categories of metrics
● T-test inconsistent, no significant difference between any
pair of algorithms
RQ1: Which recommender algorithm can be adopted - based on the preferences of users - in
order to generate personalized university ranking?
24. 5.2 Results >> Personality + Algorithm Preference
» 5 personality traits: Openness, Conscientiousness, Extroversion, Agreeableness,
Neuroticism
» A/B group split (25%-50%-25%)
» A = answers higher than the median (e.g. “Strongly Agree”)
» B = lower than the median (e.g. “Strongly Disagree”)
24
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
25. 5.2 Results >> Personality + Algorithm Preferences 25
Personality trait A B A B A B A B A B
Openness SVD SVD SVD KNN2 SVD SVD SVD KNN2 KNN2 KNN1
Conscientiousness
KNN1 SVD KNN1
SVD
KNN2
KNN1 SVD KNN1 SVD KNN2
SVD
KNN1
Extroversion KNN1
KNN2
SVD SVD KNN2 KNN1 SVD KNN1 SVD KNN1 KNN1
Agreeableness
SVD KNN1 KNN1 KNN2 SVD KNN1 SVD
KNN1
KNN2
SVD
KNN1
KNN1
KNN2
Neuroticism
KNN1 KNN1 SVD KNN2 SVD KNN1 KNN1 KNN1
KNN1
KNN2
KNN2
Accuracy Diversity Understand-me Satisfaction Novelty
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
KNN1
KNN2
KNN1
KNN2
KNN1
KNN2
KNN1
KNN2
SVD
KNN2
SVD
KNN1
SVD
KNN1
26. 5.2 Results >> Personality + Algorithm Preference 26
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
Metric Outperforming algorithm
(based on the feedback from the polarized portion of respondents)
Accuracy SVD/KNN1
Diversity KNN2
Understand Me SVD
Satisfaction KNN1
Novelty KNN1
27. 27
RQ2: Do recommender algorithm preferences depend on personality types?
● People with different types of personality may tend to
choose different types of recommendation algorithms
● SVD was considered better in Understand-Me metric
and equally good in Accuracy
● KNN1 was considered better in Satisfaction, Novelty
and equally good in Accuracy
● KNN2 was considered better in Diversity and worst in
Accuracy and Understand Me metric.
28. Overall (60)
1. High-quality teaching (80%)
2. Low/free tuition (55%)
3. Research/internship opportunities (53%)
4. High graduate employment rate (40%)
5. International diversity (40%)
6. Prestigious brand (33%)
7. Cost of food & rent in the area (28%)
8. Party environment/ extracurricular activities
(27%)
9. Access to sport facilities & sport clubs
(23%)
10. Size of the university (22%)
11. Family members have gone to that
university (2%)
5.3 Results >> Feature Preferences
Males (43)
1. High-quality teaching (81%)
2. Research/internship opportunities (51%)
3. Low/free tuition (49%)
4. High graduate employment rate (37%)
5. International diversity (35%)
Prestigious brand (35%)
6. Cost of food & rent in the area (28%)
7. Party environment/ extracurricular activities
(25%)
8. Size of the university (23%)
9. Access to sport facilities & sport clubs
(23%)
10. Family members have gone to that
university (2%)
28
Females (14)
1. High-quality teaching (86%)
2. Low/free tuition (71%)
3. Research/internship opportunities (64%)
4. High graduate employment rate (57%)
International diversity (57%)
5. Access to sport facilities & sport clubs
(50%)
6. Party environment/ extracurricular activities
(36%)
7. Prestigious brand (29%)
Cost of food & rent in the area (29%)
8. Size of the university (7%)
29. 5.3 Results >> Feature Preferences
» 5 personality traits: Openness, Conscientiousness, Extroversion, Agreeableness,
Neuroticism
» A/B group split (25%-50%-25%)
» A = answers higher than the median (e.g. “Strongly Agree”)
» B = lower than the median (e.g. “Strongly Disagree”)
29
SVD = Singular Value Decomposition, KNN1 = k-Nearest Neighbor Basic, KNN2 = k-Nearest Neighbor User
31. 31
RQ3: What are the most important features that different users consider when decide in
which university to study?
● Majority: High-quality teaching, Low or free tuition,
Research or internship opportunities
● Different concerns exhibited by the polarized groups of
respondents, with certain standard features in common
● Female respondents tend to select more features than
male respondents
32. SUS score interpretation
5.4 Results >> Usability 32
Score Grade Rating
>80.3 A Excellent
68 - 80.3 B Good
68 C Okay
51-68 D Poor
<51 F Awful
● 46 participants provided the data
● Average score = 72.9
● Median score = 75
● Lowest = 42.5
● Highest = 100
33. 33
RQ4: Will the system for generating personalized university rankings be usable according to
the users’ assessment?
● The system passed the usability test
● The system scored higher than the well-accepted
benchmark
35. 6. Conclusion >> Future Work
» Extend the dataset with more preferences from users -> improve recommendations
» Develop the existing algorithms in order to exploit the content of the items (hybrid approach)
» Incorporate personality information in the prediction model
» Improve user-system interaction model (pre-filtering of the universities/more information)
» Rework rating from 5 point to a larger scale
» Include additional features
35