VII Jornadas eMadrid "Education in exponential times". Erkan Er: "Predicting Peer-Review Participation at Large Scale Using an Ensemble Learning Method". 04/07/2017.
VII Jornadas eMadrid "Education in exponential times". Erkan Er: "Predicting Peer-Review Participation at Large Scale Using an Ensemble Learning Method". 04/07/2017.
Assessment of Student by Ashley Compton (Nov 2015)
Similar to VII Jornadas eMadrid "Education in exponential times". Erkan Er: "Predicting Peer-Review Participation at Large Scale Using an Ensemble Learning Method". 04/07/2017.
Similar to VII Jornadas eMadrid "Education in exponential times". Erkan Er: "Predicting Peer-Review Participation at Large Scale Using an Ensemble Learning Method". 04/07/2017. (20)
VII Jornadas eMadrid "Education in exponential times". Erkan Er: "Predicting Peer-Review Participation at Large Scale Using an Ensemble Learning Method". 04/07/2017.
1. Predicting Peer-Review Participation at Large
Scale Using an Ensemble Learning Method
Erkan Er | Eduardo Gómez-Sánchez | Miguel Luis Bote-Lorenzo
Yannis Dimitriadis | Juan Ignacio Asensio-Pérez
Project ID: VA082U16, Funded by Junta de Castilla y León.
GSIC-EMIC Research Group, Universidad de Valladolid
2. Peer Reviews in MOOCs
• Learning benefits for both parties [1, 2].
• Impossible to assess thousands of student artefacts.
• Receive constructive feedback,
• BUT ALSO get graded to continue
Feedback to
improve!
Higher-order
thinking!
2
3. Peer Reviews in MOOCs
• Problems with peer reviews at large scale
LOW
PARTICIPATION
Diversity among
MOOC Learners No Instructor
Facilitation
DROPOUTS
UNGRADED
SUBMISSIONS
MISSED
LEARNING
BENEFITS
3
4. Proposed Solution and Motivation
4
Predicting student engagement in peer-reviews
The number
of peer work
that students
will review
6. Proposed Solution and Motivation
Effective peer-review
sessions
Effective collaborative
learning activities
• Peer reviews based on the
expected level of
engagement
• Allotting adaptive time
periods
• Providing incentives to
motivate students
• Inter-homogeneous groups in
terms of having some members
• with desire to review
teammates’ work
6
7. Our preceding work
• Predicting Student Participation in Peer Reviews in MOOCs
5th European MOOCs Stakeholders Summit,
Madrid, Spain, May 2017.
8. Our preceding work
• Limitations:
• The model was built with
• A large part of the error was accounted for students:
8
Assignment
time
Peer-review
time
versus
Particular to
the context.
A large
feature set
Overfit small
data set
9. Current Work
• Reduced yet predictive feature set
• Helps minimize the probability of overfit,
• Enhances transferability
9
MOOC 1
10. Current Work
• Students with ZERO participation in peer reviews
10
REGRESSION
MODEL
PREDICTIONS
PERFORMANCE
ALL DATA
STUDENTS
WITH ZERO
PEER REVIEW
STUDENTS
WITH PEER
REVIEWS
CLASSIFICATION
MODEL
ALL
DATA
11. Context
• Canvas Network course (from dataverse)
• No contextual information regarding the courses,
• we do not know their learning design,
• we do not know the content/purpose of the activities.
• Therefore, we make some inferences based on the log data at
hand.
11
14. Method
• 10-fold cross-validation,
• LASSO as the regression method,
• Logistic regression as the classifier,
• Scikit-learn implementation of the predictors.
• Mean absolute error (MAE) as the performance metric
14
16. Discussion
• Better prediction performance with ensemble approach
• A classification phase before the regression model
• Standard Deviation of actual peer-review participation
• Ranging between 2.41-2.62,
• The performance of the ensemble method seem to be
promising,
• Ranging between 0.45 to 1.04.
16
17. Discussion
• However, error increased in the prediction of other levels
• Due to poor classification performance.
• More work needed on the classification model.
17
21. Practicality!
• Experimentation with post-hoc predictions [3]
• Useful in demonstrating relationships
• Exam scores Dropouts
• No practical use.
21
22. Practicality!
• Using only the information available at the time of
prediction,
• Making prediction when it is useful to instructor.
22
• Using TRANSFER LEARNING approaches [4] to create
operational prediction models.
23. Transfer learning: Practical predictions
• In-situ learning: Transferring a model across different weeks
23
Course
start
nth peer
reviews
(features)n
(MODEL) n
Predicting (n+1)th peer-reviews
(n+1)th
assignment
submissions
(features)n+1
using (MODEL)n
fed with (features)n+1
24. Transfer learning: Practical predictions
• Across contexts: Transferring a model across different courses
Course A
Course B
nth peer reviews1st peer reviews 2nd peer reviews
(MODEL)1 (MODEL)2 (MODEL)n
nth peer
reviews
1st peer
reviews
2nd peer
reviews
29. Method
• 10-fold cross-validation vs transfer learning approaches,
• Logistic regression as the classifier,
• Scikit-learn
• Offers parameters to deal with imbalanced class distributions
• AUC scores were used as the metrics of performance.
• More rigorous when the class distributions are imbalanced.
• 0.9-1.0: Excellent,
0.8-0.9: Good,
0.7-0.8: Fair,
0.6-0.7: Poor,
0.5-06: Fail
29
31. Results
• Transfer between Course #1 and Course #2
31
Using Course #1 to predict Course #2 on a weekly base
Week #1 Week #2 Week #3 Week #4
Week #1 0.559
Week #2 0.681
Week #3 0.656
Week #4 0.717
Using Course #2 to predict Course #1 on a weekly base
Week #1 Week #2 Week #3 Week #4
Week #1 0.552
Week #2 0.735
Week #3 0.779
Week #4 0.762
32. Results
• Transfer between Course #1 and Course #2
32
Using Course #2 to predict Course #1 on a weekly base
Week #2 Week #3 Week #4
0.787 0.802 0.806
Using Course #1 to predict Course #2 on a weekly base
Week #2 Week #3 Week #4
0.672 0.716 0.751
33. Discussion
• Using in-situ approach, the classification model has potential to be
utilized in an ongoing MOOC.
• Transferable over the weeks of the same course.
33
34. Discussion
• Transferable in both directions, but accuracy was not the same
• Training using whole course seems to perform better.
• More data points better trained model.
• Prediction of the participation in the first peer reviews is a challenge.
34
The classification model seems to be transferable
across different courses.
35. Future Work
Limitation -> Context was UNKNOWN
Connecting with Learning Design
• Replicating the model in one of our own MOOCs:
• Features that are informed by the learning design
• Transferring across KNOWN courses
• The role of the learning design and the context
35
36. Future Work
• Using this classification model in practice
• to produce a relevant input to the group-formation task,
• to offer peer-review related interventions
• Exploring other characteristics of peer-review engagement:
• Whether students are likely to provide feedback or not,
• The quality of feedback: related factors?
36
37. References
[1] Topping, K.: Peer assessment between students in colleges and universities. Rev. Educ. Res.
68, 249–276 (1998).
[2] Wen, M.L., Tsai, C.C., Chang, C.Y.: Attitudes towards peer assessment: a comparison of the
perspectives of pre‐service and in‐service teachers. Innov. Educ. Teach. Int. 43, 83–92 (2006).
[3] Kurka, D.B., Godoy, A., Von Zuben, F.J.: Delving Deeper into MOOC Student Dropout
Prediction. CEUR Workshop Proc. 1691, 21–27 (2016).
[4] Champaign, J., McCalla, G.: Transfer Learning for Predictive Models in Massive Open
Online Courses. Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes
Bioinformatics). 9112, 883 (2015).
37
Hello, my name is Erkan. I am a postdoctoral researcher at the GSIC research group. Today, I will present one of our recent work, which is about …..
Peer reviews have been already used a lot in the literature before MOOCs, and it has learning benefits not only for students whose work is reviewed but also for students who does the review. For example, when students perform a peer review they need to have a critical perspective using their existing knowledge, which may improve their higher-order thinking skills. Also, students whose works are reviewed, can receive valuable feedback that they can use to improve their learning.
However, the main motivation behind the use of peer reviews in MOOCs is not its learning benefits. It is mainly used because instructors cannot assess thousands of student artifacts and give individual feedback to each.
But with peer reviews in MOOCs, it is good that students get some feedback as a formative assessment, but also they receive grade for their submission so that they can continue or complete the course.
When peer reviews take place at large scale, some problems emerge. One big issue with peer reviews in MOOC contexts is the low participation. Actually, this is not very surprising considering the diversity among MOOC learners. And, such a diverse set of learners may not behave as desired especially when there is no instructor facilitation.
When the participation in peer-reviews are low, then it is likely that many students won’t receive neither a grade to move on in the course, nor feedback to improve their learning. And, such students are more likely to get disengaged in time and dropout.
So, what we propose a solution to this problem is to predict student engagement in peer-reviews using their preceding course activities. And, what I mean is predicting the number of peer work that students will review.
And we want to make this prediction at each prediction activity, for example if the course has 4 assignments that require peer-reviews, then we make this prediction 4 different times.
Why would this prediction should matter? Why we are interested in this? Well, because it can offer many pedagogical utilities in MOOCs.
First, instructors can possible create effective peer-review activities. For example, students who seem to under-participate can be assigned less number of peer reviews and vice versa. Also, students with low participation might be given longer times to complete their peer reviews.
Moreover, instructors may want to provide some incentives to motivate students to participate in peer review if in overall the class seem to be uninterested.
Other than peer reviews, instructors can also use this prediction model for designing effective collaborative learning activities. For example, using the prediction results as another grouping criteria, instructors can form inter-homogeneous groups in terms of having members with desire to review teammates’ work. In this way, these groups may function better and more effectively.
Actually, this is our follow up work. In the first study, we have worked on the same prediction task
but there were several issues.
The model was built with a large feature set, that were particular to the context. Not only this may make the prediction model very hard to transfer to other contexts, but also it may also lead to the overfit problem.
Second, we observed that A large part of the error was accounted for students who submitted their assignment but did not review any peer submission
So, what we have done differently in the current study. First, we reduce the feature set to a smaller yet predictive set, which helps minimize the porbabilty of overfit and enhaces the transferability of the model within the same MOOC and also across MOOCs, and more on this later.
And, to reduce the error caused by students who has ZERO participation in peer reviews, we proposed an ensemble approach.
So, here is the regular regression based prediction model. The regression model receives the data and produces the predictions then we evaluate its performance.
With the ensemble approach we use a classification model to identify students with zero peer reviews versus others. Then, we feed the regression model only with students who have some peer review participation. Then, for assessing the overall performance of the model, we combine the students that were predicted to have ZERO participation with the predictions obtained from the regression model.
The course was a Canvas Network Course, and the data was obtained from dataverse.
This course has more than three thousand five hundred enrollments. And, it has 4 particular assignments with peer reviews. Therefore, there are 4 peer review sessions. And, here is the distribution of number of peer-reviews performed by each student at each session. For example, here in the first peer reviews more than 100 students did not do any peer reviews even though they submitted their assignments. And, the other bar for students with 1 peer reviews, and 2, 3, and finally 4. As you may have already notice most of the students at each session has performed 3 peer reviews, which might be because they were enforced or recommended to do 3 peer reviews.
We have generated features based on four types of student activities, their quiz activities, discussion activities, assignment activities, peer-review activities. One thing to highlight is that previously we have found that past peer-review activities are good predictors of future peer-review participation. For this reason, we have generated more futures regarding students’ past peer review participations.
Let’s quickly look at the results. You may see that the ensemble approach that combine classification and regression models yield to better accuracy for predicting students with 0 participation..
Also, the overall accuracy was higher with the ensemble approach compared to what we have obtained in our previous study.
However, the improvement of the accuracy was limited since the classification error led to some problems in predicting participation at different levels.
And now I would like to present you the peer-review prediction PART II, Classification and Transferability. This is the most recent research work that we have conducted about peer-review prediction.
In this follow-up research, we worked on the same problem but this time we framed it as a classification problem instead of regression.
Here we have this bar, which represents the recommended level of participation in peer reviews. And, which is in general 3 in the courses that we have examined.
And, what we propose here is to identify early in a course which students will do the recommended number of peer reviews or more, and which students will do less than what is required.
We thought such a classification of students could be more useful for instructors, rather than predicting exactly if student will do 1, 2 reviews.
What makes this follow-up research different is that we propose a classification model that has some practical value. What I mean is a model that produces predictions to be used for designing interventions in real MOOC contexts, while the course is still continuing.
Previous research have been mostly based on post-hoc predictions that are useful in demonstrating relationships between the input variables and the target variable. For example, exam scores are important predictors of dropouts. However, these predictions are post-hoc, meaning that they are possible only after we know that students had really dropped out. Therefore, they cannot be used in practice.
So, what we do differently is to build a prediction model using only the information available at the time of prediction, and making this prediction early enough when it is useful for the instructor. For this purpose we use transfer learning approaches.
The first transfer learning approach that we use is called in-situ learning approach by which we can transfer a model across different weeks in the same course.
So let’s say we want to predict students’ peer-review participation for assignment n+1. For this we train a model using past peer review activities that took place in the same course. Then, we use this MODEL N, along with features n+1 to make a prediction for the n+1 th assignment. I hope this makes some sense.
The second transfer approach involves transferring a model across different courses. So, let’s say we have course A that was completed before Course B, and we want to use Course A to train models for each week, and use each of them to predict the participation at the corresponding week of Course B.
So, here is the first course, we name it course number one. It has more than three thousand five hundred enrollments. And, it has 4 particular assignments with peer reviews. Therefore, there are 4 peer review sessions. And, here is the distribution of number of peer-reviews performed by each student at each session. For example, here in the first peer reviews more than 100 students did not do any peer reviews even though they submitted their assignments. And, the other bar for students with 1 peer reviews, and 2, 3, and finally 4. As you may have already notice most of the students at each session has performed 3 peer reviews, which might be because they were enforced or recommended to do 3 peer reviews.
The second course has 3632 enrollments, and similar to the course #1, it has 4 peer-review sessions. The pattern in students’ peer-review participation is also similar. Except that in this course most of the students have done 4 peer reviews, which makes us think that the recommended or the required number of peer reviews in this course was probably 4.
In the third course in this study, there were even more enrollments, and higher number of peer-review sessions. Also, there was a sharp decrease in peer-review participation compared to previous 2 courses.
We have generated features based on four types of student activities, their quiz activities, discussion activities, assignment activities, peer-review activities. One thing to highlight is that previously we have found that past peer-review activities are good predictors of future peer-review participation. For this reason, we have generated more futures regarding students’ past peer review participations.
I won’t go into too much details, just we used AUC as the performance metric since it is not sensitive to imbalance data.
Let’s move to the results. First, using cross validation, we obtain a good level of accuracy for all weeks. One may argue that these results are highly optimistic since they are based on cross validation.
However, almost the same prediction performance was achieved using in-situ approach.
That is, the classification model built in one week was useful for making a prediction in the following week. And, this transferability was achieved in three different courses.
This tells us that this model has much practical value.
Accuracy was much better starting from the 3rd reviews, these models included features regarding past peer reviews.
Also, we trained models using weekly data from one course, to make predictions for the corresponding week of another course. The prediction performance was pretty good, especially after the first week.
As another way of transferring models between these two courses, we have attempted to train a single model using the whole course data and use the same model to make predictions in the other course for each week. So, here first we have one model trained using Course 2, and we have predictions for Week 2, 3, and 4 of Course 1. And, the accuracy of the predictions seem to be better than when we train models separately for each week.
And again, just for exploration we have repeated the same thing in reverse order of the courses. And the results were good, but not as accurate as the first one.
So, students’ level of peer-review participation can be predicted using past course activities (quizzes, discussions, assignments, peer-reviews). However, features about past peer-review activities seem to be the most important.
Using in-situ approach, the classification model can be utilized in an ongoing MOOC because a model trained using one-week can be transferable over to the following week in the same course
The classification model seems to be transferable also across different courses, and the transferable was in both directions, but accuracy was not the same. In some courses predicting student engagement might be harder.
Training a model using the whole course (rather than weekly) seems to perform better. With more data points, with more diversity in the dataset, we were better able to train the model, which was then more accurate in predicting weekly peer-review participations.
So, one big limitation of this research was that context was unknown. The data provider decontextualized all the data for privacy concerns.
So, we have some future work in our agenda to connect this work with learning design.
First, we want to replicate this model in …. Where we can generate features that are informed by the learning design actually.
Also, we need to explore transferring across KNOWN courses, and examine the role of the learning design and the context of these courses in efficiently transferring predictive models