The applications of data science techniques to game learning analytics data obtained from serious games can provide a more scientific approach to improve the serious games lifecycle. Honing on the game analytics data is possible to use an evidence-based approach to the design, evaluation and deployment of serious games. For instance, the use of game analytics techniques on the users gameplay interaction data can be applied to systematize the evaluation of games, and allow both teachers and institutions to make better evidence-based decisions. The talk will address some of the new possibilities offered by game learning analytics and what are the requirements (e.g. standards) for its generalization in real settings (including some of the ethical implications).
Celda 2019 game learning analytics for evidence based serious games final
1. Game Learning Analytics for
Evidence-based Serious Games
Baltasar Fernandez-Manjon
balta@fdi.ucm.es , @BaltaFM
e-UCM Research Group , www.e-ucm.es
CELDA 2019, Universita degli Studio de Calgari, Sardinia
2. The Uber game – Financial Times
https://source.opennews.org/articles/how-and-why-financial-times-made-uber-game/
https://ig.ft.com/uber-game/
3. Serious Games
• Any use of digital games with purposes other than
entertainment
(Michael & Chen, 2006)
• Applied successfully in many domains (medicine, military)
with different purposes (knowledge, awareness)
• But still is a low adoption of Serious Games in mainstream
education
• Serious Games considered usually as a complementary
content
• Mainly used for motivational purposes
• No actual impact on the final mark
7. Juegos Serios Información sobre el cáncer
y su tratamiento
Improving adherence to the cancer
treatment
https://www.re-mission2.org/
8. Citizen science: use games to contribute
solving “difficult problems”
8
Play to Cure™: Genes in Space - a mobile game in which players collaborate to analyse real genetic data (Cancer Research UK, n.d.)
http://centerforgamescience.org/portfolio/foldit/
http://www.cancerresearchuk.org/support-us/play-
to-cure-genes-in-space
9. Miller, J., Vázquez-cano, E., & Obligatoria, S. (2015). Exploring Application, Attitudes and Integration of Video Games: MinecraftEdu in Middle School. Educational
Technology & Society, 18(3), 114–128.
Educational version of commercial games
10. Do serious games actually work?
- Very few SG have a formal evaluation (e.g. pre-post)
- Usually tested with a very limited number of users
- Formal evaluation could be as expensive as creating
the game (or even more expensive)
- Evaluation is not considered as a strong requirement
- Difficult to deploy games in the classroom
- Teachers have very little info about what is happening
when a game is being used
11. Learning analytics
• Improving education based on analysis of actual data
• Data driven
• From only theory-driven to evidence-based
“the measurement, collection, analysis and reporting of
data about learners and their contexts, for purposes of
understanding and optimizing learning and the environment
in which it occurs” (Long & Siemens, 2011)
12. Game Analytics
• Application of analytics to game
development and research
• Telemetry
• Data obtained over distance
• Mobile games, MMOG
• Game metrics
• Interpretable measures of data related to
games
• Player behavior
• Mainly used with “commercial
purposes”
14. Game Learning Analytics
breaking the game black box
model to obtain information
while students play.
Manuel Freire, Ángel Serrano-Laguna, Borja Manero, Iván Martínez-Ortiz, Pablo Moreno-Ger, Baltasar Fernández-Manjón (2016): Game
Learning Analytics: Learning Analytics for Serious Games. In Learning, Design, and Technology (pp. 1–29). Cham: Springer International
Publishing. http://doi.org/10.1007/978-3-319-17727-4_21-1.
•GLA is learning analytics applied to serious games
• collect, analyze and visualize data from learners’ interactions
with SGs
Game Learning Analytics (GLA)
15. Uses of Gaming Learning Analytics in
educational games
• Game testing – game analytics
• It is the game realiable?
• How many students finish the game?
• Average time to complete the game?
• Game deployment in the class – tools for teachers
• Real-time information for supporting the teacher
• Knowing what is happening when the game is deployed in the class
• “Stealth” student evaluation
• Formal Game evaluation
• From pre-post test to evaluation based on game learning analytics??
17. Game Learning Analytics (GLA) or Informagic?
• Informagic
• False expectations of gaining full insight on the game
educational experience based only on very shallow game
interaction data
• Set more realistic expectations about learning analytics with
serious games
• Requirements
• Outcomes
• Uses
• Cost/Complexity
Perez-Colado, I. J., Alonso-Fernández, C., Freire-Moran, M., Martinez-Ortiz, I., & Fernández-Manjón, B. (2018). Game
Learning Analytics is not informagic! In IEEE Global Engineering Education Conference (EDUCON).
18. Minimun Game Requirements for GLA
• Most of games are black boxes.
• No access to what is going on during game play
• We need access to game “guts”
• User interactions
• Changes of the game state or game variables
• Or the game must communicate with the outside world
• Using some logging framework
• What is the meaning of the that data?
• Ethics: adequate experimental design and setting
• Are users informed?
• Anonymization of data could be required
22. Experience API for Serious Games: xAPI-SG Profile
Experience API (xAPI) is a new de facto standard that
enables the capture of data about human performance and
its context. Now it becoming an IEEE standard
The e-UCM Research Group in collaboration with ADL
created the Experience API for Serious Games Profile (xAPI-
SG), a xAPI profile for the specific domain of Serious Games.
The xAPI-SG Profile defines a set of verbs, activity types and
extensions, that allows tracking of all in-game interactions
as xAPI traces (e.g. level started or completed)
The model
https://adlnet.gov/news/a-serious-games-profile-for-xapi
https://xapi.e-ucm.es/vocab/seriousgames
23. Java
xApi Tracker
Unity
xApi Tracker
C#
xApi Tracker
Game trackers and cloud analytics frameworks as open code (github)
https://github.com/e-ucm
Ángel Serrano-Laguna, Iván Martínez-Ortiz, Jason Haag, Damon Regan, Andy Johnson, Baltasar Fernández-Manjón (2017):
Applying standards to systematize learning analytics in serious games. Computer Standards & Interfaces 50 (2017) 116–123,
24. Systematization of Analytics Dashboards
As long as traces follow xAPI-SG format, some
analysis do not require further configuration!
Also possible to configure game-dependent analysis and
visualizations for specific games and game characteristics.
25. Real-time analytics: Alerts and Warnings
• Identify situations that may require teacher intervention
• More complex and fragile, requires a cloud infrastructure
• Fully customizable alert and warning system for real-time teacher feedback
07/11/RAGE Project presentation25
Inactive learner: triggers when no traces received in #number of
minutes (e.g. 2 minutes)
> High % incorrect answers: after a minimum amount of
questions answered, if more than # %of the answers are wrong
Students that need
attention
View for an specific
student
(name anonymized)
26. uAdventure: xAPI GLA in games authoring
uAdventure authoring tool (on top of Unity)
• Helps to create educational
point & click adventure games
• Open code (github)
Full integration xAPI-SG game learning analytics
into uAdventure authoring tool
uAdventure games with default analytics
Include geolocalized games
https://www.e-ucm.es/uadventure/
29. Search Queries
#1: Techniques
- Artificial intelligence
- Data mining
- Machine learning
- Data analysis
- Deep learning
#2: LA
- Learning Analytics
- Game Analytics
- Educational Data Mining
#3: Games
- Serious Games
- Educational Games
- Computer Games
- Video Games
- Games-based learning
- Online Games
Alternative terms, synonyms, variations...
30. Analysis and research questions about GLA in SG
● RQ1: What are the purposes for which DS has been applied to LA data
from SGs?
● RQ2: What DS techniques have been applied to LA data from SGs?
● RQ3: What stakeholders are the target to benefit from this information?
● RQ4: What conclusions have been drawn from these applications?
● Number of participants (N)
● Education level (e.g. primary, ages)
● Interaction data captured (e.g. times, progress, errors)
● Data format (e.g. xAPI, csv)
● Game purpose (e.g. teach)
● Game subject (e.g. maths)
31. Results: RQ1 GLA purposes
➔ Main focus: assess learning &
predict performance
➔ Games are indeed useful for
purposes beyond entertainment
➔ Interest now in analyzing
interaction data to measure
impact on players and relation to
players’ in-game behaviors
32. Results: RQ2 data science techniques
➔ Linear models and cluster
techniques commonly applied
➔ Classical techniques
➔ More powerful techniques (e.g.
neural networks) not broadly
applied yet
➔ Need of XAI
33. Results: RQ3 main stakeholders
➔ Purposes that cover interests of
many stakeholders
➔ Many research done on this area
➔ Students/Learners indirect
recipients of all results
34. Results: RQ4 conclusions and results
Results on assessment & student profiling:
➔ GLA data can accurately predict games’ impact
➔ Performance is related to players’ characteristics
Results on SG design:
➔ GLA data can validate SG design
➔ Assessment can & should be integrated in SG design
➔ Importance of SG characteristics
➔ Identified challenges when designing SG
➔ Proposed frameworks to simplify design
35. Results: Additional information
Serious games used:
➔ Main focus to teach
➔ Main domain maths and science-related topics
Participants in the validations studies:
➔ Small sample sizes used (<100)
➔ Primary & secondary education
Interaction data:
➔ Completion times, actions & scores commonly tracked
➔ Format not reported
36. Most common methodology are pre-post experiments:
Assessment with serious games
Is there a
significant
difference
between pre-test
and post-test
results?
37. Methodology
Use GLA interaction data to predict knowledge after playing. Two steps:
1. Game validation phase:
○ create prediction models taking as input the interaction data
○ validate against actual results (post-test)
2. Game deployment phase:
○ students play and are automatically assessed based on their interactions (used
as input for prediction models)
○ pre-post are no longer required
We have tested this methodology with a case study.
38. Research Questions
Can we predict student knowledge from pre-test + interactions with the SG? (pre+game)
If we can predict it, what prediction models perform best and what variables are most
relevant?
Q1.2
Can we predict student knowledge solely from interaction with the SG? (game-only)Q2.1
If we can predict it, what prediction models perform best and what variables are most
relevant?
Q2.2
Q1.1
Is the pre+game condition (Q1.1) more effective than the game-only condition (Q2.1)?Q2.3
Cristina Alonso-Fernández, Manuel Freire, Iván Martínez-Ortiz, Baltasar Fernández-Manjón (2019): Predicting students’
knowledge after playing a serious game based on learning analytics data: A case study. Journal of Computer Assisted
Learning (in press).
39. The game: First Aid Game
Game to teach first aid techniques
to 12-16 years old players
Three initial situations:
● chest pain
● unconsciousness
● choking
Game previously validated with
pre-post and control group:
Video-game instruction in basic life support
maneuvers. Marchiori EJ, Ferrer G, Fernandez-Manjon
B, Povar Marco J, Suberviola Gonźalez JF, Gimenez
Valverde A. (2012)
40. Pre-post experiments + GLA data
N = 227 students from a high school in Madrid (Spain)
Each student completed:
● pre-test: 15 questions assessing previous knowledge
about first aid techniques
● gameplay: of First Aid Game
● post-test: 15 questions assessing knowledge about
first aid techniques after playing
Collection of both results in pre-post test and GLA
interaction data from the game (following xAPI-SG Profile).
41. Prediction models
● Predict exact post-test score (range 0-15):
○ Regression tree
○ Linear regression
○ SVR (non-linear kernels)
● Predict post-test pass/fail category (pass as 8/15 correct answers):
○ Decision tree
○ Logistic regression
○ Naïve Bayes Classifier
All models tested taking as input:
pre-test + game GLA interaction data
(pre+game condition)
only game GLA interaction data
(game-only condition)
42. Pass/fail prediction Score prediction (scale [0-15])
Pre-test? Prediction model Precision Recall MR Prediction model Error mean (SD)
Yes (pre+game)
Decision tree 81.6% 94.2% 16.2% Regression tree 2.22 (0.55)
Logistic regression 89.8% 98.3% 10.5% Linear regression 1.68 (1.44)
Naïve Bayes Classifier 92.6% 89.7% 15.1% SVR (non-linear kernels) 1.47 (1.33)
No (game-only)
Decision tree 88.6% 92.4% 17.3% Regression tree 2.38 (0.62)
Logistic regression 87.2% 98.8% 12.7% Linear regression 1.89 (1.54)
Naïve Bayes Classifier 89.7% 90.6% 16.9% SVR (non-linear kernels) 1.56 (1.37)
Results
43. Results
Can we predict student knowledge from pre-test + interactions with the SG?
(pre+game)
If we can predict it, what prediction models perform best and what variables are
most relevant?
Q1.2
Q1.1
Yes, highly accurate results obtained to predict knowledge
● As expected, more accurate predictions when simply predicting pass/fail
categories than exact scores.
● Models: logistic regression for binary pass/fail predictions and SVR for
score predictions
● Variables: number of interactions with game character, game scores
44. Results
Can we predict student knowledge solely from interaction with the SG? (game-
only)
If we can predict it, what prediction models perform best and what variables are
most relevant?
Is the pre+game condition (Q1.1) more effective than the game-only condition
(Q2.1)?
Q2.1
Q2.2
Q2.3
Yes, highly accurate results obtained to predict knowledge
● Models: logistic regression for pass/fail; SVR for score predictions
● Variables: number of interactions with game character, scores in level
Yes, but only slightly. Models in game-only condition still obtain accurate
results.
45. New uses of games based on GLA
- Avoiding pre-test: Games for evaluation
- Avoiding post-test: Games for teaching and measure of learning
With or without pre-test.
46. GLA Case study: Downtown
• Serious Game designed and develop to
teach young people with Down Syndrome
to move around the city using the subway
• Evaluated with 51 people with cognitive
dissabilities (mainly Down Syndrome)
• 42 users with all data
• 3h Gameplay/User
• >120K analytics xAPI data (traces) to
analyze
47. Case Study: Downtown
• From user requirements to a game
design and its observables
• Know more about how and what is learn
by people with Down Syndrome
47
Ana Rus Cano, Alvaro Garcia-Tejedor, Baltasar Fernández-Manjón (2018): Using Game Learning Analytics for Validating the
Design of a Learning Game for Adults with Intellectual Disabilities. The British Journal of Educational Technology
48. Hyp 1: Users prefer to identify themselves
with the avatar • REFUTED
• None of the users selected the avatar with
Down features despite the trainers showed
them the avatar and pointed that that
character was Down.
• The majority of the users used the
preconfigured character despite they were
asked to customize the avatars at the beginning
of the game session.
• We are not observing significative evidences in
the users’ play patterns between those who
customize the character and those who don’t,
but it may be significative that the majority of
the users that changed the avatar were Down.
48
49. Hyp: High-Functioning users do a better
performance using the game
• To determine the cognitive skills and autonomy of the users we asked the trainers
to complete a test about each student
• 6 intelectual dimensions were measured (5-point Likert scale)
• General cognitive/intellectual ability
• Language and communication
• Memory acquisition
• Attention and distractibility
• Processing speed
• Executive functioning
• Users were divided in two groups: Medium-Functioning (≤ 3 avg.) and High-Functioning (>3
avg.).
• MF = 19 (45.2%) HF = 23 (54.8%)
49
50. Hyp 2: High-Functioning users do a better
performance using the game
50
Number of MF users that played each level Number of HF users that played each level
51. Hyp 2: High-Functioning users do a better
performance using the game
Average time completing levels for MF Average time completing levels for HF
12:31:21
AM
12:37:02
AM
12:33:16
AM
EASY MEDIUM EXPERT
12:32:44 AM
12:28:33 AM
12:36:51 AM
12:25:58 AM
EASY MEDIUM HARD EXPERT
52. Hyp: ID users are engaged and motivated
while learning with a videogame
52
12:01:30 AM
12:01:03 AM
12:00:59 AM12:00:59 AM
12:00:50 AM
12:01:00 AM
12:00:36 AM12:00:38 AM
0:00
0:17
0:35
0:52
1:09
1:26
1:44
1 2 3 4 5 6 7 8
• Inactivity times reduced
in a 70,7% avg. from
session #1 to session #8
• Positive and motivational
learning environment
(98,2% users show
improvement and
engagement performing the
videogame tasks)
Average inactivity time evolution
avgtime
game session
53. Hyp: The game design of Downtown is
effective as a learning tool
53
• 100% of the trainers agree that
the use of Downtown would
enhance the user learning
adquisition (Perceived
Usefulness)
• 85,8% of the users were able to
follow the right path (both LF
and HF)
• 50,8% of the wrong path
occured during the first 30 min.
of playing0
20
40
60
80
100
120
140
160
180
1 2 3 4 5 6
Correct vs Incorrect Path per Game session
(#correct stations vs #incorrect stations)
count
game session
55. Educational desing
● Adventure game- sentiments and emotions are important
● Real situations familiar for students
● Events based on user decision making (but no agression options)
● Scenarios based on research about bullying and cyberbullying
● Different roles of bullying represented
● Designed to be used at classroom
56. Game mechanics
Seminario eMadrid sobre Serious gaes 2017-02-24 56
New student in school
Occurs during 5 days
Minigames as “nightmares”
Implications of the social networks
58. Significant increase in the ciberbullying
perception Wilcoxon paired test, p<0.001
5.72
6.38
Antonio Calvo-Morata, Dan-Cristian Rotaru, Cristina Alonso-Fernández, Manuel Freire, Iván Martínez-Ortiz, Baltasar Fernández-Manjón (2018):
Validation of a Cyberbullying Serious Game Using Game Analytics. IEEE Transactions on Learning Technologies (early access)
59. Validation with teachers and future teachers
Evaluation with 84 teachers y 104 educational science
students
● Signficant increase in their knowledge
● 99% consider that the game can be used in class VS 1%
that do not agree
● 82% willing to use it in their class VS 2% would not use
it
● 87% consider the game an effective tool VS 1% that do
not agree
Antonio Calvo-Morata, Manuel Freire, Iván Martínez-Ortiz, Baltasar Fernández-Manjón (2019): Applicability of a cyberbullying
videogame as a teacher tool: comparing teachers and educational sciences students. IEEE Access, DOI: 10.1109/ACCESS.2019.2913573
60. Simva
Simva is a tool for scientific validation of serious games.
Goal: to simplify the previously-identified issues:
● Connection with surveys system and learning analytics system
● Management of participants and surveys
● Data storage: questionnaires responses and traces of interactions
● Control while experiments are in play: questionnaires finished, data sent
Ivan Perez-Colado, Antonio Calvo-Morata, Cristina Alonso-Fernández, Manuel Freire, Iván Martínez-
Ortiz, Baltasar Fernández-Manjón (2019): Simva: Simplifying the scientific validation of serious games.
19th IEEE International Conference on Advanced Learning Technologies (ICALT), 15-18 July 2019,
Maceió-AL, Brazil.
62. Simva
● Validation of serious games is a complex, error-prone process
● Simva tool aims to simplify the possible issues
○ Before the experiments:
■ Managing users & surveys
■ Providing anonymous identifiers to users
○ During the experiments:
■ Collecting and storing all data collected
■ Relating different data from users
■ Allowing additional metadata
○ After the experiments:
■ Simplifying downloading of all data collected
63. Conclusions
• Game Learning Analytics has a great potential for improving SGs
• Evidence based serious games
• Games as assessments (better “Stealth” student evaluation)
• Games as powerful research environments
• Still complex to implement GLA in SG
• Increases the (already high) cost of the games
• Requires expertise not always present in game developers, SME or research
groups
• Real time GLA is still complex and fragile (e.g. deployment is schools)
• New standards specifications (e.g. xAPI) and open software
development could greatly simplify GLA implementation and adoption
• Ethics should drive the GLA process
63
Miller, J., Vázquez-cano, E., & Obligatoria, S. (2015). Exploring Application, Attitudes and Integration of Video Games: MinecraftEdu in Middle School. Educational Technology & Society, 18(3), 114–128.