Presentation on 5 Aug 2015 in the 17th International Conference on Human-Computer Interaction #HCII2015 in Los Angeles, CA, USA. The paper was presented in the Cross-Cultural Design track in the "Methodology of User Study" session which was chaired by Professor Patrick Rau, Tsinghua University, P.R. China http://2015.hci.international/wednesday
Comparison of User Responses to English and Arabic Emotion-Elicitation Videos
1. Comparison of User Responses to English
and Arabic Emotion-Elicitation Video Clips
NAWAL AL-MUTAIRI1, SHARIFA ALGHOWINEM1,2 AND AREEJ AL-WABIL1
1 KING SAUD UNIVERSITY, COLLEGE OF COMPUTER AND INFORMATION SCIENC ES, RIYADH, SAUDI ARABIA
2 AUSTRALIAN NATIONAL UNIVERSITY, AUSTRALIA
Aug 5, 2015
2. HCI Lab in King Saud University
@HCI_Lab
SKERG.ksu.edu.sa
3. Outline
◦ Introduction
◦ Background
◦ Emotion elicitation method
◦ Emotion response channel
◦ Emotions and cultural background
◦ Related work
◦ Research aim
◦ Features’ extraction, analysis and classification
◦ Results
◦ Conclusion
4. Introduction
o Psychologically, emotions are mixed with several terms such as affect/feeling and conjoined
with sensations, mood, desires and attitudes.
o Emotion was defined as uncontrollable feelings that affect our behavior and motivation.
o Emotional states have been divided into different categories such as Positive and Negative, and
into dimensions such as valence and arousal dimensions and into a discreet set of emotions such
as Ekman’s basic emotions (happiness, sadness, surprise, fear, anger and disgust).
5. Background – Emotion elicitation method
o To study emotions, measurable and replicable ways to collect emotional response are required.
o The most effective way is through emotion elicitation video clips, such as “Gross and
Levenson’s set” (1995) which used to elicit Ekman’s basic emotions.
o The set contains more than one clip for each emotion with different levels of eliciting target
emotions.
6. Background – Emotion response channel
o Emotions can be expressed through different channel such as
◦ Facial expression
◦ Speech prosody
◦ Physiological signals such as: heart rate, skin conductivity, brain wave,… etc.
Ekman argued that emotions have discriminative pattern generated by the Autonomic Nervous
System (ANS). These patterns reflect the changes in human physiological signals when different
emotions were elicited. Ax was the first to observe that ANS response is different between fear
and anger.
7. Background – Emotions and cultural background
o Several researchers found that emotional expressions are different between cultures, which
rely on emotion triggers.
o Therefore, the same emotion elicitation stimuli could produce different results based on
subject’s cultural background.
8. Related work
The study concluded that the selected video clips were able to elicit the target emotions in German culture.
45 German subjects were invited to participate in study by rating the elicited emotions from four
video clips from Gross and Levenson’s video clips set.
9. Related work
The study concluded that most of the video clips have a universal capability to elicit the target emotions.
However, due to the unique characteristics of the Japanese culture, some of the clips have elicited non-
target emotions as well. For example, one of the video clips that elicit amusement also induced surprise,
interest, and embarrassment
31 Japanese volunteers were invited to watch English video clips selected from Gross and
Levenson’s video clips set that elicit six different emotions then rate the felt emotions.
10. Research Aim
o Given the differences of Arab culture compared with the Western and Japanese cultures, it was
not clear whether the same set of emotional elicitation clips have the capability of eliciting the
target emotions in Arab subjects.
o Therefore we investigate and compare emotional responses to the clips in Gross and
Levenson’s set with an initial selection of Arabic emotion elicitation clips on Saudi subjects.
o The purpose of the Arabic set is to investigate the cultural acceptance effect on the elicited
emotions in comparison with the English set.
o For validation, skin physiological responses are measured and analysed to identify the
differences between emotional responses in both English and Arabic emotion elicitation video
clips.
11. Method – Video Clips /Stimuli
o One video clip for each emotion was selected from Gross and Levenson’s emotion elicitation
clips. The selected video clips were based on complying to the Arabs’ socio-cultural constraints.
o Initial Arabic clips set have been selected from a small collection of video clips gathered by the
project team members.
ClipEmotion
“Bill Cosby, Himself”Amusement
“The Champ”Sadness
“My Bodyguard”Anger
“The Shining”Fear
“Capricorn One”Surprise
“Amputation of a hand“Disgust
ClipEmotion
“Bye-Bye London”Amusement
“Bu kreem with seven woman”Sadness
“Omar”Anger
“Mother is a Basha”Fear
“Tagreebah Falestineeah”Surprise
“Arab Got Talent”Disgust
English film clips used as stimuli Arabic film clips used as stimuli
12. Method – Data Collection
o An interface was coded to show English video clips followed by Arabic video clips for
the same emotion in the following order: amusement, sadness, anger, fear, surprise, and
disgust.
o During the experiment the 29 female Saudi participants were asked to wear the
Affectiva Q-sensor to record the skin response to emotional stimuli.
o After the experiment, participants rate the elicited emotions they felt while watching
each clip using modified Likert scale range from (0 not felt to 10 felt strongly).
o Since not all video clips had elicited the target emotion in all participants we only
included in analysis the segments where the participants have felt the target emotions.
DisgustSurpriseFearAngerSadnessAmusement
221919192316EnglishClips
221719171624ArabicClips
Number of selected segments for each emotion
13. Features extraction, Analysis and
classification
o Matlab code was used to segment the raw data offline using the clips’ time stamp. Moreover, we
exclude the scenes that presumed to be not elicit the target emotions.
o From SCL, SCR, original EDA and skin temperature we extract statistical features.
o Max, Min, Mean, Range, STD and variance measured from (EDA and skin temperature)
o Max, Min, Mean, Range, STD and variance measured from (Slop of EDA and skin temperature)
o Max. Min, Average values measured from (EDA and skin temperature peaks and valleys)
o Average number of peaks and valley in second in EDA and Skin temperature.
o Number of frames above and below threshold (0.01 for SCL and 37˚ for skin temperature)
o Max, Min, Means, range, STD and variance of temperature velocity and acceleration.
o The extract features were analysis using one-way between group analysis of variance (ANOVA) test
using the six emotions as groups for each of the extracted features with 𝜌 ≤ 0.01.
o To ensure accurate comparison, features extracted from English and Arabic clips are analysed
individually, and then the common features that are significantly different in all emotion groups in
both languages are selected for the classification.
14. Features extraction, analysis and classification
o Features that pass ANOVA test are normalized using (Min-Max) normalization.
o training and testing data are constructed using leave-one-segment-out cross –validation
without overlaps between training and testing data.
o Lib SVM classifier was used to automatically classify the selected features.
o To increase the accuracy of classification, a wide range grid search with (RBF) kernal are used to
select the best gamma and cost parameters.
o One-verses-one approach are used, therefore several SVMs constructed to separate each pair
of classes then the final decision is made by the majority voting from the classifiers.
o The emotion elicitation self rating were analysis using two sample two tailed t-test assuming
unequal variance with 𝜌 = 0.05
15. Results – Self rating
o Both English and Arabic clips induced the
target emotion similarly, with the exception of
amusement and sadness clips.
o The English video clips have a universal
capability to induce the target emotions,
except the amusement clip.
o The similarity in inducing the target
emotions from initial Arabic clips suggest that
the refined selection of Arabic set might
improve the level and intensity of inducing
the target emotion.
The average self rate of the target emotion elicited by English and Arabic clips
17. Results – Classification
English clips performed a higher classification accuracy (94%), compared to Arabic clips
classification accuracy (70 %)
The confusion of sadness and disgust emotions only occurred in Arabic elicitation clips, which
suggest that the features used in classification do not accurately differentiate between these
emotions.
18. Conclusion
o English video clips in Gross and Levenson’s set have a universal capability to elicit target
emotions in Saudi sample.
o Refined selection of Arabic emotion elicitation clips will be beneficial in elicitation of targeted
emotions with higher levels of intensity.
o Classification result suggests robustness of using the skin response features for emotion
recognition.
19. Thank you!
ًشكرا
Comparison of User Responses to English and Arabic
Emotion-Elicitation Video Clips
Areej Al-Wabil
Areej@mit.edu
Aug 5, 2015