Modern-age technology enables us to consume multimedia for enjoyment and as a social experience. The traditional way to consume multimedia together (e.g., with family or friends in the living room) is being superseded by a location-independent scenario where geographically distributed users consume the same content while having a real-time communication channel among each other. Inter-Destination Multimedia Synchronization (IDMS) is the tool of choice in order to enable users a high-quality multimedia experience. In this paper, we investigate the influence of asynchronism when consuming multimedia content together while being geographically distributed. In particular, we adopt the concept of human computation and developed a reaction game which we used to conduct a crowdsourced subjective quality assessment in order to evaluate a threshold for multimedia synchronization within an IDMS scenario. Our results show a significant decrease in overall Quality of Experience (QOE) at an asynchronism level of 750ms. At the same time, we were able to show that asynchronism at a level of 400ms does not have significant differences regarding the QoE when compared to the synchronous reference case.
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchronization using Human Computation
1. Is
One
Second
Enough?
Evalua&ng
QoE
for
Inter-‐Des&na&on
Mul&media
Synchroniza&on
using
Human
Computa&on
Benjamin
Rainer,
Stefan
Petscharnig,
Chris<an
Timmerer,
and
Hermann
Hellwagner
Alpen-‐Adria-‐Universität
Klagenfurt
(AAU)
w
Faculty
of
Technical
Sciences
(TEWI)
w
Department
of
Informa&on
Technology
(ITEC)
w
Mul&media
Communica&on
(MMC)
w
Sensory
Experience
Lab
(SELab)
hLp://blog.&mmerer.com
w
hLp://selab.itec.aau.at/
w
hLp://dash.itec.aau.at
w
chris&an.&mmerer@itec.aau.at
Chief
Innova&on
Officer
(CIO)
at
bitmovin
GmbH
hLp://www.bitmovin.com
w
chris&an.&mmerer@bitmovin.com
Slides:
hBp://www.slideshare.net/chris<an.<mmerer
QoMEX
2015,
May
27,
2015
2. Outline
• Mo&va&on
• Our
Approach
• Reac&on
Game
for
Subjec&ve
Quality
Assessment
• Evalua&on
Methodology
• Results
• Conclusions
May
27,
2015
QoMEX
2015
2
3. Mo&va&on
• Watching
mul&media
content
online
together
while
geographically
distributed,
e.g.,
sport
events,
Twitch,
online
quiz
shows,
…
• SocialTV
scenario
featuring
real-‐&me
communica&on
via
text,
voice,
video
• Inter-‐Des&na&on
Mul&media
Synchroniza&on[0]
==
the
playout
of
media
streams
at
two
or
more
geographically
distributed
loca&ons
in
a
&me
synchronized
manner
May
27,
2015
QoMEX
2015
3
User
1
User
2
Goal!
Did
you
see
the
goal?
Which
goal?
Thanks
for
the
spoiler!
[0]
M.
Montagud,
F.
Boronat,
H.
Stokking,
R.
Brandenburg,
"Interdes&na&on
mul&media
synchroniza&on:
schemes,
use
cases
and
standardiza&on,"
Mul$media
Systems,
vol.
18,
pp.
459–482,
2012.
4. Mo&va&on
(cont’d)
• Geerts
et.
al:
Are
we
in
sync?[1]
– Watching
videos
online
together,
while
using
voice
and
text
chat
– No&ceability
of
asynchronism
and
its
impact
on
annoyance
and
togetherness
– Recommenda&on:
1
second
is
enough
–
we
don‘t
think
so!
• What
is
the
lower
threshold
on
asynchronism
for
IDMS?
– Alterna&vely:
Above
which
level
of
asynchronism
do
users
realize
that
they
are
not
in
sync?
• How
to
assess
QoE
in
SocialTV
scenarios?
May
27,
2015
QoMEX
2015
4
[1]
D.
Geerts,
et
al.,
"Are
we
in
sync?:
synchroniza&on
requirements
for
watching
online
video
together,"
Proc.
of
SIGCHI
Conference
on
Human
Factors
in
Compu$ng
Systems
(CHI
'11),
pp.
311-‐314,
2011.
5. Our
Approach
• We
adopt
a
combina&on
of
– Games
with
a
purpose[2]
– Gamifica&on[3]
– Crowdsourcing[4]
• We
design
and
implement
a
game
to
evaluate
the
impact
of
asynchronism
on
– Fairness
– Togetherness
– Annoyance
– QoE
May
27,
2015
QoMEX
2015
5
[2]
L.
von
Ahn,
L.
Dabbish,
"Labeling
images
with
a
computer
game,"
Proceedings
of
the
SIGCHI
Conf.
on
Human
Factors
in
Compu$ng
Systems
(CHI’04),
pp.
319-‐326,
2004.
[3]
E.
D.
Mekler,
F.
Bruhlmann,
K.
Opwis,
A.
N.
Tuch,
"Do
points,
levels
and
leaderboards
harm
intrinsic
mo&va&on?:
An
empirical
analysis
of
common
gamifica&on
elements,"
Proceedings
of
the
First
Interna$onal
Conference
on
Gameful
Design,
Research,
and
Applica$ons
(Gamifica$on’13),
pp.
66-‐73,
2013.
[4]
T.
Hossfeld,
C.
Keimel,
M.
Hirth,
B.
Gardlo,
J.
Habigt,
K.
Diepold,
and
P.
Tran-‐Gia,
"Best
Prac&ces
for
QoE
Crowdtes&ng:
QoE
Assessment
with
Crowdsourcing,”
IEEE
Transac$ons
on
Mul$media,
vol.
16,
no.
2,
pp.
541-‐558,
2014.
6. Reac&on
Game
for
Subjec&ve
Quality
Assessments
• Aligned
to
use
case,
synchroniza&on
• Connected
to
video
content,
not
a
full
game
• Crowdsourcable
(simulated
opponent)
• Game
Idea:
Collabora&ve
reac&on
game
– Players
have
to
react
to
game
events
– Collabora&ve
aspect:
bonus
score
whenever
both
players
click
within
a
given
&me
window
– Explicit
user
feedback
(hit,
miss,
bonus)
May
27,
2015
QoMEX
2015
6
9. Evalua&on
Procedure
• Evalua&on
using
the
WESP[5]
framework
• Structured
in
five
phases
– Explain
the
experiment
– Gather
demographic
data
– Get
par&cipants
used
to
the
procedure
– Play
a
game
round
with
subsequent
evalua&on
for
each
test
case
– Give
feedback
to
evalua&on
process
May
27,
2015
QoMEX
2015
9
[5]
B.
Rainer,
M.
Waltl,
C.
Timmerer,
"A
Web
based
Subjec&ve
Evalua&on
Plavorm,”
Proceedings
of
the
5th
Interna$onal
Workshop
on
Quality
of
Mul$media
Experience
(QoMEX’15).
pp.
24–25,
2013.
10. Crowdsourcing
May
27,
2015
QoMEX
2015
10
• Subjec&ve
quality
assessment
using
crowdsourcing
– We
used
Microworker[6]
crowdsourcing
plavorm
and
paid
0.5
USD
for
each
successful
par&cipa&on
– Dura&on
about
15
minutes
– Simulated
opponent
[6]
hLp://www.microworkers.com
• Implicit
Measures
• Number
of
browser
focus
changes
• Number
of
clicks
• Video
playback
length
• Score
• Number
of
pauses
• …
• Explicit
Measures
• Fairness
• Togetherness
• Annoyance
• QoE
Slider
with
a
con&nuous
scale
from
0
(very
low)
to
100
(very
high)
with
ini&al
posi&on
at
50
(medium)
11. S&muli
and
Par&cipants
• Videos:
in-‐game
footage
of
– inFAMOUS:
Second
Son[7]
– Knack[8]
• Training
phase
– Infamous:
Second
Son
0
(00:54,
3
events)
• Main
evalua&on
using
three
video
sequences*
– Infamous
:
Second
Son
1
(01:46,
6
Events)
– Infamous
:
Second
Son
2
(01:58,
8
Events)
– Knack
(01:50,
4
Events)
• Video
sequences
pre-‐cached
to
avoid
any
bias
caused
by
stalls
• Display
of
configura&ons
in
random
order
May
27,
2015
QoMEX
2015
11
Test
Configura<on
Asynchr
onism
[ms]
Window
length
[ms]
Bonus
window
[ms]
Training
0
2000
2000
Synchronous
0
2000
2000
Small
Async
400
2000
1600
Medium
Async
750
2000
1250
Big
Async
1500
2000
500
[7]
inFAMOUS:
Second
Son
-‐
Sukker
Punch,
hLp://infamous-‐second-‐son.com/
[8]
Knack
-‐
SCE
Japan
Studio,
hLp://www.playsta&on.com/en-‐us/games/knack-‐ps4/
*
With
a
resolu&on
of
720p,
29
fps,
and
approx.
2
Mbit/s
12. S&muli
and
Par&cipants
(cont‘d)
• In
total,
89
microworkers
par&cipated
in
the
study
– The
campaign
was
restricted
to
Europe,
Northern
America,
Australia
and
New
Zealand
• We
screened
45
par&cipants,
by
filtering
them
according
to:
– Browser
focus
change
(27)
– Total
number
of
clicks
<
1
(16)
– Number
of
clicks
during
any
event
<
1
(2)
May
27,
2015
QoMEX
2015
12
13. Results:
Togetherness
&
Annoyance
May
27,
2015
13
QoMEX
2015
Significant
difference
in
means
between
• 0
ms
and
750
ms
(t
=
1.68,
p-‐value
=
0.096,
alpha
=
0.1)
• 400
ms
and
750
ms
(t
=
2.08,
p-‐value
=
0.040,
alpha
=
0.05)
Significant
difference
in
means
between
• 400
ms
and
750
ms
(t
=
-‐1.31,
p-‐value
=
0.049,
alpha
=
0.05)
14. Results:
Fairness
&
QoE
May
27,
2015
QoMEX
2015
14
Significant
difference
in
means
between
• 400
ms
and
750
ms
(t
=
2.51,
p-‐value
=
0.014,
alpha
=
0.05)
• 400
ms
and
1500
ms
(t
=
1.93,
p-‐value
=
0.057,
alpha
=
0.1)
• For
the
pairs
of
test
cases
(0
ms,
750
ms)
and
(0
ms,
1500
ms)
the
p-‐value
is
slightly
above
alpha
=
0.1
Significant
difference
in
means
between
• 400
ms
and
750
ms
(t
=
1.73
p-‐value
=
0.087
alpha
=
0.1)
• 400
ms
and
1500
ms
(t
=
2.1
p-‐value
=
0.039
alpha
=
0.05)
15. Results:
Game
Score
• Drop
in
score
a}er
400ms
• Same
tendencies
as
in
previous
results
May
27,
2015
QoMEX
2015
15
16. Conclusions
• Using
a
game
to
evaluate
the
impact
of
asynchronism
on
QoE,
fairness,
togetherness,
and
annoyance
ONE
• Our
evalua&on
showed
that
there
is
significantly
– lower
QoE
– lower
fairness
– lower
togetherness
– higher
annoyance
above
a
threshold
T
(400
ms
≤
T
≤
750
ms)
• Future
work
– More
precise
threshold
value
– Rela&onship
between
QoE
and
other
variables
(fairness,
togetherness,
annoyance)
May
27,
2015
QoMEX
2015
16
One
second
is
clearly
not
enough