SlideShare a Scribd company logo
1 of 72
Download to read offline
QUALITY OF EXPERIENCE
Measuring Quality from the End-user Perspective
TEWI Kolloquium 20 Nov 2019
Dr. Raimund Schatz, AIT/AAU
About Me Raimund Schatz
Senior Scientist @ AIT
Austrian Institute of Technology
Recently joined ATHENA @ AAU
Dr. (Informatics)
MSc. (Telematics)
MBA (Creativity,
Innovation & Change)
Msc. (Int. Finance)
Research on QoE for more than 10 yrs,
Involved in >50 QoE user studies
Current Research:
• Data- & Diversity-Driven Experience Research
QoE, UX, Acceptance & Behavior
• Virtual & Augmented Reality
• Intelligent Experience Optimization
Agenda
§ Welcome & Introduction
§ What is QoE?
§ Origins & History
§ QoE Definition(s)
§ How to measure QoE?
§ Overview: Objective vs. Subjective
§ Conducting Subjective QoE Experiments
§ Conclusions/Outlook
3
PART 1: WHAT IS QOE?
4AIT | 2019
Let‘s Warm Up a Bit …
How would you define „Quality“? How is the term being used?
What is Quality?
Quality – Is in Fact an Elephant!
The blind men and the elephant: Poem by John Godfrey Saxe
And so these men of Indostan
Disputed loud and long,
Each in his own opinion
Exceeding stiff and strong,
Though each was partly in the right,
And all were in the wrong!
So, oft in theologic wars
The disputants, I ween,
Rail on in utter ignorance
Of what each other mean,
And prate about an Elephant
Not one of them has seen!
What is Quality, anyway?
§ QD1: Quality as Qualitas
§ Essential nature, inherent characteristics, characteristic attribute
§ QD2: Quality as Excellence/Goodness
§ Quality as an expression for the intuitively evaluated
excellence/goodness
§ QD3: Quality as Standards
§ Quality is the totality of characteristics of an entity that bear on its ability
to satisfy stated or implied needs (ISO, 1995)
§ Quality is the ability of a set of inherent characteristics of a product,
system or process to fulfill requirements of customers and other
interested parties (ISO, 1999)
§ QD4: Quality as Event
§ Quality is not a static thing, it is the event at which awareness of subject
and object is made possible
M. & H. Martens. Multivariate Analysis of Quality. An Introduction. Wiley, 2001
QoS
Technology-centric:
throughput, delay, packet loss, etc.
QoE
User-centric:
what really matters to the
end-user: responsiveness,
interactivity, acceptability,
utility, satisfaction, etc.
QoE Origins: Need to bridge between user and technology
perspectives (around 2001)
QoE: Some Definition Attempts
§ QoE as a reloaded buzzword:
QoE has been defined as an extension of the traditional QoS in the sense
that QoE provides information regarding the delivered services from an
end-user point of view [Lopez et al. 2006]
§ QoE as a usability metric:
QoE is how a user perceives the usability of a service when in use – how
satisfied he/she is with a service in terms of, e.g., usability, accessibility,
retainability and integrity [Soldani 2006]
§ QoE as a hedonistic concept:
QoE describes the degree of delight of the user of a service, influenced
by content, network, device, application, user expectations and goals, and
context of use [Dagstuhl Seminar May 2009]
§ QoE as the ultimate answer to life, universe and everything:
Quality of Experience includes everything that really matters
[Kilkki@LinkedIn 2008]
User Expectations
regarding system
performance
Smartphone vs.
Tablet vs. TV
Influence of
network speed,
latencies, etc.
Application type
e.g. web browsing, IPTV
We need a proper QoE Definition! à Qualinet whitepaper (2012)
Work vs.
Entertainment
“… the degree of delight or annoyance of the user of an application or service …”
(Qualinet White Paper on Definitions of Quality of Experience, 2013)
“Quality of Experience (QoE) is the degree of delight or
annoyance of the user of an application or service. It results
from the fulfillment of his or her expectations with respect to
the utility and/or enjoyment of the application or service in the light
of the user’s personality and current state.”
§ Experience: An experience is an individual’s stream of perception and
interpretation of one or multiple events.
§ Quality Feature: A perceivable, recognized and namable characteristic of
the individual’s experience of a service which contributes to its quality.
§ Influencing Factors: In the context of communication services, QoE can be
influenced by factors such as service, content, network, device, application,
and context of use.
QoE Definition (Qualinet Whitepaper, 2012)
from Qualinet White Paper on Definitions of Quality of Experience
“Quality of Experience (QoE) is the degree of delight or annoyance of a
person whose experiencing involves an application, service, or system. It
results from the person’s evaluation of the fulfillment of his or her
expectations and needs with respect to the utility and/or enjoyment in
the light of the person’s context, personality and current state.”
§ Application: A software and/or hardware that enables usage and interaction
by a user for a given purpose. Such purpose may include entertainment or
information retrieval, or other.
§ Service: An episode in which an entity takes the responsibility that something
desirable happens on the behalf of another entity.
An Even Better Definition (QoE Book, 2013)
12From Möller & Raake (Eds) 2013, pp 18-19
§ Fundamental relationships and data on quality perception
§ QoE as f(System, User state, Content, Context)
§ WQL Hypothesis, IQX Hypothesis, etc.
§ Guidelines for
§ Network planning and parametrization
§ Application, service or algorithm design
§ QoE Models and Metrics for
§ Predicting QoE based on technical measurements
§ QoE Measurement/Prediction Systems for
§ Monitoring and documenting health of system/network
based on user-centric KPIs (e.g. picture quality)
§ QoE-centric Net & App Management in order to
§ Ensure optimal end-user experience in economic ways
§ Distribute resources fairly among users
The Field: QoE Research & Applications
AnalyzePredictControl
PART 2: HOW TO „MEASURE“ QOE?
14AIT | 2019
Common Quality Issues for Networked Multimedia
§ Web Browsing:
§ Long waiting time until anything happens
§ Slow page rendering
§ Unavailability of page/site
§ Bad site design, bad usability
§ ...
§ IPTV, Mobile TV:
§ Visual quality: blocking, blurring, freeze frames
§ Audio quality: noise, distortions
§ Audio/video out of sync
§ Stalling, rebuffering
§ Long startup time of service
§ Long zapping time
§ ...
à Different services, different types of quality aspects/impairments
à Quality impairments can have various causes (device, network, content,
...)
General Question: Can we directly „measure“
experienced Quality (QoE)?
§ Answer: NO, not yet!
§ Why?
§ Elusive concept
§ No “objective” physiological / neural correlate
§ Mind-reading not possible (yet)
§ But: we can assess and estimate QoE (or parts/proxies of it)
to some extent …
How to assess or estimate QoE?
A) Subjective QoE Tests
§ Based on end-user involvement
§ Subjective measures: e.g. user opinion, ratings
§ Objective measures: e.g. task performance, behavior
B) „Objective“ QoE Prediction/Estimation
§ „Metrics“ based on analytical/statistical models
§ Translate input parameters to estimated QoE
Stimulus Response
Test Conditions
Impairments
Subjective Measures
Objective Measures
Input OutputQoE=
mc2
Model
Perception-based
Instrumental
User
App
Net
Application
Log
Analysis
Traffic
Analysis
Subjective
Testing
Method
User Quality
Perception Data
Insights
QoE Models
From (subjective) Experimental Data (A)
to (objective) QoE Models/Metrics (B)
QoE‘s Core Business: Subjective User Experiments & Model Development
Context Factors
Overview: QoE Assessment Approaches
QoE
Measurement
Subjective
Controlled
Experiments
(Lab)
Crowdsourcing
Field / Real
Service
Objective
Signal-based
FR
RR
NRPacket-level
Parametric 19
Subjective QoE Assessment
Key Question: How to assess QoE at maximum validity?
Answer: Subjective QoE Testing with Human Participants
Involves a delicate mix of choices:
§ Context: Lab, Field or Web (Crowdsourcing)?
§ Technical Setup?
§ Test Content and Procedure?
§ Data gathering: Qualitative vs. Quantative?
Data collection Methods?
§ Analytic or Utilitarian?
The Process
21
Reporting
Results Analysis
Execution
Pilot & Refinement
Setup
Planning
PLANNING & SETUP
22
Rule #1: Know your Purpose!
§ Every study is done for a purpose …
à know your purpose and clearly define the problem that you want to
address accordingly!
§ (Very) Different purposes:
§ Building a model/metric
• Example: image qoe as f(settings_of_codecXY)
§ Answering a question / test a hypothesis
• Example: does QoE as f(latency) for web browsing differ by age?
§ Evaluating metric(s)
• Example: how good does the new VMAF metric reflect QoE?
§ Identify experience dimensions/quality features
• Example: which are the main experience dimensions governing
mobile AR?
§ Validate a new QoE assessment methodology
…. 23
Test Methodology & Design
§ Variables
§ Which ones to manipulate, control, observe or ignore?
à Avoid unintended/unnoticed influences from uncontrolled factors on results!
§ Subjects
§ Naïve or Expert?, N=?
§ Instructions
§ Which questions to ask subjects and how
§ Training?
§ Presentation
§ Single or double stimulus, sequential or simultaneous?
§ Grading scales
§ How many items? Direct, indirect?
§ Numerical, Categorical? MOS?
à Methodologies draw from several disciplines: HCI, UX, Quality assessment
Psychology, Sociology, Experimental Design Theory, etc.
à Make good use of this existing body of knowledge!
Recommended Reading
§ http://www.doesinc.com/knowledge.htm
§ http://www.statsoft.com/textbook/experimental-design/
§ ITU-T P.910: Subjective video quality assessment methods for multimedia
applications
§ ITU-R BT.500: Methodology for the subjective assessment of the quality of
television pictures
§ https://www.its.bldrdoc.gov/vqeg/vqeg-home.aspx
§ Book: Michell & Jolley, „Research Design Explained“
§ Ritter, F. E., Kim, J. W., Morgan, J. H., & Carlson, R. A. (2012). How to run
experiments: A practical guide to research with human participants.
Thousand Oaks, CA: Sage.
www.frankritter.com/rbs/ rbs-handout-cogsci.pdf 25
Example: Subjective Image Quality Testing
§ Given: Source Image, System that impairs image (compression,
transmission errors, etc.)
§ Question: What is the impact of the system on experienced image
quality?
A typical lab assessment involves …
•15 to 30 participants
•1-2 hours per participant
•Informed consent / GDPR (signed)
•Instructions about tasks
•Some pre/post questionnaires (demographics,
ratings, feedback, etc.)
•Several technical conditions and original media to
evaluate
27
Test Content
§ Has considerable impact on quality perception
§ Content choice depends on study goals, e.g.
§ Typical content
§ Challenging Worst-case content
§ BUT: content choice also influences
rating behavior!
• Likeability, emotions
28P01 p06 p10 bike cafe woman
Which User Data/Correlates with QoE can we collect?
§ Subjective Opinion / Assessment
§ Quantative: Ratings
§ Qualitative: Interviews, Thinking aloud
§ Behavioural Measurements
§ Behaviour logs
§ Observational Coding
§ Behavioral performance (completion time, response time, error rates)
§ Physiological Measurements
§ Heart rate, Skin Response
§ Muscular activity, Eye activity
§ Brain activity (EEG, MRI)
29
THE MAIN VEHICLE
How to Obtain User Ratings: Scaling Methods
Scaling
Direct
Sg Stimulus
ACR
Dbl Stimulus
DCR
Continuous
SAMVIQ &
MUSHRA
Indirect
Ranking
Paired
Comparison 30
Key Measure: MOS
§ Mean Opinion Score
§ Widely used in many fields:
§ Politics/Elections
§ Marketing/Advertisement
§ Food industry
§ Multimedia
§ MOS = The likely level of satisfaction with a service or product as appreciated
by an average user
§ Example question: “How would you rate the visual quality of this image?”
§ Challenge: test design that generates valid, objective, reliable (and thus
reproducible) results
§ Implementation more complex and difficult that it seems a priori
(WYAIWYG problem: what you ask is what you get)
Excellent
Good
Fair
Poor
Bad
5
4
3
2
1
Imperceptible
Perceptible
Slightly annoying
Annoying
Very annoying
MOS Quality Impairment
Direct Scaling: ACR (Absolute Category Rating)
§ Discrete
§ Single stimulus
§ Multiple dimensions addressable
§ Usually 5-point scale, but can also be
7-, 9-, or 11-point
5 Excellent
4 Good
3 Fair
2 Poor
1 Bad
ACR
Stimulus A Stimulus B Stimulus C
Direct Scaling: DCR (Degradation Category Rating)
§ Discrete
§ Direct Comparison à Relative
§ Reference vs. processed sample
§ Highly sensitive
5 degradation is not
perceivable
4
degradation is
perceivable but not
annoying
3 degradation is slightly
annoying
2 degradation is
annoying
1 degradation is very
annoying
DCR
Ref A Stimulus A Ref B Stimulus B
Scaling: Continuous
§ Continuous / Sg or Dbl Stimulus
§ For assessing transient quality
artifacts in longer (media)
samples (videos, etc.)
Continuous
Exercise: Which Direct Scaling to Use?
§ Assessment Tasks
1) Impact of (constant) noise on QoE (image)
2) Impact infrequent bursts of packet loss on QoE (video)
3) Added value of 4k/UHD vs. HD resolution (video)
§ Rating Methods / Scaling
1. ACR
2. DCR
3. Continuous
Scales: How to Map Opinions to Numbers?
§ Not all features that an entity has can be described by numbers
§ Example: a person
• Weight and height are numeric variables (more precisely
ratio variables)
• Education and socio economic class are ordinal variables
• Sex and religion are nominal variables
§ This has direct consequences on
§ Design & usability of the rating scale
§ The kind of statistical analysis we can perform on the results!
36
Probabilities Percentiles Any statistic
interval / ratio x x x
ordinal x x
nominal x
mode median mean
Consequence: Another essential Rule …
§ ANTICIPATE, ANTICIPATE, ANTICIPATE!
§ Beyond clearly defining the problem …
§ … prepare analysis & reporting in advance!
§ Then use the prepared analysis for monitoring your results data
during piloting and test execution!
à Risk management is key!
37
Some Scale Designs (ACR)
38
Some Scale Designs (ACR) ctd.
§ Ideal for Crowdsourcing (Gardlo, Egger & Hossfeld 2015)
39
Gardlo, B., Egger, S., & Hoßfeld, T. (2015). Do Scale-Design and
Training Matter for Video QoE Assessments through
Crowdsourcing? CrowdMM@ACM Multimedia.
Setup: Test Environment
§ Measurements have to be valid, objective & reliable
§ Subjective testing methodologies
§ High requirements on testing environment
§ Many influencing/confounding factors, z.B.
§ Type, performance and quality of devices (monitor, speakers,
etc.)
§ Light and acoustic conditions
§ Ambience, interior architecture
§ Watching distance and angle
Planning: Time & Test Conditions
§ Subject‘s Time & Energy = scarce resource
§ Max. 90 min of net testing time, requires a break of 5-10 mins
§ Time slots need to be 2h (min), better 2.5h
§ #conditions = net testing time / condition duration
§ For QoE lab studies, to typically go for a within-subjects design
§ Severely limited what one test can actually cover!
§ Tricks: use between-subjects design, latin squares, etc.
§ Further hints:
§ Use anchors: very good + very bad quality conditions, evt. training
§ Do not forget to randomize the sequence of conditions!
41
Number of Subjects - The dreaded „N“!
§ Huge Trade-off: Time & Money vs. Statistical Power / Reliability
§ Higher N à smaller p & smaller confidence intervals
§ BUT: diminishing returns
§ Recommendations for N:
§ ITU-T: 15
§ VQEG: 24
à So … which N is truly „sufficient“?
§ Extensive Analysis by Brunnström & Barkowsky (2018)
42
N
ZCI
j
j
s
×=
Analysis by Brunnström & Barkowsky (2018)
§ Traditional recommendations too optimistic, particularly when you have to correct
for multiple comparisons!
§ BTW: remember the funnel: #invited subjects > #tested subjects > #valid subjects
Sample Test Plan
44
PILOT & REFINEMENT & EXECUTION
45
46
§ Subjective tests are like a live performance
§ When the audience arrives and the show starts, everything has to be in place
and work smoothly!
§ Remember: each subject costs time & money, so don‘t waste them!
§ You have to rehearse = Test & Pilot in advance
§ Use subjects of increasing „expensiveness“
• You, colleagues, (friendly) users from the target audience
• Let them give feedback afterwards on the meta-level/process
§ There are always issues with the usability of your setup and your instructions
à If you don‘t find any problems with the test setup,
then you have not piloted it well enough!
§ Reserve enough time for refining the test setup & design!
Piloting & Refinement
47
§ Treat your subjects well – for them participation should be a positive
experience in itself!
§ Monitor your data, your users and all kinds of events – you need to be able
to debug your test
§ Keep a test assistant‘s log
§ Automatize, Automatize, Automatize!
Execution
48
§ By TU–Ilmenau, AVT Group
§ https://github.com/Telecommunication-Telemedia-Assessment/avrateNG
Rating Tools/Systems#1: avrateNG
Setting up a new user study can be tedious…
TheFragebogen.de
A software framework for user studies made simple.
Used in several QoE studies: audio/video, web
2nd screen and
crowdsourcing.
open source
cross platform
multi device
multimedia
graphical scales
privacy friendlybehavioral data
Dennis Guse | Henrique R. Orefice | Gabriel Reimers | Oliver Hohlfeld
ready for crowdsourcing
DEMO
ANALYSIS & REPORTING
50
Conditions
Items
Subjects
!"" #"" $"" %"" !&""
!
!'(
#
#'(
)
)'(
$
$'(
(
*+, -./01,2./03456
7*8
!
!
89:
;89:
<=>
MOS Data Analysis and Reporting
• Mean Opinion Scores (MOS) and confidence intervals
N
m
MOS
N
i ij
j
å=
= 1
mij = score by subject i for test condition j.
N = number of subjects after outliers removal.
But: this is not enough …
!"" #"" $"" %"" !&""
!
!'(
#
#'(
)
)'(
$
$'(
(
*+, -./01,2./03456
7*8
!
!
89:
;89:
<=>
MOS Data Analysis and Reporting, ctd.
BUT: MOS by itself only reports an average opinion, and thus
HIDES a lot of information …
Excellent!
Bad!
Fair!Good!
Poor!
Æ
Fair = 3
!"" #"" $"" %"" !&""
!
!'(
#
#'(
)
)'(
$
$'(
(
*+, -./01,2./03456
7*8
!
!
89:
;89:
<=>
MOS Data Analysis and Reporting, ctd.
à don’t forget to analyze and report user opinion diversity
using confidence intervals (bare minimum), histograms,
CDFs, etc.!
!"" #"" $"" %"" !&""
!
!'(
#
#'(
)
)'(
$
$'(
(
*+, -./01,2./03456
7*8
!
!
89:
;89:
<=>
MOS Data Analysis and Reporting
• Mean Opinion Scores (MOS) and confidence intervals
N
m
MOS
N
i ij
j
å=
= 1
N
ZCI
j
j
s
×=
mij = score by subject i for test condition j.
N = number of subjects after outliers removal.
Z = z-value for required confidence level (1.96 for
95%).
σj = standard deviation of the scores distribution
across subjects for test condition j.
But: MOS + confidence intervals
can be considered only as
the bare minimum …
Remember: ALWAYS be Cautious with Summary
Statistics!
F. J. Anscombe: Graphs in
Statistical Analysis. In:
American Statistician. 27,
Nr. 1, 1973, S. 17–21.
More Extreme: The Datasaurus Dozen
https://www.autodeskresearch.com/publications/samestats
Reporting (as recommended by ITU-T P.910)
57
Further Reporting Examples …
58
Schatz et al. 2018
Further Reporting Examples (ctd)
59
Schatz et al. 2018
How to navigate the seven seas of
statistical analysis?
60
Ratings,etc.
QoS, Encoder settings, etc.
61
How to navigate the seven seas of statistical analysis? (ctd)
Source: J. Wobbrock http://depts.washington.edu/acelab/proj/Rstats/index.html
Also check out this course: „Designing, Running and Analyzing Experiments“ (Coursera)
62
§ http://www.panelcheck.com/
Tools from Food Sciences can make your life a bit easier …
Hint: check out the work (papers, R packages, tools) of P.B. Brockhoff (DTU)!
63
§ Think of Reproducible Science!
1. Publish the Dataset
§ Ideally: Content + Rating Data
2. When sharing: use standardized data formats
§ Example: suJSON
§ Universal format for exchange
§ Format specification
§ Tools
What else beyond reporting?
64
§ Emerging technologies: XR, VR video,
point-cloud, mulsemedia, …
§ New targets à more work = good J
à Many more DOF à challenge!
§ QoE in the context of interactive realtime systems and applications
§ Web, Gaming, etc.
à Confluence/overlap with UX
§ Rise of big data analytics and data-driven techniques
§ Availability of more (diverse) sources of data
à QoE substitution by e.g. behavior
à Testing driven by AI/ML – Active Learning
Outlook: Trends & Challenges
Thank you for your attention!
Any questions?
QoE-related Books
§ Quality Engineering (Möller, 2011)
§ Very good introductory book on quality engineering
§ Focus on audio/voice and video quality
§ Focus on communications context
§ Quality of Experience (Möller & Raake eds., 2013)
§ Covers QoE extensively
§ Great overview of the current state of the art
§ Addresses fundamentals (like assessment methods) and
applications (like video-conferencing)
Related Philosophy: Pirsig (1974)
§ Zen and the Art of Motorcycle Maintenance
§ possibly the most widely read philosophy book of the
20th century
§ Initially refused by more than 121 publishers
§ Three interweaved narratives:
§ a motorcycle trip across America,
§ the reconciliation of the narrator with his son and former
"insane" self, Phaedrus, and
§ a number of philosophical discussions concerning the quality of
contemporary Western life.
§ Foundation of Metaphysics of Quality (MoQ)
§ What defines good writing?
§ And what in general defines “good” or "quality“?
Further Reading, ctd.
§ European Network on Quality of Experience in Multimedia Systems and
Services (QUALINET), “Definitions of Quality of Experience (QoE) and
related concepts,” White Paper, 2012.
§ Recommendation ITU-T P.10/G.100 (2006) - Amendment 2 (07/08),
Vocabulary for performance and quality of service - New definitions for
inclusion in Recommendation ITU-T P.10/G.100.
§ Recommendation ITU-T E.800 (2008), Terms and definitions related to
quality of service and network performance including dependability.
§ Recommendation ITU-T P.800 (1998), Methods for subjective determination
of transmission quality.
§ Recommendation ITU-T G.1011 (2010), Reference guide to quality of
experience assessment methodologies.
§ Recommendation ITU-T BT.500-13, Methodology for the subjective
assessment of the quality of television pictures .
- 68 -
Further Reading: Crowdsourcing
§ T. Hoßfeld, C. Keimel, M. Hirth, B. Gardlo, J. Habigt, K. Diepold, and P.
Tran-Gia, “Best practices for QoE crowdtesting: QoE assessment with
crowdsourcing,” IEEE Trans. Multimed., vol. 16, no. 2, pp. 541–558, Feb.
2014.
§ Egger-Lampl, Sebastian, Judith Redi, Tobias Hoßfeld, Matthias Hirth,
Sebastian Möller, Babak Naderi, Christian Keimel, and Dietmar Saupe.
„Crowdsourcing Quality of Experience Experiments“. In „Evaluation in the
Crowd. Crowdsourcing and Human-Centered Experiments“, Daniel
Archambault, Helen Purchase, and Tobias Hoßfeld (Eds.), Springer
International Publishing, 2017.
§ Gadiraju, Ujwal, Sebastian Möller, Martin Nöllenburg, Dietmar Saupe,
Sebastian Egger-Lampl, Daniel Archambault, and Brian Fisher.
„Crowdsourcing Versus the Laboratory: Towards Human-Centered
Experiments Using the Crowd“. In „Evaluation in the Crowd. Crowdsourcing
and Human-Centered Experiments“, Daniel Archambault, Helen Purchase,
and Tobias Hoßfeld (Eds.), Springer International Publishing, 2017.
Further Reading, ctd.
Balachandran, Athula, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica, and
Hui Zhang. 2012. “A Quest for an Internet Video Quality-of-Experience Metric.” In
Proceedings of the 11th ACM Workshop on Hot Topics in Networks, 97–102.
ACM.
Balachandran, Athula, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica, and
Hui Zhang. 2013. “Developing a Predictive Model of Quality of Experience for
Internet Video.” In Proceedings of the ACM SIGCOMM 2013 Conference on
SIGCOMM, 339–350.
Dobrian, Florin, Vyas Sekar, Asad Awan, Ion Stoica, Dilip Joseph, Aditya Ganjam,
Jibin Zhan, and Hui Zhang. 2011. “Understanding the Impact of Video Quality on
User Engagement.” In Proceedings of the ACM SIGCOMM 2011 Conference,
362–373. SIGCOMM ’11. New York, NY, USA: ACM.
Krasula, Lukaˇs, und Patrick Le Callet. „Chapter 4: Emerging Science of QoE in
Multimedia Applications“, o. J., 35.
- 70 -
Further Reading, ctd.
Akamai. June 2006: Retail Web Site Performance: Consumer Reaction to a Poor Online
Shopping Experience. Akamai Technologies, http://www.akamai.com (accessed February 10,
2008).
Allan, L. G.: The perception of time, Perception Psychophysics, vol. 26, no. 5, pp. 340–354,
1979.
Bouch, A., Kuchinsky, A., Bhatti, N.: Quality is in the eye of the beholder: meeting users’
requirements for Internet quality of service. Proceedings of the SIGCHI conference on
Human factors in computing systems. S. 297–304 (2000).
Egger, S., Reichl, P., Hossfeld, T., Schatz, R.: “Time is Bandwidth”? Narrowing the Gap between
Subjective Time Perception and Quality of Experience. Proceedings of the 2012 IEEE
International Conference on Communications.
Fiedler, M., Hossfeld, T., and Tran-Gia, P.: A generic quantitative relationship between quality of
experience and quality of service. Netwrk. Mag. of Global Internetwkg., vol. 24, pp. 36–41,
March 2010.
ITU-T Recommendations: G.1030, P.800, P.805, P.880, P.910, BT.500
Mitchell, M.L., Jolley, J.M.: Research Design Explained. Cengage Learning (2009).
Möller, S.: Quality Engineering Qualität kommunikationstechnischer Systeme. Springer,
Heidelberg [u.a.] (2010).
Further Reading, ctd.
Strohmeier, Dominik. „Open profiling of quality: a mixed methods research approach
for audiovisual quality evaluations“. Dissertation 4, Nr. 4 (2011): 5–6.
Sackl, Andreas. “Investigations on the Role of Expectations and Individual Decisions
in Quality Perception”. Dissertation, 2016
Schatz, R., Egger, S., Platzer, A.: Poor, Good Enough or Even Better? Bridging the
Gap between Acceptability and QoE of Mobile Broadband Data Services.
Proceedings of the 2011 IEEE International Conference on Communications. S. 1
-6 (2011).
Schatz, Hossfeld, Janowski, and Egger. From Packets to People: Quality of
Experience as New Measurement Challenge. In: Data Traffic Monitoring and
Analysis. Springer LNCS, 2013.
Schatz, Fiedler, and Skorin-Kapov. QoE-based Network and Application Management.
In: Quality of Experience: Advanced Concepts, Applications and Methods.
Springer LNCS, 2014.
Seow, S.C.: Designing and Engineering Time: The Psychology of Time Perception in
Software. Addison-Wesley Professional (2008).
- 72 -

More Related Content

What's hot

Project report index pcie1
Project report index pcie1Project report index pcie1
Project report index pcie1
Ajayvorar
 
Software engineering presentation
Software engineering presentationSoftware engineering presentation
Software engineering presentation
MJ Ferdous
 

What's hot (20)

Project report index pcie1
Project report index pcie1Project report index pcie1
Project report index pcie1
 
UX and UI
UX and UIUX and UI
UX and UI
 
Software Development Software development process
Software Development Software development processSoftware Development Software development process
Software Development Software development process
 
Client & server side scripting
Client & server side scriptingClient & server side scripting
Client & server side scripting
 
Web Development
Web DevelopmentWeb Development
Web Development
 
Web development ppt
Web development pptWeb development ppt
Web development ppt
 
Web Application Design
Web Application DesignWeb Application Design
Web Application Design
 
Software engineering presentation
Software engineering presentationSoftware engineering presentation
Software engineering presentation
 
Major project synopsis format
Major project synopsis formatMajor project synopsis format
Major project synopsis format
 
Human computer interaction
Human computer interactionHuman computer interaction
Human computer interaction
 
Intro to ux and how to design a thoughtful ui
Intro to ux and how to design a thoughtful uiIntro to ux and how to design a thoughtful ui
Intro to ux and how to design a thoughtful ui
 
UI/UX Design Trends in Appliances
UI/UX Design Trends in AppliancesUI/UX Design Trends in Appliances
UI/UX Design Trends in Appliances
 
Model Based Software Architectures
Model Based Software ArchitecturesModel Based Software Architectures
Model Based Software Architectures
 
Online bookshop
Online bookshopOnline bookshop
Online bookshop
 
What Is Accessibility Testing?
What Is Accessibility Testing?What Is Accessibility Testing?
What Is Accessibility Testing?
 
UI/UX - The Bigger Picture
UI/UX - The Bigger PictureUI/UX - The Bigger Picture
UI/UX - The Bigger Picture
 
Book Shop Management System
Book Shop Management SystemBook Shop Management System
Book Shop Management System
 
Software Configuration Management (SCM)
Software Configuration Management (SCM)Software Configuration Management (SCM)
Software Configuration Management (SCM)
 
Web Development Presentation
Web Development PresentationWeb Development Presentation
Web Development Presentation
 
Software quality assurance
Software quality assuranceSoftware quality assurance
Software quality assurance
 

Similar to Quality of Experience: Measuring Quality from the End-User Perspective

Ultra-High-Definition Quality of Experience with MPEG-DASH
Ultra-High-Definition Quality of Experience with MPEG-DASHUltra-High-Definition Quality of Experience with MPEG-DASH
Ultra-High-Definition Quality of Experience with MPEG-DASH
Bitmovin Inc
 
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 KeynoteQoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
Tobias Hoßfeld
 

Similar to Quality of Experience: Measuring Quality from the End-User Perspective (20)

Uxqa lunch learn
Uxqa lunch learnUxqa lunch learn
Uxqa lunch learn
 
Quality of Experience
Quality of ExperienceQuality of Experience
Quality of Experience
 
UX Analytics Cemal Büyükgökçesu
UX Analytics Cemal BüyükgökçesuUX Analytics Cemal Büyükgökçesu
UX Analytics Cemal Büyükgökçesu
 
Quality of Experience - Why Bother?
Quality of Experience - Why Bother?Quality of Experience - Why Bother?
Quality of Experience - Why Bother?
 
UX Sales Material January 2022
UX Sales Material January 2022UX Sales Material January 2022
UX Sales Material January 2022
 
Ultra-High-Definition Quality of Experience with MPEG-DASH
Ultra-High-Definition Quality of Experience with MPEG-DASHUltra-High-Definition Quality of Experience with MPEG-DASH
Ultra-High-Definition Quality of Experience with MPEG-DASH
 
Pitfalls and Countermeasures in Software Quality Measurements and Evaluations
Pitfalls and Countermeasures in Software Quality Measurements and EvaluationsPitfalls and Countermeasures in Software Quality Measurements and Evaluations
Pitfalls and Countermeasures in Software Quality Measurements and Evaluations
 
Beyond Budget and Scope: Managing Client Expectations and Delivering Value
Beyond Budget and Scope: Managing Client Expectations and Delivering ValueBeyond Budget and Scope: Managing Client Expectations and Delivering Value
Beyond Budget and Scope: Managing Client Expectations and Delivering Value
 
Know and Delight Your Users: UX Analytics
Know and Delight Your Users: UX AnalyticsKnow and Delight Your Users: UX Analytics
Know and Delight Your Users: UX Analytics
 
Accessibility as a Driver for User Experience
Accessibility as a Driver for User ExperienceAccessibility as a Driver for User Experience
Accessibility as a Driver for User Experience
 
QA Accessibility-testing
QA Accessibility-testingQA Accessibility-testing
QA Accessibility-testing
 
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 KeynoteQoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
QoE++: Shifting from Ego- to Eco-System? QCMan 2015 Keynote
 
Agility Accelerator
Agility AcceleratorAgility Accelerator
Agility Accelerator
 
Making Government User-Centered: Managing UCD projects to promote change
Making Government User-Centered: Managing UCD projects to promote changeMaking Government User-Centered: Managing UCD projects to promote change
Making Government User-Centered: Managing UCD projects to promote change
 
Quality Measurement Framework Puts the End User in Focus
Quality Measurement Framework Puts the End User in FocusQuality Measurement Framework Puts the End User in Focus
Quality Measurement Framework Puts the End User in Focus
 
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience BenchmarkingUserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
UserZoom Webinar: How to Conduct Web Customer Experience Benchmarking
 
Resume
ResumeResume
Resume
 
ASML UX Event
ASML UX EventASML UX Event
ASML UX Event
 
ASML UX Event
ASML UX EventASML UX Event
ASML UX Event
 
161121 ASML UX Event
161121 ASML UX Event161121 ASML UX Event
161121 ASML UX Event
 

More from Förderverein Technische Fakultät

The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
Förderverein Technische Fakultät
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
Förderverein Technische Fakultät
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
Förderverein Technische Fakultät
 

More from Förderverein Technische Fakultät (20)

Supervisory control of business processes
Supervisory control of business processesSupervisory control of business processes
Supervisory control of business processes
 
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
The Digital Transformation of Education: A Hyper-Disruptive Era through Block...
 
A Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdfA Game of Chess is Like a Swordfight.pdf
A Game of Chess is Like a Swordfight.pdf
 
From Mind to Meta.pdf
From Mind to Meta.pdfFrom Mind to Meta.pdf
From Mind to Meta.pdf
 
Miniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdfMiniatures Design for Tabletop Games.pdf
Miniatures Design for Tabletop Games.pdf
 
Distributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptxDistributed Systems in the Post-Moore Era.pptx
Distributed Systems in the Post-Moore Era.pptx
 
Don't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptxDon't Treat the Symptom, Find the Cause!.pptx
Don't Treat the Symptom, Find the Cause!.pptx
 
Engineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdfEngineering Serverless Workflow Applications in Federated FaaS.pdf
Engineering Serverless Workflow Applications in Federated FaaS.pdf
 
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdfThe Role of Machine Learning in Fluid Network Control and Data Planes.pdf
The Role of Machine Learning in Fluid Network Control and Data Planes.pdf
 
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
Nonequilibrium Network Dynamics_Inference, Fluctuation-Respones & Tipping Poi...
 
Towards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdfTowards a data driven identification of teaching patterns.pdf
Towards a data driven identification of teaching patterns.pdf
 
Förderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptxFörderverein Technische Fakultät.pptx
Förderverein Technische Fakultät.pptx
 
The Computing Continuum.pdf
The Computing Continuum.pdfThe Computing Continuum.pdf
The Computing Continuum.pdf
 
East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...East-west oriented photovoltaic power systems: model, benefits and technical ...
East-west oriented photovoltaic power systems: model, benefits and technical ...
 
Machine Learning in Finance via Randomization
Machine Learning in Finance via RandomizationMachine Learning in Finance via Randomization
Machine Learning in Finance via Randomization
 
IT does not stop
IT does not stopIT does not stop
IT does not stop
 
Advances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial NetworksAdvances in Visual Quality Restoration with Generative Adversarial Networks
Advances in Visual Quality Restoration with Generative Adversarial Networks
 
Recent Trends in Personalization at Netflix
Recent Trends in Personalization at NetflixRecent Trends in Personalization at Netflix
Recent Trends in Personalization at Netflix
 
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdfIndustriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
Industriepraktikum_ Unterstützung bei Projekten in der Automatisierung.pdf
 
Introduction to 5G from radio perspective
Introduction to 5G from radio perspectiveIntroduction to 5G from radio perspective
Introduction to 5G from radio perspective
 

Recently uploaded

Recently uploaded (20)

Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Quality of Experience: Measuring Quality from the End-User Perspective

  • 1. QUALITY OF EXPERIENCE Measuring Quality from the End-user Perspective TEWI Kolloquium 20 Nov 2019 Dr. Raimund Schatz, AIT/AAU
  • 2. About Me Raimund Schatz Senior Scientist @ AIT Austrian Institute of Technology Recently joined ATHENA @ AAU Dr. (Informatics) MSc. (Telematics) MBA (Creativity, Innovation & Change) Msc. (Int. Finance) Research on QoE for more than 10 yrs, Involved in >50 QoE user studies Current Research: • Data- & Diversity-Driven Experience Research QoE, UX, Acceptance & Behavior • Virtual & Augmented Reality • Intelligent Experience Optimization
  • 3. Agenda § Welcome & Introduction § What is QoE? § Origins & History § QoE Definition(s) § How to measure QoE? § Overview: Objective vs. Subjective § Conducting Subjective QoE Experiments § Conclusions/Outlook 3
  • 4. PART 1: WHAT IS QOE? 4AIT | 2019
  • 5. Let‘s Warm Up a Bit … How would you define „Quality“? How is the term being used? What is Quality?
  • 6. Quality – Is in Fact an Elephant! The blind men and the elephant: Poem by John Godfrey Saxe And so these men of Indostan Disputed loud and long, Each in his own opinion Exceeding stiff and strong, Though each was partly in the right, And all were in the wrong! So, oft in theologic wars The disputants, I ween, Rail on in utter ignorance Of what each other mean, And prate about an Elephant Not one of them has seen!
  • 7. What is Quality, anyway? § QD1: Quality as Qualitas § Essential nature, inherent characteristics, characteristic attribute § QD2: Quality as Excellence/Goodness § Quality as an expression for the intuitively evaluated excellence/goodness § QD3: Quality as Standards § Quality is the totality of characteristics of an entity that bear on its ability to satisfy stated or implied needs (ISO, 1995) § Quality is the ability of a set of inherent characteristics of a product, system or process to fulfill requirements of customers and other interested parties (ISO, 1999) § QD4: Quality as Event § Quality is not a static thing, it is the event at which awareness of subject and object is made possible M. & H. Martens. Multivariate Analysis of Quality. An Introduction. Wiley, 2001
  • 8. QoS Technology-centric: throughput, delay, packet loss, etc. QoE User-centric: what really matters to the end-user: responsiveness, interactivity, acceptability, utility, satisfaction, etc. QoE Origins: Need to bridge between user and technology perspectives (around 2001)
  • 9. QoE: Some Definition Attempts § QoE as a reloaded buzzword: QoE has been defined as an extension of the traditional QoS in the sense that QoE provides information regarding the delivered services from an end-user point of view [Lopez et al. 2006] § QoE as a usability metric: QoE is how a user perceives the usability of a service when in use – how satisfied he/she is with a service in terms of, e.g., usability, accessibility, retainability and integrity [Soldani 2006] § QoE as a hedonistic concept: QoE describes the degree of delight of the user of a service, influenced by content, network, device, application, user expectations and goals, and context of use [Dagstuhl Seminar May 2009] § QoE as the ultimate answer to life, universe and everything: Quality of Experience includes everything that really matters [Kilkki@LinkedIn 2008]
  • 10. User Expectations regarding system performance Smartphone vs. Tablet vs. TV Influence of network speed, latencies, etc. Application type e.g. web browsing, IPTV We need a proper QoE Definition! à Qualinet whitepaper (2012) Work vs. Entertainment “… the degree of delight or annoyance of the user of an application or service …” (Qualinet White Paper on Definitions of Quality of Experience, 2013)
  • 11. “Quality of Experience (QoE) is the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and/or enjoyment of the application or service in the light of the user’s personality and current state.” § Experience: An experience is an individual’s stream of perception and interpretation of one or multiple events. § Quality Feature: A perceivable, recognized and namable characteristic of the individual’s experience of a service which contributes to its quality. § Influencing Factors: In the context of communication services, QoE can be influenced by factors such as service, content, network, device, application, and context of use. QoE Definition (Qualinet Whitepaper, 2012) from Qualinet White Paper on Definitions of Quality of Experience
  • 12. “Quality of Experience (QoE) is the degree of delight or annoyance of a person whose experiencing involves an application, service, or system. It results from the person’s evaluation of the fulfillment of his or her expectations and needs with respect to the utility and/or enjoyment in the light of the person’s context, personality and current state.” § Application: A software and/or hardware that enables usage and interaction by a user for a given purpose. Such purpose may include entertainment or information retrieval, or other. § Service: An episode in which an entity takes the responsibility that something desirable happens on the behalf of another entity. An Even Better Definition (QoE Book, 2013) 12From Möller & Raake (Eds) 2013, pp 18-19
  • 13. § Fundamental relationships and data on quality perception § QoE as f(System, User state, Content, Context) § WQL Hypothesis, IQX Hypothesis, etc. § Guidelines for § Network planning and parametrization § Application, service or algorithm design § QoE Models and Metrics for § Predicting QoE based on technical measurements § QoE Measurement/Prediction Systems for § Monitoring and documenting health of system/network based on user-centric KPIs (e.g. picture quality) § QoE-centric Net & App Management in order to § Ensure optimal end-user experience in economic ways § Distribute resources fairly among users The Field: QoE Research & Applications AnalyzePredictControl
  • 14. PART 2: HOW TO „MEASURE“ QOE? 14AIT | 2019
  • 15. Common Quality Issues for Networked Multimedia § Web Browsing: § Long waiting time until anything happens § Slow page rendering § Unavailability of page/site § Bad site design, bad usability § ... § IPTV, Mobile TV: § Visual quality: blocking, blurring, freeze frames § Audio quality: noise, distortions § Audio/video out of sync § Stalling, rebuffering § Long startup time of service § Long zapping time § ... à Different services, different types of quality aspects/impairments à Quality impairments can have various causes (device, network, content, ...)
  • 16. General Question: Can we directly „measure“ experienced Quality (QoE)? § Answer: NO, not yet! § Why? § Elusive concept § No “objective” physiological / neural correlate § Mind-reading not possible (yet) § But: we can assess and estimate QoE (or parts/proxies of it) to some extent …
  • 17. How to assess or estimate QoE? A) Subjective QoE Tests § Based on end-user involvement § Subjective measures: e.g. user opinion, ratings § Objective measures: e.g. task performance, behavior B) „Objective“ QoE Prediction/Estimation § „Metrics“ based on analytical/statistical models § Translate input parameters to estimated QoE Stimulus Response Test Conditions Impairments Subjective Measures Objective Measures Input OutputQoE= mc2 Model Perception-based Instrumental
  • 18. User App Net Application Log Analysis Traffic Analysis Subjective Testing Method User Quality Perception Data Insights QoE Models From (subjective) Experimental Data (A) to (objective) QoE Models/Metrics (B) QoE‘s Core Business: Subjective User Experiments & Model Development Context Factors
  • 19. Overview: QoE Assessment Approaches QoE Measurement Subjective Controlled Experiments (Lab) Crowdsourcing Field / Real Service Objective Signal-based FR RR NRPacket-level Parametric 19
  • 20. Subjective QoE Assessment Key Question: How to assess QoE at maximum validity? Answer: Subjective QoE Testing with Human Participants Involves a delicate mix of choices: § Context: Lab, Field or Web (Crowdsourcing)? § Technical Setup? § Test Content and Procedure? § Data gathering: Qualitative vs. Quantative? Data collection Methods? § Analytic or Utilitarian?
  • 23. Rule #1: Know your Purpose! § Every study is done for a purpose … à know your purpose and clearly define the problem that you want to address accordingly! § (Very) Different purposes: § Building a model/metric • Example: image qoe as f(settings_of_codecXY) § Answering a question / test a hypothesis • Example: does QoE as f(latency) for web browsing differ by age? § Evaluating metric(s) • Example: how good does the new VMAF metric reflect QoE? § Identify experience dimensions/quality features • Example: which are the main experience dimensions governing mobile AR? § Validate a new QoE assessment methodology …. 23
  • 24. Test Methodology & Design § Variables § Which ones to manipulate, control, observe or ignore? à Avoid unintended/unnoticed influences from uncontrolled factors on results! § Subjects § Naïve or Expert?, N=? § Instructions § Which questions to ask subjects and how § Training? § Presentation § Single or double stimulus, sequential or simultaneous? § Grading scales § How many items? Direct, indirect? § Numerical, Categorical? MOS? à Methodologies draw from several disciplines: HCI, UX, Quality assessment Psychology, Sociology, Experimental Design Theory, etc. à Make good use of this existing body of knowledge!
  • 25. Recommended Reading § http://www.doesinc.com/knowledge.htm § http://www.statsoft.com/textbook/experimental-design/ § ITU-T P.910: Subjective video quality assessment methods for multimedia applications § ITU-R BT.500: Methodology for the subjective assessment of the quality of television pictures § https://www.its.bldrdoc.gov/vqeg/vqeg-home.aspx § Book: Michell & Jolley, „Research Design Explained“ § Ritter, F. E., Kim, J. W., Morgan, J. H., & Carlson, R. A. (2012). How to run experiments: A practical guide to research with human participants. Thousand Oaks, CA: Sage. www.frankritter.com/rbs/ rbs-handout-cogsci.pdf 25
  • 26. Example: Subjective Image Quality Testing § Given: Source Image, System that impairs image (compression, transmission errors, etc.) § Question: What is the impact of the system on experienced image quality?
  • 27. A typical lab assessment involves … •15 to 30 participants •1-2 hours per participant •Informed consent / GDPR (signed) •Instructions about tasks •Some pre/post questionnaires (demographics, ratings, feedback, etc.) •Several technical conditions and original media to evaluate 27
  • 28. Test Content § Has considerable impact on quality perception § Content choice depends on study goals, e.g. § Typical content § Challenging Worst-case content § BUT: content choice also influences rating behavior! • Likeability, emotions 28P01 p06 p10 bike cafe woman
  • 29. Which User Data/Correlates with QoE can we collect? § Subjective Opinion / Assessment § Quantative: Ratings § Qualitative: Interviews, Thinking aloud § Behavioural Measurements § Behaviour logs § Observational Coding § Behavioral performance (completion time, response time, error rates) § Physiological Measurements § Heart rate, Skin Response § Muscular activity, Eye activity § Brain activity (EEG, MRI) 29 THE MAIN VEHICLE
  • 30. How to Obtain User Ratings: Scaling Methods Scaling Direct Sg Stimulus ACR Dbl Stimulus DCR Continuous SAMVIQ & MUSHRA Indirect Ranking Paired Comparison 30
  • 31. Key Measure: MOS § Mean Opinion Score § Widely used in many fields: § Politics/Elections § Marketing/Advertisement § Food industry § Multimedia § MOS = The likely level of satisfaction with a service or product as appreciated by an average user § Example question: “How would you rate the visual quality of this image?” § Challenge: test design that generates valid, objective, reliable (and thus reproducible) results § Implementation more complex and difficult that it seems a priori (WYAIWYG problem: what you ask is what you get) Excellent Good Fair Poor Bad 5 4 3 2 1 Imperceptible Perceptible Slightly annoying Annoying Very annoying MOS Quality Impairment
  • 32. Direct Scaling: ACR (Absolute Category Rating) § Discrete § Single stimulus § Multiple dimensions addressable § Usually 5-point scale, but can also be 7-, 9-, or 11-point 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad ACR Stimulus A Stimulus B Stimulus C
  • 33. Direct Scaling: DCR (Degradation Category Rating) § Discrete § Direct Comparison à Relative § Reference vs. processed sample § Highly sensitive 5 degradation is not perceivable 4 degradation is perceivable but not annoying 3 degradation is slightly annoying 2 degradation is annoying 1 degradation is very annoying DCR Ref A Stimulus A Ref B Stimulus B
  • 34. Scaling: Continuous § Continuous / Sg or Dbl Stimulus § For assessing transient quality artifacts in longer (media) samples (videos, etc.) Continuous
  • 35. Exercise: Which Direct Scaling to Use? § Assessment Tasks 1) Impact of (constant) noise on QoE (image) 2) Impact infrequent bursts of packet loss on QoE (video) 3) Added value of 4k/UHD vs. HD resolution (video) § Rating Methods / Scaling 1. ACR 2. DCR 3. Continuous
  • 36. Scales: How to Map Opinions to Numbers? § Not all features that an entity has can be described by numbers § Example: a person • Weight and height are numeric variables (more precisely ratio variables) • Education and socio economic class are ordinal variables • Sex and religion are nominal variables § This has direct consequences on § Design & usability of the rating scale § The kind of statistical analysis we can perform on the results! 36 Probabilities Percentiles Any statistic interval / ratio x x x ordinal x x nominal x mode median mean
  • 37. Consequence: Another essential Rule … § ANTICIPATE, ANTICIPATE, ANTICIPATE! § Beyond clearly defining the problem … § … prepare analysis & reporting in advance! § Then use the prepared analysis for monitoring your results data during piloting and test execution! à Risk management is key! 37
  • 38. Some Scale Designs (ACR) 38
  • 39. Some Scale Designs (ACR) ctd. § Ideal for Crowdsourcing (Gardlo, Egger & Hossfeld 2015) 39 Gardlo, B., Egger, S., & Hoßfeld, T. (2015). Do Scale-Design and Training Matter for Video QoE Assessments through Crowdsourcing? CrowdMM@ACM Multimedia.
  • 40. Setup: Test Environment § Measurements have to be valid, objective & reliable § Subjective testing methodologies § High requirements on testing environment § Many influencing/confounding factors, z.B. § Type, performance and quality of devices (monitor, speakers, etc.) § Light and acoustic conditions § Ambience, interior architecture § Watching distance and angle
  • 41. Planning: Time & Test Conditions § Subject‘s Time & Energy = scarce resource § Max. 90 min of net testing time, requires a break of 5-10 mins § Time slots need to be 2h (min), better 2.5h § #conditions = net testing time / condition duration § For QoE lab studies, to typically go for a within-subjects design § Severely limited what one test can actually cover! § Tricks: use between-subjects design, latin squares, etc. § Further hints: § Use anchors: very good + very bad quality conditions, evt. training § Do not forget to randomize the sequence of conditions! 41
  • 42. Number of Subjects - The dreaded „N“! § Huge Trade-off: Time & Money vs. Statistical Power / Reliability § Higher N à smaller p & smaller confidence intervals § BUT: diminishing returns § Recommendations for N: § ITU-T: 15 § VQEG: 24 à So … which N is truly „sufficient“? § Extensive Analysis by Brunnström & Barkowsky (2018) 42 N ZCI j j s ×=
  • 43. Analysis by Brunnström & Barkowsky (2018) § Traditional recommendations too optimistic, particularly when you have to correct for multiple comparisons! § BTW: remember the funnel: #invited subjects > #tested subjects > #valid subjects
  • 45. PILOT & REFINEMENT & EXECUTION 45
  • 46. 46 § Subjective tests are like a live performance § When the audience arrives and the show starts, everything has to be in place and work smoothly! § Remember: each subject costs time & money, so don‘t waste them! § You have to rehearse = Test & Pilot in advance § Use subjects of increasing „expensiveness“ • You, colleagues, (friendly) users from the target audience • Let them give feedback afterwards on the meta-level/process § There are always issues with the usability of your setup and your instructions à If you don‘t find any problems with the test setup, then you have not piloted it well enough! § Reserve enough time for refining the test setup & design! Piloting & Refinement
  • 47. 47 § Treat your subjects well – for them participation should be a positive experience in itself! § Monitor your data, your users and all kinds of events – you need to be able to debug your test § Keep a test assistant‘s log § Automatize, Automatize, Automatize! Execution
  • 48. 48 § By TU–Ilmenau, AVT Group § https://github.com/Telecommunication-Telemedia-Assessment/avrateNG Rating Tools/Systems#1: avrateNG
  • 49. Setting up a new user study can be tedious… TheFragebogen.de A software framework for user studies made simple. Used in several QoE studies: audio/video, web 2nd screen and crowdsourcing. open source cross platform multi device multimedia graphical scales privacy friendlybehavioral data Dennis Guse | Henrique R. Orefice | Gabriel Reimers | Oliver Hohlfeld ready for crowdsourcing DEMO
  • 51. !"" #"" $"" %"" !&"" ! !'( # #'( ) )'( $ $'( ( *+, -./01,2./03456 7*8 ! ! 89: ;89: <=> MOS Data Analysis and Reporting • Mean Opinion Scores (MOS) and confidence intervals N m MOS N i ij j å= = 1 mij = score by subject i for test condition j. N = number of subjects after outliers removal. But: this is not enough …
  • 52. !"" #"" $"" %"" !&"" ! !'( # #'( ) )'( $ $'( ( *+, -./01,2./03456 7*8 ! ! 89: ;89: <=> MOS Data Analysis and Reporting, ctd. BUT: MOS by itself only reports an average opinion, and thus HIDES a lot of information … Excellent! Bad! Fair!Good! Poor! Æ Fair = 3
  • 53. !"" #"" $"" %"" !&"" ! !'( # #'( ) )'( $ $'( ( *+, -./01,2./03456 7*8 ! ! 89: ;89: <=> MOS Data Analysis and Reporting, ctd. à don’t forget to analyze and report user opinion diversity using confidence intervals (bare minimum), histograms, CDFs, etc.!
  • 54. !"" #"" $"" %"" !&"" ! !'( # #'( ) )'( $ $'( ( *+, -./01,2./03456 7*8 ! ! 89: ;89: <=> MOS Data Analysis and Reporting • Mean Opinion Scores (MOS) and confidence intervals N m MOS N i ij j å= = 1 N ZCI j j s ×= mij = score by subject i for test condition j. N = number of subjects after outliers removal. Z = z-value for required confidence level (1.96 for 95%). σj = standard deviation of the scores distribution across subjects for test condition j. But: MOS + confidence intervals can be considered only as the bare minimum …
  • 55. Remember: ALWAYS be Cautious with Summary Statistics! F. J. Anscombe: Graphs in Statistical Analysis. In: American Statistician. 27, Nr. 1, 1973, S. 17–21.
  • 56. More Extreme: The Datasaurus Dozen https://www.autodeskresearch.com/publications/samestats
  • 57. Reporting (as recommended by ITU-T P.910) 57
  • 58. Further Reporting Examples … 58 Schatz et al. 2018
  • 59. Further Reporting Examples (ctd) 59 Schatz et al. 2018
  • 60. How to navigate the seven seas of statistical analysis? 60 Ratings,etc. QoS, Encoder settings, etc.
  • 61. 61 How to navigate the seven seas of statistical analysis? (ctd) Source: J. Wobbrock http://depts.washington.edu/acelab/proj/Rstats/index.html Also check out this course: „Designing, Running and Analyzing Experiments“ (Coursera)
  • 62. 62 § http://www.panelcheck.com/ Tools from Food Sciences can make your life a bit easier … Hint: check out the work (papers, R packages, tools) of P.B. Brockhoff (DTU)!
  • 63. 63 § Think of Reproducible Science! 1. Publish the Dataset § Ideally: Content + Rating Data 2. When sharing: use standardized data formats § Example: suJSON § Universal format for exchange § Format specification § Tools What else beyond reporting?
  • 64. 64 § Emerging technologies: XR, VR video, point-cloud, mulsemedia, … § New targets à more work = good J à Many more DOF à challenge! § QoE in the context of interactive realtime systems and applications § Web, Gaming, etc. à Confluence/overlap with UX § Rise of big data analytics and data-driven techniques § Availability of more (diverse) sources of data à QoE substitution by e.g. behavior à Testing driven by AI/ML – Active Learning Outlook: Trends & Challenges
  • 65. Thank you for your attention! Any questions?
  • 66. QoE-related Books § Quality Engineering (Möller, 2011) § Very good introductory book on quality engineering § Focus on audio/voice and video quality § Focus on communications context § Quality of Experience (Möller & Raake eds., 2013) § Covers QoE extensively § Great overview of the current state of the art § Addresses fundamentals (like assessment methods) and applications (like video-conferencing)
  • 67. Related Philosophy: Pirsig (1974) § Zen and the Art of Motorcycle Maintenance § possibly the most widely read philosophy book of the 20th century § Initially refused by more than 121 publishers § Three interweaved narratives: § a motorcycle trip across America, § the reconciliation of the narrator with his son and former "insane" self, Phaedrus, and § a number of philosophical discussions concerning the quality of contemporary Western life. § Foundation of Metaphysics of Quality (MoQ) § What defines good writing? § And what in general defines “good” or "quality“?
  • 68. Further Reading, ctd. § European Network on Quality of Experience in Multimedia Systems and Services (QUALINET), “Definitions of Quality of Experience (QoE) and related concepts,” White Paper, 2012. § Recommendation ITU-T P.10/G.100 (2006) - Amendment 2 (07/08), Vocabulary for performance and quality of service - New definitions for inclusion in Recommendation ITU-T P.10/G.100. § Recommendation ITU-T E.800 (2008), Terms and definitions related to quality of service and network performance including dependability. § Recommendation ITU-T P.800 (1998), Methods for subjective determination of transmission quality. § Recommendation ITU-T G.1011 (2010), Reference guide to quality of experience assessment methodologies. § Recommendation ITU-T BT.500-13, Methodology for the subjective assessment of the quality of television pictures . - 68 -
  • 69. Further Reading: Crowdsourcing § T. Hoßfeld, C. Keimel, M. Hirth, B. Gardlo, J. Habigt, K. Diepold, and P. Tran-Gia, “Best practices for QoE crowdtesting: QoE assessment with crowdsourcing,” IEEE Trans. Multimed., vol. 16, no. 2, pp. 541–558, Feb. 2014. § Egger-Lampl, Sebastian, Judith Redi, Tobias Hoßfeld, Matthias Hirth, Sebastian Möller, Babak Naderi, Christian Keimel, and Dietmar Saupe. „Crowdsourcing Quality of Experience Experiments“. In „Evaluation in the Crowd. Crowdsourcing and Human-Centered Experiments“, Daniel Archambault, Helen Purchase, and Tobias Hoßfeld (Eds.), Springer International Publishing, 2017. § Gadiraju, Ujwal, Sebastian Möller, Martin Nöllenburg, Dietmar Saupe, Sebastian Egger-Lampl, Daniel Archambault, and Brian Fisher. „Crowdsourcing Versus the Laboratory: Towards Human-Centered Experiments Using the Crowd“. In „Evaluation in the Crowd. Crowdsourcing and Human-Centered Experiments“, Daniel Archambault, Helen Purchase, and Tobias Hoßfeld (Eds.), Springer International Publishing, 2017.
  • 70. Further Reading, ctd. Balachandran, Athula, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica, and Hui Zhang. 2012. “A Quest for an Internet Video Quality-of-Experience Metric.” In Proceedings of the 11th ACM Workshop on Hot Topics in Networks, 97–102. ACM. Balachandran, Athula, Vyas Sekar, Aditya Akella, Srinivasan Seshan, Ion Stoica, and Hui Zhang. 2013. “Developing a Predictive Model of Quality of Experience for Internet Video.” In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, 339–350. Dobrian, Florin, Vyas Sekar, Asad Awan, Ion Stoica, Dilip Joseph, Aditya Ganjam, Jibin Zhan, and Hui Zhang. 2011. “Understanding the Impact of Video Quality on User Engagement.” In Proceedings of the ACM SIGCOMM 2011 Conference, 362–373. SIGCOMM ’11. New York, NY, USA: ACM. Krasula, Lukaˇs, und Patrick Le Callet. „Chapter 4: Emerging Science of QoE in Multimedia Applications“, o. J., 35. - 70 -
  • 71. Further Reading, ctd. Akamai. June 2006: Retail Web Site Performance: Consumer Reaction to a Poor Online Shopping Experience. Akamai Technologies, http://www.akamai.com (accessed February 10, 2008). Allan, L. G.: The perception of time, Perception Psychophysics, vol. 26, no. 5, pp. 340–354, 1979. Bouch, A., Kuchinsky, A., Bhatti, N.: Quality is in the eye of the beholder: meeting users’ requirements for Internet quality of service. Proceedings of the SIGCHI conference on Human factors in computing systems. S. 297–304 (2000). Egger, S., Reichl, P., Hossfeld, T., Schatz, R.: “Time is Bandwidth”? Narrowing the Gap between Subjective Time Perception and Quality of Experience. Proceedings of the 2012 IEEE International Conference on Communications. Fiedler, M., Hossfeld, T., and Tran-Gia, P.: A generic quantitative relationship between quality of experience and quality of service. Netwrk. Mag. of Global Internetwkg., vol. 24, pp. 36–41, March 2010. ITU-T Recommendations: G.1030, P.800, P.805, P.880, P.910, BT.500 Mitchell, M.L., Jolley, J.M.: Research Design Explained. Cengage Learning (2009). Möller, S.: Quality Engineering Qualität kommunikationstechnischer Systeme. Springer, Heidelberg [u.a.] (2010).
  • 72. Further Reading, ctd. Strohmeier, Dominik. „Open profiling of quality: a mixed methods research approach for audiovisual quality evaluations“. Dissertation 4, Nr. 4 (2011): 5–6. Sackl, Andreas. “Investigations on the Role of Expectations and Individual Decisions in Quality Perception”. Dissertation, 2016 Schatz, R., Egger, S., Platzer, A.: Poor, Good Enough or Even Better? Bridging the Gap between Acceptability and QoE of Mobile Broadband Data Services. Proceedings of the 2011 IEEE International Conference on Communications. S. 1 -6 (2011). Schatz, Hossfeld, Janowski, and Egger. From Packets to People: Quality of Experience as New Measurement Challenge. In: Data Traffic Monitoring and Analysis. Springer LNCS, 2013. Schatz, Fiedler, and Skorin-Kapov. QoE-based Network and Application Management. In: Quality of Experience: Advanced Concepts, Applications and Methods. Springer LNCS, 2014. Seow, S.C.: Designing and Engineering Time: The Psychology of Time Perception in Software. Addison-Wesley Professional (2008). - 72 -