A comparison of human and machine learning based accuracy

•

1 like•429 views

A Comparison of Human and Machine Learning-based Accuracy for Valence Classification of Subjects in Video Fragments. John Schavemaker, Yorick Holkamp

Technology

A Comparison of Human and
Machine Learning-based
Accuracy for Valence
Classification of Subjects in
Video Fragments
Yorick Holkamp, John Schavemaker

PLAY
WATCH
SENSEANNOTATE
RECOMMEND
MIME cycle

Machine vs. human?
• to determine the
performance of human
annotators
• humans have good
experience in assessing
the emotions of their
peers
• what are the practical
limitations of facial
expressions?

MAHNOB-HCI Dataset
• Soleymani, M., Lichtenauer, J.,
Pun, T., & Pantic, M. (2012). A
multimodal database for affect
recognition and implicit tagging.
Affective Computing, IEEE Trans.
on, 3(1), 42-55.
• created for affect recognition
and implicit tagging applications
• contains face video recordings,
EEG data and more
• 24 subjects
• Inter-rater agreement 0.66
(‘substantial agreement’)
setup
example

Negative valence
Fragment
from
Hannibal
(censored)

Baseline Machine Method
Sensor data
Data
processing
Machine
learning
Performance
evaluation
Application
Overview

Machine Features
Time Action Unit
00:00 6. Raise cheeks
00:00 12. Pull lip corners
00:01 6. Raise cheeks
...
Noldus
FaceReader
Data aggregation
Combination # Starts # Stops
6+12 3 3
1+4+15 5 4
... ...
Facial expression-based method with onset and offset counting by Koelstra and Patras:
Koelstra, S., & Patras, I. (2013). Fusion of facial expressions and EEG for implicit affective
tagging. Image and Vision Computing, 31(2), 164-174.

Machine Training
Machine learning
Combination Onsets Offsets
06+12 3 3
1+4+15 5 4
... ...
Combination Onsets Offsets
06+12 3 3
1+4+15 5 4
... ...
Combination # Starts # Stops
6+12 3 3
1+4+15 5 4
... ...
Subject x’s history

Machine Classification
Machine learning
Combination Onsets Offsets
6+12 3 3
1+4+15 5 4
... ...

Our Extensions (FEI)
• Facial Expression Intensity
• Different aggregation methods
• Average activation level
• Standard deviation in activation level
• Alternative training method
• Train using 23 subjects, predict for 1

Machine versus human annotator
accuracy
0%
50%
100%
Percentageofcorrectratings
Video fragments
Machine - human accuracy
Machine Human

Human accuracy per fragment
0%
25%
50%
75%
100%
Percentageofcorrectvotes
Video fragments
Human accuracy per fragment Positive Negative

conclusions
• In this paper we present the results of a comparison
between classification accuracy of humans and
machine-learning classifiers.
• For this we used the MAHNOB-HCI affective computing
dataset and we have reproduced and extended the
facial expression-based method by Koelstra and Patras.
• Our results show that both humans and machine
classifiers agree to a large portion on the appropriate
class for video fragments.
• In our experiments, we found that human annotators
obtained higher accuracy than the automatic
classification methods.

Similar to A comparison of human and machine learning based accuracy

Multimodal Learning AnalyticsXavier Ochoa

Emotion detection using cnn.pptxRADO7900

Elderly Assistance- Deep Learning Theme detectionTanvi Mittal

Human-centered AI: how can we support end-users to interact with AI?Katrien Verbert

An analysis of_machine_and_human_analytics_in_classificationSubhashis Hazarika

Machine_Learning.pptxshubhamatak136

Final Year IEEE Project 2013-2014 - Digital Image Processing Project Title a...elysiumtechnologies

Introduction to machine learning-2023-IT-AI and DS.pdfSisayNegash4

Feedback System Usign The Humans EmotionsMoustafa Ghoniem

Human factors in software reliability engineering - Research PaperMuhammad Ahmad Zia

MediaEval 2016 - UNIFESP Predicting Media Interestingness Taskmultimediaeval

Machine Learning.pptxRehmatUllah46

icmi2015_ChaZhangZhiding Yu

Precision Physiotherapy & Sports Training: Part 1PetteriTeikariPhD

Machine Learning GDSC DCE Darbhanga.pptxDCETechnicalClub

Deepfake Detection: The Importance of Training Data Preprocessing and Practic...Symeon Papadopoulos

Deepfake detectionWeverify

Deepfake detection Weverify

Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchro...Alpen-Adria-Universität

Long-term Face Tracking in the Wild using Deep LearningElaheh Rashedi

Similar to A comparison of human and machine learning based accuracy (20)

Multimodal Learning Analytics

Emotion detection using cnn.pptx

Elderly Assistance- Deep Learning Theme detection

Human-centered AI: how can we support end-users to interact with AI?

An analysis of_machine_and_human_analytics_in_classification

Machine_Learning.pptx

Final Year IEEE Project 2013-2014 - Digital Image Processing Project Title a...

Introduction to machine learning-2023-IT-AI and DS.pdf

Feedback System Usign The Humans Emotions

Human factors in software reliability engineering - Research Paper

MediaEval 2016 - UNIFESP Predicting Media Interestingness Task

Machine Learning.pptx

icmi2015_ChaZhang

Precision Physiotherapy & Sports Training: Part 1

Machine Learning GDSC DCE Darbhanga.pptx

Deepfake Detection: The Importance of Training Data Preprocessing and Practic...

Deepfake detection

Is One Second Enough? Evaluating QoE for Inter-Destination Multimedia Synchro...

Long-term Face Tracking in the Wild using Deep Learning

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Commit 2024 - Secret Management made easyAlfredo García Lavilla

How to write a Business Continuity PlanDatabarracks

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal

Search Engine Optimization SEO PDF for 2024.pdfRankYa

Advanced Computer Architecture – An IntroductionDilum Bandara

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024

Connect Wave/ connectwave Pitch Deck Presentation

DMCC Future of Trade Web3 - Special Edition

Developer Data Modeling Mistakes: From Postgres to NoSQL

Vertex AI Gemini Prompt Engineering Tips

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Commit 2024 - Secret Management made easy

How to write a Business Continuity Plan

DSPy a system for AI to Write Prompts and Do Fine Tuning

SAP Build Work Zone - Overview L2-L3.pptx

Search Engine Optimization SEO PDF for 2024.pdf

Advanced Computer Architecture – An Introduction

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

"Debugging python applications inside k8s environment", Andrii Soldatenko

Streamlining Python Development: A Guide to a Modern Project Setup

Ensuring Technical Readiness For Copilot in Microsoft 365

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

Unraveling Multimodality with Large Language Models.pdf

Dev Dives: Streamline document processing with UiPath Studio Web

A comparison of human and machine learning based accuracy

1. A Comparison of Human and Machine Learning-based Accuracy for Valence Classification of Subjects in Video Fragments Yorick Holkamp, John Schavemaker

6. PLAY WATCH SENSEANNOTATE RECOMMEND MIME cycle

10. Machine vs. human? • to determine the performance of human annotators • humans have good experience in assessing the emotions of their peers • what are the practical limitations of facial expressions?

11. MAHNOB-HCI Dataset • Soleymani, M., Lichtenauer, J., Pun, T., & Pantic, M. (2012). A multimodal database for affect recognition and implicit tagging. Affective Computing, IEEE Trans. on, 3(1), 42-55. • created for affect recognition and implicit tagging applications • contains face video recordings, EEG data and more • 24 subjects • Inter-rater agreement 0.66 (‘substantial agreement’) setup example

12. -1 0 1 valence

13. -1 0 1 Valence

14. Positive valence

15. Negative valence Fragment from Hannibal (censored)

16. Baseline Machine Method Sensor data Data processing Machine learning Performance evaluation Application Overview

17. Machine Features Time Action Unit 00:00 6. Raise cheeks 00:00 12. Pull lip corners 00:01 6. Raise cheeks ... Noldus FaceReader Data aggregation Combination # Starts # Stops 6+12 3 3 1+4+15 5 4 ... ... Facial expression-based method with onset and offset counting by Koelstra and Patras: Koelstra, S., & Patras, I. (2013). Fusion of facial expressions and EEG for implicit affective tagging. Image and Vision Computing, 31(2), 164-174.

18. Machine Training Machine learning Combination Onsets Offsets 06+12 3 3 1+4+15 5 4 ... ... Combination Onsets Offsets 06+12 3 3 1+4+15 5 4 ... ... Combination # Starts # Stops 6+12 3 3 1+4+15 5 4 ... ... Subject x’s history

19. Machine Classification Machine learning Combination Onsets Offsets 6+12 3 3 1+4+15 5 4 ... ...

20. Our Extensions (FEI) • Facial Expression Intensity • Different aggregation methods • Average activation level • Standard deviation in activation level • Alternative training method • Train using 23 subjects, predict for 1

21. Annotation

22. Experiment

23. human Human vs. machine

24. Machine versus human annotator accuracy 0% 50% 100% Percentageofcorrectratings Video fragments Machine - human accuracy Machine Human

25. Human accuracy per fragment 0% 25% 50% 75% 100% Percentageofcorrectvotes Video fragments Human accuracy per fragment Positive Negative

26. conclusions • In this paper we present the results of a comparison between classification accuracy of humans and machine-learning classifiers. • For this we used the MAHNOB-HCI affective computing dataset and we have reproduced and extended the facial expression-based method by Koelstra and Patras. • Our results show that both humans and machine classifiers agree to a large portion on the appropriate class for video fragments. • In our experiments, we found that human annotators obtained higher accuracy than the automatic classification methods.

27. QUESTIONS?

A comparison of human and machine learning based accuracy

Recommended

Recommended

More Related Content

Similar to A comparison of human and machine learning based accuracy

Similar to A comparison of human and machine learning based accuracy (20)

Recently uploaded

Recently uploaded (20)

A comparison of human and machine learning based accuracy