SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Downloaden Sie, um offline zu lesen
RANDOM FORESTS
FOR LAUGHTER DETECTION

Heysem Kaya, A. Mehdi Ercetin, A. Ali Salah, S. Fikret Gurgen
Department of Computer Engineering
Bogazici University / Istanbul

WASSS '13, Grenoble, 23.08.2013
Presentation Outline
●

Introduction and Motivation

●

A Brief Review of Literature

●

The Challenge Corpus

●

Methodology
–

Methods & Features Utilized

–

Experimental Results

●

Conclusions

●

Questions
Introduction
‱

‱

‱

‱

‱

New directions in ASR: emotional and social cues.
Social Signal Sub-Challenge of the Interspeech
2013: detection of laughter and fillers
Challenges: garbage class dominates, too many
samples
Best results obtained with fusion (e.g. speech +
vision)
TUM baseline feature set
Brief Review of Related Work
Work

Classifier

Features

Other Notes

Ito et al., 2005

GMM

MFCC,
Δ-MFCC

Class. fusion with AND operator was
found to increase precision

Thruong and
Leeuven, 2007

GMM &
SVM

PLP, prosody
voicing,
modulation
spectrum

Knox and
Mirgafori, 2007

MLP

MFCC, prosody
along with first
and second Δ

Petridis and
Pantic, 2008

MLP,
PLP, prosody &
Adaboost their Δs as well
(feat. sel.) as video features

Vinciarelli et al.,
2009

Fusion with diverse meta classifiers
was found to outperform fusion of
class. with the same algorithm
●

Class prob. stacked to ANN
●Δ-MFCC outdid raw MFCC
●

Multi-modal fusion was superior to
single-modal model
● Best results: stacking to ANN
●

Survey recommends utilization of classifier fusion for SSP
The SSP Challenge Dataset - Corpus
●
●

Based on SSPNet Vocalization Corpus
Collected from microphone recordings of 60
phone calls

●

Contains 2763 clips each with 11s length

●

At least one laughter or filler at every clip

●

●

Fillers are vocalizations to hold floor (e.g.
“uhm”, “eh”, “ah”)
To provide speaker independence:
–

calls #1-35 served as training,

–

#36-45 as development and the rest as test
The SSP Challenge Dataset – Statistics
Property

Statistic

# of Clips

2763

Clip Duration

11 sec.

# of Phone Calls

60

# of Subjects

120

# Male Subjects

57

# Female Subjects

63

# of Filler Events

3.0 k

# of Laughter Events

1.2 k
Methodology
●

Classifier:
–

Random Forests (RF)

–

Support Vector Machines (SVM)

●

Challenge Features (TUM Baseline)

●

Feature Selection:
–

●

minimum Redundancy Maximum Relevance
(mRMR)

Post-processing:
–

Gaussian Window Smoothing
The SSP Challenge Dataset - Features
●

●

Non-overlapping frames of length 10ms (over
3 x 10⁶ frames)
Considering memory limitations, a small set of
(141) affectively potent features are extracted
(Schuller et al., 2013):
–

MFCC 1-12, logarithmic energy, voicing
probability, HNR, F0, zero crossing rate
along with their first order Δ

–

Also for MFCC and log. energy second order Δ

–

Frame wise LLDs are augmented with mean
and std. of the frame and its 8 neighbors
Random Forests
●

●

Random Forest is a fusion of decision tree
predictors1
Each decision tree is grown with
–

–
●

●

a set of randomly selected instances (sampled with
replacement) having
a subset of features which are also randomly selected

Sampling with replacement leaves on the avg 1/3
instances 'out of the bag'
They are shown to be superior to current
algorithms in accuracy and perform well on large
databases
1

L. Breiman, “Random Forests”, University of California, Berkeley,USA, 2001
mRMR Based Feature Selection
●

●

●

mRMR is introduced by Peng et al. (2005) as
a feature ranking algorithm based on mutual
information (MI)
MI quantifies the amount of shared information
between two random variables
A candidate feature is selected having
max MI with target variable
– min MI with already selected features
– For mRMR we used authors' original
implementation* *http://penglab.janelia.org/proj/mRMR/
–
Experimental Results
●

●

We used WEKA* implementation of SVM and
RFs.
For Social Signal Sub-Challenge we consider
Area Under Curve (AUC) measure of laughter
and filler classes and their unweighted
average (UAAUC)

*M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H. Witten,
“The WEKA Data Mining Software”, 2009 (http://www.cs.waikato.ac.nz/ml/weka/)
Baseline Results with Linear SVM (%AUC)
SVM Baseline on Development Set1
Laughter
C2
0.1
81.3
1
81.2
10
81.2
1
2

Filler
83.6
83.7
83.7

UAAUC
82.5
82.5
82.5

Challenge paper reports 86.2% and 89.0% for laughter and filler, respectively
SVM Complexity parameter

SVM Performance with mRMR Features on Devel. Set (C=0.1)
# of mRMR
Features
90
70
50

Laughter

Filler

UAAUC

80.7
79.9
78.8

83.3
83.0
82.4

82.0
81.5
80.6
Experiments with Random Forests
●

Hyper-parameters of an RF are
–

–
●

●

the number of (randomly selected) features
per decision tree (d) and
the number of trees (T) to form the forest

We also investigated the effect of mRMR ranked
features (D)
The tested values used for the hyper-parameters
are d = {8,16,32}, T= {10,20,30} and D =
{50,70,90, All}
Random Forest Performance with T=20, Varying
d and D on Development Set (% AUC)
Local Dim. (d) #mRMR (D)
All
90
8
70
50
All
90
16
70
50
All
90
32
70
50

Laughter
88.9
89.3
88.3
88.0
89.0
89.3
88.8
88.3
89.5
89.6
89.1
88.2

Filler
90.7
90.8
90.3
89.7
90.6
90.7
90.7
90.0
90.9
90.9
90.8
90.0

UAAUC
89.8
90.1
89.3
88.9
89.8
90.0
89.8
89.2
90.2
90.3
90.0
89.1
Random Forest Performance with Varying T, d
and D on Development Set (% UAAUC)
Local Dim. (d) #mRMR (D)
8

16

32
Avg

All
90
70
50
All
90
70
50
All
90
70
50

10 Trees

20 Trees

30 Trees

87.4
88.0
87.9
87.5
88.2
88.6
88.5
87.8
88.8
89.0
88.7
87.9
88.2

89.8
90.1
89.3
88.9
89.8
90.0
89.8
89.2
90.2
90.3
90.0
89.1
89.7

89.7
90.1
89.8
89.4
90.4
90.5
90.2
89.6
90.4
90.8
90.4
89.6
90.0

Development set baseline reported on paper: 87.6%, reproduced baseline: 82.5%
Gaussian Smoothing
●

●

●

We further applied Gaussian window smoothing on
posteriors.
The posteriors of each frame were re-calculated as a
weighted sum of its 2K neighbors (K before & K after)
The Gaussian weight function used to smooth frame
i with a neighboring frame j, is given as
−(1/2)

w i , j =(2∗pi∗B)
●

●

∗exp(−∣i− j∣/(2∗B))

We tested development set accuracy for K = 1,...,10
Since the increase from K=8 to K=9 was less than
0.05%, we chose K=8
The Development Set Accuracy vs K
The Challenge Test Set Result
●

We re-trained training and development sets
together with the best setting (T=30, D=90,
d=32)

●

We issued a Gaussian smoothing with K=8

●

On the overall we attained
–

89.6% and 87.3% AUC for laughter and fillers

–

An UAAUC of 88.4% outperformed the
challenge test set baseline (83.3%) by 5.1%
Conclusions
●

●

●

●

We proposed the use of RFs for laughter
detection
The results indicate superior performance in
accuracy
We observed that RFs make use of feature
reduction via mRMR while SVMs deteriorate
Together with Gaussian smoothing of
posteriors, we attained an increase of 5.1%
(absolute) from challenge test set baseline
Thanks!
●

Merci de votre attention

Weitere Àhnliche Inhalte

Andere mochten auch

Online random forests in 10 minutes
Online random forests in 10 minutesOnline random forests in 10 minutes
Online random forests in 10 minutes
CvilleDataScience
 
Random forest
Random forestRandom forest
Random forest
Ujjawal
 

Andere mochten auch (11)

Online random forests in 10 minutes
Online random forests in 10 minutesOnline random forests in 10 minutes
Online random forests in 10 minutes
 
Conistency of random forests
Conistency of random forestsConistency of random forests
Conistency of random forests
 
Decision tree and random forest
Decision tree and random forestDecision tree and random forest
Decision tree and random forest
 
Random forest
Random forestRandom forest
Random forest
 
Introduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele CutlerIntroduction to Random Forests by Dr. Adele Cutler
Introduction to Random Forests by Dr. Adele Cutler
 
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting MachinesDecision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
Decision Tree Ensembles - Bagging, Random Forest & Gradient Boosting Machines
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 
Random forest
Random forestRandom forest
Random forest
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Building Random Forest at Scale
Building Random Forest at ScaleBuilding Random Forest at Scale
Building Random Forest at Scale
 
Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests Data Science - Part V - Decision Trees & Random Forests
Data Science - Part V - Decision Trees & Random Forests
 

Ähnlich wie Random Forests for Laughter Detection

Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009
Matthew Magistrado
 
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Waqas Tariq
 
Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...
grssieee
 
Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...
csandit
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
csandit
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
cscpconf
 
AT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.pptAT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.ppt
grssieee
 

Ähnlich wie Random Forests for Laughter Detection (20)

D111823
D111823D111823
D111823
 
Benelearn2016
Benelearn2016Benelearn2016
Benelearn2016
 
Performance Analysis of Adaptive DOA Estimation Algorithms For Mobile Applica...
Performance Analysis of Adaptive DOA Estimation Algorithms For Mobile Applica...Performance Analysis of Adaptive DOA Estimation Algorithms For Mobile Applica...
Performance Analysis of Adaptive DOA Estimation Algorithms For Mobile Applica...
 
Tree net and_randomforests_2009
Tree net and_randomforests_2009Tree net and_randomforests_2009
Tree net and_randomforests_2009
 
40220140501002
4022014050100240220140501002
40220140501002
 
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...Parameters Optimization for Improving ASR Performance in Adverse Real World N...
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
 
Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...
 
A new analysis of failure modes and effects by fuzzy todim with using fuzzy t...
A new analysis of failure modes and effects by fuzzy todim with using fuzzy t...A new analysis of failure modes and effects by fuzzy todim with using fuzzy t...
A new analysis of failure modes and effects by fuzzy todim with using fuzzy t...
 
Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
 
Tree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptionsTree models with Scikit-Learn: Great models with little assumptions
Tree models with Scikit-Learn: Great models with little assumptions
 
32 8950 surendar s comparison of surface roughness (edit ari)new
32 8950 surendar s   comparison of surface roughness (edit ari)new32 8950 surendar s   comparison of surface roughness (edit ari)new
32 8950 surendar s comparison of surface roughness (edit ari)new
 
32 8950 surendar s comparison of surface roughness (edit ari)new
32 8950 surendar s   comparison of surface roughness (edit ari)new32 8950 surendar s   comparison of surface roughness (edit ari)new
32 8950 surendar s comparison of surface roughness (edit ari)new
 
Citython presentation
Citython presentationCitython presentation
Citython presentation
 
EUCAP 2021_presentation (7)
EUCAP 2021_presentation (7)EUCAP 2021_presentation (7)
EUCAP 2021_presentation (7)
 
ppt0320defenseday
ppt0320defensedayppt0320defenseday
ppt0320defenseday
 
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
MUMS Opening Workshop - An Overview of Reduced-Order Models and Emulators (ED...
 
AT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.pptAT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.ppt
 
Modeling of Granular Mixing using Markov Chains and the Discrete Element Method
Modeling of Granular Mixing using Markov Chains and the Discrete Element MethodModeling of Granular Mixing using Markov Chains and the Discrete Element Method
Modeling of Granular Mixing using Markov Chains and the Discrete Element Method
 

KĂŒrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

KĂŒrzlich hochgeladen (20)

Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Random Forests for Laughter Detection

  • 1. RANDOM FORESTS FOR LAUGHTER DETECTION Heysem Kaya, A. Mehdi Ercetin, A. Ali Salah, S. Fikret Gurgen Department of Computer Engineering Bogazici University / Istanbul WASSS '13, Grenoble, 23.08.2013
  • 2. Presentation Outline ● Introduction and Motivation ● A Brief Review of Literature ● The Challenge Corpus ● Methodology – Methods & Features Utilized – Experimental Results ● Conclusions ● Questions
  • 3. Introduction ‱ ‱ ‱ ‱ ‱ New directions in ASR: emotional and social cues. Social Signal Sub-Challenge of the Interspeech 2013: detection of laughter and fillers Challenges: garbage class dominates, too many samples Best results obtained with fusion (e.g. speech + vision) TUM baseline feature set
  • 4. Brief Review of Related Work Work Classifier Features Other Notes Ito et al., 2005 GMM MFCC, Δ-MFCC Class. fusion with AND operator was found to increase precision Thruong and Leeuven, 2007 GMM & SVM PLP, prosody voicing, modulation spectrum Knox and Mirgafori, 2007 MLP MFCC, prosody along with first and second Δ Petridis and Pantic, 2008 MLP, PLP, prosody & Adaboost their Δs as well (feat. sel.) as video features Vinciarelli et al., 2009 Fusion with diverse meta classifiers was found to outperform fusion of class. with the same algorithm ● Class prob. stacked to ANN ●Δ-MFCC outdid raw MFCC ● Multi-modal fusion was superior to single-modal model ● Best results: stacking to ANN ● Survey recommends utilization of classifier fusion for SSP
  • 5. The SSP Challenge Dataset - Corpus ● ● Based on SSPNet Vocalization Corpus Collected from microphone recordings of 60 phone calls ● Contains 2763 clips each with 11s length ● At least one laughter or filler at every clip ● ● Fillers are vocalizations to hold floor (e.g. “uhm”, “eh”, “ah”) To provide speaker independence: – calls #1-35 served as training, – #36-45 as development and the rest as test
  • 6. The SSP Challenge Dataset – Statistics Property Statistic # of Clips 2763 Clip Duration 11 sec. # of Phone Calls 60 # of Subjects 120 # Male Subjects 57 # Female Subjects 63 # of Filler Events 3.0 k # of Laughter Events 1.2 k
  • 7. Methodology ● Classifier: – Random Forests (RF) – Support Vector Machines (SVM) ● Challenge Features (TUM Baseline) ● Feature Selection: – ● minimum Redundancy Maximum Relevance (mRMR) Post-processing: – Gaussian Window Smoothing
  • 8. The SSP Challenge Dataset - Features ● ● Non-overlapping frames of length 10ms (over 3 x 10⁶ frames) Considering memory limitations, a small set of (141) affectively potent features are extracted (Schuller et al., 2013): – MFCC 1-12, logarithmic energy, voicing probability, HNR, F0, zero crossing rate along with their first order Δ – Also for MFCC and log. energy second order Δ – Frame wise LLDs are augmented with mean and std. of the frame and its 8 neighbors
  • 9. Random Forests ● ● Random Forest is a fusion of decision tree predictors1 Each decision tree is grown with – – ● ● a set of randomly selected instances (sampled with replacement) having a subset of features which are also randomly selected Sampling with replacement leaves on the avg 1/3 instances 'out of the bag' They are shown to be superior to current algorithms in accuracy and perform well on large databases 1 L. Breiman, “Random Forests”, University of California, Berkeley,USA, 2001
  • 10. mRMR Based Feature Selection ● ● ● mRMR is introduced by Peng et al. (2005) as a feature ranking algorithm based on mutual information (MI) MI quantifies the amount of shared information between two random variables A candidate feature is selected having max MI with target variable – min MI with already selected features – For mRMR we used authors' original implementation* *http://penglab.janelia.org/proj/mRMR/ –
  • 11. Experimental Results ● ● We used WEKA* implementation of SVM and RFs. For Social Signal Sub-Challenge we consider Area Under Curve (AUC) measure of laughter and filler classes and their unweighted average (UAAUC) *M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I. H. Witten, “The WEKA Data Mining Software”, 2009 (http://www.cs.waikato.ac.nz/ml/weka/)
  • 12. Baseline Results with Linear SVM (%AUC) SVM Baseline on Development Set1 Laughter C2 0.1 81.3 1 81.2 10 81.2 1 2 Filler 83.6 83.7 83.7 UAAUC 82.5 82.5 82.5 Challenge paper reports 86.2% and 89.0% for laughter and filler, respectively SVM Complexity parameter SVM Performance with mRMR Features on Devel. Set (C=0.1) # of mRMR Features 90 70 50 Laughter Filler UAAUC 80.7 79.9 78.8 83.3 83.0 82.4 82.0 81.5 80.6
  • 13. Experiments with Random Forests ● Hyper-parameters of an RF are – – ● ● the number of (randomly selected) features per decision tree (d) and the number of trees (T) to form the forest We also investigated the effect of mRMR ranked features (D) The tested values used for the hyper-parameters are d = {8,16,32}, T= {10,20,30} and D = {50,70,90, All}
  • 14. Random Forest Performance with T=20, Varying d and D on Development Set (% AUC) Local Dim. (d) #mRMR (D) All 90 8 70 50 All 90 16 70 50 All 90 32 70 50 Laughter 88.9 89.3 88.3 88.0 89.0 89.3 88.8 88.3 89.5 89.6 89.1 88.2 Filler 90.7 90.8 90.3 89.7 90.6 90.7 90.7 90.0 90.9 90.9 90.8 90.0 UAAUC 89.8 90.1 89.3 88.9 89.8 90.0 89.8 89.2 90.2 90.3 90.0 89.1
  • 15. Random Forest Performance with Varying T, d and D on Development Set (% UAAUC) Local Dim. (d) #mRMR (D) 8 16 32 Avg All 90 70 50 All 90 70 50 All 90 70 50 10 Trees 20 Trees 30 Trees 87.4 88.0 87.9 87.5 88.2 88.6 88.5 87.8 88.8 89.0 88.7 87.9 88.2 89.8 90.1 89.3 88.9 89.8 90.0 89.8 89.2 90.2 90.3 90.0 89.1 89.7 89.7 90.1 89.8 89.4 90.4 90.5 90.2 89.6 90.4 90.8 90.4 89.6 90.0 Development set baseline reported on paper: 87.6%, reproduced baseline: 82.5%
  • 16. Gaussian Smoothing ● ● ● We further applied Gaussian window smoothing on posteriors. The posteriors of each frame were re-calculated as a weighted sum of its 2K neighbors (K before & K after) The Gaussian weight function used to smooth frame i with a neighboring frame j, is given as −(1/2) w i , j =(2∗pi∗B) ● ● ∗exp(−∣i− j∣/(2∗B)) We tested development set accuracy for K = 1,...,10 Since the increase from K=8 to K=9 was less than 0.05%, we chose K=8
  • 17. The Development Set Accuracy vs K
  • 18. The Challenge Test Set Result ● We re-trained training and development sets together with the best setting (T=30, D=90, d=32) ● We issued a Gaussian smoothing with K=8 ● On the overall we attained – 89.6% and 87.3% AUC for laughter and fillers – An UAAUC of 88.4% outperformed the challenge test set baseline (83.3%) by 5.1%
  • 19. Conclusions ● ● ● ● We proposed the use of RFs for laughter detection The results indicate superior performance in accuracy We observed that RFs make use of feature reduction via mRMR while SVMs deteriorate Together with Gaussian smoothing of posteriors, we attained an increase of 5.1% (absolute) from challenge test set baseline