SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
AcousticBrainz Genre Task
Content-based music genre recognition
from multiple sources
Dmitry Bogdanov, Alastair Porter (Universitat Pompeu Fabra)
Julián Urbano (Delft University of Technology)
Hendrik Schreiber (tagtraum industries incorporated)
Genre recognition in Music Information Retrieval
● A popular task in MIR (Sturm 2014)
● Only small number of broad genres (e.g., rock, jazz, classical, electronic)
● Almost no studies on more specific genres (subgenres)
● Studies don’t consider the subjective nature of genre labels and taxonomies
● Single-class classification problem instead of a multi-class problem
● Genre hierarchy is not exploited
● Small datasets
B. L. Sturm. 2014. The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval. Journal of New Music Research 43, 2 (2014), 147–172.
AcousticBrainz
AcousticBrainz: a community database containing music features extracted
from audio (https://acousticbrainz.org) (Porter et al. 2015)
● Open data computed by open algorithms
● Built on submissions from the community
● Over 5,600,000 analyzed recordings (tracks)
● ~3,000 music features (bags-of-frames)
● Statistical information about spectral shape, rhythm, tonality, loudness, etc.
● Rich music metadata from MusicBrainz (https://musicbrainz.org)
● Lots of data... What can we do with it?
A Porter, D Bogdanov, R Kaye, R Tsukanov, and X Serra. 2015. AcousticBrainz: a community platform for gathering music information obtained from audio. In International Society for Music
Information Retrieval (ISMIR’15) Conference. Málaga, Spain, 786–792.
The 2017 AcousticBrainz MediaEval Task
Content-based music genre recognition from multiple ground truth sources
Goal: Predict genre and subgenre of unknown music recordings given
precomputed music features
Task novelty:
● Four different genre annotation sources (and taxonomies)
● Hundreds of specific subgenres
● Multi-label genre classification problem
● A very large dataset (~2 million recordings in total)
Sources of genre information
● Scrape from internet sources
● Discogs (discogs.com) and AllMusic
(allmusic.com)
● Explicit genre and subgenre annotations
at an album level
● predefined taxonomies
● AcousticBrainz song → album → genre
Sources of genre information
● Tagtraum dataset based on beaTunes
○ Consumer application for Windows and Mac by
tagtraum industries incorporated
○ Encourages users to correct metadata
○ Collects anonymized, user-submitted metadata
○ Relationship Song:Genre is 1:n
● Last.fm
○ Folksonomy tags for each song
○ Relative strength (0-100)
● Tag cleaning (normalization and blacklisting)
● Automatic inference of genre-subgenre relations
Mapping user genre labels to a genre taxonomy
1. Normalization (lowercase, smart subs, ...)
R&B → rnb
Rhythm and Blues → rnb
R and B → rnb
2. Removal of unwanted labels via blacklisting (80spop, love, charts, djonly, ...)
3. Inferring hierarchical relationships via co-occurrence
Co-Occurrence matrix
If a song is labeled with Alternative, how often is it also labeled with Rock?
Co-Occurrence matrix
What about the other way around?
If a song is labeled with Rock, how often is it also labeled with
Alternative?
Co-Occurrence is
not symmetric!
What does the data look like?
https://www.youtube.com/watch?v=zlaz7aR7B44
Subjectivity in music genre
● Classification tasks typically rely on an agreed answer for ground truth
● What should we do if we can’t find agreement between our ground truth?
● What if different sources use a label, but source has a different definition?
Reggae → Dub Electronic → Ambient Dub
Electronic → World Fusion
World, Dub, Fusion
Sub-tasks
● Task 1: Build a separate system for each ground-truth dataset
● Task 2: Can we benefit from combining different ground truths into one
system?
Task 1 Task 2
Development and testing dataset split
● 4 development and 4 testings datasets (70%-15% split, 15% kept for future)
● Album filter
● Each label has at least 40 recordings from 6 release groups in training
dataset (20 from 3 for test dataset)
● Development datasets statistics:
Results
Submissions
● Participants from five teams
● Maximum of 5 submissions for each subtask per team
● (5 submissions ✖ 2 tasks ✖ 4 datasets = 40 runs per team)
● 115 runs received in total
Baselines
● Random baseline: following the distribution of labels
● Popularity baseline: always predicts the most popular genre
Methodologies
● Manual feature preselection
● Classifiers
○ Hierarchical (SVMs + extra trees)
○ Neural networks
○ Random Forest Classifiers
● Task 2 (combining datasets)
○ Genre similarity based on text string matching distance, voting
○ Genre/subgenre similarity based on co-occurrence, conversion matrix, weighting
Evaluation metrics
Effectiveness: Precision, Recall and F-score
● Per recording, all labels (genres and subgenres)
● Per recording, only genres
● Per recording, only subgenres
● Per label, all recordings
● Per genre label, all recordings
● Per subgenre label, all recordings
Per-track F-measure
All labels (genres and subgenres)
Per-label F-measure
All labels (genres and subgenres)
JKU
DBIS
Baselines
popularity
random
Results on genres vs subgenres
Conclusions: The Task is Challenging!
● Subgenre recognition is much more difficult - much space to improve!
● Datasets are heavily unbalanced
● High recall, but poor precision for many systems
● AllMusic dataset is the most difficult
● Systems should exploit hierarchies more
● No significant improvement from combining genre sources yet
Team results
● JKU consistently proposes the best systems across all datasets
● DBIS exploits hierarchies and is significantly better than baselines
● KART, SAM-IRIT and ICSI are similar or close to baselines
Future directions
● AcousticBrainz is an ongoing experiment in collaborative extraction of music
knowledge from audio
● MediaEval 2017 is our starting point
● Integrate promising systems to AcousticBrainz
Next iteration of the AcousticBrainz Genre Task
● Exploit hierarchies to improve predictions on subgenre level
● Better combination of multiple genre annotation sources (Task 2)
● New music features?
Reproducibility
● Open development data
○ Music features computed by open-source software (Essentia)
○ Most genre annotations are open and are gathered by open-source
software (MetaDB)
● Open-source code for evaluation and baselines
● Open validation datasets (will be published after workshop)
Thank you!
Differences between genre annotation sources
Recording "Ambassel" by Dub Colossus
● Source 1: Electronic→ambient dub and Electronic→Downtempo
● Source 2: Electronic→dub and Hip-Hop and Reggae→Dub
● Source 3: World→Worldfusion
● Source 4: World→African and world→Worldfusion
Recording "Como Poden" by In Extremo
● Source 1: Pop/Rock→Heavy Metal
● Source 2: Rock→Folk Rock
● Source 3: Metal→Folk Metal
● Source 4: Rock/Pop→Folk Metal and Rock/Pop→Metal
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources
The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources

Weitere ähnliche Inhalte

Ähnlich wie The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources

Data science-2013-heekim
Data science-2013-heekimData science-2013-heekim
Data science-2013-heekimHaklae Kim
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Yi-Hsuan Yang
 
Query By humming - Music retrieval technology
Query By humming - Music retrieval technologyQuery By humming - Music retrieval technology
Query By humming - Music retrieval technologyShital Kat
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisStuart Wrigley
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyChing-Wei Chen
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoveryKarthik Murugesan
 
Big data and machine learning @ Spotify
Big data and machine learning @ SpotifyBig data and machine learning @ Spotify
Big data and machine learning @ SpotifyOscar Carlsson
 
Anghami: From Billions Of Streams To Better Recommendations
Anghami: From Billions Of Streams To Better RecommendationsAnghami: From Billions Of Streams To Better Recommendations
Anghami: From Billions Of Streams To Better RecommendationsRamzi Karam
 
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...Joe McCarthy
 
CS158: Final Project
CS158: Final ProjectCS158: Final Project
CS158: Final ProjectEvan Casey
 
Metric Learning for Music Discovery with Source and Target Playlists
Metric Learning for Music Discovery with Source and Target PlaylistsMetric Learning for Music Discovery with Source and Target Playlists
Metric Learning for Music Discovery with Source and Target PlaylistsYing-Shu Kuo
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022Kwanghee Choi
 
Music Classification at SoundCloud
Music Classification at SoundCloudMusic Classification at SoundCloud
Music Classification at SoundCloudPetko Nikolov
 
Lorenzo Porcaro PhD Defense
Lorenzo Porcaro PhD Defense Lorenzo Porcaro PhD Defense
Lorenzo Porcaro PhD Defense Lorenzo Porcaro
 
Knn a machine learning approach to recognize a musical instrument
Knn  a machine learning approach to recognize a musical instrumentKnn  a machine learning approach to recognize a musical instrument
Knn a machine learning approach to recognize a musical instrumentIJARIIT
 
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...multimediaeval
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...kthrlab
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Yi-Hsuan Yang
 

Ähnlich wie The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources (20)

Data science-2013-heekim
Data science-2013-heekimData science-2013-heekim
Data science-2013-heekim
 
Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017Research at MAC Lab, Academia Sincia, in 2017
Research at MAC Lab, Academia Sincia, in 2017
 
Query By humming - Music retrieval technology
Query By humming - Music retrieval technologyQuery By humming - Music retrieval technology
Query By humming - Music retrieval technology
 
Improving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log AnalysisImproving Semantic Search Using Query Log Analysis
Improving Semantic Search Using Query Log Analysis
 
Machine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at SpotifyMachine Learning and Big Data for Music Discovery at Spotify
Machine Learning and Big Data for Music Discovery at Spotify
 
Spotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music DiscoverySpotify Machine Learning Solution for Music Discovery
Spotify Machine Learning Solution for Music Discovery
 
Big data and machine learning @ Spotify
Big data and machine learning @ SpotifyBig data and machine learning @ Spotify
Big data and machine learning @ Spotify
 
Anghami: From Billions Of Streams To Better Recommendations
Anghami: From Billions Of Streams To Better RecommendationsAnghami: From Billions Of Streams To Better Recommendations
Anghami: From Billions Of Streams To Better Recommendations
 
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
MusicFX: An Arbiter of Group Preferences for Computer Supported Collaborative...
 
CS158: Final Project
CS158: Final ProjectCS158: Final Project
CS158: Final Project
 
Metric Learning for Music Discovery with Source and Target Playlists
Metric Learning for Music Discovery with Source and Target PlaylistsMetric Learning for Music Discovery with Source and Target Playlists
Metric Learning for Music Discovery with Source and Target Playlists
 
Trends of ICASSP 2022
Trends of ICASSP 2022Trends of ICASSP 2022
Trends of ICASSP 2022
 
Music Classification at SoundCloud
Music Classification at SoundCloudMusic Classification at SoundCloud
Music Classification at SoundCloud
 
Kaggle kenneth
Kaggle kennethKaggle kenneth
Kaggle kenneth
 
Lorenzo Porcaro PhD Defense
Lorenzo Porcaro PhD Defense Lorenzo Porcaro PhD Defense
Lorenzo Porcaro PhD Defense
 
Knn a machine learning approach to recognize a musical instrument
Knn  a machine learning approach to recognize a musical instrumentKnn  a machine learning approach to recognize a musical instrument
Knn a machine learning approach to recognize a musical instrument
 
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
MediaEval 2016 - COSMIR and the OpenMIC Challenge: A Plan for Sustainable Mus...
 
Nithin Xavier research_proposal
Nithin Xavier research_proposalNithin Xavier research_proposal
Nithin Xavier research_proposal
 
Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...Introduction of my research histroy: From instrument recognition to support o...
Introduction of my research histroy: From instrument recognition to support o...
 
Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)Machine Learning for Creative AI Applications in Music (2018 May)
Machine Learning for Creative AI Applications in Music (2018 May)
 

Mehr von multimediaeval

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...multimediaeval
 
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...multimediaeval
 
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...multimediaeval
 
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...multimediaeval
 
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 TaskEssex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Taskmultimediaeval
 
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...multimediaeval
 
Fooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality EstimatorFooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality Estimatormultimediaeval
 
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...multimediaeval
 
Pixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social ImagesPixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social Imagesmultimediaeval
 
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-MatchingHCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matchingmultimediaeval
 
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...multimediaeval
 
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...multimediaeval
 
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...multimediaeval
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentationmultimediaeval
 
A Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image DetectionA Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image Detectionmultimediaeval
 
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...multimediaeval
 
Fine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with AttentionFine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with Attentionmultimediaeval
 
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...multimediaeval
 
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...multimediaeval
 
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ... Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...multimediaeval
 

Mehr von multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
 
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
 
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
 
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
 
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 TaskEssex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
 
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
 
Fooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality EstimatorFooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality Estimator
 
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
 
Pixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social ImagesPixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social Images
 
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-MatchingHCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
 
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
 
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
 
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
 
A Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image DetectionA Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image Detection
 
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
 
Fine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with AttentionFine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with Attention
 
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
 
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
 
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ... Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 

Kürzlich hochgeladen

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Silpa
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curveAreesha Ahmad
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 

Kürzlich hochgeladen (20)

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Stages in the normal growth curve
Stages in the normal growth curveStages in the normal growth curve
Stages in the normal growth curve
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 

The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources

  • 1. AcousticBrainz Genre Task Content-based music genre recognition from multiple sources Dmitry Bogdanov, Alastair Porter (Universitat Pompeu Fabra) Julián Urbano (Delft University of Technology) Hendrik Schreiber (tagtraum industries incorporated)
  • 2. Genre recognition in Music Information Retrieval ● A popular task in MIR (Sturm 2014) ● Only small number of broad genres (e.g., rock, jazz, classical, electronic) ● Almost no studies on more specific genres (subgenres) ● Studies don’t consider the subjective nature of genre labels and taxonomies ● Single-class classification problem instead of a multi-class problem ● Genre hierarchy is not exploited ● Small datasets B. L. Sturm. 2014. The State of the Art Ten Years After a State of the Art: Future Research in Music Information Retrieval. Journal of New Music Research 43, 2 (2014), 147–172.
  • 3. AcousticBrainz AcousticBrainz: a community database containing music features extracted from audio (https://acousticbrainz.org) (Porter et al. 2015) ● Open data computed by open algorithms ● Built on submissions from the community ● Over 5,600,000 analyzed recordings (tracks) ● ~3,000 music features (bags-of-frames) ● Statistical information about spectral shape, rhythm, tonality, loudness, etc. ● Rich music metadata from MusicBrainz (https://musicbrainz.org) ● Lots of data... What can we do with it? A Porter, D Bogdanov, R Kaye, R Tsukanov, and X Serra. 2015. AcousticBrainz: a community platform for gathering music information obtained from audio. In International Society for Music Information Retrieval (ISMIR’15) Conference. Málaga, Spain, 786–792.
  • 4. The 2017 AcousticBrainz MediaEval Task Content-based music genre recognition from multiple ground truth sources Goal: Predict genre and subgenre of unknown music recordings given precomputed music features Task novelty: ● Four different genre annotation sources (and taxonomies) ● Hundreds of specific subgenres ● Multi-label genre classification problem ● A very large dataset (~2 million recordings in total)
  • 5. Sources of genre information ● Scrape from internet sources ● Discogs (discogs.com) and AllMusic (allmusic.com) ● Explicit genre and subgenre annotations at an album level ● predefined taxonomies ● AcousticBrainz song → album → genre
  • 6. Sources of genre information ● Tagtraum dataset based on beaTunes ○ Consumer application for Windows and Mac by tagtraum industries incorporated ○ Encourages users to correct metadata ○ Collects anonymized, user-submitted metadata ○ Relationship Song:Genre is 1:n ● Last.fm ○ Folksonomy tags for each song ○ Relative strength (0-100) ● Tag cleaning (normalization and blacklisting) ● Automatic inference of genre-subgenre relations
  • 7. Mapping user genre labels to a genre taxonomy 1. Normalization (lowercase, smart subs, ...) R&B → rnb Rhythm and Blues → rnb R and B → rnb 2. Removal of unwanted labels via blacklisting (80spop, love, charts, djonly, ...) 3. Inferring hierarchical relationships via co-occurrence
  • 8. Co-Occurrence matrix If a song is labeled with Alternative, how often is it also labeled with Rock?
  • 9. Co-Occurrence matrix What about the other way around? If a song is labeled with Rock, how often is it also labeled with Alternative? Co-Occurrence is not symmetric!
  • 10. What does the data look like? https://www.youtube.com/watch?v=zlaz7aR7B44
  • 11.
  • 12.
  • 13.
  • 14. Subjectivity in music genre ● Classification tasks typically rely on an agreed answer for ground truth ● What should we do if we can’t find agreement between our ground truth? ● What if different sources use a label, but source has a different definition? Reggae → Dub Electronic → Ambient Dub Electronic → World Fusion World, Dub, Fusion
  • 15. Sub-tasks ● Task 1: Build a separate system for each ground-truth dataset ● Task 2: Can we benefit from combining different ground truths into one system? Task 1 Task 2
  • 16. Development and testing dataset split ● 4 development and 4 testings datasets (70%-15% split, 15% kept for future) ● Album filter ● Each label has at least 40 recordings from 6 release groups in training dataset (20 from 3 for test dataset) ● Development datasets statistics:
  • 17. Results Submissions ● Participants from five teams ● Maximum of 5 submissions for each subtask per team ● (5 submissions ✖ 2 tasks ✖ 4 datasets = 40 runs per team) ● 115 runs received in total Baselines ● Random baseline: following the distribution of labels ● Popularity baseline: always predicts the most popular genre
  • 18. Methodologies ● Manual feature preselection ● Classifiers ○ Hierarchical (SVMs + extra trees) ○ Neural networks ○ Random Forest Classifiers ● Task 2 (combining datasets) ○ Genre similarity based on text string matching distance, voting ○ Genre/subgenre similarity based on co-occurrence, conversion matrix, weighting
  • 19. Evaluation metrics Effectiveness: Precision, Recall and F-score ● Per recording, all labels (genres and subgenres) ● Per recording, only genres ● Per recording, only subgenres ● Per label, all recordings ● Per genre label, all recordings ● Per subgenre label, all recordings
  • 20. Per-track F-measure All labels (genres and subgenres) Per-label F-measure All labels (genres and subgenres) JKU DBIS Baselines popularity random
  • 21. Results on genres vs subgenres
  • 22. Conclusions: The Task is Challenging! ● Subgenre recognition is much more difficult - much space to improve! ● Datasets are heavily unbalanced ● High recall, but poor precision for many systems ● AllMusic dataset is the most difficult ● Systems should exploit hierarchies more ● No significant improvement from combining genre sources yet Team results ● JKU consistently proposes the best systems across all datasets ● DBIS exploits hierarchies and is significantly better than baselines ● KART, SAM-IRIT and ICSI are similar or close to baselines
  • 23. Future directions ● AcousticBrainz is an ongoing experiment in collaborative extraction of music knowledge from audio ● MediaEval 2017 is our starting point ● Integrate promising systems to AcousticBrainz Next iteration of the AcousticBrainz Genre Task ● Exploit hierarchies to improve predictions on subgenre level ● Better combination of multiple genre annotation sources (Task 2) ● New music features?
  • 24. Reproducibility ● Open development data ○ Music features computed by open-source software (Essentia) ○ Most genre annotations are open and are gathered by open-source software (MetaDB) ● Open-source code for evaluation and baselines ● Open validation datasets (will be published after workshop)
  • 26. Differences between genre annotation sources Recording "Ambassel" by Dub Colossus ● Source 1: Electronic→ambient dub and Electronic→Downtempo ● Source 2: Electronic→dub and Hip-Hop and Reggae→Dub ● Source 3: World→Worldfusion ● Source 4: World→African and world→Worldfusion Recording "Como Poden" by In Extremo ● Source 1: Pop/Rock→Heavy Metal ● Source 2: Rock→Folk Rock ● Source 3: Metal→Folk Metal ● Source 4: Rock/Pop→Folk Metal and Rock/Pop→Metal