2. ABSTRACT
• A novel approach that provides, the user with an
automatically generated playlist of songs based on the mood
of the user.
• Music plays a very important role in human’s daily life and
in the modern advanced technologies .
• The difficulties in the creation of large playlists can
overcome here.
• This Music player itself selects songs according to the
current mood of the user. 2
3. Cont…
• Existing methods for automating the playlist generation
process are computationally slow, less accurate and sometimes
even require use of additional hardware like EEG or sensors.
• This proposed system based on extraction of facial
expressions that will generate a playlist automatically thereby
reducing the effort and time.
3
4. • User dependent and user independent datasets can done here
and the Facial expressions are captured using an inbuilt
camera.
• The accuracy of the detection algorithm used for real time
images is around 85-90%, while for static images it is
around 98-100%
4
Cont…
5. INTRODUCTION
• From others this plays a novel approach that removes the
risk that the user has to face the task of manually browsing
through the playlist of songs to select.
• Here an efficient and accurate algorithm, that would
generate a playlist based on current emotional state and
behavior of the user.
5
6. • The introduction of Audio Emotion Recognition (AER) and
Music Information Retrieval (MIR) in the traditional music
players provided automatically parsing the playlist based on
various classes of emotions and moods.
• Facial expression is the most ancient and natural way of
expressing feelings, emotions and mood and its algorithm
requires less computational, time, and cost.
6
Cont…
7. Emotion Basics
• The facial expressions categorize into 5 different of facial
expressions like anger, joy, surprise, sad, and excitement.
• An emotion model is proposed that classifies a song based
on any of the 7 classes of emotions viz sad, joy-anger, joy-
surprise, joy-excitement, joy, anger, and sad-anger.
7
8. 8
Requirement
• Track User Emotion
• Recommend by Sorting playlist based on user’s current
emotion
• Sort songs by 2 factors
o Relevancy to User Preference
o Effect on User Emotion
9. METHODOLOGY
• Three major modules:
Emotion extraction module(EEM)
Audio feature extraction module(AEM)
Emotion-Audio recognition module.
• EEM and AEM are two separate modules and Emotion-
Audio recognition module performs the mapping of
modules by querying the audio meta-data file.
• The EEM and AEM are combined in Emotion-Audio
integration module. 9
10. Emotion extraction module
• Done by the analysis on Images, Image of a user is captured
using a webcam or it can be accessed from the stored image in
the hard disk.
• This acquired image undergoes image enhancement in the
form of tone mapping in order to restore the original contrast
of the image.
• converted into binary image format for the detection of face
using Viola and Jones algorithm (Frontal Cart property)
10
11. Viola and Jones algorithm
• The Viola–Jones object detection framework is the first object
detection framework to provide competitive object detection rates
in real-time proposed in 2001 by Paul Viola and Michael Jones.
• It can be trained to detect a variety of object classes, it was
motivated primarily by the problem of face detection .
12. Cont…
• Here a computer needs precise instructions and constraints for
the detection To make the task more manageable, Viola–Jones
requires full view frontal upright faces.
• Thus in order to be detected, the entire face must point towards
the camera and should not be tilted to either side.
• Here we use the viola frontal cart property, The ‘frontal cart’
property only detects the frontal face that are upright and
forward facing.
13. • Here this property with a merging threshold of 16 is
employed to carry out this process. This value helps in
coagulating the multiple boxes into a single bounding box.
• Bounding box is the physical shape for a given object and
can be used to graphically define the physical shapes of the
objects.
Cont…
14. MATLAB code on frontal cart for
object detection
function [ ] = Viola_Jones_img ( Img )
%Viola_Jones_img( Img )
% Img - input image
% Example how to call function: Viola_Jones_img(
imread ('name_of_the_picture.jpg'))
faceDetector =
vision.CascadeObjectDetector;
bboxes = step(faceDetector, Img);
figure, imshow( Img ), title( 'Detected faces' );hold on
for i = 1 : size (bboxes, 1 )
rectangle( 'Position' ,bboxes
( i ,:), ‘ LineWidth ' , 2 , ‘ EdgeColor ' , 'y' );
end
end
15. Advantages
Extremely fast feature computation
Efficient feature selection
Scale and location invariant detector
Instead of scaling the image itself (e.g. pyramid-filters), we
scale the features.
Such a generic detection scheme can be trained for detection
of other types of objects (e.g. cars, hands)
16. Disadvantages
Detector is most effective only on frontal images of faces
Sensitive to lighting conditions
We might get multiple detections of the same face, due to
overlapping sub-windows.
17. EEM Cont..
• The output of Viola and Jones Face detection block forms an
input to the facial feature extraction block.
• Equations for bounding box calculations for extracting
features of a mouth.
1. X (start pt of mouth) = X (mid pt of nose) – (X (end pt of nose) – (X start pt of nose))
2. X (end pt of mouth) = X (mid pt of nose) + ((X end pt of nose) – (X start pt of nose))
3. Y (start pt of mouth) = Y (mid pt of nose) +15
4. Y (end pt of mouth) = Y (start pt of mouth) +103
17
18. Audio feature extraction module
• Here the list of songs forms as input audio files, and here
the conversion of 16 bit PCM mono signal around a
variable sampling rate of 48.6 kHz. The conversion process
is done using Audacity technique.
• The extraction of converted files are done by three tool box
units - MIR 1.5 Toolbox
- Chroma Toolbox
- Auditory Toolbox 18
19. MIR 1.5 Toolbox
• Features like tonality, rhythm, structures, etc extracted using
integrated set of functions written in Matlab.
• Additionally to the basic computational processes, the toolbox
also includes higher-level musical feature extraction tools,
whose alternative strategies, and their multiple combinations,
can be selected by the user.
19
20. Cont…
• The toolbox was initially conceived in the context of the Brain
Tuning project financed by the European Union (FP6-NEST).
One main objective was to investigate the relation between
musical features and music-induced emotion and the
associated neural activity.
20
21. Chroma Toolbox
• It is a MATLAB implementations for extracting various types
of novel pitch-based and chroma-based audio features.
• It contains feature extractors for pitch features as well as
parameterized families of variants of chroma-like features.
21
22. Auditory Toolbox
• Features like centroid, spectral flux, spectral roll off,
kurtosis,15 MFCC coefficients are extracted using Auditory
Toolbox with the implementation on MATLAB.
• Audio signals are categorized into 8 types viz. sad, joy-anger,
joy-surprise, joy-excitement, joy, anger, sad-anger and others.
22
23. 8 Types
1. Songs that resemble cheerfulness, energetic and playfulness are classified
under joy.
2. Songs that resemble very depressing are classified under the sad.
3. Songs that reflect mere attitude, revenge are classified under anger.
4. Songs with anger in playful is classified under Joy-anger category.
5. Songs with very depress mode and anger mood are classified under Sad-
Anger category.
6. Songs which reflect excitement of joy is classified under Joy-
Excitement category.
7. Songs which reflect surprise of joy is classified under Joy-surprise
category.
8. All other songs fall under ‗others‘ category. 23
24. Emotion-Audio recognition module
• The emotion extraction module and audio feature extraction
module is finally mapped and combined using an Emotion
Audio integration module.
• The extracted for the songs are stored as a meta-data in the
database. Mapping is performed by querying the meta-data
database.
24
28. 28
Related research
• Emotion recognition by Appearance-based feature extraction
and Geometric based feature extraction.
• An accurate and efficient statistical based approach for
analyzing extracted facial expression features…
29. 29
CONCLUSION
• which is very less thus helping in achieving a better real time
performance and efficiency.
Module Time Taken(sec)
Face Detection 0.8126
Facial Feature Extraction 0.9216
Emotion Extraction Module 0.9994
Emotion-Audio Integration Module 0.0006
Proposed System 1.0000
30. FUTURE SCOP
• The future scope in the system would to design a mechanism
that would be helpful in music therapy treatment and provide
the music therapist the help needed to treat the patients
suffering from disorders like mental stress, anxiety, acute
depression and trauma. The proposed system also tends to
avoid in future the unpredictable results produced in extreme
bad light conditions and very poor camera resolution.
30
31. REFERENCE
• Chang, C. Hu, R. Feris, and M. Turk, ―Manifold based analysis of facial
expression,‖ Image Vision Comput ,IEEE Trans. Pattern Anal. Mach. Intell.
vol. 24, pp. 05–614, June 2006.
• A. habibzad, ninavin, Mir kamalMirnia,‖ A new algorithm to classify face
emotions through eye and lip feature by using particle swarm optimization.‖
• Byeong-jun Han, Seungmin Rho, Roger B. Dannenberg and Eenjun Hwang,
―SMERS: music emotion recognition using support vector regression‖,
10thISMIR , 2009.
• Alvin I. Goldmana, b.Chandra and SekharSripadab, ―Simulationist models of
face-based emotion recognition‖.
• Carlos A. Cervantes and Kai-Tai Song , ―Embedded Design of an Emotion-
Aware Music Player‖, IEEE International Conference on Systems, Man, and
Cybernetics, pp 2528-2533 ,2013.
31