2. Problem Introduction
• Sentiment analysis aims to determine the
attitude of a speaker or a writer with respect
to some topic or the overall contextual
polarity of a document.
• In our project, we use pronounce-based
feature to classify lyrics
3. Problem Introduction
• We work on four classification topics. They
are:
– Mood:
• Self-Center
• Self-Reflective
– Target Figure:
• Speaking Globally
• Speaking about a relationship
• We develop binary classifier to see if a song
can be classified to a category or not.
4. Dataset
• 222 songs from different genres and singers
are selected and manually labeled by Luce and
me.
5. Features
• We define features based on pronouns
– Pronouns frequency (PF)
– Regular expression (RE)
– Contextual (CON)
7. RE
•
•
•
•
•
•
•
•
r”I .* her”: e.g. “I miss her so much.”
r”You .* my”: e.g. “You got my attention.”
r”^I .*”: e.g. “I am riding a tank.”
r”They .* us”: e.g. “They wanna control us.”
r”Let us .*”: e.g. “Let us share the world.”
r”You .* me”: e.g. “You raise me up.”
r”We .*you”: e.g. “We will rock you for free.”
……
The value of each element in the feature vector is
𝑁𝑘
𝑓𝑘 =
𝑁𝑆
9. Classifier - Boosting
• Boosting: Using multiple weak classifiers to
build a strong one.
• For the first three features, we train linear
SVM on them. Then we build a boosting
classifier based on that
11. Difficulty and Challenging
• Feature Selection:
– From frequency to context
• Dataset Bias:
– All data is human labelled (by us)
– Mood metric
12. Conclusion
• We developed and tested three kings of
pronouns feature. In addition, we trained
boosting classifier for a better result.
• The pronouns feature is more effective on
“Target Figure” problem rather than “Mood”
Problem
• This kind of detector can be used in song
searching