4. • “How to group customers for targeted marketing purposes?”
• “Which neighborhoods in a country are most similar to each other?”
• “What groups of insurance policy holders have high claim costs?”
• “How to group the products in a store based on their attributes?”
• “How to group pictures based on their description?”
4
Examples of Clustering Applications
6. • “Which category of products is most interesting to this customer?”
• “Is this movie a romantic comedy, documentary, or thriller?”
• “Is this review written by a customer or a robot?”
• “Will the customer buy this product?”
• “Is this email spam or not spam?”
6
Examples of Classification Applications
8. • “Which movies should be recommended to a user?”
• “If the user just listened to a song, which song would he like now?”
• “Which news articles are relevant for a user in a particular context?”
• “Which advertisements should be displayed for a user on a mobile app?”
• “Which products are frequently bought together?”
8
Examples of Recommendation Applications
9. In May 2015 Flickr released an automatic
image tagging capability that mistakenly
labeled a black man for an ape.
Soon afterwards, Google came up with a
photo labeling tool similar to Flickr, which
made similar mistakes. Black men were
tagged as gorillas.
A recent Carnegie Mellon University study
showed that Google displayed ads in a way
that discriminated based on the gender of
the user.
9
Failures of Machine Learning
11. Derive new features from the initial data set:
1. Aggregations: Count, Sum, Min, Max, Avg, Std
2. Temporal: Elapsed time, Trends
3. Continuous to Categorical: Converting real values to enumerations.
4. Categorical to Binary: Converting enumerations to binary features.
5. Domain-specific derived features.
11
Feature Engineering
13. Question: How to evaluate the predictive accuracy of your model?
Answer: Partition the data set into Train and Test sets.
Train is like the “past” you learn from, and Test is like the “future” you predict.
13
Evaluation