A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints - HUM 2018 – Holistic User Modeling Workshop jointly held with
UMAP 2018 – 26th International
Conference on User Modeling,
Adaptation and Personalization
Singapore - July 8, 2018
HTML Injection Attacks: Impact and Mitigation Strategies
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints
1. @cataldomusto
A Framework for Holistic User Modeling
Merging Heterogeneous Digital Footprints
CATALDO MUSTO, GIOVANNI SEMERARO, COSIMO LOVASCIO,
MARCO DE GEMMIS, PASQUALE LOPS
UNIVERSITÀ DEGLI STUDI DI BARI ‘ALDO MORO’ - ITALY
cataldo.musto@uniba.it
HUM 2018 – Holistic User Modeling Workshop
jointly held with
UMAP 2018 – 26th International
Conference on User Modeling,
Adaptation and Personalization
Singapore - July 8, 2018
2. Background
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
3. The ‘Egosystem’
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
4. Data Silos Problem
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
5. What about personalization?
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
X
6. Research Questions
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
‘’ Is it possible to build a
unique representation of
the user merging data
extracted from personal
devices with data
extracted from social
networks? ‘’
7. Holistic User Model
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
8. Holistic User Model
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Affects
Demographics
Interests
Behaviors
Social Relations
Knowledge and
Skills
Physical States
Cognitive Aspects
Inspiredy by
Cena, F., Likavec, S., and Rapp, A. Real world user model: Evolution of user modeling triggered
by advances in wearable and ubiquitous computing. Information Systems Frontiers, 2018
9. Holistic User Model
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Name Description
Demographics Models demographic data (age, weight, name, etc.)
Interests Models interests and preferences
Affects Models mood and emotions of the user
Cognitive Aspects Models cognitive traits (personality, emphaty, etc.)
Behaviors Models the activities of the user
Social Relations Models the connections and the relations
Physical States Models physical data (sleep, food, heart rate, etc.)
Knowledge and Skills Models knowledge and skills of the user
10. How can we build a Holistic User Model?
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
11. Holistic User Modeling Workflow
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
12. Holistic User Modeling Workflow
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
13. Data Acquisition - Sources
Twitter
Facebook LinkedIn
Android
FitBit
Instagram
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
14. Data Acquisition
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Twitter
Dato Descrizione
Profile Demographic information extracted from the profile (name,
location, website, #followers, #following, etc.)
Post Textual content of each post, date, language, #likes e retweet,
latitude and longitude (if any)
Connections Username, kind of relations (following/follower)
15. Data Acquisition
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Facebook
Dato Descrizione
Profile Demographic information extracted from the profile (name,
surname, profile pic, sex, age, location)
Post Textual content of the post, date, language, story (if any)
Friends Username
Likes Name of the page, Category of the page, Description of the page
16. Data Acquisition
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
LinkedIn
Dato Descrizione
Profile Demographic information extracted from the profile (name,
surname, profile pic, language, work category)
Position Description and category of the current work
17. Data Acquisition
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Android Devices
Dato Descrizione
GPS Data Latitude, Longitude, Accuracy, Timestamp
Contacts Name, Phone Number, Interactions
App Name of the app, category, daily usage
Display Display mode (on/off)
Usage Network used (Wi-Fi, 4G, etc.) and traffic
Device Brand and Model of the phone
18. Data Acquisition
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Instagram
Dato Descrizione
Profile Demographic information extracted from the profile (name,
surname, profile pic)
Post Textual content of the post, hashtag , location (if any)
Friends Following/followers
19. Data Acquisition
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
FitBit
Dato Descrizione
Profile Demographic information extracted from the profile (name,
surname, profile pic, birth date, height, weight, sex)
Activities Kind of activities (running, walking) duration, calories, distance
Heart Rate Heart rate and timestamp
Sleep Date, sleep duration, sleep Quality
Food Food, calories, date, time
20. Holistic User Modeling Workflow
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
21. Data Processing and Enrichment
Data are processed through a Natural Language Processing and Machine Learning pipelines
Natural Language Processing
◦ Language Detection
◦ Stop-words Removal
◦ Lemmatization
◦ Entity Linking
◦ Wikipedia Categories Identification
Machine Learning (work-in-progress)
◦ Sentiment Analysis for Italian Tweets
◦ Emphathy and Personality Detection from Text
◦ Models for Activity Detection
◦ etc.
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
22. Holistic User Modeling Workflow
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
23. Holistic Profile Builder
Automatically maps rough data to the facets of the
holistic user profile and sets privacy settings
Two mapping mechanisms are implemented
◦ Explicit Mapping
Rough data are automatically copied in the corresponding facet of the
profile (e.g. name is copied in the «demographic» facet)
◦ Implicit Mapping
Rough data are processed through algorithms and are used to populate a
specific facet (e.g., a set of posts is processed through sentiment analysis
to get user mood in a certain moment)
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
24. Holistic Profile Builder
Automatically maps rough data to the facets of the
holistic user profile and sets privacy settings
Two mapping mechanisms are implemented
◦ Explicit Mapping
◦ Rough data are automatically copied in the corresponding facet of the
profile (e.g. name is copied in the «demographic» facet)
◦ Implicit Mapping
◦ Rough data are processed through algorithms and are used to populate a
specific facet (e.g., a set of posts is processed through sentiment analysis
to get user mood in a certain moment)
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
25. Holistic User Model
(recap)
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Affects
Demographics
Interests
Behaviors
Social Relations
Knowledge and
Skills
Physical States
Cognitive Aspects
26. Holistic Profile Builder
12 demographics attributes are modeled
◦ Name, surname, profile pic, email, gender, location, height,
weight, working position, industry, language
Many attributes are general and available in many sources
◦ Name, surname, profilePic, etc.
Other attributes are source-specific
◦ e.g., height, weight from FitBit, working position from LinkedIn
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Demographics
27. Holistic Profile Builder
Source-specific attributes are immediately mapped in the
Holistic User Profile
The values for general attributes are selected by defining
priority rules
◦ e.g., LinkedIn name is the most reliable one, followed by
Facebook name, etc.
◦ The most relevant available source is used to map the attribute
in the Holistic User Profile
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Demographics
28. Holistic Profile Builder
Interests can be both explicitly and implicitly modeled
◦ Explicitly: interests are obtained from the categories of the Pages
liked by the user on Facebook (e.g., sports, politics, etc.) and from the
apps she used;
◦ Implicitly: interests are inferred by mining relevant entities
mentioned in posts (with a positive or neutral sentiment) written by
the user on Twitter and Facebook, from the hashtags used on
Instagram, etc.
In both cases, the Interest facet is populated with some keywords
and entities extracted from the posts and copied from the Pages
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Interests
29. Holistic Profile Builder
User Mood and Emotions are encoded in the ‘Affects’ facet
Affects are implicitly obtained by mining textual content
◦ Mood: is obtained as the average ‘sentiment’ expressed by all
the posts written by the user in that day
◦ Emotions: emotion is extracted from each post, by mapping
textual content to Ekman’s emotions (anger, fear, disgust,
sadness, joy)
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Affects
30. Holistic Profile Builder
User Personality traits and inclination to empathy are encoded in
the ‘Cognitive Aspects’ facet
Cognitive Aspects are implicitly obtained by mining textual content
◦ Personality traits are obtained by automatically inferring Big-5 traits
(extraversion, openness to experience, Conscientiousness, neuroticism,
agreeableness) from content written by the user
◦ Inclination to Empathy is obtained by using a model developed by
Polignano et al. (*). It uses as input all the posts of the user and
returns her inclination (high, medium, low)
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Cognitive Aspects
(*) Polignano, Marco, et al.
"Learning inclination to empathy from social media footprints."
Proceedings of the 25th Conference on User Modeling,
Adaptation and Personalization (UMAP) 2017.
31. Holistic Profile Builder
User Activities and Visited Places are encoded in the
‘Cognitive Aspects’ facet
User Activities are obtained from FitBit and Android Phones
◦ Both of them exploit device sensors to implicitly infer users’ activities
(running, walking, etc.).
Visited Places are implicitly inferred from the geotag
available in Instagram pictures
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Behaviors
32. Holistic Profile Builder
Social Relations are currently inferred by merging all the
contacts available in the sources connected to the
platforms
Issue: no methodology for merging identities is
implemented (e.g., Facebook and Twitter accounts of the
same person are not mapped to the same identities)
◦Future work!
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Social Relations
33. Holistic Profile Builder
Knowledge and Skills are currently inferred by identifying
concepts associated to the working position of the user
Physical States of the user are explicitly extracted from
FitBit data (food, hearth rate, sleep quality, etc.)
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Knowledge and Skills
Physical States
34. Holistic User Modeling Workflow
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
35. Data Exposure
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Holistic User Profiles are made available to developers
and third-party services via a high-level REST api
◦e.g., http://90.147.102.243/api/profile/cataldo
◦Only the facets the user explicitly labeled as ‘public’ are
exposed via the endpoint
◦Developers have to request an API key to access to the
user profiles
36. Holistic User Modeling Workflow
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
37. Data Visualization - Login
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
38. Data Visualization – Linking Identities
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
39. Data Visualization – Privacy Settings
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
40. Data Visualization – Controlling Data Exposure
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
41. Data Visualization – Controlling Data Exposure
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
42. Data Visualization – User Interests
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
43. Data Visualization – Emotion Monitoring
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
44. Data Visualization – Sleep Monitoring
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
45. Conclusions
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Holistic User Profiling
◦Conceptual model based on eight different facets
◦Built by merging the digital footprints gathered from several
heterogeneous sources
Myrror
◦ Platform supporting the creation of holistic user profiles
◦ Users have full control over the data extracted and exposed
◦ Profiles are made available through a REST Api to third-party
services and developers
46. Future Work
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018
Data Sources
◦Introduction of more sources for modeling user profiles
Data Processing and Enrichment
◦Room for improvement: many algorithms can be
integrated to implicitly infer users’ data from rough data
Experimental Evaluation
◦ Definition of an experimental scenario to assess about the
effectiveness of holistic user profiles.
47. Thank you!
cataldo.musto@uniba.it
@cataldomusto
Want to try Myrror?
Contacts
Demo paper tomorrow and Tuesday at
the poster session!
Cataldo Musto, Giovanni Semeraro, Cosimo Lovascio, Marco de Gemmis, Pasquale Lops
A Framework for Holistic User Modeling Merging Heterogeneous Digital Footprints . Holistic User Modeling Workshop @UMAP 2018. Singapore. July, 8 2018