SlideShare ist ein Scribd-Unternehmen logo
1 von 17
Downloaden Sie, um offline zu lesen
Brave New Task: User Account
               Matching

Pisa – October 5, 2012


                                          C. Hauff, G. Friedland
                                                   TU Delft & ICSI




                         MediaEval 2012                              1
Users on the Social Web




     “Cooperative” users: publicly
   provide their respective accounts.

Not just one account, but many accounts.
                       MediaEval 2012      2
User Account Matching                                   time




                                  Can we identify the
   Social Web Stream                  same user in
                                  another social Web
                                   stream universe?



                                  Social Web Stream



                 MediaEval 2012                         3
User Account Matching
                                         1 vs. k
• Different scenarios



            1 vs. 1                                additional
                                                    evidence




 Our task setup.




                        MediaEval 2012               4
Example Recovery Questions
Why?
                                                             What is your favorite team?

• Benevolent uses                                           What is your favorite movie?

  • Enriched user models                              What is your favorite TV program?
  • Improved personalization
                                                    What is your least favorite nickname?
    effectiveness
  • To make users happier                                   What is your favorite sport?

                                                          Who was your childhood hero?
• Malicious uses
  • Password recovery (self-service             What was the first concert you attended?
    password reset) on a large scale                What time of the day were you born?
  • Discover “offline” information
    based on enriched profiles (e.g.                What was your dream job as a child?

    phone numbers)                          What is the middle name of your oldest child?

                                                      Source: http://goodsecurityquestions.com

                           MediaEval 2012                                         5
Existing work with a strong text-based bias
• Most previous work based directly on the profile
  information




                       MediaEval 2012                6
Existing work profile-information based
• Zafarani et al., 2009
   • Similarity of user names on different
     platforms
   • Automatic matching ground truth:
     BlogCatalog (cooperative users)


• Abel et al., 2010
   • Investigated the amount of user profile
     aggregation possible with cross-
     community linking (cross-links
     retrieved from the Social Graph API)

                Source: Abel et al., Interweaving Public User Proles on the Web, UMAP 2010

                              MediaEval 2012                                   7
Existing work beyond profiles
• Narayanan et al., 2009
   • Rely on the graph structure of social networks to de-anonymize the
     graph (no use of profile or content information)

• Iofciu et al., 2011
   • Used tags (StumbleUpon, Flickr, Delicious) of images and
     bookmarks to identify matching accounts
   • Ground truth based on the Social Graph API
   • Content-based matching (compared to user name matching) is a
     much more difficult task
      • Starting point for our work




                                MediaEval 2012                     8
Our task (1 vs. 1)


Assuming a set of uncooperative users, i.e. users that cannot
     be linked according to their self-reported profile
   information, to what extent is it still possible to determine
                          matches?

 Given a Flickr account …


                                  determine the corresponding
                                  Twitter account from a large set
                                  of potential streams.

                         MediaEval 2012                      9
Profile meta-data
Data Set: The Basics                                       is removed.

• 50,000 semi-random users selected on Twitter and
  followed for three months (04/2012-06/2012)
  • ~18,000 tweeted at least once in that time period

• Manually checked potential matching Flickr accounts
  • Potential matches: (i) tweets containing flickr.com,         (ii)
    existing Flickr account with the same user name




                           MediaEval 2012                         10
Data Set: User Distribution
                                   200 photos
                                   (Flickr limit)


    more tweets
    than images




                                                    more images
                                                    than tweets




                  MediaEval 2012                                  11
Data Set: The Temporal Dimension
                                   119 account pairs with
                                   overlapping time stamps




        No information available

                 MediaEval 2012                      12
Baseline
• Treat all tweets of a user as document
   • Corpus of Twitter user documents

• Treat all textual information from a user’s Flickr stream as
  a (very long) query

• Rank the documents with respect to the query
  (i.e. rank the Twitter accounts)


This is a standard ad hoc retrieval problem: we used Okapi.



                          MediaEval 2012                  13
Baseline Results                      RR=
                                                    1
                                          rank(matchingAccount)




  Account matching based on content is hard.


  The larger the number of Flickr images, the
              better the matching.


                     MediaEval 2012                    14
Baseline Results: Taking a Closer Look
• Distribution of the 233 RR values
180
160
140
120                                       Task is either very easy or
                                         very difficult. Less than 20%
100
 80
 60                                      of „queries‟ with non-0/1 RR.
 40
 20
  0
        0          1         Other




• Influence of time overlap in MRR                 Time overlap in
  119 account pairs with overlap      0.2134     streams makes the
  114 account pairs without overlap   0.1253         task easier!

                             MediaEval 2012                     15
Challenges
① Social networks have (strong) data gathering
  restrictions in place – requires long term setup
   • Twitter: complete user history not available
   • Flickr: max. 200 photos for users without “pro” accounts

② Users use different social networks at different time
  periods – makes the matching even more difficult

③ Automatic ground truth generation is error-prone:
  self-reported links can be arbitrary, link to friends, etc.
   •   Crowd-sourcing may be an option

④ Many encountered matches are not from private
  individuals, but belong to organizations or businesses
                            MediaEval 2012                      16
Thank you!

All suggestions are welcome!




          MediaEval 2012       17

Weitere ähnliche Inhalte

Andere mochten auch

Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skillsJNavarro0321
 
14 10 21_презентация сту
14 10 21_презентация сту14 10 21_презентация сту
14 10 21_презентация стуStanislav Litvinenko
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskMediaEval2012
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...MediaEval2012
 
Designinteração– veda 3
Designinteração– veda 3Designinteração– veda 3
Designinteração– veda 3souzadea1
 
2010 Marketing Plan
2010 Marketing Plan2010 Marketing Plan
2010 Marketing PlanJPemberton15
 
Intro totransportphenomenanew
Intro totransportphenomenanewIntro totransportphenomenanew
Intro totransportphenomenanewilovepurin
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonSharon Jimenez
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskMediaEval2012
 
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMTUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMMediaEval2012
 
6dicas– veda 4
6dicas– veda 46dicas– veda 4
6dicas– veda 4souzadea1
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...MediaEval2012
 
GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012MediaEval2012
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...MediaEval2012
 
Core companies for eee
Core companies for eeeCore companies for eee
Core companies for eeenarenans
 
Mr. & Mrs. S Before & After
Mr. & Mrs. S Before & AfterMr. & Mrs. S Before & After
Mr. & Mrs. S Before & AfterMichael Kret
 
National publishing company-Titli case
National publishing company-Titli caseNational publishing company-Titli case
National publishing company-Titli caseVishwa Bhaskar
 

Andere mochten auch (19)

Activities for journalistic skills
Activities for journalistic skillsActivities for journalistic skills
Activities for journalistic skills
 
14 10 21_презентация сту
14 10 21_презентация сту14 10 21_презентация сту
14 10 21_презентация сту
 
Papiloma humano
Papiloma humanoPapiloma humano
Papiloma humano
 
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect TaskNII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task
 
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization...
 
κειμενο
κειμενοκειμενο
κειμενο
 
Designinteração– veda 3
Designinteração– veda 3Designinteração– veda 3
Designinteração– veda 3
 
2010 Marketing Plan
2010 Marketing Plan2010 Marketing Plan
2010 Marketing Plan
 
Intro totransportphenomenanew
Intro totransportphenomenanewIntro totransportphenomenanew
Intro totransportphenomenanew
 
Como hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharonComo hacer una pagina web en wix sharon
Como hacer una pagina web en wix sharon
 
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search TaskThe TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
The TUM Cumulative DTW Approach for the Mediaeval 2012 Spoken Web Search Task
 
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVMTUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
TUKE MediaEval 2012: Spoken Web Search using DTW and Unsupervised SVM
 
6dicas– veda 4
6dicas– veda 46dicas– veda 4
6dicas– veda 4
 
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
ARF @ MediaEval 2012: An Uninformed Approach to Violence Detection in Hollywo...
 
GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012GTTS System for the Spoken Web Search Task at MediaEval 2012
GTTS System for the Spoken Web Search Task at MediaEval 2012
 
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visu...
 
Core companies for eee
Core companies for eeeCore companies for eee
Core companies for eee
 
Mr. & Mrs. S Before & After
Mr. & Mrs. S Before & AfterMr. & Mrs. S Before & After
Mr. & Mrs. S Before & After
 
National publishing company-Titli case
National publishing company-Titli caseNational publishing company-Titli case
National publishing company-Titli case
 

Ähnlich wie Brave New Task: User Account Matching

Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationMarieke van Erp
 
VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2Davide Ceolin
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social NetworksBang Hui Lim
 
Learning registry for Connected Educators
Learning registry for Connected EducatorsLearning registry for Connected Educators
Learning registry for Connected EducatorsMarie Bienkowski
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013mrkwr
 
ePortfolios in 2012 (according to Don) - CAPLA version
ePortfolios in 2012 (according to Don) - CAPLA versionePortfolios in 2012 (according to Don) - CAPLA version
ePortfolios in 2012 (according to Don) - CAPLA versionDon Presant
 
Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)Lora Aroyo
 
Global lodlam_communities and open cultural data
Global lodlam_communities and open cultural dataGlobal lodlam_communities and open cultural data
Global lodlam_communities and open cultural dataMinerva Lin
 
Breaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social SemanticsBreaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social SemanticsJohn Breslin
 
TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkLora Aroyo
 
VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5Davide Ceolin
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareIMC Technologies
 
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantPOWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantLynne Thomas
 
Appleseed Social Networking
Appleseed Social NetworkingAppleseed Social Networking
Appleseed Social NetworkingFSCONS
 
Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013Marco Brambilla
 
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...Lora Aroyo
 
Enterprise 2.0 with Open Source Frameworks like Agorava
Enterprise 2.0 with Open Source Frameworks like AgoravaEnterprise 2.0 with Open Source Frameworks like Agorava
Enterprise 2.0 with Open Source Frameworks like AgoravaWerner Keil
 
Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignCommunitySense
 

Ähnlich wie Brave New Task: User Account Matching (20)

Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and Visualisation
 
Lecture4 Social Web
Lecture4 Social Web Lecture4 Social Web
Lecture4 Social Web
 
VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2VU University Amsterdam - The Social Web 2016 - Lecture 2
VU University Amsterdam - The Social Web 2016 - Lecture 2
 
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
#mytweet via Instagram: Exploring User Behaviour Across Multiple Social Networks
 
Learning registry for Connected Educators
Learning registry for Connected EducatorsLearning registry for Connected Educators
Learning registry for Connected Educators
 
Social Media Principles for Enterprise Knowledge Management by Augustine Fou
Social Media Principles for Enterprise Knowledge Management by Augustine FouSocial Media Principles for Enterprise Knowledge Management by Augustine Fou
Social Media Principles for Enterprise Knowledge Management by Augustine Fou
 
ASA conference Feb 2013
ASA conference Feb 2013ASA conference Feb 2013
ASA conference Feb 2013
 
ePortfolios in 2012 (according to Don) - CAPLA version
ePortfolios in 2012 (according to Don) - CAPLA versionePortfolios in 2012 (according to Don) - CAPLA version
ePortfolios in 2012 (according to Don) - CAPLA version
 
Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)Lecture 5: Personalization on the Social Web (2014)
Lecture 5: Personalization on the Social Web (2014)
 
Global lodlam_communities and open cultural data
Global lodlam_communities and open cultural dataGlobal lodlam_communities and open cultural data
Global lodlam_communities and open cultural data
 
Breaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social SemanticsBreaking Down Walls in Enterprise with Social Semantics
Breaking Down Walls in Enterprise with Social Semantics
 
TruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social NetworkTruSIS: Trust Accross Social Network
TruSIS: Trust Accross Social Network
 
VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5VU University Amsterdam - The Social Web 2016 - Lecture 5
VU University Amsterdam - The Social Web 2016 - Lecture 5
 
Linked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the SoftwareLinked Data for the Masses: The approach and the Software
Linked Data for the Masses: The approach and the Software
 
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership GrantPOWRR Tools: Lessons learned from an IMLS National Leadership Grant
POWRR Tools: Lessons learned from an IMLS National Leadership Grant
 
Appleseed Social Networking
Appleseed Social NetworkingAppleseed Social Networking
Appleseed Social Networking
 
Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013Choosing the right crowd. Expert finding in social networks. edbt 2013
Choosing the right crowd. Expert finding in social networks. edbt 2013
 
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
Lecture 4: How do we MINE, ANALYSE & VISUALISE the Social Web? (VU Amsterdam ...
 
Enterprise 2.0 with Open Source Frameworks like Agorava
Enterprise 2.0 with Open Source Frameworks like AgoravaEnterprise 2.0 with Open Source Frameworks like Agorava
Enterprise 2.0 with Open Source Frameworks like Agorava
 
Conversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems DesignConversations in Context: A Twitter Case for Social Media Systems Design
Conversations in Context: A Twitter Case for Social Media Systems Design
 

Mehr von MediaEval2012

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval2012
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding MediaEval2012
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingMediaEval2012
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012MediaEval2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskMediaEval2012
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...MediaEval2012
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsMediaEval2012
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskMediaEval2012
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval2012
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...MediaEval2012
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...MediaEval2012
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioMediaEval2012
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodMediaEval2012
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...MediaEval2012
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskMediaEval2012
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationMediaEval2012
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskMediaEval2012
 

Mehr von MediaEval2012 (20)

MediaEval 2012 Opening
MediaEval 2012 OpeningMediaEval 2012 Opening
MediaEval 2012 Opening
 
Closing
ClosingClosing
Closing
 
A Multimodal Approach for Video Geocoding
A Multimodal Approach for   Video Geocoding A Multimodal Approach for   Video Geocoding
A Multimodal Approach for Video Geocoding
 
Brave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music TaggingBrave New Task: Musiclef Multimodal Music Tagging
Brave New Task: Musiclef Multimodal Music Tagging
 
Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012Search and Hyperlinking Task at MediaEval 2012
Search and Hyperlinking Task at MediaEval 2012
 
CUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking TaskCUNI at MediaEval 2012: Search and Hyperlinking Task
CUNI at MediaEval 2012: Search and Hyperlinking Task
 
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking TaskDCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
DCU Search Runs at MediaEval 2012: Search and Hyperlinking Task
 
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
Ghent University-IBBT at MediaEval 2012 Search and Hyperlinking: Semantic Sim...
 
The CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and OnwardsThe CLEF Initiative From 2010 to 2012 and Onwards
The CLEF Initiative From 2010 to 2012 and Onwards
 
Overview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy TaskOverview of MediaEval 2012 Visual Privacy Task
Overview of MediaEval 2012 Visual Privacy Task
 
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
MediaEval 2012 Visual Privacy Task: Privacy and Intelligibility through Pixel...
 
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
Violent Scenes Detection with Large, Brute-forced Acoustic and Visual Feature...
 
mevd2012 esra_
 mevd2012 esra_ mevd2012 esra_
mevd2012 esra_
 
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
Technicolor/INRIA/Imperial College London at the MediaEval 2012 Violent Scene...
 
The MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes DetectioThe MediaEval 2012 Affect Task: Violent Scenes Detectio
The MediaEval 2012 Affect Task: Violent Scenes Detectio
 
LIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic methodLIG at MediaEval 2012 affect task: use of a generic method
LIG at MediaEval 2012 affect task: use of a generic method
 
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
Violence Detection in Video by Large Scale Multi-Scale Local Binary Pattern D...
 
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging TaskUNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task
 
ARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video ClassificationARF @ MediaEval 2012: Multimodal Video Classification
ARF @ MediaEval 2012: Multimodal Video Classification
 
Overview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging TaskOverview of the MediaEval 2012 Tagging Task
Overview of the MediaEval 2012 Tagging Task
 

Brave New Task: User Account Matching

  • 1. Brave New Task: User Account Matching Pisa – October 5, 2012 C. Hauff, G. Friedland TU Delft & ICSI MediaEval 2012 1
  • 2. Users on the Social Web “Cooperative” users: publicly provide their respective accounts. Not just one account, but many accounts. MediaEval 2012 2
  • 3. User Account Matching time Can we identify the Social Web Stream same user in another social Web stream universe? Social Web Stream MediaEval 2012 3
  • 4. User Account Matching 1 vs. k • Different scenarios 1 vs. 1 additional evidence Our task setup. MediaEval 2012 4
  • 5. Example Recovery Questions Why?  What is your favorite team? • Benevolent uses  What is your favorite movie? • Enriched user models  What is your favorite TV program? • Improved personalization What is your least favorite nickname? effectiveness • To make users happier   What is your favorite sport? Who was your childhood hero? • Malicious uses • Password recovery (self-service What was the first concert you attended? password reset) on a large scale What time of the day were you born? • Discover “offline” information based on enriched profiles (e.g. What was your dream job as a child? phone numbers) What is the middle name of your oldest child? Source: http://goodsecurityquestions.com MediaEval 2012 5
  • 6. Existing work with a strong text-based bias • Most previous work based directly on the profile information MediaEval 2012 6
  • 7. Existing work profile-information based • Zafarani et al., 2009 • Similarity of user names on different platforms • Automatic matching ground truth: BlogCatalog (cooperative users) • Abel et al., 2010 • Investigated the amount of user profile aggregation possible with cross- community linking (cross-links retrieved from the Social Graph API) Source: Abel et al., Interweaving Public User Proles on the Web, UMAP 2010 MediaEval 2012 7
  • 8. Existing work beyond profiles • Narayanan et al., 2009 • Rely on the graph structure of social networks to de-anonymize the graph (no use of profile or content information) • Iofciu et al., 2011 • Used tags (StumbleUpon, Flickr, Delicious) of images and bookmarks to identify matching accounts • Ground truth based on the Social Graph API • Content-based matching (compared to user name matching) is a much more difficult task • Starting point for our work MediaEval 2012 8
  • 9. Our task (1 vs. 1) Assuming a set of uncooperative users, i.e. users that cannot be linked according to their self-reported profile information, to what extent is it still possible to determine matches? Given a Flickr account … determine the corresponding Twitter account from a large set of potential streams. MediaEval 2012 9
  • 10. Profile meta-data Data Set: The Basics is removed. • 50,000 semi-random users selected on Twitter and followed for three months (04/2012-06/2012) • ~18,000 tweeted at least once in that time period • Manually checked potential matching Flickr accounts • Potential matches: (i) tweets containing flickr.com, (ii) existing Flickr account with the same user name MediaEval 2012 10
  • 11. Data Set: User Distribution 200 photos (Flickr limit) more tweets than images more images than tweets MediaEval 2012 11
  • 12. Data Set: The Temporal Dimension 119 account pairs with overlapping time stamps No information available MediaEval 2012 12
  • 13. Baseline • Treat all tweets of a user as document • Corpus of Twitter user documents • Treat all textual information from a user’s Flickr stream as a (very long) query • Rank the documents with respect to the query (i.e. rank the Twitter accounts) This is a standard ad hoc retrieval problem: we used Okapi. MediaEval 2012 13
  • 14. Baseline Results RR= 1 rank(matchingAccount) Account matching based on content is hard. The larger the number of Flickr images, the better the matching. MediaEval 2012 14
  • 15. Baseline Results: Taking a Closer Look • Distribution of the 233 RR values 180 160 140 120 Task is either very easy or very difficult. Less than 20% 100 80 60 of „queries‟ with non-0/1 RR. 40 20 0 0 1 Other • Influence of time overlap in MRR Time overlap in 119 account pairs with overlap 0.2134 streams makes the 114 account pairs without overlap 0.1253 task easier! MediaEval 2012 15
  • 16. Challenges ① Social networks have (strong) data gathering restrictions in place – requires long term setup • Twitter: complete user history not available • Flickr: max. 200 photos for users without “pro” accounts ② Users use different social networks at different time periods – makes the matching even more difficult ③ Automatic ground truth generation is error-prone: self-reported links can be arbitrary, link to friends, etc. • Crowd-sourcing may be an option ④ Many encountered matches are not from private individuals, but belong to organizations or businesses MediaEval 2012 16
  • 17. Thank you! All suggestions are welcome! MediaEval 2012 17