Finding Co-solvers on Twitter, with the Little Help from Linked Data

Finding Co-solvers on Twitter, with a
Little Help from Linked Data

Milan Stankovic, Hypios, Université Paris-Sorbonne, France*
Matthew Rowe, KMi, Open University, UK
Philippe Laublet, Université Paris-Sorbonne, France

Outline
• Context
• Problem
• Our Approach
• Evaluation
• Example of use
• Conclusion and questions

Context: Innovation on the Web

Academia

Solvers from
Innovation Seekers industry,
research etc.

Problem: Find Collaborators

Innovation Seeker ??
?
Problem

solver


•How to find collaborators that
Innovation Seeker complement the solver’s competence
with regards to the problem
??
?
? •How to find collaborators that are
Problem
compatible with him in terms of
teamwork
solver


Complementary Competence

Problem Interest Similarity

Social Similarity
solver

inspired by social studies on team composition, and factors that influence good teamwork

Our Approach

profiling >> profile extension >> calculation of similarities >> ranking

Implementation and tests performed using data from Twitter

Our Approach: Profiling

solver
conceptial
candidate collaborators
social

problem

• Conceptual Profiles
– users: Zemanta used to extract DBPedia concepts from
textual elements that the user created on twitter (tweets,
bio, etc.). Profiles contain concepts and the frequency of
their occurrence
– problem: Text of the innovation problem treated with
Zemanta to extract concepts
• Social Profiles
– contain all the contacts of a given user on Twitter
• Both types of profiles are in vector form.
• Simple in purpose, to get most topics, not to specialize
for topics of highest expertise

Our Approach: Profile Extension

Our Our Approach: Profiling
Approach: Profile Extension
• Why extend profiles:
– imperfection of source data
(tweets)
– incompleteness of coverage
(due to difference in vocabulary
some concepts may stay
unnoticed)
– to perform broader/lateral
match

• How
– hyProximity (HPSR): a graph
measure using Linked Data
(tested on DBPedia)
– DMSR: distributional measure
inspired by Normalized Google
Distance
– PRF: Pseudo Relevance
Feedback

• HSPR (hyProximity)

HPSR(c1,c 2 ) = å ic(K i ) + å link( p,c1,c 2 ) · pond( p,c1 )
K i Î K (c1 ,c 2 ) pÎ P

skos:broader

skos:broader
dct:subject

• DMSR – Distributional Measure of Semantic
Relatedness
ocurrence(c1,c 2 )
DMSRτ (c1,c 2 ) =
ocurrence(c1 ) + ocurrence(c 2 )

c1 c2 c16 c18 c32
c1 and c2 more related
c1 c2 c15 c43 c56 then c1 and c3

c1 c3 c4 c10 c13

• PRF: Pseudo Relevance Feedback
– Distributional measure based on the profiles
appearing in the n best ranked solutions.
– The same measure of co-occurrence as DMSR,
applied to the set of first 10 suggestions
– This method can be applied with any ranking
technique

Our Approach: Similarities

Complementarity (Similarity with
difference topics)
Conceptual Similarity (Similarity
of conceptual profiles)

Social Similarity (Similarity of
Social Profiles)

• Vector Similarity Measures
wi
– Weighted Overlap

– Cosine Similarity

cosine

Ranking
• By one similarity measure
– complementarity
– conceptual similarity
– social similarity
• By a linear combination of measures
a*Comp+b*ConcSim+c*SocSim
• By a product of measures
Comp*ConcSim*SocSim

Evaluation
• Evaluation 1
– recommending a collaborator to a group of solvers
– a group of 3 solvers (experts in Semantic Web) is
trying to solve 3 cross-disciplinary problems
– problems inspired from real challenges (workshops,
calls for papers, etc.)
• Evaluation 2
– recommending collaborators to individual solvers
– 12 twitter users, experts in Semantic Web look for
collaborators for the same 3 problems

Evaluation: Metrics
• Discounted Cumulative Gain
– what is the value of considering first 10
suggestions, and what is the quality of their
ordering 10
ratingi
DCG = rating1 + ∑
i =2 log 2 i
• Average Precision
– what is the cumulative benefit of considering each
next suggestion in a particular ranking

Evaluation 1
compatibility

Evaluation 1
conceptual similarity

Evaluation 2
• Composite Ranking Functions: Product
– Comp*ConcSim*SocSim
– PRF(Comp*ConcSim*SocSim): PRF problem profile expansion with
composite similarity.
– HSPR(Comp)*ConcSim*SocSim: HPSR expansion performed on difference
topics prior to calculating the complementarity (similarity with difference
topics)
– Comp*DMSR(ConcSim)*SocSim: DMSR expansion performed over the
seed user profile prior to calculating interest similarity.
– HSPR(Comp)*DMSR(ConcSim)*SocSim: composite function in which HPSR
is used to expand profile topics and DMSR to expand seed user topic
profile prior to calculating the similarities.

Evaluation 2

Comp*ConcSim*SocSim

PRF(Comp*ConcSim*SocSim)

HSPR(Comp)*ConcSim*SocSim

Comp*DMSR(ConcSim)*SocSim

HSPR(Comp)*DMSR(ConcSim)*SocSim

Evaluation 2
• Average Precision (Cumulative)

Comp*ConcSim*SocSim

PRF(Comp*ConcSim*SocSim)

HSPR(Comp)*ConcSim*SocSim

Comp*DMSR(ConcSim)*SocSim

HSPR(Comp)*DMSR(ConcSim)*SocSim

Conclusions
• The Linked Data based concept expansion technique
(hyProximity) gives best results when expanding topics for
Compatibility measures. A distributional one works slightly
better for Conceptual Similarity measures.
• In a composite ranking function, expanding profiles with
hyProximity is beneficial if applied only to Compatibility.
Expansion in both Compatibility and Conceptual Similarity has
negative effects.
• All profile expansion techniques, applied individually, have
positive effects in comparisons to direct similarity calculation
with no expansion.

Take Away
Compatibility Expansion

( , )
Problem
hyProximity
a Linked Data-
based measure

Conceptual Similarity
DMSR
a distributional
measure

Example
Problem : Semantic Web representation of start-
up history for start-up performance indicators
User: Milan Stankovic (@milstan)
Angel investor specialized
Suggestions: davidsrose in technology statups
fundingpost
ECVentureCapita
BVCA Investors and
Entrepreneurs, Information
vc20 technology
AndySack
CVCACanada
Austin_Startups
tgmtgm Entrepreneur, Social
davidblerner Networks (KLOUT), Metrics

Finding Co-solvers on Twitter, with the Little Help from Linked Data

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (14)

Ähnlich wie Finding Co-solvers on Twitter, with the Little Help from Linked Data

Ähnlich wie Finding Co-solvers on Twitter, with the Little Help from Linked Data (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Finding Co-solvers on Twitter, with the Little Help from Linked Data