2. ABSTRACT
1.
2.
A magnet community is such a community that
attracts significantly more people’s interests and
attentions than other communities of similar
topics.
the study of magnet community identification
problem.
We observe several properties of magnet
communities.
We formalize these properties with the
combination of community feature extraction
into a graph ranking formulation.
4. INTRODUCTION
1.
2.
Importance:
help people understand the trends of their
domains.
help people make decisions when joining
communities.
Our goal :
Given communities in a domain, we want to rank
them based on their attractiveness to people among
the communities of that domain.
In the end, the top ranked communities are the ones
people tend to adhere to.
6. INTRODUCTION
1.
2.
3.
Challenges:
how to extract features from these
heterogeneous sources of impacting factors
of a community’s attractiveness.
how to combine all heterogeneous
information into a unified ranking model.
Noise handling.
common properties:
attention flow
Attention quality
persistence of people’s attention
7. contributions
A new direction on social network analysis,
namely
magnet community identification.
One definition of magnet communities by
identifying their properties.
We demonstrate the effectiveness of our
framework on a particular domain of magnet
community identification, namely company’s
employee magnet community identification.
11. MAGNET COMMUNITY
IDENTIFICATION FRAMEWORK
Attractiveness features
Standalone features
Attention migrating matrix as dependency
features
•
•
an attention migrating matrix:D=(dij)k*k
The attention vector, A =( ai )k∗1 = D · e
•
•
Dependency features of communities:
17. EVALUATION
Data collection and features extraction
•
•
•
Data collection
www.linkedin.com
Standalone features : a company’s revenue per
employee, industry, location, age
39527 companies’ information in 142 industries
18. EVALUATION
Feature extraction
•
•
•
industry – count how many people flow into it and
out of it, using company level departure and arrival data.
Locations -- popularities
Founded year feature -- the number of companies founded
for each year.
Ranking performance
Baseline Description
•
•
•
PageRank
IT and financial
The 2011 ideal employer ranking proposed by
Universumglobal
the 2011 most admired company ranking by Fortune
24. JOINT TOPIC MODELING
FOR EVENT
SUMMARIZATION
ACROSS
NEWS AND SOCIAL
MEDIA STREAMS
Qatar Computing Research Institute
Qatar Foundation Doha, Qatar
25. ABSTRACT
1.
2.
a novel unsupervised approach based on topic
modeling to summarize trending subjects by
jointly discovering the representative and
complementary information from news and
tweets.
topic modeling formalism by combining a twodimensional topic-aspect model and a
cross-collection approach in the multidocument summarization literature.
co-ranking the news sentences and tweets
in both sides.
26. INTRODUCTION
News -- well-crafted, fact-oriented long stories
written by professionals based on the latest
past events
Tweets – personalized, more opinionated freestyle short messages posted by the average
persons in real time.
27. INTRODUCTION
contributions
A novel problem of generating complementary
summaries
a principled measure to assess the extent of
sentence-level complementarity
A topic modeling approach called crosscollection
topic-aspect model (ccTAM) that combines
ccLDA and topic-aspect mixture model for
precisely estimating the proposed
complementary measure.
a gold-standard dataset of complementary
37. GENERATE COMPLEMENTARY
SUMMARIES
G =( N ∪ T, E )
N = { n1 ,n2, ··· ,nmn }, T = { t1 ,t2 , ··· ,tnt }
E = { ( p ( ni | tj ) ,p ( tj | ni )) | i =1 , ··· ,mn ; j
=1 , ··· ,nt } is the set of directed edges
between two sets of nodes whose values are
node-to-node jumping probabilities.
41. EXPERIMENTS AND
RESULTS
gold-standard summaries
The news summaries: English Wikipedia and
Wikinews
Tweets summaries
Baseline Methods
BL-0:LexRank
BL-1: KL-divergence(KLD)
BL-2: Cosine and language modeling(LM)
BL-3:LexRank+Complementarity(LexComp)