Presentation given at ICCSS 2015, Helsinki, Finland. It illustrates an approach for classifying users of OSNs solely based on their interactions with other users.
Axa Assurance Maroc - Insurer Innovation Award 2024
Β
Learning to Classify Users in Online Interaction Networks
1. Learning to Classify Users in Online
Interaction Networks
Georgios Rizos, Symeon Papadopoulos, and Yiannis Kompatsiaris
Centre for Research and Technology Hellas (CERTH) β Information Technologies Institute (ITI)
ICCSS 2015, June 10, 2015, Helsinki, Finland
2. User Classification
#2
Twitter Handle Labels
@nytimes usa, press,
new york
@HuffPostBiz finance
@BBCBreaking press,
journalist, tv
@StKonrath journalist
Examples from SNOW 2014 dataset
3. User Classification in (and outside) OSNs
#3
OSN
online activities
log filesAPIs
Behaviour
Observation
Profiling/Classification
4. Network-based User Classification
β’ People with similar interests tend to connect
(homophily)
β’ Knowing about oneβs connections
could reveal information
about them
β’ Knowing about
the whole network
structure could reveal
even moreβ¦
#4
5. Related Work: User Classification
Graph-based semi-supervised learning:
β’ Label propagation (Zhu and Ghahramani, 2002)
β’ Local and global consistency (Zhou et al., 2004)
β’ Empirical evaluation of many graph kernels (Fouss et al., 2012)
Other approaches to user classification:
β’ Hybrid feature engineering for inferring user behaviors
(Pennacchiotti et al., 2011 , Wagner et al., 2013)
β’ Crowdsourcing Twitter list keywords for popular users
(Ghosh et al., 2012)
β’ Content-based, graph-regularized NMF for spammer detection
(Hu et al., 2013)
#5
6. Related Work: Graph Feature Extraction
First attempts at using community detection:
β’ EdgeCluster: Edge centric k-means (Tang and Liu, 2009)
β’ MROC: Binary tree community hierarchy (Wang et al., 2013)
Low-rank matrix representation methods:
β’ Laplacian Eigenmaps: k eigenvectors of the graph Laplacian
(Belkin and Niyogi, 2003 , Tang and Liu, 2011)
β’ Random-Walk Modularity Maximization: Does not suffer from
the resolution limit of ModMax (Devooght et al., 2014)
β’ Deepwalk: Deep representation learning (Perozzi et al., 2014)
#6
7. Overview of Framework
#7
Online social interactions
(retweets, mentions, etc.)
Social interaction
user graph
ARCTE
Partial/Sparse
Annotation
Unsupervised graph
feature representation
Supervised graph
feature representation
Feature Weighting
User Label
Learning
Classified Users
8. Network Features using ARCTE
β’ Based on user-centric community detection.
β’ We extract for each user, two types of user-centric
communities.
β’ Base user-centric community: π π£ = π(π£) βͺ π£
β’ Extended user-centric community: Consider a vector π π£ that
contains similarity values among the seed user π£ and all the
rest of the users.
β By truncating appropriately, we can keep a community of the most
similar users to the seed π£.
β We keep the fewest possible users such that we still include the seed
userβs direct neighbors.
β’ Denote the set of communities detected by πΆ. We form the
feature matrix π as follows:
π₯ π£π =
1, πππ£ β ππ
0, ππ‘βπππ€ππ π
, βππ β πΆ
#8
10. Fast Approximate User-centric PageRank
β’ Given a seed user π£, we calculate the user-centric PageRank
vector (i.e. stationary distribution with probability 1 at π£).
β’ Localized, sparse vector; i.e. we neither propagate nor store
trivial values.
β’ Instead of approximating the PageRank vector, we
approximate cumulative PageRank differences. Better
approximation for fewer iterations.
β’ We alternate between two update rules:
β Cumulative PR diff: π(π‘+1) = π(π‘) + 1 β π π(π‘β1) ππ’
(instead of PR: π(π‘+1) = π(π‘) + π(π‘) πΌ π’, (Andersen et al., 2006))
β Residual distribution: π(π‘+1) = π(π‘) β π(π‘) πΌ π’ + (1 β π)π(π‘) ππ’
where π: Restart probability and
ππ’ the π’-th row of π = π·β1 π΄ and πΌ π’ the π’-th row of πΌ
β’ Finally, we divide each element of π by its degree in order to
get approximate, user-centric, regularized commute-times.
#10
11. Community Weighting
β’ We perform a supervised community weighting step to
boost the importance of highly predictive communities.
β’ For each community we calculate a weight:
π€ π = π2 π Γ ππ£π(π)
β’ The first factor is based on supervised chi-squared weighting
that quantifies the correlation among all feature-label pairs.
β PSNR aggregation across labels: π2
π =
max π
2
π,π βmin( π2 π,π )
π€ππ‘βππβπππππβπ£πππππππππ‘π¦
β’ The second factor is unsupervised inverse vertex frequency.
β Consider idf with vertices as terms and communities as documents.
β’ We multiply each column of π with the corresponding weight.
#11
12. Evaluation: Dataset Description
#12
Datasets Labels Vertices Vertex Type Edges Edge Type
SNOW2014 Graph
(Papadopoulos et al., 2014)
90 533,874 Twitter
Account
949,661 Mentions +
Retweets
IRMV-PoliticsUK
(Greene & Cunningham, 2013)
5 419 Twitter
Account
11,349 Mentions +
Retweets
ASU-YouTube
(Mislove et al., 2007)
47 1,134,890 YouTube
Channel
2,987,624 Subscriptions
ASU-Flickr
(Tang and Liu, 2009)
195 80,513 Flickr Account 5,899,882 Contacts
Ground truth generation:
β’ SNOW2014 Graph: Twitter list aggregation & post-processing
β’ IRMV-PoliticsUK: Manual annotation
β’ ASU-YouTube: User membership to group
β’ ASU-Flickr: User subscription to interest group
13. Evaluation: SNOW 2014 dataset
#13
SNOW2014 Graph (534K, 950K): Twitter mentions + retweets
ground truth based on Twitter list processing
14. Evaluation: Insight Politics UK
#14
Insight-Multiview-PoliticsUK (419, 11K): mentions + retweets
ground truth based on manual annotation
18. Conclusion
β’ Key ideas:
β new user feature representation based on user-centric
communities
β community weighting based on sparse annotations
β consistently good performance both on interaction
(mention/retweet) and affiliation (follow/subscribe)
graphs
β’ Future Work:
β integration of additional signals (content)
β investigating feasibility on other classification problems,
e.g. spammer detection
#18
20. References (1/3)
β’ Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction
and data representation. Neural computation, 15(6), 1373-1396.
β’ Tang, L., & Liu, H. (2011). Leveraging social media networks for classification. Data
Mining and Knowledge Discovery, 23(3), 447-478.
β’ Devooght, R., Mantrach, A., KivimΓ€ki, I., Bersini, H., Jaimes, A., & Saerens, M.
(2014, April). Random walks based modularity: application to semi-supervised
learning. In Proceedings of the 23rd international conference on World wide web
(pp. 213-224). International World Wide Web Conferences Steering Committee.
β’ Perozzi, B., Al-Rfou, R., & Skiena, S. (2014, August). Deepwalk: Online learning of
social representations. In Proceedings of the 20th ACM SIGKDD international
conference on Knowledge discovery and data mining (pp. 701-710). ACM.
β’ Tang, L., & Liu, H. (2009, November). Scalable learning of collective behavior based
on sparse social dimensions. In Proceedings of the 18th ACM conference on
Information and knowledge management (pp. 1107-1116). ACM.
β’ Wang, X., Tang, L., Liu, H., & Wang, L. (2013). Learning with multi-resolution
overlapping communities. Knowledge and information systems, 36(2), 517-535.
#20
21. References (2/3)
β’ Zhu, X., & Ghahramani, Z. (2002). Learning from labeled and unlabeled data with label
propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University.
β’ Zhou, D., Bousquet, O., Lal, T. N., Weston, J., & SchΓΆlkopf, B. (2004). Learning with local and
global consistency. Advances in neural information processing systems, 16(16), 321-328.
β’ Fouss, F., Francoisse, K., Yen, L., Pirotte, A., & Saerens, M. (2012). An experimental
investigation of kernels on graphs for collaborative recommendation and semisupervised
classification. Neural Networks, 31, 53-72.
β’ Pennacchiotti, M., & Popescu, A. M. (2011, August). Democrats, republicans and starbucks
afficionados: user classification in twitter. In Proceedings of the 17th ACM SIGKDD
international conference on Knowledge discovery and data mining (pp. 430-438). ACM.
β’ Ghosh, S., Sharma, N., Benevenuto, F., Ganguly, N., & Gummadi, K. (2012, August). Cognos:
crowdsourcing search for topic experts in microblogs. In Proceedings of the 35th
international ACM SIGIR conference on Research and development in information retrieval
(pp. 575-590). ACM.
β’ Hu, X., Tang, J., Zhang, Y., & Liu, H. (2013, August). Social spammer detection in
microblogging. In Proceedings of the Twenty-Third international joint conference on Artificial
Intelligence (pp. 2633-2639). AAAI Press.
β’ Wagner, C., Asur, S., & Hailpern, J. (2013, September). Religious politicians and creative
photographers: Automatic user categorization in twitter. In Social Computing (SocialCom),
2013 International Conference on (pp. 303-310). IEEE.
#21
22. References (3/3)
β’ Andersen, R., Chung, F., & Lang, K. (2006, October). Local graph
partitioning using pagerank vectors. In Foundations of Computer Science,
2006. FOCS'06. 47th Annual IEEE Symposium on (pp. 475-486). IEEE.
β’ Papadopoulos, S., Corney, D., & Aiello, L. M. (2014). SNOW 2014 Data
Challenge: Assessing the Performance of News Topic Detection Methods
in Social Media. In SNOW-DC@ WWW (pp. 1-8).
β’ Greene, D., & Cunningham, P. (2013, May). Producing a unified graph
representation from multiple social network views. In Proceedings of the
5th Annual ACM Web Science Conference (pp. 118-121). ACM.
β’ Mislove, A., Marcon, M., Gummadi, K. P., Druschel, P., & Bhattacharjee, B.
(2007, October). Measurement and analysis of online social networks. In
Proceedings of the 7th ACM SIGCOMM conference on Internet
measurement (pp. 29-42). ACM.
β’ Tang, L., & Liu, H. (2009, June). Relational learning via latent social
dimensions. In Proceedings of the 15th ACM SIGKDD international
conference on Knowledge discovery and data mining (pp. 817-826). ACM.
#22
24. Classifying Users using Network Structure
β’ User-centric community detection to the problem
of graph-based user classification. We name our
approach ARCTE.
β’ Improved approximate, user-centric PageRank
calculation for better local graph exploration.
β’ Supervised community weighting step that boosts
the importance of highly predictive communities in
the feature representation.
β’ Extensive comparative study of numerous state-of-
the-art network feature extraction methods on
several social interaction datasets.
#24
Editor's Notes
Topics
Political/social attitudes
News stories
Geographical area
User types/roles
Useful for news search/discovery
Potential privacy issues
Different kinds of user classification:
topic-oriented (e.g., interest/expertise)
role-based/behavioral (e.g., bot/spammer)
geographical location
Useful for advertising,
user recommendation,
expert search, etc.
For personal accounts,
user classification raises
privacy concerns
Challenges
multi-linguality
Brevity
informal language