Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Modelling and Analysis of User Behaviour in Online Communities
1. Modelling and Analysis of User
Behaviour in Online
Communities
Sofia Angeletou, Matthew Rowe and Harith Alani
Knowledge Media Institute, The Open University, Milton
Keynes, United Kingdom
International Semantic Web Conference 2011.
Bonn, Germany. 2011
2. The Utility of
Online Communities
• Online communities yield value in terms of:
– Idea generation
– Customer support
– Problem solving
• Managing and hosting communities can be
– Expensive
– Time-consuming
• Large investments in communities, therefore they must:
– flourish and remain active
– remain… ‘healthy’
Modelling and Analysis of User Behaviour in Online 1
Communities
3. Increasingly Active
Community
What did the community look like at the point?
Modelling and Analysis of User Behaviour in Online 2
Communities
4. Increasingly Inactive
Community
What were the conditions
at this point?
Modelling and Analysis of User Behaviour in Online 3
Communities
5. Gauging Health
• How can we gauge community health?
– Post Count?
– User Count?
– Communication/Interaction?
– Behaviour?
• Domination of one behaviour could lead to churn
– Preece, 2000
• Behaviour in online community is influenced by the roles that
users assume
– Preece, 2001
• To provide health insights we need to monitor behaviour over
time
– Combined with basic health metrics (e.g. post count)
Modelling and Analysis of User Behaviour in Online 4
Communities
6. Supporting
Community Owners
1. Monitor and capture member activities
2. Analyse emerging behaviour over time
3. Understand the correlation of behaviour with
community evolution
4. Learn when to intervene to influence the community
Modelling and Analysis of User Behaviour in Online 5
Communities
7. Supporting
Community Owners
1. Monitor and capture member activities
2. Analyse emerging behaviour over time
3. Understand the correlation of behaviour with
community evolution
4. Learn when to intervene to influence the community
Modelling and Analysis of User Behaviour in Online 6
Communities
8. Contributions
• Ontology to model behavioural roles and behaviour
features
– Capturing time stamped user attributes
• Method to infer user roles in online communities
– Using semantic rules
• Analysis of community health through role composition
– Identifying composition patterns for healthy communities
Modelling and Analysis of User Behaviour in Online 7
Communities
9. Outline
• Behaviour Ontology
• Behaviour Features
• Community Roles
• Approach for Behaviour Analysis
– Constructing Semantic Rules
– Applying Semantic Rules
• Analysis of Community Health
• Predicting Community Health
• Findings
• Future Work
• Conclusions
Modelling and Analysis of User Behaviour in Online 8
Communities
10. Behaviour Ontology
http://purl.org/net/oubo/0.3
Modelling and Analysis of User Behaviour in Online 9
Communities
11. Behaviour Features
• In-degree Ratio
– Proportion of users that reply to user ui
• Posts Replied Ratio
– Proportion of posts by ui that yield a reply
• Thread Initiation Ratio
– Proportion of threads started by ui
• Bi-directional Threads Ratio
– Proportion of threads where ui is involved in a reciprocal action
• Bi-directional Neighbours Ratio
– Proportion of ui‘s neighbours with whom a reciprocal action has
taken place
• Average Posts per Thread
– Mean number of posts in the threads that ui has participated in
• Standard Deviation of Posts per Thread
– Standard deviation of posts in the threads that ui has posted in
Modelling and Analysis of User Behaviour in Online 10
Communities
12. Community Roles
Elitist
Grunt
Joining Conversationalist
Popular Initiator
Popular Participant
Supporter
Taciturn
Ignored
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing
discussion forums using common user roles. In Proc. Web Science
Conf. (WebSci10), Raleigh, NC: US, 2010.
Modelling and Analysis of User Behaviour in Online 11
Communities
13. Community Roles
Elitist
Grunt
Joining Conversationalist
Popular Initiator
Popular Participant
Supporter
Taciturn
Ignored
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing
discussion forums using common user roles. In Proc. Web Science
Conf. (WebSci10), Raleigh, NC: US, 2010.
Modelling and Analysis of User Behaviour in Online 12
Communities
14. Community Roles
Elitist
Grunt
Joining Conversationalist
Popular Initiator
Popular Participant
Supporter
Taciturn
Ignored
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing
discussion forums using common user roles. In Proc. Web Science
Conf. (WebSci10), Raleigh, NC: US, 2010.
Modelling and Analysis of User Behaviour in Online 13
Communities
15. Community Roles
Elitist
Grunt
Joining Conversationalist
Popular Initiator
Popular Participant
Supporter
Taciturn
Ignored
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing
discussion forums using common user roles. In Proc. Web Science
Conf. (WebSci10), Raleigh, NC: US, 2010.
Modelling and Analysis of User Behaviour in Online 14
Communities
16. Community Roles
Elitist
Grunt
Joining Conversationalist
Popular Initiator
Popular Participant
Supporter
Taciturn
Ignored
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing
discussion forums using common user roles. In Proc. Web Science
Conf. (WebSci10), Raleigh, NC: US, 2010.
Modelling and Analysis of User Behaviour in Online 15
Communities
17. Community Roles
Elitist
Grunt
Joining Conversationalist
Popular Initiator
Popular Participant
Supporter
Taciturn
Ignored
Jeffrey Chan, Conor Hayes, and Elizabeth Daly. Decomposing
discussion forums using common user roles. In Proc. Web Science
Conf. (WebSci10), Raleigh, NC: US, 2010.
Modelling and Analysis of User Behaviour in Online 16
Communities
18. Community Roles
T abl e 1. Roles and t he feat ure-t o-level mappings
R ol e Feat ur e L evel
E l i t i st I n-D egr ee R at i o l ow
B i -di r ect i onal T hr eads R at i o hi gh
B i -di r ect i onal N ei ghb our s R at i o l ow
G r unt B i -di r ect i onal T hr eads R at i o m ed
B i -di r ect i onal N ei ghb our s R at i o m ed
A ver age Post s p er T hr ead l ow
ST D of Post s p er T hr ead l ow
Joi ni ng Conver sat i onal i st T hr ead I ni t i at i on R at i o l ow
A ver age Post s p er T hr ead hi gh
ST D of Post s p er T hr ead hi gh
Popul ar I ni t i at or I n-D egr ee R at i o hi gh
T hr ead I ni t i at i on R at i o hi gh
Popul ar Par t i ci pant s I n-D egr ee R at i o hi gh
T hr ead I ni t i at i on R at i o l ow
A ver age Post s p er T hr ead m ed
ST D of Post s p er T hr ead m ed
Supp or t er I n-D egr ee R at i o m ed
B i -di r ect i onal T hr eads R at i o m ed
B i -di r ect i onal N ei ghb our s R at i o m ed
T aci t ur n B i -di r ect i onal T hr eads R at i o l ow
B i -di r ect i onal N ei ghb our s R at i o l ow
A ver age Post s p er T hr ead l ow
ST D of Post s p er T hr ead l ow
I gnor ed Post s R epl i ed R at i o l ow
Modelling and Analysis of User Behaviour in Online 17
Communities
19. Constructing Rules
Structural, social network, Feature levels change with the
reciprocity, persistence, participation dynamics of the community
Run rules over each user’s features Based on related work, we associate
and derive the community role composition roles with a collection of feature-to-level
mappings
e.g. in-degree -> high, out-degree -> high
Modelling and Analysis of User Behaviour in Online 18
Communities
20. Applying Rules
CONSTRUCT {
?role a ?t .
?this social-reality:count_as ?role .
?context a social-reality:C .
?role social-reality:context ?context .
?temp a oubo:TemporalContext .
?forum a sioc:Forum .
?forum oubo:belongsToContext ?context .
?temp oubo:belongsToContext ?context
} WHERE {
BIND (oubo:fn_getRoleType(?this) AS ?type) .
BIND(smf:buildURI("oubo:Role{?type}") AS ?t) .
.....
}
Modelling and Analysis of User Behaviour in Online 19
Communities
21. Applying Rules
CONSTRUCT {
?role a ?t .
?this social-reality:count_as ?role .
?context a social-reality:C .
?role social-reality:context ?context .
?temp a oubo:TemporalContext .
?forum a sioc:Forum .
?forum oubo:belongsToContext ?context .
?temp oubo:belongsToContext ?context
} WHERE {
BIND (oubo:fn_getRoleType(?this) AS ?type) .
BIND(smf:buildURI("oubo:Role{?type}") AS ?t) .
.....
} 1. SPIN function fn_getRoleType() matches the
user (?this) with the relevant role type
http://spinrdf.org/spin.html
Modelling and Analysis of User Behaviour in Online 20
Communities
22. Applying Rules
CONSTRUCT {
?role a ?t .
?this social-reality:count_as ?role .
?context a social-reality:C .
?role social-reality:context ?context .
?temp a oubo:TemporalContext .
?forum a sioc:Forum .
?forum oubo:belongsToContext ?context .
?temp oubo:belongsToContext ?context
} WHERE {
BIND (oubo:fn_getRoleType(?this) AS ?type) .
BIND(smf:buildURI("oubo:Role{?type}") AS ?t) .
.....
} 2. Build the URI for the behaviour role class of the
user, based on the ?type match
Modelling and Analysis of User Behaviour in Online 21
Communities
23. Applying Rules
CONSTRUCT {
?role a ?t .
?this social-reality:count_as ?role .
?context a social-reality:C .
?role social-reality:context ?context .
?temp a oubo:TemporalContext .
?forum a sioc:Forum .
?forum oubo:belongsToContext ?context .
?temp oubo:belongsToContext ?context
} WHERE {
BIND (oubo:fn_getRoleType(?this) AS ?type) .
BIND(smf:buildURI("oubo:Role{?type}") AS ?t) .
.....
} 3. The user (?this) is associated with the role in
the given time span (?temp) and forum (?forum)
Modelling and Analysis of User Behaviour in Online 22
Communities
24. Analysis of
Community Health
• How is community role composition associated with activity?
• Dataset
– Irish community message board: Boards.ie
– All posts used from 2004 – 2006
– Selected 3 forums for analysis
• F246: Commuting and Transport
• F388: Rugby
• F411: Mobile Phones and PDAs
• Measured at 12-week increments:
– Forum composition (% of roles)
• E.g. 20% elitists, 10% grunts, etc
– Number of posts
Modelling and Analysis of User Behaviour in Online 23
Communities
25. Analysis: Results (1)
Forum 246 – Commuting and Transport
Modelling and Analysis of User Behaviour in Online 24
Communities
26. Analysis: Results (2)
Forum 246 – Commuting Forum 388 – Rugby Forum 411 – Mobile Phones
and Transport and PDAs
Modelling and Analysis of User Behaviour in Online 25
Communities
27. Analysis: Results (3)
Forum 246 – Commuting and Transport
Modelling and Analysis of User Behaviour in Online 26
Communities
28. Analysis: Results (4)
Forum 246 – Commuting Forum 388 – Rugby Forum 411 – Mobile Phones
and Transport and PDAs
Modelling and Analysis of User Behaviour in Online 27
Communities
29. Predicting
Community Health
• Can we predict community health from role composition?
1. Predict either an increase or decrease in activity
– Features: roles and percentages
– Class label: increase/decrease
– Performed 10-fold cross validation with J48 decision tree
2. Predict post count from role composition
– Independent variables: roles and percentages
– Dependent variable: post count
– Induced linear regression model and assessed the model
Modelling and Analysis of User Behaviour in Online 28
Communities
30. having eit her increased (pos) or decreased (neg) since t he previous t ime window.
For our classificat ion t ask we used t he J48 decision t ree classifier in a 10-fold
cross validat ion set t ing (due t o t he Prediction: dat aset s) by: first, iden-
limit ed size of t he Results (1)
t ifying increases and decreases in each of t he forums, and secondly, ident ifying
act ivity changes across communit ies, by combining forum dat aset s t oget her int o
a single dat aset . To report on t he performance of our approach we used preci-
sion, recall, f-measure (set t ing β = 1) and t he area under t he Receiver Operat or
Charact erist ic Curve (ROC).
T ab l e 2. Result s from det ect ing changes in act ivity using community composit ion
For um P R F1 ROC
246 0.799 0.769 0.780 0.800
388 0.603 0.615 0.605 0.775
411 0.765 0.692 0.714 0.617
A ll 0.583 0.667 0.607 0.466
Table 2 present s t he result s from our classificat ion experiment s. For forum
246 we achieve t he highest F1 value due t o t he act ivity in t he forum st eadily
increasing over t ime and t he precision value indicat ing t hat in t his forum t he
composit ion pat t erns account for fluct uat ions in act ivity. For forum 388 we re-
turn t he lowest F1 value, indicat ing t hat t he variance in act ivity renders t he
predict ion of act ivit y increase difficult wit hin t his forum, t his could possibly
Modelling and Analysis of User Behaviour in Online 29
be due t o t he seasonal fluct uat ions in int erest surrounding t he rugby season.
Communities
31. t his analysis we have ident ified four key take-home messages:
1. Healt hy communit ies cont ain more elit ist s and popular part icipant s.
2. Unhealt hy communit ies cont ain Prediction: Results (2)
many t acit urns and ignored users.
3. Communit ies exhibit idiosyncrat ic composit ions, t hus reflect ing t he differing
dynamics t hat are required/ exhibit ed by individual communit ies.
4. A st able composit ion, wit h a mix of roles, increases community healt h.
T ab l e 3. Linear regression model induced from t he forum composit ion of f388
R ol e E st ’ Coeffici ent St andar d E r r or t -Val ue P ( x > t )
Joi ni ng Conver sat i onal i st 69.20 43.82 1.579 0.1751
Popul ar I ni t i at or s 173.41 54.72 3.169 0.0248 * *
T acit ur ns -135.97 101.91 -1.334 0.2397
Supp or t er s -266.53 109.60 -2.432 0.0592 *
E l i t i st s -105.19 55.88 -1.882 0.1185
Popul ar Par t i ci pant s 372.44 103.24 3.608 0.0154 * *
I gn or ed -75.69 33.39 -2.267 0.0727 *
2
Sum m ar y: R es. St E r r : 311.5, A dj R : 0.8514, F 7 , 5 : 10.82, p-val ue: 0.0092
Si gni f. codes: p-val ue < 0.001 * * * 0.01 * * 0.05 * 0.1 . 1
5 D iscussion and Fut ur e W or k
T he communit ies we chose t o analyse in t his paper were forums from Boards.ie.
It is possible of course t hat different behavioural pat t erns could emerge when
Modelling and Analysis of User Behaviour in Online 30
Communities
analysing different communit ies. However, t here is no reason t o assume t hat our
32. Findings
1. Active communities contain more Elitists and Popular
Participants
=
2. Unhealthy community contain more Tactiturns and
Ignored users
=
3. Communities exhibit idiosyncratic compositions
4. A stable, mixed composition increases activity
Modelling and Analysis of User Behaviour in Online 31
Communities
33. Future Work
• Micro-level role analysis
– Development of a ‘role lifecycle’
• Identification of key community users
– To avoid such users ‘churning’
• Explore alternative methods for role labelling
– Current approach misses ~29% of users
• Extend analysis to other community types
– Enterprise communities
– Social networking platforms
Modelling and Analysis of User Behaviour in Online 32
Communities
34. Conclusions
• Presented an approach to label users with roles based
on their behaviour
– Ontology captures user behaviour as numeric attributes
– Semantic rules are employed to infer user roles
• Behaviour roles are only a subset of the literature
– Roles differ based on the community type
– Our approach is portable to other roles
• Correlated community composition with activity
– Increase in Elitists and Popular Participants = increased
activity
– Increase in Taciturns and Ignored = decreased activity
– Stable, mixed composition = increased health
Modelling and Analysis of User Behaviour in Online 33
Communities
In-degree ratio = concentrationPosts Replied ratio = popularityThread initiation ratio = propensity to initiate discussionsBi-directional threads ratio = reciprocity and interactionBi-directional neighbours ratio = reciprocityAverage posts per thread = level of discussionSD of posts per thread = captures variance of discussions
Roles from Chan et al’s 2010 Web Science paper
Roles from Chan et al’s 2010 Web Science paper
Roles from Chan et al’s 2010 Web Science paper
Roles from Chan et al’s 2010 Web Science paper
Roles from Chan et al’s 2010 Web Science paper
Roles from Chan et al’s 2010 Web Science paper
Forms our skeleton rule base
For each role type weConstruct an instance of oubo:RoleClassifierAssign features with numeric ranges for their values to each ruleSkeleton rule base provides the form that the rules should take, the min and max of each feature are derived dynamically, depending on the distribution of the community
Rules are SPARQL Construct with SPIN functions included
Rules are SPARQL Construct with SPIN functions includedEach user is an instance of oubo:UserAccountSPIN function classifies the user into different role types, we have a function for each role
Users can have different roles in different contexts, both in time and location (forum)
Increase in Elitists and Participants is associated with increased activityUsers who communicate often with other usersIncrease in Taciturns and ignored is associated with decreased activityTaciturns contribute little
Common patterns across all three forums analysedCertain roles more important that others in differing communities:Conversationalists important in commuting and transport and rugby, not in mobile phones and PDAs – conversation not a driving factor in the forumsSupporters found to negatively impact upon activity in forum 411 – again because conversation is not a common action in the community: more interested in support
Activity increases as the composition reaches a relatively stable settingi.e. little variation and fluctuation in the roles
Composition stability is associated with increased activity in 246 and 411Fluctuation in activity in rugby forum correlated with variation in roles
Best results for 246 – steady increase in activity over timeWorst results for 388 – fluctuation in composition and activity making it hard to perform predictionsCross community patterns are not reliable – idiosyncratic behaviour in each community
Induced linear regression using the role percentages as independent variables and the post count as the dependent variablesShowing f388 as it had the highest R*2 valueStatistically significant variables:If the community: increases in initiators, popular participants and decreases in supporters and ignored users then the activity in the community will increase