Nowadays, social networks and media, such as Facebook, Twitter & Co, affect our communication and our exchange of knowledge more than ever. But which additional benefits can offer social media apart from easy interaction with friends and how can they be used to create additional value for companies and institutions? These are the questions that the area Social Computing at Know-Center addresses in detail.
In this talk we will give a brief overview of industry and non-industry related research projects which we have been involved in recently with my group, Social Computing at the Know-Center, in the context of Big Data and social media. In particular, the talk will highlight specific research project outcomes and work-in-progress that make use of social media data to help people to explore the vastly growing overloaded information space more efficiently.
Ähnlich wie Social Computing in the area of Big Data at the Know-Center Austria's leading competence center for data driven business and Big Data analytics
20151001 charles university prague - marc smith - node xl-picturing political...Marc Smith
Ähnlich wie Social Computing in the area of Big Data at the Know-Center Austria's leading competence center for data driven business and Big Data analytics (20)
Social Computing in the area of Big Data at the Know-Center Austria's leading competence center for data driven business and Big Data analytics
1. Social Computing @ Know-Center
1
Social Computing in the area of Big Data
at the Know-Center
Christoph Trattner
Know-Center
ctrattner@know-center.at
@Graz University of Technology, Austria
. Christoph Trattner 29.8.2014 – PUC, Chile
2. Social Computing @ Know-Center
2
Before I will start in this talk I will talk a bit about
myself and how it happened that I became
Head of the Social Computing Research Area
at the Know-Center, Austria’s leading competence
center for data driven business and Big Data
analytics
. Christoph Trattner 29.8.2014 – PUC, Chile
3. Social Computing @ Know-Center
3
Where do I come from (Austria)?
. Christoph Trattner 29.8.2014 – PUC, Chile
4. Social Computing @ Know-Center
4
Graz
. Christoph Trattner 29.8.2014 – PUC, Chile
5. Social Computing @ Know-Center
5
Academic Back-Ground?
§ Studies Computer Science at Graz University of
Technology & University of Pittsburgh
§ Worked since 2009 as scientific researcher at the KMI &
IICM (BSc 2008, MSc 2009)
§ My PhD thesis was on the Search & Navigation in Social
Tagging Systems (defended 2012)
§ Since Feb. 2013 @ Know-Center
§ Leading the SC Area
§ At TUG:
§ WebScience
§ Semantic Technologies
. Christoph Trattner 29.8.2014 – PUC, Chile
6. Social Computing @ Know-Center
6
My
team
2
Post-‐Docs,
5
Pre-‐Docs
(4
more
to
join
soon
J)
2
MSc
student
2
BSc
student
DI. Dieter
Theiler
DI. Dominik
Kowald
Dr. Peter
Kraker
. Christoph Trattner 29.8.2014 – PUC, Chile
Dr. Elisabeth
Lex
Mag. Sebastian
Dennerlein
Mag. Matthias
Rella
DI. Emanuel
Lacic
7. Social Computing @ Know-Center
7
Thanks to my Collaborators
. Christoph Trattner 29.8.2014 – PUC, Chile
8. Social Computing @ Know-Center
8
What is my group doing?
… we research on novel methods and tools that exploit
social data to generate a greater value for the
individual, communities, companies and the society as
whole.
Our competences:
• Network & Web Science
• Science 2.0
• Predictive Modeling
• Social Network Analysis
• Information Quality Assessment
• User Modeling
• Machine Learning and Data Mining
• Collaborative Systems
Our Services:
• Social Analytics: Hub-, Expert -, Community
-, Influencer -, Information Flow-, Trend
(Event) Detection, etc.
• Information Quality Assessment
• Social & Location-based Recommander
Systems
• Customer Segmentation
• Social Systems Design
. Christoph Trattner 29.8.2014 – PUC, Chile
9. Social Computing @ Know-Center
9
What type of projects are we running?
COMET NON-K
FWF EU
Industry
Projects
. Christoph Trattner 29.8.2014 – PUC, Chile
Non-Industrial
Projects
FFG ...
10. Social Computing @ Know-Center
10
Some industry partners...
. Christoph Trattner 29.8.2014 – PUC, Chile
11. Social Computing @ Know-Center
11
The Projects
Project 1: Mendeley – UK Startup (recently acquired by Elsevier):
Interested in the problem of hirachical concept-based
navigation.
Project 2: Blanc Noir – Austrian Startup: Interested in the problem
of recommending items to users through social data.
Project 3: University of Pittsburgh & Several Austrian
companies: Interested on the usefulness of Twitter in academic
conferences.
. Christoph Trattner 29.8.2014 – PUC, Chile
12. Social Computing @ Know-Center
12
Ok, lets start….
. Christoph Trattner 29.8.2014 – PUC, Chile
13. Social Computing @ Know-Center
13
Project 1
Mendeley – UK Startup (recently acquired by Elsevier):
Interested in the problem of hierarchical concept-based
navigation.
. Christoph Trattner 29.8.2014 – PUC, Chile
14. Social Computing @ Know-Center
14
Research Question 1:
What kind of meta-data is more useful for the task of
navigation in information systems - tags or keywords?
Externals involved:
• Mendeley, London, UK
Helic, D., Körner, C., Granitzer, M., Strohmaier, M. and Trattner, C. 2012. Navigational Efficiency of Broad vs.
Narrow Folksonomies. In Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT
2012), ACM, New York, NY, USA, pp. 63-72.
. Christoph Trattner 29.8.2014 – PUC, Chile
15. Social Computing @ Know-Center
15
Mendeley
. Christoph Trattner 29.8.2014 – PUC, Chile
16. Social Computing @ Know-Center
16
§ We
. Christoph Trattner 29.8.2014 – PUC, Chile
Tags
Keywords
Mendeley Desktop
17. Social Computing @ Know-Center
17
Task
What is the best way to extract hirachies from enties such
as social tags or keywords? What is more useful for
navigation – keyword or tag hierarchies?
. Christoph Trattner 29.8.2014 – PUC, Chile
18. Social Computing @ Know-Center
18
Different types of hierarchy induction
algorithms
Helic, D., Strohmaier, M., Trattner, C., Muhr M. and Lermann, K.: Pragmatic Evaluation of Folksonomies, In
Proceedings of the 20th international conference on World Wide Web (WWW 2011), ACM, New York, NY, USA,
417-426, 2011.
. Christoph Trattner 29.8.2014 – PUC, Chile
19. Social Computing @ Know-Center
19
Issue (!!!)
...no literature on what type of hierarchy is best suited
for the task of navigation...
D. J. Watts, P. S. Dodds, and M. E. J. Newman. Identity and
search in social networks. Science, 296:1302–1305, 2002.
J. M. Kleinberg. Navigation in a small world. Nature,
406(6798):845, August 2000.
. Christoph Trattner 29.8.2014 – PUC, Chile
20. Social Computing @ Know-Center
20
Stanley Milgram
§ A social psychologist
§ Yale and Harvard University
§ Study on the Small World Problem,
beyond well defined communities
and relations
(such as actors, scientists, …)
§ „An Experimental Study of the Small World Problem”
. Christoph Trattner 29.8.2014 – PUC, Chile
1933-1984
21. Social Computing @ Know-Center
21
Set Up
§ Target person:
§ A Boston stockbroker
§ Three starting populations
Nebraska
random
§ 100 “Nebraska stockholders”
§ 96 “Nebraska random”
§ 100 “Boston random”
Nebraska
stockholders
. Christoph Trattner 29.8.2014 – PUC, Chile
Target
Boston
stockbroker
Boston
random
22. Social Computing @ Know-Center
22
Results
§ How many of the starters would be able to establish
contact with the target?
§ 64 out of 296 reached the target
§ How many intermediaries would be required to link
starters with the target?
§ Well, that depends: the overall mean 5.2 links
§ Through hometown: 6.1 links
§ Through business: 4.6 links
§ Boston group faster than Nebraska groups
§ Nebraska stockholders not faster than Nebraska random
§ What form would the distribution of chain lengths
take?
. Christoph Trattner 29.8.2014 – PUC, Chile
23. Social Computing @ Know-Center
23
Hierarchical decentralized searcher
Information
Network
Hierarchy
. Christoph Trattner 29.8.2014 – PUC, Chile
24. Social Computing @ Know-Center
24
Validation
§ We compared simulations with
human click trails of the online Game –
The Wiki Game (http://thewikigame.com/)
§ Contains 1,500,000
click trails of more
than 500,000 users with
(start; target) information.
. Christoph Trattner 29.8.2014 – PUC, Chile
25. Social Computing @ Know-Center
Wikipedia Category Label Dataset:
2,300,000 category labels,
4,500,000 articles, 30,000,000 category
label assignments
Delicious Tag Dataset:
440,000 tags, 580,000 articles and
3,400,000 tag assignments
25
Hierachy Creation (1)
Two types of hierarchies were evaluated
1.) First type is based on our previous work
§ Categorial Concepts:
§ Tags from Delicious
§ Category labels from Wikipedia
Similarity Graph Latent Hierarchical Taxonomy
. Christoph Trattner 29.8.2014 – PUC, Chile
26. Social Computing @ Know-Center
26
Hierarchy Creation (2)
2.) Second type is based on the work of [Muchnik et al. 2007]
Simple idea: Algorithm iterates through all
links in the network and decides if that link is
of a hierarchical type, in which case it
remains in the network otherwise it is
removed.
Directed link-network dataset of the
English-Wikipedia from February
2012.
All in all, the dataset includes
around 10,000,000 articles and
around 250,000,000 links
Muchnik, L., Itzhack, R., Solomon S. and Louzoun Y.: Self-emergence of knowledge trees: Extraction
of the Wikipedia hierarchies, PHYSICAL REVIEW E 76, 016106 (2007)
. Christoph Trattner 29.8.2014 – PUC, Chile
27. Social Computing @ Know-Center
27
Validation Human Navigators
. Christoph Trattner 29.8.2014 – PUC, Chile
28. Social Computing @ Know-Center
28
...ok let‘s come back to the Mendeley „problem“...
. Christoph Trattner 29.8.2014 – PUC, Chile
29. Social Computing @ Know-Center
29
Are keyword hierarchies more navigable
than social tag hierarchies?
Results:
With simulations we find that tag-based
. Christoph Trattner 29.8.2014 – PUC, Chile
Tags
Keywords
Results: Our Greedy Navigator (= Simulator) needs on average 1-click
more with keywords to reach the target node than with tags
hierarchies are more efficient
for navigation than keywords
30. Social Computing @ Know-Center
30
...ok let‘s move on to some (Social) networking stuff J
. Christoph Trattner 29.8.2014 – PUC, Chile
31. Social Computing @ Know-Center
31
Project 2
Blanc Noir – Austrian Startup: Interested in the problem
of recommending items to users through social &
location-based (social) data.
. Christoph Trattner 29.8.2014 – PUC, Chile
32. Social Computing @ Know-Center
32
Research Question 2:
To what extent is social network location-based data
useful to predict trades or products in online and offline
marketplaces?
Externals involved:
• Blanc Noir
• PUC, Chile
Trattner, C., Parra, D., Eberhard, L. and Wen, X.: Who will Trade with Whom? Predicting Buyer-Seller
Interactions in Online Trading Platforms through Social Networks, In Proceedings of the ACM World Wide
Web Conference (WWW 2014), ACM, New York, NY, 2014.
. Christoph Trattner 29.8.2014 – PUC, Chile
33. Social Computing @ Know-Center
33
How did we answer that question?
• Major issue: There are no freely available data sets
available
• Idea: Crawl data from virtual world of Second Life
• Comprises both:
• Online Social Network & Location-Based (Social) data
• Amazon/eBay alike Marketplace
• https://my.secondlife.com/
• https://marketplace.secondlife.com/
. Christoph Trattner 29.8.2014 – PUC, Chile
34. Social Computing @ Know-Center
34
Features
• In our analysis we focused on content (e.g., common
interests) and network features (e.g., common
interaction partners)
Example of network features we used in our analysis
. Christoph Trattner 29.8.2014 – PUC, Chile
35. Social Computing @ Know-Center
35
Evaluation
• We split the dataset in two different kinds of sets (one
for training and one for testing)
• Trained a binary classifier
• Eval metric (Area Under the Curve – AUC)
. Christoph Trattner 29.8.2014 – PUC, Chile
36. Social Computing @ Know-Center
Results:
Although the combination of features from
both social and trading networks did not
show a significant improvement over trading
network data alone, our experiments
indicate that the online social network data
improve the predictive accuracy of trading
interactions over random guessing by 28%
in a cold-start setting.
36
Results:
seller/buyer prediction
Baseline: 0.5 (random guessing)
Dataset:
• 131,087 seller profiles with 268,852
trading interactions.
• 169,035 social profiles with overall
3,175,304 social interactions.
. Christoph Trattner 29.8.2014 – PUC, Chile
37. Social Computing @ Know-Center
37
Follow-up (1)
Experiment with location-based
social network data
Task: Predict items to users
User-based collaborative filtering
. Christoph Trattner 29.8.2014 – PUC, Chile
38. Social Computing @ Know-Center
38
Follow-up (2)
. Christoph Trattner 29.8.2014 – PUC, Chile
39. Social Computing @ Know-Center
39
Recsium Framework
• Near Real-Time Updates
• Real Time Recommendations
• Deals with various sources of data
• RESTful API
. Christoph Trattner 29.8.2014 – PUC, Chile
40. Social Computing @ Know-Center
40
Demo - Recsium
http://recsium.know-center.tugraz.at/recsium/
. Christoph Trattner 29.8.2014 – PUC, Chile
41. Social Computing @ Know-Center
41
...currently working on
Location-based services shopping malls, train-stations
Technology: iBeacons
Task: indoor navigation, indoor marketing, etc...
. Christoph Trattner 29.8.2014 – PUC, Chile
42. Social Computing @ Know-Center
42
Project 3
University of Pittsburgh: Interested on the usefulness
of Twitter in academic conferences.
. Christoph Trattner 29.8.2014 – PUC, Chile
43. Social Computing @ Know-Center
43
Research Question 3:
To what extent is Twitter useful to engage new comers
(junior researchers) in academic conferences?
Externals involved:
• University of Pittsburgh, Pittsburgh, USA
• PUC, Chile
Wen,X., Parra, D. and Trattner, C.: How groups of people interact with each other on Twitter during academic
conferences, In Proceedings of the 2014 ACM Conference on Computer Supported Cooperative Work
(CSCW 2014), ACM, Baltimore, Maryland, USA.
. Christoph Trattner 29.8.2014 – PUC, Chile
44. Social Computing @ Know-Center
44
Dataset
§ Data: We collected tweets data by searching for the hashtag of four
conferences: Hypertext 2012 (#ht2012), UMAP 2012 (#umap2012),
RecSys 2012 (#recsys2012), and ECTEL 2012 (#ectel2012).
§ Tweets Type: a) mentions, b) replies to, c) re-tweets, and d) isolated
tweets (not a), b), c))
§ Twitters Group: a) Junior researcher (JR), b) Senior researcher (SR), c)
Faculty (F), d) Industry (I), and e) Organizations (OR).
Dates
captured
#
Users
# Total
tweets
a)
Mentions
b)
Replies
c)
RT
. Christoph Trattner 29.8.2014 – PUC, Chile
not
a),
b),
c)
% Users
re-tweeted,
mentioned
, replied-to
# F # I # JR # O # SR
HT 12 June 24-28 61 254 24 19 105 106 34.40% 19 16 6 4 15
UMAP 12 July 16-20 51 234 32 16 104 82 37.30% 23 7 3 8 18
RECSYS 12 Sept. 10-13 266 2022 265 60 1087 610 34.60% 61 120 6 19 53
ECTEL 12 Sept. 18-21 91 434 17 138 38 241 46.20% 51 17 3 11 15
45. Social Computing @ Know-Center
Results:
Junior researchers show the lowest
group attention, and conversation
ration among all groups.
45
Who is receiving the attention?
9.00
8.00
7.00
6.00
5.00
4.00
3.00
2.00
1.00
0.00
Average Group Attention Per User
Faculty Senior Researcher Junior Researcher Organization Industry
HT 12
UMAP 12
RECSYS 12
ECTEL 12
16.00
14.00
12.00
10.00
8.00
6.00
4.00
2.00
0.00
Average Group Contribution Per User
Faculty Senior Researcher Junior Researcher Organization Industry
0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
Conversion Ratio
. Christoph Trattner 29.8.2014 – PUC, Chile
HT 12
UMAP 12
RECSYS 12
ECTEL 12
Conversion Ratio (CR) = Attention / Contribution = (|mentioned| + |replied| + |RT|) /|tweets|
0.00
Faculty Senior Researcher Junior Researcher Organization Industry
HT 12
UMAP 12
RECSYS 12
ECTEL 12
46. Social Computing @ Know-Center
Results:
Juniors researchers are less involved
in the conversation on Twitter than
any other group of users.
46
Who interacts with whom?
HT12
UMAP12
RECSYS12
ECTEL12
FromTo
F
SR
JR
O
I
F
SR
JR
O
I
F
SR
JR
O
I
F
SR
JR
O
I
Faculty
(F)
0.43
0.16
0.20
0.16
0.05
0.53
0.42
0.00
0.02
0.04
0.36
0.30
0.01
0.00
0.34
0.73
0.14
0.00
0.02
0.11
Senior
Researcher
(SR)
0.46
0.19
0.15
0.12
0.08
0.32
0.60
0.00
0.01
0.06
0.22
0.33
0.01
0.02
0.42
0.42
0.13
0.00
0.16
0.29
Junior
Researcher
(JR)
0.52
0.00
0.12
0.20
0.16
0.40
0.60
0.00
0.00
0.00
0.21
0.38
0.08
0.00
0.33
1.00
0.00
0.00
0.00
0.00
OrganizaTon
(O)
0.26
0.30
0.15
0.26
0.04
0.50
0.40
0.00
0.10
0.00
0.15
0.26
0.02
0.08
0.49
0.20
0.20
0.00
0.27
0.33
Industry
(I)
0.27
0.31
0.19
0.19
0.04
0.42
0.50
0.00
0.08
0.00
0.26
0.25
0.00
0.02
0.47
0.58
0.20
0.00
0.13
0.10
. Christoph Trattner 29.8.2014 – PUC, Chile
47. Social Computing @ Know-Center
Results:
Retweets and Mentions increase
over time. Replies and Mentions stay
steady over time.
47
Has usage changed over time?
. Christoph Trattner 29.8.2014 – PUC, Chile
48. Social Computing @ Know-Center
Results:
Our analysis reveals a steady growth
in the communication over twitter
over time. Interestingly these
conversations get less connected
over time.
48
Has interaction changed over time?
. Christoph Trattner 29.8.2014 – PUC, Chile
49. Social Computing @ Know-Center
Results:
Eigenvector centrality is the most
important feature to predict future
conference participation followed by
degree centrality.
49
What keeps users returning over time?
. Christoph Trattner 29.8.2014 – PUC, Chile
50. Social Computing @ Know-Center
50
...ok that‘s basically it J
. Christoph Trattner 29.8.2014 – PUC, Chile
51. Social Computing @ Know-Center
51
...of course there are other projects
. Christoph Trattner 29.8.2014 – PUC, Chile
52. Social Computing @ Know-Center
52
Thank you!
Christoph Trattner
Email: ctrattner@know-center.at
Web: christophtrattner.info
Twitter: @ctrattner
Sponsors:
. Christoph Trattner 29.8.2014 – PUC, Chile
53. Social Computing @ Know-Center
53
Any questions?
. Christoph Trattner 29.8.2014 – PUC, Chile