SlideShare ist ein Scribd-Unternehmen logo
1 von 28
Downloaden Sie, um offline zu lesen
The Web Science MacroScope: Mixed-methods Approach for
Understanding Web Activity
Markus Luczak-Roesch (some slides based on work of Ramine Tinati)
University of Southampton (UK), Web and Internet Science Group
@mluczak | http://markus-luczak.de
 Image source: https://en.wikipedia.org/wiki/File:Compound_Microscope_(cropped).JPG, CC BY-SA 4.0
The
World
Wide
Web
Image source: screenshot taken from https://www.w3.org/History/
1989/proposal.html
Essential part of the data science story of the
WWW: Web Observatories
Data Sources
Challenges:
-  Who are the providers?
-  Is the service reliable/stable?

Data
Collec=on
Challenges:
-  API Limita=ons/Restric=ons
-  Data Schemas/Consistency
-  Does it change over=me?





Data
Storage
Challenges:
-  Storage approaches
(rela=onal, flat, linked?)
Data Analysis
and Modelling
Challenges:
-  What methods/models? 
-  How is the data sampled?



Data
Visualisa=on
Challenges:
-  Misrepresenta=on of data?
e.g. visualise “filtered” data


Data Querying
and Transforma3on
Sta3s3cal and
computa3onal analysis
Methods



Data
Interpreta=on
Challenges:
-  Are the ques=ons being
asked relevant to the data
-  Are insights being fed back
into the analysis?


Add or update ini3al stored data
Update current harves3ng strategy (req. for real-3me analysis)
(a)
Image source: https://en.wikipedia.org/wiki/File:Sphinx_Observatory.jpg, CC BY-SA 2.0
What to observe? Social Machines!
“Real life is and must be full of all kinds of social constraint
– the very processes from which society arises. Computers
can help if we use them to create abstract social
machines on the Web: processes in which
the people do the creative work and the
machine does the administration.“
Berners-Lee, Tim; Mark Fischetti (1999). Weaving the Web: The Original Design and
Ultimate Destiny of the World Wide Web by its inventor. Britain: Orion Business. ISBN
0-7528-2090-7.
Topic outbreaks across systems
Peak	in	
tweets	
containing	
topic	‘x’		
Peak	in	
Wikipedia	
views	of	
ar7cles	‘x’		
‘Lag	diffusion’	
7me
Topic outbreaks across cultures
Participation in Citizen Science projects by
communication patterns
98 game entries and 835,732 chat messages,
ue players. For each game, the EyeWire sys-
duration taken (in seconds) for a player to
he time the game was completed. Each chat
player’s ID, timestamp, and message text.
the question of player chat engagement and
granularity of players with similar character-
fferent sets of players related to their gaming
ur. We initially reduced the data to include
ed to both games and chat. we labelled these
Based on these players, we computed several
players related to specific EyeWire features;
sets we computed a number of statistics and
escribed in Table 2.
uting statistics for the 10,714 ’active’ players
mes and chat, we extracted the top quadrant
milar to the approach taken in other citizen
mmunity engagement [27]. We label these
ive’. Based on a initial analysis of user re-
e’ players contain individuals who sustained
of 30 days with respects to writing chat mes-
a game.
anised as follows, we begin by presenting the
the system-level analysis, then explore the
lationship with a players’ gaming participa-
n the chat messages corresponding to differ-
ing process, the impact on game commands
y, examine the context of the chat messages
ing.
Figure 4: Distribution of games, chat messages, and account
durations (games and chat) for all EyeWire players.
Figure 5: Timeline of chat and gaming activity for the EyeWire
platform.
5.1.1 Player Cohorts
As shown in Figure 4, the analysis of chat and gaming account
duration reveals that for gaming activity, there are many players
Stage Criteria
Before Game (Q0) 30s < Game Start
Start of Game (Q1) Game Start < x < 1st Quartile Game Duration
During Game (Q2-3) Quartile Game Duration < x < 3rd Quartile
Game Duration
End of Game (Q4) 3rd Quartile Game Duration < x < Game End
After Game (Q5) 30s < Game End
Table 1: Chat Message Stages: Boundary Conditions
Tinati, R., Luczak-Rösch, M., Simperl, E., Hall, W., & Shadbolt, N. (2015,
May). /Command'and conquer: analysing discussion in a citizen science
game. In ACM Web Science Conference 2015.
ng past classifications.
Main Interface in EyeWire
ns and gamification techniques are integral
eWire platform. As shown in 2, EyeWire
eal-time chat that allows players to talk to
layers points and achievements, as well as
commands which provide additional func-
and talking. Game commands are issued
h (‘/’), such as being able to mute and hide
ng the ‘/silence’ command. Issuing player
not shown on the public chat feed, unless
and such as group message (‘/gm’), which
particular team, in which they first have to
mmand.
am is an community-driven process which
n ongoing competition between teams of
re either setup by the EyeWire team (usu-
esh system activity), or led by the players
r a specific goal or set of ’badges’.
ernal chat, the main interface links to ad-
interfaces which are not part of the game.
oject blog, where the community managers
s, competitions, and challenges as well as
ul players. The players can also consult the
ntains information about how to play the
ence behind ‘connectome’ mapping. In ad-
e provided with a forum that is meant to be
nsive, asynchronous discussion on various
including error reports.
METHODS
Figure 2: Embedded Chat Interface in EyeWire
given time frame. The cohort analysis examines monthly cohorts
of players based on their first chat and game entry, and provides
a measure of sustained activity. Based on the the monthly player
retention values, we are able to differentiate between different sets
of users, as described in the following section.
To examine the context and discourse within the chat messages,
we perform text analysis to extract the use of EyeWire game com-
mands, and also perform topic modelling on the content of the chat
messages. To achieve this we use LDA [5] to derive topic models
which contain common vocabulary used by players. We combine
this with the different categories of chat messages in order to de-
termine the context of chat during different stages of completing a
game.
As we are interested in the relationship between a players gam-
ing session and use of chat, we construct a model of player chat
messages which classify chat activity at different stages of when a
game is being performed. As described in Table 1 and illustrated
in Figure 3, we categorise the chat messages into 5 stages around
the process of gaming. Stages Q1 to Q4 are relative to the time it
took for the game to be completed. For example, if a game was
completed in 10 seconds, then Q1 would represent 0-2 seconds,
Participation in Citizen Science projects by
communication patterns
Luczak-Roesch, M., Tinati, R., Simperl, E., Van Kleek, M., Shadbolt, N., & Simpson, R. (2014). Why won't aliens talk to us? Content and community dynamics in
online citizen science. Proceedings of the Eighth AAAI Conference on Weblogs and Social Media, {ICWSM} 2014, Ann Arbor, Michigan, USA, June 1-4, 2014. 
Image source: David Miller, https://daily.zooniverse.org/2013/11/21/an-ever-expanding-zooniverse/
Participation in Citizen Science projects by
location
Temporal networks of information co-occurrence
for system-agnostic exploratory data analysis
Markus Luczak-Roesch, Ramine Tinati, Max van Kleek, and Nigel Shadbolt. 2015. From coincidence to
purposeful flow? Properties of transcendental information cascades. In IEEE/ACM International
Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, FR.
Where is the MacroScope?
Data Sources
Challenges:
-  Who are the providers?
-  Is the service reliable/stable?

Data
Collec=on
Challenges:
-  API Limita=ons/Restric=ons
-  Data Schemas/Consistency
-  Does it change over=me?





Data
Storage
Challenges:
-  Storage approaches
(rela=onal, flat, linked?)
Data Analysis
and Modelling
Challenges:
-  What methods/models? 
-  How is the data sampled?



Data
Visualisa=on
Challenges:
-  Misrepresenta=on of data?
e.g. visualise “filtered” data


Data Querying
and Transforma3on
Sta3s3cal and
computa3onal analysis
Methods



Data
Interpreta=on
Challenges:
-  Are the ques=ons being
asked relevant to the data
-  Are insights being fed back
into the analysis?


Add or update ini3al stored data
Update current harves3ng strategy (req. for real-3me analysis)
(a)
Image sources: https://en.wikipedia.org/wiki/File:Compound_Microscope_(cropped).JPG, CC
BY-SA 4.0 & https://en.wikipedia.org/wiki/File:Sphinx_Observatory.jpg, CC BY-SA 2.0
What is the MacroScope?
Other data visualization capacities
Image source: screenshot from https://www.imperial.ac.uk/data-science/kpmg-data-observatory-/technical-specifications/
Other data visualization capacities
Image source: screenshot from http://approach.rpi.edu/2015/11/18/immersive-experience-the-campfire/
What is the MacroScope?
“Wow, they don’t even know
that this is happening!”
Do we really think this is an event to be
addressed in a purely quantitative fashion?
Source: United Nations Development Programme, https://goo.gl/Z1uXdV, CC BY-NC-ND 2.0
A qualitative investigation of crowdsourced
disaster response
•  Haiti (Ushahidi, N=298)
– requests for help from
identified local source
•  Congo (Ushahidi, N=102)
– information about the
situation but not who is
responsible for this
information
– more non-local sources
•  Ebola (Twitter, N=298)
– comments
•  tasteless jokes
•  racist comments
•  concern that the crisis could
spread and call to
governments to close the
borders
Joint project with Silke Roth
Boundaries of crowdsourced disaster response
•  Wrong things go viral
•  Crowdsourcing informativeness
of social media information not
synchronized with crises
negative
 neutral
 positive
18
“When you tell a […] kid that is has got Ebola”
Serendipitous discoveries in Citizen Science
Hanny’s Voorwerp
Galaxy Zoo [2007]
Green Pea Galaxies
Galaxy Zoo [2007]
Yellow Balls
Milky Way [2009]
Circumbinary Planet Ph1b
Planet Hunter [2012]
Convict Worm
Seafloor Explorer [2012]
Spanish Flu
Operation War Diaries [2014]
From information co-occurrence to the discovery
of hidden structure in Wikipedia
Figure 1: Wikipedia edits in a three dimensional space. The di-
mensions are (1) time; (2) information diversity as the chronologi-
Tinati, R., Luczak-Rösch, M., & Hall, W. (to appear). Finding Structure in Wikipedia Edit Activity:
An Information Cascade Approach . In WikiWorkshop 2016, co-located with WWW 2016.
Events detected:
•  Edward Snowden speech at SXSW
conference
•  US supreme court case on same sex
marriage
(a) Cascade Article Network (CAN): Nodes represent unique
Wikipedia articles, edges are shared edits based on a shared
identifier matched. A force directed layout has been ap-
plied, with edge path lengths determined by edge weight. The
strongly connected component (A) contains articles associated
with South Korean media, (B) and (C) contain articles related
to the USA.
(b) Cascade-to-Cascade path network graph: Nodes are cas-
cades, Edges are the shared articles between cascades. The cen-
tral strongly connected component is established by the Identi-
fiers shown in Table 3. A force directed layout has been applied,
with edge path lengths determined by edge weight.
The MacroScope is technology
External APIs
•  Twitter
•  Wikipedia
•  Instagram
•  Google Trends
•  Yahoo Trends
Pre-processing
Stage:
1.Enrich Streams
2. Unify feeds
into
WO JSON Format
Streaming
Stage:
1. Post incoming
stream to
RabbitMQ
exchange (each
source has its
own exchange)
Hadoop
Storage Stage:
1. Apache Flume
for each stream
HDFS
HTTP Streaming
Stage:
1. Send Stream to
Web Observatory
Server
Unstructured	
Web	Streams	
or	Web	Scraped	
Pages	
Web	
Observatory	
JSON	Data	
Schema	
RabbitMQ	
JSON	Stream	
Socket.IO	
Daily Storage
Stage:
1. MapReduce
Daily Results
MongoDB
MacroScope
Socket.IO
•  six screens in WAIS labs
•  as part of presentations
•  as a mobile exhibit
•  as a Web application
There is more than one MacroScope
Cross-disciplinary research
Scholars from discipline A	 Scholars from discipline B	
Adaptive
epistemological
framework
Engagement with the general public
Scholars	 People from the general public	
demonstrating the
power and the danger
of individuals sharing
information online 

developing a new
“situational ethics of
data”
The MacroScope
Scholars	 The public
The MacroScope
Surveys, interviews, focus groups, observations	
Scholars	 The public
A mantra for the MacroScope:
“Overview first, zoom and filter, then details-
on demand”* and capture
engagement.
* Shneiderman, B. (1996, September). The eyes have it: A task by
data type taxonomy for information visualizations. In Visual
Languages, 1996. Proceedings., IEEE Symposium on (pp. 336-343).
IEEE.
 Image source: screenshot taken from http://data.shopsavvy.mobi/globe
The Web Science MacroScope:
Mixed-methods Approach for
Understanding Web Activity










Markus Luczak-Roesch
@mluczak | http://markus-luczak.de
Image source: https://en.wikipedia.org/wiki/File:Compound_Microscope_(cropped).JPG, CC BY-SA 4.0
Discover

Describe

 
Directly engage

Weitere ähnliche Inhalte

Was ist angesagt?

IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash TagIRJET Journal
 
Finding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerankFinding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerankcsandit
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual informationeSAT Journals
 
Who to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsWho to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsNicola Barbieri
 
Link Prediction in (Partially) Aligned Heterogeneous Social Networks
Link Prediction in (Partially) Aligned Heterogeneous Social NetworksLink Prediction in (Partially) Aligned Heterogeneous Social Networks
Link Prediction in (Partially) Aligned Heterogeneous Social NetworksSina Sajadmanesh
 
Techniques that Facebook use to Analyze and QuerySocial Graphs
Techniques that Facebook use to Analyze and QuerySocial GraphsTechniques that Facebook use to Analyze and QuerySocial Graphs
Techniques that Facebook use to Analyze and QuerySocial GraphsHaneen Droubi
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...Geetika Gautam
 
Tweet sentiment analysis
Tweet sentiment analysisTweet sentiment analysis
Tweet sentiment analysisAnil Shrestha
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter dataBhagyashree Deokar
 
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Andry Alamsyah
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network AnalysisFred Stutzman
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportDavid Kennedy
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using mlPravin Katiyar
 
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...TELKOMNIKA JOURNAL
 
Poster presentation 5th BENet (Belgium Network Research Meting), Namur
Poster presentation 5th BENet (Belgium Network Research Meting), Namur Poster presentation 5th BENet (Belgium Network Research Meting), Namur
Poster presentation 5th BENet (Belgium Network Research Meting), Namur Nanyang Technological University
 

Was ist angesagt? (18)

IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
IRJET - Implementation of Twitter Sentimental Analysis According to Hash Tag
 
Finding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerankFinding important nodes in social networks based on modified pagerank
Finding important nodes in social networks based on modified pagerank
 
Scalable recommendation with social contextual information
Scalable recommendation with social contextual informationScalable recommendation with social contextual information
Scalable recommendation with social contextual information
 
Abstract
AbstractAbstract
Abstract
 
Who to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanationsWho to follow and why: link prediction with explanations
Who to follow and why: link prediction with explanations
 
Ijsea04031005
Ijsea04031005Ijsea04031005
Ijsea04031005
 
Link Prediction in (Partially) Aligned Heterogeneous Social Networks
Link Prediction in (Partially) Aligned Heterogeneous Social NetworksLink Prediction in (Partially) Aligned Heterogeneous Social Networks
Link Prediction in (Partially) Aligned Heterogeneous Social Networks
 
Techniques that Facebook use to Analyze and QuerySocial Graphs
Techniques that Facebook use to Analyze and QuerySocial GraphsTechniques that Facebook use to Analyze and QuerySocial Graphs
Techniques that Facebook use to Analyze and QuerySocial Graphs
 
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...Project prSentiment Analysis  of Twitter Data Using Machine Learning Approach...
Project prSentiment Analysis of Twitter Data Using Machine Learning Approach...
 
Content-based link prediction
Content-based link predictionContent-based link prediction
Content-based link prediction
 
Tweet sentiment analysis
Tweet sentiment analysisTweet sentiment analysis
Tweet sentiment analysis
 
Sentiment analysis of twitter data
Sentiment analysis of twitter dataSentiment analysis of twitter data
Sentiment analysis of twitter data
 
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
Dissemination of Awareness Evolution “What is really going on?” Pilkada 2015 ...
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Ego web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf exportEgo web qqml presentation 2016 pdf export
Ego web qqml presentation 2016 pdf export
 
Sentiment analysis using ml
Sentiment analysis using mlSentiment analysis using ml
Sentiment analysis using ml
 
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
Selecting User Influence on Twitter Data Using Skyline Query under MapReduce ...
 
Poster presentation 5th BENet (Belgium Network Research Meting), Namur
Poster presentation 5th BENet (Belgium Network Research Meting), Namur Poster presentation 5th BENet (Belgium Network Research Meting), Namur
Poster presentation 5th BENet (Belgium Network Research Meting), Namur
 

Ähnlich wie The Web Science MacroScope: Mixed-methods Approach for Understanding Web Activity

What network simulator questions do users ask? a large-scale study of stack o...
What network simulator questions do users ask? a large-scale study of stack o...What network simulator questions do users ask? a large-scale study of stack o...
What network simulator questions do users ask? a large-scale study of stack o...nooriasukmaningtyas
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics DomainDrjabez
 
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...reshma reshu
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningEditor IJCATR
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...cscpconf
 
STOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKET
STOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKETSTOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKET
STOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKETIRJET Journal
 
Derogatory Comment Classification
Derogatory Comment ClassificationDerogatory Comment Classification
Derogatory Comment ClassificationIRJET Journal
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...csandit
 
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMSMODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMSijasuc
 
Modeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware SystemsModeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware Systemsijasuc
 
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTemporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTELKOMNIKA JOURNAL
 
IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...
IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...
IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...IRJET Journal
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# ProjectsVijay Karan
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# ProjectsVijay Karan
 
IRJET- Monitoring Suspicious Discussions on Online Forums using Data Mining
IRJET- Monitoring Suspicious Discussions on Online Forums using Data MiningIRJET- Monitoring Suspicious Discussions on Online Forums using Data Mining
IRJET- Monitoring Suspicious Discussions on Online Forums using Data MiningIRJET Journal
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGdannyijwest
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGIJwest
 
Ui Design And Usability For Everybody
Ui Design And Usability For EverybodyUi Design And Usability For Everybody
Ui Design And Usability For EverybodyEmpatika
 
WMJ&GMBwosc08-Effective Learning & Production Via Modelling
WMJ&GMBwosc08-Effective Learning & Production Via ModellingWMJ&GMBwosc08-Effective Learning & Production Via Modelling
WMJ&GMBwosc08-Effective Learning & Production Via ModellingGary Boyd
 

Ähnlich wie The Web Science MacroScope: Mixed-methods Approach for Understanding Web Activity (20)

What network simulator questions do users ask? a large-scale study of stack o...
What network simulator questions do users ask? a large-scale study of stack o...What network simulator questions do users ask? a large-scale study of stack o...
What network simulator questions do users ask? a large-scale study of stack o...
 
Profile Analysis of Users in Data Analytics Domain
Profile Analysis of   Users in Data Analytics DomainProfile Analysis of   Users in Data Analytics Domain
Profile Analysis of Users in Data Analytics Domain
 
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
Testing Vitality Ranking and Prediction in Social Networking Services With Dy...
 
Semantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data MiningSemantically Enriched Knowledge Extraction With Data Mining
Semantically Enriched Knowledge Extraction With Data Mining
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
 
STOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKET
STOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKETSTOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKET
STOCKSENTIX: A MACHINE LEARNING APPROACH TO STOCKMARKET
 
Derogatory Comment Classification
Derogatory Comment ClassificationDerogatory Comment Classification
Derogatory Comment Classification
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
 
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMSMODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
MODELING THE ADAPTION RULE IN CONTEXTAWARE SYSTEMS
 
Modeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware SystemsModeling the Adaption Rule in Contextaware Systems
Modeling the Adaption Rule in Contextaware Systems
 
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter StreamTemporal Exploration in 2D Visualization of Emotions on Twitter Stream
Temporal Exploration in 2D Visualization of Emotions on Twitter Stream
 
IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...
IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...
IRJET- Sentimental Prediction of Users Perspective through Live Streaming : T...
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
IRJET- Monitoring Suspicious Discussions on Online Forums using Data Mining
IRJET- Monitoring Suspicious Discussions on Online Forums using Data MiningIRJET- Monitoring Suspicious Discussions on Online Forums using Data Mining
IRJET- Monitoring Suspicious Discussions on Online Forums using Data Mining
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED  ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKINGINTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
INTELLIGENT SOCIAL NETWORKS MODEL BASED ON SEMANTIC TAG RANKING
 
Ui Design And Usability For Everybody
Ui Design And Usability For EverybodyUi Design And Usability For Everybody
Ui Design And Usability For Everybody
 
WMJ&GMBwosc08-Effective Learning & Production Via Modelling
WMJ&GMBwosc08-Effective Learning & Production Via ModellingWMJ&GMBwosc08-Effective Learning & Production Via Modelling
WMJ&GMBwosc08-Effective Learning & Production Via Modelling
 

Mehr von Markus Luczak-Rösch

Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...Markus Luczak-Rösch
 
Analysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network scienceAnalysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network scienceMarkus Luczak-Rösch
 
Transcending our views to sequential data
Transcending our views to sequential data Transcending our views to sequential data
Transcending our views to sequential data Markus Luczak-Rösch
 
Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.Markus Luczak-Rösch
 
From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...Markus Luczak-Rösch
 
When resources collide: Towards a theory of coincidence in information spaces...
When resources collide: Towards a theory of coincidence in information spaces...When resources collide: Towards a theory of coincidence in information spaces...
When resources collide: Towards a theory of coincidence in information spaces...Markus Luczak-Rösch
 
Observation and Analysis of Social Machines
Observation and Analysis of Social MachinesObservation and Analysis of Social Machines
Observation and Analysis of Social MachinesMarkus Luczak-Rösch
 
Zooniverse - Through the Observatory
Zooniverse - Through the ObservatoryZooniverse - Through the Observatory
Zooniverse - Through the ObservatoryMarkus Luczak-Rösch
 
loomp - semantic content authoring
loomp - semantic content authoringloomp - semantic content authoring
loomp - semantic content authoringMarkus Luczak-Rösch
 
Statistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data UsageStatistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data UsageMarkus Luczak-Rösch
 

Mehr von Markus Luczak-Rösch (12)

Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
 
Analysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network scienceAnalysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network science
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
Web of Data Usage Mining
Web of Data Usage MiningWeb of Data Usage Mining
Web of Data Usage Mining
 
Transcending our views to sequential data
Transcending our views to sequential data Transcending our views to sequential data
Transcending our views to sequential data
 
Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.
 
From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...From coincidence to purposeful flow? Properties of transcendental information...
From coincidence to purposeful flow? Properties of transcendental information...
 
When resources collide: Towards a theory of coincidence in information spaces...
When resources collide: Towards a theory of coincidence in information spaces...When resources collide: Towards a theory of coincidence in information spaces...
When resources collide: Towards a theory of coincidence in information spaces...
 
Observation and Analysis of Social Machines
Observation and Analysis of Social MachinesObservation and Analysis of Social Machines
Observation and Analysis of Social Machines
 
Zooniverse - Through the Observatory
Zooniverse - Through the ObservatoryZooniverse - Through the Observatory
Zooniverse - Through the Observatory
 
loomp - semantic content authoring
loomp - semantic content authoringloomp - semantic content authoring
loomp - semantic content authoring
 
Statistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data UsageStatistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data Usage
 

Kürzlich hochgeladen

Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICEayushi9330
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flyPRADYUMMAURYA1
 

Kürzlich hochgeladen (20)

Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 

The Web Science MacroScope: Mixed-methods Approach for Understanding Web Activity

  • 1. The Web Science MacroScope: Mixed-methods Approach for Understanding Web Activity Markus Luczak-Roesch (some slides based on work of Ramine Tinati) University of Southampton (UK), Web and Internet Science Group @mluczak | http://markus-luczak.de Image source: https://en.wikipedia.org/wiki/File:Compound_Microscope_(cropped).JPG, CC BY-SA 4.0
  • 2. The World Wide Web Image source: screenshot taken from https://www.w3.org/History/ 1989/proposal.html
  • 3. Essential part of the data science story of the WWW: Web Observatories Data Sources Challenges: -  Who are the providers? -  Is the service reliable/stable? Data Collec=on Challenges: -  API Limita=ons/Restric=ons -  Data Schemas/Consistency -  Does it change over=me? Data Storage Challenges: -  Storage approaches (rela=onal, flat, linked?) Data Analysis and Modelling Challenges: -  What methods/models? -  How is the data sampled? Data Visualisa=on Challenges: -  Misrepresenta=on of data? e.g. visualise “filtered” data Data Querying and Transforma3on Sta3s3cal and computa3onal analysis Methods Data Interpreta=on Challenges: -  Are the ques=ons being asked relevant to the data -  Are insights being fed back into the analysis? Add or update ini3al stored data Update current harves3ng strategy (req. for real-3me analysis) (a) Image source: https://en.wikipedia.org/wiki/File:Sphinx_Observatory.jpg, CC BY-SA 2.0
  • 4. What to observe? Social Machines! “Real life is and must be full of all kinds of social constraint – the very processes from which society arises. Computers can help if we use them to create abstract social machines on the Web: processes in which the people do the creative work and the machine does the administration.“ Berners-Lee, Tim; Mark Fischetti (1999). Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by its inventor. Britain: Orion Business. ISBN 0-7528-2090-7.
  • 5. Topic outbreaks across systems Peak in tweets containing topic ‘x’ Peak in Wikipedia views of ar7cles ‘x’ ‘Lag diffusion’ 7me
  • 7. Participation in Citizen Science projects by communication patterns 98 game entries and 835,732 chat messages, ue players. For each game, the EyeWire sys- duration taken (in seconds) for a player to he time the game was completed. Each chat player’s ID, timestamp, and message text. the question of player chat engagement and granularity of players with similar character- fferent sets of players related to their gaming ur. We initially reduced the data to include ed to both games and chat. we labelled these Based on these players, we computed several players related to specific EyeWire features; sets we computed a number of statistics and escribed in Table 2. uting statistics for the 10,714 ’active’ players mes and chat, we extracted the top quadrant milar to the approach taken in other citizen mmunity engagement [27]. We label these ive’. Based on a initial analysis of user re- e’ players contain individuals who sustained of 30 days with respects to writing chat mes- a game. anised as follows, we begin by presenting the the system-level analysis, then explore the lationship with a players’ gaming participa- n the chat messages corresponding to differ- ing process, the impact on game commands y, examine the context of the chat messages ing. Figure 4: Distribution of games, chat messages, and account durations (games and chat) for all EyeWire players. Figure 5: Timeline of chat and gaming activity for the EyeWire platform. 5.1.1 Player Cohorts As shown in Figure 4, the analysis of chat and gaming account duration reveals that for gaming activity, there are many players Stage Criteria Before Game (Q0) 30s < Game Start Start of Game (Q1) Game Start < x < 1st Quartile Game Duration During Game (Q2-3) Quartile Game Duration < x < 3rd Quartile Game Duration End of Game (Q4) 3rd Quartile Game Duration < x < Game End After Game (Q5) 30s < Game End Table 1: Chat Message Stages: Boundary Conditions Tinati, R., Luczak-Rösch, M., Simperl, E., Hall, W., & Shadbolt, N. (2015, May). /Command'and conquer: analysing discussion in a citizen science game. In ACM Web Science Conference 2015. ng past classifications. Main Interface in EyeWire ns and gamification techniques are integral eWire platform. As shown in 2, EyeWire eal-time chat that allows players to talk to layers points and achievements, as well as commands which provide additional func- and talking. Game commands are issued h (‘/’), such as being able to mute and hide ng the ‘/silence’ command. Issuing player not shown on the public chat feed, unless and such as group message (‘/gm’), which particular team, in which they first have to mmand. am is an community-driven process which n ongoing competition between teams of re either setup by the EyeWire team (usu- esh system activity), or led by the players r a specific goal or set of ’badges’. ernal chat, the main interface links to ad- interfaces which are not part of the game. oject blog, where the community managers s, competitions, and challenges as well as ul players. The players can also consult the ntains information about how to play the ence behind ‘connectome’ mapping. In ad- e provided with a forum that is meant to be nsive, asynchronous discussion on various including error reports. METHODS Figure 2: Embedded Chat Interface in EyeWire given time frame. The cohort analysis examines monthly cohorts of players based on their first chat and game entry, and provides a measure of sustained activity. Based on the the monthly player retention values, we are able to differentiate between different sets of users, as described in the following section. To examine the context and discourse within the chat messages, we perform text analysis to extract the use of EyeWire game com- mands, and also perform topic modelling on the content of the chat messages. To achieve this we use LDA [5] to derive topic models which contain common vocabulary used by players. We combine this with the different categories of chat messages in order to de- termine the context of chat during different stages of completing a game. As we are interested in the relationship between a players gam- ing session and use of chat, we construct a model of player chat messages which classify chat activity at different stages of when a game is being performed. As described in Table 1 and illustrated in Figure 3, we categorise the chat messages into 5 stages around the process of gaming. Stages Q1 to Q4 are relative to the time it took for the game to be completed. For example, if a game was completed in 10 seconds, then Q1 would represent 0-2 seconds,
  • 8. Participation in Citizen Science projects by communication patterns Luczak-Roesch, M., Tinati, R., Simperl, E., Van Kleek, M., Shadbolt, N., & Simpson, R. (2014). Why won't aliens talk to us? Content and community dynamics in online citizen science. Proceedings of the Eighth AAAI Conference on Weblogs and Social Media, {ICWSM} 2014, Ann Arbor, Michigan, USA, June 1-4, 2014. Image source: David Miller, https://daily.zooniverse.org/2013/11/21/an-ever-expanding-zooniverse/
  • 9. Participation in Citizen Science projects by location
  • 10. Temporal networks of information co-occurrence for system-agnostic exploratory data analysis Markus Luczak-Roesch, Ramine Tinati, Max van Kleek, and Nigel Shadbolt. 2015. From coincidence to purposeful flow? Properties of transcendental information cascades. In IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), Paris, FR.
  • 11. Where is the MacroScope? Data Sources Challenges: -  Who are the providers? -  Is the service reliable/stable? Data Collec=on Challenges: -  API Limita=ons/Restric=ons -  Data Schemas/Consistency -  Does it change over=me? Data Storage Challenges: -  Storage approaches (rela=onal, flat, linked?) Data Analysis and Modelling Challenges: -  What methods/models? -  How is the data sampled? Data Visualisa=on Challenges: -  Misrepresenta=on of data? e.g. visualise “filtered” data Data Querying and Transforma3on Sta3s3cal and computa3onal analysis Methods Data Interpreta=on Challenges: -  Are the ques=ons being asked relevant to the data -  Are insights being fed back into the analysis? Add or update ini3al stored data Update current harves3ng strategy (req. for real-3me analysis) (a) Image sources: https://en.wikipedia.org/wiki/File:Compound_Microscope_(cropped).JPG, CC BY-SA 4.0 & https://en.wikipedia.org/wiki/File:Sphinx_Observatory.jpg, CC BY-SA 2.0
  • 12. What is the MacroScope?
  • 13. Other data visualization capacities Image source: screenshot from https://www.imperial.ac.uk/data-science/kpmg-data-observatory-/technical-specifications/
  • 14. Other data visualization capacities Image source: screenshot from http://approach.rpi.edu/2015/11/18/immersive-experience-the-campfire/
  • 15. What is the MacroScope? “Wow, they don’t even know that this is happening!”
  • 16. Do we really think this is an event to be addressed in a purely quantitative fashion? Source: United Nations Development Programme, https://goo.gl/Z1uXdV, CC BY-NC-ND 2.0
  • 17. A qualitative investigation of crowdsourced disaster response •  Haiti (Ushahidi, N=298) – requests for help from identified local source •  Congo (Ushahidi, N=102) – information about the situation but not who is responsible for this information – more non-local sources •  Ebola (Twitter, N=298) – comments •  tasteless jokes •  racist comments •  concern that the crisis could spread and call to governments to close the borders Joint project with Silke Roth
  • 18. Boundaries of crowdsourced disaster response •  Wrong things go viral •  Crowdsourcing informativeness of social media information not synchronized with crises negative neutral positive 18 “When you tell a […] kid that is has got Ebola”
  • 19. Serendipitous discoveries in Citizen Science Hanny’s Voorwerp Galaxy Zoo [2007] Green Pea Galaxies Galaxy Zoo [2007] Yellow Balls Milky Way [2009] Circumbinary Planet Ph1b Planet Hunter [2012] Convict Worm Seafloor Explorer [2012] Spanish Flu Operation War Diaries [2014]
  • 20. From information co-occurrence to the discovery of hidden structure in Wikipedia Figure 1: Wikipedia edits in a three dimensional space. The di- mensions are (1) time; (2) information diversity as the chronologi- Tinati, R., Luczak-Rösch, M., & Hall, W. (to appear). Finding Structure in Wikipedia Edit Activity: An Information Cascade Approach . In WikiWorkshop 2016, co-located with WWW 2016. Events detected: •  Edward Snowden speech at SXSW conference •  US supreme court case on same sex marriage (a) Cascade Article Network (CAN): Nodes represent unique Wikipedia articles, edges are shared edits based on a shared identifier matched. A force directed layout has been ap- plied, with edge path lengths determined by edge weight. The strongly connected component (A) contains articles associated with South Korean media, (B) and (C) contain articles related to the USA. (b) Cascade-to-Cascade path network graph: Nodes are cas- cades, Edges are the shared articles between cascades. The cen- tral strongly connected component is established by the Identi- fiers shown in Table 3. A force directed layout has been applied, with edge path lengths determined by edge weight.
  • 21. The MacroScope is technology External APIs •  Twitter •  Wikipedia •  Instagram •  Google Trends •  Yahoo Trends Pre-processing Stage: 1.Enrich Streams 2. Unify feeds into WO JSON Format Streaming Stage: 1. Post incoming stream to RabbitMQ exchange (each source has its own exchange) Hadoop Storage Stage: 1. Apache Flume for each stream HDFS HTTP Streaming Stage: 1. Send Stream to Web Observatory Server Unstructured Web Streams or Web Scraped Pages Web Observatory JSON Data Schema RabbitMQ JSON Stream Socket.IO Daily Storage Stage: 1. MapReduce Daily Results MongoDB MacroScope Socket.IO
  • 22. •  six screens in WAIS labs •  as part of presentations •  as a mobile exhibit •  as a Web application There is more than one MacroScope
  • 23. Cross-disciplinary research Scholars from discipline A Scholars from discipline B Adaptive epistemological framework
  • 24. Engagement with the general public Scholars People from the general public demonstrating the power and the danger of individuals sharing information online developing a new “situational ethics of data”
  • 26. The MacroScope Surveys, interviews, focus groups, observations Scholars The public
  • 27. A mantra for the MacroScope: “Overview first, zoom and filter, then details- on demand”* and capture engagement. * Shneiderman, B. (1996, September). The eyes have it: A task by data type taxonomy for information visualizations. In Visual Languages, 1996. Proceedings., IEEE Symposium on (pp. 336-343). IEEE. Image source: screenshot taken from http://data.shopsavvy.mobi/globe
  • 28. The Web Science MacroScope: Mixed-methods Approach for Understanding Web Activity Markus Luczak-Roesch @mluczak | http://markus-luczak.de Image source: https://en.wikipedia.org/wiki/File:Compound_Microscope_(cropped).JPG, CC BY-SA 4.0 Discover Describe Directly engage