SlideShare ist ein Scribd-Unternehmen logo
1 von 33
Downloaden Sie, um offline zu lesen
From coincidence to purposeful flow? Properties of transcendental
information cascades.
Markus Luczak-Roesch
University of Southampton, Web and Internet Science Group
@mluczak | http://sociam.org
Zooniverse
Serendipity through talk
Task and talk participation
40.5%
Talkcontributions
Classifications
Community-level linguistic change
Project initial 10% most recent 10%
PH transit, star, day, aph, look,
one, planet, like, possibl, dip
day, transit, httparchive. . . ,
possibl, star, kid, dip, look,
planet, like
SF like, look, fish, sea, scallop,
thing, imag, right, star, left
corallinealga, anemon, object,
hermitcrab, bryozoan, stalkedtun,
shrimp, left, cerianthid, sanddollar
NN field, record, one, use, enter,
get, work, can, specimen,
button
like, field, record, date, name, can,
click, look, get, label
Stable domain specific
vocabulary
Emerging domain
specific vocabulary
Stable problem/error
reporting
Dominance of microposts and implicit coordination
PH SG SW NN GZ CC PF SF AP WS
91%
Vocabularyshift 2	
  
0	
  
6	
  
4	
  
10	
  
8	
  
Microposts
Luczak-Roesch, M., Tinati, R., Simperl, E., Van Kleek, M., Shadbolt, N., & Simpson, R. (2014). Why
won't aliens talk to us? Content and community dynamics in online citizen science. Proceedings of
the Eighth AAAI Conference on Weblogs and Social Media, {ICWSM} 2014, Ann Arbor, Michigan,
USA, June 1-4, 2014.
Networks within and out of the Zooniverse
Crowd mapping
Crisis response on social media
A qualitative investigation of crowdsourced
disaster response
•  Haiti (Ushahidi, N=298)
– requests for help from
identified local source
•  Congo (Ushahidi, N=102)
– information about the
situation but not who is
responsible for this
information
– more non-local sources
•  Ebola (Twitter, N=298)
– comments
•  tasteless jokes
•  racist comments
•  concern that the crisis could
spread and call to
governments to close the
borders
Boundaries of crowdsourced disaster response
•  Wrong things go viral
•  Crowdsourcing informativeness
of social media information not
synchronized with crises*
negative neutral positive
11
“When you tell a […] kid that is has got Ebola” 	
  
*Olteanu, A., Vieweg, S., & Castillo, C. (2015). What to Expect When the Unexpected Happens: Social
Media Communications Across Crises. In In Proc. of 18th ACM Computer Supported Cooperative Work
and Social Computing (CSCW’15), (No. EPFL-CONF-203562).
The future of disaster crowd work
Synchronization
Coordination
We can observe situations when online communication does not
happen along explicit social ties (especially in critical situations
when time to make decisions is rare). Instead of talking
explicitly with each other people are
broadcasting about the same event or topic.
Source: United Nations Development Programme, https://goo.gl/Z1uXdV, CC BY-NC-ND 2.0
“An informational
cascade occurs when it is
optimal for an individual,
having observed the actions of
those ahead him, to follow the
behavior of the preceding
individual without regard to
his own information.” [1]
[2]	
  
[1] Bikhchandani, Sushil, David Hirshleifer, and Ivo Welch. "A theory of fads, fashion, custom, and cultural
change as informational cascades." Journal of political Economy (1992): 992-1026.
[2] Cheng, Justin, et al. "Can cascades be predicted?." Proceedings of the 23rd international conference
on World wide web. International World Wide Web Conferences Steering Committee, 2014.
Boundaries of context-rich approaches
Twitter
Facebook
Quora
?
System A
System B
System C
Collective action?
t	
  
Does the accumulated information propagation behaviour on the
Web form giant purposeful processes?
Source:MichaelDales,https://goo.gl/IKXs4X,CCBY-NC2.0
Discovering the algorithms of Social Machines
Socio-technical Computation
The computational capability embodied in cascades of information
sharing activities on the Web that are not necessarily conditioned by
system-specific or social network features but only time and inherent
properties of pairs of resources.
Markus Luczak-Roesch, Ramine Tinati, Kieron O'Hara, and Nigel Shadbolt. 2015. Socio-technical Computation. In
Proceedings of the 18th ACM Conference Companion on Computer Supported Cooperative Work & Social Computing
(CSCW'15 Companion). ACM, New York, NY, USA, 139-142. http://doi.acm.org/10.1145/2685553.2698991
2-state model infinite-state model
HF LF
[3] Kleinberg, Jon. "Bursty and hierarchical structure in streams." Data
Mining and Knowledge Discovery 7.4 (2003): 373-397.
Time
Numberofobserveddocuments
Content streams as automata [3]
t	
  
Transcendental
information cascades
Transcendental
information cascades
t	
  
#A	
  
#A#B	
  
#A#B#C	
  
#B#D	
  
#C	
  
Building transcendental information cascades
conditionality.
In [20] we presented the initial definition of a transcenden-
tal information cascade as a 4-tupel TC = (V, E, R, F). This
4-tupel represents a directed network consisting of a set of
nodes V and edges E, derived when applying a set of matching
functions F to a set of resources R = {r1, r2, ..., rm}, ri =
(ui, ti, ci), where every ui is a unique identifier of a resource
ri that was shared at the time ti with the content ci. Nodes in
the network are those resources from R that contain a set Ii of
one or multiple cascade identifiers. A cascade identifier is any
unique informational pattern that is recognized by applying
a matching function to the content or any other inherent
properties of a resource (e.g. simple string matching algorithms
to identify keywords in content). Formally a matching function
fk 2 F, k 2 N, k  n is defined as:
fk(ci) =
8
>>>>><
>>>>>:
{i1, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
; otherwise
Nodes V and edges E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
the result of the concatenation of all identifiers found by all
matching functions2
. An edge exists between any two nodes
that share a unique subset of all the cascade identifiers that
were found for them. This subset and none of its subsets is
part of the identifiers found for any node that was created in the
time period between when the two linked nodes were created.
⇤z ={ir|
ir 2 Ia ^ ir 2 Ib,
8ir ! V 0
=
{vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;,
vc 2 V, r 2 N, r  |Ib|}
A node that contains a cascade identifier that was not
detected for any other nodes before is called the identifier
root. Beside this we call a node without any incoming edges
a network root and node that has no outgoing edges a stub.
network are those resources from R that contain a set Ii of
e or multiple cascade identifiers. A cascade identifier is any
que informational pattern that is recognized by applying
matching function to the content or any other inherent
perties of a resource (e.g. simple string matching algorithms
dentify keywords in content). Formally a matching function
2 F, k 2 N, k  n is defined as:
fk(ci) =
8
>>>>><
>>>>>:
{i1, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
; otherwise
des V and edges E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
h Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
result of the concatenation of all identifiers found by all
tching functions2
. An edge exists between any two nodes
t share a unique subset of all the cascade identifiers that
re found for them. This subset and none of its subsets is
t of the identifiers found for any node that was created in the
e period between when the two linked nodes were created.
⇤z ={ir|
ir 2 Ia ^ ir 2 Ib,
8ir ! V 0
=
{vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;,
vc 2 V, r 2 N, r  |Ib|}
A node that contains a cascade identifier that was not
ected for any other nodes before is called the identifier
t. Beside this we call a node without any incoming edges
etwork root and node that has no outgoing edges a stub.
r cascade model clearly yields different outputs depending
the data to hand (e.g. determined by the extent of the
Please note that [20] contains an unintentionally malformed equation for
as the wrong symbol was used to refer to the concatenation of the matching
ctions.
Fig. 1. Depending on the applied matching functions, different transcendental
information cascade representations can be generated for the same input data.
A fictive example of a transcendental cascade based on our
model is shown in Figure 2. Consider a system that features
hashtags as an established form of identifying content patterns.
The visualisation uses the following approach to represent
distinct identifiers and time: Nodes are chronologically ordered
alongside the horizontal dimension from left (the oldest node)
to right (the most recent node); additionally nodes are ordered
alongside the vertical dimension depending on the set of
identifiers present in a node (each unique set is assigned to
a distinct level). Consequently, the visualisation represents the
content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”)
- (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”).
Fig. 2. Example of a cascade that emerges along five different identifiers.
#A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations
resepectively) treated as the indentifying content patterns
In order to understand how edges are labelled we highlight
the sub-graph involving the nodes 2, 3, 4, and 5. Conforming
to our cascade model an edge exist between nodes 2 and 3
nding of its use but also an abstract global
ropose a new model that we call transcen-
ascades. Informed by Kleinbergs work on
document streams [2] it regards time as
le condition for relationships between any
meaning that we focus on coincidence of
activities rather than socially-determined
nted the initial definition of a transcenden-
ade as a 4-tupel TC = (V, E, R, F). This
a directed network consisting of a set of
E, derived when applying a set of matching
et of resources R = {r1, r2, ..., rm}, ri =
very ui is a unique identifier of a resource
t the time ti with the content ci. Nodes in
se resources from R that contain a set Ii of
cade identifiers. A cascade identifier is any
al pattern that is recognized by applying
n to the content or any other inherent
rce (e.g. simple string matching algorithms
s in content). Formally a matching function
n is defined as:
, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
otherwise
E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
, io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
ncatenation of all identifiers found by all
2
. An edge exists between any two nodes
subset of all the cascade identifiers that
m. This subset and none of its subsets is
s found for any node that was created in the
n when the two linked nodes were created.
{ir|
Web crawl), and the matching algorithms determining which
cascade identifiers will be spotted (e.g. reuse of hashtags,
URIs, quotes, images, or maybe exploiting wider semantics
or sentiment) as depicted in Figure ??.
Fig. 1. Depending on the applied matching functions, different transcendental
information cascade representations can be generated for the same input data.
A fictive example of a transcendental cascade based on our
model is shown in Figure 2. Consider a system that features
hashtags as an established form of identifying content patterns.
The visualisation uses the following approach to represent
distinct identifiers and time: Nodes are chronologically ordered
alongside the horizontal dimension from left (the oldest node)
to right (the most recent node); additionally nodes are ordered
alongside the vertical dimension depending on the set of
identifiers present in a node (each unique set is assigned to
a distinct level). Consequently, the visualisation represents the
content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”)
- (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”).
i that was shared at the time ti with the content ci. Nodes in
he network are those resources from R that contain a set Ii of
ne or multiple cascade identifiers. A cascade identifier is any
nique informational pattern that is recognized by applying
matching function to the content or any other inherent
roperties of a resource (e.g. simple string matching algorithms
o identify keywords in content). Formally a matching function
k 2 F, k 2 N, k  n is defined as:
fk(ci) =
8
>>>>><
>>>>>:
{i1, i2, ..., ix} if fk matches patterns
{i1, i2, ..., ix} in ci
x 2 N
; otherwise
Nodes V and edges E are then given as follows
V ={v1, v2, ..., vp}
vy = (uy, ty, Iy),
E ={e1, e2, ..., eq}
ez =(ua, ub, ⇤z)
with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being
he result of the concatenation of all identifiers found by all
matching functions2
. An edge exists between any two nodes
hat share a unique subset of all the cascade identifiers that
were found for them. This subset and none of its subsets is
art of the identifiers found for any node that was created in the
ime period between when the two linked nodes were created.
⇤z ={ir|
ir 2 Ia ^ ir 2 Ib,
8ir ! V 0
=
{vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;,
vc 2 V, r 2 N, r  |Ib|}
A node that contains a cascade identifier that was not
etected for any other nodes before is called the identifier
oot. Beside this we call a node without any incoming edges
network root and node that has no outgoing edges a stub.
Our cascade model clearly yields different outputs depending
n the data to hand (e.g. determined by the extent of the
2Please note that [20] contains an unintentionally malformed equation for
his as the wrong symbol was used to refer to the concatenation of the matching
unctions.
Fig. 1. Depending on the applied matching functions, different transcendental
information cascade representations can be generated for the same input data.
A fictive example of a transcendental cascade based on our
model is shown in Figure 2. Consider a system that features
hashtags as an established form of identifying content patterns.
The visualisation uses the following approach to represent
distinct identifiers and time: Nodes are chronologically ordered
alongside the horizontal dimension from left (the oldest node)
to right (the most recent node); additionally nodes are ordered
alongside the vertical dimension depending on the set of
identifiers present in a node (each unique set is assigned to
a distinct level). Consequently, the visualisation represents the
content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”)
- (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”).
Fig. 2. Example of a cascade that emerges along five different identifiers.
#A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations
resepectively) treated as the indentifying content patterns
In order to understand how edges are labelled we highlight
the sub-graph involving the nodes 2, 3, 4, and 5. Conforming
to our cascade model an edge exist between nodes 2 and 3
Transcendental
information cascades
t	
  
#A	
  
#A#B	
  
#A#B#C	
  
#A	
  
#A	
  
#B	
  
#A	
  
Capturing the unintended action resulting from information
sharing activities of human collectives.
t	
  
Document stream	
  
Transcendental Information Cascade	
  
Temporal text/data mining
=
t ∈[t− 2
,t+ 2
] j=1
t′∈[t− W
2
,t+ W
2
] |dt′ |
of each theme can then be modeled as the
theme strengths over time.
of theme life cycles thus involves the follow-
(1) Construct an HMM to model how themes
ach other in the collection. (2) Estimate the
meters of the HMM using the whole stream
served example sequence. (3) Decode the col-
el each word with the hidden theme model
is generated. (4) For each trans-collection
when it starts, when it terminates, and how
me.
IMENTS AND RESULTS
reparation
ts are constructed to evaluate the proposed
methods. The first, tsunami news data, con-
ticles about the event of Asia Tsunami dated
o Feb. 8 2005. We downloaded 7468 news
0 selected sources, with the keyword query
shown in Table 1, three of the sources are in
m are in Europe and the rest are in the U.S.
e Nation News Source Nation
UK Times of India India
US VOA US
mes India Washington Post US
mes US Washington Times US
UK Xinhua News China
ws sources of Asia Tsunami data set
with the previous one. We use the mixture model discussed
in Section 3 to extract the most salient themes in each time
interval. We set the background parameter λB = 0.95 and
number of themes in each time interval to be 6. The varia-
tion of λB is discussed later. Table 3 shows the top 10 words
with the highest probabilities in each theme span. We see
that most of these themes suggest meaningful subtopics in
the context of the Asia tsunami event.
!"##$%#&'($)&"*"%+
,-.#
/$%0"(+#&'$(&.$%1+-$%
2"(#$%13&456"(-"%0"
7$%1+-$%&81+09
2$3-+-013&:##;"#
/(-+-0-#)&&$%&:(1<
=+1+-#+-0#
Figure 6: Theme evolution graph for Asia Tsunami
With these theme spans, we use KL-divergence to further
identify evolutionary transitions. Figure 6 shows a theme
evolution graph discovered from Asia Tsunami data when
the threshold for evolution distance is set to ξ = 12. From
Figure 6, we can see several interesting evolution threads
which are annotated with symbols.
The thread labeled with a may be about warning systems
[4] Subašić, I., & Berendt, B. (2013). Story graphs: Tracking document set evolution using dynamic graphs.
Intelligent Data Analysis, 17(1), 125-147.
[5] Mei, Q., & Zhai, C. (2005, August). Discovering evolutionary theme patterns from text: an exploration of
temporal text mining. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge
discovery in data mining (pp. 198-207). ACM.
[5]	
  
“The key notion of
TTM is burstiness –
sudden increases in
frequency of text
fragments, and all TTM
methods aim to model
burstiness.” [4]
t	
   t	
  
F1	
  
Fn	
  
…	
  
…	
  
C11	
  
C21	
  
C22	
  
C23	
  
t0	
   t1	
   t2	
   t3	
   t4	
  t5	
   t7	
   t8	
  
t6	
  
t6	
  -­‐	
  t0	
  
t2	
  -­‐	
  t1	
   t8	
  -­‐	
  t2	
  
t4	
  -­‐	
  t2	
  
t7	
  -­‐	
  t4	
  
t5	
  -­‐	
  t3	
  
t1	
  -­‐	
  t0	
  
t2	
  -­‐	
  t1	
  
t4	
  -­‐	
  t1	
  
t4	
  -­‐	
  t3	
  
t6	
  -­‐	
  t5	
  
t8	
  -­‐	
  t6	
  
t7	
  -­‐	
  t4	
  
t5	
  -­‐	
  t4	
  
t3	
  -­‐	
  t2	
  
There is more than one “reality”
Analyzing low-level properties of the multiple
states of a system that exist at the same time
4
1 15
10
Tags	
   URIs	
  
KID & APH	
  
Single node motifs	
  
long uniform paths	
  
short uniform paths	
  
long non-uniform paths	
  
Analyzing low-level properties of the multiple
states of a system that exist at the same time
Tags	
   URIs	
  
KID&APH	
  
Identifier entropy	
  
4. Overview of the results of the cascade comparison. Cascade size distribution and wi
d with a log scale on the y-axis.
ain one or few identifiers equally distributed. Very large identifiers
e size distribution and wiener index are plotted on a log-log scale; identifier entropy is
large identifiers (KID, APH, URIs), cascades which are based on
varying profiles of increasing
randomness with growing
cascade size	
  
Cascade motifs as an indicator of state?
?
t	
  
F1	
  
Fn	
  
…	
  
…	
  
C11	
  
C21	
  
C22	
  
C23	
  
Formalising the
multiple possible
representations of
a system at any time
and their relationships.
Not all representing
purposeful action but
reflecting useful
informational properties.
By focusing only on the
coincidence of
information occurrence,
we can capture and
analyse emergent
collective action across
system boundaries and
independent from social
network contexts.
Markus Luczak-Roesch
@mluczak
http://markus-luczak.de
Source:GiuliaForsythe,http://goo.gl/6hpZ0W,CCBY-NC-SA2.0
References
•  Markus Luczak-Roesch, Ramine Tinati, Kieron O'Hara, and Nigel
Shadbolt. 2015. Socio-technical Computation. In
Proceedings of the 18th ACM Conference Companion on Computer
Supported Cooperative Work & Social Computing (CSCW'15
Companion). ACM, New York, NY, USA, 139-142. http://
doi.acm.org/10.1145/2685553.2698991
•  Markus Luczak-Roesch, Ramine Tinati, and Nigel Shadbolt. 2015.
When Resources Collide: Towards a Theory of
Coincidence in Information Spaces. To appear in
WWW’15 Companion, May 18–22, 2015, Florence, Italy. http://
dx.doi.org/10.1145/2740908.2743973

Weitere ähnliche Inhalte

Ähnlich wie From coincidence to purposeful flow? Properties of transcendental information cascades.

Mining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMarko Rodriguez
 
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 TutorialSemantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 TutorialMaribel Acosta Deibe
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Dataebrahim_bagheri
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesValentina Presutti
 
Applied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysisApplied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysisUltraUploader
 
Seville2000
Seville2000Seville2000
Seville2000behem0t
 
Computational Modeling of Complex Biochemical Structures Assembly and Evolution
Computational Modeling of Complex Biochemical Structures Assembly and EvolutionComputational Modeling of Complex Biochemical Structures Assembly and Evolution
Computational Modeling of Complex Biochemical Structures Assembly and EvolutionVishakha Sharma, PhD
 
Complexity Play&Learn
Complexity Play&LearnComplexity Play&Learn
Complexity Play&LearnMassimo Conte
 
Augmented Collective Digital Twins for Self-Organising Cyber-Physical Systems
Augmented Collective Digital Twins for Self-Organising Cyber-Physical SystemsAugmented Collective Digital Twins for Self-Organising Cyber-Physical Systems
Augmented Collective Digital Twins for Self-Organising Cyber-Physical SystemsRoberto Casadei
 
Guaranteeing Memory Safety in Rust
Guaranteeing Memory Safety in RustGuaranteeing Memory Safety in Rust
Guaranteeing Memory Safety in Rustnikomatsakis
 
From Signal to Symbols
From Signal to SymbolsFrom Signal to Symbols
From Signal to Symbolsgpano
 
Six And Half Philosophies for Design & Innovation
Six And Half Philosophies for Design & InnovationSix And Half Philosophies for Design & Innovation
Six And Half Philosophies for Design & InnovationAlex Zhu
 
Information among networks and systems of knowledge
Information among networks and systems of knowledgeInformation among networks and systems of knowledge
Information among networks and systems of knowledgeJosé Nafría
 
Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Tin180 VietNam
 
Designing for Online Collaborative Sensemaking
Designing for Online Collaborative SensemakingDesigning for Online Collaborative Sensemaking
Designing for Online Collaborative SensemakingNitesh Goyal
 
An Architecture For Character-Mediated Interactive Presentations
An Architecture For Character-Mediated Interactive PresentationsAn Architecture For Character-Mediated Interactive Presentations
An Architecture For Character-Mediated Interactive PresentationsJennifer Daniel
 
La résolution de problèmes à l'aide de graphes
La résolution de problèmes à l'aide de graphesLa résolution de problèmes à l'aide de graphes
La résolution de problèmes à l'aide de graphesData2B
 
Criminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian NetworksCriminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian NetworksCSCJournals
 

Ähnlich wie From coincidence to purposeful flow? Properties of transcendental information cascades. (20)

Mining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network ResearchMining and Supporting Community Structures in Sensor Network Research
Mining and Supporting Community Structures in Sensor Network Research
 
Cyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and BeyondCyberinfrastructure for Einstein's Equations and Beyond
Cyberinfrastructure for Einstein's Equations and Beyond
 
Research #1
Research #1Research #1
Research #1
 
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 TutorialSemantic Data Management in Graph Databases: ESWC 2014 Tutorial
Semantic Data Management in Graph Databases: ESWC 2014 Tutorial
 
Filtering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open DataFiltering Inaccurate Entity Co-references on the Linked Open Data
Filtering Inaccurate Entity Co-references on the Linked Open Data
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with Frames
 
Applied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysisApplied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysis
 
Seville2000
Seville2000Seville2000
Seville2000
 
Computational Modeling of Complex Biochemical Structures Assembly and Evolution
Computational Modeling of Complex Biochemical Structures Assembly and EvolutionComputational Modeling of Complex Biochemical Structures Assembly and Evolution
Computational Modeling of Complex Biochemical Structures Assembly and Evolution
 
Complexity Play&Learn
Complexity Play&LearnComplexity Play&Learn
Complexity Play&Learn
 
Augmented Collective Digital Twins for Self-Organising Cyber-Physical Systems
Augmented Collective Digital Twins for Self-Organising Cyber-Physical SystemsAugmented Collective Digital Twins for Self-Organising Cyber-Physical Systems
Augmented Collective Digital Twins for Self-Organising Cyber-Physical Systems
 
Guaranteeing Memory Safety in Rust
Guaranteeing Memory Safety in RustGuaranteeing Memory Safety in Rust
Guaranteeing Memory Safety in Rust
 
From Signal to Symbols
From Signal to SymbolsFrom Signal to Symbols
From Signal to Symbols
 
Six And Half Philosophies for Design & Innovation
Six And Half Philosophies for Design & InnovationSix And Half Philosophies for Design & Innovation
Six And Half Philosophies for Design & Innovation
 
Information among networks and systems of knowledge
Information among networks and systems of knowledgeInformation among networks and systems of knowledge
Information among networks and systems of knowledge
 
Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)Socialnetworkanalysis (Tin180 Com)
Socialnetworkanalysis (Tin180 Com)
 
Designing for Online Collaborative Sensemaking
Designing for Online Collaborative SensemakingDesigning for Online Collaborative Sensemaking
Designing for Online Collaborative Sensemaking
 
An Architecture For Character-Mediated Interactive Presentations
An Architecture For Character-Mediated Interactive PresentationsAn Architecture For Character-Mediated Interactive Presentations
An Architecture For Character-Mediated Interactive Presentations
 
La résolution de problèmes à l'aide de graphes
La résolution de problèmes à l'aide de graphesLa résolution de problèmes à l'aide de graphes
La résolution de problèmes à l'aide de graphes
 
Criminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian NetworksCriminal and Civil Identification with DNA Databases Using Bayesian Networks
Criminal and Civil Identification with DNA Databases Using Bayesian Networks
 

Mehr von Markus Luczak-Rösch

Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...Markus Luczak-Rösch
 
Analysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network scienceAnalysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network scienceMarkus Luczak-Rösch
 
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...Markus Luczak-Rösch
 
Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.Markus Luczak-Rösch
 
loomp - semantic content authoring
loomp - semantic content authoringloomp - semantic content authoring
loomp - semantic content authoringMarkus Luczak-Rösch
 
Statistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data UsageStatistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data UsageMarkus Luczak-Rösch
 

Mehr von Markus Luczak-Rösch (8)

Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
Not re-decentralizing the Web is not only a missed opportunity, it is irrespo...
 
Analysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network scienceAnalysing literature through the lens of information theory and network science
Analysing literature through the lens of information theory and network science
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
Web of Data Usage Mining
Web of Data Usage MiningWeb of Data Usage Mining
Web of Data Usage Mining
 
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
The Web Science MacroScope: Mixed-methods Approach for Understanding Web Acti...
 
Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.Context-free data analysis with Transcendental Information Cascades.
Context-free data analysis with Transcendental Information Cascades.
 
loomp - semantic content authoring
loomp - semantic content authoringloomp - semantic content authoring
loomp - semantic content authoring
 
Statistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data UsageStatistical Analysis of Web of Data Usage
Statistical Analysis of Web of Data Usage
 

Kürzlich hochgeladen

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 

Kürzlich hochgeladen (20)

Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 

From coincidence to purposeful flow? Properties of transcendental information cascades.

  • 1. From coincidence to purposeful flow? Properties of transcendental information cascades. Markus Luczak-Roesch University of Southampton, Web and Internet Science Group @mluczak | http://sociam.org
  • 4. Task and talk participation 40.5% Talkcontributions Classifications
  • 5. Community-level linguistic change Project initial 10% most recent 10% PH transit, star, day, aph, look, one, planet, like, possibl, dip day, transit, httparchive. . . , possibl, star, kid, dip, look, planet, like SF like, look, fish, sea, scallop, thing, imag, right, star, left corallinealga, anemon, object, hermitcrab, bryozoan, stalkedtun, shrimp, left, cerianthid, sanddollar NN field, record, one, use, enter, get, work, can, specimen, button like, field, record, date, name, can, click, look, get, label Stable domain specific vocabulary Emerging domain specific vocabulary Stable problem/error reporting
  • 6. Dominance of microposts and implicit coordination PH SG SW NN GZ CC PF SF AP WS 91% Vocabularyshift 2   0   6   4   10   8   Microposts Luczak-Roesch, M., Tinati, R., Simperl, E., Van Kleek, M., Shadbolt, N., & Simpson, R. (2014). Why won't aliens talk to us? Content and community dynamics in online citizen science. Proceedings of the Eighth AAAI Conference on Weblogs and Social Media, {ICWSM} 2014, Ann Arbor, Michigan, USA, June 1-4, 2014.
  • 7. Networks within and out of the Zooniverse
  • 9. Crisis response on social media
  • 10. A qualitative investigation of crowdsourced disaster response •  Haiti (Ushahidi, N=298) – requests for help from identified local source •  Congo (Ushahidi, N=102) – information about the situation but not who is responsible for this information – more non-local sources •  Ebola (Twitter, N=298) – comments •  tasteless jokes •  racist comments •  concern that the crisis could spread and call to governments to close the borders
  • 11. Boundaries of crowdsourced disaster response •  Wrong things go viral •  Crowdsourcing informativeness of social media information not synchronized with crises* negative neutral positive 11 “When you tell a […] kid that is has got Ebola”   *Olteanu, A., Vieweg, S., & Castillo, C. (2015). What to Expect When the Unexpected Happens: Social Media Communications Across Crises. In In Proc. of 18th ACM Computer Supported Cooperative Work and Social Computing (CSCW’15), (No. EPFL-CONF-203562).
  • 12. The future of disaster crowd work Synchronization Coordination
  • 13. We can observe situations when online communication does not happen along explicit social ties (especially in critical situations when time to make decisions is rare). Instead of talking explicitly with each other people are broadcasting about the same event or topic. Source: United Nations Development Programme, https://goo.gl/Z1uXdV, CC BY-NC-ND 2.0
  • 14. “An informational cascade occurs when it is optimal for an individual, having observed the actions of those ahead him, to follow the behavior of the preceding individual without regard to his own information.” [1] [2]   [1] Bikhchandani, Sushil, David Hirshleifer, and Ivo Welch. "A theory of fads, fashion, custom, and cultural change as informational cascades." Journal of political Economy (1992): 992-1026. [2] Cheng, Justin, et al. "Can cascades be predicted?." Proceedings of the 23rd international conference on World wide web. International World Wide Web Conferences Steering Committee, 2014. Boundaries of context-rich approaches
  • 16.
  • 17. System A System B System C Collective action? t  
  • 18. Does the accumulated information propagation behaviour on the Web form giant purposeful processes? Source:MichaelDales,https://goo.gl/IKXs4X,CCBY-NC2.0
  • 19. Discovering the algorithms of Social Machines Socio-technical Computation The computational capability embodied in cascades of information sharing activities on the Web that are not necessarily conditioned by system-specific or social network features but only time and inherent properties of pairs of resources. Markus Luczak-Roesch, Ramine Tinati, Kieron O'Hara, and Nigel Shadbolt. 2015. Socio-technical Computation. In Proceedings of the 18th ACM Conference Companion on Computer Supported Cooperative Work & Social Computing (CSCW'15 Companion). ACM, New York, NY, USA, 139-142. http://doi.acm.org/10.1145/2685553.2698991
  • 20. 2-state model infinite-state model HF LF [3] Kleinberg, Jon. "Bursty and hierarchical structure in streams." Data Mining and Knowledge Discovery 7.4 (2003): 373-397. Time Numberofobserveddocuments Content streams as automata [3]
  • 22. Transcendental information cascades t   #A   #A#B   #A#B#C   #B#D   #C  
  • 23. Building transcendental information cascades conditionality. In [20] we presented the initial definition of a transcenden- tal information cascade as a 4-tupel TC = (V, E, R, F). This 4-tupel represents a directed network consisting of a set of nodes V and edges E, derived when applying a set of matching functions F to a set of resources R = {r1, r2, ..., rm}, ri = (ui, ti, ci), where every ui is a unique identifier of a resource ri that was shared at the time ti with the content ci. Nodes in the network are those resources from R that contain a set Ii of one or multiple cascade identifiers. A cascade identifier is any unique informational pattern that is recognized by applying a matching function to the content or any other inherent properties of a resource (e.g. simple string matching algorithms to identify keywords in content). Formally a matching function fk 2 F, k 2 N, k  n is defined as: fk(ci) = 8 >>>>>< >>>>>: {i1, i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N ; otherwise Nodes V and edges E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being the result of the concatenation of all identifiers found by all matching functions2 . An edge exists between any two nodes that share a unique subset of all the cascade identifiers that were found for them. This subset and none of its subsets is part of the identifiers found for any node that was created in the time period between when the two linked nodes were created. ⇤z ={ir| ir 2 Ia ^ ir 2 Ib, 8ir ! V 0 = {vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;, vc 2 V, r 2 N, r  |Ib|} A node that contains a cascade identifier that was not detected for any other nodes before is called the identifier root. Beside this we call a node without any incoming edges a network root and node that has no outgoing edges a stub. network are those resources from R that contain a set Ii of e or multiple cascade identifiers. A cascade identifier is any que informational pattern that is recognized by applying matching function to the content or any other inherent perties of a resource (e.g. simple string matching algorithms dentify keywords in content). Formally a matching function 2 F, k 2 N, k  n is defined as: fk(ci) = 8 >>>>>< >>>>>: {i1, i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N ; otherwise des V and edges E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) h Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being result of the concatenation of all identifiers found by all tching functions2 . An edge exists between any two nodes t share a unique subset of all the cascade identifiers that re found for them. This subset and none of its subsets is t of the identifiers found for any node that was created in the e period between when the two linked nodes were created. ⇤z ={ir| ir 2 Ia ^ ir 2 Ib, 8ir ! V 0 = {vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;, vc 2 V, r 2 N, r  |Ib|} A node that contains a cascade identifier that was not ected for any other nodes before is called the identifier t. Beside this we call a node without any incoming edges etwork root and node that has no outgoing edges a stub. r cascade model clearly yields different outputs depending the data to hand (e.g. determined by the extent of the Please note that [20] contains an unintentionally malformed equation for as the wrong symbol was used to refer to the concatenation of the matching ctions. Fig. 1. Depending on the applied matching functions, different transcendental information cascade representations can be generated for the same input data. A fictive example of a transcendental cascade based on our model is shown in Figure 2. Consider a system that features hashtags as an established form of identifying content patterns. The visualisation uses the following approach to represent distinct identifiers and time: Nodes are chronologically ordered alongside the horizontal dimension from left (the oldest node) to right (the most recent node); additionally nodes are ordered alongside the vertical dimension depending on the set of identifiers present in a node (each unique set is assigned to a distinct level). Consequently, the visualisation represents the content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”) - (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”). Fig. 2. Example of a cascade that emerges along five different identifiers. #A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations resepectively) treated as the indentifying content patterns In order to understand how edges are labelled we highlight the sub-graph involving the nodes 2, 3, 4, and 5. Conforming to our cascade model an edge exist between nodes 2 and 3 nding of its use but also an abstract global ropose a new model that we call transcen- ascades. Informed by Kleinbergs work on document streams [2] it regards time as le condition for relationships between any meaning that we focus on coincidence of activities rather than socially-determined nted the initial definition of a transcenden- ade as a 4-tupel TC = (V, E, R, F). This a directed network consisting of a set of E, derived when applying a set of matching et of resources R = {r1, r2, ..., rm}, ri = very ui is a unique identifier of a resource t the time ti with the content ci. Nodes in se resources from R that contain a set Ii of cade identifiers. A cascade identifier is any al pattern that is recognized by applying n to the content or any other inherent rce (e.g. simple string matching algorithms s in content). Formally a matching function n is defined as: , i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N otherwise E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) , io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being ncatenation of all identifiers found by all 2 . An edge exists between any two nodes subset of all the cascade identifiers that m. This subset and none of its subsets is s found for any node that was created in the n when the two linked nodes were created. {ir| Web crawl), and the matching algorithms determining which cascade identifiers will be spotted (e.g. reuse of hashtags, URIs, quotes, images, or maybe exploiting wider semantics or sentiment) as depicted in Figure ??. Fig. 1. Depending on the applied matching functions, different transcendental information cascade representations can be generated for the same input data. A fictive example of a transcendental cascade based on our model is shown in Figure 2. Consider a system that features hashtags as an established form of identifying content patterns. The visualisation uses the following approach to represent distinct identifiers and time: Nodes are chronologically ordered alongside the horizontal dimension from left (the oldest node) to right (the most recent node); additionally nodes are ordered alongside the vertical dimension depending on the set of identifiers present in a node (each unique set is assigned to a distinct level). Consequently, the visualisation represents the content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”) - (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”). i that was shared at the time ti with the content ci. Nodes in he network are those resources from R that contain a set Ii of ne or multiple cascade identifiers. A cascade identifier is any nique informational pattern that is recognized by applying matching function to the content or any other inherent roperties of a resource (e.g. simple string matching algorithms o identify keywords in content). Formally a matching function k 2 F, k 2 N, k  n is defined as: fk(ci) = 8 >>>>>< >>>>>: {i1, i2, ..., ix} if fk matches patterns {i1, i2, ..., ix} in ci x 2 N ; otherwise Nodes V and edges E are then given as follows V ={v1, v2, ..., vp} vy = (uy, ty, Iy), E ={e1, e2, ..., eq} ez =(ua, ub, ⇤z) with Ii = {i1, i2, ..., io} = f1(ci) [ f2(ci) [ ... [ fn(ci) being he result of the concatenation of all identifiers found by all matching functions2 . An edge exists between any two nodes hat share a unique subset of all the cascade identifiers that were found for them. This subset and none of its subsets is art of the identifiers found for any node that was created in the ime period between when the two linked nodes were created. ⇤z ={ir| ir 2 Ia ^ ir 2 Ib, 8ir ! V 0 = {vc|vc = (uc,tc, Ic), ir 2 Ic ^ ta  tc  tb} = ;, vc 2 V, r 2 N, r  |Ib|} A node that contains a cascade identifier that was not etected for any other nodes before is called the identifier oot. Beside this we call a node without any incoming edges network root and node that has no outgoing edges a stub. Our cascade model clearly yields different outputs depending n the data to hand (e.g. determined by the extent of the 2Please note that [20] contains an unintentionally malformed equation for his as the wrong symbol was used to refer to the concatenation of the matching unctions. Fig. 1. Depending on the applied matching functions, different transcendental information cascade representations can be generated for the same input data. A fictive example of a transcendental cascade based on our model is shown in Figure 2. Consider a system that features hashtags as an established form of identifying content patterns. The visualisation uses the following approach to represent distinct identifiers and time: Nodes are chronologically ordered alongside the horizontal dimension from left (the oldest node) to right (the most recent node); additionally nodes are ordered alongside the vertical dimension depending on the set of identifiers present in a node (each unique set is assigned to a distinct level). Consequently, the visualisation represents the content creation sequence (“#A”) - (“#A#B”) - (“#A”) - (“#A”) - (“#A#B#C”) - (“#C”) - (“#A”) - (“#B#D”) - (“#A”). Fig. 2. Example of a cascade that emerges along five different identifiers. #A, #B, #A#B#C, #B#D and #C are fictive hashtags (or hashtag combinations resepectively) treated as the indentifying content patterns In order to understand how edges are labelled we highlight the sub-graph involving the nodes 2, 3, 4, and 5. Conforming to our cascade model an edge exist between nodes 2 and 3
  • 24. Transcendental information cascades t   #A   #A#B   #A#B#C   #A   #A   #B   #A  
  • 25. Capturing the unintended action resulting from information sharing activities of human collectives. t   Document stream   Transcendental Information Cascade  
  • 26. Temporal text/data mining = t ∈[t− 2 ,t+ 2 ] j=1 t′∈[t− W 2 ,t+ W 2 ] |dt′ | of each theme can then be modeled as the theme strengths over time. of theme life cycles thus involves the follow- (1) Construct an HMM to model how themes ach other in the collection. (2) Estimate the meters of the HMM using the whole stream served example sequence. (3) Decode the col- el each word with the hidden theme model is generated. (4) For each trans-collection when it starts, when it terminates, and how me. IMENTS AND RESULTS reparation ts are constructed to evaluate the proposed methods. The first, tsunami news data, con- ticles about the event of Asia Tsunami dated o Feb. 8 2005. We downloaded 7468 news 0 selected sources, with the keyword query shown in Table 1, three of the sources are in m are in Europe and the rest are in the U.S. e Nation News Source Nation UK Times of India India US VOA US mes India Washington Post US mes US Washington Times US UK Xinhua News China ws sources of Asia Tsunami data set with the previous one. We use the mixture model discussed in Section 3 to extract the most salient themes in each time interval. We set the background parameter λB = 0.95 and number of themes in each time interval to be 6. The varia- tion of λB is discussed later. Table 3 shows the top 10 words with the highest probabilities in each theme span. We see that most of these themes suggest meaningful subtopics in the context of the Asia tsunami event. !"##$%#&'($)&"*"%+ ,-.# /$%0"(+#&'$(&.$%1+-$% 2"(#$%13&456"(-"%0" 7$%1+-$%&81+09 2$3-+-013&:##;"# /(-+-0-#)&&$%&:(1< =+1+-#+-0# Figure 6: Theme evolution graph for Asia Tsunami With these theme spans, we use KL-divergence to further identify evolutionary transitions. Figure 6 shows a theme evolution graph discovered from Asia Tsunami data when the threshold for evolution distance is set to ξ = 12. From Figure 6, we can see several interesting evolution threads which are annotated with symbols. The thread labeled with a may be about warning systems [4] Subašić, I., & Berendt, B. (2013). Story graphs: Tracking document set evolution using dynamic graphs. Intelligent Data Analysis, 17(1), 125-147. [5] Mei, Q., & Zhai, C. (2005, August). Discovering evolutionary theme patterns from text: an exploration of temporal text mining. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining (pp. 198-207). ACM. [5]   “The key notion of TTM is burstiness – sudden increases in frequency of text fragments, and all TTM methods aim to model burstiness.” [4]
  • 27. t   t   F1   Fn   …   …   C11   C21   C22   C23   t0   t1   t2   t3   t4  t5   t7   t8   t6   t6  -­‐  t0   t2  -­‐  t1   t8  -­‐  t2   t4  -­‐  t2   t7  -­‐  t4   t5  -­‐  t3   t1  -­‐  t0   t2  -­‐  t1   t4  -­‐  t1   t4  -­‐  t3   t6  -­‐  t5   t8  -­‐  t6   t7  -­‐  t4   t5  -­‐  t4   t3  -­‐  t2   There is more than one “reality”
  • 28. Analyzing low-level properties of the multiple states of a system that exist at the same time 4 1 15 10 Tags   URIs   KID & APH   Single node motifs   long uniform paths   short uniform paths   long non-uniform paths  
  • 29. Analyzing low-level properties of the multiple states of a system that exist at the same time Tags   URIs   KID&APH   Identifier entropy   4. Overview of the results of the cascade comparison. Cascade size distribution and wi d with a log scale on the y-axis. ain one or few identifiers equally distributed. Very large identifiers e size distribution and wiener index are plotted on a log-log scale; identifier entropy is large identifiers (KID, APH, URIs), cascades which are based on varying profiles of increasing randomness with growing cascade size  
  • 30. Cascade motifs as an indicator of state? ?
  • 31. t   F1   Fn   …   …   C11   C21   C22   C23   Formalising the multiple possible representations of a system at any time and their relationships. Not all representing purposeful action but reflecting useful informational properties.
  • 32. By focusing only on the coincidence of information occurrence, we can capture and analyse emergent collective action across system boundaries and independent from social network contexts. Markus Luczak-Roesch @mluczak http://markus-luczak.de Source:GiuliaForsythe,http://goo.gl/6hpZ0W,CCBY-NC-SA2.0
  • 33. References •  Markus Luczak-Roesch, Ramine Tinati, Kieron O'Hara, and Nigel Shadbolt. 2015. Socio-technical Computation. In Proceedings of the 18th ACM Conference Companion on Computer Supported Cooperative Work & Social Computing (CSCW'15 Companion). ACM, New York, NY, USA, 139-142. http:// doi.acm.org/10.1145/2685553.2698991 •  Markus Luczak-Roesch, Ramine Tinati, and Nigel Shadbolt. 2015. When Resources Collide: Towards a Theory of Coincidence in Information Spaces. To appear in WWW’15 Companion, May 18–22, 2015, Florence, Italy. http:// dx.doi.org/10.1145/2740908.2743973