2011 04 troussov_graph_basedmethods-weakknowledge

Alexander Troussov, Ph.D., IBM Dublin Software Lab
16th of April 2011, Mathlingvo Seminar, St.Petersburg State University, Russia

Graph-based methods
to exploit “weak” knowledge

© 2011 Alexander Troussov

About AT

IBM Ireland Center for Advanced Studies - Chief Scientist
IBM LanguageWare group – the Architect
National Geophysical Data Center, Boulder, CO, USA - Visiting scientist
– Fuzzy logic based search engine for search in large databases when exact parameters
of search are hard to define
Observatoire de la Côte d’Azur, Nice, France – Visiting scientist
– numerical simulation in stochastic physics
Institute of Physics of the Earth (Russian Academy of Sciences) and the International
Institute for Earthquake Prediction Theory and Mathematical Geophysics, Moscow, Russia -
Lead Researcher
– R&D in geophysics and geoinformatics
System programming at the Institute of Precise Mechanics, Moscow
PhD in Mathematics from Lomonosov Moscow State University

2 © 2011 Alexander Troussov

Natural Language Understanding is Inferencing (?)

From computational point of view
natural language understanding
is inferencing

– Text which mentions
Malahide
is probably about
Canada (??)

Malahide (Canada 2006 Census population
8,828) is a township in Elgin County, Ontario,
Canada

Source: Troussov et al. MITACS, Canada, 2010


Inferencing

Terms are ambiguous, and our knowledge is never “the truth, the whole truth, and nothing
but the truth”
– Malahide, Co. Dublin
– Malahide is a township in Elgin County, Ontario, Canada.
– Paradis Gisenyi Malahide is a hotel in Rwanda
Solution (Troussov et al. MITACS, Canada, 2010 ): propagation from multiple concepts, for
instance, the initial seed for the activation propagation starts at two nodes in a geographical
taxonomy: Malahide (Ontario) and Malahide (Co. Dublin) as well as from other concepts
mentioned in the text
• Text which mentions Malahide and Europe – is a little bit more likely to be about
Ireland than about Canada
• Text which mentions Malahide and Clontarf – is more likely to be about
• …
• Cohesive coherent text which mentions: Malahide, Mulhuddart, Lansdowne,
Clontarf, Donabate - is almost for sure about Dublin


Knowledge, Lexico-Semantic Resource

Text Relevancy


Text – Semantic Network

NETWORK OF CONCEPTS

Finding “focus” concept
Mapping of term mentions to concepts
.

Mention Mention Mention Mention

TEXT


NLU as inferencing

The concept of a car is relevant to a text.
Car IS-A “on-land travel” (?)
Therefore “on-land travel” is somewhat relevant to the text, …


Demo

– 2 1 Spreading Activation.pdf


Agenda

Introduction
Building Semantic Model
SA
Research Challenges
– Why SA
– Relayability of inferencing
– What is the purpose of graph operations
Centrality, network flow methods
Zoo of algorithms
Nepomuk Recommender


Spreading Activation Methods


There is an increased need for a new generic and formal understanding of spreading
activation as a class of algorithms rather than a particular algorithm with many parameters
Spreading activation (also known as spread of activation) is a method for searching
associative networks, neural networks or semantic networks. The method is based on the
idea of quickly spreading an associative relevancy measure over the network. Our goal is to
give an expanded introduction to the method. We will demonstrate and describe in sufficient
detail that this method can be applied to very diverse problems and applications. We present
the method as a general framework. First we will present this method as a very general
class of algorithms on large (or very large) so-called multidimensional networks which will
serve a mathematical model.

Source: Troussov, Levner, Bogdan, Judge, Botvich “Spreading activation methods”

We present spreading activation in a generic form, as a set of methods suitable for mining
multidimensional networks with oriented weighted links. These graphmining methods might
produce results similar to those which might be achieved by soft clustering and fuzzy
inferencing. The input object is a function on nodes of the network, and the spread of
activation is a technique which provides “spreading” of this function through the network
links. The result of the spreading activation is a new function on the nodes. The properties of
that function strongly depend on the original function and the parameters of the spreading
activation. For instance, when the underlying network is a network of ontological concepts,
parameters governing spread might be chosen in such a way that allows “smoothing” of the
original function and interpreting the resulting function as “conceptual” summaries of the
initial non-zero valued nodes.


Origin of Spreading Activation Methods

In neurophysiology interactions between neurons is modeled by way of activation which
propagates from one neuron to another via connections called synapses to transmit
information using chemical signals. The first spreading activation models were used in
cognitive psychology to model this processes of memory retrieval (Collins, A.M. & Loftus,
E.F., 1975; Anderson, J.,1983).
This framework was later exploited in Artificial Intelligence (AI) as a processing framework
for semantic networks and ontologies, and applied to Information Retrieval (Crestani, F.,
1997; Aleman-Meza, Halaschek, Arpinar, & Sheth, 2003; Rocha, C, Schwabe, D. & Poggi de
Aragao, M., 2004; …) as the result of direct transfer of information retrieval ideas from
cognitive sciences to AI.


Notation

A multidimensional network can be modeled as a directed graph, which is a pair
G = (V,E)
where
V – is the set of vertices vi
E – is the set of edges ej (although in oriented graphs edges are referred to as arcs)
init: E → V – is the mapping which provides initial nodes for arcs
term: E → V – is the mapping which provides terminal nodes for arcs
imp – is importance value of arcs and nodes.
For instance, imp(v) where the node v is a geographical location, might be the population. Imp(e)
number of phone calls from person init(e) to person term(e).
w – “weights”, for instance, the sigmoidal function of imp.
w(ej)=0 means that effectively arc ej is ignored
w(ej)=1 means that activation of init(ej) strongly affects the activation of term(ej). For instance,
when the nodes represent “words”, synonym links might be assigned the value 1.
F(E) – is the “activation” function, usually a real valued function on nodes of the network.


Generic description of spreading activation methods (SAM)
framework

1. Initialisation
Sets the parameters of the algorithm, network, and initial F(E) as a list of non-zero
valued nodes V n
2. Iterations
(each iteration is one pulse of SAM)
– a. List Expansion
the list is expanded to include neighbors (including both neighbors following outgoing
links, and neighbors which have links to the nodes in the list). Newly added nodes
receive a zero valued level of activation
– b. Recomputation
the value at each node in the list is recomputed based on the values of the function on
nodes which have links to the given node and types of connections
– c. List Purging
The list is purged - we exclude the nodes with the values less than a threshold.
– d. Conditions Check To Break Iterations
like maximum number of iterations to be performed.
3. Output
The list of nodes (value of the function after spread of activation) ranked according F
values.


Generic description of recomputation phase

We have the list of nodes V n .
1. Input/Output Through Links Computation.
– For each node v we compute the input signal to each arc e, such that init(e)=v. When the
signal (“activation”) passes through a link e, the activation usually experiences decay by
a factor w(e)
2. Input/Output of Node Activation
– Before the pulse, the node v has the activation level F(v).
• Through incoming links v get more activation, By dissipating the activation through
outgoing links, the node v might lose activation.
3. Computation of the New Level of Activation
– A new value F(v) is computed based on F(v), Input (v), and Output (v)



1. Input/Output Through Links Computation.
For each node v we compute the input signal to each arc e, such that init(e)=v. This
computation can be based on the value F(v), the outdegree of a node etc. For instance, if
the node v has n outgoing arcs of the same type, each arc e might get input signal:
I (e) = F(init(e)) · (1 / outdegree(v)**beta )
where beta might be equal to 1. It could be also less than one, in which case the node v will
propagate more activation to its neighbors than it has.
When the signal (“activation”) passes through a link e, the activation usually experiences
decay by a factor w(e):
O (e) = I(e) · w(e)


Generic description of input/output phase

2. Input/Output of Node Activation
Before the pulse, the node v has the activation level F(v).

Through incoming links v get more activation:
Input(v) = Σ O(e)
for all links e such that init(e) ∈V n, term(e) = v.

By dissipating the activation through outgoing links, the node v might lose activation:
Output(v) = Σ I(e)
for all links e such that init(e) = v, term(e) ∈V n



3. Computation of the New Level of Activation
A new value F(v) is computed based on F(v), Input (v), and Output (v), for example
Fnew(v) = F(v) + Input (v)


SAM and Methods of Numerical Simulation in Physics

Spreading activation algorithms were introduced in 1990s; however the same iterative
methods were used long before in numerical simulation in physics, mechanics, chemistry
and engineering sciences. The major distinctions of these algorithms from what is called
now as spreading activation are:
– a) in physics – such algorithms usually work on a regular mesh (so that the local
topology of the graph is encoded into formulas of the recomputation stage)
– b) in physics – initial conditions, or initial activation – are usually assigned to all nodes
on the mesh; and the use of algorithms for efficient graph traversal is not needed. For
instance, steps 2a (List expansion) and 2b (List Purging) in the generic description of
SAM framework might be skipped.
For instance, one dimensional heat transfer equations might be numerically simulated on a
one-dimensional mesh, by iterative methods. On each iteration recomputation stage is
based on the formula below:
Fnew (v) = ( F(RightNeighbor(v)) + F(LeftNeighbor(v)) ) / 2
Using a different formula, one can simulate the behavior of an oscillating string (although this
will require storing tree values at each node - position, mass and velocity of the material
point corresponding to the node).

SAM and Methods of Numerical Simulation in Physics

Using the same iterative algorithm, with one set of parameters one can emulate heat
transfer; with another set of parameters the same algorithm will show us the behavior of
oscillating strings. But the phenomena of heat propagation and string oscillation are quite
different (for instance, heat propagation might lead to “thermal death” - the state of
equilibrium where the level of activation is the same for all nodes, while oscillation might
continue forever). Our illustration concern only basics, while real modeling might be much
more complicated, for instance, hear transfer might lead to combustion, where after reaching
some level of activation a node generates more “heat” than it gets from neighboring nodes.


Spreading Activation as a Graphmining Technique

The technique of SAM is quite polymorphic. On this slide we interpret the results of
spreading activation in terms of graph mining.
– First of all, one can think that after running SAM the most activated nodes will be those
nodes, which get the activation from multiple sources, or, in other words, those nodes
which minimize the “distance” to the nodes which were initially activated. Therefore
these nodes might be considered as potential centroids of strong clusters induced by the
initial activation. Since partitioning of the nodes according to these clusters is not
immediately available (and is not needed in many applications), SAM algorithms might
be considered as methods of soft clustering.
– On the other hand, the most activated nodes are those nodes, which are connected to
the initial conditions by particular types of directed links (arcs with large weights).
Therefore we might consider SAM as an efficient scheme for computing fuzzy
inferencing. For such applications replacing a single valued function F by a vector
function might be useful.
We conclude by noting that SAM algorithms might be used for soft clustering and fuzzy
inferencing on networks.


Γαλλία People

Παρίσι

Ναπολέων Αλέξανδρος

Geographical
artifacts
Relations
• Friends
• Part of, Instance of, Subcluss
• Created


France Russia

Paris Moscow

Napoleon Alexander

Borodino

Kutuzov

Meeting:
Battle of Austerlitz

Meeting:
Battle of Borodino

Project:
Invasion of Russia


Diagram on the previous slide …

What it represents?
How it can be used?


France Russia

Paris Moscow

Napoleon Alexander

Borodino

Kutuzov

Meeting:
Battle of Austerlitz

Meeting:
Battle of Borodino
How this diagram could be used?
1.Network flow process could show the nodes most relevant
to the pair “Napoleon” & “Meeting”
- Selection WHO – whom to invite
Project: - Other nodes – explain recommendations
Invasion of Russia 2.When Napoleon opens email or a web page containing W&P
he will be advised that the content of this resource is relevant
to his project “Invasion of Russia”0

Diagram on the previous slide … What it represents?

Data from Facebook, data from Napoleon’s Lotus Notes calendar, structure of a Wiki,
network of collocations or relations between the entities in W&P, …
– The proliferation of Web 2.0 and Enterprise 2.0 technologies has lead to the emergence
of massive networks connecting people and various digital artifacts. These networks can
be treated as a “weak” knowledge, which nevertheless might be used recommendations
and even for such traditional applications as knowledge-based text processing
Or instantiation of an ontology related to W&P by Leo Tolstoy
– In which case we would probably know that Napoleon is emperor of France, Paris is the
capital (not instantiation of a subclass) of France, etc.
Ontology provides conceptualization, allow inferencing, but these advantages per se are
useless without tedious manual work to encode the rules how to use this additional
knowledge. While the knowledge encoded in the topology of the multidimensional network is
ready to use provided that methods are tolerant to errors and inconsistencies in data - i.e.
the methods are methods of “soft mathematic” – fuzzy inferencing, soft clustering, …


Social Context = Knowledge ?

A New Mathematical Model of Horse Racing
Assume, without the loss of generality, that each horse in the horse
racing is modelled by a wooden ball of radius Ri.

= a ball ? ☺


Representing social context as a knowledge allows us to
benefit from the experience of knowledge based
applications.


For instance, the social context modeled as a network is not much different from semantic networks
which are formed from concepts represented in ontologies. And it is possible to use such networks
for knowledge based text processing. Representing social context as knowledge allows us to draw
experience from such mature R&D area as knowledge-based text processing


How to model the social context

As multidimensional networks
– The primary source - network models of instantiations of techno-social systems
As a “Knowledge” – represented as objects, clauses, XML, graphs, some combination of
these


The primary source – network models of techno-social systems

Invited

Joined
Log-files of Techno-Social systems (like
Created Facebook or IBM’s Lotus Connections)
keep track about who did what.
Triples could be aggregated into a
network.


Examples of Graph Models:
Folksonomies: – Tripartite Hypergraph

Social bookmarking systems (Del.icio.us, …)
– Where to keep my bookmarks?
– Users (actors), resources, tags
In social bookmarking systems users describe bookmarks by keywords called tags. The
structure behind these social systems, called folksonomies, can be viewed as a tripartite
hypergraph of actors, tag and resource nodes.
– Three types of citizens of the first class citizens, and hyperplanes
– If hyperplanes are made from rubber, they could be schinked to a node, so the
hyperplanes will also be citizens of the first class
Advantages of the network models (see next slide)
– Extensibility
– Easy of merge heterogeneous information

Source: Hypergraphs: see Jäschke et al. "Logsonomy — A Search Engine Folksonomy" MediaICWSM 2008AAAI Press (2008)

Inferencing – “Soft methods” could provide reliable inferencing

For instance, the social context modeled as a network is not much different from semantic networks
which are formed from concepts represented in ontologies. And it is possible to use such networks
for knowledge based text processing. Representing social context as knowledge allows us to draw
experience from such mature R&D area as knowledge-based text processing


Natural Language Understanding is Inferencing (?)

From computational point of view
natural language understanding
is inferencing

– Text which mentions
Malahide
is probably about
Canada (??)

Malahide (Canada 2006 Census population
8,828) is a township in Elgin County, Ontario,
Canada

Source: Troussov et al. MITACS, Canada, 2010


Inferencing

Terms are ambiguous, and our knowledge is never “the truth, the whole truth, and nothing
but the truth”
– Malahide, Co. Dublin
– Malahide is a township in Elgin County, Ontario, Canada.
– Paradis Gisenyi Malahide is a hotel in Rwanda
Solution (Troussov et al. MITACS, Canada, 2010 ): propagation from multiple concepts, for
instance, the initial seed for the activation propagation starts at two nodes in a geographical
taxonomy: Malahide (Ontario) and Malahide (Co. Dublin) as well as from other concepts
mentioned in the text
• Text which mentions Malahide and Europe – is a little bit more likely to be about
• Text which mentions Malahide and Clontarf – is more likely to be about
• …
• Cohesive coherent text which mentions: Malahide, Mulhuddart, Lansdowne,
Clontarf, Donabate - is almost for sure about Dublin
Such rapid “phase transition” from uncertainty to certainty is similar to the
transition related to percolation threshold


from Uncertainty to Certainty in Inferencing: phase transitions as a function
of seed size in analogy to ones in percolation

In (semantic) networks with high local density
the reliability of inferencing from a single concept is almost never sufficient,
reliability could be low when inferencing starts from a small number of seed concepts,
but inferencing becomes very reliable at some level of the number of the initial seed
concepts (which could be explained by combinatorics)

Reliability
of inferencing

40 Number of nodes in the seed © 2011 Alexander Troussov

And could be explained by combinatorics

A graph showing the approximate probability of at least two people sharing a birthday
amongst a certain number of people.
In probability theory, the birthday problem, or birthday paradox, pertains to the probability
that in a set of randomly chosen people some pair of them will have the same birthday. By
the pigeonhole principle, the probability reaches 100% when the number of people reaches
366 (ignoring February 29 births). But perhaps counter-intuitively, 99% probability is reached
with a mere 57 people, and 50% probability with 23 people.

Simulation
The network (such as a taxonomy of geographical
locations) is the tree of 20,000 nodes. Text is modeled
as a list of 100 terms each of which is ambiguous and
could be mapped into 8 network nodes. When such
mapping happens, we consider that the node (the
geographical location represented by the node) could
be relevant to the text.
We are looking for clusters such as the groups of N
nodes each of them is mentioned in the text and the
graph distance between each pair of nodes in the
cluster is less than three.
Such graph structures have low probability of
occurrence for small N (N=1 or 2), and their probability
sharply decreases to zero for bigger N;
correspondingly, our certainty that the graph structure
signifies the topicality of the text increases to 1.0
– Text which mentions Malahide, Mulhuddart,
Lansdowne, Clontarf, Donabate - is almost for sure
about Dublin (Ireland)

Source: F. Darena and A. Troussov 2010


Processes in Networks

How we study the Earth?
– By looking at the results of the propagation of
waves through the Earth
Propagation of seismic wave in the ground
and the effect of presence of land mine
Similarly, one can study the networks
by network flow methods
– introducing the processes where something
is flowing from node to node across the
edges


Processses

Used goods- trail
Money - walk
Gossip - replication rather than transference (trails rather than walks)
E-mail - diffusion by replication
Attitudes - spread through replication rather than transfer
Infection - spreads like gossip, but does not re-infect
Packages - usually the shortest route possible
Relevancy in semantic networks
Trust - Shortest path or volume?


we are talking about consumability of centrality measurements
produced by network flow methods like these (DEMO)


Key difference between SNA and other approaches to social science

Social sciences usually have focus
on attributes of individual actors


Key difference between SNA and other approaches to social science

SNA focus on relationships
between actors
“Social network analysis reflects a shift from the
individualism common in the social sciences towards a
structural analysis”.
Garton et al. Studying Online Social Networks
Structuralism is an approach to the human sciences
that attempts to analyze a specific field (for instance,
mythology) as a complex system of interrelated parts.
лингвистс Романа Якобсона и Ник. Трубецкоj
антрополог Леви-Стросс
~ Complex systems
Sociogram:
– Jacob Levy Moreno (1889-1974) was a Austrian-American
leading psychiatrist and psychosociologist, thinker and
educator, the founder of psychodrama, and the foremost
pioneer of group psychotherapy.
Among Moreno’s primary contributions to sociometrics was
the sociogram. The sociogram is a method of representing
individuals as points on graphs and using lines and arcs to
represent the relationships between the individuals.

Graphics from Prof. Hendrik Speck's tutorial at 5th Karlsruhe Symposium for Knowledge
Management in Theory and Praxis, 2007

Prominence

The study of structural properties of networks and their interplay with the processes taking
place on the network is one of the main problems in the last years in the field of complex
network analysis
A primary use of graph theory in social network analysis is to identify “important”
actors.
Centrality and prestige concepts seek to quantify graph theoretic ideas about an individual
actor’s prominence within a network by summarizing structural relations among the graph
nodes.
An actor’s prominence reflects its greater visibility to the other network actors (an audience).
An actor’s prominent location takes account of the direct sociometric choices made and
choices received (outdegrees and indegrees), as well as the indirect ties with other actors.
The two basic prominence classes:

– Centrality: Actor has high involvement in many relations, regardless of send/receive
directionality (volume of activity)
– Prestige: Actor receives many directed ties, but initiates few relations
(popularity > extensivity)
Source: Wasserman&Faust "Social Network Analysis“ (W&F)

Centrality: Eigenvector Centrality

Eigenvector centrality was introduced by Phillip Bonacich in 1987
“Google's workhorse search engine ranking algorithm, PageRank, is actually a variant on an
SNA concept - Bonacich Power Centrality.
– Bonacich (1987) hypothesized that someone's power in society depends on the power of his or her
social contacts. Bonacich formalized this mathematically:
ci = B(c1Ri1 + c2Ri2 + ... + cnRin) ,
where ci is the person in question, B is the magnitude of the effect, and Rij is the strength of the
relationship between the person in question, i, and each of the other people, j, under consideration.
If B=1 , the formula becomes eigenvector centrality, of which PageRank is a variant. Now, Page, et
al. (1998) do not cite Bonacich, I am not claiming that they stole the idea - I am merely stating that a
social network analyst appears to me to have been the first to think up the concept”.
Solomon Messing http://www.stanford.edu/~messing/RforSNA.html


Centrality and the network flow methods

Most of the centrality measurement are based on the network flow process, “that focuses on
the outcomes for nodes in a network where something is flowing from node to node across
the edges” (Borgatti and Everett, M. 2006 ]
We interpret this “something” as a relevancy measure; for instance, the initial seed input
value which shows nodes of interest in the network. Propagating the relevancy measure
through outgoing links allows us to compute the relevancy measure for other network nodes
and dynamically rank these nodes according to the relevancy measures.
The same paradigm could be used to address the centrality measurements in social network
analysis. Centralisation of the network can be achieved when we assume that all the nodes
are equally important, and iteratively recompute the relevancy measure based on the
connections between nodes.


Master Equation Numerical Solution

Bonacich Power Centrality, Eigenvector Centrality, Google’s PageRank

– “Google's workhorse search engine ranking algorithm, PageRank, is actually a variant on
an SNA concept - Bonacich Power Centrality. Bonacich (1987) hypothesized that
someone's power in society depends on the power of his or her social contacts.
Bonacich formalized this mathematically:
ci = B(c1Ri1 + c2Ri2 + ... + cnRin) ,
where ci is the person in question, B is the magnitude of the effect, and Rij is the
strength of the relationship between the person in question, i, and each of the other
people, j, under consideration.
If B=1 , the formula becomes eigenvector centrality, of which PageRank is a variant.
Now, Page, et al. (1998) do not cite Bonacich, I am not claiming that they stole the idea -
I am merely stating that a social network analyst appears to me to have been the first to
think up the concept”.
Solomon Messing http://www.stanford.edu/~messing/RforSNA.html


Master Equation Numerical Solution

Computation

Master equation easily leads us to a numerical solution


It is great to have “the right master equation”!
What is the shape of a hanging chain?

– What is the shape of a hanging chain when supported at its ends
and acted on only by its own weight?

Plotting geometric arrangements and forces acting on small
segments of the chain
Integrating the results



What is the shape of a hanging chain when
supported at its ends and acted on only by
its own weight?
• Galileo: “This chain will assume the form
of a parabola”
y=x2



What is the shape of a hanging chain when
supported at its ends and acted on only by
its own weight?
• Galileo: “This chain will assume the form
of a parabola”
y=x2
• But the shape is different:
y = (a / 2) ( ex/a + e-x/a )
which was established later by applying
calculus

." In 1669, Jungius disproved Galileo's claim that the curve of a chain
hanging under gravity would be a parabola (MacTutor Archive). The
curve is also called the alysoid and chainette. The equation was
obtained by Leibniz, Huygens, and Johann Bernoulli in 1691 in Leibniz's solution is on the left.
response to a challenge by Jakob Bernoulli”. Huygen's illustation is on the right.
http://mathworld.wolfram.com/Catenary.html

“Plotting geometric arrangements and forces acting on small segments” evolved into
– Finite difference method
• In mathematics, finite-difference methods are numerical methods for approximating
the solutions to differential equations using finite difference equations to approximate
derivatives.
– Stencil
• In mathematics, especially the areas of numerical analysis concentrating on the
numerical solution of partial differential equations, a stencil is a geometric
arrangement of a nodal group that relate to the point of interest by using a numerical
approximation routine. Stencils are the basis for many algorithms to numerically
solve partial differential equations.


Numerical Solution NO Master Equation

“Integrating” evolved into …
– Well, in financial mathematics solutions are tuned on “stencils”.
Numerical solutions are known.
Master equation is not known,
and is not interesting to know.
“Master equation is not known” – this is ok.
– But we need to be aware about emergency effects in complex systems:
learning how to do something right in a small scale, doesn’t necessarily imply that we’ll
do right things in a bigger scale


Leibniz, Huygens, and Johann Bernoulli knew geometry and mechanics. We don't know
"geometry" and "mechanics” of techno-social systems (and we don’t even know "geometry"
and "mechanics” of semantic network, social networks, …)
but we can create small "nodal arrangements" modeling multidimensional networks (for
instance, folksonomies)
Apply known and novel numerical algorithms and utilize state of the art knowledge to decide
which algorithms provides better results.
The next step - to check if good properties of the numerical solutions on the micro-level hold
true on the mezzo-level

Source: Troussov at MITACS Workshop in Vancouver, Canada, 2010


Recommender systems and global/local ranking

Link analysis is frequently employed for ranking and navigation
Graph-based recommender systems should recommend
“Important” objects (nodes, links, subgraphs)
which are also are
– Close enough to the initial points of interests (query, focus, initial seed)
(for instance, in physical space)
Global ranking ~ PageRank
Breadth first search (BFS) ? Local Ranking !?

Recommending a suitable restaurant near the NY 9th avenue (next slide)
or the music you might like, the advertisement you should see, etc


Graphics: http://strangemaps.wordpress.com/2007/02/07/72-the-world-as-seen-from-new-yorks-9th-avenue/


Global Ranking (like Google’s PageRank) –
a view on the network from external point - modern, “Copernican” approach

Source: NOAA


Local Ranking – is needed for recommenders – should rely on Ego-
centered Ptolemaic view (actually, Poly-Centered, see next slide)

LOCAL RANKING
Ego-centered or "personal“ networks
provide an Ptolemaic views of their
networks from the perspective of the
persons (egos) at the centers of their
network.

Graphics: http://strangemaps.wordpress.com/2007/02/07/72-the-world-as-seen-from-new-yorks-9th-avenue/


POLY-CENTRIC
Poly-Centric In physical space – navigation
is from one point to another.
In applications to virtual spaces
- navigation is not simply
browsing from a single object
to another, but by dealing with
several objects at the same
time .
For instance, to get better
results in Google we add
terms, we remove terms, …
To compute recommendation
“Whom invite to the meeting”,
one can start navigation from
two objects representing the
user whom recommendation is
for and the meeting in question


.

Graph-based recommender systems should
recommend
“Important” objects (nodes)
which are also located
Close to the initial points of interests (query,
initial seed)
One of the leading approaches in recommenders is:
Results of Global Ranking (Link analysis)
are “filtered” according to their proximity to the query
In this paper we introduce novel algorithms which could
replace two step procedure mentioned above with one
step:
Local Ranking
which simultaneously computes proximity and importance

Web and Communities

Communities in Social Sciences: A tribe learning to survive, a group of engineers working on similar
problems, …
Communities in computer sciences - any empirically found group of people

Recent advances in digital technologies invite consideration of organizing as a process that is
accomplished by global, flexible, adaptive, and ad hoc networks that can be created, maintained,
dissolved, and reconstituted with remarkable alacrity”.
Prof. N. Contractor


Community detection … but What is a Community?

Are you Russian? Yes. Are you Irish? Yes. Are you mathematician? Yes. Are you
practitioner? Yes.
– Communities easily overlap, multiple membership and fuzzy belongings
At the same time, some communities SHOULD be kept separate
– Remember “Strange Case of Dr Jekyll and Mr Hyde” (Robert Louis Stevenson, 1886).
• How Google had failed to understand an essential property of real-world
social networks
• So by testing their social service inside a single context (Google employees only),
the developers failed to notice that in real life, people participate in multiple
contexts (family, work, friends, etc) that they work actively to keep
separate. The reasons for wanting to keep these groups separate can range from
wanting to keep an illicit affair secret from your spouse to political activists in
oppressive regimes wanting to keep certain connections secret from the
government. Another important reason to keep our communities separate, is that
we often play different roles - and communicate differently

http://www.iq.harvard.edu/blog/netgov/2010/03/worlds_colliding.html


New methods for community detection are needed

Multiple membership
– Are you Russian? Yes. Are you Irish? Yes. Are you mathematician? Yes. Are you
practitioner? Yes. …
Fuzzy-belongings
– We don’t know the social structures behind on-line “communities”
members of an on-line community don’t necessarily have the sense of identity as
members of real-life social communities, on-line communities could be project teams or
networks of knowledge, …
High performance and scalability (agglomerative, local, …)
– Clustering as simply partitioning is ruled out because of multimembership
– Clustering as partitioning is not possible in real time for many business applications
• IBM Intranet: 400K employee, 10K on-line communities (the biggest 23K
members), ...
Contextualisation of Community Detection
– Collaborative filtering systems provide recommendations based on the detection of like-
minded users. But the user of a techno-social system whom the prediction is for could be
"Matematician", "Irish" etc., or a kind of Dr. Jeckyll / Mr. Hyde persons, etc.(see next
68 slide) © 2011 Alexander Troussov

An example of clustering around a node using propagation


Future work in local dynamic clustering

Troussov et al “Vectorised Spreading Activation” 2010 theorize that the future development
of spreading activation (SA) methods might be driven by
“physics-inspired”
and
“logic-inspired” algorithms
– SA algorithms have roots in numerical simulation of various physics phenomena,
particularly by finite difference methods.
– From the other hand, the iterative procedure of SA is essentially the same as the
procedure that determines the new state of a cell in cellular automata such as Conway’s
Game of Life. Although cellular automata usually perform on rectangular (cubic, etc.)
grids, the extension to arbitrary networks is feasible.
~ Marker propagation, MajorClust, Chinese whispers graph clustering algorithm, …


Conway's Game of Life


Logic-inspired VSA

Finite difference approximations to differential equations were one of precursors of cellular
automata (Stephen Wolfram "A New Kind of Science") and of the method of spreading
activation (Troussov et al 2009)
Iterative computational procedures in cellular automata are the same as in SA.
The identity of the computational procedures allows to develop VSA algorithms with hybrid
operations over the components of the activation vector.
– For instance, “physical” operations could be responsible for the propagation of the
activation around the initial seeds, the level of the activation indicates the relevancy of
the nodes to the initial seeds.
– “Logical” operations could propagate markers, which indicate potential belongings of
nodes to clusters.

Such hybrid operations will combine ranking with clustering; and is
computationally efficient on massive networks since the major time consuming operations –
retrieval of nodes – serve both “physical” and “logical” operations. The clustering does not
involve partitioning of the whole network.


VSA & Marker propagation – combining ranking with clustering

My University

An Expert

A topic
I’m interested in


VSA & Clustering (Cont.)


My University

An Expert

A topic
I’m interested in


Tasks / Methods

Various terminology in various domains (for instance, from the point of view of IM many tasks falls into the
category of hidden knowledge discovery)

Multidimensional network Techno-Social Systems Networks Theory and Graph
point of view (A.T.): tasks Theory terminology
Recommender Systems Random walks
Centralisation PageRank etc Eigenvector centrality
Expertise location

Recommender systems Motifs
Local topology Link prediction

Ad hoc generalisation across Expertise location Clustering
dimensions Recommender Systems


Tasks

Avenues to deep socio-semantic analytics and the possibility of high-
quality functionalities for techno-social systems (like recommending people to
invite into your social network) hinge on the availability of engines which are able
– to provide hidden knowledge discovery like
• Structural importance of nodes
• discovering a new relation in a network
that based on the strength of multiple connectivity between the nodes
of a social network one can conclude
that Dr. Jekyll is related to Mr. Hide),
• provide ad hoc generalisation across dimensions.
• For instance, the ability to detect that a particular person might serve as an
representative of a community or as an expert on a particular topic (the example
of such generalisation is the expression frequently attributed to Louis XIV "L'e'tat
s'est moi (I'm the State).")


“Three steps away” ?

John B. Axel P. Dan B. Tim B.

Why recommender decided
that this three steps away
connection is a strong
connection?
83

John and Tim –
Recommender computes that this is a
strong connection because of
multiple ways of connections

Shortest Path vs. Volume of traffic
Friends-of-Friends

Interest
Workplace

84

John and Derek

Recommender computes
that such type of
connectivity is a weak
connection

85

Tasks: Generalisation Across Domains - Whom is Claudia connected with?

All of these people

Dirk

Martin

Claudia Elaine Researcher

John

Hanna


Ranking

2

1 3


Ranking
3

1 2


Ranking

1 2

…
…


Nepomuk Recommender

NEPOMUK (Networked Environment for Personalized, Ontology-based Management of Unified
Knowledge) is an open-source software specification that is concerned with the development of a social
semantic desktop that enriches and interconnects data from different desktop applications using semantic
metadata stored as RDF.
Initially, it was developed in the EU 6th framework integrated project Nepomuk (2006-2008) - 17 million
Euros, of which 11.5 million was funded by the European Union


Nepomuk Recommender (Cont.)

Troussov et al “Social Context as Machine Processable Knowledge” presented the
architecture of the hybrid recommender system in the activity centric environment Nepomuk-
Simple (EU 6th Framework Project NEPOMUK).
“Real” desktops usually have piles of things on them where the users (consciously or
unconsciously) grouped together items which are related to each other or to a task. The so
called “Pile” UI, used in the Nepomuk-Simple imitates this type of data and metadata
organisation which helps to avoid premature categorisation and reduces the retention of
useless documents.
Metadata describing the user data are stored in the Nepomuk personal information
management ontology (PIMO). Proper recommendations, such as recommendation of
additional items to add to the pile, apparently should be based on the PIMO, on the textual
content of the items in the pile. Although methods of natural language processing for
information retrieval could be useful, the most important type of textual processing are those
which allows to related concepts in PIMO to the processed texts. Since PIMO changes over
the time, this type of natural language processing can’t be performed as preprocessing of all
textual context related to the user. Hybrid recommendation needs on-the fly textual
processing with the ability to aggregate the current instantiation of PIMO with the results of
textual processing.


Nepomuk

Representing and modeling this ontology as a multidimensional network allows to augment
the ontology on the fly by new information, such as the “semantic” content of the textual
information in user documents. Recommendations in the Nepomuk-Simple are computed on
the fly by graph-based methods performing in the unified multidimensional network of
concepts from the personal information management ontology augmented with concepts
extracted from the documents pertaining to the activity in question.
Troussov et al. 2008 classify Nepomuk-Simple recommendations into two major types.
– The first type of recommendations is recommendation of the additional items to the
pile, when the user is working on an activity.
– The second type of recommendations arises, for instance, when the user is browsing
Web; the Nepomuk-Simple can recommend that current resource might be relevant to
one or more activities performed by the user. In both cases there is a need to operate
with Clouds (fuzzy sets of PIMO nodes): Clouds describe topicality of documents in
terms of PIMO, the pile itself is a Cloud.


Pile UI


Nepomuk use case: activity management

A user started to work on a new project CID.
Using the Nepomuk SSD, she collects a “pile” of
resources she needs while working on the project:
MS-Word documents, contacts, etc
by drag-and-dropping resources from her desktop,
by linking resources from e-mail (Mozilla
Thunderbird) and web browser (Firefox)
applications.


Nepomuk use case: activity management using IBM recommender
codenamed “Galaxy”

Galaxy (IBM hybrid recommender)
analyses the pile content and linkage
structure
as a multidimensional network of concepts
extracted from documents and links between
concepts, projects, project participants,
meetings, document authors, … .
and provides handy recommendations of
resources she might possibly need


Nepomuk use case: activity management

Galaxy can spot what the user might miss:
“This web page might be relevant to your CID
activity”


Thank you !


2011 04 troussov_graph_basedmethods-weakknowledge

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie 2011 04 troussov_graph_basedmethods-weakknowledge

Ähnlich wie 2011 04 troussov_graph_basedmethods-weakknowledge (20)

Mehr von Natalia Ostapuk

Mehr von Natalia Ostapuk (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

2011 04 troussov_graph_basedmethods-weakknowledge