SlideShare ist ein Scribd-Unternehmen logo
1 von 6
Downloaden Sie, um offline zu lesen
Toward Personalized Query Expansion

                  Marin Bertier                                      Rachid Guerraoui                         Anne-Marie Kermarrec
           INSA de Rennes, France                                      EPFL, Switzerland                        INRIA Rennes, France
                                                                        Vincent Leroy
                                                                  INSA de Rennes, France



ABSTRACT                                                                                ness).
Social networking and tagging have taken off at an unex-
pected scale and speed, opening huge opportunities to en-                               1.   MOTIVATION
hance the user search experience. We present Gossple 1 , a
new, user-centric, approach to improve the exploration of the                           The Web revolution.
Internet. Underlying Gossple lies the intuition that while
                                                                                           The Web has turned from a read-only infrastructure
social networks provides news from your old buddies, you                                with passive participants into a read-write platform with
can learn a lot more from people you don’t know, but with
                                                                                        active players. The content of the Web is no longer
whom you share many (tagging) interests. More specifically,                              generated only by experts but pretty much by every-
considering a collaborative tagging system with active tag-
                                                                                        one (YouTube, Flickr, Last.fm, Delicious, etc). Like
gers annotating content, Gossple expands the search query,                              any popular revolution, this goes through democratis-
of any user u, with tags that are considered “close” enough                             ing the language: instead of subject indexing with a
with respect to users that are “close” to u.
                                                                                        controlled vocabulary, freely chosen keywords are used
   Gossple users create their own network of social ac-                                 to tag billions of items, e.g. URL (Delicious). The user-
quaintances in a gossip-based manner, by dynamically com-
                                                                                        generated taxonomy is called folksonomy (folk + tax-
puting the estimation of a distance between taggers, based                              onomy) and is used to label and share user-generated
on cosine similarity between tags and items. These connec-
                                                                                        content (e.g photographs), or to collaboratively label
tions are used to feed a TagMap: our central abstraction that                           existing content (e.g Web sites, books, or blog entries).
captures the personalised relationships between tags. The
                                                                                        Part of the appeal of a folksonomy is its inherent sub-
TagMap is then used by Gossple to meaningfully expand                                   versiveness: folksonomies can be seen as a rejection of
queries leveraging the personalised network. This is achieved
                                                                                        the traditional search engine status quo in favour of
through the TagRank algorithm, an adaptation of the cele-                               tools that are created by the community. In theory, pre-
brated pagerank algorithm, which automatically determines
                                                                                        cisely because folksonomies develop Internet-mediated
which tags best expand a list of tags in a given query.                                 personalised environments, one could dynamically dis-
   Gossple has no central authority: every user stores its                              cover the tag sets of another user who tends to interpret
own items and its tagging behaviour is stored only by its
                                                                                        and tag content in a similar manner. The result could
neighbours. The resulting networks are live, dynamic and do                             be a rewarding gain in the user’s capacity to find related
not require any underlying structure. We report on our eval-
                                                                                        content, a practice known as ”pivot browsing”.
uation of Gossple with CiteUlike traces, involving 33,834
users. In short, we show that, with little information stored
                                                                                        Personalisation goes with decentralisation.
at every peer, Gossple enables to retrieve items that cannot
                                                                                           While intriguing, this Web revolution is still in a pre-
be retrieved with state of the art search systems (complete-
                                                                                        liminary stage, and this is at least for two main rea-
1
    This work is supported by the ERC Starting Grant 204742                             sons. First, most collaborative tagging networks are
                                                                                        controlled by centralised systems. So as much as users
                                                                                        are first class citizens of the system and are free to in-
                                                                                        troduce new items and tag them in their proper lan-
Permission to make digital or hard copies of all or part of this work for               guage, they are not free to choose where these items
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies               are stored and, more importantly, cannot usually freely
bear this notice and the full citation on the first page. To copy otherwise, to          decide to remove items and tags. In the long run, this
republish, to post on servers or to redistribute to lists, requires prior specific       might dissuade users from generating new content and
permission and/or a fee.
SNS’09 , March 31, 2009 Nuremberg, Germany                                              expressing their tagging behaviour in an explicit man-
Copyright 2009 ACM 978-1-60558-463-8 ...$5.00.                                          ner. Furthermore, and no matter how powerful servers


                                                                                    1
can be, centralised solutions do not promote the main-          nor even have similar jobs. Yet, their past history made
tenance of personalised relations between users, which          clear their links through the fact that they both lived in
might reveal crucial in the search as we will discuss be-       English speaking countries and both have kids around
low. These relations grow exponentially with the size           the same age and do need baby-sitters. Should a sys-
of the system and the success of social tagging might           tem be able to make the connection between Alice and
simply kill the underlying centralised infrastructure.          Anne, the association between tags “teaching assistant”
   Second, while the success of collaborative networks is       and “baby-sitter” could be helpful. Therefore a mecha-
clearly related to the freedom left to the users, this is       nism that would expand Anne’s query “english-speaking
also a drawback. The facts that such systems are not            baby-sitter” to “assistant etranger” or teaching assis-
governed by specific structures (as opposed to ontologies        tant” would render her request solvable by any search
for instances), and that tags are informally defined, and        engine.
continually changing, mean there is no insurance that
the tagging behaviour of a user on some content makes           Contributions.
any sense for another one, nor does it prevent junk tag-           The observation we drew from this example which,
ging and synonyms, which introduce significant noise             as we pointed out, is inspired by a real scenario, is that,
in the process. The reactivity offered by fully decen-           in contrast to old buddies that do not bring much to
tralised solutions may solve this issue.                        the search, unknown people who share similar interests
                                                                can do the job. Expanding a user’s query by identify-
Beyond friends: discovering similar users.                      ing her connection with personalised acquaintances is
   We believe that the salvation can only come from             not immediate for this requires, within a huge dynamic
pushing the revolution further. Basically, we argue for         system, maintaining implicit connections and deriving
a fully user centric approach where every participant           complementary tags on the fly for every query. This is
stores and controls not only her own items and tagging          the challenge addressed by Gossple. In short, Goss-
behaviour, but also her perspective on what portion of          ple automatically infers personalised connections be-
the network is relevant to her own search. Every user           tween users and provides them with semantically re-
query is then expanded with tags that are considered            lated tags as companions to their queries.
appropriate with respect to the personalised network of            At the heart of Gossple lies the TagMap abstrac-
that user.                                                      tion through which we capture the personalised rela-
   To illustrate the motivation behind our approach,            tionship between tags. Every peer locally stores its
consider the following (real) example. After living for         TagMap which is fed by a (discovered) personal net-
several years in the UK, Anne is back to Rennes in              work. This network is dynamically created in a gossip-
France and, to maintain her kids’ skills in English, is         based manner computing the estimation of a distance
looking for an English speaking student who would be            between taggers, based on tagging behaviour. Cosine
willing to trade baby-sitting hours against accommo-            similarities on tags and items are used to create each
dation. Given the high number of students in Rennes,            user’s TagMap. A key feature of Gossple is to expand
there is no doubt that such an offer would be of inter-          queries in a meaningful way, leveraging the TagMap.
est for many English speaking students. Anne’s Google           To this end, we use an algorithm called TagRank for it
request “English baby-sitter Rennes” does not give any-         is inspired from the pagerank algorithm, to extract the
thing interesting for baby-sitter is immediately associ-        most relevant tags from the TagMap for a given request
ated with child minders or local (French) baby-sitting          in order to expand the query.
companies. Her Facebook buddies in Rennes or in the                We report on our evaluation of Gossple with Ci-
UK cannot really help either for none has ever looked           teULike traces, involving 33,834 users. In short, we
for an English speaking baby sitter in Rennes. Con-             show that, with little information stored at every peer,
sider now Alice leaving in Bordeaux, after several years        Gossple enables to retrieve items that cannot be re-
in the US, and who is looking for a similar deal with her       trieved with state of the art search systems (complete-
kids. Alice is however lucky to discover that teaching          ness) without hampering accuracy (increasing the num-
assistants in primary school are a very good match as           ber of false positives).
they have the same working hours as kids, they do have
a salary but would enjoy leaving within a family. Now           2.   GOSSPLE IN A NUTSHELL
if Alice associates “english-speaking baby-sitter” with
“teaching assistant” in her search request, she does in-
deed find very good candidates. Clearly, if Anne could           System model.
reuse Alice’s discovery, she would also find good can-              We consider a system composed of a set of users U .
didates in Rennes. Nevertheless, Alice and Anne do              Users may tag a set of items I with tags from the set of
not know each other nor do they live in the same area,          tags T , The information space (IS) is defined as a set of
                                                                triplets (u, i, t) ∈ U ×I×T representing the relationships


                                                            2
user u. This is computed from the information of each
                                                                  user’s personal network. We then present the TagRank
                                                                  algorithm, exploiting the TagMap to expand the query
                                                                  on a per query basis. Finally, we present the way each
                                                                  user discover the neighbours to form its personal net-
                                                                  work in a fully decentralised way through gossip proto-
                                                                  cols.
             Figure 1: gossple overview
defined by users between items and tags.                           3.    THE TAGMAP: A PERSONALISED VIEW
   The information space can be accessed by a set of                    OF THE RELATIONS BETWEEN TAGS
functions defined as follow: F unctionN ame(parameters)                In this section, we first present the metric used to
returns a set of F unctionN ame for the fixed values of            compute the distance between users based on their tag-
the parameters. For instance Item({u1 } , {t1 , t2 }) re-         ging behaviour, namely cosine similarity between items
turns the sets of items tagged by u1 with t1 or t2 . Sim-         2
                                                                    .
ilarly ItemT ag({u1 }) returns the set of (i, t) such as
(u1 , i, t) ∈ IS (u1 tagged i with t). This represents the        3.1   Rating the users: items cosine similarity
profile of a user.                                                    The TagMap capture the distance between tags, this
   We consider that the users are connected through a             is extracted automatically from the tagging behaviour.
connected network, typically through the use of random            Detecting users sharing interest requires to be able to
peer sampling service [1].                                        compute a distance between users. The most natural
                                                                  metric to consider is the overlap between the items they
Overview of Gossple.                                              tag. However, this simple metric suffers from several
   Query expansion definition: The query expansion is a            drawbacks. Users that have a very high tagging ac-
process that transforms a user query in order to improve          tivity, will exhibit a high overlap with any other user,
the performance of the search engine. It involves trans-          while this does not reflect any specific interest. In addi-
forming the query terms (correcting spelling and stem-            tion, the proximity between users with a relevant metric
ming words), adding new terms and weighting them.                 not only should increase when interests are similar, but
We will not consider correcting spelling and stemming             it should also decrease if many other interests are not
since they usually rely on known local algorithms and             shared.
dictionaries, they do not require the personal network               Instead we use a well-known metric, used in data min-
knowledge. Query expansions adds terms to the query,              ing, namely the cosine similarity between items. This
increasing the number of results given to the user. The           can be seen as a normalised overlap. Items are repre-
query expansion has to be precise enough to be able               sented as vectors in a multidimensional space, the num-
to the add relevant documents to the result set while             ber of dimension being |I|.
keeping the number of irrelevant documents low. This                 More formally, the cosine between two vectors of items
is a trade off between recall and precision.                       is defined as follows:
   The goal of Gossple is to discover users sharing sim-                                 ·
                                                                     cos(v1 , v2 ) = v1v1×v2v2
ilar interests (personalised network), and to gather their           The item cosine is defined as follows:
information in order to improve query expansion. Fig-
                                                                                                          T
                                                                                              |Item({u1 }) Item({u2 })|
                                                                     ItemCos(u1 , u2 ) = √
ure 1 presents an overview of Gossple. The first step is                                   |Item({u1 })|×|Item({u2 })|

to identify the relevant users to form the personal net-            The score between two users increases when interests
work. This is achieved by relying on a distance metric            are shared and decreases when they are not.
between user, using the cosine similarity between sets
                                                                  3.2   Creating the TagMap
of items. The personal network, much smaller than the
whole network, typically 20 neighbours are enough in                The distance between users is used by users to create
a 33,834 user system as we show in the experiments,               their personal network, so that the information about
is used to build a each user’s TagMap. The TagMap                 tags, collected from users ui are incorporated in the
represents a personalised view of the relation between            TagMap of user uj depending on its distance to ui . We
tags, as a distance between tags. The query expansion             consider that each user has a personalised network, we
algorithm, called TagRank, exploits the user’s TagMap             will come back on the discovery of such a personalised
to expand a given query.                                          network in the next section. N eighbours(u) is defined
   In the sequel we describe how each user creates its            as the set of users in the personalised network of u.
TagMap, a matrix T Mu , capturing the relationships be-           2
                                                                   Note that there are many metrics that could be used, we
tween tags where T Mu [ti , tj ] contains a value reflecting       chose this one for the purpose of comparison with related
the relationship between tags ti and tj as seen by the            works.


                                                              3
Again there are many ways to use the information                 puted is biased by the query, the query tags being the
provided by the neighbour’s profile to fill the TagMap                ones that spread importance into the graph. Calculat-
depending whether the query expansion should rely on                ing exact TagRank scores in a big graph can be a long
a dictionary of synonyms or a hierarchical relationship             process. Since this process is repeated at each query,
[2]. For space reasons, we focus on synonyms in this                this might be an issue in the long run. Therefore, we
paper.                                                              use an algorithm from [5] in order to provide a more
   The information needed to fill the tag map is for each            efficient approach. For each query, the computation is
tag, the number of occurrences of the use of that tags              split in order to compute partial scores for each tag in
per items, namely:                                                  the query. At the end, all the partial scores are added
   For all t ∈ T ag(N eighbours(u), a vector Vt of dimen-           to get the TagRank score of each tag. This saves a lot
sion |I| is maintained such that if Vt [itemi ] = x, x =            of processing time. The partial scores are approximated
|U ser(N eighboursu , {i} , {t})|, namely the number of             through random walks.
times the item itemi has been tagged with t by the                  T agRank(query, T agM ap) outputs the list of all the
neighbours of u. The TagMap is then filled as follows:               tags in the TagMap associated with a weight. Since
   T M [ti , tj ] = cos(Vti , Vtj )                                 each weight is a probability, they sum up to one. The
                                                                    expanded query consists in the original query, plus addi-
4.   THE TAGRANK ALGORITHM: PERSON-                                 tional terms chosen by descending weight. The system
                                                                    can either use the top-k extra tags, or add enough tags
     ALISED QUERY EXPANSION                                         to “capture” a given amount of the weight.
   The TagMap represents the personalised relationships
between pairs of tags to be used to expand queries.
                                                                    5.   CREATING THE PERSONAL NETWORK
A straightforward solution, used in [3], to exploit the
TagMap directly, is to consider only tags close to the                 Our algorithm is based on profile proximity between
tags of the query. This is an issue for the items suffer             users. As presented in the subsection 3.2, the TagMap
from a high sparsity: as there is a very large number of            of a user is created from the profile of users which belong
items, relationships between tags are sometimes hidden              to her personalised network. The aim of the personal
and can be hardly discovered. Consider for example                  network is so to connect a user with their k closest users
a query on t1 , the TagMap provides a link between t1               according to the metric presented in the subsection 3.1.
and t2 (based on a set of items). Consider now that t2              k represents the trade-off between the amount of avail-
and t3 are also close in the same TagMap (based on a                able information and the personalisation degree of this
different set of items), this straightforward solution will          information (in other words, its quality).
never discover a link between t1 and t3 .                              We assume that the users are connected through an
   By iterating on the set of added tags, more relevant             unstructured overlay implemented by a peer sampling
tags could be added to the query. To this end, we                   service [1]. Basically, each user is provided with a (chang-
designed an algorithm called TagRank, inspired from                 ing) random sample of the network (a view of say 20
PageRank[4]. The TagMap is represented as a graph in                random users). This protocol ensures that the network
which all the tags in the TagMap are vertices. They are             is connected and that new relevant users may be dis-
connected by weighted edges so that weight(ti , tj ) =              covered when maintaining the personal network.
T agM ap(ti , tj ) and weight(ti , ti ) = 1 3 . In PageRank,           The creation of the personal work is achieved through
a random surfer walks in a graph of Web pages. The                  a clustering gossip protocol. To this end each user
importance of each page is the probability of the surfer            maintains a view of k neighbours forming its person-
to be on that page at any time. At each step of the walk,           alised network. Starting from a random sample (typi-
the surfer either follows a link on the page or moves to            cally provided by the underlying peer sampling service),
a page chosen uniformly at random on the whole graph.               this network is refined as follow. Periodically, a user
In TagRank, the transition probability from one tag to              contacts another user from her neighbours to exchange
another depends on the edge weight:                                 neighbours. When a user receive new neighbours upon
   T ransitionP robability(t1 , t2 ) = P T agM ap[tap[t] ,t]
                                                    1 ,t2           a gossip interaction, it keeps from its own neighbours
                                          t∈T T agM       1
                                                                    and the discovered one the k closest according to the
   The original PageRank algorithm computes a score
                                                                    metric defined in Section 3.1. This process is iterated
for each vertex, but that score only depends on the
                                                                    and converges in a few cycles [6]. The TagMap of each
structure of the graph, not on a user query. Like in
                                                                    user is then built from the profile of those k users, form-
personalised versions of PageRank, we modify the set
                                                                    ing the personal network,
of vertices the surfer can move to at random and limit
                                                                       In order to reduce the message size, users exchange a
it to the tags of the query. Therefore, the score com-
                                                                    Bloom filter representing a hash of their items vectors
3
  This is directly infered from the metric based on cosine of       instead of the whole profile. The Bloom filter provides a
vectors of items.                                                   reasonably good approximation of the user profile that


                                                                4
can be used to compute the cosine similarity with a
small error margin. If the value of the cosine between                             0.8
the user’s vector and the one infered from the Bloom
                                                                                  0.75
filter, the users are considered closes enough and the
entire profile is then exchanged. Otherwise, there is no                            0.7
further exchange. This avoid the transfers of useless
                                                                                  0.65
entire profiles.




                                                                         recall
                                                                                   0.6

6.   EVALUATION
                                                                                  0.55
   In this section, we present preliminary experimen-
tal results. We run experiments using the CiteULike                                0.5
                                                                                                             personalized TagMap + TagRank
                                                                                             personalized TagMap + one step query expansion
dataset of the 2008-10-09. |U | = 33, 834, |I| = 1, 134, 167,                     0.45
                                                                                                   global TagMap + one step query expansion

|T | = 237, 450, |IS| = 4, 064, 310. We build a profile for                               0   10           20            30
                                                                                                        query expansion size
                                                                                                                                     40       50

each user u ∈ U .                                                          Figure 2: recall performance evaluation
                                                                    of TagRank over a simple query expansion mechanism.
Workload.
                                                                       Figure 2 shows the results of our simulations. In all
   To evaluate our algorithm, we generate queries to ex-
                                                                    cases, a query expansion size of 0 gives a recall of 47%.
pand. After the query expansion, we launch the query
                                                                    That means that in 47% of cases, when the item has
to build a result set. The result set contain the items
                                                                    been tagged by more than 2 users, at least one other
which match the query’s tags. We generate a query for
                                                                    user has used one tag in common. In all the other cases,
each item i ∈ Item({u}) such as U ser({i}) > 1 (an
                                                                    the system has to rely on the query expansion process
item has to be tagged by at least 2 users). For an item
                                                                    to add relevant tags to the query and improve the recall
i, we choose a user u and we use the tags used by u
                                                                    rate.
on the item i to fill the query. As u will launch the
                                                                       We observe that the personalised TagMap performs
query, we delete from the Information Space, the in-
                                                                    a lot better than the global TagMap, with on average
formation used by the query generation (IS − (u, i, t),
                                                                    8% more recall. This shows first that the personal net-
t ∈ T ag({u} , {i})).
                                                                    work is effective to personalise in a meaningful way
   The query succeeds when i is in the result set. The
                                                                    the TagMap and generate a substantially more accu-
query goes through the query expansion process and we
                                                                    rate query expansion. Second, it shows that only a
modify the number of tags added to the query in order
                                                                    small portion of the network is required to personalise
to evaluate the impact on the recall, which in that case
                                                                    in an effective manner the TagMap. The TagMap con-
is the proportion of items found using the tags that were
                                                                    tains much less information, but since this information
assigned to them by a given user.
                                                                    is centred on the user, the tags added through the query
                                                                    expansion are more relevant.
Settings.
                                                                       Finally, we observe that TagRank also contributes to
   To evaluate our approach we run the following exper-
                                                                    improving the quality of the results, especially when it
iments on the same trace.
                                                                    comes to producing a long query expansion. The recall
   Global TagMap, simple query expansion: a global
                                                                    is improved by up to 4% with a query expansion size of
TagMap is built based on the same metric, namely co-
                                                                    50. This experiment demonstrates the limits of the one
sine of item vectors. The distance between tags is there-
                                                                    step distance when using the TagMap. The sparsity of
fore not personalised as it takes into account the infor-
                                                                    the information in folksonomies limits the number of re-
mation of all users. The query expansion is the simple
                                                                    lated tags that can be found. Since TagRank distributes
one considered before, used in [3] considering only the
                                                                    weight in the whole graph, it can find tags that seem
tags related to the query tags. This is typically repre-
                                                                    not related to the query but are still relevant.
sentative of a centralised approach, where personalised
TagMap are too space intensive to maintain.
                                                                    7.      RELATED WORK & CONCLUDING RE-
   Personalised TagMap, simple query expansion: the
TagMap is personalised, based on the profile of the                          MARKS
(k = 20) closest neighbours. The simple query expan-                  Collaborative social tagging schemes have received a
sion mechanism is used here. The goal is to evaluate                growing attention, they provide a huge potential for dis-
the impact of the TagMap personalisation.                           covering new information through implicit connections.
   Personalised TagMap, TagRank based query expan-                  In this paper, we presented the query expansion fea-
sion: the TagMap is personalised, based on the profile               ture of Gossple, a user centric system to discover and
of the (k = 20) closest neighbours. TagRank is used to              maintain such acquaintances. The Gossple query ex-
expand the queries. This enables to evaluate the benefit             pansion mechanism improves the completeness of the


                                                                5
search queries over state of the art alternatives without       difference with our system is that this data is neutral
hampering the search accuracy. This is achieved with            and objective, while our system aims at a subjective,
little information maintained at each peer, in the form         user-related query expansion. Furthermore, our system
of the TagMap. Interestingly, each peer discovers its           is able to directly extract the knowledge from the in-
personal network, locally stores the relevant, to itself,       formation space while those information sources need
relationships between tags, and its tagging behaviour is        to be maintained by users. Although we limited our
only recorded by its personal neighbours. The TagMap            approach to adding synonyms in this paper, Gossple
is exploited by an original TagRank algorithm to ex-            can determine different kind of relations between tags
pand in an effective way queries.                                and use the same approach and is also able to build a
   Recently, several centralised systems have addressed         biased taxonomy that reflects the interests of the user.
the personalisation of search in the context of folksonomies.      We believe that the way to personalise Internet search,
These approaches mostly focus on top-k processing. In           in a world where users are free to express their opinion
[7], the investigate network-aware top-k processing. They       and interests goes with a fully decentralised system. To
show that full personalisation which would result in            the best of our knowledge, we are the first to propose
maintaining data structures (typically inverted lists) on       a personalised query expansion in a fully decentralised
a per user basis are too space intensive. Instead the pro-      manner. We foresee many perspectives to that work,
posed algorithms rely on maintaining such data struc-           such as leveraging the TagMap for recommendation sys-
ture per cluster of socially related users and adapt the        tems for instance and addressing dynamic networks.
traditional centralised top-k algorithms to that setting.
   In [3], a centralised system proposing both query ex-        8.   ACKNOWLEDGEMENT
pansion and top k processing also relies on tags asso-            We warmly thank Vivien Qu´ma for his help at early
                                                                                           e
ciation. Yet, the tag association is not personalised,          stages of the work.
nor the system is decentralised. The personalisation
is addressed only in the top-k processing. One of the           9.   REFERENCES
main reasons is that a personalised association between         [1] M. Jelasity, R. Guerraoui, A.-M. Kermarrec, and M. van
tags is too space intensive in a centralised system. Our            Steen. The peer sampling service: experimental evaluation of
                                                                    unstructured gossip-based implementations. In Middleware
experiments showed the benefit of the Gossple person-                ’04: Proceedings of the 5th ACM/IFIP/USENIX
alised query expansion over this approach.                          international conference on Middleware, pages 79–98, New
   In [8], several types of social decentralised routing            York, NY, USA, 2004. Springer-Verlag New York, Inc.
                                                                [2] C. Cattuto, D. Benz, A. Hotho, and G. Stumme. Semantic
strategies are considered. Although they do not deal                analysis of tag similarity measures in collaborative tagging
with query expansion, they confirm our intuition that                systems. In Proceedings of the 3rd Workshop on Ontology
social explicit connexions ala Facebook are useless for             Learning and Population (OLP3), pages 39–43, July 2008.
many requests. Instead they show that semantic rout-                ISBN 978-960-89282-6-8.
                                                                [3] V. Zanardi and L. Capra. Social ranking: uncovering
ing, contacting neighours for given request are depen-              relevant content using tag-based recommender systems. In
dent or the content of the request, or spiritual routing,           Proceedings of the 2008 ACM conference on Recommender
contacting neighbours having behavioural affinity pro-                systems, pages 51–58, 2008.
                                                                [4] S. Brin and L. Page. The anatomy of a large-scale
vide the best results.                                              hypertextual web search engine. Computer Networks and
   In [9], the authors explore different ways of providing           ISDN Systems, 30:107–117, 1998.
personalised query expansion. They show that adding             [5] D. Fogaras, B. R´cz, K. Csalog´ny, and T. Sarl´s. Towards
                                                                                      a             a                o
information extracted from the user’s profile can help               scaling fully personalized pagerank: Algorithms, lower
                                                                    bounds, and experiments. Internet Mathematics, pages
increasing the rank of a relevant document in the result            333–358, 2005.
list. Their approach is based on using the user’s profile        [6] M. Jelasity and O. Babaoglu. T-Man: Gossip-Based Overlay
only, while our algorithms take advantage of the knowl-             Topology Management. Engineering Self-Organising
                                                                    Systems, 3910:1–15, 2006.
edge of the other users in the system. Therefore, our           [7] S. Amer-Yahia, M. Benedikt, L. Lakshmanan, and
approach is more relevant for discovering new tags and              J. Stoyanovich. Efficient network aware search in
increasing the recall of the requests.                              collaborative tagging sites. Proc. VLDB Endow., pages
                                                                    710–721, 2008.
   Many centralised search engines provide non person-          [8] M. Bender, T. Crecelius, M. Kacimi, S. Miche,
alised query expansion. They add to the query syn-                  J. Xavier Parreira, and G. Weikum. Peer-to-peer information
onyms and related concepts, found in a taxonomy in                  search: Semantic, social, or spiritual? Bulletin of the IEEE
order to improve the quality of the results. They rely              Computer Society Technical Committee on Data
                                                                    Engineering, 2007.
on hand-generated information like Yahoo! Directory 4 ,         [9] M. Carman, M. Baillie, and F. Crestani. Tag data and
Wordnet 5 or the Open Directory Project 6 . The main                personalized information retrieval. In SSM ’08: Proceeding
                                                                    of the 2008 ACM workshop on Search in social media, pages
4
  http://dir.yahoo.com/                                             27–34, New York, NY, USA, 2008. ACM.
5
  http://wordnet.princeton.edu/
6
  http://www.dmoz.org/


                                                            6

Weitere ähnliche Inhalte

Was ist angesagt?

College connect prototype_final
College connect prototype_finalCollege connect prototype_final
College connect prototype_finalChristine Greenhow
 
Columbia University Research Data Symposium Dataverse Network Poster
Columbia University Research Data Symposium Dataverse Network PosterColumbia University Research Data Symposium Dataverse Network Poster
Columbia University Research Data Symposium Dataverse Network PosterEleni Castro
 
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...BO TRUE ACTIVITIES SL
 
DMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for SuccessDMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for SuccessCarly Strasser
 
Scientific Information Integration & Discovery Service: Getting the most rele...
Scientific Information Integration & Discovery Service: Getting the most rele...Scientific Information Integration & Discovery Service: Getting the most rele...
Scientific Information Integration & Discovery Service: Getting the most rele...Filipe MS Bento
 
Support for Resource-Based Learning on the Internet
Support for Resource-Based Learning on the InternetSupport for Resource-Based Learning on the Internet
Support for Resource-Based Learning on the InternetMojisola Erdt née Anjorin
 
Indexing presentation 2013 06-04
Indexing presentation 2013 06-04Indexing presentation 2013 06-04
Indexing presentation 2013 06-04Louise Spiteri
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLocal Social Summit
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smithMarc Smith
 
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...i_scienceEU
 

Was ist angesagt? (12)

College connect prototype_final
College connect prototype_finalCollege connect prototype_final
College connect prototype_final
 
Columbia University Research Data Symposium Dataverse Network Poster
Columbia University Research Data Symposium Dataverse Network PosterColumbia University Research Data Symposium Dataverse Network Poster
Columbia University Research Data Symposium Dataverse Network Poster
 
trial+pdf
trial+pdftrial+pdf
trial+pdf
 
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
Crawling Big Data in a New Frontier for Socioeconomic Research: Testing with ...
 
DMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for SuccessDMPTool at NNLM Research Lifecycle: Partnering for Success
DMPTool at NNLM Research Lifecycle: Partnering for Success
 
Scientific Information Integration & Discovery Service: Getting the most rele...
Scientific Information Integration & Discovery Service: Getting the most rele...Scientific Information Integration & Discovery Service: Getting the most rele...
Scientific Information Integration & Discovery Service: Getting the most rele...
 
Support for Resource-Based Learning on the Internet
Support for Resource-Based Learning on the InternetSupport for Resource-Based Learning on the Internet
Support for Resource-Based Learning on the Internet
 
Indexing presentation 2013 06-04
Indexing presentation 2013 06-04Indexing presentation 2013 06-04
Indexing presentation 2013 06-04
 
LSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social MediaLSS'11: Charting Collections Of Connections In Social Media
LSS'11: Charting Collections Of Connections In Social Media
 
20111103 con tech2011-marc smith
20111103 con tech2011-marc smith20111103 con tech2011-marc smith
20111103 con tech2011-marc smith
 
Bean arsel
Bean arselBean arsel
Bean arsel
 
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
Network of Excellence in Internet Science (Multidisciplinarity and its Implic...
 

Andere mochten auch

Toward Personalized Peer-to-Peer Top-k Processing
Toward Personalized Peer-to-Peer Top-k ProcessingToward Personalized Peer-to-Peer Top-k Processing
Toward Personalized Peer-to-Peer Top-k Processingasapteam
 
10pitanja[1] M
10pitanja[1] M10pitanja[1] M
10pitanja[1] MDragana28
 
10 pitanja koja će Vam Bog postaviti
10 pitanja koja će Vam Bog postaviti10 pitanja koja će Vam Bog postaviti
10 pitanja koja će Vam Bog postavitiDragana28
 
Cad Modelling Ergonomics: Sistema di Scansione Antropometrica
Cad Modelling Ergonomics:  Sistema di Scansione AntropometricaCad Modelling Ergonomics:  Sistema di Scansione Antropometrica
Cad Modelling Ergonomics: Sistema di Scansione AntropometricaLorenzo Salemme
 
Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)Sarah Maryatie
 
Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)Sarah Maryatie
 
Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...
Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...
Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...Lorenzo Salemme
 
Presentazione lugano 2010
Presentazione lugano 2010Presentazione lugano 2010
Presentazione lugano 2010Lorenzo Salemme
 
Cad Modelling Ergonomics - brief introduction
Cad Modelling Ergonomics - brief introductionCad Modelling Ergonomics - brief introduction
Cad Modelling Ergonomics - brief introductionLorenzo Salemme
 

Andere mochten auch (16)

Toward Personalized Peer-to-Peer Top-k Processing
Toward Personalized Peer-to-Peer Top-k ProcessingToward Personalized Peer-to-Peer Top-k Processing
Toward Personalized Peer-to-Peer Top-k Processing
 
10pitanja[1] M
10pitanja[1] M10pitanja[1] M
10pitanja[1] M
 
10 pitanja koja će Vam Bog postaviti
10 pitanja koja će Vam Bog postaviti10 pitanja koja će Vam Bog postaviti
10 pitanja koja će Vam Bog postaviti
 
test
testtest
test
 
Skt migas share1
Skt migas share1Skt migas share1
Skt migas share1
 
Cad Modelling Ergonomics: Sistema di Scansione Antropometrica
Cad Modelling Ergonomics:  Sistema di Scansione AntropometricaCad Modelling Ergonomics:  Sistema di Scansione Antropometrica
Cad Modelling Ergonomics: Sistema di Scansione Antropometrica
 
Skt migas share1
Skt migas share1Skt migas share1
Skt migas share1
 
Skt migas share2
Skt migas share2Skt migas share2
Skt migas share2
 
Skt migas share1
Skt migas share1Skt migas share1
Skt migas share1
 
Skt migas share2
Skt migas share2Skt migas share2
Skt migas share2
 
Skt migas share2
Skt migas share2Skt migas share2
Skt migas share2
 
Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)
 
Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)Proses iujk kualifikasi b1 (grade 7)
Proses iujk kualifikasi b1 (grade 7)
 
Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...
Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...
Sistema di Scansione Antropometrica per la realizzazione di Abbigliamento di ...
 
Presentazione lugano 2010
Presentazione lugano 2010Presentazione lugano 2010
Presentazione lugano 2010
 
Cad Modelling Ergonomics - brief introduction
Cad Modelling Ergonomics - brief introductionCad Modelling Ergonomics - brief introduction
Cad Modelling Ergonomics - brief introduction
 

Ähnlich wie paper

IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...
IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...
IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...IEEEFINALYEARSTUDENTPROJECTS
 
2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...
2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...
2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...IEEEMEMTECHSTUDENTSPROJECTS
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentDavid De Roure
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic WaveKaniska Mandal
 
Tags, Networks, Narrative
Tags, Networks, NarrativeTags, Networks, Narrative
Tags, Networks, NarrativeBruce Mason
 
JPJ1419 Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...
JPJ1419  Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...JPJ1419  Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...
JPJ1419 Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...chennaijp
 
A model of recommender system for a digital library
A model of recommender system for a digital libraryA model of recommender system for a digital library
A model of recommender system for a digital librarySar Lyna
 
Evolution of social networks based on tagging practices
Evolution of social networks based on tagging practicesEvolution of social networks based on tagging practices
Evolution of social networks based on tagging practicesJPINFOTECH JAYAPRAKASH
 
typical recommending systems
typical recommending systemstypical recommending systems
typical recommending systemspashaying
 
On the Navigability of Social Tagging Systems
On the Navigability of Social Tagging SystemsOn the Navigability of Social Tagging Systems
On the Navigability of Social Tagging SystemsMarkus Strohmaier
 
New e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionNew e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionDavid De Roure
 
Hooks & Filters
Hooks & FiltersHooks & Filters
Hooks & Filterstrendschau
 
Towards the Design of Intelligible Object-based Applications for the Web of T...
Towards the Design of Intelligible Object-based Applications for the Web of T...Towards the Design of Intelligible Object-based Applications for the Web of T...
Towards the Design of Intelligible Object-based Applications for the Web of T...Pierrick Thébault
 
discovering emerging topics in social
discovering emerging topics in socialdiscovering emerging topics in social
discovering emerging topics in socialswathi78
 
Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectorsManish Kumar
 

Ähnlich wie paper (20)

IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...
IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...
IEEE 2014 JAVA DATA MINING PROJECTS Multi comm finding community structure in...
 
2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...
2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...
2014 IEEE JAVA DATA MINING PROJECT Multi comm finding community structure in ...
 
myExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research EnvironmentmyExperiment - Defining the Social Virtual Research Environment
myExperiment - Defining the Social Virtual Research Environment
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
 
paper_97
paper_97paper_97
paper_97
 
Tags, Networks, Narrative
Tags, Networks, NarrativeTags, Networks, Narrative
Tags, Networks, Narrative
 
JPJ1419 Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...
JPJ1419  Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...JPJ1419  Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...
JPJ1419 Discovering Emerging Topics in Social Streams via Link-Anomaly Detec...
 
A model of recommender system for a digital library
A model of recommender system for a digital libraryA model of recommender system for a digital library
A model of recommender system for a digital library
 
Evolution of social networks based on tagging practices
Evolution of social networks based on tagging practicesEvolution of social networks based on tagging practices
Evolution of social networks based on tagging practices
 
typical recommending systems
typical recommending systemstypical recommending systems
typical recommending systems
 
On the Navigability of Social Tagging Systems
On the Navigability of Social Tagging SystemsOn the Navigability of Social Tagging Systems
On the Navigability of Social Tagging Systems
 
Dv31821825
Dv31821825Dv31821825
Dv31821825
 
Benoit Visual Only Retrieval
Benoit Visual Only RetrievalBenoit Visual Only Retrieval
Benoit Visual Only Retrieval
 
New e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionNew e-Science Edinburgh Late Edition
New e-Science Edinburgh Late Edition
 
Hooks & Filters
Hooks & FiltersHooks & Filters
Hooks & Filters
 
Al26234241
Al26234241Al26234241
Al26234241
 
188-tagging.pdf
188-tagging.pdf188-tagging.pdf
188-tagging.pdf
 
Towards the Design of Intelligible Object-based Applications for the Web of T...
Towards the Design of Intelligible Object-based Applications for the Web of T...Towards the Design of Intelligible Object-based Applications for the Web of T...
Towards the Design of Intelligible Object-based Applications for the Web of T...
 
discovering emerging topics in social
discovering emerging topics in socialdiscovering emerging topics in social
discovering emerging topics in social
 
Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectors
 

Kürzlich hochgeladen

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxAshokKarra1
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYKayeClaireEstoconing
 

Kürzlich hochgeladen (20)

Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Karra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptxKarra SKD Conference Presentation Revised.pptx
Karra SKD Conference Presentation Revised.pptx
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx4.18.24 Movement Legacies, Reflection, and Review.pptx
4.18.24 Movement Legacies, Reflection, and Review.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITYISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
ISYU TUNGKOL SA SEKSWLADIDA (ISSUE ABOUT SEXUALITY
 

paper

  • 1. Toward Personalized Query Expansion Marin Bertier Rachid Guerraoui Anne-Marie Kermarrec INSA de Rennes, France EPFL, Switzerland INRIA Rennes, France Vincent Leroy INSA de Rennes, France ABSTRACT ness). Social networking and tagging have taken off at an unex- pected scale and speed, opening huge opportunities to en- 1. MOTIVATION hance the user search experience. We present Gossple 1 , a new, user-centric, approach to improve the exploration of the The Web revolution. Internet. Underlying Gossple lies the intuition that while The Web has turned from a read-only infrastructure social networks provides news from your old buddies, you with passive participants into a read-write platform with can learn a lot more from people you don’t know, but with active players. The content of the Web is no longer whom you share many (tagging) interests. More specifically, generated only by experts but pretty much by every- considering a collaborative tagging system with active tag- one (YouTube, Flickr, Last.fm, Delicious, etc). Like gers annotating content, Gossple expands the search query, any popular revolution, this goes through democratis- of any user u, with tags that are considered “close” enough ing the language: instead of subject indexing with a with respect to users that are “close” to u. controlled vocabulary, freely chosen keywords are used Gossple users create their own network of social ac- to tag billions of items, e.g. URL (Delicious). The user- quaintances in a gossip-based manner, by dynamically com- generated taxonomy is called folksonomy (folk + tax- puting the estimation of a distance between taggers, based onomy) and is used to label and share user-generated on cosine similarity between tags and items. These connec- content (e.g photographs), or to collaboratively label tions are used to feed a TagMap: our central abstraction that existing content (e.g Web sites, books, or blog entries). captures the personalised relationships between tags. The Part of the appeal of a folksonomy is its inherent sub- TagMap is then used by Gossple to meaningfully expand versiveness: folksonomies can be seen as a rejection of queries leveraging the personalised network. This is achieved the traditional search engine status quo in favour of through the TagRank algorithm, an adaptation of the cele- tools that are created by the community. In theory, pre- brated pagerank algorithm, which automatically determines cisely because folksonomies develop Internet-mediated which tags best expand a list of tags in a given query. personalised environments, one could dynamically dis- Gossple has no central authority: every user stores its cover the tag sets of another user who tends to interpret own items and its tagging behaviour is stored only by its and tag content in a similar manner. The result could neighbours. The resulting networks are live, dynamic and do be a rewarding gain in the user’s capacity to find related not require any underlying structure. We report on our eval- content, a practice known as ”pivot browsing”. uation of Gossple with CiteUlike traces, involving 33,834 users. In short, we show that, with little information stored Personalisation goes with decentralisation. at every peer, Gossple enables to retrieve items that cannot While intriguing, this Web revolution is still in a pre- be retrieved with state of the art search systems (complete- liminary stage, and this is at least for two main rea- 1 This work is supported by the ERC Starting Grant 204742 sons. First, most collaborative tagging networks are controlled by centralised systems. So as much as users are first class citizens of the system and are free to in- troduce new items and tag them in their proper lan- Permission to make digital or hard copies of all or part of this work for guage, they are not free to choose where these items personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies are stored and, more importantly, cannot usually freely bear this notice and the full citation on the first page. To copy otherwise, to decide to remove items and tags. In the long run, this republish, to post on servers or to redistribute to lists, requires prior specific might dissuade users from generating new content and permission and/or a fee. SNS’09 , March 31, 2009 Nuremberg, Germany expressing their tagging behaviour in an explicit man- Copyright 2009 ACM 978-1-60558-463-8 ...$5.00. ner. Furthermore, and no matter how powerful servers 1
  • 2. can be, centralised solutions do not promote the main- nor even have similar jobs. Yet, their past history made tenance of personalised relations between users, which clear their links through the fact that they both lived in might reveal crucial in the search as we will discuss be- English speaking countries and both have kids around low. These relations grow exponentially with the size the same age and do need baby-sitters. Should a sys- of the system and the success of social tagging might tem be able to make the connection between Alice and simply kill the underlying centralised infrastructure. Anne, the association between tags “teaching assistant” Second, while the success of collaborative networks is and “baby-sitter” could be helpful. Therefore a mecha- clearly related to the freedom left to the users, this is nism that would expand Anne’s query “english-speaking also a drawback. The facts that such systems are not baby-sitter” to “assistant etranger” or teaching assis- governed by specific structures (as opposed to ontologies tant” would render her request solvable by any search for instances), and that tags are informally defined, and engine. continually changing, mean there is no insurance that the tagging behaviour of a user on some content makes Contributions. any sense for another one, nor does it prevent junk tag- The observation we drew from this example which, ging and synonyms, which introduce significant noise as we pointed out, is inspired by a real scenario, is that, in the process. The reactivity offered by fully decen- in contrast to old buddies that do not bring much to tralised solutions may solve this issue. the search, unknown people who share similar interests can do the job. Expanding a user’s query by identify- Beyond friends: discovering similar users. ing her connection with personalised acquaintances is We believe that the salvation can only come from not immediate for this requires, within a huge dynamic pushing the revolution further. Basically, we argue for system, maintaining implicit connections and deriving a fully user centric approach where every participant complementary tags on the fly for every query. This is stores and controls not only her own items and tagging the challenge addressed by Gossple. In short, Goss- behaviour, but also her perspective on what portion of ple automatically infers personalised connections be- the network is relevant to her own search. Every user tween users and provides them with semantically re- query is then expanded with tags that are considered lated tags as companions to their queries. appropriate with respect to the personalised network of At the heart of Gossple lies the TagMap abstrac- that user. tion through which we capture the personalised rela- To illustrate the motivation behind our approach, tionship between tags. Every peer locally stores its consider the following (real) example. After living for TagMap which is fed by a (discovered) personal net- several years in the UK, Anne is back to Rennes in work. This network is dynamically created in a gossip- France and, to maintain her kids’ skills in English, is based manner computing the estimation of a distance looking for an English speaking student who would be between taggers, based on tagging behaviour. Cosine willing to trade baby-sitting hours against accommo- similarities on tags and items are used to create each dation. Given the high number of students in Rennes, user’s TagMap. A key feature of Gossple is to expand there is no doubt that such an offer would be of inter- queries in a meaningful way, leveraging the TagMap. est for many English speaking students. Anne’s Google To this end, we use an algorithm called TagRank for it request “English baby-sitter Rennes” does not give any- is inspired from the pagerank algorithm, to extract the thing interesting for baby-sitter is immediately associ- most relevant tags from the TagMap for a given request ated with child minders or local (French) baby-sitting in order to expand the query. companies. Her Facebook buddies in Rennes or in the We report on our evaluation of Gossple with Ci- UK cannot really help either for none has ever looked teULike traces, involving 33,834 users. In short, we for an English speaking baby sitter in Rennes. Con- show that, with little information stored at every peer, sider now Alice leaving in Bordeaux, after several years Gossple enables to retrieve items that cannot be re- in the US, and who is looking for a similar deal with her trieved with state of the art search systems (complete- kids. Alice is however lucky to discover that teaching ness) without hampering accuracy (increasing the num- assistants in primary school are a very good match as ber of false positives). they have the same working hours as kids, they do have a salary but would enjoy leaving within a family. Now 2. GOSSPLE IN A NUTSHELL if Alice associates “english-speaking baby-sitter” with “teaching assistant” in her search request, she does in- deed find very good candidates. Clearly, if Anne could System model. reuse Alice’s discovery, she would also find good can- We consider a system composed of a set of users U . didates in Rennes. Nevertheless, Alice and Anne do Users may tag a set of items I with tags from the set of not know each other nor do they live in the same area, tags T , The information space (IS) is defined as a set of triplets (u, i, t) ∈ U ×I×T representing the relationships 2
  • 3. user u. This is computed from the information of each user’s personal network. We then present the TagRank algorithm, exploiting the TagMap to expand the query on a per query basis. Finally, we present the way each user discover the neighbours to form its personal net- work in a fully decentralised way through gossip proto- cols. Figure 1: gossple overview defined by users between items and tags. 3. THE TAGMAP: A PERSONALISED VIEW The information space can be accessed by a set of OF THE RELATIONS BETWEEN TAGS functions defined as follow: F unctionN ame(parameters) In this section, we first present the metric used to returns a set of F unctionN ame for the fixed values of compute the distance between users based on their tag- the parameters. For instance Item({u1 } , {t1 , t2 }) re- ging behaviour, namely cosine similarity between items turns the sets of items tagged by u1 with t1 or t2 . Sim- 2 . ilarly ItemT ag({u1 }) returns the set of (i, t) such as (u1 , i, t) ∈ IS (u1 tagged i with t). This represents the 3.1 Rating the users: items cosine similarity profile of a user. The TagMap capture the distance between tags, this We consider that the users are connected through a is extracted automatically from the tagging behaviour. connected network, typically through the use of random Detecting users sharing interest requires to be able to peer sampling service [1]. compute a distance between users. The most natural metric to consider is the overlap between the items they Overview of Gossple. tag. However, this simple metric suffers from several Query expansion definition: The query expansion is a drawbacks. Users that have a very high tagging ac- process that transforms a user query in order to improve tivity, will exhibit a high overlap with any other user, the performance of the search engine. It involves trans- while this does not reflect any specific interest. In addi- forming the query terms (correcting spelling and stem- tion, the proximity between users with a relevant metric ming words), adding new terms and weighting them. not only should increase when interests are similar, but We will not consider correcting spelling and stemming it should also decrease if many other interests are not since they usually rely on known local algorithms and shared. dictionaries, they do not require the personal network Instead we use a well-known metric, used in data min- knowledge. Query expansions adds terms to the query, ing, namely the cosine similarity between items. This increasing the number of results given to the user. The can be seen as a normalised overlap. Items are repre- query expansion has to be precise enough to be able sented as vectors in a multidimensional space, the num- to the add relevant documents to the result set while ber of dimension being |I|. keeping the number of irrelevant documents low. This More formally, the cosine between two vectors of items is a trade off between recall and precision. is defined as follows: The goal of Gossple is to discover users sharing sim- · cos(v1 , v2 ) = v1v1×v2v2 ilar interests (personalised network), and to gather their The item cosine is defined as follows: information in order to improve query expansion. Fig- T |Item({u1 }) Item({u2 })| ItemCos(u1 , u2 ) = √ ure 1 presents an overview of Gossple. The first step is |Item({u1 })|×|Item({u2 })| to identify the relevant users to form the personal net- The score between two users increases when interests work. This is achieved by relying on a distance metric are shared and decreases when they are not. between user, using the cosine similarity between sets 3.2 Creating the TagMap of items. The personal network, much smaller than the whole network, typically 20 neighbours are enough in The distance between users is used by users to create a 33,834 user system as we show in the experiments, their personal network, so that the information about is used to build a each user’s TagMap. The TagMap tags, collected from users ui are incorporated in the represents a personalised view of the relation between TagMap of user uj depending on its distance to ui . We tags, as a distance between tags. The query expansion consider that each user has a personalised network, we algorithm, called TagRank, exploits the user’s TagMap will come back on the discovery of such a personalised to expand a given query. network in the next section. N eighbours(u) is defined In the sequel we describe how each user creates its as the set of users in the personalised network of u. TagMap, a matrix T Mu , capturing the relationships be- 2 Note that there are many metrics that could be used, we tween tags where T Mu [ti , tj ] contains a value reflecting chose this one for the purpose of comparison with related the relationship between tags ti and tj as seen by the works. 3
  • 4. Again there are many ways to use the information puted is biased by the query, the query tags being the provided by the neighbour’s profile to fill the TagMap ones that spread importance into the graph. Calculat- depending whether the query expansion should rely on ing exact TagRank scores in a big graph can be a long a dictionary of synonyms or a hierarchical relationship process. Since this process is repeated at each query, [2]. For space reasons, we focus on synonyms in this this might be an issue in the long run. Therefore, we paper. use an algorithm from [5] in order to provide a more The information needed to fill the tag map is for each efficient approach. For each query, the computation is tag, the number of occurrences of the use of that tags split in order to compute partial scores for each tag in per items, namely: the query. At the end, all the partial scores are added For all t ∈ T ag(N eighbours(u), a vector Vt of dimen- to get the TagRank score of each tag. This saves a lot sion |I| is maintained such that if Vt [itemi ] = x, x = of processing time. The partial scores are approximated |U ser(N eighboursu , {i} , {t})|, namely the number of through random walks. times the item itemi has been tagged with t by the T agRank(query, T agM ap) outputs the list of all the neighbours of u. The TagMap is then filled as follows: tags in the TagMap associated with a weight. Since T M [ti , tj ] = cos(Vti , Vtj ) each weight is a probability, they sum up to one. The expanded query consists in the original query, plus addi- 4. THE TAGRANK ALGORITHM: PERSON- tional terms chosen by descending weight. The system can either use the top-k extra tags, or add enough tags ALISED QUERY EXPANSION to “capture” a given amount of the weight. The TagMap represents the personalised relationships between pairs of tags to be used to expand queries. 5. CREATING THE PERSONAL NETWORK A straightforward solution, used in [3], to exploit the TagMap directly, is to consider only tags close to the Our algorithm is based on profile proximity between tags of the query. This is an issue for the items suffer users. As presented in the subsection 3.2, the TagMap from a high sparsity: as there is a very large number of of a user is created from the profile of users which belong items, relationships between tags are sometimes hidden to her personalised network. The aim of the personal and can be hardly discovered. Consider for example network is so to connect a user with their k closest users a query on t1 , the TagMap provides a link between t1 according to the metric presented in the subsection 3.1. and t2 (based on a set of items). Consider now that t2 k represents the trade-off between the amount of avail- and t3 are also close in the same TagMap (based on a able information and the personalisation degree of this different set of items), this straightforward solution will information (in other words, its quality). never discover a link between t1 and t3 . We assume that the users are connected through an By iterating on the set of added tags, more relevant unstructured overlay implemented by a peer sampling tags could be added to the query. To this end, we service [1]. Basically, each user is provided with a (chang- designed an algorithm called TagRank, inspired from ing) random sample of the network (a view of say 20 PageRank[4]. The TagMap is represented as a graph in random users). This protocol ensures that the network which all the tags in the TagMap are vertices. They are is connected and that new relevant users may be dis- connected by weighted edges so that weight(ti , tj ) = covered when maintaining the personal network. T agM ap(ti , tj ) and weight(ti , ti ) = 1 3 . In PageRank, The creation of the personal work is achieved through a random surfer walks in a graph of Web pages. The a clustering gossip protocol. To this end each user importance of each page is the probability of the surfer maintains a view of k neighbours forming its person- to be on that page at any time. At each step of the walk, alised network. Starting from a random sample (typi- the surfer either follows a link on the page or moves to cally provided by the underlying peer sampling service), a page chosen uniformly at random on the whole graph. this network is refined as follow. Periodically, a user In TagRank, the transition probability from one tag to contacts another user from her neighbours to exchange another depends on the edge weight: neighbours. When a user receive new neighbours upon T ransitionP robability(t1 , t2 ) = P T agM ap[tap[t] ,t] 1 ,t2 a gossip interaction, it keeps from its own neighbours t∈T T agM 1 and the discovered one the k closest according to the The original PageRank algorithm computes a score metric defined in Section 3.1. This process is iterated for each vertex, but that score only depends on the and converges in a few cycles [6]. The TagMap of each structure of the graph, not on a user query. Like in user is then built from the profile of those k users, form- personalised versions of PageRank, we modify the set ing the personal network, of vertices the surfer can move to at random and limit In order to reduce the message size, users exchange a it to the tags of the query. Therefore, the score com- Bloom filter representing a hash of their items vectors 3 This is directly infered from the metric based on cosine of instead of the whole profile. The Bloom filter provides a vectors of items. reasonably good approximation of the user profile that 4
  • 5. can be used to compute the cosine similarity with a small error margin. If the value of the cosine between 0.8 the user’s vector and the one infered from the Bloom 0.75 filter, the users are considered closes enough and the entire profile is then exchanged. Otherwise, there is no 0.7 further exchange. This avoid the transfers of useless 0.65 entire profiles. recall 0.6 6. EVALUATION 0.55 In this section, we present preliminary experimen- tal results. We run experiments using the CiteULike 0.5 personalized TagMap + TagRank personalized TagMap + one step query expansion dataset of the 2008-10-09. |U | = 33, 834, |I| = 1, 134, 167, 0.45 global TagMap + one step query expansion |T | = 237, 450, |IS| = 4, 064, 310. We build a profile for 0 10 20 30 query expansion size 40 50 each user u ∈ U . Figure 2: recall performance evaluation of TagRank over a simple query expansion mechanism. Workload. Figure 2 shows the results of our simulations. In all To evaluate our algorithm, we generate queries to ex- cases, a query expansion size of 0 gives a recall of 47%. pand. After the query expansion, we launch the query That means that in 47% of cases, when the item has to build a result set. The result set contain the items been tagged by more than 2 users, at least one other which match the query’s tags. We generate a query for user has used one tag in common. In all the other cases, each item i ∈ Item({u}) such as U ser({i}) > 1 (an the system has to rely on the query expansion process item has to be tagged by at least 2 users). For an item to add relevant tags to the query and improve the recall i, we choose a user u and we use the tags used by u rate. on the item i to fill the query. As u will launch the We observe that the personalised TagMap performs query, we delete from the Information Space, the in- a lot better than the global TagMap, with on average formation used by the query generation (IS − (u, i, t), 8% more recall. This shows first that the personal net- t ∈ T ag({u} , {i})). work is effective to personalise in a meaningful way The query succeeds when i is in the result set. The the TagMap and generate a substantially more accu- query goes through the query expansion process and we rate query expansion. Second, it shows that only a modify the number of tags added to the query in order small portion of the network is required to personalise to evaluate the impact on the recall, which in that case in an effective manner the TagMap. The TagMap con- is the proportion of items found using the tags that were tains much less information, but since this information assigned to them by a given user. is centred on the user, the tags added through the query expansion are more relevant. Settings. Finally, we observe that TagRank also contributes to To evaluate our approach we run the following exper- improving the quality of the results, especially when it iments on the same trace. comes to producing a long query expansion. The recall Global TagMap, simple query expansion: a global is improved by up to 4% with a query expansion size of TagMap is built based on the same metric, namely co- 50. This experiment demonstrates the limits of the one sine of item vectors. The distance between tags is there- step distance when using the TagMap. The sparsity of fore not personalised as it takes into account the infor- the information in folksonomies limits the number of re- mation of all users. The query expansion is the simple lated tags that can be found. Since TagRank distributes one considered before, used in [3] considering only the weight in the whole graph, it can find tags that seem tags related to the query tags. This is typically repre- not related to the query but are still relevant. sentative of a centralised approach, where personalised TagMap are too space intensive to maintain. 7. RELATED WORK & CONCLUDING RE- Personalised TagMap, simple query expansion: the TagMap is personalised, based on the profile of the MARKS (k = 20) closest neighbours. The simple query expan- Collaborative social tagging schemes have received a sion mechanism is used here. The goal is to evaluate growing attention, they provide a huge potential for dis- the impact of the TagMap personalisation. covering new information through implicit connections. Personalised TagMap, TagRank based query expan- In this paper, we presented the query expansion fea- sion: the TagMap is personalised, based on the profile ture of Gossple, a user centric system to discover and of the (k = 20) closest neighbours. TagRank is used to maintain such acquaintances. The Gossple query ex- expand the queries. This enables to evaluate the benefit pansion mechanism improves the completeness of the 5
  • 6. search queries over state of the art alternatives without difference with our system is that this data is neutral hampering the search accuracy. This is achieved with and objective, while our system aims at a subjective, little information maintained at each peer, in the form user-related query expansion. Furthermore, our system of the TagMap. Interestingly, each peer discovers its is able to directly extract the knowledge from the in- personal network, locally stores the relevant, to itself, formation space while those information sources need relationships between tags, and its tagging behaviour is to be maintained by users. Although we limited our only recorded by its personal neighbours. The TagMap approach to adding synonyms in this paper, Gossple is exploited by an original TagRank algorithm to ex- can determine different kind of relations between tags pand in an effective way queries. and use the same approach and is also able to build a Recently, several centralised systems have addressed biased taxonomy that reflects the interests of the user. the personalisation of search in the context of folksonomies. We believe that the way to personalise Internet search, These approaches mostly focus on top-k processing. In in a world where users are free to express their opinion [7], the investigate network-aware top-k processing. They and interests goes with a fully decentralised system. To show that full personalisation which would result in the best of our knowledge, we are the first to propose maintaining data structures (typically inverted lists) on a personalised query expansion in a fully decentralised a per user basis are too space intensive. Instead the pro- manner. We foresee many perspectives to that work, posed algorithms rely on maintaining such data struc- such as leveraging the TagMap for recommendation sys- ture per cluster of socially related users and adapt the tems for instance and addressing dynamic networks. traditional centralised top-k algorithms to that setting. In [3], a centralised system proposing both query ex- 8. ACKNOWLEDGEMENT pansion and top k processing also relies on tags asso- We warmly thank Vivien Qu´ma for his help at early e ciation. Yet, the tag association is not personalised, stages of the work. nor the system is decentralised. The personalisation is addressed only in the top-k processing. One of the 9. REFERENCES main reasons is that a personalised association between [1] M. Jelasity, R. Guerraoui, A.-M. Kermarrec, and M. van tags is too space intensive in a centralised system. Our Steen. The peer sampling service: experimental evaluation of unstructured gossip-based implementations. In Middleware experiments showed the benefit of the Gossple person- ’04: Proceedings of the 5th ACM/IFIP/USENIX alised query expansion over this approach. international conference on Middleware, pages 79–98, New In [8], several types of social decentralised routing York, NY, USA, 2004. Springer-Verlag New York, Inc. [2] C. Cattuto, D. Benz, A. Hotho, and G. Stumme. Semantic strategies are considered. Although they do not deal analysis of tag similarity measures in collaborative tagging with query expansion, they confirm our intuition that systems. In Proceedings of the 3rd Workshop on Ontology social explicit connexions ala Facebook are useless for Learning and Population (OLP3), pages 39–43, July 2008. many requests. Instead they show that semantic rout- ISBN 978-960-89282-6-8. [3] V. Zanardi and L. Capra. Social ranking: uncovering ing, contacting neighours for given request are depen- relevant content using tag-based recommender systems. In dent or the content of the request, or spiritual routing, Proceedings of the 2008 ACM conference on Recommender contacting neighbours having behavioural affinity pro- systems, pages 51–58, 2008. [4] S. Brin and L. Page. The anatomy of a large-scale vide the best results. hypertextual web search engine. Computer Networks and In [9], the authors explore different ways of providing ISDN Systems, 30:107–117, 1998. personalised query expansion. They show that adding [5] D. Fogaras, B. R´cz, K. Csalog´ny, and T. Sarl´s. Towards a a o information extracted from the user’s profile can help scaling fully personalized pagerank: Algorithms, lower bounds, and experiments. Internet Mathematics, pages increasing the rank of a relevant document in the result 333–358, 2005. list. Their approach is based on using the user’s profile [6] M. Jelasity and O. Babaoglu. T-Man: Gossip-Based Overlay only, while our algorithms take advantage of the knowl- Topology Management. Engineering Self-Organising Systems, 3910:1–15, 2006. edge of the other users in the system. Therefore, our [7] S. Amer-Yahia, M. Benedikt, L. Lakshmanan, and approach is more relevant for discovering new tags and J. Stoyanovich. Efficient network aware search in increasing the recall of the requests. collaborative tagging sites. Proc. VLDB Endow., pages 710–721, 2008. Many centralised search engines provide non person- [8] M. Bender, T. Crecelius, M. Kacimi, S. Miche, alised query expansion. They add to the query syn- J. Xavier Parreira, and G. Weikum. Peer-to-peer information onyms and related concepts, found in a taxonomy in search: Semantic, social, or spiritual? Bulletin of the IEEE order to improve the quality of the results. They rely Computer Society Technical Committee on Data Engineering, 2007. on hand-generated information like Yahoo! Directory 4 , [9] M. Carman, M. Baillie, and F. Crestani. Tag data and Wordnet 5 or the Open Directory Project 6 . The main personalized information retrieval. In SSM ’08: Proceeding of the 2008 ACM workshop on Search in social media, pages 4 http://dir.yahoo.com/ 27–34, New York, NY, USA, 2008. ACM. 5 http://wordnet.princeton.edu/ 6 http://www.dmoz.org/ 6