WP2 2nd Review

Combining Human and Computational Intelligence Ilya Zaihrayeu, Pierre Andrews, Juan Pane

Semantic annotation lifecycle Problem 4: semi-automatic semantification of existing annotations free text annotations Overall problem:How to combine human and computational intelligence to support the generation and consumption of semantic contents? Problem 2: extract (semantic) annotations from contexts of user resource at publishing Problem 1: help the user find and understand the meaning of semantic annotations What if the users could use semantic annotations instead to leverage semantic technology services? Semantic annotation=structure and/or meaning User Reasoning Semantic search … Context Problem 3: QoS of semantics-enabled services 4/14/2011 2

Index: meaning summarization Problem 1: help the user find and understand the meaning of semantic annotations User Reasoning Semantic search … 4/14/2011 3

Meaning summarization: why? The right meaning of the words being used for the annotation are in the mind of the people using them E.g.: Java: an island in Indonesia south of Borneo; one of the world's most densely populated regions a beverage consisting of an infusion of ground coffee beans; "he ordered a cup of coffee“ a simple platform-independent object-oriented programming languageused for writing applets that are downloaded from the World Wide Web by a client and run on the client's machine Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations island beverage programming language 4/14/2011 4

Meaning summarization: an example One word summaries are generated from the relations in the knowledge base, sense definitions, synonyms and hypernym terms 4/14/2011 5

Meaning summarization: evaluation results Best precision: 63% If we talk about java, does the word coffee mean the same as island? Discriminating power: 76,4% 4/14/2011 6

Index: gold standard dataset Problem 4: semi-automatic semantification of existing annotations In order to evaluate the performance of the algorithms, a gold standard dataset is needed User Reasoning Semantic search … Problem 3: QoS of semantics-enabled services? 4/14/2011 7

Proposed Approach Create a gold standard of folksonomy with sense Tag Tokens Senses Disambiguation Preprocessing 59% Accuracy 80% Accuracy Java – an island in Indonesia to the south of Borneo Island – a land mass that is surrounded by water javaisland Java island Java is land … 4/14/2011 8

A Platform for Gold Standards of Semantic Annotation Systems Manual validation RDF export Evaluation of Preprocessing WSD BoW Search Convergence Open source: 7 modules 25K lines of code26% of comments http://sourceforge.net/projects/tags2con/ 4/14/2011 9

Delicious RDF Dataset @ LOD cloud http://disi.unitn.it/~knowdive/dataset/delicious/ 4/14/2011 10 Dereferenceable at:

Index: QoS for semantic search User Reasoning Semantic search … Problem 3: QoS of semantics-enabled services? 4/14/2011 11

Semantic search: why? With the free text search, the following problems may reduce precision and recall: synonymy problem: searching for “images” should return resources annotated with “picture” polysemy problem: searching for “java” (island) should not return resources annotated with “java” (coffee beverage) specificity gap problem: searching for “animals” should also return resources annotated with “dogs” Semantic, meaning-based search can address the above listed problems 4/14/2011 12

Semantics vsFolksonomy link Used to build “raw” queries javaisland Semantic search: complete and correct results (the baseline) vehicle query submit Used to build BoW queries java island Used to build semantic queriescorrect and complete Java(island) island(land) result Specificity Gap (SG) resource annotation Recall goes down as the specificity gap increases SG=1 User car SG=2 taxi Specificity Gap 4/14/2011 13

Index: semantic convergence Problem 4: semi-automatic semantification of existing annotations User Reasoning Semantic search … 4/14/2011 14

Semantic convergence: Why? Ajax Mac Apple CSS … Random: programming and web domain “General” domains: cooking, travel, education 4/14/2011 15

Semantic convergence: proposed solution Find new senses of terms Find different senses of the same term (word sense) Find synonymous of a term (synonymous sets - synset) Place the new synset in the vocabulary is-a hierarchy What we improve Better use of Machine Learning techniques The polysemy issue is not considered in the state of the art Missing or “subjective” evaluations in the state of the art Evaluation using the Delicious dataset 4/14/2011 16

Convergence Evaluation:Finding Senses Tag Collocation User Collocation 4/14/2011 17 t2 t2 U1 B2 B1 B1 t3 t1 t1 t4 t3 t5 t5 U2 B4 t4 B4 B3 B3 Random Baseline Precision: 56% Recall: 73% Precision: 57% Recall: 68% Precision: 42% Recall: 29%

Semantic annotation lifecycle Problem 4: semi-automatic semantification of existing annotations free text annotations combining human and computational intelligence Conclusions Problem 2: extract (semantic) annotations from contexts of user resource at publishing? Problem 1: help the user understand the meaning of semantic annotations? What if the users could use semantic annotations instead to leverage semantic technology services? Semantic annotation=structure and/or meaning User Reasoning Semantic search … Context Problem 3: QoS of semantics-enabled services? 4/14/2011 18

Conclusions We developed and evaluated a meaning summarization algorithm We developed a “semantic folksonomy” evaluation platform We studied the effect of semantics on social tagging systems: how much semantics can help? how much the user needs to be involved? How human and computer intelligence can be combined in the generation and consumption of semantic annotations We developed and evaluated a knowledge base enrichment algorithm We built and used a gold standard dataset for evaluating: Word Sense Disambiguation Tag Preprocessing Semantic Search Semantic Convergence 4/14/2011 19

Integration with the use cases 4/14/2011 20

WP2 2nd Review

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (6)

Andere mochten auch

Andere mochten auch (20)

Ähnlich wie WP2 2nd Review

Ähnlich wie WP2 2nd Review (20)

Mehr von INSEMTIVES project

Mehr von INSEMTIVES project (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

WP2 2nd Review

Hinweis der Redaktion