WordPress Websites for Engineers: Elevate Your Brand
WP2 2nd Review
1. Combining Human and Computational Intelligence Ilya Zaihrayeu, Pierre Andrews, Juan Pane
2. Semantic annotation lifecycle Problem 4: semi-automatic semantification of existing annotations free text annotations Overall problem:How to combine human and computational intelligence to support the generation and consumption of semantic contents? Problem 2: extract (semantic) annotations from contexts of user resource at publishing Problem 1: help the user find and understand the meaning of semantic annotations What if the users could use semantic annotations instead to leverage semantic technology services? Semantic annotation=structure and/or meaning User Reasoning Semantic search … Context Problem 3: QoS of semantics-enabled services 4/14/2011 2
3. Index: meaning summarization Problem 1: help the user find and understand the meaning of semantic annotations User Reasoning Semantic search … 4/14/2011 3
4. Meaning summarization: why? The right meaning of the words being used for the annotation are in the mind of the people using them E.g.: Java: an island in Indonesia south of Borneo; one of the world's most densely populated regions a beverage consisting of an infusion of ground coffee beans; "he ordered a cup of coffee“ a simple platform-independent object-oriented programming languageused for writing applets that are downloaded from the World Wide Web by a client and run on the client's machine Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations island beverage programming language 4/14/2011 4
5. Meaning summarization: an example One word summaries are generated from the relations in the knowledge base, sense definitions, synonyms and hypernym terms 4/14/2011 5
6. Meaning summarization: evaluation results Best precision: 63% If we talk about java, does the word coffee mean the same as island? Discriminating power: 76,4% 4/14/2011 6
7. Index: gold standard dataset Problem 4: semi-automatic semantification of existing annotations In order to evaluate the performance of the algorithms, a gold standard dataset is needed User Reasoning Semantic search … Problem 3: QoS of semantics-enabled services? 4/14/2011 7
8. Proposed Approach Create a gold standard of folksonomy with sense Tag Tokens Senses Disambiguation Preprocessing 59% Accuracy 80% Accuracy Java – an island in Indonesia to the south of Borneo Island – a land mass that is surrounded by water javaisland Java island Java is land … 4/14/2011 8
9. A Platform for Gold Standards of Semantic Annotation Systems Manual validation RDF export Evaluation of Preprocessing WSD BoW Search Convergence Open source: 7 modules 25K lines of code26% of comments http://sourceforge.net/projects/tags2con/ 4/14/2011 9
11. Index: QoS for semantic search User Reasoning Semantic search … Problem 3: QoS of semantics-enabled services? 4/14/2011 11
12. Semantic search: why? With the free text search, the following problems may reduce precision and recall: synonymy problem: searching for “images” should return resources annotated with “picture” polysemy problem: searching for “java” (island) should not return resources annotated with “java” (coffee beverage) specificity gap problem: searching for “animals” should also return resources annotated with “dogs” Semantic, meaning-based search can address the above listed problems 4/14/2011 12
13. Semantics vsFolksonomy link Used to build “raw” queries javaisland Semantic search: complete and correct results (the baseline) vehicle query submit Used to build BoW queries java island Used to build semantic queriescorrect and complete Java(island) island(land) result Specificity Gap (SG) resource annotation Recall goes down as the specificity gap increases SG=1 User car SG=2 taxi Specificity Gap 4/14/2011 13
14. Index: semantic convergence Problem 4: semi-automatic semantification of existing annotations User Reasoning Semantic search … 4/14/2011 14
15. Semantic convergence: Why? Ajax Mac Apple CSS … Random: programming and web domain “General” domains: cooking, travel, education 4/14/2011 15
16. Semantic convergence: proposed solution Find new senses of terms Find different senses of the same term (word sense) Find synonymous of a term (synonymous sets - synset) Place the new synset in the vocabulary is-a hierarchy What we improve Better use of Machine Learning techniques The polysemy issue is not considered in the state of the art Missing or “subjective” evaluations in the state of the art Evaluation using the Delicious dataset 4/14/2011 16
17. Convergence Evaluation:Finding Senses Tag Collocation User Collocation 4/14/2011 17 t2 t2 U1 B2 B1 B1 t3 t1 t1 t4 t3 t5 t5 U2 B4 t4 B4 B3 B3 Random Baseline Precision: 56% Recall: 73% Precision: 57% Recall: 68% Precision: 42% Recall: 29%
18. Semantic annotation lifecycle Problem 4: semi-automatic semantification of existing annotations free text annotations combining human and computational intelligence Conclusions Problem 2: extract (semantic) annotations from contexts of user resource at publishing? Problem 1: help the user understand the meaning of semantic annotations? What if the users could use semantic annotations instead to leverage semantic technology services? Semantic annotation=structure and/or meaning User Reasoning Semantic search … Context Problem 3: QoS of semantics-enabled services? 4/14/2011 18
19. Conclusions We developed and evaluated a meaning summarization algorithm We developed a “semantic folksonomy” evaluation platform We studied the effect of semantics on social tagging systems: how much semantics can help? how much the user needs to be involved? How human and computer intelligence can be combined in the generation and consumption of semantic annotations We developed and evaluated a knowledge base enrichment algorithm We built and used a gold standard dataset for evaluating: Word Sense Disambiguation Tag Preprocessing Semantic Search Semantic Convergence 4/14/2011 19
22. Semantic Annotation of Images on FlickrPierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;ESWC 2011
23. A Classification of Semantic Annotation SystemsPierre Andrews, Sergey Kanshin, Juan Pane, and IlyaZaihrayeu;Semantic Web Journal – second review phase
24. Sense Induction in FolksonomiesPierre Andrews, Juan Pane, and IlyaZaihrayeu;IJCAI-LHD 2011 – under review
25. Evaluating the Quality of Service in Semantic Annotation SystemsIlyaZaihrayeu, Pierre Andrews, and Juan Pane;in preparation4/14/2011 21
Hinweis der Redaktion
Say how it’s different from tagora dataset => we have gold standard preprocessing disambiguation, with agreement between at least two annotators
The first platform for building gold standards for the evaluation of concept-based search algorithms, vocabulary convergence algorithms, etc in folksonomiesThe first gold standard dataset produced and publishedThe first evaluation of a keywords-based search algorithm w.r.t. the gold standard semantic search in a folksonomyTag preprocessing algorithm, WSD algorithm, concept-based search algorithm