Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.

Exploratory Search Based on Linked Open Data

Exploratory Search and Intelligent Recommendations, CrEDIBLE 2014 Workshop, Sophia-Antipolis, France, 08-10.10.2014

  • Loggen Sie sich ein, um Kommentare anzuzeigen.

Exploratory Search Based on Linked Open Data

  1. 1. The Journey is the Reward - Explorative Semantic Search based on Linked Open Data Sophia Antipolis, 09. October 2014 Dr. Harald Sack Hasso-Plattner-Institute for IT Systems Engineering University of Potsdam Donnerstag, 9. Oktober 14
  2. 2. Hasso Plattner Institute for IT Systems Engineering Semantic Technologies & Multimedia Retrieval Research Group Donnerstag, 9. Oktober 14
  3. 3. Hasso Plattner Institute for IT Systems Engineering Semantic Technologies & Multimedia Retrieval Research Group • Research Topics □ Semantic Web Technologies □ Knowledge Discovery □ Ontological Engineering □ Multimedia Analysis & Retrieval □ Social Networking □ Data/Information Visualization • Research Projects: Donnerstag, 9. Oktober 14
  4. 4. The Journey is the Reward Explorative Semantic Search based on Linked Open Data Overview (1) Search & Retrieval and why we are not always content with it... (2) Semantic Analysis to better „understand“ the content (3) Explorative Semantic Search switching from „retrieval“ to „discovery“ (4) Intelligent Recommendation variatio delectat - variation is delectable Donnerstag, 9. Oktober 14
  5. 5. Search & Retrieval today... Donnerstag, 9. Oktober 14
  6. 6. Autocompletion Google Knowledge Graph Donnerstag, 9. Oktober 14
  7. 7. Query by Example Visual Analysis Recommendations Donnerstag, 9. Oktober 14
  8. 8. The Ordinary Archive is a Small World... Jules Verne Donnerstag, 9. Oktober 14
  9. 9. Information Retrieval Paradigm (Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983) Set of Documents Files of records Set of Queries Information requests query Document Index based on “similarity“ Indexing Query Formulation String Matching Donnerstag, 9. Oktober 14
  10. 10. Let‘s assume you are looking for something and you don‘t know how to phrase your search correctly.... Donnerstag, 9. Oktober 14
  11. 11. moon Donnerstag, 9. Oktober 14
  12. 12. moon spaceflight Donnerstag, 9. Oktober 14
  13. 13. moon spaceflight impact Donnerstag, 9. Oktober 14
  14. 14. moon spaceflight impact silent movie Donnerstag, 9. Oktober 14
  15. 15. moon spaceflight impact silent movie Donnerstag, 9. Oktober 14
  16. 16. • sometimes simple query matching with text content or metadata alone is not sufficient to fulfill the user‘s information needs • what is missing are often the relational connections and circumstances, i.e. contextual information is needed to answer the query • in order to achieve this the content must be „understood“ Semantic Analysis Donnerstag, 9. Oktober 14
  17. 17. The Journey is the Reward Explorative Semantic Search based on Linked Open Data Overview (1) Search & Retrieval and why we are not always content with it... (2) Semantic Analysis to better „understand“ the content (3) Explorative Semantic Search switching from „retrieval“ to „discovery“ (4) Intelligent Recommendation variatio delectat - variation is delectable Donnerstag, 9. Oktober 14
  18. 18. How to Determine the Meaning of (Meta)data? • Authoritative • structured data • semi-structured data • natural language text • Non-authoritative • (free) user tags and comments • restricted vocabularies • (Media) Analysis • low level features • high level features Semantic Analysis level of abstraction accuracy reliability context pragmatics location dependency time dependency (Meta)data Source Donnerstag, 9. Oktober 14
  19. 19. From Raw (Text) Data to Semantic Entities Neil Armstrong, the 38-year-old civilian commander, radioes to earth an the mission control room here: „Houston, Tranquility Base here, The Eagle has landed.“ Neil Armstrong rdf:type Astronaut Entities Ontologies dbpedia-owl:crewMember SpaceMission rdfs:subClassOf Person rdfs:subClassOf string dbpedia-owl:birth_name date dbpedia-owl:birth_date Event integer dbpedia-owl:crewSize tag text image annotation Donnerstag, 9. Oktober 14
  20. 20. Web of Data = Linked Open Data Neil Armstrong rdf:type Astronaut rdfs:subClassOf Person dbpedia-owl:crewMember SpaceMission rdfs:subClassOf string dbpedia-owl:birth_name date dbpedia-owl:birth_date Event integer dbpedia-owl:crewSize Donnerstag, 9. Oktober 14
  21. 21. Named Entity Resolution Neil Armstrong text Neil Armstrong, the 38-year-old civilian commander, radioes to earth an the mission control room here: „Houston, Tranquility Base here, The Eagle has landed.“ image annotation Donnerstag, 9. Oktober 14
  22. 22. Text Neil Armstrong, the 38-year-old civilian commander, radioes to earth an the mission control room here: „Houston, Tranquility Base here, The Eagle has landed.“ (1) Determine possible Entity Candidates • linguistic analysis (POS tagging) • n-gram analysis • normalization (stemming) • encoding and spelling • language dependent spellings • abbreviations & acronyms • type dependent spellings • alternative names and synonyms • fuzzy string mapping • ... Named Entity Resolution Donnerstag, 9. Oktober 14
  23. 23. Text Neil Armstrong, the 38-year-old civilian commander, radioes to earth an the mission control room here: „Houston, Tranquility Base here, The Eagle has landed.“ (2) Subsequent Filtering of Entity Candidates • Named Entity Tagging • Persons • Locations • Organization • Time • Date • Money • ... Named Entity Resolution Donnerstag, 9. Oktober 14
  24. 24. (3)Disambiguation of Correct Entity • Which entity candidate to choose depends on the context • Context Analysis • takes into account Ambiguity, Accuracy, and Reliability of source data and mapping Temporal Context Spatial Context Context Item Social Context Contextual Description Class Diversity Level of Structure Source Reliability Source Diversity Context Dimensions influences influences Ambiguity Accuracy determines Relevance N.Steinmetz, H.Sack: Semantic Multimedia Information Retrieval Based on Contextual Descriptions, ESWC 2013 Donnerstag, 9. Oktober 14
  25. 25. (3)Disambiguation of Correct Entity • Determine Candidates for all Entities within the given context Neil Armstrong Tranquility Base Houston Earth Commander Eagle Mission Control Donnerstag, 9. Oktober 14
  26. 26. (3)Disambiguation of Correct Entity • look for existing connections/relations among entity candidates within the given context Tranquility Base Neil Armstrong Mission Control Donnerstag, 9. Oktober 14
  27. 27. (3)Disambiguation of Correct Entity • Link Graph Analysis Neil Armstrong, the 38-year-old civilian commander, radioes to earth an the mission control room here: „Houston, Tranquility Base here, The Eagle has landed.“ Donnerstag, 9. Oktober 14
  28. 28. (3)Disambiguation of Correct Entity • Link Graph Analysis Neil Armstrong, the 38-year-old civilian commander, radioes to earth an the mission control room here: „Houston, Tranquility Base here, The Eagle has landed.“ Tranquility Base Neil Armstrong Mission Control Houston Eagle Earth Donnerstag, 9. Oktober 14
  29. 29. Neil Houston Tranquility Armstrong Base mission Eagle earth control Term Entities (3)Disambiguation of Correct Entity • Link Graph Analysis • identify connected components that cover the most term partitions • only one node per partition should be covered • strongly connected components consolidate the disambiguation Donnerstag, 9. Oktober 14
  30. 30. (3)Disambiguation of Correct Entity • for our example Neil Armstrong, the 38-year-old civilian commander, radioes to earth an the mission control room here: „Houston, Tranquility Base here, The Eagle has landed.“ Donnerstag, 9. Oktober 14
  31. 31. The Journey is the Reward Explorative Semantic Search based on Linked Open Data Overview (1) Search & Retrieval and why we are not always content with it... (2) Semantic Analysis to better „understand“ the content (3) Explorative Semantic Search switching from „retrieval“ to „discovery“ (4) Intelligent Recommendation variatio delectat - variation is delectable Donnerstag, 9. Oktober 14
  32. 32. Search vs. Exploration Donnerstag, 9. Oktober 14
  33. 33. Search vs. Exploration V E R N E, Jules: From the Earth to the Moon, Direct in 97 Hours 20 Minutes and a Trip Round It, Sampson Low, Marston&Company, London (1873), viii, 323 p. plates. GRC C.194.a.659, 12516.g.20 Donnerstag, 9. Oktober 14
  34. 34. Search vs. Exploration • Find another („comparable“) book, (that will interest me...) • Find books on related topics? • How did the author / the topic develop over time? • What else would I like to read? Donnerstag, 9. Oktober 14
  35. 35. Search vs. Exploration • Find another („comparable“) book, (that will interest me...) • Find books on related topics? • How did the author / the topic develop over time? • What else would I like to read? Exploratory Search Donnerstag, 9. Oktober 14
  36. 36. (Traditional) Libraries also enable Exploratory Search Donnerstag, 9. Oktober 14
  37. 37. (Traditional) Librarians enable „intelligent“ Recommendations Donnerstag, 9. Oktober 14
  38. 38. The Journey is the Reward Explorative Semantic Search based on Linked Open Data Overview (1) Search & Retrieval and why we are not always content with it... (2) Semantic Analysis to better „understand“ the content (3) Explorative Semantic Search switching from „retrieval“ to „discovery“ (4) Intelligent Recommendation variatio delectat - variation is delectable Donnerstag, 9. Oktober 14
  39. 39. Exploratory Search based on Linked Open Data http://dbpedia.org/resource/From_the_Earth_to_the_Moon Donnerstag, 9. Oktober 14
  40. 40. Exploratory Search based on Linked Open Data :From_the_Earth_to_the_Moon :Jules_Verne dbpedia-owl:author dbpedia-owl:influenced :H._G._Wells dbpedia-owl:Book rdf:type dcterms:subject dbprop:preceded_by category:1865_novels category:Frence_science_fiction_novels category:Novels_by_Jules_Verne category:Moon_in_fiction category:Fictional_rivalries category:Novels_set_in_Florida category:1860s_science_fiction_novels ... :In_Search_of_the_Castaways Donnerstag, 9. Oktober 14
  41. 41. Similar Results ➞ belong to consistent categories • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  42. 42. Similar Results ➞ belong to consistent categories • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  43. 43. Similar Results ➞ belong to consistent categories • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Problem: too „similar“ Recommendations (in the long run) Donnerstag, 9. Oktober 14
  44. 44. Donnerstag, 9. Oktober 14
  45. 45. Serendipity helps to improve the Quality of Discovery & Exploration Serendipity: • finding a solution to a problem that is relevant but not intentionally thought of • a recommended item is a seredipitious discovery if it is interesting and positively surprising because one was not in search for any of its kind = Relevance + Unexpectedness Serendipity Donnerstag, 9. Oktober 14
  46. 46. Serendipity helps to improve the Quality of Discovery & Exploration Relevance: • In the general case, often referenced (cited) facts are considered more relevant • In the special case, the relevance of a fact must be adapted to the current (personal) context Unexpectedness: • the likelyhood of co-occurrence should be low Exploration combine • similarity based recommendations • with serendipitiuos but relevant findings Donnerstag, 9. Oktober 14
  47. 47. Serendipity: Skip most comon (similar) categories (classes) dcterms:subject category:1865_novels category:Frence_science_fiction_novels category:Novels_by_Jules_Verne category:Moon_in_fiction category:Fictional_rivalries category:Novels_set_in_Florida category:1860s_science_fiction_novels ... most comon (similar) category for a specific entity • contains the most similar entities for this specific entity Donnerstag, 9. Oktober 14
  48. 48. Similarity ≈ Sharing comon Properties dcterms:subject dcterms:subject category:Moon_in_fiction dcterms:subject dcterms:subject category:French_science_fiction_novels dbprop:country dbprop:country France dbpedia-owl:series dbpedia-owl:series dbpedia:Voyages_Extraordinaires dbpedia-owl:literaryGenre dbpedia-owl:literaryGenre dbpedia:Science_Fiction dbpedia-owl:Book rdf:type rdf:type dbpedia-owl:author dbpedia-owl:author dbpedia:Jules_Verne Donnerstag, 9. Oktober 14
  49. 49. Serendipity: Skip most comon (similar) categories (classes) • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  50. 50. Serendipity: Skip most comon (similar) categories (classes) • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  51. 51. Serendipity: Skip most comon (similar) categories (classes) • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  52. 52. Serendipity: Skip most comon (similar) categories (classes) • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  53. 53. Serendipity: look for the least expected yet relevant... Donnerstag, 9. Oktober 14
  54. 54. Serendipity: look for the least expected yet relevant... • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  55. 55. Serendipity: look for the least expected yet relevant... • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  56. 56. Serendipity: look for the least expected yet relevant... • category:French_science_fiction_novels • category:Moon_in_fiction • category:Novels_by_Jules_Verne • category:American_Civil_War_novels • category:Novels_set_in_Florida • category:1860s_science_fiction_novels • category:1865_novels • category:Fictional_rivalries Donnerstag, 9. Oktober 14
  57. 57. Content-Based Recommendations Industry Project: cENTERTAIN.me video recommendation http://mediaglobe.yovisto.com:8080/c.me-gui-0.0.1-SNAPSHOT2/ Donnerstag, 9. Oktober 14
  58. 58. The Journey is the Reward Explorative Semantic Search based on Linked Open Data Overview (1) Search & Retrieval and why we are not always content with it... (2) Semantic Analysis to better „understand“ the content (3) Explorative Semantic Search switching from „retrieval“ to „discovery“ (4) Intelligent Recommendation variatio delectat - variation is delectable Dr. Harald Sack Hasso-Plattner-Institut für Softwaresystemtechnik, Universität Potsdam Prof.-Dr.-Helmert-Str. 2-3, D-14482 Potsdam Donnerstag, 9. Oktober 14

×