SlideShare ist ein Scribd-Unternehmen logo
1 von 21
Downloaden Sie, um offline zu lesen
Improving the Search Experience
in a Social Network with Cross
Media Contents
Daniele Cenni, Paolo Nesi
University of Florence
Department of Systems and Informatics
Distributed Systems and Internet Technology Laboratory
Paolo.nesi@unifi.it
cenni@dsi.unifi.it , http://www.disit.dinfo.unifi.it
DMS2013, August 2013, UK, Paolo Nesi 1
ECLAP Social Network
 ECLAP is a Digital Library on Performing
Arts connected with Europeana
 ECLAP is a Best Practice and Social
Network (blogs, forums, comments,
tagging, voting, …)
DMS2013, August 2013, UK, Paolo Nesi 2
Goals/Requirements
 Develop an Indexing/Searching solution for ECLAP Social
Network allowing:
 Indexing multilingual crossmedia content metadata and
data (e.g. documents)
 Indexing portal blogs, forums, events, group pages,
comments, etc.
 Efficient multilingual search (keyword search and advanced
search) supporting:
 misspelled words (e.g. shespeare)
 partial word search
 Sorting and filtering search results
 re-index the whole data without blocking the system
 Log and monitor users activity
 …
 Evaluate the Indexing/Searchig service
DMS2013, August 2013, UK, Paolo Nesi 3
ECLAP ANY content kind
 Informative Content
 Video, audio, images,
documents
 3D, animations, Braille
 Slide, Video-Slide, courses
 eBook, ePub, Mpeg21,
intelligent
 Aggregated Content:
 Playlist, Collections
 Annotations,
Synchronization
 Support and networking
content:
 Blog, WebPage, Events,
comments,
forum, votes, messages, …
4
comments
rating
relationships
technical
Dynamic
recommend
……………
• Performance
• Master classes
• Scene Sketches
• Scenography
• Scenes
• Private lives of
artists
• Scores
• Braille
• BackStage Stills
• Choreography
• Morals
• Poster
• Booklets
• Magazines Music
• Audio ballets
ECLAP Semantic Model 1
DMS2013, August 2013, UK, Paolo Nesi
Media Object
Video Audio
Document
Group/Channel
CollectionPlaylist
0..n
0..n
1..n
0..n
Image
AVObjectAnnotation
0..n 1..2
1..n
0..n
ForumWebPage
CommentContentTaxonomyTerm
0..n 0..n 0..n1
0..n
0..n
Blog
Metadata
Performing
Arts
Dublin Core
Technical
Main
Annotation
Side
Annotation
1..n
1
GeoName
Crossmedia
Archive
Event
epub
3D
IPR
Braille Music
Score
5
ECLAP Semantic Model 2
DMS2013, August 2013, UK, Paolo Nesi
User Group/Channel
Content
Media Object
Comment
Annotation
TaxonomyTerm
foaf:member
admin
isProvidedBy
isFavouriteOf
dc:creator
dc:creator
foaf:topic_interest
isFeaturedBy
foaf:knows
6
Indexing
 Indexing & Search system
 Based on Apache Solr
 Multilingual aspects
 Translate the metadata or translate the query?.. both
 metadata translation
 Query translation
 Indexing schema
 Dublin Core + DCTerms (multi language)
 Performing Arts
 Technical (provider, content type, GPS, IPR, duration, quality, …)
 Groups associations (multi language)
 Taxonomy associations (multi language)
 Comments & multi language tags
 FullText of the textual digital resources
DMS2013, August 2013, UK, Paolo Nesi 7
Indexing
DMS2013, August 2013, UK, Paolo Nesi 8
Metadata Schema Indexing
DMS2013, August 2013, UK, Paolo Nesi 9
Search Facilities
 Full text search
 Uses the catch all fields to search for keywords in
most important fields in all languages (title,
description, text, body, subject,…)
 Fuzzy search
 Allows matching mistyped words
 Deep search
 Allows searching for partial words
 Faceted Search
 Maximasing Precision and Recall:
 Relevance & boosting terms
DMS2013, August 2013, UK, Paolo Nesi 10
Search Facilities vs Information
DMS2013, August 2013, UK, Paolo Nesi 11
Searching
 Faceted search
DMS2013, August 2013, UK, Paolo Nesi 12
Weighted Query Model
 Where for the “q” query
 Weights are boosting fields
 Title is DC.Title, description DC.Description….,
 Body is textual body, subject…,
 taxonomy the full description of the taxonomy
branch
DMS2013, August 2013, UK, Paolo Nesi 13
Model Optimization
 Optimization of the Precision&Recall to
improve search quality
 50 reference queries
 Optimization Methods
 Simulated Annealing
 Genetic Algorithms
 7 parameters
DMS2013, August 2013, UK, Paolo Nesi 14
Monte Carlo Analysis
MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 15
DMS2013, August 2013, UK, Paolo Nesi 16
Some weights’ Trends
DMS2013, August 2013, UK, Paolo Nesi 17
Comparative Results
MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 18
Usage Results
 Over than 500.000 visits
 7.29 minutes of permanence on the
portal
DMS2013, August 2013, UK, Paolo Nesi 19
Assessment of Search Facility
 Distribution of performed clicks
First page
DMS2013, August 2013, UK, Paolo Nesi 20
Conclusions
 indexing solution for
 cross media for multilingual metadata and texts
 Improved Searching & filtering results and thus user experience
quality
 Providing: (full text, operators), advanced, faceted, etc.
 Precision and Recall analysis allowed to tune the search
services
 Simulated Annealing and Genetic Algorithms produced similar
results
 User behavior assessment has shown that search facility
appreciation has been improved wrt to early previous
settings, grounded on common sense and classical
metadata relevance
DMS2013, August 2013, UK, Paolo Nesi 21

Weitere ähnliche Inhalte

Ähnlich wie Improving the Search Experience in a Social Network with Cross Media Contents

Indexing and Searching Cross Media Content in a Social Network
Indexing and Searching Cross Media Content in a Social NetworkIndexing and Searching Cross Media Content in a Social Network
Indexing and Searching Cross Media Content in a Social NetworkPaolo Nesi
 
Slawek Korea
Slawek KoreaSlawek Korea
Slawek KoreaSlawek
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...Europeana
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projectszsrlibrary
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries mdabrowski
 
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...Paolo Nesi
 
Usability & User-Centred Design
Usability & User-Centred DesignUsability & User-Centred Design
Usability & User-Centred Designboonious
 
MPEG-7 Services in Community Engines
MPEG-7 Services in Community EnginesMPEG-7 Services in Community Engines
MPEG-7 Services in Community EnginesRalf Klamma
 
Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]guest410707c
 
Information Architecture
Information ArchitectureInformation Architecture
Information ArchitectureOlivier Tripet
 
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...jacekg
 
Accessibility, Automation and Metadata
Accessibility, Automation and MetadataAccessibility, Automation and Metadata
Accessibility, Automation and Metadatalisbk
 
RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)Vladimir Alexiev, PhD, PMP
 
Modular Documentation Joe Gelb Techshoret 2009
Modular Documentation Joe Gelb Techshoret 2009Modular Documentation Joe Gelb Techshoret 2009
Modular Documentation Joe Gelb Techshoret 2009Suite Solutions
 
Institutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR ManagementInstitutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR ManagementPaolo Nesi
 
A Learning to Rank Project on a Daily Song Ranking Problem
A Learning to Rank Project on a Daily Song Ranking ProblemA Learning to Rank Project on a Daily Song Ranking Problem
A Learning to Rank Project on a Daily Song Ranking ProblemSease
 

Ähnlich wie Improving the Search Experience in a Social Network with Cross Media Contents (20)

Indexing and Searching Cross Media Content in a Social Network
Indexing and Searching Cross Media Content in a Social NetworkIndexing and Searching Cross Media Content in a Social Network
Indexing and Searching Cross Media Content in a Social Network
 
Slawek Korea
Slawek KoreaSlawek Korea
Slawek Korea
 
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...WP3 Further specification of Functionality and Interoperability - Gradmann / ...
WP3 Further specification of Functionality and Interoperability - Gradmann / ...
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projects
 
UCIAD overview
UCIAD overviewUCIAD overview
UCIAD overview
 
Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries Geo-annotations in Semantic Digital Libraries
Geo-annotations in Semantic Digital Libraries
 
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
NLP on Hadoop: A Distributed Framework for NLP-Based Keyword and Keyphrase Ex...
 
Semantic Web in Action
Semantic Web in ActionSemantic Web in Action
Semantic Web in Action
 
Usability & User-Centred Design
Usability & User-Centred DesignUsability & User-Centred Design
Usability & User-Centred Design
 
MPEG-7 Services in Community Engines
MPEG-7 Services in Community EnginesMPEG-7 Services in Community Engines
MPEG-7 Services in Community Engines
 
Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]
 
Information Architecture
Information ArchitectureInformation Architecture
Information Architecture
 
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...
Panel: Social Tagging and Folksonomies: Indexing, Retrieving... and Beyond? ...
 
Accessibility, Automation and Metadata
Accessibility, Automation and MetadataAccessibility, Automation and Metadata
Accessibility, Automation and Metadata
 
Tech WG report 2011
Tech WG report 2011Tech WG report 2011
Tech WG report 2011
 
JeromeDL Tutorial
JeromeDL TutorialJeromeDL Tutorial
JeromeDL Tutorial
 
RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)RDF Data and Image Annotations in ResearchSpace (paper)
RDF Data and Image Annotations in ResearchSpace (paper)
 
Modular Documentation Joe Gelb Techshoret 2009
Modular Documentation Joe Gelb Techshoret 2009Modular Documentation Joe Gelb Techshoret 2009
Modular Documentation Joe Gelb Techshoret 2009
 
Institutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR ManagementInstitutional Services and Tools for Content, Metadata and IPR Management
Institutional Services and Tools for Content, Metadata and IPR Management
 
A Learning to Rank Project on a Daily Song Ranking Problem
A Learning to Rank Project on a Daily Song Ranking ProblemA Learning to Rank Project on a Daily Song Ranking Problem
A Learning to Rank Project on a Daily Song Ranking Problem
 

Kürzlich hochgeladen

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Kürzlich hochgeladen (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Improving the Search Experience in a Social Network with Cross Media Contents

  • 1. Improving the Search Experience in a Social Network with Cross Media Contents Daniele Cenni, Paolo Nesi University of Florence Department of Systems and Informatics Distributed Systems and Internet Technology Laboratory Paolo.nesi@unifi.it cenni@dsi.unifi.it , http://www.disit.dinfo.unifi.it DMS2013, August 2013, UK, Paolo Nesi 1
  • 2. ECLAP Social Network  ECLAP is a Digital Library on Performing Arts connected with Europeana  ECLAP is a Best Practice and Social Network (blogs, forums, comments, tagging, voting, …) DMS2013, August 2013, UK, Paolo Nesi 2
  • 3. Goals/Requirements  Develop an Indexing/Searching solution for ECLAP Social Network allowing:  Indexing multilingual crossmedia content metadata and data (e.g. documents)  Indexing portal blogs, forums, events, group pages, comments, etc.  Efficient multilingual search (keyword search and advanced search) supporting:  misspelled words (e.g. shespeare)  partial word search  Sorting and filtering search results  re-index the whole data without blocking the system  Log and monitor users activity  …  Evaluate the Indexing/Searchig service DMS2013, August 2013, UK, Paolo Nesi 3
  • 4. ECLAP ANY content kind  Informative Content  Video, audio, images, documents  3D, animations, Braille  Slide, Video-Slide, courses  eBook, ePub, Mpeg21, intelligent  Aggregated Content:  Playlist, Collections  Annotations, Synchronization  Support and networking content:  Blog, WebPage, Events, comments, forum, votes, messages, … 4 comments rating relationships technical Dynamic recommend …………… • Performance • Master classes • Scene Sketches • Scenography • Scenes • Private lives of artists • Scores • Braille • BackStage Stills • Choreography • Morals • Poster • Booklets • Magazines Music • Audio ballets
  • 5. ECLAP Semantic Model 1 DMS2013, August 2013, UK, Paolo Nesi Media Object Video Audio Document Group/Channel CollectionPlaylist 0..n 0..n 1..n 0..n Image AVObjectAnnotation 0..n 1..2 1..n 0..n ForumWebPage CommentContentTaxonomyTerm 0..n 0..n 0..n1 0..n 0..n Blog Metadata Performing Arts Dublin Core Technical Main Annotation Side Annotation 1..n 1 GeoName Crossmedia Archive Event epub 3D IPR Braille Music Score 5
  • 6. ECLAP Semantic Model 2 DMS2013, August 2013, UK, Paolo Nesi User Group/Channel Content Media Object Comment Annotation TaxonomyTerm foaf:member admin isProvidedBy isFavouriteOf dc:creator dc:creator foaf:topic_interest isFeaturedBy foaf:knows 6
  • 7. Indexing  Indexing & Search system  Based on Apache Solr  Multilingual aspects  Translate the metadata or translate the query?.. both  metadata translation  Query translation  Indexing schema  Dublin Core + DCTerms (multi language)  Performing Arts  Technical (provider, content type, GPS, IPR, duration, quality, …)  Groups associations (multi language)  Taxonomy associations (multi language)  Comments & multi language tags  FullText of the textual digital resources DMS2013, August 2013, UK, Paolo Nesi 7
  • 9. Metadata Schema Indexing DMS2013, August 2013, UK, Paolo Nesi 9
  • 10. Search Facilities  Full text search  Uses the catch all fields to search for keywords in most important fields in all languages (title, description, text, body, subject,…)  Fuzzy search  Allows matching mistyped words  Deep search  Allows searching for partial words  Faceted Search  Maximasing Precision and Recall:  Relevance & boosting terms DMS2013, August 2013, UK, Paolo Nesi 10
  • 11. Search Facilities vs Information DMS2013, August 2013, UK, Paolo Nesi 11
  • 12. Searching  Faceted search DMS2013, August 2013, UK, Paolo Nesi 12
  • 13. Weighted Query Model  Where for the “q” query  Weights are boosting fields  Title is DC.Title, description DC.Description….,  Body is textual body, subject…,  taxonomy the full description of the taxonomy branch DMS2013, August 2013, UK, Paolo Nesi 13
  • 14. Model Optimization  Optimization of the Precision&Recall to improve search quality  50 reference queries  Optimization Methods  Simulated Annealing  Genetic Algorithms  7 parameters DMS2013, August 2013, UK, Paolo Nesi 14
  • 15. Monte Carlo Analysis MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 15
  • 16. DMS2013, August 2013, UK, Paolo Nesi 16
  • 17. Some weights’ Trends DMS2013, August 2013, UK, Paolo Nesi 17
  • 18. Comparative Results MAP: Mean Average PrecisionDMS2013, August 2013, UK, Paolo Nesi 18
  • 19. Usage Results  Over than 500.000 visits  7.29 minutes of permanence on the portal DMS2013, August 2013, UK, Paolo Nesi 19
  • 20. Assessment of Search Facility  Distribution of performed clicks First page DMS2013, August 2013, UK, Paolo Nesi 20
  • 21. Conclusions  indexing solution for  cross media for multilingual metadata and texts  Improved Searching & filtering results and thus user experience quality  Providing: (full text, operators), advanced, faceted, etc.  Precision and Recall analysis allowed to tune the search services  Simulated Annealing and Genetic Algorithms produced similar results  User behavior assessment has shown that search facility appreciation has been improved wrt to early previous settings, grounded on common sense and classical metadata relevance DMS2013, August 2013, UK, Paolo Nesi 21