SlideShare ist ein Scribd-Unternehmen logo
1 von 4
Downloaden Sie, um offline zu lesen
Sirio, Orione and Pan: an Integrated Web System for
Ontology-based Video Search and Annotation
Marco Bertini, Gianpaolo D’Amico, Andrea Ferracani,
Marco Meoni and Giuseppe Serra
Media Integration and Communication Center, University of Florence, Italy
{bertini, damico, ferracani, meoni, serra}@dsi.unifi.it
http://www.micc.unifi.it/vim
SUBMITTED to ACM MULTIMEDIA 2010 DEMO PROGRAM
ABSTRACT
In this technical demonstration we show an integrated web
system for video search and annotation based on ontolo-
gies. The system is composed by three components: the
Orione1
ontology-based search engine, the Sirio2
search in-
terface, and the Pan3
web-based video annotation tool. The
system is currently being developed within the EU IM3I4
project. The goal of the system is to provide an integrated
environment for video annotation and retrieval of videos, for
both technical and non-technical users. In fact, the search
engine has different interfaces that permit different query
modalities: free-text, natural language, graphical composi-
tion of concepts using Boolean and temporal relations and
query by visual example. In addition, the ontology structure
is exploited to encode semantic relations between concepts
permitting, for example, to expand queries to synonyms and
concept specializations. The annotation tool can be used to
create ground-truth annotations to train automatic anno-
tations systems, or to complement the results of automatic
annotation, e.g. adding geolocalized information.
Categories and Subject Descriptors
H.3.3 [Information Storage and Retrieval]: Information
Search and Retrieval—Search process; H.3.5 [Information
Storage and Retrieval]: Online Information Services—
Web-based services
General Terms
Algorithms, Experimentation
1
Orione was a heroic and mythical hunter, first cited in
Homer’s Odyssey.
2
Sirio was the hound of Orione. It was a dog so swift that
no prey could escape it.
3
A Greek divinity, his name means “all”.
4
http://www.im3i.eu
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
MM’10, October 25–29, 2010, Firenze, Italy.
Copyright 2010 ACM X-XXXXX-XX-X/XX/XX ...$10.00.
Keywords
Video retrieval, ontologies, web services
1. INTRODUCTION
Video search engines are the product of progress in many
technologies: visual and audio analysis, machine learning
techniques, as well as visualization and interaction. Auto-
matic video annotation systems are based on large sets of
concept classifiers [4], typically based on supervised machine
learning techniques such as SVMs; therefore there is need to
easily create ground-truth annotations of videos, indicating
the objects and events of interest. The current video search
engines are based on lexicons of semantic concepts and per-
form keyword-based queries [3]. These systems are generally
desktop applications or have simple web interfaces that show
the results of the query as a ranked list of keyframes [2, 4].
These systems do not let users to perform composite queries
that can include temporal relations between concepts and
do not allow to look for concepts that are not in the lexicon.
In addition, desktop applications require installation on the
end-user computer and can not be used in a distributed en-
vironment.
Within the EU VidiVideo project has been conducted a
survey formulated following the guidelines of IEEE Standard
830-1984 Guide to Software Requirements Specifications, to
gather user requirements for a video search engine. 40 quali-
fied users from 10 different countries participated; half of the
users were part of the scientific and cultural heritage sector
and half from the broadcasting and entertainment indus-
try. 73.5% consider a web-based interface “mandatory” and
20.5% “desirable”; this requirement has resulted as funda-
mental also during the collection of user requirements done
within the IM3I project. Users also requested the possibil-
ity to formulate complex composite queries and to expand
queries based on ontology reasoning. They also requested
to have an interface able to show concepts and their rela-
tions. Regarding the current annotation practices the vast
majority uses controlled lexicons and ontologies. All these
requirements have been taken into account in the design of
the integrated system.
Regarding manual annotation tools, these have become
popular in several mainstream video sharing sites such as
YouTube and Viddler, because they extend the popular idea
of image tagging applying it to videos. This information is
managed in a collaborative web 2.0 style, since annotations
can be inserted not only by content owners (video profes-
sionals, archivists, etc.), but also by end users, providing an
enhanced knowledge representation of multimedia content.
In this demonstration we present an integrated system
composed by i) a web video search engine that allows se-
mantic retrieval by content for different domains (broadcast
news, surveillance, cultural heritage documentaries) with
query interaction and visualization; ii) a web-based system
for manual annotation of videos, developed with the aim of
creating, collaboratively, ground truth annotations that can
be used for training and evaluating automatic video anno-
tation systems. Annotations can be exported to MPEG-7
and OWL ontologies and are directly integrated with the
ontology used by the search engine.
The search engine permits different query modalities (free
text, natural language, graphical composition of concepts
using boolean and temporal relations and query by visual
example) and visualizations, resulting in an advanced tool
for retrieval and exploration of video archives for both tech-
nical and non-technical users. In addition the use of ontolo-
gies permits to exploit semantic relations between concepts
through reasoning. Finally our web system, using the Rich
Internet Application paradigm (RIA), does not require any
installation and provides a responsive user interface.
2. THE SYSTEM
The search engine. The Sirio system5
is composed by three
different interfaces shown in Fig. 1: a GUI to build compos-
ite queries that may include Boolean and temporal operators
and visual examples, a natural language interface for simpler
queries with boolean/temporal operators, a free-text inter-
face for Google-like searches. The GUI interface allows to
inspect a local view of the ontology graph, when building
queries, to better understand how one concept is related to
the others and thus suggesting to the users possible changes
of the composition of the query. In all the interfaces it is
possible to extend queries adding synonyms and concept
specializations through ontology reasoning and the use of
WordNet. Consider, for instance, a query “Find shots with
vehicles”: the concept specializations expansion through on-
tology structure permits to retrieve not only the shots anno-
tated with vehicle, but also those annotated with its special-
izations (cars, trucks, etc.). In particular, WordNet query
expansion, using synonyms, is required when using natural
language and free-text queries, since it is not possible to
force the user to formulate a query selecting terms from a
lexicon, as is done using the GUI interface.
The Orione search engine uses an ontology that has been
created automatically from a flat lexicon, using WordNet to
create concept relations (is a, is part of and has part) as
shown, partially, in Fig. 3. The ontology is modelled follow-
ing the Dynamic Pictorially Enriched Ontology model [1],
that includes both concepts and visual concept prototypes.
These prototypes represent the different visual modalities in
which a concept can manifest; they can be selected by the
users to perform query by example. Concepts, concepts re-
lations, video annotations and visual concept prototypes are
defined using the standard Web Ontology Language (OWL)
so that the ontology can be easily reused and shared. The
queries created in each interface are translated by the search
5
http://deckard.micc.unifi.it/im3i/
engine into SPARQL, the W3C standard ontology query lan-
guage.
The system backend is currently based on open source
tools (i.e. Apache Tomcat and Red 5 video streaming server)
or freely available commercial tools (Adobe Media Server has
a free developer edition). The RTMP video streaming proto-
col is used. The search engine is developed in Java and sup-
ports multiple ontologies and ontology reasoning services.
Ontology structure and concept instances serialization has
been designed so that inference can be execute simultane-
ously on multiple ontologies, without slowing the retrieval;
this design allows to avoid the need of selecting a specific
ontology when creating a query. The engine has also been
designed to fit into a service oriented architecture, so that it
can be incorporated into the customizable search systems,
other than Sirio, that are developed within IM3I project.
Audio-visual concepts are automatically annotated using ei-
ther the VidiVideo annotation engine [4] and the IM3I anno-
tation engine. The search results are in RSS 2.0 XML format
with paging, so that they can be treated as RSS feeds. Re-
sults of the query are shown in the interface and for each
video clip of the result set is shown the first frame. These
frames are obtained from the video streaming server, and
are shown within a small video player. Users can then play
the video sequence and, if interested, zoom in each result
displaying it in a larger player, that provides more details
on the video metadata and allows better video browsing.
Figure 2: The video annotator tool.
The annotation tool. The video annotation system allows
to browse videos, select concepts from an ontology struc-
ture, insert or modify annotations and download previously
saved annotations (Fig. 2). The annotation tool can work
with multiple ontologies, to accommodate different video do-
mains. Ontologies can be imported and exported using the
MPEG-7 or OWL ontology, or created from scratch by users.
The system administrator can set the ontologies as modifi-
able by end-users. Annotations, managed by the Orione
search engine, can be exported to both MPEG-7 and OWL
Figure 1: Search interfaces: GUI query builder (with ontology tree and ontology graph); natural language
search; Google-like search.
ontology formats.
The system is implemented according to the Rich Internet
Application paradigm: all the user interfaces of the system
are written in Adobe Flex and Action Script 3, and run in
a Flash virtual machine inside a web browser. RIAs offer
many advantages to the user experience, because they avoid
the usual slow and synchronous loop for user interactions,
typical of web based environments that use only the HTML
widgets available to standard browsers. This allows to im-
plement a visual querying mechanism that exhibits a look
and feel approaching that of a desktop environment, with
the fast response that is expected by users. Both high levels
of interaction and collaboration (obtainable by a web appli-
cation) and robustness in multimedia environments (typical
of a desktop application) are guaranteed for the users. With
this solution installation is not required, since the applica-
tion is updated on the server, and run anywhere regardless
of what operating system is used. All the modules of the
system are connected using HTTP POST, XML and SOAP
web services. This allows to integrate the system within a
web-services architecture such as that of the IM3I project.
3. USABILITY TESTS
The video search engine (Sirio+Orione) has been tested
at the end of the VidiVideo project to evaluate the usabil-
ity of the system, with a set of field trials. A group of 12
people coming from broadcasting, media industry and cul-
tural heritage institutions, in Italy and in the Netherlands,
have tested the system on-line (running on the UniFi-MICC
servers), performing a set of pre-defined queries and inter-
acting with the system. The methodology used follows the
practices defined in the ISO 9241 standard, and gathered:
Figure 3: An example of the relations between the
DOME ontology concepts.
i) observational notes taken during test sessions by moni-
tors, ii) verbal feedback noted by test monitor and iii) an
online survey completed after the tests by all the users. The
observational notes and verbal feedbacks of the users have
been analyzed to understand the more critical parts of the
system and, during the development of the IM3I project,
have been used to redesign the functionalities of the Orione
search engine and of the Sirio interface. Fig. 4 summarizes
the results of the tests. The overall experience is very posi-
tive and the system proved to be easy to be used, despite the
objective difficulty of interacting with a complex system for
which the testers received only a very limited training. The
type of interaction that proved to be more suitable for the
majority of the users is the GUI based interface, followed
by the Google-like. The NLP interface received the lower
marks when users were asked to rate its usefulness for their
work, but still it was considered easy to be used. This is
due probably to the fact that this type of interface is not
commonly used, so that the average user has not enough
familiarity with it.
4. DEMONSTRATION
We demonstrate the search modalities of the system in
three different video domains: broadcast news, video surveil-
lance and cultural heritage documentaries. We show how
each interface is suitable for different users: the GUI in-
terface allows to build composite queries that take into ac-
count also metadata, as required by professional archivists;
the natural language interface allows to build simple queries
with Boolean and temporal relations between concepts; the
free-text interface provides the popular Google-like search.
We also demonstrate the use of the annotation tool to inte-
grate the results obtained by automatic annotation systems,
and the addition of geolocalized annotations. Differently
from other web based tools the proposed system allows to
annotate the presence of objects and events in frame accu-
rate time intervals, instead of annotating a few keyframes or
having to rely on offline video segmentation. The possibility
for the annotators to extend the proposed ontologies allow
expert users to better represent the video domains, as the
videos are inspected. The import and export capabilities us-
ing MPEG-7 standard format allow interoperability of the
system.
Acknowledgments.
This work was partially supported by the EU IST Vidi-
Video project (www.vidivideo.info - contract FP6-045547)
and EU IST IM3I project (http://www.im3i.eu/ - contract
FP7-222267). The authors thank Thomas M. Alisi and
Nicola Martorana for their help in software development.
5. REFERENCES
[1] M. Bertini, A. Del Bimbo, G. Serra, C. Torniai,
R. Cucchiara, C. Grana, and R. Vezzani. Dynamic
pictorially enriched ontologies for digital video libraries.
IEEE MultiMedia, 16(2):42–51, Apr/Jun 2009.
[2] A. Natsev, J. Smith, J. Teˇsi´c, L. Xie, R. Yan, W. Jiang,
and M. Merler. IBM Research TRECVID-2008 video
retrieval system. In Proceedings of the 6th TRECVID
Workshop, 2008.
[3] A. Smeaton, P. Over, and W. Kraaij. High-level feature
detection from video in TRECVid: a 5-year
retrospective of achievements. Multimedia Content
Analysis, Theory and Applications, pages 151–174,
2009.
[4] C. G. M. Snoek, K. E. A. van de Sande, O. de Rooij,
B. Huurnink, J. R. R. Uijlings, M. van Liempt,
M. Bugalho, I. Trancoso, F. Yan, M. A. Tahir,
K. Mikolajczyk, J. Kittler, M. de Rijke, J.-M.
Geusebroek, T. Gevers, M. Worring, D. C. Koelma,
and A. W. M. Smeulders. The MediaMill TRECVID
2009 semantic video search engine. In Proceedings of
the 7th TRECVID Workshop, Gaithersburg, USA,
November 2009.
Figure 4: Overview of usability tests: overall us-
ability of the system, preferred search modalities,
usability of the combination of search modalities.

Weitere ähnliche Inhalte

Ähnlich wie Sirio Orione and Pan

Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...IEEEFINALYEARPROJECTS
 
Review on content based video lecture retrieval
Review on content based video lecture retrievalReview on content based video lecture retrieval
Review on content based video lecture retrievaleSAT Journals
 
Automatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in VideosAutomatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in VideosIRJET Journal
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization ijcga
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization ijcga
 
Automatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In VideosAutomatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In VideosAsia Smith
 
Video Liveness Verification
Video Liveness VerificationVideo Liveness Verification
Video Liveness Verificationijtsrd
 
The_IoT-vision_of_INATO_Ltd
The_IoT-vision_of_INATO_LtdThe_IoT-vision_of_INATO_Ltd
The_IoT-vision_of_INATO_LtdLyubomir Blagoev
 
Smart India Hackathon Idea Submission
Smart India Hackathon Idea SubmissionSmart India Hackathon Idea Submission
Smart India Hackathon Idea SubmissionGaurav Ganna
 
OGC Web Service Shibboleth Interoperability Experiment
OGC Web Service Shibboleth Interoperability ExperimentOGC Web Service Shibboleth Interoperability Experiment
OGC Web Service Shibboleth Interoperability ExperimentEDINA, University of Edinburgh
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataTrieu Nguyen
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Sandro D'Elia
 

Ähnlich wie Sirio Orione and Pan (20)

Interactive Video Search and Browsing Systems
Interactive Video Search and Browsing SystemsInteractive Video Search and Browsing Systems
Interactive Video Search and Browsing Systems
 
Arneb
ArnebArneb
Arneb
 
Media Pick
Media PickMedia Pick
Media Pick
 
On Annotation of Video Content for Multimedia Retrieval and Sharing
On Annotation of Video Content for Multimedia  Retrieval and SharingOn Annotation of Video Content for Multimedia  Retrieval and Sharing
On Annotation of Video Content for Multimedia Retrieval and Sharing
 
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
On Linked Open Data (LOD)-based  Semantic Video Annotation SystemsOn Linked Open Data (LOD)-based  Semantic Video Annotation Systems
On Linked Open Data (LOD)-based Semantic Video Annotation Systems
 
Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...Automatic semantic content extraction in videos using a fuzzy ontology and ru...
Automatic semantic content extraction in videos using a fuzzy ontology and ru...
 
Review on content based video lecture retrieval
Review on content based video lecture retrievalReview on content based video lecture retrieval
Review on content based video lecture retrieval
 
Automatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in VideosAutomatic Subtitle Generation for Sound in Videos
Automatic Subtitle Generation for Sound in Videos
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization
 
Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization  Video Data Visualization System : Semantic Classification and Personalization
Video Data Visualization System : Semantic Classification and Personalization
 
Automatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In VideosAutomatic Subtitle Generation For Sound In Videos
Automatic Subtitle Generation For Sound In Videos
 
Video Liveness Verification
Video Liveness VerificationVideo Liveness Verification
Video Liveness Verification
 
Icme2011 demo poster
Icme2011 demo posterIcme2011 demo poster
Icme2011 demo poster
 
The_IoT-vision_of_INATO_Ltd
The_IoT-vision_of_INATO_LtdThe_IoT-vision_of_INATO_Ltd
The_IoT-vision_of_INATO_Ltd
 
Smart India Hackathon Idea Submission
Smart India Hackathon Idea SubmissionSmart India Hackathon Idea Submission
Smart India Hackathon Idea Submission
 
Shibboleth Federations and Secure SDI
Shibboleth Federations and Secure SDIShibboleth Federations and Secure SDI
Shibboleth Federations and Secure SDI
 
OGC Web Service Shibboleth Interoperability Experiment
OGC Web Service Shibboleth Interoperability ExperimentOGC Web Service Shibboleth Interoperability Experiment
OGC Web Service Shibboleth Interoperability Experiment
 
Video Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big dataVideo Ecosystem and some ideas about video big data
Video Ecosystem and some ideas about video big data
 
Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708Summer school bz_fp7research_20100708
Summer school bz_fp7research_20100708
 
K1803027074
K1803027074K1803027074
K1803027074
 

Kürzlich hochgeladen

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Kürzlich hochgeladen (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Sirio Orione and Pan

  • 1. Sirio, Orione and Pan: an Integrated Web System for Ontology-based Video Search and Annotation Marco Bertini, Gianpaolo D’Amico, Andrea Ferracani, Marco Meoni and Giuseppe Serra Media Integration and Communication Center, University of Florence, Italy {bertini, damico, ferracani, meoni, serra}@dsi.unifi.it http://www.micc.unifi.it/vim SUBMITTED to ACM MULTIMEDIA 2010 DEMO PROGRAM ABSTRACT In this technical demonstration we show an integrated web system for video search and annotation based on ontolo- gies. The system is composed by three components: the Orione1 ontology-based search engine, the Sirio2 search in- terface, and the Pan3 web-based video annotation tool. The system is currently being developed within the EU IM3I4 project. The goal of the system is to provide an integrated environment for video annotation and retrieval of videos, for both technical and non-technical users. In fact, the search engine has different interfaces that permit different query modalities: free-text, natural language, graphical composi- tion of concepts using Boolean and temporal relations and query by visual example. In addition, the ontology structure is exploited to encode semantic relations between concepts permitting, for example, to expand queries to synonyms and concept specializations. The annotation tool can be used to create ground-truth annotations to train automatic anno- tations systems, or to complement the results of automatic annotation, e.g. adding geolocalized information. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval—Search process; H.3.5 [Information Storage and Retrieval]: Online Information Services— Web-based services General Terms Algorithms, Experimentation 1 Orione was a heroic and mythical hunter, first cited in Homer’s Odyssey. 2 Sirio was the hound of Orione. It was a dog so swift that no prey could escape it. 3 A Greek divinity, his name means “all”. 4 http://www.im3i.eu Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM’10, October 25–29, 2010, Firenze, Italy. Copyright 2010 ACM X-XXXXX-XX-X/XX/XX ...$10.00. Keywords Video retrieval, ontologies, web services 1. INTRODUCTION Video search engines are the product of progress in many technologies: visual and audio analysis, machine learning techniques, as well as visualization and interaction. Auto- matic video annotation systems are based on large sets of concept classifiers [4], typically based on supervised machine learning techniques such as SVMs; therefore there is need to easily create ground-truth annotations of videos, indicating the objects and events of interest. The current video search engines are based on lexicons of semantic concepts and per- form keyword-based queries [3]. These systems are generally desktop applications or have simple web interfaces that show the results of the query as a ranked list of keyframes [2, 4]. These systems do not let users to perform composite queries that can include temporal relations between concepts and do not allow to look for concepts that are not in the lexicon. In addition, desktop applications require installation on the end-user computer and can not be used in a distributed en- vironment. Within the EU VidiVideo project has been conducted a survey formulated following the guidelines of IEEE Standard 830-1984 Guide to Software Requirements Specifications, to gather user requirements for a video search engine. 40 quali- fied users from 10 different countries participated; half of the users were part of the scientific and cultural heritage sector and half from the broadcasting and entertainment indus- try. 73.5% consider a web-based interface “mandatory” and 20.5% “desirable”; this requirement has resulted as funda- mental also during the collection of user requirements done within the IM3I project. Users also requested the possibil- ity to formulate complex composite queries and to expand queries based on ontology reasoning. They also requested to have an interface able to show concepts and their rela- tions. Regarding the current annotation practices the vast majority uses controlled lexicons and ontologies. All these requirements have been taken into account in the design of the integrated system. Regarding manual annotation tools, these have become popular in several mainstream video sharing sites such as YouTube and Viddler, because they extend the popular idea of image tagging applying it to videos. This information is managed in a collaborative web 2.0 style, since annotations can be inserted not only by content owners (video profes-
  • 2. sionals, archivists, etc.), but also by end users, providing an enhanced knowledge representation of multimedia content. In this demonstration we present an integrated system composed by i) a web video search engine that allows se- mantic retrieval by content for different domains (broadcast news, surveillance, cultural heritage documentaries) with query interaction and visualization; ii) a web-based system for manual annotation of videos, developed with the aim of creating, collaboratively, ground truth annotations that can be used for training and evaluating automatic video anno- tation systems. Annotations can be exported to MPEG-7 and OWL ontologies and are directly integrated with the ontology used by the search engine. The search engine permits different query modalities (free text, natural language, graphical composition of concepts using boolean and temporal relations and query by visual example) and visualizations, resulting in an advanced tool for retrieval and exploration of video archives for both tech- nical and non-technical users. In addition the use of ontolo- gies permits to exploit semantic relations between concepts through reasoning. Finally our web system, using the Rich Internet Application paradigm (RIA), does not require any installation and provides a responsive user interface. 2. THE SYSTEM The search engine. The Sirio system5 is composed by three different interfaces shown in Fig. 1: a GUI to build compos- ite queries that may include Boolean and temporal operators and visual examples, a natural language interface for simpler queries with boolean/temporal operators, a free-text inter- face for Google-like searches. The GUI interface allows to inspect a local view of the ontology graph, when building queries, to better understand how one concept is related to the others and thus suggesting to the users possible changes of the composition of the query. In all the interfaces it is possible to extend queries adding synonyms and concept specializations through ontology reasoning and the use of WordNet. Consider, for instance, a query “Find shots with vehicles”: the concept specializations expansion through on- tology structure permits to retrieve not only the shots anno- tated with vehicle, but also those annotated with its special- izations (cars, trucks, etc.). In particular, WordNet query expansion, using synonyms, is required when using natural language and free-text queries, since it is not possible to force the user to formulate a query selecting terms from a lexicon, as is done using the GUI interface. The Orione search engine uses an ontology that has been created automatically from a flat lexicon, using WordNet to create concept relations (is a, is part of and has part) as shown, partially, in Fig. 3. The ontology is modelled follow- ing the Dynamic Pictorially Enriched Ontology model [1], that includes both concepts and visual concept prototypes. These prototypes represent the different visual modalities in which a concept can manifest; they can be selected by the users to perform query by example. Concepts, concepts re- lations, video annotations and visual concept prototypes are defined using the standard Web Ontology Language (OWL) so that the ontology can be easily reused and shared. The queries created in each interface are translated by the search 5 http://deckard.micc.unifi.it/im3i/ engine into SPARQL, the W3C standard ontology query lan- guage. The system backend is currently based on open source tools (i.e. Apache Tomcat and Red 5 video streaming server) or freely available commercial tools (Adobe Media Server has a free developer edition). The RTMP video streaming proto- col is used. The search engine is developed in Java and sup- ports multiple ontologies and ontology reasoning services. Ontology structure and concept instances serialization has been designed so that inference can be execute simultane- ously on multiple ontologies, without slowing the retrieval; this design allows to avoid the need of selecting a specific ontology when creating a query. The engine has also been designed to fit into a service oriented architecture, so that it can be incorporated into the customizable search systems, other than Sirio, that are developed within IM3I project. Audio-visual concepts are automatically annotated using ei- ther the VidiVideo annotation engine [4] and the IM3I anno- tation engine. The search results are in RSS 2.0 XML format with paging, so that they can be treated as RSS feeds. Re- sults of the query are shown in the interface and for each video clip of the result set is shown the first frame. These frames are obtained from the video streaming server, and are shown within a small video player. Users can then play the video sequence and, if interested, zoom in each result displaying it in a larger player, that provides more details on the video metadata and allows better video browsing. Figure 2: The video annotator tool. The annotation tool. The video annotation system allows to browse videos, select concepts from an ontology struc- ture, insert or modify annotations and download previously saved annotations (Fig. 2). The annotation tool can work with multiple ontologies, to accommodate different video do- mains. Ontologies can be imported and exported using the MPEG-7 or OWL ontology, or created from scratch by users. The system administrator can set the ontologies as modifi- able by end-users. Annotations, managed by the Orione search engine, can be exported to both MPEG-7 and OWL
  • 3. Figure 1: Search interfaces: GUI query builder (with ontology tree and ontology graph); natural language search; Google-like search. ontology formats. The system is implemented according to the Rich Internet Application paradigm: all the user interfaces of the system are written in Adobe Flex and Action Script 3, and run in a Flash virtual machine inside a web browser. RIAs offer many advantages to the user experience, because they avoid the usual slow and synchronous loop for user interactions, typical of web based environments that use only the HTML widgets available to standard browsers. This allows to im- plement a visual querying mechanism that exhibits a look and feel approaching that of a desktop environment, with the fast response that is expected by users. Both high levels of interaction and collaboration (obtainable by a web appli- cation) and robustness in multimedia environments (typical of a desktop application) are guaranteed for the users. With this solution installation is not required, since the applica- tion is updated on the server, and run anywhere regardless of what operating system is used. All the modules of the system are connected using HTTP POST, XML and SOAP web services. This allows to integrate the system within a web-services architecture such as that of the IM3I project. 3. USABILITY TESTS The video search engine (Sirio+Orione) has been tested at the end of the VidiVideo project to evaluate the usabil- ity of the system, with a set of field trials. A group of 12 people coming from broadcasting, media industry and cul- tural heritage institutions, in Italy and in the Netherlands, have tested the system on-line (running on the UniFi-MICC servers), performing a set of pre-defined queries and inter- acting with the system. The methodology used follows the practices defined in the ISO 9241 standard, and gathered: Figure 3: An example of the relations between the DOME ontology concepts. i) observational notes taken during test sessions by moni- tors, ii) verbal feedback noted by test monitor and iii) an online survey completed after the tests by all the users. The observational notes and verbal feedbacks of the users have been analyzed to understand the more critical parts of the system and, during the development of the IM3I project, have been used to redesign the functionalities of the Orione search engine and of the Sirio interface. Fig. 4 summarizes the results of the tests. The overall experience is very posi- tive and the system proved to be easy to be used, despite the objective difficulty of interacting with a complex system for which the testers received only a very limited training. The type of interaction that proved to be more suitable for the
  • 4. majority of the users is the GUI based interface, followed by the Google-like. The NLP interface received the lower marks when users were asked to rate its usefulness for their work, but still it was considered easy to be used. This is due probably to the fact that this type of interface is not commonly used, so that the average user has not enough familiarity with it. 4. DEMONSTRATION We demonstrate the search modalities of the system in three different video domains: broadcast news, video surveil- lance and cultural heritage documentaries. We show how each interface is suitable for different users: the GUI in- terface allows to build composite queries that take into ac- count also metadata, as required by professional archivists; the natural language interface allows to build simple queries with Boolean and temporal relations between concepts; the free-text interface provides the popular Google-like search. We also demonstrate the use of the annotation tool to inte- grate the results obtained by automatic annotation systems, and the addition of geolocalized annotations. Differently from other web based tools the proposed system allows to annotate the presence of objects and events in frame accu- rate time intervals, instead of annotating a few keyframes or having to rely on offline video segmentation. The possibility for the annotators to extend the proposed ontologies allow expert users to better represent the video domains, as the videos are inspected. The import and export capabilities us- ing MPEG-7 standard format allow interoperability of the system. Acknowledgments. This work was partially supported by the EU IST Vidi- Video project (www.vidivideo.info - contract FP6-045547) and EU IST IM3I project (http://www.im3i.eu/ - contract FP7-222267). The authors thank Thomas M. Alisi and Nicola Martorana for their help in software development. 5. REFERENCES [1] M. Bertini, A. Del Bimbo, G. Serra, C. Torniai, R. Cucchiara, C. Grana, and R. Vezzani. Dynamic pictorially enriched ontologies for digital video libraries. IEEE MultiMedia, 16(2):42–51, Apr/Jun 2009. [2] A. Natsev, J. Smith, J. Teˇsi´c, L. Xie, R. Yan, W. Jiang, and M. Merler. IBM Research TRECVID-2008 video retrieval system. In Proceedings of the 6th TRECVID Workshop, 2008. [3] A. Smeaton, P. Over, and W. Kraaij. High-level feature detection from video in TRECVid: a 5-year retrospective of achievements. Multimedia Content Analysis, Theory and Applications, pages 151–174, 2009. [4] C. G. M. Snoek, K. E. A. van de Sande, O. de Rooij, B. Huurnink, J. R. R. Uijlings, M. van Liempt, M. Bugalho, I. Trancoso, F. Yan, M. A. Tahir, K. Mikolajczyk, J. Kittler, M. de Rijke, J.-M. Geusebroek, T. Gevers, M. Worring, D. C. Koelma, and A. W. M. Smeulders. The MediaMill TRECVID 2009 semantic video search engine. In Proceedings of the 7th TRECVID Workshop, Gaithersburg, USA, November 2009. Figure 4: Overview of usability tests: overall us- ability of the system, preferred search modalities, usability of the combination of search modalities.