Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
A framework for visual search in broadcasting companies' multimedia archives
1. A framework for visual search in
broadcast archives
Speakers:
Federico Maria Pandolfi, Davide Desirello
Rai Teche
2. Importance of proper organization and management of contents
Efficient search and retrieval methodologies are a must
Typical MAM systems: text-based queries, search over textual
information and metadata
Pros: reliability, robustness
Cons: metadata extraction is expensive, time consuming and may not be
available for each entry
No semantic or analytical representations of contents
No query-by-example or near-duplicate detection
Introduction
3. Rai's digital archives include (videos and images as of end 2015):
1.540.032 hours of video material
102.300 music sheets and documents
18.720 photos of scenic costumes
1.700 photos of sets furniture
1.552 photos of Centro Elettronico Rai
The number increases with a rate of approx. 130.000hr/year both
because of new and old (digitized) material
Only about 46% is annotated
Case study: numbers
4. Case study: possible IR scenarios
Archives
Correlating non-annotated material
with similar pre-annotated contents
(video-to-video search)
Retrieve specific video/image in the
multimedia archive from a clip,
single frame or similar image
(image/video-to-image/video
search)
News
Link an edited
news/reportage to
its raw footage and,
viceversa (video-to-
video search)
Web
Find a specific show
from an image/clip
(image/video-to-
video search)
5. Content-Based Image Retrieval (CBIR) solutions are
necessary
Representation of images by means of features
automatically extractable from the contents
themselves, no annotation needed
Large number of CBIR solutions available
Highly customizable to address specific needs (e.g.:
global/local/DCNN features, lots of efficient indexing
and retrieval options, etc...)
The importance of CBIR
6. Issue: lots of options for
image search, few for video-
to-video search
Issue: Cutting-edge solutions
with solid absolute
performance but complex
systems and/or non patent-
free algorithms
Expensive and difficult to
maintain platform: not ideal in
enterprise environment
Our approach
Solution: new framework based on
ready-to-use solutions compatible
with Rai's enterprise infrastructure
Solution: first approach based on
simpler, open-source solution
LIRe (Lucene Image Retrieval):
• CBIR platform with strong community support
• Easy to integrate with Apache Solr
(widely used in Rai)
• Easy distributed search, index replication
and scalability
7. Modules composing the framework
(and their implementation):
Listener (custom files and folders manager)
Scene detector/key-frames extractor (FFMpeg)
Feature extractor (CEDD, LIRESOLR Plugin)
Indexer (LIRESOLR Plugin)
Retriever (LIRESOLR Plugin)
Goal
Implement independent workflow's logic
blocks to:
Develop code in parallel
Easily replace blocks with
better/more efficient solutions
Allow faster debugging and
maintenance operations
Proposed workflow: modularity
8. Proposed workflow: Listener and indexing
Chain starts by indexing reference videos in
the database, various entry-points:
Shared folder
RESTful APIs
Listener:
Is a background process watching a shared
folder (container) with files to be indexed
Manages the whole flow by giving specific
commands to the various components
Manages the folders-structure
Triggered by JSON Token file containing
file-list and parameters for indexing process
Framework targeted at image search on
video files
Scene detection and key-frame extraction
with FFMpeg
Generation of CEDD features descriptor
(light, low computational power) for each
key-frame
Indexing entries in Solr, two cores:
ImageCore (ID, URI, Descriptor)
MetaCore (other available metadata)
10. Simple retrieval algorithm:
1. Computation of query image descriptor
2. Descriptor-specific distance evaluated for each entry in the database
3. LIRe tweakable parameters:
Accuracy
Number of candidates
4. Results sorted by relevance using distance as score
Proposed workflow: retrieval
11. How to test the framework?
Lack of copyright-free datasets and evaluation frameworks that target our
specific use-case (to use as reference)
Impossible to perform image search on the whole Rai’s archive, datasets
selected (not annotated):
TG Leonardo (2200 episodes, approx. 360hrs): thematic, scientific focused newscast,
suitable for news/reportage and raw footage retrieval
Medita (2000 episodes, approx. 2000hrs): educational show, suitable to test pure image
search and tagging-aid capabilities
Query images extracted from indexed videos using different techniques:
FFMpeg shot detection
Rai’s Shotfinder
Preliminary evaluation
12. Preliminary evaluation
The best match is not always
found among the very first results
CEDD is a very compact
descriptor, images with similar
colours and textures may have
very similar descriptors
Changing the accuracy
increases retrieval time,
slightly better results
Difficult to evaluate precision and
recall for query images different
than the indexed images (datasets
not annotated yet)
If query shot is indexed: pat(1)≃1,
otherwise the distance increases
substantially
Might be good enough for raw
footage/final edit match use-case
13. Not able yet to find instances of same objects within different videos
and under different conditions (e.g. different video quality, framing,
etc..), no semantic search
Might be because of CEDD and, in general, global descriptors
Compact global descriptors may be good for specific tasks but a more
semantic approach is required
Quantitative tests presented are not mature yet
Making a proper dataset requires time and our framework is still in early stage
of development
We plan to build our own annotated dataset using the company’s archive
material
Conclusions
14. Future work
Creation of a new
annotated dataset
containing raw and
edited material
Evaluation of better key-frame extraction and shot detection
algorithms:
Reduce the number of extracted key-frames
Weight key-frames according to their relevance within the
related sequence
Improve retrieval performances, decrease index size and,
reduce disk occupation and speed-up search times
Evaluation of more sophisticated feature extraction
algorithms (local features, BoVW, DCNN feature vectors, ...)
In some cases a semantic search (based on image
contents) might be more useful
15. Thank you for watching
F. M. Pandolfi, D. Desirello
Rai Teche