SlideShare ist ein Scribd-Unternehmen logo
1 von 52
Downloaden Sie, um offline zu lesen
Compact Descriptors 4 Visual Search
 Danilo Pau (danilo.pau@st.com)
 Senior Principal Engineer
 Senior Member of Technical Staff
 SMIEEE
 SI/CVRP
 STMicroelectronics/AST




                                    Courtesy: M. Funamizu
Agenda                             2



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
Agenda                             3



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
Visual Search Context                                       4


• Millions of images and videos continue being uploaded all over the
  world on remote servers

   • Each day on Facebook 300 million photos are uploaded

   • roughly 58 photos uploaded each second

   • One hour of video uploaded to YouTube every second




                                                            Presentation Title   15/01/2013
Content Based Image Recognition                                                                5



• CBIR covers the concept of search that analyzes the actual content in
  the image, rather than relying on metadata.

• The development of this concept incorporated many algorithms and
  techniques from fields such as statistics, pattern recognition and
  computer vision.

• CBIR attracted a lot of attention and after many years of research, it
  has expanded towards the marketplace.

• CBIR’s application on mobile market is called Mobile Visual Search

• Visual Search is about the capability to initiate a search using an
  image as a query that captures a rigid object
   • Market potential of mobile visual search considers any mobile device with camera
     (phones, tablets and hybrids).

                                                                      Presentation Title   15/01/2013
CBIR vs QR Codes                                         6



• Quick Response codes, a type of two-dimensional barcode.

• The code is scanned by the mobile imager to produce a URL address
  for re-direction and browsing.

• QR codes are being used by 6.2% of the smart phone users in USA




                                                        Presentation Title   15/01/2013
Lots of Existing Applications                           7

• Google’s Goggles
• Nokia’s Point and Find
• oMoby
• Like.com
• Kooaba
• Moodstocks
• Snaptell
• pixlinQ
• Bing




                                             Presentation Title   15/01/2013
Existing Apps use Jpeg                                         8



• Previous applications use mobile imager that send JPEG compressed
  queries


      Mobile device
                        Send Jpeg images                 Remote server




                       Visual search result
                                              Database




                                                         Presentation Title   15/01/2013
An Example of Visual Search                              9




          Interest Point Description
            Descriptor pairing
                   Inliers




Query
                                       Courtesy Telecom Italia
The Rise of Compressed Descriptors                                                                10



• Alternatively send “compact features” extracted from raw images

• For example Scale Invariant Feature Transform – SIFT visual
  descriptors

• Consider 1200 descriptors, each one 128 Bytes, 4 bytes for
  coordinates, times 30 fps   network load nearly 38 Mbit/s
  unacceptable                VGA Image

                   160

                   140

                   120

                   100                                 JPEG High
              KB   80                                  JPEG Low
                                                       SIFT
                   60

                   40

                   20

                    0
                         JPEG High   JPEG Low   SIFT               Presentation Title   15/01/2013
Systems Considered   11




• Instead of sending images
  (a)



• application can send
  compact descriptors (b)



• and even perform search
  locally (c).
Previous Attempts                                       12



• Hashing
   • Locality Sensitive Hashing [Yeo et ali., 2008]
   • Similarity Sensitive Coding [Torralba et ali., 2008]
   • Spectral Hashing [Weiss et ali, 2008]

• Transform Coding
   • Karunen-love Transform [Chandrasekhar et ali. 2009]
   • ICA based Transform [Narozny et ali., 2008]

• Vector Quantization
   • Product Quantization [Jegou et ali., 2010]
   • Tree Structured Vector Quantization [Nistr et ali., 2006]

• Alternative to SIFT
   • Compressed Histogram of Gradients [Chandrasekhar et ali. 2011]


                                                                      Presentation Title   15/01/2013
Agenda                             13



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
Is a standard on Visual Search needed ?                               14




• Reduce load on wireless networks carrying visual search-related
  information.

• Ensure interoperability of visual search applications and databases,

• Enable hardware support for descriptor extraction and matching in
  mobile devices,

• Enable high level of performance of implementations conformant to
  the standard,

• Simplify design of descriptor extraction and matching for visual search
  applications,
What is a suitable standardization
                                                                                      15
   body ?
• Informal title:
       • Moving Picture Experts Group (MPEG)


• Formal title:
       • ISO/IEC JTC1 SC29 WG11 (Coding of Moving Pictures and Audio)
                                                                           JTC 1

• Parent SDOs:
       •   ISO:     International Organization for Standardization         SC29
       •   IEC:     International Electro technical Commission
       •   JTC 1:   Joint Technical Committee One
       •   SC29:    Study Committee 29: Coding of Audio, Picture,       WG11 (MPEG)
                     Multimedia and Hypermedia Information


• Members: National Bodies (25 voting, 16 observers)
16
Agenda                             17



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
CDVS : Scope                               18



• Descriptor extraction process needed to ensure interoperability.

• Bitstream of compact descriptors




                           Standard


   Query      Descriptor        Descriptor   Descriptor              Geometric      List of
   Image      extraction        bitstream    matching                verification   results




                                                          Database
Requirements                       19



Robustness
    High matching accuracy shall be achieved at least for images of textured
    rigid objects, landmarks, and printed documents.
    The matching accuracy shall be robust to changes in vantage points,
    camera parameters, lighting conditions, as well as in the presence of partial
    occlusions.
Sufficiency
    Descriptors shall be self-contained, in the sense that no other data are
    necessary for matching.
Compactness
    Shall minimize lengths/size of image descriptors
Scalability
    Shall allow adaptation of descriptor lengths to support the required
    performance level and database size.
    Shall enable design of web-scale visual search applications and
    databases.
How to achieve robustness                           20


• Image content is transformed into visual feature with coordinates
  that are invariant to illumination, scale, rotation, affine and
  perspective transforms
Types of invariance   21




• Illumination
Types of invariance   22




• Illumination

• Scale
Types of invariance   23




• Illumination

• Scale

• Rotation
Types of invariance   24




• Illumination

• Scale

• Rotation

• Affine Transform
Types of invariance   25




• Illumination

• Scale

• Rotation

• Affine Transform

• Full Perspective
Compactness                                        26




KB                                       VGA Image

160


140


120                                                                                          JPEG High
                                                                                             JPEG Low
100                                                                                          SIFT
                                                                                             512B
 80                                                                                          1KB
                                                                                             2KB
 60                                                                                          4KB
                                                                                             8KB
 40                                                                                          16KB


 20


  0
      JPEG High JPEG Low   SIFT   512B   1KB   2KB   4KB   8KB   16KB




                                                                        Presentation Title   15/01/2013
Extraction Pipeline                        27



                                                                           Encoding

                       Local Description                  Transfor      Arithmetic
                                                          m & SQ         coding
                       Extraction


Image                                       Keypoint       MSVQ
            Resizing        DoG      SIFT                                       H Mode
                                            selection     encoding                        Compact
                                                                                         descriptors

                                                                                          S Mode
                                                          Coordinate
                                                           coding




        H-Mode uses SQ encoding (256B)                       SCFV

        S-Mode uses MSVQ encoding (38KB)                   Descriptor

        Both Mode uses SCFV (49KB)
Properties of SIFT           28

David Lowe’s local descriptor detection extraction (1999-2004)
Extraordinarily robust matching technique
   • Can handle changes in viewpoint
      • Up to about 30 degree out of plane rotation
   • Can handle significant changes in illumination
      • Sometimes even day vs. night (below)
   • Lots of code available     http://www.vlfeat.org (BSD license)
Scale 1
                     Pyramid of DoG
                          Scale m
                                        29




Octave 1




                                 DoGs




                                 DoGs

Octave n

                                 DoGs
Actual Interest Point Detector Output   30
Building a Descriptor                                             31

• Take 16x16 patch window around detected interest point

• Subdivide patch with 4x4 sub-patches

• Create per sub patch 8 bin-histogram over edge orientations weighted
  by magnitude
                                                  angle histogram




                                             0               π
                                                            2π




• These lead to a 4x4x8=128 element vector    the SIFT descriptor


                                                               Presentation Title   15/01/2013
Key point selection                    32



• Basic idea: inlier features do not behave, in a statistical sense, as do
  the outlier features.

• Relevance value that results from taking into account distance from
  center, scale, orientation, peak, mean and variance of the SIFT
  descriptor.
Local Descriptor Compression H mode                       33



• Main idea is to generate a compressed descriptor from
  uncompressed SIFT by
   • Simple linear combinations of histograms
   • Scalar quantisation of resultant values
   • Adaptive Arithmetic coding

• Main benefits
   • Very low computational complexity
   • Negligible memory requirements
   • Highly scalable
   • Allows for very efficient matching and retrieval
Vector Quantizer Scheme: S- Mode   34
Location Encoding                     35




• Histogram Map: The positions of the nonzero bins are encoded as
  binary words through scanning columns and compressing the words by
  arithmetic coding.
• Histogram Count: The number of coordinates in the nonzero bins is
  encoded in an iterative fashion, by specifying first which bins contain
  more than 1 key point, then by specifying which among these that
  contain more than 2 keypoints, and so forth
Agenda                             36



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
Extraction times                                                          37




• SIFT interest point detection and feature extraction made the biggest
  contribution

• Global descriptors as complex as Interest Point Detection

• Very fast local descriptors and coordinate encoding

                                      Quantitative evaluation of CDVS extraction and pairwise matching   15/01/2013
Agenda                             38



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
Mobile Visual Search: Music CDs   39




               Query


               Stream Music




…                             …
Visual Search: eReaders, Printers                                                 40

Snapshot                                                                    Mass Storage
                                                                   Augmentation
Paper-copy      Initiate Visual                                3D models and markers
                Search                               Send
                                                     Compact               Transmission of
                                                                           markers and 3D
                                                     Query                     models

                                                                 Augmentation
                                                                  Rendering
                                                                                     2D / 3D
                                                                                    Rendering
Selective quality&content   Multimedia Content Retrieval           Composition of
printing                    From the cloud                         augmentations
                                                                      and image

                                                                Content Augmentation
News Finder
                                                         41
Still Pictures - Visual Search




                       Presentation Title   15/01/2013
Application and Use Cases from
                                                                          42
                    Broadcaster point of view
• Logo Detection




• Interactive Fruition




                         Courtesy RAI   Presentation Title   15/01/2013
Automotive 3D Top View     43




            Cam
      ECU
Cam                                    Cam

            Cam
Automotive 3D Top View   44
Moving Pictures Visual Search                   45




                      Courtesy Telecom Design
Agenda                             46



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
Intra Predicted Descriptors                                    47



           Desirable Properties:


           An inter descriptor coded in a
           compact visual stream
           Expressed in terms of one or
           more temporally neighboring
           descriptors.
           The "inter" part of the term
           refers to the use of Inter Frame
           Prediction.
           Designed to achieve higher
           compression rates and/or better
           precision-recall performances

                             Presentation Title   15/01/2013
3D Mobile Devices Will Surpass 148 Million
                                                                                                            48
                                  in 2015
• Advances in the 3D technology are very fast

• Industry adoption opens new opportunities                3D Visual Search

• From In-Stat studies:
   • ~ 30 % of all handheld game consoles will be 3D by 2015.
   • 3D mobile devices will increase demand for image sensors by 130 %.
   • In 2012, Notebook will be the first 3D enabled mobile device to reach 1 million
     units.
   • By 2014, 18 % of all tablets will be 3D.
   • Nintendo, Fuji, GoPro, Sony, ViewSonic, LG, Origin, Toshiba, Fujitsu, HP, ASUS,
     Lenovo, Dell, Alienware, HTC and Sharp focusing on autostereoscopy mobile
     technologies




                                                                          Presentation Title   15/01/2013
Microsoft Kinect      Asus Xtion

                                                                                       49




                       LG Optimus 3D P920




                                                LG Optimus Pad

                                                  3DS by Nintendo
Google 3D Warehouse
HTC EVO 3D                 Sharp Aquos SH-12C




                                                     Presentation Title   15/01/2013
3D Object Recognition with Kinect                                            50


SHOT: Unique Signatures of Histograms for Local Surface Description




         http://www.youtube.com/watch?v=eRW1zG_aONk
         Courtesy: CV laboratory University of Bologna
                                                         Presentation Title   15/01/2013
Agenda                             51



• Visual Search: Context

• MPEG initiative on Visual Search

• Compact Descriptors for Visual Search

• Implementation

• Use Cases

• Visual Search Evolution: Moving Pictures and 3D

• Question and Answers




                                                     Presentation Title   15/01/2013
52




Presentation Title   15/01/2013

Weitere ähnliche Inhalte

Ähnlich wie Compact Descriptors for Visual Search

Transcoding Characteristics of Web Images
Transcoding Characteristics of Web ImagesTranscoding Characteristics of Web Images
Transcoding Characteristics of Web ImagesVideoguy
 
A Ensemble Learning-based No Reference QoE Model for User Generated Contents
A Ensemble Learning-based No Reference QoE Model for User Generated ContentsA Ensemble Learning-based No Reference QoE Model for User Generated Contents
A Ensemble Learning-based No Reference QoE Model for User Generated ContentsDuc Nguyen
 
QoS for Media Networks
QoS for Media NetworksQoS for Media Networks
QoS for Media NetworksAmine Choukir
 
CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)
CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)
CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)Vizualsite LLC
 
Recent trends and challenges in 360-degree video compression
Recent trends and challenges in 360-degree video compressionRecent trends and challenges in 360-degree video compression
Recent trends and challenges in 360-degree video compressionYan Ye
 
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LGEdge AI and Vision Alliance
 
Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)
Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)
Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)David Rosenblum
 
IRJET - Applications of Image and Video Deduplication: A Survey
IRJET -  	  Applications of Image and Video Deduplication: A SurveyIRJET -  	  Applications of Image and Video Deduplication: A Survey
IRJET - Applications of Image and Video Deduplication: A SurveyIRJET Journal
 
GWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energy
GWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energyGWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energy
GWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energyPlanetek Italia Srl
 
AIDC India - AI Vision Slides
AIDC India - AI Vision SlidesAIDC India - AI Vision Slides
AIDC India - AI Vision SlidesIntel® Software
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Igor De Souza
 
“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...
“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...
“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...Edge AI and Vision Alliance
 
World’s Fastest Image Serving Technology
World’s Fastest Image Serving TechnologyWorld’s Fastest Image Serving Technology
World’s Fastest Image Serving TechnologySiyathokoza Ngcobo
 
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & ConsCognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & ConsSenturus
 
Track and Trace Solution Details
Track and Trace Solution DetailsTrack and Trace Solution Details
Track and Trace Solution DetailsPropix Technologies
 
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningAli Alkan
 
SMT Global Services
SMT Global ServicesSMT Global Services
SMT Global Servicessmtmarketing
 
O&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle East
O&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle EastO&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle East
O&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle EastCCT International
 

Ähnlich wie Compact Descriptors for Visual Search (20)

Transcoding Characteristics of Web Images
Transcoding Characteristics of Web ImagesTranscoding Characteristics of Web Images
Transcoding Characteristics of Web Images
 
A Ensemble Learning-based No Reference QoE Model for User Generated Contents
A Ensemble Learning-based No Reference QoE Model for User Generated ContentsA Ensemble Learning-based No Reference QoE Model for User Generated Contents
A Ensemble Learning-based No Reference QoE Model for User Generated Contents
 
QoS for Media Networks
QoS for Media NetworksQoS for Media Networks
QoS for Media Networks
 
CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)
CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)
CGM (Computer Graphics Metafile) v SVG (Scalable Vector Graphic)
 
CGM versus SVG
CGM versus SVGCGM versus SVG
CGM versus SVG
 
Recent trends and challenges in 360-degree video compression
Recent trends and challenges in 360-degree video compressionRecent trends and challenges in 360-degree video compression
Recent trends and challenges in 360-degree video compression
 
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
"The Coming Shift from Image Sensors to Image Sensing," a Presentation from LG
 
Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)
Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)
Software System Scalability: Concepts and Techniques (keynote talk at ISEC 2009)
 
Parking Lot App
Parking Lot AppParking Lot App
Parking Lot App
 
IRJET - Applications of Image and Video Deduplication: A Survey
IRJET -  	  Applications of Image and Video Deduplication: A SurveyIRJET -  	  Applications of Image and Video Deduplication: A Survey
IRJET - Applications of Image and Video Deduplication: A Survey
 
GWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energy
GWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energyGWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energy
GWT 2014: Energy Conference - 02 Le soluzioni Geospaziali per il mondo energy
 
AIDC India - AI Vision Slides
AIDC India - AI Vision SlidesAIDC India - AI Vision Slides
AIDC India - AI Vision Slides
 
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
Data Engineer, Patterns & Architecture The future: Deep-dive into Microservic...
 
“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...
“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...
“Selecting the Right Camera for Your Embedded Computer Vision Project,” a Pre...
 
World’s Fastest Image Serving Technology
World’s Fastest Image Serving TechnologyWorld’s Fastest Image Serving Technology
World’s Fastest Image Serving Technology
 
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & ConsCognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
Cognos Dynamic Cubes:Set To Retire Transformer?: 10.2.2 Update: Pros & Cons
 
Track and Trace Solution Details
Track and Trace Solution DetailsTrack and Trace Solution Details
Track and Trace Solution Details
 
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine LearningMakine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
Makine Öğrenmesi ile Görüntü Tanıma | Image Recognition using Machine Learning
 
SMT Global Services
SMT Global ServicesSMT Global Services
SMT Global Services
 
O&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle East
O&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle EastO&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle East
O&M Challenge: BIM and Laser Scanning, a Hospital Project from the Middle East
 

Mehr von Antonio Capone

CommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyond
CommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyondCommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyond
CommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyondAntonio Capone
 
CommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) servicesCommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) servicesAntonio Capone
 
CommTech Talks: Elastic Optical Devices for Software Defined Optical Networks
CommTech Talks: Elastic Optical Devices for Software Defined Optical NetworksCommTech Talks: Elastic Optical Devices for Software Defined Optical Networks
CommTech Talks: Elastic Optical Devices for Software Defined Optical NetworksAntonio Capone
 
Tutorial on SDN data plane evolution
Tutorial on SDN data plane evolutionTutorial on SDN data plane evolution
Tutorial on SDN data plane evolutionAntonio Capone
 
CommTech Talks: Journey to 5G: Trends and Scenarios for Mobile Networks
CommTech Talks: Journey to 5G: Trends and Scenarios for Mobile NetworksCommTech Talks: Journey to 5G: Trends and Scenarios for Mobile Networks
CommTech Talks: Journey to 5G: Trends and Scenarios for Mobile NetworksAntonio Capone
 
CommTech Talks: CISCO Connecting the unconnected
CommTech Talks: CISCO Connecting the unconnectedCommTech Talks: CISCO Connecting the unconnected
CommTech Talks: CISCO Connecting the unconnectedAntonio Capone
 
CommTech Talks: Patents in ICT
CommTech Talks: Patents in ICTCommTech Talks: Patents in ICT
CommTech Talks: Patents in ICTAntonio Capone
 
L'esplosione del traffico dati mobile e l'arrivo di LTE
L'esplosione del traffico dati mobile e l'arrivo di LTEL'esplosione del traffico dati mobile e l'arrivo di LTE
L'esplosione del traffico dati mobile e l'arrivo di LTEAntonio Capone
 
Audio rendering: from sound diffusion to sound projection
Audio rendering: from sound diffusion to sound projectionAudio rendering: from sound diffusion to sound projection
Audio rendering: from sound diffusion to sound projectionAntonio Capone
 
Kaleidon: la nuova rete fotonica italiana
Kaleidon: la nuova rete fotonica italianaKaleidon: la nuova rete fotonica italiana
Kaleidon: la nuova rete fotonica italianaAntonio Capone
 
CommTech Talks: Lightstreamer (A. Alinone)
CommTech Talks: Lightstreamer (A. Alinone)CommTech Talks: Lightstreamer (A. Alinone)
CommTech Talks: Lightstreamer (A. Alinone)Antonio Capone
 
CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...
CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...
CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...Antonio Capone
 
CommTech Talks: Fondazione VODAFONE Italia tecnologie per il sociale
CommTech Talks: Fondazione VODAFONE Italia tecnologie per il socialeCommTech Talks: Fondazione VODAFONE Italia tecnologie per il sociale
CommTech Talks: Fondazione VODAFONE Italia tecnologie per il socialeAntonio Capone
 
Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...
Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...
Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...Antonio Capone
 

Mehr von Antonio Capone (14)

CommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyond
CommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyondCommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyond
CommTech Talks: Vehicle-to-Vehicle Communications in LTE and beyond
 
CommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) servicesCommTech Talks: Challenges for Video on Demand (VoD) services
CommTech Talks: Challenges for Video on Demand (VoD) services
 
CommTech Talks: Elastic Optical Devices for Software Defined Optical Networks
CommTech Talks: Elastic Optical Devices for Software Defined Optical NetworksCommTech Talks: Elastic Optical Devices for Software Defined Optical Networks
CommTech Talks: Elastic Optical Devices for Software Defined Optical Networks
 
Tutorial on SDN data plane evolution
Tutorial on SDN data plane evolutionTutorial on SDN data plane evolution
Tutorial on SDN data plane evolution
 
CommTech Talks: Journey to 5G: Trends and Scenarios for Mobile Networks
CommTech Talks: Journey to 5G: Trends and Scenarios for Mobile NetworksCommTech Talks: Journey to 5G: Trends and Scenarios for Mobile Networks
CommTech Talks: Journey to 5G: Trends and Scenarios for Mobile Networks
 
CommTech Talks: CISCO Connecting the unconnected
CommTech Talks: CISCO Connecting the unconnectedCommTech Talks: CISCO Connecting the unconnected
CommTech Talks: CISCO Connecting the unconnected
 
CommTech Talks: Patents in ICT
CommTech Talks: Patents in ICTCommTech Talks: Patents in ICT
CommTech Talks: Patents in ICT
 
L'esplosione del traffico dati mobile e l'arrivo di LTE
L'esplosione del traffico dati mobile e l'arrivo di LTEL'esplosione del traffico dati mobile e l'arrivo di LTE
L'esplosione del traffico dati mobile e l'arrivo di LTE
 
Audio rendering: from sound diffusion to sound projection
Audio rendering: from sound diffusion to sound projectionAudio rendering: from sound diffusion to sound projection
Audio rendering: from sound diffusion to sound projection
 
Kaleidon: la nuova rete fotonica italiana
Kaleidon: la nuova rete fotonica italianaKaleidon: la nuova rete fotonica italiana
Kaleidon: la nuova rete fotonica italiana
 
CommTech Talks: Lightstreamer (A. Alinone)
CommTech Talks: Lightstreamer (A. Alinone)CommTech Talks: Lightstreamer (A. Alinone)
CommTech Talks: Lightstreamer (A. Alinone)
 
CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...
CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...
CommTech Talks: Optical Access Architectures for Backhauling of Broadband Mob...
 
CommTech Talks: Fondazione VODAFONE Italia tecnologie per il sociale
CommTech Talks: Fondazione VODAFONE Italia tecnologie per il socialeCommTech Talks: Fondazione VODAFONE Italia tecnologie per il sociale
CommTech Talks: Fondazione VODAFONE Italia tecnologie per il sociale
 
Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...
Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...
Linkra: Mobile backhauling: i collegamenti radio a larga banda per le reti ra...
 

Kürzlich hochgeladen

IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXTarek Kalaji
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsSafe Software
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsSeth Reyes
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7DianaGray10
 

Kürzlich hochgeladen (20)

IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBXVoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
20150722 - AGV
20150722 - AGV20150722 - AGV
20150722 - AGV
 
Computer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and HazardsComputer 10: Lesson 10 - Online Crimes and Hazards
Computer 10: Lesson 10 - Online Crimes and Hazards
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
 

Compact Descriptors for Visual Search

  • 1. Compact Descriptors 4 Visual Search Danilo Pau (danilo.pau@st.com) Senior Principal Engineer Senior Member of Technical Staff SMIEEE SI/CVRP STMicroelectronics/AST Courtesy: M. Funamizu
  • 2. Agenda 2 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 3. Agenda 3 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 4. Visual Search Context 4 • Millions of images and videos continue being uploaded all over the world on remote servers • Each day on Facebook 300 million photos are uploaded • roughly 58 photos uploaded each second • One hour of video uploaded to YouTube every second Presentation Title 15/01/2013
  • 5. Content Based Image Recognition 5 • CBIR covers the concept of search that analyzes the actual content in the image, rather than relying on metadata. • The development of this concept incorporated many algorithms and techniques from fields such as statistics, pattern recognition and computer vision. • CBIR attracted a lot of attention and after many years of research, it has expanded towards the marketplace. • CBIR’s application on mobile market is called Mobile Visual Search • Visual Search is about the capability to initiate a search using an image as a query that captures a rigid object • Market potential of mobile visual search considers any mobile device with camera (phones, tablets and hybrids). Presentation Title 15/01/2013
  • 6. CBIR vs QR Codes 6 • Quick Response codes, a type of two-dimensional barcode. • The code is scanned by the mobile imager to produce a URL address for re-direction and browsing. • QR codes are being used by 6.2% of the smart phone users in USA Presentation Title 15/01/2013
  • 7. Lots of Existing Applications 7 • Google’s Goggles • Nokia’s Point and Find • oMoby • Like.com • Kooaba • Moodstocks • Snaptell • pixlinQ • Bing Presentation Title 15/01/2013
  • 8. Existing Apps use Jpeg 8 • Previous applications use mobile imager that send JPEG compressed queries Mobile device Send Jpeg images Remote server Visual search result Database Presentation Title 15/01/2013
  • 9. An Example of Visual Search 9 Interest Point Description Descriptor pairing Inliers Query Courtesy Telecom Italia
  • 10. The Rise of Compressed Descriptors 10 • Alternatively send “compact features” extracted from raw images • For example Scale Invariant Feature Transform – SIFT visual descriptors • Consider 1200 descriptors, each one 128 Bytes, 4 bytes for coordinates, times 30 fps network load nearly 38 Mbit/s unacceptable VGA Image 160 140 120 100 JPEG High KB 80 JPEG Low SIFT 60 40 20 0 JPEG High JPEG Low SIFT Presentation Title 15/01/2013
  • 11. Systems Considered 11 • Instead of sending images (a) • application can send compact descriptors (b) • and even perform search locally (c).
  • 12. Previous Attempts 12 • Hashing • Locality Sensitive Hashing [Yeo et ali., 2008] • Similarity Sensitive Coding [Torralba et ali., 2008] • Spectral Hashing [Weiss et ali, 2008] • Transform Coding • Karunen-love Transform [Chandrasekhar et ali. 2009] • ICA based Transform [Narozny et ali., 2008] • Vector Quantization • Product Quantization [Jegou et ali., 2010] • Tree Structured Vector Quantization [Nistr et ali., 2006] • Alternative to SIFT • Compressed Histogram of Gradients [Chandrasekhar et ali. 2011] Presentation Title 15/01/2013
  • 13. Agenda 13 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 14. Is a standard on Visual Search needed ? 14 • Reduce load on wireless networks carrying visual search-related information. • Ensure interoperability of visual search applications and databases, • Enable hardware support for descriptor extraction and matching in mobile devices, • Enable high level of performance of implementations conformant to the standard, • Simplify design of descriptor extraction and matching for visual search applications,
  • 15. What is a suitable standardization 15 body ? • Informal title: • Moving Picture Experts Group (MPEG) • Formal title: • ISO/IEC JTC1 SC29 WG11 (Coding of Moving Pictures and Audio) JTC 1 • Parent SDOs: • ISO: International Organization for Standardization SC29 • IEC: International Electro technical Commission • JTC 1: Joint Technical Committee One • SC29: Study Committee 29: Coding of Audio, Picture, WG11 (MPEG) Multimedia and Hypermedia Information • Members: National Bodies (25 voting, 16 observers)
  • 16. 16
  • 17. Agenda 17 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 18. CDVS : Scope 18 • Descriptor extraction process needed to ensure interoperability. • Bitstream of compact descriptors Standard Query Descriptor Descriptor Descriptor Geometric List of Image extraction bitstream matching verification results Database
  • 19. Requirements 19 Robustness High matching accuracy shall be achieved at least for images of textured rigid objects, landmarks, and printed documents. The matching accuracy shall be robust to changes in vantage points, camera parameters, lighting conditions, as well as in the presence of partial occlusions. Sufficiency Descriptors shall be self-contained, in the sense that no other data are necessary for matching. Compactness Shall minimize lengths/size of image descriptors Scalability Shall allow adaptation of descriptor lengths to support the required performance level and database size. Shall enable design of web-scale visual search applications and databases.
  • 20. How to achieve robustness 20 • Image content is transformed into visual feature with coordinates that are invariant to illumination, scale, rotation, affine and perspective transforms
  • 21. Types of invariance 21 • Illumination
  • 22. Types of invariance 22 • Illumination • Scale
  • 23. Types of invariance 23 • Illumination • Scale • Rotation
  • 24. Types of invariance 24 • Illumination • Scale • Rotation • Affine Transform
  • 25. Types of invariance 25 • Illumination • Scale • Rotation • Affine Transform • Full Perspective
  • 26. Compactness 26 KB VGA Image 160 140 120 JPEG High JPEG Low 100 SIFT 512B 80 1KB 2KB 60 4KB 8KB 40 16KB 20 0 JPEG High JPEG Low SIFT 512B 1KB 2KB 4KB 8KB 16KB Presentation Title 15/01/2013
  • 27. Extraction Pipeline 27 Encoding Local Description Transfor Arithmetic m & SQ coding Extraction Image Keypoint MSVQ Resizing DoG SIFT H Mode selection encoding Compact descriptors S Mode Coordinate coding H-Mode uses SQ encoding (256B) SCFV S-Mode uses MSVQ encoding (38KB) Descriptor Both Mode uses SCFV (49KB)
  • 28. Properties of SIFT 28 David Lowe’s local descriptor detection extraction (1999-2004) Extraordinarily robust matching technique • Can handle changes in viewpoint • Up to about 30 degree out of plane rotation • Can handle significant changes in illumination • Sometimes even day vs. night (below) • Lots of code available http://www.vlfeat.org (BSD license)
  • 29. Scale 1 Pyramid of DoG Scale m 29 Octave 1 DoGs DoGs Octave n DoGs
  • 30. Actual Interest Point Detector Output 30
  • 31. Building a Descriptor 31 • Take 16x16 patch window around detected interest point • Subdivide patch with 4x4 sub-patches • Create per sub patch 8 bin-histogram over edge orientations weighted by magnitude angle histogram 0 π 2π • These lead to a 4x4x8=128 element vector the SIFT descriptor Presentation Title 15/01/2013
  • 32. Key point selection 32 • Basic idea: inlier features do not behave, in a statistical sense, as do the outlier features. • Relevance value that results from taking into account distance from center, scale, orientation, peak, mean and variance of the SIFT descriptor.
  • 33. Local Descriptor Compression H mode 33 • Main idea is to generate a compressed descriptor from uncompressed SIFT by • Simple linear combinations of histograms • Scalar quantisation of resultant values • Adaptive Arithmetic coding • Main benefits • Very low computational complexity • Negligible memory requirements • Highly scalable • Allows for very efficient matching and retrieval
  • 35. Location Encoding 35 • Histogram Map: The positions of the nonzero bins are encoded as binary words through scanning columns and compressing the words by arithmetic coding. • Histogram Count: The number of coordinates in the nonzero bins is encoded in an iterative fashion, by specifying first which bins contain more than 1 key point, then by specifying which among these that contain more than 2 keypoints, and so forth
  • 36. Agenda 36 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 37. Extraction times 37 • SIFT interest point detection and feature extraction made the biggest contribution • Global descriptors as complex as Interest Point Detection • Very fast local descriptors and coordinate encoding Quantitative evaluation of CDVS extraction and pairwise matching 15/01/2013
  • 38. Agenda 38 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 39. Mobile Visual Search: Music CDs 39 Query Stream Music … …
  • 40. Visual Search: eReaders, Printers 40 Snapshot Mass Storage Augmentation Paper-copy Initiate Visual 3D models and markers Search Send Compact Transmission of markers and 3D Query models Augmentation Rendering 2D / 3D Rendering Selective quality&content Multimedia Content Retrieval Composition of printing From the cloud augmentations and image Content Augmentation
  • 41. News Finder 41 Still Pictures - Visual Search Presentation Title 15/01/2013
  • 42. Application and Use Cases from 42 Broadcaster point of view • Logo Detection • Interactive Fruition Courtesy RAI Presentation Title 15/01/2013
  • 43. Automotive 3D Top View 43 Cam ECU Cam Cam Cam
  • 44. Automotive 3D Top View 44
  • 45. Moving Pictures Visual Search 45 Courtesy Telecom Design
  • 46. Agenda 46 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 47. Intra Predicted Descriptors 47 Desirable Properties: An inter descriptor coded in a compact visual stream Expressed in terms of one or more temporally neighboring descriptors. The "inter" part of the term refers to the use of Inter Frame Prediction. Designed to achieve higher compression rates and/or better precision-recall performances Presentation Title 15/01/2013
  • 48. 3D Mobile Devices Will Surpass 148 Million 48 in 2015 • Advances in the 3D technology are very fast • Industry adoption opens new opportunities 3D Visual Search • From In-Stat studies: • ~ 30 % of all handheld game consoles will be 3D by 2015. • 3D mobile devices will increase demand for image sensors by 130 %. • In 2012, Notebook will be the first 3D enabled mobile device to reach 1 million units. • By 2014, 18 % of all tablets will be 3D. • Nintendo, Fuji, GoPro, Sony, ViewSonic, LG, Origin, Toshiba, Fujitsu, HP, ASUS, Lenovo, Dell, Alienware, HTC and Sharp focusing on autostereoscopy mobile technologies Presentation Title 15/01/2013
  • 49. Microsoft Kinect Asus Xtion 49 LG Optimus 3D P920 LG Optimus Pad 3DS by Nintendo Google 3D Warehouse HTC EVO 3D Sharp Aquos SH-12C Presentation Title 15/01/2013
  • 50. 3D Object Recognition with Kinect 50 SHOT: Unique Signatures of Histograms for Local Surface Description http://www.youtube.com/watch?v=eRW1zG_aONk Courtesy: CV laboratory University of Bologna Presentation Title 15/01/2013
  • 51. Agenda 51 • Visual Search: Context • MPEG initiative on Visual Search • Compact Descriptors for Visual Search • Implementation • Use Cases • Visual Search Evolution: Moving Pictures and 3D • Question and Answers Presentation Title 15/01/2013
  • 52. 52 Presentation Title 15/01/2013