Slides for talk at Haystack Conference 2018. Covers evolution of an Image Similarity Search Proof of Concept built to identify similar medical images. Discusses various image vectorizing techniques that were considered in order to convert images into searchable entities, an evaluation strategy to rank these techniques, as well as various indexing strategies to allow searching for similar images at scale.
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
Â
Evolving a Medical Image Similarity Search
1. Presented by Sujit Pal
April 10-11, 2018
Evolving a Medical Image
Similarity Search
Haystack 2018, Charlottesville, VA
2. | 2
⢠Early user of Solr at CNET before it was open-sourced
⢠Search at Healthline (consumer health)
ď§ Lucene/Solr
ď§ Taxonomy backed âConceptâ Search
⢠Medical image classification at Elsevier
ď§ Deep Learning / Caffe
ď§ Machine Learning (Logistic Regression)
⢠Duplicate image Detection
ď§ Computer Vision / OpenCV, LIRE (Lucene Image Retrieval Engine)
ď§ Deep Learning / Keras
⢠Medical Similarity Search
ď§ Semantic rather than structural similarity
Background
3. | 3
⢠Ron Daniel
ď§ Help with expertise in Computer Vision techniques
⢠Matt Corkum
ď§ Caption based Image Search Platform
ď§ Tooling and Integration for Image Search done against this plaform
⢠Adrian Rosenbrock
ď§ PyImageSearch and OpenCV
⢠Doug Turnbull
ď§ Elastic{ON} 2016 talk about Image Search
Acknowledgements
4. | 4
Image Search Workflow
⢠Internal application for image review and tagging
5. | 5
⢠Feature Extraction
ď§ Converting images to feature vectors
⢠Indexing Strategies
ď§ Represent vectors using (text based) search index
⢠Evaluation
ď§ Search Quality metrics
Steps
6. | 6
⢠Global Features
ď§ Color
ď§ Texture (Edge)
⢠Quantize image
⢠Build Histogram
⢠Histogram is feature vector
⢠Descriptors
ď§ RGB
ď§ HSV
ď§ Opponent
ď§ CEDD
ď§ FCTH
ď§ JCD
Feature Extraction â Global Features
Image Credits: Shutterstock, 7-Themes.com, Kids Britannica, Pexels.com, and OpenCV Tutorials
7. | 7
⢠Local Features
ď§ Edges and Corners
ď§ Scale Invariant Feature Transform (SIFT)
ď§ Speeded up Robust Features (SURF)
ď§ Difference of Gaussians (DoG)
⢠Tile image and compute features per tile
⢠Cluster features
Feature Extraction â Local Features
⢠Centroids are
vocabulary words
⢠Image represented
as histogram of
vocab words.
Image Credits: OpenCV Tutorials, ScienceDirect.com
8. | 8
Feature Extraction â Deep Learning Features
Image Credits: CAIS++, Distill.pub
⢠Deep Learning models outperform traditional models for CV tasks
⢠Works like edge and color detectors at lower layers, and object detectors at
higher layers
⢠Encodes semantics of image rather than just color, texture and shapes
⢠Learns transformation from image to vector as a series of convolutions
⢠Many high performing models trained on large image datasets available
9. | 9
Feature Extraction â Deep Learning Features (contâd)
Image Credits: i-systems.github.io and ufldl.stanford.edu
⢠Deep Learning models are a sequence of convolutions and pooling
operations
⢠Each successive layer has a deeper (more convolution operations) over a
larger part of the image (pooling).
10. | 10
⢠Idea of using convolutions for feature extraction not new to CV, e.g.,
used in Haar Cascades
⢠But traditional CV uses specific convolutions for a task to extract
features for that task
⢠Deep Learning starts with random convolutions and uses (image,
label) pairs to learn convolutions appropriate to task
Feature Extraction â Deep Learning Features (contâd)
Image Credit: Greg Borenstein
11. | 11
Feature Extraction â Deep Learning Features (contâd)
woman
Image Credits: eepLearning.net
⢠Image to vector transformation == sequence of learned convolutions and
pooling operations
⢠Remove classification layer from pre-trained network.
⢠Run images through truncated network to product image vectors.
12. | 12
Indexing Strategies
⢠Naïve approaches
ď§ Linear search â LIRE default
ď§ Pre-compute K (approximate) nearest neighbors
⢠Text based indexes
ď§ Index-able unit is document (stream of tokens from an alphabet)
ď§ Image needs to be converted into a sequence of tokens from a âvisualâ alphabet
- Locality Sensitive Hashing (LSH)
- Metric Spaces Indexing
- Bag of Visual Words (BoVW)
⢠Text+Payload based indexes
ď§ Represent vectors as payloads with custom similarity
⢠Tensor based indexes
ď§ Supports indexing and querying of image feature vectors natively
ď§ Uses approximate nearest neighbor techniques
ď§ NMSLib â Non-Metric Space Library (ok for <= 1M vectors)
ď§ FAISS â Facebook AI Similarity Search
⢠Hybrid indexes
ď§ Vespa.ai â supports both text and tensor based queries
13. | 13
⢠Image vectors written out as âindex0|score0 index1|score1 âŚâ
⢠Query image vectorized and sparsified, then provided as a string
consisting of non-zero indices after sparsification, for example,
âindex50 index54 index67â.
⢠Payload similarity implementation provided as Groovy script to
Elasticsearch 1.5 (ES) engine, returns cosine similarity
⢠Find similar images using the ES function_score_query
⢠Did not scale beyond few hundred images in index
⢠Recent ES versions require custom Java ScriptEngine
implementation registered as plugin, so probably better scaling now.
Indexing Strategy â Payloads + Custom Similarity
14. | 14
⢠LSH - similar objects hashed to same bin.
⢠Assume image feature vectors V of rank R.
⢠Generate k values of vector Ai (also of rank R)
and bi from random normal distribution.
⢠Compute k values of hashes hi using following
formula:
⢠If at least m of k hashes for a pair of images
match, then the images are near duplicates.
⢠No ranking of similarities possible.
⢠Good for finding near duplicates.
Indexing Strategy â Locality Sensitive Hashing
15. | 15
⢠Also known as Perspective based Space Transformation
⢠Based on the idea that objects that are similar to a set of reference
objects are similar to each other.
⢠Randomly select k (â 2âN) images as reference objects RO
⢠Compute distance of each object from each reference image in RO
using the following distance formula:
⢠Posting list for each image is the m nearest reference objects
ordered by distance.
⢠Havenât tried this, but looks promising.
Indexing Strategy â Metric Spaces Indexing
16. | 16
⢠Briefly touched upon this when talking about Local Features
⢠Tile image, compute local descriptors (such as SIFT, SURF, etc) for
each tile
⢠Cluster these descriptors across all images
⢠Generate a vocabulary of Visual words out of the centroids of these
clusters
⢠Represent each image in index as a sequence of visual words
⢠During query, tile and compute local descriptors, then find the
closest words for each descriptor in vocabulary, and search using
this sequence of visual words.
⢠Used LIREâs built-in support for generating a BoVW based index but
results not very encouraging.
Indexing Strategy â Bag of Visual Words (BoVW)
17. | 17
⢠Produces approximate nearest neighbors
⢠Cluster image vectors into smaller clusters. Size of each cluster
should be chosen such that brute force KNN (with KD-Tree support
if available) is tractable
⢠For each cluster, compute K nearest neighbors for each image in
cluster
⢠Save ordered list of neighbor image IDs against each image
⢠At search time, the neighbors are simply looked up using the source
image ID
⢠Works well for my Similar Images functionality (closed system)
⢠For unknown query image, two step process to find the cluster and
then find K nearest neighbors
Indexing Strategy â Precompute K nearest neighbors
18. | 18
⢠Data Collection
ď§ 4 similarity levels (Essentially Identical, Very Similar, Similar, Different)
⢠Metrics
ď§ Precision @k
ď§ Mean Average Precision (MAP)
ď§ Recall
ď§ F1-score
ď§ nDCG
ď§ Correlation
Evaluation
19. | 19
⢠Similarity Page has
a Reset Similarity
button for each
similar image.
⢠Default is Similar,
overridden if needed
and captured into
logging database
⢠About 2000 pairs
(220 unique source
images) captured
using interface
Evaluation â Data Collection
20. | 20
⢠Almost Identical and Very Similar count as full hit (+1), and Similar
counts as half (+0.5), Different as non (+0).
⢠Precision @k results
Evaluation â Precision @k
k precision
1 0.3287
3 0.1096
5 0.0657
10 0. 0329
21. | 21
⢠Distance Metric: Cosine Similarity
⢠Features used:
ď§ Baseline: LIRE Global Features
ď§ Best: vectors from Xception
Evaluation â Correlation Results
Pearson Baseline Xception
Pearson -0.102 -0.566
Spearman -0.071 -0.495
22. | 22
Future Work
⢠Include captions for image search
⢠We have tried word2vec and skip-thoughts to generate caption vectors
but it didnât result in appreciable improvement
⢠Two stage search, caption search + refine with image, or vice versa
⢠Investigate metric spaces indexing approach
⢠Investigate dimensionality reduction â since curse of dimensionality seems
to be a common issue mentioned in computer vision literature
⢠Investigate using indexing approaches that allow tensor search
⢠Incorporate outputs of multiple classifiers to create faceted search
functionality that can be overlaid on results
⢠By genre â radiology, data graphics, microscopy, etc.
⢠By anatomical part
⢠By specialty
⢠By keywords in caption
⢠By concepts in caption