Visual search has the potential to change how we discover new ideas and products. As such, the likes of Pinterest, Google, and Amazon have made visual search a priority.
It is a complex field, however, requiring sophisticated technology and huge quantities of training data.
In this presentation, you will discover:
- What visual search is
- How visual search works
- How effective the main visual search engines are today
- What you can do to start optimizing for visual search today
6. “ Visual search is a common visual activity that
we engage in on a daily basis.
For example, we spend time looking for a
friend in the airport crowd, looking for our car
in the parking lot, or looking for the tomatoes
in the vegetable aisle at the supermarket.
- Encyclopedia of Neuroscience
6
8. 600,000,000+
● Visual searches via Pinterest Lens every month
● Pinterest’s image ads have an 8.5% conversion rate
● 97% of all Pinterest searches are non-branded
8
10. 10
Camera-based search leads to:
● 48% more product views
● 75% greater likelihood to return
● 51% higher time on site
● 9% higher average order value
- BloomReach
Google image search is already a huge
opportunity. As Google integrates Lens
into more products, brands will be
able to connect with their audience in
new and more effective ways.
12. “ "In the English language there’s something like
180,000 words, and we only use 3,000 to 5,000 of
them.
If you’re trying to do voice recognition, there’s a
really small set of things you actually need to be
able to recognize.
Think about how many objects there are in the
world, distinct objects, billions, and they all come in
different shapes and sizes."
- Clay Bavor, Google
12
14. “ “A field linguist has gone to visit a culture whose
language is entirely different from our own. The linguist is
trying to learn some words from a helpful native speaker,
when a rabbit scurries by.
The native speaker declares “gavagai”, and the linguist is
left to infer the meaning of this new word. The linguist is
faced with an abundance of possible inferences,
including that “gavagai” refers to rabbits, animals, white
things, that specific rabbit, or “undetached parts of
rabbits”. There is an infinity of possible inferences to be
made. How are people able to choose the correct one?”
- DeepMind
14
15. Teaching a machine to understand images
15
- Deep neural networks are put through
their paces in tests like the one to the
left, with the expectation that they will
mimic the functioning of the human
brain in identifying targets.
- The decisions (or ‘inherent biases’, as
they are known) that allow us to make
sense of these patterns are more
difficult to integrate into a machine.
- When processing an image, should a
machine prioritize shape, color, or
size? How does a person do this? Do
we even know for sure, or do we only
know the output?
16. 16
1. Query
understanding:
Object shape, size,
color. Annotations
help connect this to
Pinterest’s inventory.
2. Blender:
Results pulled from
Visual Search (similar
aesthetic), Object
Search (similar
objects) and Image
Search (similar text
search results).
Blending ratios are
weighted
dynamically based
on query
understanding -
including related
boards and past user
behavior.
17. “ “The images that appear in both the style ideas and
similar items grids are also algorithmically ranked,
and will prioritize those that focus on a particular
product type or that appear as a complete look and
are from authoritative sites.”
- Google
17
18. Visual search both extends and shortens the search journey
18
Intent state Input Output
Open to ideas
Looking for a style
Looking for a specific type of
product
“White sneakers”
Ready to buy
“Adidas originals
white”
- Visual search creates a
new space for
image-driven,
inspiration-based
connections.
- It can also collapse the
purchase journey,
allowing someone to
go from image to
purchase in just a few
moments.
20. 20
If a visual search engine can
identify this object and return
results that are similar in aesthetic
quality and cheerful disposition, it
can be said to have passed The
Loafie Test.
Loafie is:
● A cushion
● Shaped like a loaf of bread
● Soft/plushy
● Brown and white
● Cute
● Smiling and winking
27. 27
What this tells us:
- Pinterest: Meta data (Pins, board names, image tags) help it understand context. The ‘blender’ that decides the weighting
of shape/color/texture is dynamic and effective. The high quantity of visual searches on the platform is helping Pinterest
improve accuracy of object recognition. Pinterest is closest to understanding the ‘essence’ of an object, beyond its form
or color. This allows it to deliver satisfactory results for fashion and decor image queries.
- Google: For now, Google turns the image into a text query based on the object it recognizes. Search with an image of a
mug using Google Lens and it will return results for mugs, but it will not detect the style of the mug based on any design
patterns it contains. Google is very effective at picking out text on items such as clothing and uses these to form queries,
too. Knowledge Graph and Maps integration will see Google’s results improve, but for now it is behind Pinterest in
identifying the intangibles that escape the grasp of language.
- Amazon: Amazon could not recognize any aspects of Loafie, even though the item is in its inventory. In general, Amazon
is effective at recognizing everyday objects and returning related results. Search with an image of a kettle and it will
return ‘kettle’ results. This makes it useful for its core purpose as a retailer, but Amazon will also want to branch out into
the ‘inspiration’ space. Better visual search results will be an important part of this strategy.
- Bing: The results for Bing show that it focuses on identifying color and shape, but not necessarily the category of the
object. Bing has some useful new features that allow for object isolation within complex images, but the algorithms
require more data and training if Bing is to expand its remit beyond everyday objects. The results in this test were the
most erratic of all the technology providers.
- Camfind: As a specialist visual search tool, Camfind performed impressively, even if the results were not entirely
accurate. Years of training on a wide range of objects has led to a more nuanced understanding of objects beyond just
color and shape. It will be interesting to see where Camfind sits in this market as the likes of Google make visual search a
priority and integrate into services like Shopping.
28. 28
In summary:
- There are numerous layers of interpretation when analyzing
an image: size, shape, color, object purpose, style, context…
- Different technologies approach this in different ways.
Pinterest is best at blending these factors, for now.
- Even when an object exists in the image inventory (in the
case of Amazon), there are no guarantees that the visual
search engine will recognize it.
- We need to help search engines as much as possible.
29. How can I optimize for
visual search?
Some practical tips to get started
4
34. 2. Use this knowledge to organize your products
34
35. 3. Image search best practices
35
➔ Compress image size.
➔ Alt attributes and captions: include
key concepts.
➔ Insert images on high authority,
relevant pages.
➔ Only use stock photography if it
has been edited to make it unique.
➔ Maintain a consistent aesthetic -
this helps search engines
understand the relation between
images.
36. 36
3. Remove clutter from images
Automatic object detection works best when
the focal points of the image are in the
foreground.
Tell search engines what your image is about
by giving prominence to the items you want
to rank for.
This allows the technology to produce a
feature map, which can be used to find
relevant images in the database.
37. Place your screenshot here
37
4.
Structured
data
Help search engines understand
your content by using structured
data for all relevant elements of
images.
As a guideline, always mark up:
● Price
● Availability
● Image
● Product name
38. 5. Use visual search to unite the physical and
digital worlds
38
Maps integration through Augmented Reality
PinCodes in store lead consumers to online
listings. Ensure they have a cohesive
experience across channels
39. Key Tips
39
1. Upload your product inventory to your website and social media profiles
a. Image XML sitemap
b. Check indexation status of images
2. Research trends (both keywords and styles)
a. Map keywords to images
b. Logical taxonomy
3. Optimize your images
a. Remove clutter from images
b. Maintain a consistent aesthetic
c. Don’t use stock images; or at least, edit them to make them unique
4. Make context clear
a. Structured data is essential (!)
5. Link your physical and digital presences
a. PinCodes
b. Maps optimization