SlideShare ist ein Scribd-Unternehmen logo
1 von 35
Searching Images by Color 
Chris Becker 
Search Engineering @ Shutterstock
What is Shutterstock? 
• Shutterstock sells stock images, videos & music. 
• Crowdsourced from artists around the world 
• Shutterstock reviews and indexes them for search 
• Customers buy a subscription and download them
Why search by color?
Stock photography on the internet… 
images from www.shutterstock.com
Stock photography on the internet… 
images from www.shutterstock.com
Color is one of many visual 
attributes that you can use 
to create an engaging 
image search experience
Shutterstock Labs 
Spectrum 
Palette
Diving into Color Data
Color Spaces 
• RGB 
• HSL 
• Lab 
• LCH 
images from www.wikipedia.org
Calculating Distances Between Colors 
• Euclidean distance works reasonably well in any color space 
distRGB = sqrt((r 
-r 
1 
)^2 + (g 
2 
-g 
1 
)^2 + (b 
2 
-b 
1 
)^2) 
2 
distHSL = sqrt((h 
-h 
1 
)^2 + (s 
2 
-s 
1 
)^2 + (l 
2 
-l 
1 
)^2) 
2 
distLCH = sqrt((L 
-L 
1 
)^2 + (C 
2 
-C 
1 
)^2 + (H 
2 
-H 
1 
)^2) 
2 
distLAB = sqrt((L 
-L 
1 
)^2 + (a 
2 
-a 
1 
)^2 + (b 
2 
-b 
1 
)^2) 
2 
• More sophisticated equations that better account for human 
perception can be found at 
http://en.wikipedia.org/wiki/Color_difference
Images are just numbers 
[ 
[[054,087,058], [054,116,206], [017,226,194], [234,203,215], [188,205,000], [229,156,182]], 
[[214,238,109], [064,190,104], [191,024,161], [104,071,036], [222,081,005], [204,012,113]], 
[[197,100,189], [159,204,024], [228,214,054], [250,098,125], [050,144,093], [021,122,101]], 
[[255,146,010], [115,156,002], [174,023,137], [161,141,077], [154,189,005], [242,170,074]], 
[[113,146,064], [196,057,200], [123,203,160], [066,090,234], [200,186,103], [099,074,037]], 
[[194,022,018], [226,045,008], [123,023,087], [171,029,021], [040,001,143], [255,083,194]], 
[[115,186,246], [025,064,109], [029,071,001], [140,031,002], [248,170,244], [134,112,252]], 
[[116,179,059], [217,205,159], [157,060,251], [151,205,058], [036,214,075], [107,103,130]], 
[[052,003,227], [184,037,078], [161,155,181], [051,070,186], [082,235,108], [129,233,211]], 
[[047,212,209], [250,236,085], [038,128,148], [115,171,113], [186,092,227], [198,130,024]], 
[[225,210,064], [123,049,199], [173,207,164], [161,069,220], [002,228,184], [170,248,075]], 
[[234,157,201], [168,027,113], [117,080,236], [168,131,247], [028,177,060], [187,147,084]], 
[[184,166,096], [107,117,037], [154,208,093], [237,090,188], [007,076,086], [224,239,210]], 
[[105,230,058], [002,122,240], [036,151,107], [101,023,149], [048,010,225], [109,102,195]], 
[[050,019,169], [219,235,027], [061,064,133], [218,221,113], [009,032,125], [109,151,137]], 
[[010,037,189], [216,010,101], [000,037,084], [166,225,127], [203,067,214], [110,020,245]], 
[[180,147,130], [045,251,177], [127,175,215], [237,161,084], [208,027,218], [244,194,034]], 
[[089,235,226], [106,219,220], [010,040,006], [094,138,058], [148,081,166], [249,216,177]], 
[[121,110,034], [007,232,255], [214,052,035], [086,100,020], [191,064,105], [129,254,207]], 
]
Any operation you can do on a set of 
numbers, you can do on an image 
• getting histograms 
• computing median values 
• standard deviations / variance 
• other statistics
Extracting Color Data
Tools & Libraries 
• ImageMagick 
• Python Image Library 
• ImageJ
# python example to get a histogram from an image 
import PIL 
from PIL import Image 
from pprint import pprint 
image = Image.open('./samplephoto.jpg') 
width, height = image.size 
colors = image.getcolors(width*height) 
hist = {} 
for i, c in enumerate(colors): 
hex = '%02x%02x%02x' % (c[1][0],c[1][1],c[1][2]) 
hist[hex] = c[0] 
pprint(hist)
Indexing & Searching 
in Solr
Indexing color histograms 
• index colors just like you would index text 
• amount of color = frequency of the term 
color_txt = "cfebc2 
cfebc2 cfebc2 cfebc2 
cfebc2 cfebc2 cfebc2 
cfebc2 cfebc2 cfebc2 
95bf40 95bf40 95bf40 
95bf40 95bf40 95bf40 
2e6b2e 2e6b2e 2e6b2e 
ff0000 …"
Solr Schema & Queries 
<field name="color" type="text_ws" …> 
• Can use solr’s default ranking effectively 
/solr/select?q=ff0000 e2c2d2&qf=color&defType=edismax… 
• or use term frequencies directly for specific sort functions: 
sort=product(tf(color,"ff0000"),tf(color,"e2c2d2")) desc
Indexing color statistics 
Represent aggregate statistics of each image 
lightness: 
median: 2 
standard dev: 1 
largest bin: 0 
largest bin size: 50 
saturation 
median: 0 
standard dev: 0 
largest bin: 0 
largest bin size: 100 
…
Solr Fields & Queries 
<field name=”hue_median” type=”int” …> 
• Sort by the distance between input param 
and median value for each image 
/solr/select?q=*&sort=abs(sub($query,hue_median)) asc
Ranking & Relevance
How much of the image has the color ? 
image from www.shutterstock.com
is this relevant if I search for ? 
image from www.shutterstock.com
which image is more relevant if I search for ? 
image from www.shutterstock.com
is this relevant if I search for ? 
image from www.shutterstock.com
How do we account for these factors?
How much of the image contains the 
selected color? 
• Score each color by the number of pixels 
sort=tf(color,"cfebc2") desc
Balance Precision and Recall 
• Reduce your colorspace enough 
to balance: 
• color accuracy 
• index size 
• query complexity 
• result counts 
• only need 100-200 colors for a good UX 
✓
Weighing Multiple Colors Together 
• If you search for 2 or more colors, the top result should have 
the most even distribution of those colors 
✓ 
• simple option: 
sort=product(tf(color,"ff9900"),tf(color,"2280e2")) desc 
• more complex: compute the standard deviation or variance 
of the term frequencies of matching color values for each 
image, and sort the results with the lowest variance first.
Weighing Similar & Different Colors 
• The score for one color should reflect all the colors in the image. 
• At indexing time, increase the score based on similar colors; 
decrease it based on differing colors.
Conclusion
Conclusion 
• Steps for building color search in Solr: 
• Extract colors using a tool like the Python Image Library 
• Score colors based on the number of pixels 
• Adjust scores based on similar / different colors 
• Index colors into Solr as text document 
• In your query, sort by the term frequency values for each 
color
One more demo…

Weitere ähnliche Inhalte

Ähnlich wie Searching Images by Color Using Solr

Digitization Basics for Archives and Special Collections – Part 1: Select and...
Digitization Basics for Archives and Special Collections – Part 1: Select and...Digitization Basics for Archives and Special Collections – Part 1: Select and...
Digitization Basics for Archives and Special Collections – Part 1: Select and...WiLS
 
Efficient realization for geometric transformation of digital images in run l...
Efficient realization for geometric transformation of digital images in run l...Efficient realization for geometric transformation of digital images in run l...
Efficient realization for geometric transformation of digital images in run l...Shlomo Pongratz
 
Ch2
Ch2Ch2
Ch2teba
 
Helvetia
HelvetiaHelvetia
HelvetiaESUG
 
Overview of graphics systems
Overview of  graphics systemsOverview of  graphics systems
Overview of graphics systemsJay Nagar
 
Learn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing LanguageLearn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing Languageshelfrog
 
Learn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing LanguageLearn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing LanguageW M Harris
 
5707_10_auto-encoder.pptx
5707_10_auto-encoder.pptx5707_10_auto-encoder.pptx
5707_10_auto-encoder.pptxSidoriOne
 
What Color is Solid State Lighting - Panel Discussion
What Color is Solid State Lighting - Panel DiscussionWhat Color is Solid State Lighting - Panel Discussion
What Color is Solid State Lighting - Panel DiscussionCindy Foster-Warthen
 
Building Composable Abstractions
Building Composable AbstractionsBuilding Composable Abstractions
Building Composable AbstractionsEric Normand
 
Introduction to Coding
Introduction to CodingIntroduction to Coding
Introduction to CodingFabio506452
 
Multimedia
MultimediaMultimedia
MultimediaMR Z
 
Lecture 02 visualization and programming
Lecture 02   visualization and programmingLecture 02   visualization and programming
Lecture 02 visualization and programmingSmee Kaem Chann
 
ModuleII.ppt
ModuleII.pptModuleII.ppt
ModuleII.pptSKILL2021
 
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Savvas Chatzichristofis
 
Overview of graphics systems.ppt
Overview of graphics systems.pptOverview of graphics systems.ppt
Overview of graphics systems.pptMalleshBettadapura1
 

Ähnlich wie Searching Images by Color Using Solr (20)

Style Guide
Style GuideStyle Guide
Style Guide
 
Digitization Basics for Archives and Special Collections – Part 1: Select and...
Digitization Basics for Archives and Special Collections – Part 1: Select and...Digitization Basics for Archives and Special Collections – Part 1: Select and...
Digitization Basics for Archives and Special Collections – Part 1: Select and...
 
Efficient realization for geometric transformation of digital images in run l...
Efficient realization for geometric transformation of digital images in run l...Efficient realization for geometric transformation of digital images in run l...
Efficient realization for geometric transformation of digital images in run l...
 
Ch2
Ch2Ch2
Ch2
 
Helvetia
HelvetiaHelvetia
Helvetia
 
Overview of graphics systems
Overview of  graphics systemsOverview of  graphics systems
Overview of graphics systems
 
Learn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing LanguageLearn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing Language
 
Learn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing LanguageLearn Creative Coding: Begin Programming with the Processing Language
Learn Creative Coding: Begin Programming with the Processing Language
 
5707_10_auto-encoder.pptx
5707_10_auto-encoder.pptx5707_10_auto-encoder.pptx
5707_10_auto-encoder.pptx
 
What Color is Solid State Lighting - Panel Discussion
What Color is Solid State Lighting - Panel DiscussionWhat Color is Solid State Lighting - Panel Discussion
What Color is Solid State Lighting - Panel Discussion
 
Building Composable Abstractions
Building Composable AbstractionsBuilding Composable Abstractions
Building Composable Abstractions
 
Introduction to Coding
Introduction to CodingIntroduction to Coding
Introduction to Coding
 
Multimedia
MultimediaMultimedia
Multimedia
 
Lecture 02 visualization and programming
Lecture 02   visualization and programmingLecture 02   visualization and programming
Lecture 02 visualization and programming
 
ModuleII.ppt
ModuleII.pptModuleII.ppt
ModuleII.ppt
 
ModuleII.ppt
ModuleII.pptModuleII.ppt
ModuleII.ppt
 
ModuleII.ppt
ModuleII.pptModuleII.ppt
ModuleII.ppt
 
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
Searching Images with MPEG-7 (& MPEG-7 Like) Powered Localized dEscriptors (S...
 
CBIR_white.ppt
CBIR_white.pptCBIR_white.ppt
CBIR_white.ppt
 
Overview of graphics systems.ppt
Overview of graphics systems.pptOverview of graphics systems.ppt
Overview of graphics systems.ppt
 

Kürzlich hochgeladen

哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查ydyuyu
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...kumargunjan9515
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsMonica Sydney
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfJOHNBEBONYAP1
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...meghakumariji156
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiMonica Sydney
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdfMatthew Sinclair
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理F
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge GraphsEleniIlkou
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 

Kürzlich hochgeladen (20)

哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
哪里办理美国迈阿密大学毕业证(本硕)umiami在读证明存档可查
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
 
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girlsRussian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
Russian Call girls in Abu Dhabi 0508644382 Abu Dhabi Call girls
 
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdfpdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
pdfcoffee.com_business-ethics-q3m7-pdf-free.pdf
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
20240507 QFM013 Machine Intelligence Reading List April 2024.pdf
 
一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理一比一原版奥兹学院毕业证如何办理
一比一原版奥兹学院毕业证如何办理
 
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
2nd Solid Symposium: Solid Pods vs Personal Knowledge Graphs
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 

Searching Images by Color Using Solr

  • 1.
  • 2. Searching Images by Color Chris Becker Search Engineering @ Shutterstock
  • 3. What is Shutterstock? • Shutterstock sells stock images, videos & music. • Crowdsourced from artists around the world • Shutterstock reviews and indexes them for search • Customers buy a subscription and download them
  • 4. Why search by color?
  • 5. Stock photography on the internet… images from www.shutterstock.com
  • 6. Stock photography on the internet… images from www.shutterstock.com
  • 7. Color is one of many visual attributes that you can use to create an engaging image search experience
  • 10. Color Spaces • RGB • HSL • Lab • LCH images from www.wikipedia.org
  • 11. Calculating Distances Between Colors • Euclidean distance works reasonably well in any color space distRGB = sqrt((r -r 1 )^2 + (g 2 -g 1 )^2 + (b 2 -b 1 )^2) 2 distHSL = sqrt((h -h 1 )^2 + (s 2 -s 1 )^2 + (l 2 -l 1 )^2) 2 distLCH = sqrt((L -L 1 )^2 + (C 2 -C 1 )^2 + (H 2 -H 1 )^2) 2 distLAB = sqrt((L -L 1 )^2 + (a 2 -a 1 )^2 + (b 2 -b 1 )^2) 2 • More sophisticated equations that better account for human perception can be found at http://en.wikipedia.org/wiki/Color_difference
  • 12. Images are just numbers [ [[054,087,058], [054,116,206], [017,226,194], [234,203,215], [188,205,000], [229,156,182]], [[214,238,109], [064,190,104], [191,024,161], [104,071,036], [222,081,005], [204,012,113]], [[197,100,189], [159,204,024], [228,214,054], [250,098,125], [050,144,093], [021,122,101]], [[255,146,010], [115,156,002], [174,023,137], [161,141,077], [154,189,005], [242,170,074]], [[113,146,064], [196,057,200], [123,203,160], [066,090,234], [200,186,103], [099,074,037]], [[194,022,018], [226,045,008], [123,023,087], [171,029,021], [040,001,143], [255,083,194]], [[115,186,246], [025,064,109], [029,071,001], [140,031,002], [248,170,244], [134,112,252]], [[116,179,059], [217,205,159], [157,060,251], [151,205,058], [036,214,075], [107,103,130]], [[052,003,227], [184,037,078], [161,155,181], [051,070,186], [082,235,108], [129,233,211]], [[047,212,209], [250,236,085], [038,128,148], [115,171,113], [186,092,227], [198,130,024]], [[225,210,064], [123,049,199], [173,207,164], [161,069,220], [002,228,184], [170,248,075]], [[234,157,201], [168,027,113], [117,080,236], [168,131,247], [028,177,060], [187,147,084]], [[184,166,096], [107,117,037], [154,208,093], [237,090,188], [007,076,086], [224,239,210]], [[105,230,058], [002,122,240], [036,151,107], [101,023,149], [048,010,225], [109,102,195]], [[050,019,169], [219,235,027], [061,064,133], [218,221,113], [009,032,125], [109,151,137]], [[010,037,189], [216,010,101], [000,037,084], [166,225,127], [203,067,214], [110,020,245]], [[180,147,130], [045,251,177], [127,175,215], [237,161,084], [208,027,218], [244,194,034]], [[089,235,226], [106,219,220], [010,040,006], [094,138,058], [148,081,166], [249,216,177]], [[121,110,034], [007,232,255], [214,052,035], [086,100,020], [191,064,105], [129,254,207]], ]
  • 13. Any operation you can do on a set of numbers, you can do on an image • getting histograms • computing median values • standard deviations / variance • other statistics
  • 14.
  • 16. Tools & Libraries • ImageMagick • Python Image Library • ImageJ
  • 17. # python example to get a histogram from an image import PIL from PIL import Image from pprint import pprint image = Image.open('./samplephoto.jpg') width, height = image.size colors = image.getcolors(width*height) hist = {} for i, c in enumerate(colors): hex = '%02x%02x%02x' % (c[1][0],c[1][1],c[1][2]) hist[hex] = c[0] pprint(hist)
  • 19. Indexing color histograms • index colors just like you would index text • amount of color = frequency of the term color_txt = "cfebc2 cfebc2 cfebc2 cfebc2 cfebc2 cfebc2 cfebc2 cfebc2 cfebc2 cfebc2 95bf40 95bf40 95bf40 95bf40 95bf40 95bf40 2e6b2e 2e6b2e 2e6b2e ff0000 …"
  • 20. Solr Schema & Queries <field name="color" type="text_ws" …> • Can use solr’s default ranking effectively /solr/select?q=ff0000 e2c2d2&qf=color&defType=edismax… • or use term frequencies directly for specific sort functions: sort=product(tf(color,"ff0000"),tf(color,"e2c2d2")) desc
  • 21. Indexing color statistics Represent aggregate statistics of each image lightness: median: 2 standard dev: 1 largest bin: 0 largest bin size: 50 saturation median: 0 standard dev: 0 largest bin: 0 largest bin size: 100 …
  • 22. Solr Fields & Queries <field name=”hue_median” type=”int” …> • Sort by the distance between input param and median value for each image /solr/select?q=*&sort=abs(sub($query,hue_median)) asc
  • 24. How much of the image has the color ? image from www.shutterstock.com
  • 25. is this relevant if I search for ? image from www.shutterstock.com
  • 26. which image is more relevant if I search for ? image from www.shutterstock.com
  • 27. is this relevant if I search for ? image from www.shutterstock.com
  • 28. How do we account for these factors?
  • 29. How much of the image contains the selected color? • Score each color by the number of pixels sort=tf(color,"cfebc2") desc
  • 30. Balance Precision and Recall • Reduce your colorspace enough to balance: • color accuracy • index size • query complexity • result counts • only need 100-200 colors for a good UX ✓
  • 31. Weighing Multiple Colors Together • If you search for 2 or more colors, the top result should have the most even distribution of those colors ✓ • simple option: sort=product(tf(color,"ff9900"),tf(color,"2280e2")) desc • more complex: compute the standard deviation or variance of the term frequencies of matching color values for each image, and sort the results with the lowest variance first.
  • 32. Weighing Similar & Different Colors • The score for one color should reflect all the colors in the image. • At indexing time, increase the score based on similar colors; decrease it based on differing colors.
  • 34. Conclusion • Steps for building color search in Solr: • Extract colors using a tool like the Python Image Library • Score colors based on the number of pixels • Adjust scores based on similar / different colors • Index colors into Solr as text document • In your query, sort by the term frequency values for each color