Presented by John Marc Imbrescia, Senior Software Engineer, Etsy.com
Etsy recently chose to bring our location services in house. We used the open source GeoNames data set and built the tools we needed to use that data to allow members to select their location, show translations of place names, and to feed data into our search database for local, regional, and country based searches.
This talk will cover the implementation details and decisions we made along the way. How we mapped places from our old data set to the GeoNames data. The internal tools we built including a SOLR core for doing location place name autosuggest. Modifications to our Listings Search and Shop Search cores and the different ways we use location based search around the site both distance and region based using GeoNames hierarchy data.
There will also be a discussion about choosing to release some of the tools we built for this project open source and the decisions behind the non-search (display etc.) related elements of the project and the tools we chose for them and why.
3. The world’s online handmade marketplace.
What is Etsy.com?
Wednesday, May 1, 13
4. What is Etsy.com?
•20 million unique items
•18 million daily item searches
•800,000 sellers
•28 million unique views per month
•Developer blog: codeascraft.etsy.com
•450 worldwide employees
Wednesday, May 1, 13
5. Our Problem
Location names were only in English
•Search based on English names
•Display and search needed to be
i18n friendly.
•API limits and speed concerns
meant we needed a new solution.
Wednesday, May 1, 13
6. What do we use Location for?
More than just search
•Display
•Local Search
•No Mapping
•No Bounding boxes
Wednesday, May 1, 13
7. What do we use Location for?
Item Search
Wednesday, May 1, 13
8. What do we use Location for?
Item Search
Wednesday, May 1, 13
9. What do we use Location for?
Item Search
Wednesday, May 1, 13
10. What do we use Location for?
Location Display
Wednesday, May 1, 13
11. How did this use to work?
•Yahoo API
•Every lookup was an API call
•Stored user input and API response
•Searched based on text match of API response
•Not radius using lat/lon
•No way to Internationalize
Wednesday, May 1, 13
12. What Services did Etsy need to
Internalize location services?
•Lookup - Autosuggest
•Update - Scripts to refresh data
•Display - Built into the php stack
•Search - Existing, modified for new pattern
Wednesday, May 1, 13
13. What we have now
•GeoNames as a data source
•Feeds “geonamessuggest” Solr Core
•Sqlite database for place name lookup
•GeoName IDs used for local search
•Leverages GeoName hierarchy data
•Built in Internationalization
Wednesday, May 1, 13
14. How did we get here?
•Mapped old locations to GeoNames
•Added Geoname ID hierarchy to listing search
•Pushed out Sqlite database to webservers
•Slowly transitioned lookup and search services
•Did side by side testing to look for anomalies
Wednesday, May 1, 13
15. What are the data types?
GeoNames
Wednesday, May 1, 13
21. GeonameId Hierarachy
Local listing search
•Each listing gets a hierarchy of geonameids
•Local search is a filter on this ID
•Fast & Reliable
•Enables powerful functionality
•Kept old data fields
Wednesday, May 1, 13
22. GeonameId Collection
Local listing search
•Each listing gets a hierarchy of geonameids
•Local search is a filter query on this ID
•Fast & Reliable
•Enables powerful functionality
•Kept old data fields
Wednesday, May 1, 13
23. CONFERENCE PARTY
The Tipsy Crow: 770 5th Ave
Starts after Stump The Chump
Your conference badge gets you in the
door
TOMORROW
Breakfast starts at 7:30
Keynotes start at 8:30
CONTACT
John Marc Imbrescia
@thejohnmarc
johnmarc@etsy.com
Wednesday, May 1, 13