Using Solr in Online Travel Shopping to Improve User Experience
1. Using Solr in Online Travel to
Improve User Experience
Sudhakar Karegowdra, Esteban Donato
Travelocity, May 25TH 2011
{ sudhakar.karegowdra, esteban.donato}@travelocity.com
2. What We Will Cover
§ Travelocity
§ Speakers Background
§ Merchandising & Solr
• Challenges
• Solution
• Sizing and performance data
• Take Away
§ Location Resolution & Solr
• Challenges
• Solution
• Sizing and performance data
• Take Away
§ Q&A
3
3. § First Online Travel Agency(OTA) Launched in 1996
§ Grown to 3,000 employees and is one of the largest
travel agencies worldwide
§ Headquartered in Dallas/Fort Worth with satellite
offices in San Francisco, New York, London,
Singapore, Bangalore, Buenos Aires to name a few
§ In 2004, the Roaming Gnome became the
centerpiece of marketing efforts and has become an
international pop icon
§ Owned by Sabre Holdings - sister companies include
Travelocity Business, IgoUgo.com, lastminute.com,
Zuji among others
4
4. Speakers Background
§ Sudhakar Karegowdra § Esteban Donato
• Principal Architect • Lead Architect
Travelocity.com Travelocity.com
§ My experience § My experience
– 13 + years – 10 + years
– Solr/ Lucene 3 years – Solr 2 years
– Implementing Hadoop, – Analyzing Mahout and
Pig and Hive for Data Carrot2 for document
warehouse. clustering engine.
§ Topic : § Topic :
Merchandising Location Resolution
5
6. The Challenge
§ Market Drivers
• Build Landing Pages with Faceted Navigation
• Enable Content Segmentation and delivery
• Support Roll out of Promotions
• Roll up Data to a higher level
§ E.g., All 5 star hotels in California to bring all the 5 Star
hotels from SFO,LAX, SAN etc.,
• Faster time to market new Ideas
• Rapidly scale to accommodate global brands
with disparate data sources
7
7. The Challenge
§ Traditional Database approach
• Higher time to market
• Specialized skill set to design and optimize
database structures and queries
• Aggregation of data and changing of structures
quite complex
• Building Faceted navigation capabilities needs
complex logic leading to high maintenance cost
8
8. Solution - Overview
§ Data from various sources aggregated and
ingested into Solr
• Core per Locale and Product Type
§ Wrapper service to combine some data across
product cores and manage configuration rules
§ Solr’s built in Search and Faceting to power the
navigation
9
10. Solution - Achievements
§ Millions of unique Long Tail Landing Pages
§ E.g.,
http://www.travelocity.com/hotel-d4980-nevada-las-vegas-
hotels_5-star_business-center_green
§ Faster search across products
§ E.g., Beach Deals under $500
§ Segmented Content delivery through tagging
§ Scaled well to distribute the content to different
brands, partners and advertisers
§ Opened up for other innovative applications
§ Deals on Map, Deals on Mobile, Wizards etc.,
11
11. Solution – Road Ahead
§ Migration to Solr 3.1
• Geo spatial search
• CSV out put format
§ Query boosting by Search pattern
§ Near Real time Updates
§ Deal and user behavior mining in Hadoop –
MapReduce and Solr to Serve the Content
§ Move Slaves to Cloud
12
12. Sizing & Performance
§ Index Stats
§ Number of Cores : 25
§ Number of Documents : ~ 1 Million Records
§ Response
§ Requests : 70 tps
§ Average response time : 0.005 seconds (5 ms)
§ Software Versions
§ Solr Version 1.4.0
– filterCache size : 30000
§ Tomcat – 5.5.9
§ JDK1.6
13
13. Take Away
§ Semi Structured Storage in Solr helps
aggregate disparate sources easily
Remember Dynamic fields
§ Multiple Cores to manage multiple locale data
§ Solr is a great enabler of “Innovations”
14
15. The Challenge
§ How to develop a global location resolution
service?
§ Flexibility to changes
§ General enough to cover everyone needs
§ Multi language
§ Performance and scalability
§ Configurable by site
16
16. Architecture of the solution
Auto-complete
Solr Slave
Resolution
§ Master/Slave architecture
§ SolrJ client each core
§ Multi-core: binary format
§ Solr response cache
represents a language Solr Master
§ Remote Streaming indexing
§ CSV format
Management Batch Job
Tool Location DB
17
17. Auto-complete
§ System has to suggest options as the users
type their desired location
§ Examples “san” => San Francisco, “veg” =>
Las Vegas
§ Relevancy: not all the locations are equally
important. “par” => “Paris, France”; “Parana,
Argentina”
§ Users can search by various fields: location
code, location name, city code, city name,
state/province code, state province name,
country code, country name.
18
19. Resolution
§ System has to resolve the location requested
by the users.
§ Contemplates aliases. Big Apple => New York
§ Contemplates ambiguities.
§ Contemplates misspellings. Lomdon => London
§ NGramDistance algorithm.
§ How to combine distance with relevancy
§ Error suggesting the correct location when it is a prefix.
Lond => London
20