SlideShare a Scribd company logo
1 of 24
Using Solr in Online Travel to Improve User Experience Sudhakar Karegowdra, Esteban Donato Travelocity, May 25TH 2011{ sudhakar.karegowdra, esteban.donato}@travelocity.com
What We Will Cover Travelocity Speakers Background Merchandising & Solr Challenges Solution Sizing and performance data Take Away Location Resolution & Solr Challenges Solution Sizing and performance data Take Away Q&A 3
First Online Travel Agency(OTA) Launched in 1996 Grown to 3,000 employees and is one of the largest travel agencies worldwide Headquartered in Dallas/Fort Worth with satellite offices in San Francisco, New York, London, Singapore, Bangalore, Buenos Aires to name a few In 2004, the Roaming Gnome became the centerpiece of marketing efforts and has become an international pop icon Owned by Sabre Holdings - sister companies include Travelocity Business, IgoUgo.com, lastminute.com, Zuji among others 4
Speakers Background ,[object Object]
Principal ArchitectTravelocity.com ,[object Object]
13 + years
Solr/ Lucene 3 years
Implementing Hadoop, Pig and Hive for Data warehouse.
Topic : MerchandisingEsteban Donato Lead Architect     Travelocity.com My experience 10 + years Solr 2 years  Analyzing Mahout and Carrot2 for document clustering engine. Topic : Location Resolution 5
6 Merchandising  By Sudhakar Karegowdra
The Challenge Market Drivers Build Landing Pages with Faceted Navigation Enable Content Segmentation and delivery Support Roll out of Promotions  Roll up Data to a higher level  E.g., All 5 star hotels in California to bring all the 5 Star hotels from SFO,LAX, SAN etc., Faster time to market new Ideas Rapidly scale to accommodate global brands with disparate data sources 7
The Challenge Traditional Database approach Higher time to market Specialized skill set to design and optimize database structures and queries Aggregation of data and changing of structures quite complex Building Faceted navigation capabilities needs complex logic leading to high maintenance cost 8
Solution - Overview  Data from various sources aggregated and ingested into Solr  Core per Locale and Product Type   Wrapper service to combine some data across product cores and manage configuration rules Solr’s built in Search and Faceting to power the navigation 9
Solution – Architecture View 10 UI Widgets Mobile Services/Business Logic Solr Slaves (Multi Core) Solr Master (Multi Core) Offer Management Tool Oracle ETL Products Deals ……
Solution - Achievements Millions of unique Long Tail Landing Pages E.g., http://www.travelocity.com/hotel-d4980-nevada-las-vegas-hotels_5-star_business-center_green Faster search across products  E.g., Beach Deals under $500 Segmented Content delivery through tagging  Scaled well to distribute the content to different brands, partners and advertisers Opened up for other innovative applications Deals on Map, Deals on Mobile, Wizards etc., 11
Solution – Road Ahead Migration to Solr 3.1  Geo spatial search CSV out put format Query boosting by Search pattern Near Real time Updates Deal and user behavior mining in Hadoop – MapReduce and Solr to Serve the Content Move Slaves to Cloud  12
Sizing & Performance  Index Stats  Number of Cores : 25 Number of Documents : ~ 1 Million Records Response Requests : 70 tps  Average response time : 0.005 seconds (5 ms) Software Versions Solr Version 1.4.0 filterCache size : 30000 Tomcat – 5.5.9 JDK1.6 13
Take Away Semi Structured Storage in Solr helps aggregate disparate sources easily Remember Dynamic fields  Multiple Cores to manage multiple locale data Solr is a great enabler of “Innovations” 14
15 Location Resolution By Esteban Donato
The Challenge How to develop a global location resolution service? Flexibility to changes General enough to cover everyone needs Multi language Performance and scalability  Configurable by site 16
Architecture of the solution 17 Solr Slave Auto-complete Resolution ,[object Object]
Multi-core: each core represents a language
Remote Streaming indexing
CSV formatSolr Master Location DB Batch Job Management Tool ,[object Object]

More Related Content

Featured

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at WorkGetSmarter
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...DevGAMM Conference
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationErica Santiago
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellSaba Software
 

Featured (20)

Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 

Using Solr In Online Travel To Improve User Experience

  • 1. Using Solr in Online Travel to Improve User Experience Sudhakar Karegowdra, Esteban Donato Travelocity, May 25TH 2011{ sudhakar.karegowdra, esteban.donato}@travelocity.com
  • 2. What We Will Cover Travelocity Speakers Background Merchandising & Solr Challenges Solution Sizing and performance data Take Away Location Resolution & Solr Challenges Solution Sizing and performance data Take Away Q&A 3
  • 3. First Online Travel Agency(OTA) Launched in 1996 Grown to 3,000 employees and is one of the largest travel agencies worldwide Headquartered in Dallas/Fort Worth with satellite offices in San Francisco, New York, London, Singapore, Bangalore, Buenos Aires to name a few In 2004, the Roaming Gnome became the centerpiece of marketing efforts and has become an international pop icon Owned by Sabre Holdings - sister companies include Travelocity Business, IgoUgo.com, lastminute.com, Zuji among others 4
  • 4.
  • 5.
  • 8. Implementing Hadoop, Pig and Hive for Data warehouse.
  • 9. Topic : MerchandisingEsteban Donato Lead Architect Travelocity.com My experience 10 + years Solr 2 years Analyzing Mahout and Carrot2 for document clustering engine. Topic : Location Resolution 5
  • 10. 6 Merchandising By Sudhakar Karegowdra
  • 11. The Challenge Market Drivers Build Landing Pages with Faceted Navigation Enable Content Segmentation and delivery Support Roll out of Promotions Roll up Data to a higher level E.g., All 5 star hotels in California to bring all the 5 Star hotels from SFO,LAX, SAN etc., Faster time to market new Ideas Rapidly scale to accommodate global brands with disparate data sources 7
  • 12. The Challenge Traditional Database approach Higher time to market Specialized skill set to design and optimize database structures and queries Aggregation of data and changing of structures quite complex Building Faceted navigation capabilities needs complex logic leading to high maintenance cost 8
  • 13. Solution - Overview Data from various sources aggregated and ingested into Solr Core per Locale and Product Type Wrapper service to combine some data across product cores and manage configuration rules Solr’s built in Search and Faceting to power the navigation 9
  • 14. Solution – Architecture View 10 UI Widgets Mobile Services/Business Logic Solr Slaves (Multi Core) Solr Master (Multi Core) Offer Management Tool Oracle ETL Products Deals ……
  • 15. Solution - Achievements Millions of unique Long Tail Landing Pages E.g., http://www.travelocity.com/hotel-d4980-nevada-las-vegas-hotels_5-star_business-center_green Faster search across products E.g., Beach Deals under $500 Segmented Content delivery through tagging Scaled well to distribute the content to different brands, partners and advertisers Opened up for other innovative applications Deals on Map, Deals on Mobile, Wizards etc., 11
  • 16. Solution – Road Ahead Migration to Solr 3.1 Geo spatial search CSV out put format Query boosting by Search pattern Near Real time Updates Deal and user behavior mining in Hadoop – MapReduce and Solr to Serve the Content Move Slaves to Cloud 12
  • 17. Sizing & Performance Index Stats Number of Cores : 25 Number of Documents : ~ 1 Million Records Response Requests : 70 tps Average response time : 0.005 seconds (5 ms) Software Versions Solr Version 1.4.0 filterCache size : 30000 Tomcat – 5.5.9 JDK1.6 13
  • 18. Take Away Semi Structured Storage in Solr helps aggregate disparate sources easily Remember Dynamic fields Multiple Cores to manage multiple locale data Solr is a great enabler of “Innovations” 14
  • 19. 15 Location Resolution By Esteban Donato
  • 20. The Challenge How to develop a global location resolution service? Flexibility to changes General enough to cover everyone needs Multi language Performance and scalability Configurable by site 16
  • 21.
  • 22. Multi-core: each core represents a language
  • 24.
  • 25.
  • 26. Solr schema <dynamicField name="RANK*" type="int" required="false" indexed="true" stored="true" /> <field name="GLS_FULL_SEARCH" type="glsSearchField" required="false" indexed="true" stored="false" multiValued="true"/> <fieldType name="glsSearchField" class="solr.TextField" positionIncrementGap="100“> <analyzer> <tokenizer class="solr.PatternTokenizerFactory" pattern="[/ ]+" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.ISOLatin1AccentFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/> </analyzer> </fieldType> 19
  • 27. Resolution System has to resolve the location requested by the users. Contemplates aliases. Big Apple => New York Contemplates ambiguities. Contemplates misspellings. Lomdon => London NGramDistance algorithm. How to combine distance with relevancy Error suggesting the correct location when it is a prefix. Lond => London 20
  • 28. Spellchecker configuration <fieldType name=" spellcheckType " class="solr.TextField" positionIncrementGap="100“> <analyzer> <tokenizerclass="solr.KeywordTokenizerFactory” /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.TrimFilterFactory" /> <filter class="solr.ISOLatin1AccentFilterFactory" /> <filter class="solr.RemoveDuplicatesTokenFilterFactory" /> <filter class="solr.PatternReplaceFilterFactory" pattern="[,.]" replacement="" replace="all"/> </analyzer> </fieldType> 21
  • 29. Sizing & Performance 4 cores with ~ 500,000 documents indexed each Response times Auto-complete: 15ms, 20 TPS Resolution: 10ms, 2 TPS Cache configuration queryResultCache: maxSize=1024 documentCache, maxSize=1024 fieldValueCache  & filterCache  disabled 22
  • 30. Wrap Up Performance always as top priority Develop simple but robust services Provide a simple API 23
  • 32. Contact Esteban Donato Esteban.donato@travelocity.com Twitter: @eddonato Sudhakar Karegowdra Sudhakar.karegowdra@travelocity.com Twitter: @skaregowdra https://www.facebook.com/travelocity Twitter: @travelocityand @RoamingGnome 25