Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Wanderu: Lessons
Learned
Lessons Learned and Unlearned from Building a Travel
Site with Graphs and Neo4j

Eddy Wong

CTO, ...
About Wanderu.com
Search Engine for (Intercity) Buses and Trains
Demo
From pt A to pt B
A Shortest Path Problem as a function of
depart, arrive, price, duration, date times

Philly
A: NYC

MEG...
Lessons
•Architectural
•Modeling
•Geo
Learned
UnLearned
Idea
Our Story
• 2 yr startup, Tech started about 1+ yr ago
• Beta in Mar 2013, Launch in Aug 2013
• Knew nothing about Neo4j w...
Workflow
Scraping
Bus Websites

JSON

Non-uniform
Data

Server

Store

Uniform
Data
Architectural
Lessons

Art: MC Escher
Our Situation
• Data is written only in one direction
• Users search for paths, then segments
• Searches are done by date
...
Solution
Scraping
Bus Websites

JSON

Uniform
Data

Non-uniform
Data

Replica
Mechanism
Nodes & Edges
Neo4j

Mongo
Conn

M...
MongoConnector
•
•
•
•
•
•
•

MongoDB Lab project, open source, unsupported
Uses Replica Mechanism: Oplog
Eventually Consi...
Polyglot Arch
BOS, NYC
BOS, PHL
NYC, DC
NYC, PHL

Scraping
Bus Websites

JSON

Non-uniform
Data

Replica
Mechanism
MongoDB...
Modeling
Lessons

Art: MC Escher
Our Story
• We tried to “dump” all data into Neo4j
• Edges had dates -> too many Edges ->
“Super Node Problem”

• Query pe...
“Dehydate”
• Don’t store everything in the Neo4j, only
metadata

• Use Neo4j as a “connection index”
• Don’t store entitie...
Neo4j Model

source: Wes Freeman, Tobias Lindaaker
Our Solution
• Serve paths from Neo4j
• Segments from MongoDB (with date
constraints)

• Back to “Joins”
• “Join” across N...
Joins across DBs
MongoDB: Stations

Neo4j: Nodes

BOS

NYC

DC

DC

...

generated by dbs

BOS

NYC

• Forget seq id

...
...
Geo Lessons

Art: MC Escher
Hybrid Solution
• Google

Autocomplete

• Google Maps
• MongoDB station
geo lookup
Lessons of Lessons
• Really understand the Neo4j Runtime
Model

• Pick universal human generated ids
• Join across dbs bet...
Useful Links
•

Neo4j Internals
slideshare.net/thobe/an-overview-of-neo4j-internals

•

Aseem’s Lessons Learned with Neo4j...
Nächste SlideShare
Wird geladen in …5
×

Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013

3.534 Aufrufe

Veröffentlicht am

Wanderu is a consumer-focused search engine for buses and trains. Eddy will recount the architectural, modeling and other technical “lessons learned” and “lessons unlearned” in implementing our geospatial and search features using Neo4j in the context of a NoSQL polyglot solution.

Veröffentlicht in: Technologie
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE Format, ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier
  • The            setup            in            the            video            no            longer            works.           
    And            all            other            links            in            comment            are            fake            too.           
    But            luckily,            we            found            a            working            one            here (copy paste link in browser) :            www.goo.gl/yT1SNP
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

Wanderu – Lessons from Building a Travel Site with Neo4j - Eddy Wong @ GraphConnect NY 2013

  1. 1. Wanderu: Lessons Learned Lessons Learned and Unlearned from Building a Travel Site with Graphs and Neo4j Eddy Wong CTO, Wanderu.com @eddywongch
  2. 2. About Wanderu.com Search Engine for (Intercity) Buses and Trains
  3. 3. Demo
  4. 4. From pt A to pt B A Shortest Path Problem as a function of depart, arrive, price, duration, date times Philly A: NYC MEG, $9, 11/07/2013 MEG, $4, 11/07/2013 BOLT, $13, 11/07/2013 Nomenclature: Stations, Trips B: DC
  5. 5. Lessons •Architectural •Modeling •Geo Learned UnLearned Idea
  6. 6. Our Story • 2 yr startup, Tech started about 1+ yr ago • Beta in Mar 2013, Launch in Aug 2013 • Knew nothing about Neo4j when we started (Jun 2012) • Did not like the relational model: wanted schema-less and no self-joins • Wanted a graph model
  7. 7. Workflow Scraping Bus Websites JSON Non-uniform Data Server Store Uniform Data
  8. 8. Architectural Lessons Art: MC Escher
  9. 9. Our Situation • Data is written only in one direction • Users search for paths, then segments • Searches are done by date • Needed online capability • Trip info (price/avail) could change on some
  10. 10. Solution Scraping Bus Websites JSON Uniform Data Non-uniform Data Replica Mechanism Nodes & Edges Neo4j Mongo Conn MongoDB
  11. 11. MongoConnector • • • • • • • MongoDB Lab project, open source, unsupported Uses Replica Mechanism: Oplog Eventually Consistent (not real time) Written in Python Main methods: Upserts and Deletes, passes doc Implement DocMgr->Neo4jDocMgr->py2neo We can add new properties easily on the fly
  12. 12. Polyglot Arch BOS, NYC BOS, PHL NYC, DC NYC, PHL Scraping Bus Websites JSON Non-uniform Data Replica Mechanism MongoDB REST Server Nodes & Edges Neo4j Mongo Conn
  13. 13. Modeling Lessons Art: MC Escher
  14. 14. Our Story • We tried to “dump” all data into Neo4j • Edges had dates -> too many Edges -> “Super Node Problem” • Query perf was terrible (1+ mins) and worse as # edges increased • Tried Gremlin -> No improvements • Needed range queries on Edges
  15. 15. “Dehydate” • Don’t store everything in the Neo4j, only metadata • Use Neo4j as a “connection index” • Don’t store entities in Nodes, only keys • Don’t store heavy properties in Edges
  16. 16. Neo4j Model source: Wes Freeman, Tobias Lindaaker
  17. 17. Our Solution • Serve paths from Neo4j • Segments from MongoDB (with date constraints) • Back to “Joins” • “Join” across Neo4j + MongoDB: 1 != 525d9031e6c9236072114387
  18. 18. Joins across DBs MongoDB: Stations Neo4j: Nodes BOS NYC DC DC ... generated by dbs BOS NYC • Forget seq id ... • Use a human-created “UUID” string for id MongoDB: Trips Neo4j: Edges BOS-NYC BOS-NYC BOS-DC BOS-DC NYC-DC NYC-DC ... ... • Convert pair into id: depart-arrive • For example: BOSNYC
  19. 19. Geo Lessons Art: MC Escher
  20. 20. Hybrid Solution • Google Autocomplete • Google Maps • MongoDB station geo lookup
  21. 21. Lessons of Lessons • Really understand the Neo4j Runtime Model • Pick universal human generated ids • Join across dbs better than RDBMS: 10s paths x 100s segments vs. 500k x 500k • Glad to have picked Neo4j: doing content gen and more geo features now
  22. 22. Useful Links • Neo4j Internals slideshare.net/thobe/an-overview-of-neo4j-internals • Aseem’s Lessons Learned with Neo4j http://aseemk.com/talks/neo4j-lessons-learned#/14 • Wes Freeman, Neo4j Internals http://wes.skeweredrook.com/graphdb-meetup-may-2013.pdf • MongoConnector blog.mongodb.org/post/29127828146/introducing-mongo-connector

×