SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Wanderu: Lessons
Learned
Lessons Learned and Unlearned from Building a Travel
Site with Graphs and Neo4j
Eddy Wong
CTO, Wanderu.com
@eddywongch
About Wanderu.com
Search Engine for (Intercity) Buses and Trains
Demo
From pt A to pt B
A: Boston B: DC
NYC
Nomenclature: Stations,Trips
Amtrak, $101, 09/26/2013
Bolt, $25, 09/26/2013 Mega, $24, 09/26/2013
From pt A to pt B
B: Brooklyn, NY
A: Cambridge, MA 31st & 9th Ave, NYC
South Station, Boston
28st & 7th Ave, NYC
34st & 8th Ave, NYC
Our Story
• Tech Started about 1+ yr ago
• Beta in Mar, Launch in Aug
• Knew nothing about Neo4j when we
started (Jun 2012)
• Did not like the relational model: wanted
schema-less and no self-joins
• Wanted a graph model
Relational vs. Graph
Lessons
Learned
UnLearned
Idea
•Architectural
•Modeling
•Geo
Architectural
Lessons
Art: MC Escher
Our Story
• Started with MongoDB as a general store:
easy to manipulate and organize data
• Wanted a db that could preserve the
Graph Model
• Debated: Document vs. Graph
• Could not find one single db that could do
both: general store + graph
Workflow
Store
Scraping JSON
Bus Websites Non-uniform
Data
Uniform
Data
Server
noSQL
• You need to make a choice of one noSQL
database
• You need ONE (centralized) database
• The word “database” is a loaded term
• Lots of (very diff) noSQL dbs options
Our Situation
• Data is written only in one direction
• Users search for paths, then segments
• Searches are done by date
• Needed online capability
• Trip info (price/avail) could change on some
Our Solution
• Use Both: MongoDB + Neo4j
• “Docugraph” = Document + Graph
• Syncing two kinds of databases
• Eventual consistency
Pipeline
Scraping JSON
Bus Websites Non-uniform
Data
Uniform
Data
MongoDBNeo4j
Mongo
Conn
Nodes & Edges
Replica
Mechanism
MongoConnector
• MongoDB Lab project, open source, unsupported
• Uses Replica Mechanism: Oplog
• Eventually Consistent (not real time)
• Written in Python
• Main methods: Upserts and Deletes, passes doc
• Implement DocMgr->Neo4jDocMgr->py2neo
• Other impls: MongoDocMgr, SolrDocMgr,
ESDocMgr
Populating Neo4j (2)
• Created our own way of creating Edges
• Auto Node creation when Edge is created:
Could add Stations (nodes) on the fly
• py2neo requires 2 “node ref”s to create an
edge, ie. might need two round trips to
Neo4j
Edge Creator P-code
hashtable allStations = load_stations
w_create_edge (station_id a, station_id b, otherdata)
look_up a in allStations
If found -> ref_a = allStations.get(a)
If not found ->
ref_a = py2neo.create_node(a)
Add a to allStations
...
py2neo.create_edge(ref_a, ref_b, ...)
Pipeline
Scraping JSON
Bus Websites Non-uniform
Data
MongoDB
Neo4j
Mongo
ConnNodes & Edges
Replica
Mechanism
REST
Server
BOS, NYC
BOS, PHL
NYC, DC
NYC, PHL
Modeling
Lessons
Art: MC Escher
Our Story
• We tried to “dump” all data into Neo4j
• Stations -> Nodes,Trips -> Edges
• Problem: Edges had dates -> too many
Edges -> “Super Node”
• Query perf was terrible (1+ mins) and
worse as # edges increased
Our Story (2)
• Went from Cypher to Gremlin, thinking
that would have improve performance
• Needed range queries on Edges
Our Solution
• Don’t store everything in the Neo4j, only
metadata
• Use Neo4j as an index
• Don’t store entities in Nodes, only keys
• Don’t store heavy properties in Edges
Neo4j Model
source:Tobias Lindaaker, Wes Freeman
Neo4j RuntimeModel
• Relationships are in a linked list
• Properties are in a linked list
• Therefore:There is NO random access for
Relationships or Properties
• A range query of relationships required a
full scan
Our Solution (2)
• Needed ability to do range queries on
Edges
• Serve paths from Neo4j, segments from
MongoDB
• The one thing we tried to avoid we ended
up doing: Joins
• Came up with “Docugraph” approach
Docugraph
• MongoDB Collections for Nodes and Edges
• Neo4j: Only keys for nodes
• Neo4j: Only Properties relevant for queries
Nodes & Edges
• Collection for Stations (nodes)
{id: “BOS”, name: “Boston South
Station”, address: “Summer
St”, ...}
• Collection for Trips (edges)
{depart_id: “BOS”, arrive_id:
“NYC”, carrier: “Megabus”, price:
24.0, ...}
Modeling
• Storing info in two or more dbs
• Doing a “join” across multiple dbs
Joins across DBs
MongoDB: Stations Neo4j: Nodes
BOS BOS
NYC NYC
DC DC
... ...
MongoDB: Trips Neo4j: Edges
BOS-NYC BOS-NYC
BOS-DC BOS-DC
NYC-DC NYC-DC
... ...
• Forget seq id
generated by dbs
• Use a human-created
long string for id
• Convert pair into id:
depart-arrive
• For example: BOS-
NYC
Indexing Technique
• Index Trips by {origin-dest, datetime}
Querying
• REST API in node.js
• Assemble results from two sources
• Paths from Neo4j
• Segments from MongoDB
• Sort by price, duration
Geo Lessons
Art: MC Escher
Our Story
• Wanted to mix public transport data with
intercity data
• Did not want to host all public transport
data
• Created a hybrid solution
Our Solution
• Hybrid:
• Google
Autocomplete
• Google Maps
• In house station geo
lookup
Geo
• Neo4j geo func was not out of the box
• Requires jar install
• Run a Java program to index
• Needed better doc
• Ended up using MongoDB geo instead
• Make geo func out of the box
Conclusions
• Even with a join across dbs -> solution
better than relational
• 10s paths x 100s segments vs. 500k x 500k
• Glad to have picked Neo4j: doing content
gen and more geo features now
• Graph model will be useful for future
analytics->Big Data
Useful Links
• Neo4j Internals
slideshare.net/thobe/an-overview-of-neo4j-internals
• Aseem’s Lessons Learned with Neo4j
http://aseemk.com/talks/neo4j-lessons-learned#/14
• Wes Freeman, Neo4j Internals
http://wes.skeweredrook.com/graphdb-meetup-may-2013.pdf
• MongoConnector
blog.mongodb.org/post/29127828146/introducing-mongo-connector

Weitere ähnliche Inhalte

Andere mochten auch

GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...
GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...
GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...Neo4j
 
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph DatabasesGraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph DatabasesNeo4j
 
Graph cafe-lightning
Graph cafe-lightningGraph cafe-lightning
Graph cafe-lightningVolker Pacher
 
Management des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLPManagement des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLPChristophe Willemsen
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Sebastian Verheughe
 
Sustainability in Household - Global Product Innovation and Consumer Insights...
Sustainability in Household - Global Product Innovation and Consumer Insights...Sustainability in Household - Global Product Innovation and Consumer Insights...
Sustainability in Household - Global Product Innovation and Consumer Insights...Revista H&C
 
Why would I store my data in more than one database?
Why would I store my data in more than one database?Why would I store my data in more than one database?
Why would I store my data in more than one database?Kurtosys Systems
 
Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...
Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...
Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...Neo4j
 
How we use neo4j for finding public transport routes
How we use neo4j for finding public transport routesHow we use neo4j for finding public transport routes
How we use neo4j for finding public transport routesEvgenii Kozhanov
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practiceEugene Fidelin
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...MLconf
 
GraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud PreventionGraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud PreventionNeo4j
 
Machine Learning and GraphX
Machine Learning and GraphXMachine Learning and GraphX
Machine Learning and GraphXAndy Petrella
 
Graph Databases, a little connected tour (Codemotion Rome)
Graph Databases, a little connected tour (Codemotion Rome)Graph Databases, a little connected tour (Codemotion Rome)
Graph Databases, a little connected tour (Codemotion Rome)fcofdezc
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
Microservices + Oracle: A Bright Future
Microservices + Oracle: A Bright FutureMicroservices + Oracle: A Bright Future
Microservices + Oracle: A Bright FutureKelly Goetsch
 
GraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4jGraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4jNeo4j
 
Working With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingWorking With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingNeo4j
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
 

Andere mochten auch (20)

GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...
GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...
GraphConnect Europe 2016 - Semantic PIM: Using a Graph Data Model at Toy Manu...
 
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph DatabasesGraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
 
Graph cafe-lightning
Graph cafe-lightningGraph cafe-lightning
Graph cafe-lightning
 
Management des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLPManagement des issues Github avec Neo4j et NLP
Management des issues Github avec Neo4j et NLP
 
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
Using Graph Databases in Real-Time to Solve Resource Authorization at Telenor...
 
Sustainability in Household - Global Product Innovation and Consumer Insights...
Sustainability in Household - Global Product Innovation and Consumer Insights...Sustainability in Household - Global Product Innovation and Consumer Insights...
Sustainability in Household - Global Product Innovation and Consumer Insights...
 
Why would I store my data in more than one database?
Why would I store my data in more than one database?Why would I store my data in more than one database?
Why would I store my data in more than one database?
 
Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...
Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...
Route Finding in Time Dependent Graphs - Nima Montazeri and Ben Earlam @ Grap...
 
How we use neo4j for finding public transport routes
How we use neo4j for finding public transport routesHow we use neo4j for finding public transport routes
How we use neo4j for finding public transport routes
 
How NOSQL Paid off for Telenor
How NOSQL Paid off for TelenorHow NOSQL Paid off for Telenor
How NOSQL Paid off for Telenor
 
Redis persistence in practice
Redis persistence in practiceRedis persistence in practice
Redis persistence in practice
 
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
Ehtsham Elahi, Senior Research Engineer, Personalization Science and Engineer...
 
GraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud PreventionGraphDay Stockholm - Fraud Prevention
GraphDay Stockholm - Fraud Prevention
 
Machine Learning and GraphX
Machine Learning and GraphXMachine Learning and GraphX
Machine Learning and GraphX
 
Graph Databases, a little connected tour (Codemotion Rome)
Graph Databases, a little connected tour (Codemotion Rome)Graph Databases, a little connected tour (Codemotion Rome)
Graph Databases, a little connected tour (Codemotion Rome)
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
Microservices + Oracle: A Bright Future
Microservices + Oracle: A Bright FutureMicroservices + Oracle: A Bright Future
Microservices + Oracle: A Bright Future
 
GraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4jGraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Introducing Neo4j
 
Working With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and ModelingWorking With a Real-World Dataset in Neo4j: Import and Modeling
Working With a Real-World Dataset in Neo4j: Import and Modeling
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 

Ähnlich wie Wanderu - Lessons from Building a Travel Site with Neo4j

OSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyOSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyNETWAYS
 
Mongodb intro
Mongodb introMongodb intro
Mongodb introchristkv
 
MongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 Minutes
MongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 MinutesMongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 Minutes
MongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 MinutesMongoDB
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)MongoDB
 
S01 e01 schema-design
S01 e01 schema-designS01 e01 schema-design
S01 e01 schema-designMongoDB
 
Node.js, From Simple to Complex
Node.js, From Simple to ComplexNode.js, From Simple to Complex
Node.js, From Simple to ComplexAlexandra Anghel
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBMongoDB
 
Schema design
Schema designSchema design
Schema designchristkv
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN AppMongoDB
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkMongoDB
 
Analyzing NYC Transit Data
Analyzing NYC Transit DataAnalyzing NYC Transit Data
Analyzing NYC Transit DataWork-Bench
 
Schema Design by Example ~ MongoSF 2012
Schema Design by Example ~ MongoSF 2012Schema Design by Example ~ MongoSF 2012
Schema Design by Example ~ MongoSF 2012hungarianhc
 
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBRackspace
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBMarakana Inc.
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewAntonio Pintus
 
Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklNeo4j
 
Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-databaseMongoDB
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...MongoDB
 
Migrating from MongoDB to Neo4j - Lessons Learned
Migrating from MongoDB to Neo4j - Lessons LearnedMigrating from MongoDB to Neo4j - Lessons Learned
Migrating from MongoDB to Neo4j - Lessons LearnedNick Manning
 

Ähnlich wie Wanderu - Lessons from Building a Travel Site with Neo4j (20)

OSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross LawleyOSDC 2012 | Building a first application on MongoDB by Ross Lawley
OSDC 2012 | Building a first application on MongoDB by Ross Lawley
 
Mongodb intro
Mongodb introMongodb intro
Mongodb intro
 
MongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 Minutes
MongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 MinutesMongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 Minutes
MongoDB World 2018: Tutorial - MongoDB & NodeJS: Zero to Hero in 80 Minutes
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
 
Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)
 
S01 e01 schema-design
S01 e01 schema-designS01 e01 schema-design
S01 e01 schema-design
 
Node.js, From Simple to Complex
Node.js, From Simple to ComplexNode.js, From Simple to Complex
Node.js, From Simple to Complex
 
Webinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDBWebinar: Working with Graph Data in MongoDB
Webinar: Working with Graph Data in MongoDB
 
Schema design
Schema designSchema design
Schema design
 
Building your First MEAN App
Building your First MEAN AppBuilding your First MEAN App
Building your First MEAN App
 
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation FrameworkConceptos básicos. Seminario web 5: Introducción a Aggregation Framework
Conceptos básicos. Seminario web 5: Introducción a Aggregation Framework
 
Analyzing NYC Transit Data
Analyzing NYC Transit DataAnalyzing NYC Transit Data
Analyzing NYC Transit Data
 
Schema Design by Example ~ MongoSF 2012
Schema Design by Example ~ MongoSF 2012Schema Design by Example ~ MongoSF 2012
Schema Design by Example ~ MongoSF 2012
 
Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDB
 
Learn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDBLearn Learn how to build your mobile back-end with MongoDB
Learn Learn how to build your mobile back-end with MongoDB
 
MongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overviewMongoDB: a gentle, friendly overview
MongoDB: a gentle, friendly overview
 
Combine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quicklCombine Spring Data Neo4j and Spring Boot to quickl
Combine Spring Data Neo4j and Spring Boot to quickl
 
Marc s01 e02-crud-database
Marc s01 e02-crud-databaseMarc s01 e02-crud-database
Marc s01 e02-crud-database
 
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
Webinarserie: Einführung in MongoDB: “Back to Basics” - Teil 3 - Interaktion ...
 
Migrating from MongoDB to Neo4j - Lessons Learned
Migrating from MongoDB to Neo4j - Lessons LearnedMigrating from MongoDB to Neo4j - Lessons Learned
Migrating from MongoDB to Neo4j - Lessons Learned
 

Mehr von Neo4j

Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Neo4j
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeNeo4j
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsNeo4j
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j
 
Neo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdf
Neo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdfNeo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdf
Neo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdfNeo4j
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...Neo4j
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AINeo4j
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignNeo4j
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Neo4j
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxNeo4j
 
Identification of insulin-resistance genes with Knowledge Graphs topology and...
Identification of insulin-resistance genes with Knowledge Graphs topology and...Identification of insulin-resistance genes with Knowledge Graphs topology and...
Identification of insulin-resistance genes with Knowledge Graphs topology and...Neo4j
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...
EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...
EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...Neo4j
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxNeo4j
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxNeo4j
 
KUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ionKUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ionNeo4j
 
SKY Paradigms, change and cake: the steep curve of introducing new technologies
SKY Paradigms, change and cake: the steep curve of introducing new technologiesSKY Paradigms, change and cake: the steep curve of introducing new technologies
SKY Paradigms, change and cake: the steep curve of introducing new technologiesNeo4j
 

Mehr von Neo4j (20)

Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!Webinar - IA generativa e grafi Neo4j: RAG time!
Webinar - IA generativa e grafi Neo4j: RAG time!
 
IA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG timeIA Generativa y Grafos de Neo4j: RAG time
IA Generativa y Grafos de Neo4j: RAG time
 
Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)Neo4j: Data Engineering for RAG (retrieval augmented generation)
Neo4j: Data Engineering for RAG (retrieval augmented generation)
 
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdfNeo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
 
Enabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge GraphsEnabling GenAI Breakthroughs with Knowledge Graphs
Enabling GenAI Breakthroughs with Knowledge Graphs
 
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdfNeo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
Neo4j_Anurag Tandon_Product Vision and Roadmap.Benelux.pptx.pdf
 
Neo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdf
Neo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdfNeo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdf
Neo4j_Jesus Barrasa_The Art of the Possible with Graph.pptx.pdf
 
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
SWIFT: Maintaining Critical Standards in the Financial Services Industry with...
 
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AIDeloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
Deloitte & Red Cross: Talk to your data with Knowledge-enriched Generative AI
 
Ingka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by DesignIngka Digital: Linked Metadata by Design
Ingka Digital: Linked Metadata by Design
 
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
Discover Neo4j Aura_ The Future of Graph Database-as-a-Service Workshop_3.13.24
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptxEmil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
Emil Eifrem at GraphSummit Copenhagen 2024 - The Art of the Possible.pptx
 
Identification of insulin-resistance genes with Knowledge Graphs topology and...
Identification of insulin-resistance genes with Knowledge Graphs topology and...Identification of insulin-resistance genes with Knowledge Graphs topology and...
Identification of insulin-resistance genes with Knowledge Graphs topology and...
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...
EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...
EY: Graphs as Critical Enablers for LLM-based Assistants- the Case of Custome...
 
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptxGraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
GraphSummit London Feb 2024 - ABK - Neo4j Product Vision and Roadmap.pptx
 
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptxThe Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
The Art of the Possible with Graph by Dr Jim Webber Neo4j.pptx
 
KUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ionKUBRICK Graphs: A journey from in vogue to success-ion
KUBRICK Graphs: A journey from in vogue to success-ion
 
SKY Paradigms, change and cake: the steep curve of introducing new technologies
SKY Paradigms, change and cake: the steep curve of introducing new technologiesSKY Paradigms, change and cake: the steep curve of introducing new technologies
SKY Paradigms, change and cake: the steep curve of introducing new technologies
 

Kürzlich hochgeladen

Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdfPedro Manuel
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPathCommunity
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6DianaGray10
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarPrecisely
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxUdaiappa Ramachandran
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Brian Pichman
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 

Kürzlich hochgeladen (20)

201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
Nanopower In Semiconductor Industry.pdf
Nanopower  In Semiconductor Industry.pdfNanopower  In Semiconductor Industry.pdf
Nanopower In Semiconductor Industry.pdf
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
UiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation DevelopersUiPath Community: AI for UiPath Automation Developers
UiPath Community: AI for UiPath Automation Developers
 
UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6UiPath Studio Web workshop series - Day 6
UiPath Studio Web workshop series - Day 6
 
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity WebinarAI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptxBuilding AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
 
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 

Wanderu - Lessons from Building a Travel Site with Neo4j

  • 1. Wanderu: Lessons Learned Lessons Learned and Unlearned from Building a Travel Site with Graphs and Neo4j Eddy Wong CTO, Wanderu.com @eddywongch
  • 2. About Wanderu.com Search Engine for (Intercity) Buses and Trains
  • 4. From pt A to pt B A: Boston B: DC NYC Nomenclature: Stations,Trips Amtrak, $101, 09/26/2013 Bolt, $25, 09/26/2013 Mega, $24, 09/26/2013
  • 5. From pt A to pt B B: Brooklyn, NY A: Cambridge, MA 31st & 9th Ave, NYC South Station, Boston 28st & 7th Ave, NYC 34st & 8th Ave, NYC
  • 6. Our Story • Tech Started about 1+ yr ago • Beta in Mar, Launch in Aug • Knew nothing about Neo4j when we started (Jun 2012) • Did not like the relational model: wanted schema-less and no self-joins • Wanted a graph model
  • 10. Our Story • Started with MongoDB as a general store: easy to manipulate and organize data • Wanted a db that could preserve the Graph Model • Debated: Document vs. Graph • Could not find one single db that could do both: general store + graph
  • 11. Workflow Store Scraping JSON Bus Websites Non-uniform Data Uniform Data Server
  • 12. noSQL • You need to make a choice of one noSQL database • You need ONE (centralized) database • The word “database” is a loaded term • Lots of (very diff) noSQL dbs options
  • 13. Our Situation • Data is written only in one direction • Users search for paths, then segments • Searches are done by date • Needed online capability • Trip info (price/avail) could change on some
  • 14. Our Solution • Use Both: MongoDB + Neo4j • “Docugraph” = Document + Graph • Syncing two kinds of databases • Eventual consistency
  • 15. Pipeline Scraping JSON Bus Websites Non-uniform Data Uniform Data MongoDBNeo4j Mongo Conn Nodes & Edges Replica Mechanism
  • 16. MongoConnector • MongoDB Lab project, open source, unsupported • Uses Replica Mechanism: Oplog • Eventually Consistent (not real time) • Written in Python • Main methods: Upserts and Deletes, passes doc • Implement DocMgr->Neo4jDocMgr->py2neo • Other impls: MongoDocMgr, SolrDocMgr, ESDocMgr
  • 17. Populating Neo4j (2) • Created our own way of creating Edges • Auto Node creation when Edge is created: Could add Stations (nodes) on the fly • py2neo requires 2 “node ref”s to create an edge, ie. might need two round trips to Neo4j
  • 18. Edge Creator P-code hashtable allStations = load_stations w_create_edge (station_id a, station_id b, otherdata) look_up a in allStations If found -> ref_a = allStations.get(a) If not found -> ref_a = py2neo.create_node(a) Add a to allStations ... py2neo.create_edge(ref_a, ref_b, ...)
  • 19. Pipeline Scraping JSON Bus Websites Non-uniform Data MongoDB Neo4j Mongo ConnNodes & Edges Replica Mechanism REST Server BOS, NYC BOS, PHL NYC, DC NYC, PHL
  • 21. Our Story • We tried to “dump” all data into Neo4j • Stations -> Nodes,Trips -> Edges • Problem: Edges had dates -> too many Edges -> “Super Node” • Query perf was terrible (1+ mins) and worse as # edges increased
  • 22. Our Story (2) • Went from Cypher to Gremlin, thinking that would have improve performance • Needed range queries on Edges
  • 23. Our Solution • Don’t store everything in the Neo4j, only metadata • Use Neo4j as an index • Don’t store entities in Nodes, only keys • Don’t store heavy properties in Edges
  • 25. Neo4j RuntimeModel • Relationships are in a linked list • Properties are in a linked list • Therefore:There is NO random access for Relationships or Properties • A range query of relationships required a full scan
  • 26. Our Solution (2) • Needed ability to do range queries on Edges • Serve paths from Neo4j, segments from MongoDB • The one thing we tried to avoid we ended up doing: Joins • Came up with “Docugraph” approach
  • 27. Docugraph • MongoDB Collections for Nodes and Edges • Neo4j: Only keys for nodes • Neo4j: Only Properties relevant for queries
  • 28. Nodes & Edges • Collection for Stations (nodes) {id: “BOS”, name: “Boston South Station”, address: “Summer St”, ...} • Collection for Trips (edges) {depart_id: “BOS”, arrive_id: “NYC”, carrier: “Megabus”, price: 24.0, ...}
  • 29. Modeling • Storing info in two or more dbs • Doing a “join” across multiple dbs
  • 30. Joins across DBs MongoDB: Stations Neo4j: Nodes BOS BOS NYC NYC DC DC ... ... MongoDB: Trips Neo4j: Edges BOS-NYC BOS-NYC BOS-DC BOS-DC NYC-DC NYC-DC ... ... • Forget seq id generated by dbs • Use a human-created long string for id • Convert pair into id: depart-arrive • For example: BOS- NYC
  • 31. Indexing Technique • Index Trips by {origin-dest, datetime}
  • 32. Querying • REST API in node.js • Assemble results from two sources • Paths from Neo4j • Segments from MongoDB • Sort by price, duration
  • 34. Our Story • Wanted to mix public transport data with intercity data • Did not want to host all public transport data • Created a hybrid solution
  • 35. Our Solution • Hybrid: • Google Autocomplete • Google Maps • In house station geo lookup
  • 36. Geo • Neo4j geo func was not out of the box • Requires jar install • Run a Java program to index • Needed better doc • Ended up using MongoDB geo instead • Make geo func out of the box
  • 37. Conclusions • Even with a join across dbs -> solution better than relational • 10s paths x 100s segments vs. 500k x 500k • Glad to have picked Neo4j: doing content gen and more geo features now • Graph model will be useful for future analytics->Big Data
  • 38. Useful Links • Neo4j Internals slideshare.net/thobe/an-overview-of-neo4j-internals • Aseem’s Lessons Learned with Neo4j http://aseemk.com/talks/neo4j-lessons-learned#/14 • Wes Freeman, Neo4j Internals http://wes.skeweredrook.com/graphdb-meetup-may-2013.pdf • MongoConnector blog.mongodb.org/post/29127828146/introducing-mongo-connector