SlideShare ist ein Scribd-Unternehmen logo
1 von 5
Downloaden Sie, um offline zu lesen
Performance of Neo4J versus MongoDB for Social
actions
May 7, 2014
Santosh S Ravi1
Kalyanaraman Santhanam1
University of Southern California University of Southern California
sathyavi@usc.edu ksanthan@usc.edu
Abstract
The data collected nowadays are highly connected in nature owing to
the social nature of the way in which they are accumulated by the various
social networks and other internet companies. Social network analysis
(SNA) is the analysis of such data and views social relationships in terms
of network theory, consisting of users and relationships between them.
Graph databases like Neo4J have risen to handle these requirements by al-
lowing efficient index free lookups. We try to understand the performance
of the Neo4j over other NoSQL especially MongoDB. We have used three
social metrics namely distance, network closure and assortavitity for our
analysis.
1 Introduction
Relational databases are popular for storing large amount of structured data for
past few decades because of their ACID capabilities. Recent evolution of large
volume of data from Social Networks and cloud services led to the development
of non-traditional NoSQL datastores such as MongoDB, Neo4j and HBase etc,.
With our requirements to model the highly interconnected social networking
data, graph databases are particularly interesting as it directly fits into the
model.
Modeling the social networking data in relational databases requires many-
to-many relations and should perform many join operations for a simple path
traversal between two actors. Graph databases are designed to store the data
in such a manner to perform traversal easily. Popular benchmarks like Yahoo!
Cloud Serving Benchmark (YCSB) benchmarking framework aid in evaluating
the performance of emerging cloud serving systems with different workloads.
However, it does not suit evaluating the performance of popular social network-
ing actions such as View Profile, List Friends in the cloud serving systems. To
overcome the limitation, BG is a benchmark works well to evaluate the perfor-
mance of data stores for interactive social networking actions and sessions. BG
computes either a Social Action Rating (SoAR) or a Socialites rating of a data
store. These ratings compute the number of concurrent actions performed by a
system for a fixed percentage of requests.
We leveraged BG to assess the performance of social metrics such as Dis-
tance, Network Closure and Assortativity in both Neo4j and MongoDB data-
stores. Neo4j, a popular java-based graph database which offers high perfor-
mance, availability and ACID transactions. Neo4j supports query language,
1
Cypher to access the data from database. We compared the performance of the
social metrics for both Neo4j Embedded and Neo4j Cypher REST as well as
MongoDB datastore.
2 Description
2.1 Data stores
• Neo4J Community 2.0.0
Run mode: RESTful and embedded
Query mode: Java API and Cypher 2.0
• MongoDB 2.6
2.2 Test setup
All benchmarks are performed on a single machine with specifications as follows:
2.6 GHz Intel Core i5 with 8GB 1600 MHz DDR3 RAM, 256GB SSD, OS X
10.9.2.
2.3 BGBenchmark
We used BGBenchmark http://bgbenchmark.org v0.1.4776 for analysis of the
social networking actions such as Assortativity, Network Closure and Distance.
We also leveraged viewprofile action in BGBenchmark to test these social ac-
tions.
2.4 Data Model and workload
Figure 1 shows BGBenchmark’s data model. The workload used for benchmark-
ing includes: 10,000 users with 4 friends per user and 10 resources per user. The
friends relationship was created such that the data forms a torus model i.e that
all the users are connected to all other users via Friends-of-Friends relation-
ship. The users are given unique usersid between 0..9999 by the BGWorkload
generator.
2.5 Social Metrics
The following are the social metrics identified for the scope of this project.
Network Closure: A measure of the completeness of relational triads.
An individual’s assumption of network closure (i.e. that their friends are also
friends) is called transitivity. Transitivity is an outcome of the individual or
situational trait of Need for Cognitive Closure.
Assortativity: The extent to which actors form ties with similar versus
dissimilar others. Similarity can be defined by gender, race, age, occupation,
educational achievement, status, values or any other salient characteristic.
2
Figure 1: Data model used for Benchmarking
Distance: The minimum number of ties required to connect two particular
actors, as popularized by the idea of ‘six degrees of separation’.
2.6 Implementation
The code developed can be classified into three sections namely - Embedded
Neo4J, RESTful Neo4J and finally MongoDB. The goal was to find the best
implementation for the Social Metrics identified here 2.5. To remain fair in our
comparsion, we used the same algorithm in all these sections. The Distance
metric is computed using Breadth-First-Search(BFS) algorithm, Assortativity
metric involves iterating through the properties/attributes and finding the in-
tersection and finally network closure retrieves all the nodes/actors in one hop
for a given node/actor in a single query.
2.7 Test Suite
The implementations are tested for accuracy using Junit test suite. Network
closure and Assortativity are straight forward to test compared to Distance ac-
tion. Since we already know the graph topology forms torus model, Network
closure results can be validated by adding/subtracting the userid with outgo-
ing and incoming friends count. In order to test the Assortativity results, we
inserted ’country name’ and ’organization’ properties with synthetic data for
every 100 users. We also used a formula to test distance metric since it proved
to be less tedious than performing the actual BFS test.
distance = min(
| d − s |
f
,
| N − d + s |
f
),
where s, d are the source and destination usersid and f is the number of
outgoing friend relationships
3
Figure 2: Throughput(actions/secs) comparison
3 Findings
3.1 Observation
On comparison of performance between the social metrics, we find network clo-
sure and assortativity has much higher throughput compared to distance metric.
Since the distance metric performs graph transversals whereas the former met-
rics just performs lookups on userid, an indexed attribute in Neo4j. Between
network closure and assortativity, network closure performs poor as expected
since it involves iterating all neighbours of the given nodes to find the intersec-
tion of friend members between the user nodes.
As expected, Embedded Java API outperforms others significantly. The rea-
son being elimination of network overhead and object marshalling/unmarshalling
overhead. We also believe the main reasons for Neo4j Cypher performing poorly
is due to network overhead, cypher query parsing and optimization performed
by cypher engine.
The Neo4j’s index-free lookup takes center stage for the distance metric.
MongoDB performs slower at least by factor of 10 and 100 compared to the Neo4j
RESTful and Embedded versions respectively. The workload for distance metric
being examined requires on an average of 2000 - 3000 index lookups to perform
the BFS traversal to arrive at the goal node. Since Neo4j uses Relationship
Expander for path traversals, it avoids the lookup of user indexes compared
to MongoDB. As for other metrics, MongoDB performs more or less similar to
Neo4j REST. However, we believe mongodb protocol plays a significant part
in MongoDB’s higher throughput results for assortativity. mongodb protocol
operates via TCP/IP over the transport layer using BSON format whereas Neo4j
REST in addition to TCP/IP uses RESTful HTTP protocol headers with JSON
encoding/decoding.
4
Figure 3: Throughput(actions/secs) comparison for distance metric
4 Future Work
We would like to include these social metrics as part of BGBenchmark frame-
work to extend the current set of actions. We believe that the implementation
of new social metrics would give better understanding of data stores for complex
graph operations compared to the existing simple operations.
References
[1] Neo4j Documentation, http://docs.neo4j.org/
[2] MongoDB Documentation, http://docs.mongodb.org/manual/
[3] Florian Holzschuher and Ren´e Peinl, Performance of graph query languages:
comparison of cypher, gremlin and native access in Neo4j.
[4] Social Network Analysis, http://en.wikipedia.org/wiki/Socialnetworkanalysis
[5] BG Benchmark, http://bgbenchmark.org/BG/overview.html
5

Weitere ähnliche Inhalte

Was ist angesagt?

Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDBlehresman
 
Neo4j Training Cypher
Neo4j Training CypherNeo4j Training Cypher
Neo4j Training CypherMax De Marzi
 
It's 2017, and I still want to sell you a graph database
It's 2017, and I still want to sell you a graph databaseIt's 2017, and I still want to sell you a graph database
It's 2017, and I still want to sell you a graph databaseSwanand Pagnis
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB MongoDB
 
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) David Fombella Pombal
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherMongoDB
 
Tutorial: Building Your First App with MongoDB Stitch
Tutorial: Building Your First App with MongoDB StitchTutorial: Building Your First App with MongoDB Stitch
Tutorial: Building Your First App with MongoDB StitchMongoDB
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jConnected Data World
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDBTim Callaghan
 
Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Neo4j
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsMongoDB
 
Neo4j Data Science Presentation
Neo4j Data Science PresentationNeo4j Data Science Presentation
Neo4j Data Science PresentationMax De Marzi
 
Graphs for Data Science and Machine Learning
Graphs for Data Science and Machine LearningGraphs for Data Science and Machine Learning
Graphs for Data Science and Machine LearningNeo4j
 
Graphs, Graphs everywhere - Lucene powered relation exploration
Graphs, Graphs everywhere - Lucene powered relation explorationGraphs, Graphs everywhere - Lucene powered relation exploration
Graphs, Graphs everywhere - Lucene powered relation explorationZbyszko Papierski
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsNeo4j
 
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...
PhD thesis defense:  Large-scale multilingual knowledge extraction, publishin...PhD thesis defense:  Large-scale multilingual knowledge extraction, publishin...
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...Dimitris Kontokostas
 
One Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACLOne Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACLConnected Data World
 

Was ist angesagt? (20)

ETL into Neo4j
ETL into Neo4jETL into Neo4j
ETL into Neo4j
 
Modeling Data in MongoDB
Modeling Data in MongoDBModeling Data in MongoDB
Modeling Data in MongoDB
 
Neo4j Training Cypher
Neo4j Training CypherNeo4j Training Cypher
Neo4j Training Cypher
 
It's 2017, and I still want to sell you a graph database
It's 2017, and I still want to sell you a graph databaseIt's 2017, and I still want to sell you a graph database
It's 2017, and I still want to sell you a graph database
 
User Data Management with MongoDB
User Data Management with MongoDB User Data Management with MongoDB
User Data Management with MongoDB
 
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH) Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
 
Using MongoDB + Hadoop Together
Using MongoDB + Hadoop TogetherUsing MongoDB + Hadoop Together
Using MongoDB + Hadoop Together
 
Tutorial: Building Your First App with MongoDB Stitch
Tutorial: Building Your First App with MongoDB StitchTutorial: Building Your First App with MongoDB Stitch
Tutorial: Building Your First App with MongoDB Stitch
 
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4jExplicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
Explicit Semantics in Graph DBs Driving Digital Transformation With Neo4j
 
5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB5 Pitfalls to Avoid with MongoDB
5 Pitfalls to Avoid with MongoDB
 
Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach Training Week: Create a Knowledge Graph: A Simple ML Approach
Training Week: Create a Knowledge Graph: A Simple ML Approach
 
Webinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in DocumentsWebinar: Back to Basics: Thinking in Documents
Webinar: Back to Basics: Thinking in Documents
 
Neo4j Data Science Presentation
Neo4j Data Science PresentationNeo4j Data Science Presentation
Neo4j Data Science Presentation
 
Graphs for Data Science and Machine Learning
Graphs for Data Science and Machine LearningGraphs for Data Science and Machine Learning
Graphs for Data Science and Machine Learning
 
Neo4jrb
Neo4jrbNeo4jrb
Neo4jrb
 
Graphs, Graphs everywhere - Lucene powered relation exploration
Graphs, Graphs everywhere - Lucene powered relation explorationGraphs, Graphs everywhere - Lucene powered relation exploration
Graphs, Graphs everywhere - Lucene powered relation exploration
 
Christian Jakenfelds
Christian JakenfeldsChristian Jakenfelds
Christian Jakenfelds
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative Facts
 
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...
PhD thesis defense:  Large-scale multilingual knowledge extraction, publishin...PhD thesis defense:  Large-scale multilingual knowledge extraction, publishin...
PhD thesis defense: Large-scale multilingual knowledge extraction, publishin...
 
One Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACLOne Ontology, One Data Set, Multiple Shapes with SHACL
One Ontology, One Data Set, Multiple Shapes with SHACL
 

Ähnlich wie Performance neo4j-versus (2)

Graph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jGraph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jijtsrd
 
ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...
ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...
ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...IJCI JOURNAL
 
Effective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To NetworksEffective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To NetworksIRJET Journal
 
A scalable server architecture for mobile presence services in social network...
A scalable server architecture for mobile presence services in social network...A scalable server architecture for mobile presence services in social network...
A scalable server architecture for mobile presence services in social network...Bijoy L
 
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...ijfcstjournal
 
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...ijfcstjournal
 
Evolution of social developer network in oss survey
Evolution of social developer network in oss surveyEvolution of social developer network in oss survey
Evolution of social developer network in oss surveyeSAT Publishing House
 
10.11648.j.ajnc.20130202.13
10.11648.j.ajnc.20130202.1310.11648.j.ajnc.20130202.13
10.11648.j.ajnc.20130202.13Uvaraj Shan
 
Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...Uvaraj Shan
 
Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...Uvaraj Shan
 
Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networksIIIT Hyderabad
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Editor IJAIEM
 
Service Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data MiningService Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data MiningIIRindia
 
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksScale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksIOSR Journals
 
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...birdsking
 
SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...
SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...
SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...ijwscjournal
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...cscpconf
 

Ähnlich wie Performance neo4j-versus (2) (20)

SNAwithNeo4j
SNAwithNeo4jSNAwithNeo4j
SNAwithNeo4j
 
Graph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4jGraph Databases and Graph Data Science in Neo4j
Graph Databases and Graph Data Science in Neo4j
 
ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...
ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...
ENHANCING THE PERFORMANCE OF E-COMMERCE SOLUTIONS BY FRIENDS RECOMMENDATION S...
 
Effective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To NetworksEffective Approach For Content Based Image Retrieval In Peer-Peer To Networks
Effective Approach For Content Based Image Retrieval In Peer-Peer To Networks
 
A scalable server architecture for mobile presence services in social network...
A scalable server architecture for mobile presence services in social network...A scalable server architecture for mobile presence services in social network...
A scalable server architecture for mobile presence services in social network...
 
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
 
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
A NOVEL APPROACH FOR PERFORMANCE ENHANCEMENT OF E-COMMERCE SOLUTIONS BY FRIEN...
 
Evolution of social developer network in oss survey
Evolution of social developer network in oss surveyEvolution of social developer network in oss survey
Evolution of social developer network in oss survey
 
10.11648.j.ajnc.20130202.13
10.11648.j.ajnc.20130202.1310.11648.j.ajnc.20130202.13
10.11648.j.ajnc.20130202.13
 
Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...
 
Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...Node selection in Peer-to-Peer content sharing service in mobile cellular net...
Node selection in Peer-to-Peer content sharing service in mobile cellular net...
 
Studying user footprints in different online social networks
Studying user footprints in different online social networksStudying user footprints in different online social networks
Studying user footprints in different online social networks
 
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
Unification Algorithm in Hefty Iterative Multi-tier Classifiers for Gigantic ...
 
Service Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data MiningService Level Comparison for Online Shopping using Data Mining
Service Level Comparison for Online Shopping using Data Mining
 
Cloud java titles adrit solutions
Cloud java titles adrit solutionsCloud java titles adrit solutions
Cloud java titles adrit solutions
 
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer NetworksScale-Free Networks to Search in Unstructured Peer-To-Peer Networks
Scale-Free Networks to Search in Unstructured Peer-To-Peer Networks
 
Q046049397
Q046049397Q046049397
Q046049397
 
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
Ieee projects-2014-bulk-ieee-projects-2015-title-list-for-me-be-mphil-final-y...
 
SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...
SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...
SOME INTEROPERABILITY ISSUES IN THE DESIGNING OF WEB SERVICES : CASE STUDY ON...
 
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
LINKING SOFTWARE DEVELOPMENT PHASE AND PRODUCT ATTRIBUTES WITH USER EVALUATIO...
 

Kürzlich hochgeladen

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Kürzlich hochgeladen (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Performance neo4j-versus (2)

  • 1. Performance of Neo4J versus MongoDB for Social actions May 7, 2014 Santosh S Ravi1 Kalyanaraman Santhanam1 University of Southern California University of Southern California sathyavi@usc.edu ksanthan@usc.edu Abstract The data collected nowadays are highly connected in nature owing to the social nature of the way in which they are accumulated by the various social networks and other internet companies. Social network analysis (SNA) is the analysis of such data and views social relationships in terms of network theory, consisting of users and relationships between them. Graph databases like Neo4J have risen to handle these requirements by al- lowing efficient index free lookups. We try to understand the performance of the Neo4j over other NoSQL especially MongoDB. We have used three social metrics namely distance, network closure and assortavitity for our analysis. 1 Introduction Relational databases are popular for storing large amount of structured data for past few decades because of their ACID capabilities. Recent evolution of large volume of data from Social Networks and cloud services led to the development of non-traditional NoSQL datastores such as MongoDB, Neo4j and HBase etc,. With our requirements to model the highly interconnected social networking data, graph databases are particularly interesting as it directly fits into the model. Modeling the social networking data in relational databases requires many- to-many relations and should perform many join operations for a simple path traversal between two actors. Graph databases are designed to store the data in such a manner to perform traversal easily. Popular benchmarks like Yahoo! Cloud Serving Benchmark (YCSB) benchmarking framework aid in evaluating the performance of emerging cloud serving systems with different workloads. However, it does not suit evaluating the performance of popular social network- ing actions such as View Profile, List Friends in the cloud serving systems. To overcome the limitation, BG is a benchmark works well to evaluate the perfor- mance of data stores for interactive social networking actions and sessions. BG computes either a Social Action Rating (SoAR) or a Socialites rating of a data store. These ratings compute the number of concurrent actions performed by a system for a fixed percentage of requests. We leveraged BG to assess the performance of social metrics such as Dis- tance, Network Closure and Assortativity in both Neo4j and MongoDB data- stores. Neo4j, a popular java-based graph database which offers high perfor- mance, availability and ACID transactions. Neo4j supports query language, 1
  • 2. Cypher to access the data from database. We compared the performance of the social metrics for both Neo4j Embedded and Neo4j Cypher REST as well as MongoDB datastore. 2 Description 2.1 Data stores • Neo4J Community 2.0.0 Run mode: RESTful and embedded Query mode: Java API and Cypher 2.0 • MongoDB 2.6 2.2 Test setup All benchmarks are performed on a single machine with specifications as follows: 2.6 GHz Intel Core i5 with 8GB 1600 MHz DDR3 RAM, 256GB SSD, OS X 10.9.2. 2.3 BGBenchmark We used BGBenchmark http://bgbenchmark.org v0.1.4776 for analysis of the social networking actions such as Assortativity, Network Closure and Distance. We also leveraged viewprofile action in BGBenchmark to test these social ac- tions. 2.4 Data Model and workload Figure 1 shows BGBenchmark’s data model. The workload used for benchmark- ing includes: 10,000 users with 4 friends per user and 10 resources per user. The friends relationship was created such that the data forms a torus model i.e that all the users are connected to all other users via Friends-of-Friends relation- ship. The users are given unique usersid between 0..9999 by the BGWorkload generator. 2.5 Social Metrics The following are the social metrics identified for the scope of this project. Network Closure: A measure of the completeness of relational triads. An individual’s assumption of network closure (i.e. that their friends are also friends) is called transitivity. Transitivity is an outcome of the individual or situational trait of Need for Cognitive Closure. Assortativity: The extent to which actors form ties with similar versus dissimilar others. Similarity can be defined by gender, race, age, occupation, educational achievement, status, values or any other salient characteristic. 2
  • 3. Figure 1: Data model used for Benchmarking Distance: The minimum number of ties required to connect two particular actors, as popularized by the idea of ‘six degrees of separation’. 2.6 Implementation The code developed can be classified into three sections namely - Embedded Neo4J, RESTful Neo4J and finally MongoDB. The goal was to find the best implementation for the Social Metrics identified here 2.5. To remain fair in our comparsion, we used the same algorithm in all these sections. The Distance metric is computed using Breadth-First-Search(BFS) algorithm, Assortativity metric involves iterating through the properties/attributes and finding the in- tersection and finally network closure retrieves all the nodes/actors in one hop for a given node/actor in a single query. 2.7 Test Suite The implementations are tested for accuracy using Junit test suite. Network closure and Assortativity are straight forward to test compared to Distance ac- tion. Since we already know the graph topology forms torus model, Network closure results can be validated by adding/subtracting the userid with outgo- ing and incoming friends count. In order to test the Assortativity results, we inserted ’country name’ and ’organization’ properties with synthetic data for every 100 users. We also used a formula to test distance metric since it proved to be less tedious than performing the actual BFS test. distance = min( | d − s | f , | N − d + s | f ), where s, d are the source and destination usersid and f is the number of outgoing friend relationships 3
  • 4. Figure 2: Throughput(actions/secs) comparison 3 Findings 3.1 Observation On comparison of performance between the social metrics, we find network clo- sure and assortativity has much higher throughput compared to distance metric. Since the distance metric performs graph transversals whereas the former met- rics just performs lookups on userid, an indexed attribute in Neo4j. Between network closure and assortativity, network closure performs poor as expected since it involves iterating all neighbours of the given nodes to find the intersec- tion of friend members between the user nodes. As expected, Embedded Java API outperforms others significantly. The rea- son being elimination of network overhead and object marshalling/unmarshalling overhead. We also believe the main reasons for Neo4j Cypher performing poorly is due to network overhead, cypher query parsing and optimization performed by cypher engine. The Neo4j’s index-free lookup takes center stage for the distance metric. MongoDB performs slower at least by factor of 10 and 100 compared to the Neo4j RESTful and Embedded versions respectively. The workload for distance metric being examined requires on an average of 2000 - 3000 index lookups to perform the BFS traversal to arrive at the goal node. Since Neo4j uses Relationship Expander for path traversals, it avoids the lookup of user indexes compared to MongoDB. As for other metrics, MongoDB performs more or less similar to Neo4j REST. However, we believe mongodb protocol plays a significant part in MongoDB’s higher throughput results for assortativity. mongodb protocol operates via TCP/IP over the transport layer using BSON format whereas Neo4j REST in addition to TCP/IP uses RESTful HTTP protocol headers with JSON encoding/decoding. 4
  • 5. Figure 3: Throughput(actions/secs) comparison for distance metric 4 Future Work We would like to include these social metrics as part of BGBenchmark frame- work to extend the current set of actions. We believe that the implementation of new social metrics would give better understanding of data stores for complex graph operations compared to the existing simple operations. References [1] Neo4j Documentation, http://docs.neo4j.org/ [2] MongoDB Documentation, http://docs.mongodb.org/manual/ [3] Florian Holzschuher and Ren´e Peinl, Performance of graph query languages: comparison of cypher, gremlin and native access in Neo4j. [4] Social Network Analysis, http://en.wikipedia.org/wiki/Socialnetworkanalysis [5] BG Benchmark, http://bgbenchmark.org/BG/overview.html 5