This document discusses benchmarking TPC-H queries in MongoDB compared to MySQL. It introduces MongoDB and describes setting up the TPC-H data by embedding all tables into a single MongoDB collection. Six sample queries are presented and run using Map-Reduce and the Aggregation Framework. Benchmark results show MongoDB performing worse than MySQL on all queries due to data conversion difficulties and MongoDB's immature Aggregation Framework. The document concludes that while MongoDB is suitable for some applications, it is not well-suited to complex queries like those in TPC-H due to its lack of standard query language and server-side processing abilities.
2. Agenda
• Introduction to MongoDB
• TPC-H Data Setup
• Schema
• Advantages and Disadvantages of New Schema
• Queries
o Pricing Summary Record
o National Market Share Query
o Total Supplier Query
o Potential Part Promotion Query
o Suppliers who kept orders waiting query
o Global Sales Opportunity Query
• Benchmark result
• Discussion
• Demonstration
3. Introduction to MongoDB
• Open source, document-oriented and schema-free
• Store data in BSON format
• Easy to understand
• Flexible, Scalable & lightweight
• Ease of use
• No ‘join’ operation
• SQL to MongoDB Sample Query
• Select * from users where status = “A” ORDER BY USER_ID DESC
• db.users.find( { status: "A" } ).sort( { user_id: -1 } )
4. TPC-H Data Setup
• Import data into MongoDB
o Use MongoVue to import from MySQL
o Time consuming and difficult
• To achieve flexibility:
o Embedded all tables into single collection
o Replace all foreign keys with objects from lineitem table
o Choose lineitem table because of
• No primary keys
5. Schema
• Final Schema of TPC-H in MongoDB
lineitemOrder CustomerNation Region Partsupp Part supplier N R
6. Advantages and Disadvantages
of New Schema
• Advantages
o Easier to understand than SQL schema
o One document: one record
o No need to join tables
• Disadvantages
o Higher memory usage
o Update operation becomes more demanding
o Converting to BSON takes time
o Require lot of computational power
o Only around 300,000(5%) count of lineitem able to convert
7. Queries
• Select 6 queries to run on MongoDB with Map-
Reduce & Aggregation Framework
• Compare the result with MySQL
PROBLEMS
• Outputs are not the same because of failure during
converting data
• Aggregation framework is still in development
14. Benchmark result
• All benchmarks run on Intel Core i7-3610QM 2.30GHz 6MB
cache,4GB DDR3,750GB 7200 RPM,Win64 system
• Query1
MongoDB 6.1 sec
MySQL 0.2 sec
• Query 8
MongoDB 1.6 sec
MySQL 0.1 sec
• Query15
MongoDB 0.7 sec
MySQL 0.4 sec
15. Benchmark result(cont.)
• Query 20
MongoDB 1.1 sec
MySQL 174.4 sec
• Query 21
MongoDB 6.2 sec
MySQL 5.5 sec
• Query 22
MongoDB 7.6 sec
MySQL 0.8 sec
16. Discussion & Conclusion
• MongoDB left behind in all queries
o Design problem
o Aggregation framework problem
o No standard Query Language
o Server side query processing is not the nature of NoSQL
o Complex SQL cannot convert easily
• Only suitable for Applications:
o Business card database
o Web Blog
o Applications without complex transactions