SlideShare ist ein Scribd-Unternehmen logo
1 von 63
Downloaden Sie, um offline zu lesen
High Performance JSON
PostgreSQL vs. MongoDB
Percona Live Europe 2017
Dominic Dwyer
Wei Shan Ang
VS
GlobalSign
● GlobalSign identity & crypto services provider
● WebTrust certified Certificate Authority - 3rd in the world
● High volume services - IoT devices, cloud providers
● Cryptographic identities, timestamping, signing, etc
About Me
● Database Engineer with 6 years of experience
● From Singapore but now based in London
● Originally worked on Oracle systems
● Switched to open-source databases in 2015
● Currently working in GlobalSign as a “Data Reliability Engineer”
Motivation
● Most benchmark results are biased - commercial interests
● We feel that benchmark results are measured “creatively”
● We use PostgreSQL and MongoDB a lot!
● We wanted to test the latest versions
PostgreSQL
PostgreSQL
● Around for 21 years
● JSON supported since 9.2
● JSONB supported since 9.4
● Does not have any statistics about the internals of document types like JSON
or JSONB
○ Can be overcome with default_statistics_target or ALTER TABLE TABLE_NAME ALTER int4
SET STATISTICS 2000;
● Many JSON/JSONB operator/functions released since 9.2 (jsonb_set,
jsonb_insert)
● Many JSON/JSONB bug fixes too
PostgreSQL Ecosystem
● “Build it yourself”
● Many High Availability solutions - all 3rd party
○ repmgr, pacemaker/corosync, Slony, Patroni and many more
● Connection Pooling
○ pgBouncer (single-threaded), pgpool-II
● Sharding
○ CitusDB
● Live version upgrades - tricky!
○ pg_upgrade, Slony, pg_dump and pg_logical
MongoDB
MongoDB
● Relatively “young” database software
○ 8 years since first release
● Known as a /dev/null database since the early days (jepsen.io)
○ Tremendous stability improvements since then
○ All known reliability issues has been fixed since 3.2.12
● Lots of WiredTiger bug fixes since 3.2
○ Cache eviction
○ Checkpoints
○ Lost updates and dirty writes
Source: jepsen.io
● Everything comes as standard:
○ Built-in replication
○ Built-in sharding
○ Live cluster version upgrades (ish)
■ Shutdown slave, upgrade slave, startup slave, repeat
MongoDB
Server Hardware
● 2x Intel(R) Xeon(R) CPU E5-2630 v4
○ 20 cores / 40 threads
● 32GB Memory
● FreeBSD 11
● ZFS Filesystem
● 2 x 1.6TB (Intel SSD DC S3610, MLC)
Why do we use ZFS?
● Highly tunable filesystem
○ Layered caching (ARC, L2ARC, ZIL)
○ Advanced cache eviction
■ Most Recently Used (MRU)
■ Most Frequently Used (MFU)
● Free snapshots
● Block level checksums
● Fast compression
● Nexenta, Delphix, Datto, Joyent, Tegile, Oracle (obviously) and many more!
The Setup
● 1-3 Client machines (depending on the test)
● 1 Server, two jails - one for Postgres & one for Mongo
● PostgreSQL 9.6.5 with pgBouncer 1.7.2
● MongoDB 3.4.9 with WiredTiger
Performance Tuning
● We had to tune PostgreSQL heavily
○ System V IPC (shmmax, shmall, semmns and etc)
○ pgBouncer (single threaded, we need multiple instances to handle the load)
● MongoDB tuning was easy!
○ WiredTiger cache size
○ Compression settings
○ Default settings are usually good enough
● ZFS tuning
○ atime
○ recordsize
○ checksum
○ compression
Sample JSON Document
{
"_id" : NumberLong(2),
"name" : "lPAyAYpUvUDGiCd",
"addresses" : [
{
"number" : 59,
"line1" : "EPJKLhmEPrrdYqaFxxEVMF",
"line2" : "Rvlgkmb"
},
{
"number" : 59,
"line1" : "DdCBXEW",
"line2" : "FEV"
}
],
"phone_number" : "xPOYCOfSpieIxbGxpYEpi",
"dob" : ISODate("2017-09-05T00:03:28.956Z"),
"age" : 442006075,
"balance" : 0.807247519493103,
"enabled" : false,
"counter" : 442006075,
"padding" : BinData(0,"")
}
Sub-documents
Varying field types
Randomised binary blob
About Me
● Engineer on the High Performance Platforms team
○ Our team builds a high volume CA platform & distributed systems
○ Based in Old Street, London
○ Greenfields project, all new stuff!
● Day job has me breaking all the things
○ Simulating failures, network partitions, etc
○ Assessing performance and durability
● Maintain performance fork of Go MongoDB driver
○ github.com/globalsign/mgo
MPJBT Benchmark Tool
● MongoDB PostgreSQL JSONB Benchmarking Tool
○ Seriously, we’re open to better names…….
● Written in Golang
● Will be released on Github after the conference
● Models typical workloads (but maybe not yours!)
○ Inserts, selects, select-updates, range queries, etc.
● Lockless outside of the database drivers
○ Low contention improves ability to push servers
Why Go?
● Designed from the start for high concurrency
○ Thousands of concurrent workers is totally fine
● Co-operative scheduler can maximise I/O throughput
○ When blocked, Go switches to another worker
○ Blocked worker is woken up when it’s unblocked
○ Much cheaper context switching - occurs in userland
● Familiarity - I use it every day!
Does it deliver?
Insert 10,000,000 records
Average isn’t very helpful
● I have an average of 52.2ms
Average isn’t very helpful
● I have an average of 52.2ms
120.080231
36.237584
25.904811
44.053916
66.617778
59.713100
74.620329
1.689589
90.641940
27.202953
51.162331
52.202392
52.511745
50.439697
52.975609
52.567941
53.067609
52.122890
51.159180
52.390616
OR
Inserts - Latency Histogram
Inserts - Latency Histogram
1s 2s 3s
Inserts - Throughput
insert 30877op/s avg.0ms
insert 27509op/s avg.0ms
insert 29997op/s avg.0ms
insert 31143op/s avg.0ms
insert 22576op/s avg.0ms
insert 0op/s avg.0ms
insert 0op/s avg.0ms
insert 1op/s avg.2561ms
insert 0op/s avg.0ms
insert 20703op/s avg.6ms
insert 31154op/s avg.0ms
insert 31298op/s avg.0ms
insert 30359op/s avg.0ms
insert 26081op/s avg.0ms
insert 25938op/s avg.0ms
insert 26649op/s avg.0ms
insert 26009op/s avg.0ms
insert 26029op/s avg.0ms
insert 25522op/s avg.0ms
insert 25960op/s avg.0ms
insert 26000op/s avg.0ms
insert 25576op/s avg.0ms
insert 26159op/s avg.0ms
insert 25628op/s avg.0ms
insert 26071op/s avg.0ms
insert 25856op/s avg.0ms
MongoDB cache eviction bug?
● Show some cool graphs
● https://jira.mongodb.org/browse/SERVER-29311
●
MongoDB cache eviction bug - not a bug?
● Reported to MongoDB
○ https://jira.mongodb.org/browse/SERVER-29311
○ Offered to run any tests and analyse data
● Ran 36 different test combinations
○ ZFS compression: lz4, zlib, off
○ MongoDB compression: snappy, zlib, off
○ Filesystem block sizes
○ Disk configurations
○ Tried running on Linux/XFS
● Always saw the same pauses
○ Described as an I/O bottleneck
(twice!)
Profile with Dtrace!
● Dynamic tracer built into FreeBSD (and others)
○ Originally created by Sun for Solaris
○ Ported to FreeBSD
○ Low profiling overhead
● Traces in both kernel and userspace
○ Hook syscalls, libc, application functions, etc
○ Access function arguments, kernel structures, etc
● Hooks expressed in D like DSL
○ Conditionally trigger traces
○ Really simple to use
Trace the Virtual File System
● Measures application file system operations
○ Kernel level
○ File system agnostic (XFS, ZFS, anything)
● Records data size & latency:
○ Reads - vfs::vop_read
○ Writes - vfs::vop_write
● Configured to output ASCII histograms
○ Per second aggregations
○ Broken down by type
○ Timestamped for correlating with MPJBT logs
VFS Writes vs. Throughput - PostgreSQL
VFS Writes vs. Throughput - MongoDB
Insert / Update / Select comparison
● Preloaded 10,000,000 records in the table
○ No padding - records are ~320 bytes
● 3 clients running different workloads
○ 50 workers inserting
○ 50 workers updating
○ 50 workers performing a range over partial index
● Both databases become CPU bound
○ Database server is under maximum load
○ Typically avoided in a production environment
○ Always good to know your maximum numbers
MongoDB
Insert
99th%
13ms
Average
18,070 op/s
Update
99th%
11ms
Average
22,304 op/s
Select
99th%
12ms
Average
18,960 op/s
MongoDB
PostgreSQL
Insert
99th%
4ms
Average
25,244 op/s
Update
99th%
4ms
Average
26,085 op/s
Select
99th%
3ms
Average
27,778 op/s
PostgreSQL
Workload - 1MB Inserts
Lower is better
99th Percentile Latency
● MongoDB => 4.5s
● PostgreSQL => 1.5s
Insert Performance
CPU 35%
insert 65543op/s avg.0ms
insert 65113op/s avg.0ms
insert 69881op/s avg.0ms
insert 55728op/s avg.0ms
insert 57502op/s avg.0ms
insert 64428op/s avg.0ms
insert 64872op/s avg.6ms
insert 68804op/s avg.0ms
insert 63204op/s avg.0ms
insert 63279op/s avg.0ms
insert 42011op/s avg.0ms
insert 53330op/s avg.0ms
insert 57815op/s avg.0ms
insert 54331op/s avg.0ms
insert 39616op/s avg.0ms
insert 51919op/s avg.0ms
insert 53366op/s avg.0ms
insert 56678op/s avg.0ms
insert 40283op/s avg.0ms
insert 47300op/s avg.0ms
CPU 40%
Update Performance
CPU 85%
update 2416 op/s avg.0ms
update 0 op/s avg.0ms
update 0 op/s avg.0ms
update 2856 op/s avg.33ms
update 21425op/s avg.0ms
update 0 op/s avg.0ms
update 0 op/s avg.0ms
update 12798op/s avg.5ms
update 11094op/s avg.0ms
update 21302op/s avg.0ms
update 31252op/s avg.0ms
update 32706op/s avg.0ms
update 33801op/s avg.0ms
update 28276op/s avg.0ms
update 34749op/s avg.0ms
update 29972op/s avg.0ms
update 28565op/s avg.0ms
update 32286op/s avg.0ms
update 30905op/s avg.0ms
update 32052op/s avg.0ms
CPU 65%
So...why are we even using MongoDB?
PostgreSQL is awesome right?
Vacuum Performance - Insert Workload
Horizontally Scalable
Table Size Comparison
Lower is better
Summary
Summary
● There is no such thing as the best database in the world!
● Choosing the right database for your application is never easy
○ How well does it scale?
○ How easy is it to perform upgrades?
○ How does it behave under stress?
● What is your application requirements?
○ Do you really need ACID?
● Do your own research!
Summary - PostgreSQL
● PostgreSQL has poor performance out of the box
○ Requires a decent amount of tuning to get good performance out of it
● Does not scale well with large number of connections
○ pgBouncer is a must
● Combines ACID compliance with schemaless JSON
● Queries not really intuitive
Summary - MongoDB
● MongoDB has decent performance out of the box.
● Unstable throughput and latency
● Scale well with large number of connections
● Strong horizontal scalability
● Throughput bug is annoying
● MongoDB rolling upgrades are ridiculously easy
● Developer friendly - easy to use!
TODO
● Release MPJBT on Github
○ Open source for all
○ github.com/domodwyer/mpjbt
● Run similar tests against MongoRocks
○ You guys have inspired us to keep looking!
● Create our 3rd bug report on MongoDB Jira
Thank You!
Like what you see?
We are hiring!
Come and speak to us!
Questions?
References
● https://people.freebsd.org/~seanc/postgresql/scale15x-2017-postgresql_zfs_b
est_practices.pdf
● https://jepsen.io/analyses/mongodb-3-4-0-rc3
● https://dba.stackexchange.com/questions/167525/inconsistent-statistics-on-js
onb-column-with-btree-index
● https://github.com/domodwyer/mpjbt
Previous Benchmark Results
● http://tiborsimko.org/postgresql-mongodb-json-select-speed.html
● http://erthalion.info/2015/12/29/json-benchmarks/
● https://www.enterprisedb.com/postgres-plus-edb-blog/marc-linster/postgres-o
utperforms-mongodb-and-ushers-new-developer-reality
● https://pgconf.ru/media/2017/04/03/20170317H2_O.Bartunov_json-2017.pdf
● https://www.slideshare.net/toshiharada/ycsb-jsonb
Appendix
pgbouncer.ini
● PostgreSQL does not support connection pooling
● PgBouncer is an extremely lightweight connection pooler
● Setting up and tearing down a new connection is expensive
● Each PostgreSQL connection forks a new process
● Configuration
○ pool_mode = transaction
○ max_client_conn = 300
postgresql.conf
● shared_buffer = 16GB
● max_connections = 400
● fsync = on
● synchronous_commit = on
● full_page_writes = off
● wal_compression = off
● wal_buffers = 16MB
● min_wal_size = 2GB
● max_wal_size = 4GB
● checkpoint_completion_target = 0.9
● work_mem = 33554KB
● maintenance_work_mem = 2GB
● wal_level=replica
mongod.conf
● wiredTiger.engineConfig.cacheSizeGB: 19
● wiredTiger.engineConfig.journalCompressor: snappy
● wiredTiger.collectionConfig.blockCompressor: snappy
● wiredTiger.indexConfig.prefixCompression: true
● net.maxIncomingConnections: 65536
● wiredTigerConcurrentReadTransactions: 256
● wiredTigerConcurrentWriteTransactions: 256
ZFS Tuning
● No separate L2ARC
● No separate ZIL
● 1 dataset for O/S
● 1 dataset for data directory
○ checksum=on
○ atime=off
○ recordsize=8K
○ compression=lz4 (PostgreSQL) or off (MongoDB)
/boot/loader.conf
● kern.maxusers=1024
● kern.ipc.semmns=2048
● kern.ipc.semmni=1024
● kern.ipc.semmnu=1024
● kern.ipc.shmall=34359738368
● kern.ipc.shmmax=34359738368
● kern.ipc.maxsockets=256000
● kern.ipc.maxsockbuf=2621440
● kern.ipc.shmseg=1024
/etc/sysctl.conf
● net.inet.tcp.keepidle=3000000
● net.inet.tcp.keepintvl=60000
● net.inet.tcp.keepinit=60000
● security.jail.sysvipc_allowed=1
● kern.ipc.shmmax=34359738368
● kern.ipc.shmall=16777216
● kern.ipc.semmap=256
● kern.ipc.shm_use_phys=1
● kern.ipc.nmbclusters=66560
● kern.maxfiles=2621440
● kern.maxfilesperproc=2621440
● kern.threads.max_threads_per_proc=65535
● kern.ipc.somaxconn=65535
● kern.eventtimer.timer=HPET
● kern.timecounter.hardware=HPET
● vfs.zfs.arc_max: 8589934592 for PostgreSQL or 1073741824 for MongoDB

Weitere ähnliche Inhalte

Was ist angesagt?

Salt Stack - Subhankar Sengupta
Salt Stack - Subhankar SenguptaSalt Stack - Subhankar Sengupta
Salt Stack - Subhankar SenguptaDevOpsBangalore
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Understanding Open vSwitch
Understanding Open vSwitch Understanding Open vSwitch
Understanding Open vSwitch YongKi Kim
 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookAnne Nicolas
 
OpenStack networking
OpenStack networkingOpenStack networking
OpenStack networkingSim Janghoon
 
Large scale overlay networks with ovn: problems and solutions
Large scale overlay networks with ovn: problems and solutionsLarge scale overlay networks with ovn: problems and solutions
Large scale overlay networks with ovn: problems and solutionsHan Zhou
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing LandscapeKernel TLV
 
Salt Stack pt. 2 : Configuration Management
Salt Stack pt. 2 : Configuration ManagementSalt Stack pt. 2 : Configuration Management
Salt Stack pt. 2 : Configuration ManagementUmberto Nicoletti
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsBrendan Gregg
 
Performance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen HemmingerPerformance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen Hemmingerharryvanhaaren
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
 
Accelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAccelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAlexander Shalimov
 
Performance: Observe and Tune
Performance: Observe and TunePerformance: Observe and Tune
Performance: Observe and TunePaul V. Novarese
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?ScyllaDB
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
 
Openv switchの使い方とか
Openv switchの使い方とかOpenv switchの使い方とか
Openv switchの使い方とかkotto_hihihi
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful ServicesThomas Graf
 
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...OpenStack Korea Community
 

Was ist angesagt? (20)

Salt Stack - Subhankar Sengupta
Salt Stack - Subhankar SenguptaSalt Stack - Subhankar Sengupta
Salt Stack - Subhankar Sengupta
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Understanding Open vSwitch
Understanding Open vSwitch Understanding Open vSwitch
Understanding Open vSwitch
 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at Facebook
 
OpenStack networking
OpenStack networkingOpenStack networking
OpenStack networking
 
Large scale overlay networks with ovn: problems and solutions
Large scale overlay networks with ovn: problems and solutionsLarge scale overlay networks with ovn: problems and solutions
Large scale overlay networks with ovn: problems and solutions
 
Linux Kernel Live Patching
Linux Kernel Live PatchingLinux Kernel Live Patching
Linux Kernel Live Patching
 
Salt at school
Salt at schoolSalt at school
Salt at school
 
Modern Linux Tracing Landscape
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
 
Salt Stack pt. 2 : Configuration Management
Salt Stack pt. 2 : Configuration ManagementSalt Stack pt. 2 : Configuration Management
Salt Stack pt. 2 : Configuration Management
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance Results
 
Performance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen HemmingerPerformance Lessons learned in vRouter - Stephen Hemminger
Performance Lessons learned in vRouter - Stephen Hemminger
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Accelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDKAccelerating Neutron with Intel DPDK
Accelerating Neutron with Intel DPDK
 
Performance: Observe and Tune
Performance: Observe and TunePerformance: Observe and Tune
Performance: Observe and Tune
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Openv switchの使い方とか
Openv switchの使い方とかOpenv switchの使い方とか
Openv switchの使い方とか
 
2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services2015 FOSDEM - OVS Stateful Services
2015 FOSDEM - OVS Stateful Services
 
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
 

Ähnlich wie High performance json- postgre sql vs. mongodb

PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
 
An evening with Postgresql
An evening with PostgresqlAn evening with Postgresql
An evening with PostgresqlJoshua Drake
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadKrivoy Rog IT Community
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseFITC
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloudOVHcloud
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205Linaro
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data DeduplicationRedWireServices
 
Streaming huge databases using logical decoding
Streaming huge databases using logical decodingStreaming huge databases using logical decoding
Streaming huge databases using logical decodingAlexander Shulgin
 
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Odinot Stanislas
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2aspyker
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGPablo Garbossa
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with GatlingPetr Vlček
 
Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database AuditingJuan Berner
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsCeph Community
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleJérôme Petazzoni
 
Snowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesSnowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesDrew Hansen
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022StreamNative
 

Ähnlich wie High performance json- postgre sql vs. mongodb (20)

PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
An evening with Postgresql
An evening with PostgresqlAn evening with Postgresql
An evening with Postgresql
 
kranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High loadkranonit S06E01 Игорь Цинько: High load
kranonit S06E01 Игорь Цинько: High load
 
MongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL DatabaseMongoDB: Advantages of an Open Source NoSQL Database
MongoDB: Advantages of an Open Source NoSQL Database
 
The Accidental DBA
The Accidental DBAThe Accidental DBA
The Accidental DBA
 
Logs @ OVHcloud
Logs @ OVHcloudLogs @ OVHcloud
Logs @ OVHcloud
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205LMG Lightning Talks - SFO17-205
LMG Lightning Talks - SFO17-205
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
 
Streaming huge databases using logical decoding
Streaming huge databases using logical decodingStreaming huge databases using logical decoding
Streaming huge databases using logical decoding
 
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
Hands-on Lab: How to Unleash Your Storage Performance by Using NVM Express™ B...
 
Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2Netflix Open Source Meetup Season 4 Episode 2
Netflix Open Source Meetup Season 4 Episode 2
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORINGEko10 workshop - OPEN SOURCE DATABASE MONITORING
Eko10 workshop - OPEN SOURCE DATABASE MONITORING
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with Gatling
 
Eko10 Workshop Opensource Database Auditing
Eko10  Workshop Opensource Database AuditingEko10  Workshop Opensource Database Auditing
Eko10 Workshop Opensource Database Auditing
 
Experiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah WatkinsExperiences building a distributed shared log on RADOS - Noah Watkins
Experiences building a distributed shared log on RADOS - Noah Watkins
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
 
Snowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD PipelinesSnowflake Automated Deployments / CI/CD Pipelines
Snowflake Automated Deployments / CI/CD Pipelines
 
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
Towards a ZooKeeper-less Pulsar, etcd, etcd, etcd. - Pulsar Summit SF 2022
 

Kürzlich hochgeladen

Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityMorshed Ahmed Rahath
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projectssmsksolar
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf203318pmpc
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxJuliansyahHarahap1
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdfKamal Acharya
 

Kürzlich hochgeladen (20)

Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Netaji Nagar, Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects2016EF22_0 solar project report rooftop projects
2016EF22_0 solar project report rooftop projects
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 

High performance json- postgre sql vs. mongodb

  • 1. High Performance JSON PostgreSQL vs. MongoDB Percona Live Europe 2017 Dominic Dwyer Wei Shan Ang
  • 2. VS
  • 3. GlobalSign ● GlobalSign identity & crypto services provider ● WebTrust certified Certificate Authority - 3rd in the world ● High volume services - IoT devices, cloud providers ● Cryptographic identities, timestamping, signing, etc
  • 4. About Me ● Database Engineer with 6 years of experience ● From Singapore but now based in London ● Originally worked on Oracle systems ● Switched to open-source databases in 2015 ● Currently working in GlobalSign as a “Data Reliability Engineer”
  • 5. Motivation ● Most benchmark results are biased - commercial interests ● We feel that benchmark results are measured “creatively” ● We use PostgreSQL and MongoDB a lot! ● We wanted to test the latest versions
  • 7. PostgreSQL ● Around for 21 years ● JSON supported since 9.2 ● JSONB supported since 9.4 ● Does not have any statistics about the internals of document types like JSON or JSONB ○ Can be overcome with default_statistics_target or ALTER TABLE TABLE_NAME ALTER int4 SET STATISTICS 2000; ● Many JSON/JSONB operator/functions released since 9.2 (jsonb_set, jsonb_insert) ● Many JSON/JSONB bug fixes too
  • 8. PostgreSQL Ecosystem ● “Build it yourself” ● Many High Availability solutions - all 3rd party ○ repmgr, pacemaker/corosync, Slony, Patroni and many more ● Connection Pooling ○ pgBouncer (single-threaded), pgpool-II ● Sharding ○ CitusDB ● Live version upgrades - tricky! ○ pg_upgrade, Slony, pg_dump and pg_logical
  • 10. MongoDB ● Relatively “young” database software ○ 8 years since first release ● Known as a /dev/null database since the early days (jepsen.io) ○ Tremendous stability improvements since then ○ All known reliability issues has been fixed since 3.2.12 ● Lots of WiredTiger bug fixes since 3.2 ○ Cache eviction ○ Checkpoints ○ Lost updates and dirty writes Source: jepsen.io
  • 11.
  • 12. ● Everything comes as standard: ○ Built-in replication ○ Built-in sharding ○ Live cluster version upgrades (ish) ■ Shutdown slave, upgrade slave, startup slave, repeat MongoDB
  • 13. Server Hardware ● 2x Intel(R) Xeon(R) CPU E5-2630 v4 ○ 20 cores / 40 threads ● 32GB Memory ● FreeBSD 11 ● ZFS Filesystem ● 2 x 1.6TB (Intel SSD DC S3610, MLC)
  • 14. Why do we use ZFS? ● Highly tunable filesystem ○ Layered caching (ARC, L2ARC, ZIL) ○ Advanced cache eviction ■ Most Recently Used (MRU) ■ Most Frequently Used (MFU) ● Free snapshots ● Block level checksums ● Fast compression ● Nexenta, Delphix, Datto, Joyent, Tegile, Oracle (obviously) and many more!
  • 15. The Setup ● 1-3 Client machines (depending on the test) ● 1 Server, two jails - one for Postgres & one for Mongo ● PostgreSQL 9.6.5 with pgBouncer 1.7.2 ● MongoDB 3.4.9 with WiredTiger
  • 16. Performance Tuning ● We had to tune PostgreSQL heavily ○ System V IPC (shmmax, shmall, semmns and etc) ○ pgBouncer (single threaded, we need multiple instances to handle the load) ● MongoDB tuning was easy! ○ WiredTiger cache size ○ Compression settings ○ Default settings are usually good enough ● ZFS tuning ○ atime ○ recordsize ○ checksum ○ compression
  • 17. Sample JSON Document { "_id" : NumberLong(2), "name" : "lPAyAYpUvUDGiCd", "addresses" : [ { "number" : 59, "line1" : "EPJKLhmEPrrdYqaFxxEVMF", "line2" : "Rvlgkmb" }, { "number" : 59, "line1" : "DdCBXEW", "line2" : "FEV" } ], "phone_number" : "xPOYCOfSpieIxbGxpYEpi", "dob" : ISODate("2017-09-05T00:03:28.956Z"), "age" : 442006075, "balance" : 0.807247519493103, "enabled" : false, "counter" : 442006075, "padding" : BinData(0,"") } Sub-documents Varying field types Randomised binary blob
  • 18. About Me ● Engineer on the High Performance Platforms team ○ Our team builds a high volume CA platform & distributed systems ○ Based in Old Street, London ○ Greenfields project, all new stuff! ● Day job has me breaking all the things ○ Simulating failures, network partitions, etc ○ Assessing performance and durability ● Maintain performance fork of Go MongoDB driver ○ github.com/globalsign/mgo
  • 19. MPJBT Benchmark Tool ● MongoDB PostgreSQL JSONB Benchmarking Tool ○ Seriously, we’re open to better names……. ● Written in Golang ● Will be released on Github after the conference ● Models typical workloads (but maybe not yours!) ○ Inserts, selects, select-updates, range queries, etc. ● Lockless outside of the database drivers ○ Low contention improves ability to push servers
  • 20. Why Go? ● Designed from the start for high concurrency ○ Thousands of concurrent workers is totally fine ● Co-operative scheduler can maximise I/O throughput ○ When blocked, Go switches to another worker ○ Blocked worker is woken up when it’s unblocked ○ Much cheaper context switching - occurs in userland ● Familiarity - I use it every day!
  • 23. Average isn’t very helpful ● I have an average of 52.2ms
  • 24. Average isn’t very helpful ● I have an average of 52.2ms 120.080231 36.237584 25.904811 44.053916 66.617778 59.713100 74.620329 1.689589 90.641940 27.202953 51.162331 52.202392 52.511745 50.439697 52.975609 52.567941 53.067609 52.122890 51.159180 52.390616 OR
  • 25. Inserts - Latency Histogram
  • 26. Inserts - Latency Histogram 1s 2s 3s
  • 27. Inserts - Throughput insert 30877op/s avg.0ms insert 27509op/s avg.0ms insert 29997op/s avg.0ms insert 31143op/s avg.0ms insert 22576op/s avg.0ms insert 0op/s avg.0ms insert 0op/s avg.0ms insert 1op/s avg.2561ms insert 0op/s avg.0ms insert 20703op/s avg.6ms insert 31154op/s avg.0ms insert 31298op/s avg.0ms insert 30359op/s avg.0ms insert 26081op/s avg.0ms insert 25938op/s avg.0ms insert 26649op/s avg.0ms insert 26009op/s avg.0ms insert 26029op/s avg.0ms insert 25522op/s avg.0ms insert 25960op/s avg.0ms insert 26000op/s avg.0ms insert 25576op/s avg.0ms insert 26159op/s avg.0ms insert 25628op/s avg.0ms insert 26071op/s avg.0ms insert 25856op/s avg.0ms
  • 28. MongoDB cache eviction bug? ● Show some cool graphs ● https://jira.mongodb.org/browse/SERVER-29311 ●
  • 29. MongoDB cache eviction bug - not a bug? ● Reported to MongoDB ○ https://jira.mongodb.org/browse/SERVER-29311 ○ Offered to run any tests and analyse data ● Ran 36 different test combinations ○ ZFS compression: lz4, zlib, off ○ MongoDB compression: snappy, zlib, off ○ Filesystem block sizes ○ Disk configurations ○ Tried running on Linux/XFS ● Always saw the same pauses ○ Described as an I/O bottleneck (twice!)
  • 30. Profile with Dtrace! ● Dynamic tracer built into FreeBSD (and others) ○ Originally created by Sun for Solaris ○ Ported to FreeBSD ○ Low profiling overhead ● Traces in both kernel and userspace ○ Hook syscalls, libc, application functions, etc ○ Access function arguments, kernel structures, etc ● Hooks expressed in D like DSL ○ Conditionally trigger traces ○ Really simple to use
  • 31. Trace the Virtual File System ● Measures application file system operations ○ Kernel level ○ File system agnostic (XFS, ZFS, anything) ● Records data size & latency: ○ Reads - vfs::vop_read ○ Writes - vfs::vop_write ● Configured to output ASCII histograms ○ Per second aggregations ○ Broken down by type ○ Timestamped for correlating with MPJBT logs
  • 32. VFS Writes vs. Throughput - PostgreSQL
  • 33. VFS Writes vs. Throughput - MongoDB
  • 34. Insert / Update / Select comparison ● Preloaded 10,000,000 records in the table ○ No padding - records are ~320 bytes ● 3 clients running different workloads ○ 50 workers inserting ○ 50 workers updating ○ 50 workers performing a range over partial index ● Both databases become CPU bound ○ Database server is under maximum load ○ Typically avoided in a production environment ○ Always good to know your maximum numbers
  • 37.
  • 40. Workload - 1MB Inserts Lower is better 99th Percentile Latency ● MongoDB => 4.5s ● PostgreSQL => 1.5s
  • 41. Insert Performance CPU 35% insert 65543op/s avg.0ms insert 65113op/s avg.0ms insert 69881op/s avg.0ms insert 55728op/s avg.0ms insert 57502op/s avg.0ms insert 64428op/s avg.0ms insert 64872op/s avg.6ms insert 68804op/s avg.0ms insert 63204op/s avg.0ms insert 63279op/s avg.0ms insert 42011op/s avg.0ms insert 53330op/s avg.0ms insert 57815op/s avg.0ms insert 54331op/s avg.0ms insert 39616op/s avg.0ms insert 51919op/s avg.0ms insert 53366op/s avg.0ms insert 56678op/s avg.0ms insert 40283op/s avg.0ms insert 47300op/s avg.0ms CPU 40%
  • 42. Update Performance CPU 85% update 2416 op/s avg.0ms update 0 op/s avg.0ms update 0 op/s avg.0ms update 2856 op/s avg.33ms update 21425op/s avg.0ms update 0 op/s avg.0ms update 0 op/s avg.0ms update 12798op/s avg.5ms update 11094op/s avg.0ms update 21302op/s avg.0ms update 31252op/s avg.0ms update 32706op/s avg.0ms update 33801op/s avg.0ms update 28276op/s avg.0ms update 34749op/s avg.0ms update 29972op/s avg.0ms update 28565op/s avg.0ms update 32286op/s avg.0ms update 30905op/s avg.0ms update 32052op/s avg.0ms CPU 65%
  • 43. So...why are we even using MongoDB? PostgreSQL is awesome right?
  • 44.
  • 45. Vacuum Performance - Insert Workload
  • 49.
  • 50. Summary ● There is no such thing as the best database in the world! ● Choosing the right database for your application is never easy ○ How well does it scale? ○ How easy is it to perform upgrades? ○ How does it behave under stress? ● What is your application requirements? ○ Do you really need ACID? ● Do your own research!
  • 51. Summary - PostgreSQL ● PostgreSQL has poor performance out of the box ○ Requires a decent amount of tuning to get good performance out of it ● Does not scale well with large number of connections ○ pgBouncer is a must ● Combines ACID compliance with schemaless JSON ● Queries not really intuitive
  • 52. Summary - MongoDB ● MongoDB has decent performance out of the box. ● Unstable throughput and latency ● Scale well with large number of connections ● Strong horizontal scalability ● Throughput bug is annoying ● MongoDB rolling upgrades are ridiculously easy ● Developer friendly - easy to use!
  • 53. TODO ● Release MPJBT on Github ○ Open source for all ○ github.com/domodwyer/mpjbt ● Run similar tests against MongoRocks ○ You guys have inspired us to keep looking! ● Create our 3rd bug report on MongoDB Jira
  • 54. Thank You! Like what you see? We are hiring! Come and speak to us! Questions?
  • 55. References ● https://people.freebsd.org/~seanc/postgresql/scale15x-2017-postgresql_zfs_b est_practices.pdf ● https://jepsen.io/analyses/mongodb-3-4-0-rc3 ● https://dba.stackexchange.com/questions/167525/inconsistent-statistics-on-js onb-column-with-btree-index ● https://github.com/domodwyer/mpjbt
  • 56. Previous Benchmark Results ● http://tiborsimko.org/postgresql-mongodb-json-select-speed.html ● http://erthalion.info/2015/12/29/json-benchmarks/ ● https://www.enterprisedb.com/postgres-plus-edb-blog/marc-linster/postgres-o utperforms-mongodb-and-ushers-new-developer-reality ● https://pgconf.ru/media/2017/04/03/20170317H2_O.Bartunov_json-2017.pdf ● https://www.slideshare.net/toshiharada/ycsb-jsonb
  • 58. pgbouncer.ini ● PostgreSQL does not support connection pooling ● PgBouncer is an extremely lightweight connection pooler ● Setting up and tearing down a new connection is expensive ● Each PostgreSQL connection forks a new process ● Configuration ○ pool_mode = transaction ○ max_client_conn = 300
  • 59. postgresql.conf ● shared_buffer = 16GB ● max_connections = 400 ● fsync = on ● synchronous_commit = on ● full_page_writes = off ● wal_compression = off ● wal_buffers = 16MB ● min_wal_size = 2GB ● max_wal_size = 4GB ● checkpoint_completion_target = 0.9 ● work_mem = 33554KB ● maintenance_work_mem = 2GB ● wal_level=replica
  • 60. mongod.conf ● wiredTiger.engineConfig.cacheSizeGB: 19 ● wiredTiger.engineConfig.journalCompressor: snappy ● wiredTiger.collectionConfig.blockCompressor: snappy ● wiredTiger.indexConfig.prefixCompression: true ● net.maxIncomingConnections: 65536 ● wiredTigerConcurrentReadTransactions: 256 ● wiredTigerConcurrentWriteTransactions: 256
  • 61. ZFS Tuning ● No separate L2ARC ● No separate ZIL ● 1 dataset for O/S ● 1 dataset for data directory ○ checksum=on ○ atime=off ○ recordsize=8K ○ compression=lz4 (PostgreSQL) or off (MongoDB)
  • 62. /boot/loader.conf ● kern.maxusers=1024 ● kern.ipc.semmns=2048 ● kern.ipc.semmni=1024 ● kern.ipc.semmnu=1024 ● kern.ipc.shmall=34359738368 ● kern.ipc.shmmax=34359738368 ● kern.ipc.maxsockets=256000 ● kern.ipc.maxsockbuf=2621440 ● kern.ipc.shmseg=1024
  • 63. /etc/sysctl.conf ● net.inet.tcp.keepidle=3000000 ● net.inet.tcp.keepintvl=60000 ● net.inet.tcp.keepinit=60000 ● security.jail.sysvipc_allowed=1 ● kern.ipc.shmmax=34359738368 ● kern.ipc.shmall=16777216 ● kern.ipc.semmap=256 ● kern.ipc.shm_use_phys=1 ● kern.ipc.nmbclusters=66560 ● kern.maxfiles=2621440 ● kern.maxfilesperproc=2621440 ● kern.threads.max_threads_per_proc=65535 ● kern.ipc.somaxconn=65535 ● kern.eventtimer.timer=HPET ● kern.timecounter.hardware=HPET ● vfs.zfs.arc_max: 8589934592 for PostgreSQL or 1073741824 for MongoDB