SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Bitsy Graph Database

Sridhar Ramachandran
Founder, LambdaZen LLC
What is Bitsy?
● A small, fast, embeddable,
durable, in-memory graph
database.
● Maintains an on-disk copy of
the graph database.
● Designed for multi-threaded
OLTP applications.
● Provides ACID guarantees
and optimistic concurrency
control for transactions.
● Compatible with
Tinkerpop/Blueprints -- the
graph database standard.

Tinkerpop software stack
From https://github.com/tinkerpop/blueprints/wiki
In-memory and durable?
● Bitsy maintains a copy of the entire graph in memory
data-structures.
● Bitsy saves all changes made to the database, to the
disk, during a commit operation.
● Commits from different threads are forced to the disk at
once, thereby improving the write performance in a
multithreaded OLTP environment.
● The database is loaded from files during startup.
● All database files are append-only text files with JSONencoded vertices and edges.
● The database files are periodically compacted by a
background thread.
Design Principle #1: No Seek
● Bitsy appends all changes to an
unordered transaction log, unlike
most databases which persist data in
B-Trees and other ordered
structures.
● Ordered data structures perform
multiple seeks per updated element.
● Seek operations on the hard-disk are
expensive (5-15 ms).
● Bitsy avoids seeks per element, and
addresses rotational latency by
combining commits from concurrent
transactions.

Hard disk head: Seek
operations require a
mechanical movement of
the hard disk head which
takes 5-15ms.
Rotational latency is the
time taken for the
requested sector in the
rotating platter to reach
the head. Takes 2-4ms.
Design Principle #2: No Socket
● Typical databases run in a separate
process exposing a socket-based
protocol to applications.
● The cost of serializing and
deserializing the requests and
responses, and calling OS-level
functions, reduces the overall
throughput of the database.
● By avoiding a socket-based protocol
between the application and the
database, Bitsy can achieve submicrosecond query latencies.

The OSI model requires
serialization and
deserialization as the
packet crosses from one
layer to another
Design Principle #3: No SQL
● Tuning a SQL database is
a non-trivial task.
● The biggest factor in a
SQL query's efficiency is
its execution plan.
● By avoiding SQL and the
execution plans that come
with it, Bitsy ensures that
all queries and updates
are efficient*.

An example execution plan from Oracle's
documentation

* The "allow full-graph scan" option must be disabled to guarantee quick responses.
Concurrency Model
● Bitsy is designed to work in multi-threaded OLTP
environments.
● It implements optimistic concurrency control where
edges and vertices are tied to version numbers that are
incremented on updates.
● A BitsyRetryException is raised during a transaction
commit, if an updated vertex/edge has a different
version at the time of commit, than at the time of query.
● The application should retry the entire transaction in
case of conflict.
Write Algorithms
●
●

●
●

●

●

The write algorithms operate on
three levels of "double buffers".
The transaction buffers capture
transactions to be committed
simultaneously.
The commit waits for the buffer to
flush to a transaction file (A/B).
Transaction files are moved to
vertex and edge files on exceeding
a threshold size (default is 4MB).
Vertex and edge files are
reorganized after a period of growth
(default is +1x initial size).
Online backups trigger a
transaction flush, and then copy the
backup the vertex and edge files
representing the DB snapshot.
Write throughput in an OLTP setting
●
●
●

The plot below shows the throughput of a test application* that repeatedly
commits a small transaction (1 vertex + 1 edge) from multiple threads.
The throughput exceeds 50K ops/second at 750 concurrent threads.
The comparison with Neo4J 1.9.2 illustrates the benefit of "No Seek".

* Tests performed on a $600 HP p7-1287c desktop PC with a single 7200 rpm hard disk.
Read throughput in an OLTP setting
●
●

The plot below shows the read throughput of threads, repeatedly traversing
separate portions of the graph in a desktop PC*.
Bitsy implements mostly lock-free read algorithms that can perform close
to 20M ops/second at 1000 threads -- on par with Neo4J’s warm caches.

* Tests performed on a $600 HP p7-1287c desktop PC with 4 cores
Monitoring and Management
● Offline backup and
restore operations are
simple file copy
operations on the
database directory.
● Bitsy exposes a JMX
interface to make online
backups, and adjust
database parameters.
● Bitsy logs messages
using the SLF4J API with
logger names starting
with "com.lambdazen".

Online backup through jconsole
Dependencies
●
●
●
●

Blueprints Core
Jackson JSON Processor
SLF4J API
Ness Computing Core Component: For fast UUID
serialization/deserialization
License
● Bitsy is a dual-licensed product.
● The AGPL v3 license can be used for open-source
●

projects and internally-used closed-source projects.
The commercial license is an extremely liberal license
that provides rights to modify and use Bitsy in an
unlimited number of instances, products* and services.
Pricing details with a 15% promotional discount (till Feb 2014)
Startups and small
businesses
(1-10 employees)

Medium-sized enterprises
(10-500 employees)

Large-sized
enterprises
(500+ employees)

$425 annual
$1699 perpetual

$849 annual
$3399 perpetual

$1275 annual
$5099 perpetual

* The products must not encourage the direct use of Bitsy APIs.
Wrap-up
● Bitsy is a small, fast, embeddable, durable, in-memory
graph database, with the following features:
○ ACID guarantees and clean recovery from crashes
○ Query latency in sub-microseconds
○ High transaction throughput in an OLTP setting with multiple
clients/threads accessing the database

●

○ Well-defined optimistic concurrency model
○ Support for online backups
○ Human-readable database files
○ Small code footprint (~1.5MB with dependencies)
Bitsy is dual-licensed under AGPL and a liberal
commercial license for unlimited enterprise-wide use.
Questions and Feedback
● The project is hosted at https://bitbucket.
org/lambdazen/bitsy with publicly accessible
○ Documentation and install instructions (in Wiki)
○ Links to downloads
○ Issue management

● Please email your questions and feedback to
bisty@lambdazen.com

Weitere Àhnliche Inhalte

Was ist angesagt?

The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowDataWorks Summit
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetDataWorks Summit/Hadoop Summit
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controllerconfluent
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkDatabricks
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...HostedbyConfluent
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013larsgeorge
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compactionMIJIN AN
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021StreamNative
 
Click-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingClick-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingRobert Metzger
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemDatabricks
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Flink Forward
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
Ch04 æœƒè©±çźĄç†
Ch04 æœƒè©±çźĄç†Ch04 æœƒè©±çźĄç†
Ch04 æœƒè©±çźĄç†Justin Lin
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introductionleanderlee2
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureDan McKinley
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangDatabricks
 
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerDataWorks Summit
 

Was ist angesagt? (20)

Amazon EBS: Deep Dive
Amazon EBS: Deep DiveAmazon EBS: Deep Dive
Amazon EBS: Deep Dive
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
File Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and ParquetFile Format Benchmark - Avro, JSON, ORC and Parquet
File Format Benchmark - Avro, JSON, ORC and Parquet
 
A Deep Dive into Kafka Controller
A Deep Dive into Kafka ControllerA Deep Dive into Kafka Controller
A Deep Dive into Kafka Controller
 
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
 
Machine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache SparkMachine Learning at Scale with MLflow and Apache Spark
Machine Learning at Scale with MLflow and Apache Spark
 
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
Apache Kafka’s Transactions in the Wild! Developing an exactly-once KafkaSink...
 
Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013Parquet - Data I/O - Philadelphia 2013
Parquet - Data I/O - Philadelphia 2013
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
 
Click-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer CheckpointingClick-Through Example for Flink’s KafkaConsumer Checkpointing
Click-Through Example for Flink’s KafkaConsumer Checkpointing
 
The Apache Spark File Format Ecosystem
The Apache Spark File Format EcosystemThe Apache Spark File Format Ecosystem
The Apache Spark File Format Ecosystem
 
Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!Near real-time statistical modeling and anomaly detection using Flink!
Near real-time statistical modeling and anomaly detection using Flink!
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Ch04 æœƒè©±çźĄç†
Ch04 æœƒè©±çźĄç†Ch04 æœƒè©±çźĄç†
Ch04 æœƒè©±çźĄç†
 
Apache doris (incubating) introduction
Apache doris (incubating) introductionApache doris (incubating) introduction
Apache doris (incubating) introduction
 
Etsy Activity Feeds Architecture
Etsy Activity Feeds ArchitectureEtsy Activity Feeds Architecture
Etsy Activity Feeds Architecture
 
Tomcat Server
Tomcat ServerTomcat Server
Tomcat Server
 
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang WangApache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
 
ORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, SmallerORC: 2015 Faster, Better, Smaller
ORC: 2015 Faster, Better, Smaller
 

Andere mochten auch

Improvements in Bitsy 1.5
Improvements in Bitsy 1.5Improvements in Bitsy 1.5
Improvements in Bitsy 1.5LambdaZen LLC
 
HyperGraphDb
HyperGraphDbHyperGraphDb
HyperGraphDbborislav
 
HypergraphDB
HypergraphDBHypergraphDB
HypergraphDBJan Drozen
 
OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1Luca Garulli
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastoreKishore Gopalakrishna
 
fluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualfluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualSATOSHI TAGOMORI
 

Andere mochten auch (6)

Improvements in Bitsy 1.5
Improvements in Bitsy 1.5Improvements in Bitsy 1.5
Improvements in Bitsy 1.5
 
HyperGraphDb
HyperGraphDbHyperGraphDb
HyperGraphDb
 
HypergraphDB
HypergraphDBHypergraphDB
HypergraphDB
 
OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1OrientDB distributed architecture 1.1
OrientDB distributed architecture 1.1
 
Pinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastorePinot: Realtime Distributed OLAP datastore
Pinot: Realtime Distributed OLAP datastore
 
fluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasualfluent-plugin-norikra #fluentdcasual
fluent-plugin-norikra #fluentdcasual
 

Ähnlich wie Bitsy graph database

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
 
PostgreSQL 10: What to Look For
PostgreSQL 10: What to Look ForPostgreSQL 10: What to Look For
PostgreSQL 10: What to Look ForAmit Langote
 
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle Ashnikbiz
 
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GRGlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GRTheophanis Kontogiannis
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0MongoDB
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...HostedbyConfluent
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latencyhyeongchae lee
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...In-Memory Computing Summit
 
Db2 analytics accelerator on ibm integrated analytics system technical over...
Db2 analytics accelerator on ibm integrated analytics system   technical over...Db2 analytics accelerator on ibm integrated analytics system   technical over...
Db2 analytics accelerator on ibm integrated analytics system technical over...Daniel Martin
 
MySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMorgan Tocker
 
PCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card DesignPCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card DesignMohamad Tisani
 
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...MariaDB plc
 
Sparc t4 systems customer presentation
Sparc t4 systems customer presentationSparc t4 systems customer presentation
Sparc t4 systems customer presentationsolarisyougood
 
Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...javier ramirez
 
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...MongoDB
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdfsreedb2
 

Ähnlich wie Bitsy graph database (20)

Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
PostgreSQL 10: What to Look For
PostgreSQL 10: What to Look ForPostgreSQL 10: What to Look For
PostgreSQL 10: What to Look For
 
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
Ashnik EnterpriseDB PostgreSQL - A real alternative to Oracle
 
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GRGlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR
 
WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0WiredTiger & What's New in 3.0
WiredTiger & What's New in 3.0
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
 
in-memory database system and low latency
in-memory database system and low latencyin-memory database system and low latency
in-memory database system and low latency
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
 
Db2 analytics accelerator on ibm integrated analytics system technical over...
Db2 analytics accelerator on ibm integrated analytics system   technical over...Db2 analytics accelerator on ibm integrated analytics system   technical over...
Db2 analytics accelerator on ibm integrated analytics system technical over...
 
EQUNIX - PPT 11DB-Postgresℱ.pdf
EQUNIX - PPT 11DB-Postgresℱ.pdfEQUNIX - PPT 11DB-Postgresℱ.pdf
EQUNIX - PPT 11DB-Postgresℱ.pdf
 
VM-aware Adaptive Storage Cache Prefetching
VM-aware Adaptive Storage Cache PrefetchingVM-aware Adaptive Storage Cache Prefetching
VM-aware Adaptive Storage Cache Prefetching
 
MySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics ImprovementsMySQL 5.6 - Operations and Diagnostics Improvements
MySQL 5.6 - Operations and Diagnostics Improvements
 
PCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card DesignPCIX Gigabit Ethernet Card Design
PCIX Gigabit Ethernet Card Design
 
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
M|18 Intel and MariaDB: Strategic Collaboration to Enhance MariaDB Functional...
 
Apache ignite v1.3
Apache ignite v1.3Apache ignite v1.3
Apache ignite v1.3
 
Sparc t4 systems customer presentation
Sparc t4 systems customer presentationSparc t4 systems customer presentation
Sparc t4 systems customer presentation
 
Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...
Cómo se diseña una base de datos que pueda ingerir mås de cuatro millones de ...
 
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
 
9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf9.6_Course Material-Postgresql_002.pdf
9.6_Course Material-Postgresql_002.pdf
 

KĂŒrzlich hochgeladen

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

KĂŒrzlich hochgeladen (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Bitsy graph database

  • 1. Bitsy Graph Database Sridhar Ramachandran Founder, LambdaZen LLC
  • 2. What is Bitsy? ● A small, fast, embeddable, durable, in-memory graph database. ● Maintains an on-disk copy of the graph database. ● Designed for multi-threaded OLTP applications. ● Provides ACID guarantees and optimistic concurrency control for transactions. ● Compatible with Tinkerpop/Blueprints -- the graph database standard. Tinkerpop software stack From https://github.com/tinkerpop/blueprints/wiki
  • 3. In-memory and durable? ● Bitsy maintains a copy of the entire graph in memory data-structures. ● Bitsy saves all changes made to the database, to the disk, during a commit operation. ● Commits from different threads are forced to the disk at once, thereby improving the write performance in a multithreaded OLTP environment. ● The database is loaded from files during startup. ● All database files are append-only text files with JSONencoded vertices and edges. ● The database files are periodically compacted by a background thread.
  • 4. Design Principle #1: No Seek ● Bitsy appends all changes to an unordered transaction log, unlike most databases which persist data in B-Trees and other ordered structures. ● Ordered data structures perform multiple seeks per updated element. ● Seek operations on the hard-disk are expensive (5-15 ms). ● Bitsy avoids seeks per element, and addresses rotational latency by combining commits from concurrent transactions. Hard disk head: Seek operations require a mechanical movement of the hard disk head which takes 5-15ms. Rotational latency is the time taken for the requested sector in the rotating platter to reach the head. Takes 2-4ms.
  • 5. Design Principle #2: No Socket ● Typical databases run in a separate process exposing a socket-based protocol to applications. ● The cost of serializing and deserializing the requests and responses, and calling OS-level functions, reduces the overall throughput of the database. ● By avoiding a socket-based protocol between the application and the database, Bitsy can achieve submicrosecond query latencies. The OSI model requires serialization and deserialization as the packet crosses from one layer to another
  • 6. Design Principle #3: No SQL ● Tuning a SQL database is a non-trivial task. ● The biggest factor in a SQL query's efficiency is its execution plan. ● By avoiding SQL and the execution plans that come with it, Bitsy ensures that all queries and updates are efficient*. An example execution plan from Oracle's documentation * The "allow full-graph scan" option must be disabled to guarantee quick responses.
  • 7. Concurrency Model ● Bitsy is designed to work in multi-threaded OLTP environments. ● It implements optimistic concurrency control where edges and vertices are tied to version numbers that are incremented on updates. ● A BitsyRetryException is raised during a transaction commit, if an updated vertex/edge has a different version at the time of commit, than at the time of query. ● The application should retry the entire transaction in case of conflict.
  • 8. Write Algorithms ● ● ● ● ● ● The write algorithms operate on three levels of "double buffers". The transaction buffers capture transactions to be committed simultaneously. The commit waits for the buffer to flush to a transaction file (A/B). Transaction files are moved to vertex and edge files on exceeding a threshold size (default is 4MB). Vertex and edge files are reorganized after a period of growth (default is +1x initial size). Online backups trigger a transaction flush, and then copy the backup the vertex and edge files representing the DB snapshot.
  • 9. Write throughput in an OLTP setting ● ● ● The plot below shows the throughput of a test application* that repeatedly commits a small transaction (1 vertex + 1 edge) from multiple threads. The throughput exceeds 50K ops/second at 750 concurrent threads. The comparison with Neo4J 1.9.2 illustrates the benefit of "No Seek". * Tests performed on a $600 HP p7-1287c desktop PC with a single 7200 rpm hard disk.
  • 10. Read throughput in an OLTP setting ● ● The plot below shows the read throughput of threads, repeatedly traversing separate portions of the graph in a desktop PC*. Bitsy implements mostly lock-free read algorithms that can perform close to 20M ops/second at 1000 threads -- on par with Neo4J’s warm caches. * Tests performed on a $600 HP p7-1287c desktop PC with 4 cores
  • 11. Monitoring and Management ● Offline backup and restore operations are simple file copy operations on the database directory. ● Bitsy exposes a JMX interface to make online backups, and adjust database parameters. ● Bitsy logs messages using the SLF4J API with logger names starting with "com.lambdazen". Online backup through jconsole
  • 12. Dependencies ● ● ● ● Blueprints Core Jackson JSON Processor SLF4J API Ness Computing Core Component: For fast UUID serialization/deserialization
  • 13. License ● Bitsy is a dual-licensed product. ● The AGPL v3 license can be used for open-source ● projects and internally-used closed-source projects. The commercial license is an extremely liberal license that provides rights to modify and use Bitsy in an unlimited number of instances, products* and services. Pricing details with a 15% promotional discount (till Feb 2014) Startups and small businesses (1-10 employees) Medium-sized enterprises (10-500 employees) Large-sized enterprises (500+ employees) $425 annual $1699 perpetual $849 annual $3399 perpetual $1275 annual $5099 perpetual * The products must not encourage the direct use of Bitsy APIs.
  • 14. Wrap-up ● Bitsy is a small, fast, embeddable, durable, in-memory graph database, with the following features: ○ ACID guarantees and clean recovery from crashes ○ Query latency in sub-microseconds ○ High transaction throughput in an OLTP setting with multiple clients/threads accessing the database ● ○ Well-defined optimistic concurrency model ○ Support for online backups ○ Human-readable database files ○ Small code footprint (~1.5MB with dependencies) Bitsy is dual-licensed under AGPL and a liberal commercial license for unlimited enterprise-wide use.
  • 15. Questions and Feedback ● The project is hosted at https://bitbucket. org/lambdazen/bitsy with publicly accessible ○ Documentation and install instructions (in Wiki) ○ Links to downloads ○ Issue management ● Please email your questions and feedback to bisty@lambdazen.com