SlideShare ist ein Scribd-Unternehmen logo
1 von 11
Downloaden Sie, um offline zu lesen
PalDB
Introduction to PalDB
Mathieu Bastian - October 2015
Summary
❖ PalDB is an embeddable write-once key-value store
❖ Written in Java, no dependencies and only 110K JAR
❖ Very fast read performance, 2M+ reads/second
❖ Simple, works like an immutable un-typed HashMap
❖ Compact, holds in a single binary file
❖ Open-sourced at LinkedIn in 2015
Why PalDB?
❖ Need for an efficient solution to package side-data
❖ Inappropriate existing solutions
‣ Raw data files (CSV, JSON, Avro, Thrift) require complex
parsing code and in-memory data structures
‣ Embeddable key-value stores (LevelDB, RocksDB) have large
overhead due to read/write capabilities
‣ Traditional in-memory data structures (List, HashSet, HashMap)
take too much memory and require load time
Features
✓ All primitives and arrays, no schema needed
✓ Random read & iteration (unsorted)
✓ No load time, and uses off-heap memory
✓ Custom serializers can be defined
✓ Read from store file, stream or resources within JAR
✓ Holds in a single binary file
Write-once
❖ Write-once, read many
❖ Once a store has been written and closed, it can’t be
modified
❖ Typical use-case is to transport pre-created datasets
❖ Principal benefit is a more compact store size
Code: Write store
Java
StoreWriter writer = PalDB.createWriter(new File("store.paldb"));
writer.put("foo", "bar");
writer.put(1213, new int[] {1, 2, 3});
writer.close();
Scala
val writer: StoreWriter = PalDB.createWriter(new File("store.paldb"));
writer.put("foo", "bar");
writer.put(1213, Array(1, 2, 3));
writer.close();
Code: Read store
Java
StoreReader reader = PalDB.createReader(new File("store.paldb"));
String val1 = reader.get("foo");
int[] val2 = reader.get(1213);
reader.close();
Scala
val reader: StoreReader = PalDB.createReader(new File("store.paldb"));
val val1: String = reader.get("foo");
var val2: Array[Int] = reader.get(1213);
reader.close();
Benchmark summary
❖ When compared to embeddable key-value stores
(LevelDB, RocksDB)
‣ PalDB has 5X to 15X higher throughput on datasets
fitting in memory*
❖ When compared to in-memory Java HashSet/HashMap
‣ PalDB has 2X to 5X lower throughput
‣ Uses 6X less memory
* PalDB does not intend to scale to very large disk indices like RocksDB or LevelDB
Throughput
❖ Throughput benchmark between PalDB, LevelDB and
RocksDB (higher is better)
Memory
❖ Memory usage benchmark between PalDB and a Java
HashSet (lower is better)
PalDB © 2015 LinkedIn Corp. Licensed under the terms of the Apache License, Version 2.0.
Code & documentation available on GitHub

https://github.com/linkedin/PalDB
PalDB

Weitere ähnliche Inhalte

Was ist angesagt?

Azure DocumentDB 101
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101Ike Ellis
 
Replicating application data into materialized views
Replicating application data into materialized viewsReplicating application data into materialized views
Replicating application data into materialized viewsZach Cox
 
Updating materialized views and caches using kafka
Updating materialized views and caches using kafkaUpdating materialized views and caches using kafka
Updating materialized views and caches using kafkaZach Cox
 
Draft slide of Demystifying DHT in GlusterFS
Draft slide of Demystifying DHT in GlusterFSDraft slide of Demystifying DHT in GlusterFS
Draft slide of Demystifying DHT in GlusterFSAnkit Raj
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisArnab Mitra
 
Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph Ceph Community
 
MongoDB_Sharan_Prakash_Babu
MongoDB_Sharan_Prakash_BabuMongoDB_Sharan_Prakash_Babu
MongoDB_Sharan_Prakash_BabuSharan
 
FOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBFOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBArangoDB Database
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOLradiocats
 
MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011Chris Westin
 
Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.eross77
 
PENXY - Redis in Azure
PENXY - Redis in AzurePENXY - Redis in Azure
PENXY - Redis in Azuremourhoon
 
Visualize your graph database
Visualize your graph databaseVisualize your graph database
Visualize your graph databaseMichael Hackstein
 
CouchDB: replicated data store for distributed proxy server
CouchDB: replicated data store for distributed proxy serverCouchDB: replicated data store for distributed proxy server
CouchDB: replicated data store for distributed proxy servertkramar
 

Was ist angesagt? (20)

Azure DocumentDB 101
Azure DocumentDB 101Azure DocumentDB 101
Azure DocumentDB 101
 
Mongodb lab
Mongodb labMongodb lab
Mongodb lab
 
MongoDB
MongoDBMongoDB
MongoDB
 
Replicating application data into materialized views
Replicating application data into materialized viewsReplicating application data into materialized views
Replicating application data into materialized views
 
Updating materialized views and caches using kafka
Updating materialized views and caches using kafkaUpdating materialized views and caches using kafka
Updating materialized views and caches using kafka
 
Draft slide of Demystifying DHT in GlusterFS
Draft slide of Demystifying DHT in GlusterFSDraft slide of Demystifying DHT in GlusterFS
Draft slide of Demystifying DHT in GlusterFS
 
ArangoDB
ArangoDBArangoDB
ArangoDB
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph Ceph Day Beijing: Containers and Ceph
Ceph Day Beijing: Containers and Ceph
 
MongoDB_Sharan_Prakash_Babu
MongoDB_Sharan_Prakash_BabuMongoDB_Sharan_Prakash_Babu
MongoDB_Sharan_Prakash_Babu
 
Mongo db
Mongo dbMongo db
Mongo db
 
FOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDBFOXX - a Javascript application framework on top of ArangoDB
FOXX - a Javascript application framework on top of ArangoDB
 
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
 
MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011MongoDB Aggregation MongoSF May 2011
MongoDB Aggregation MongoSF May 2011
 
KeyValue Stores
KeyValue StoresKeyValue Stores
KeyValue Stores
 
Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.Comparison with storing data using NoSQL(CouchDB) and a relational database.
Comparison with storing data using NoSQL(CouchDB) and a relational database.
 
PENXY - Redis in Azure
PENXY - Redis in AzurePENXY - Redis in Azure
PENXY - Redis in Azure
 
Visualize your graph database
Visualize your graph databaseVisualize your graph database
Visualize your graph database
 
Redis IU
Redis IURedis IU
Redis IU
 
CouchDB: replicated data store for distributed proxy server
CouchDB: replicated data store for distributed proxy serverCouchDB: replicated data store for distributed proxy server
CouchDB: replicated data store for distributed proxy server
 

Ähnlich wie Introduction to PalDB

Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it bettergvernik
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?gvernik
 
Project Voldemort: Big data loading
Project Voldemort: Big data loadingProject Voldemort: Big data loading
Project Voldemort: Big data loadingDan Harvey
 
Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...
Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...
Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...spinningmatt
 
Introduction to Apache Spark Ecosystem
Introduction to Apache Spark EcosystemIntroduction to Apache Spark Ecosystem
Introduction to Apache Spark EcosystemBojan Babic
 
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)Shivji Kumar Jha
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and HowBigBlueHat
 
JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataPROIDEA
 
OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"Giivee The
 
Efficient In-situ Processing of Various Storage Types on Apache Tajo
Efficient In-situ Processing of Various Storage Types on Apache TajoEfficient In-situ Processing of Various Storage Types on Apache Tajo
Efficient In-situ Processing of Various Storage Types on Apache TajoDataWorks Summit
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoHyunsik Choi
 
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoEfficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoGruter
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS Ceph Community
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkRahul Jain
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Introduction to NoSql
Introduction to NoSqlIntroduction to NoSql
Introduction to NoSqlOmid Vahdaty
 

Ähnlich wie Introduction to PalDB (20)

Hadoop and object stores can we do it better
Hadoop and object stores  can we do it betterHadoop and object stores  can we do it better
Hadoop and object stores can we do it better
 
Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?Hadoop and object stores: Can we do it better?
Hadoop and object stores: Can we do it better?
 
Project Voldemort: Big data loading
Project Voldemort: Big data loadingProject Voldemort: Big data loading
Project Voldemort: Big data loading
 
Taming NoSQL with Spring Data
Taming NoSQL with Spring DataTaming NoSQL with Spring Data
Taming NoSQL with Spring Data
 
Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...
Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...
Boston Apache Spark User Group (the Spahk group) - Introduction to Spark - 15...
 
Introduction to Apache Spark Ecosystem
Introduction to Apache Spark EcosystemIntroduction to Apache Spark Ecosystem
Introduction to Apache Spark Ecosystem
 
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
How pulsar stores data at Pulsar-na-summit-2021.pptx (1)
 
מיכאל
מיכאלמיכאל
מיכאל
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
NoSQL: Why, When, and How
NoSQL: Why, When, and HowNoSQL: Why, When, and How
NoSQL: Why, When, and How
 
JDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big DataJDD 2016 - Michal Matloka - Small Intro To Big Data
JDD 2016 - Michal Matloka - Small Intro To Big Data
 
OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"
 
Efficient In-situ Processing of Various Storage Types on Apache Tajo
Efficient In-situ Processing of Various Storage Types on Apache TajoEfficient In-situ Processing of Various Storage Types on Apache Tajo
Efficient In-situ Processing of Various Storage Types on Apache Tajo
 
In-memory database
In-memory databaseIn-memory database
In-memory database
 
Efficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajoEfficient in situ processing of various storage types on apache tajo
Efficient in situ processing of various storage types on apache tajo
 
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoEfficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
 
Ceph Day New York 2014: Future of CephFS
Ceph Day New York 2014:  Future of CephFS Ceph Day New York 2014:  Future of CephFS
Ceph Day New York 2014: Future of CephFS
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Introduction to NoSql
Introduction to NoSqlIntroduction to NoSql
Introduction to NoSql
 

Kürzlich hochgeladen

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 

Kürzlich hochgeladen (20)

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 

Introduction to PalDB

  • 1. PalDB Introduction to PalDB Mathieu Bastian - October 2015
  • 2. Summary ❖ PalDB is an embeddable write-once key-value store ❖ Written in Java, no dependencies and only 110K JAR ❖ Very fast read performance, 2M+ reads/second ❖ Simple, works like an immutable un-typed HashMap ❖ Compact, holds in a single binary file ❖ Open-sourced at LinkedIn in 2015
  • 3. Why PalDB? ❖ Need for an efficient solution to package side-data ❖ Inappropriate existing solutions ‣ Raw data files (CSV, JSON, Avro, Thrift) require complex parsing code and in-memory data structures ‣ Embeddable key-value stores (LevelDB, RocksDB) have large overhead due to read/write capabilities ‣ Traditional in-memory data structures (List, HashSet, HashMap) take too much memory and require load time
  • 4. Features ✓ All primitives and arrays, no schema needed ✓ Random read & iteration (unsorted) ✓ No load time, and uses off-heap memory ✓ Custom serializers can be defined ✓ Read from store file, stream or resources within JAR ✓ Holds in a single binary file
  • 5. Write-once ❖ Write-once, read many ❖ Once a store has been written and closed, it can’t be modified ❖ Typical use-case is to transport pre-created datasets ❖ Principal benefit is a more compact store size
  • 6. Code: Write store Java StoreWriter writer = PalDB.createWriter(new File("store.paldb")); writer.put("foo", "bar"); writer.put(1213, new int[] {1, 2, 3}); writer.close(); Scala val writer: StoreWriter = PalDB.createWriter(new File("store.paldb")); writer.put("foo", "bar"); writer.put(1213, Array(1, 2, 3)); writer.close();
  • 7. Code: Read store Java StoreReader reader = PalDB.createReader(new File("store.paldb")); String val1 = reader.get("foo"); int[] val2 = reader.get(1213); reader.close(); Scala val reader: StoreReader = PalDB.createReader(new File("store.paldb")); val val1: String = reader.get("foo"); var val2: Array[Int] = reader.get(1213); reader.close();
  • 8. Benchmark summary ❖ When compared to embeddable key-value stores (LevelDB, RocksDB) ‣ PalDB has 5X to 15X higher throughput on datasets fitting in memory* ❖ When compared to in-memory Java HashSet/HashMap ‣ PalDB has 2X to 5X lower throughput ‣ Uses 6X less memory * PalDB does not intend to scale to very large disk indices like RocksDB or LevelDB
  • 9. Throughput ❖ Throughput benchmark between PalDB, LevelDB and RocksDB (higher is better)
  • 10. Memory ❖ Memory usage benchmark between PalDB and a Java HashSet (lower is better)
  • 11. PalDB © 2015 LinkedIn Corp. Licensed under the terms of the Apache License, Version 2.0. Code & documentation available on GitHub
 https://github.com/linkedin/PalDB PalDB