SlideShare ist ein Scribd-Unternehmen logo
1 von 31
Downloaden Sie, um offline zu lesen
CrateDB & PostgreSQL
OldSQL to NewSQL
11th July 2017
@claus__m
About
~2yrs at Crate.io
DevRel/Field Engineering/Support/
Integrations/…
Speaking
Conferences, meetups, ...
Working with customers
Consulting, pre- and post-sales
@claus__m
Agenda
Failures
What, how, and when?
PostgreSQL
Concept overview
CrateDB
Concept overview
Discussion
NewSQL or not? Benefits and drawbacks.
Use Cases
Wrap up
@claus__m
Failures
@claus__m
Database Failures
Consequences
Data loss
Lost updates, dirty reads, ...
Service interruptions
Services can’t work without their database
Slow performance
Users may lose interest
Pressure
DBAs in the spotlight
@claus__m
What Makes Databases
Fail?
Overloaded
Insufficient hardware (RAM, CPU, disk),
swapping, inefficient queries
Failure
Hardware may fail on many levels: e.g.
Network, disk, RAM
Platform
Configuration errors, updates, resource
sharing, bugs
People
Malicious intent, sloppiness, ...
@claus__m
Overloaded
Insufficient hardware (RAM, CPU, disk),
swapping, inefficient queries
Failure
Hardware may fail on many levels: e.g.
Network, disk, RAM
Platform
Configuration errors, updates, resource
sharing, bugs
People
Malicious intent, sloppiness, ...
@claus__m
What Makes Databases
Fail?
Overview
Concepts and other things
Index and data
How the database creates indices, stores and
retrieves data
Search and scans
How the data is found
Replication and high availability
Distribution and achieving zero downtime
@claus__m
Assessment
PostgreSQL
@claus__m
Overview
Multi-process System
fork() to clone processes from postmaster to
postgres instances with shared memory
Technology
C/C++ based natively compiled
Optimization
Cost-based optimizer
Transactional
ACID compliant
@claus__m
Index And Data
Tree-based
An in-memory B-Tree, defined in CREATE
TABLE or ALTER TABLE
In Memory & On Disk
8K data pages in shared buffer cache and on
disk
Item Pointers
Only major changes are reflected in the index
(e.g. INSERT/DELETES)
@claus__m
@claus__mhttp://use-the-index-luke.com/sql/anatomy/the-tree
Searches And Scans
Sequential
Go over every block and execute a predicate
Index-based
Find something using an index on that column,
or a full index scan
Bitmap-based
Mark matches in boolean queries for results
@claus__m
Replication And
High Availability
Disk based
By sharing a disk or continuously cloning a disk
Log-shipping
Send the write-ahead-log to the standby server,
which can answer reads
Master/Master
Sends rows to the other master, can answer
reads and writes, locks rows/tables
Client-sharding
Shard the data on a client/proxy and route
accordingly
@claus__m
CrateDB
@claus__m
Overview
Multi-threaded System
Thread-pools to read/write Lucene segments
Technology
Java/JVM based
Optimization
Naive optimization on query levels
Eventually Consistent
Atomic operations per row, optimistic
concurrency only
Distributed By Default
Transparent partitioning and sharding @claus__m
Index And Data
Inverted index
Term dictionary where field values point to
rows (posting list)
Field cache
“Inverted inverted index”, column names point
to the possible values and their rows
On disk, cached in memory
Immutable segments on disk, binary search in
each segment, cached with mmap() into ram
pages
@claus__m
Example Posting List
@claus__m
Index And Data
@claus__m
Shards
Compounds of multiple immutable segments,
merged occasionally
Rows are documents, columns are fields
Vector space model to weight and score
searches (_score field)
Multi-threaded index access
Shards are multiple segments, each is read
with a thread
Replication And
High Availability
Shared nothing architecture
Every node handles every task
Shard-based
Replicas are copies of shards that are
distributed in the cluster evenly
Consistency
Elected leader maintains and distributes a
consistent cluster state
CAP
Tuneable consistency with synchronous inserts
@claus__m
Discussion
@claus__m
PostgreSQL: Strengths
Single-Node-Performance
Predictable and fast
SQL Sophistication
Lots of features, many of them heavily
optimized
Transactions
ACID compliance, concurrency control
@claus__m
PostgreSQL: Weaknesses
Distribution
High availability or working with huge data sets
requires 3rd party software, partitioning
Ingest speed
ACID compliance slows down inserts
Operational Complexity/DevOps Readiness
Highly controllable features make it hard to
manage
Schema Flexibility
Schema evolution management required
@claus__m
CrateDB: Strengths
Distribution
Distributed by nature, with tunable consistency
Ingest speed
Solid insert speeds with bulk inserts
Operational Complexity/DevOps Readiness
High flexibility, containerization, sane defaults
Schema Flexibility
Schema evolution on the fly
Built-in Search
Fulltext capabilities
@claus__m
CrateDB: Weaknesses
Single-Node-Performance
Distribution overhead requires a certain cluster
size to be efficient
SQL Features
Many features are yet missing or hard to do in
a distributed system
Transactions
No ACID compliance, eventual
consistency/optimistic concurrency requires
client-side handling
@claus__m
Use Cases
@claus__m
Use Cases: PostgreSQL
ORMs
Broad integration in various object-relational
mappers in frameworks (hibernate, …)
Transaction-based workloads
Single, high-value transactions
Extensive SQL compliance
Required support for views, stored procedures,
…
Small data sets
Hundreds of MBs to several GB
@claus__m
Use Cases: CrateDB
DevOps
Flexible schemas, ad-hoc queries, easy
maintenance
Analytics, machine learning
Large scale inserts/queries, high concurrency,
SQL
Fulltext search
Built-in tools for text-mining/analysis, built on
the de-facto standard of search
@claus__m
Thanks!
Links
https://github.com/crate
https://crate.io
Follow us on twitter
@crateio @claus__m
Next webinar: Scale your SQL database
with Docker, 27th July
Q & A

Weitere ähnliche Inhalte

Was ist angesagt?

NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)gdusbabek
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.Navdeep Charan
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLRamakant Soni
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overviewPritamKathar
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Ahmed Rashwan
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduceJ Singh
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Backbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTPBackbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTPMax Neunhöffer
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational DatabasesUdi Bauman
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL DatabasesRajith Pemabandu
 

Was ist angesagt? (20)

NoSQL databases
NoSQL databasesNoSQL databases
NoSQL databases
 
Nosql seminar
Nosql seminarNosql seminar
Nosql seminar
 
No SQL and MongoDB - Hyderabad Scalability Meetup
No SQL and MongoDB - Hyderabad Scalability MeetupNo SQL and MongoDB - Hyderabad Scalability Meetup
No SQL and MongoDB - Hyderabad Scalability Meetup
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)Introduction to Cassandra (June 2010)
Introduction to Cassandra (June 2010)
 
Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Selecting best NoSQL
Selecting best NoSQL Selecting best NoSQL
Selecting best NoSQL
 
A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.A Seminar on NoSQL Databases.
A Seminar on NoSQL Databases.
 
Apache Cassandra
Apache CassandraApache Cassandra
Apache Cassandra
 
NoSql
NoSqlNoSql
NoSql
 
NOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQLNOSQL- Presentation on NoSQL
NOSQL- Presentation on NoSQL
 
Cassandra an overview
Cassandra an overviewCassandra an overview
Cassandra an overview
 
Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?Why no sql ? Why Couchbase ?
Why no sql ? Why Couchbase ?
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
NoSql Introduction
NoSql IntroductionNoSql Introduction
NoSql Introduction
 
Backbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTPBackbone using Extensible Database APIs over HTTP
Backbone using Extensible Database APIs over HTTP
 
Nonrelational Databases
Nonrelational DatabasesNonrelational Databases
Nonrelational Databases
 
An Intro to NoSQL Databases
An Intro to NoSQL DatabasesAn Intro to NoSQL Databases
An Intro to NoSQL Databases
 

Ähnlich wie OldSQL to NewSQL

NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, HowIgor Moochnick
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLHyderabad Scalability Meetup
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...javier ramirez
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQLPankaj Khattar
 
GIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLsGIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLstechmaddy
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...Felix Gessert
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppthothyfa
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless DatabasesDan Gunter
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internalsnarsiman
 
Types of Databases
Types of DatabasesTypes of Databases
Types of Databaseskedar2310
 
Basics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageBasics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageNilesh Salpe
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.pptAnandKonj1
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'sankarapu posibabu
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.pptssuser8c8fc1
 
Exploring NoSQL and implementing through Cassandra
Exploring NoSQL and implementing through CassandraExploring NoSQL and implementing through Cassandra
Exploring NoSQL and implementing through CassandraDileep Kalidindi
 

Ähnlich wie OldSQL to NewSQL (20)

nosql.pptx
nosql.pptxnosql.pptx
nosql.pptx
 
NO SQL: What, Why, How
NO SQL: What, Why, HowNO SQL: What, Why, How
NO SQL: What, Why, How
 
Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
Ingesting Over Four Million Rows Per Second With QuestDB Timeseries Database ...
 
Learning Cassandra NoSQL
Learning Cassandra NoSQLLearning Cassandra NoSQL
Learning Cassandra NoSQL
 
GIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLsGIDS 2016 Understanding and Building No SQLs
GIDS 2016 Understanding and Building No SQLs
 
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
NoSQL Data Stores in Research and Practice - ICDE 2016 Tutorial - Extended Ve...
 
5266732.ppt
5266732.ppt5266732.ppt
5266732.ppt
 
Schemaless Databases
Schemaless DatabasesSchemaless Databases
Schemaless Databases
 
Cassandra internals
Cassandra internalsCassandra internals
Cassandra internals
 
No sql
No sqlNo sql
No sql
 
Types of Databases
Types of DatabasesTypes of Databases
Types of Databases
 
No sq lv2
No sq lv2No sq lv2
No sq lv2
 
Basics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed StorageBasics of Distributed Systems - Distributed Storage
Basics of Distributed Systems - Distributed Storage
 
NoSQL Basics - A Quick Tour
NoSQL Basics - A Quick TourNoSQL Basics - A Quick Tour
NoSQL Basics - A Quick Tour
 
05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt05 No SQL Sudarshan.ppt
05 No SQL Sudarshan.ppt
 
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
No SQL Databases sdfghjkl;sdfghjkl;sdfghjkl;'
 
No SQL Databases.ppt
No SQL Databases.pptNo SQL Databases.ppt
No SQL Databases.ppt
 
Exploring NoSQL and implementing through Cassandra
Exploring NoSQL and implementing through CassandraExploring NoSQL and implementing through Cassandra
Exploring NoSQL and implementing through Cassandra
 

Mehr von Claus Matzinger

Rust Munich February 2018: Rust on VSTS
Rust Munich February 2018: Rust on VSTSRust Munich February 2018: Rust on VSTS
Rust Munich February 2018: Rust on VSTSClaus Matzinger
 
CrateDB 101: Geospatial data
CrateDB 101: Geospatial dataCrateDB 101: Geospatial data
CrateDB 101: Geospatial dataClaus Matzinger
 
CrateDB 101: Sensor data
CrateDB 101: Sensor dataCrateDB 101: Sensor data
CrateDB 101: Sensor dataClaus Matzinger
 
Getting the most out of your containerized database
Getting the most out of your containerized databaseGetting the most out of your containerized database
Getting the most out of your containerized databaseClaus Matzinger
 
Sensordaten analysieren mit Docker, CrateDB und Grafana
Sensordaten analysieren mit Docker, CrateDB und GrafanaSensordaten analysieren mit Docker, CrateDB und Grafana
Sensordaten analysieren mit Docker, CrateDB und GrafanaClaus Matzinger
 
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+Lunch
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+LunchOpen Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+Lunch
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+LunchClaus Matzinger
 
Containerized DBs in a Machine Data environment with Crate.io
Containerized DBs in a Machine Data environment with Crate.ioContainerized DBs in a Machine Data environment with Crate.io
Containerized DBs in a Machine Data environment with Crate.ioClaus Matzinger
 

Mehr von Claus Matzinger (7)

Rust Munich February 2018: Rust on VSTS
Rust Munich February 2018: Rust on VSTSRust Munich February 2018: Rust on VSTS
Rust Munich February 2018: Rust on VSTS
 
CrateDB 101: Geospatial data
CrateDB 101: Geospatial dataCrateDB 101: Geospatial data
CrateDB 101: Geospatial data
 
CrateDB 101: Sensor data
CrateDB 101: Sensor dataCrateDB 101: Sensor data
CrateDB 101: Sensor data
 
Getting the most out of your containerized database
Getting the most out of your containerized databaseGetting the most out of your containerized database
Getting the most out of your containerized database
 
Sensordaten analysieren mit Docker, CrateDB und Grafana
Sensordaten analysieren mit Docker, CrateDB und GrafanaSensordaten analysieren mit Docker, CrateDB und Grafana
Sensordaten analysieren mit Docker, CrateDB und Grafana
 
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+Lunch
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+LunchOpen Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+Lunch
Open Machine Data Analysis Stack with Docker, CrateDB, and Grafana @Chadev+Lunch
 
Containerized DBs in a Machine Data environment with Crate.io
Containerized DBs in a Machine Data environment with Crate.ioContainerized DBs in a Machine Data environment with Crate.io
Containerized DBs in a Machine Data environment with Crate.io
 

Kürzlich hochgeladen

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 

Kürzlich hochgeladen (20)

Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 

OldSQL to NewSQL

  • 1. CrateDB & PostgreSQL OldSQL to NewSQL 11th July 2017 @claus__m
  • 2. About ~2yrs at Crate.io DevRel/Field Engineering/Support/ Integrations/… Speaking Conferences, meetups, ... Working with customers Consulting, pre- and post-sales @claus__m
  • 3. Agenda Failures What, how, and when? PostgreSQL Concept overview CrateDB Concept overview Discussion NewSQL or not? Benefits and drawbacks. Use Cases Wrap up @claus__m
  • 4.
  • 6. Database Failures Consequences Data loss Lost updates, dirty reads, ... Service interruptions Services can’t work without their database Slow performance Users may lose interest Pressure DBAs in the spotlight @claus__m
  • 7. What Makes Databases Fail? Overloaded Insufficient hardware (RAM, CPU, disk), swapping, inefficient queries Failure Hardware may fail on many levels: e.g. Network, disk, RAM Platform Configuration errors, updates, resource sharing, bugs People Malicious intent, sloppiness, ... @claus__m
  • 8. Overloaded Insufficient hardware (RAM, CPU, disk), swapping, inefficient queries Failure Hardware may fail on many levels: e.g. Network, disk, RAM Platform Configuration errors, updates, resource sharing, bugs People Malicious intent, sloppiness, ... @claus__m What Makes Databases Fail?
  • 9. Overview Concepts and other things Index and data How the database creates indices, stores and retrieves data Search and scans How the data is found Replication and high availability Distribution and achieving zero downtime @claus__m Assessment
  • 11. Overview Multi-process System fork() to clone processes from postmaster to postgres instances with shared memory Technology C/C++ based natively compiled Optimization Cost-based optimizer Transactional ACID compliant @claus__m
  • 12. Index And Data Tree-based An in-memory B-Tree, defined in CREATE TABLE or ALTER TABLE In Memory & On Disk 8K data pages in shared buffer cache and on disk Item Pointers Only major changes are reflected in the index (e.g. INSERT/DELETES) @claus__m
  • 14. Searches And Scans Sequential Go over every block and execute a predicate Index-based Find something using an index on that column, or a full index scan Bitmap-based Mark matches in boolean queries for results @claus__m
  • 15. Replication And High Availability Disk based By sharing a disk or continuously cloning a disk Log-shipping Send the write-ahead-log to the standby server, which can answer reads Master/Master Sends rows to the other master, can answer reads and writes, locks rows/tables Client-sharding Shard the data on a client/proxy and route accordingly @claus__m
  • 17. Overview Multi-threaded System Thread-pools to read/write Lucene segments Technology Java/JVM based Optimization Naive optimization on query levels Eventually Consistent Atomic operations per row, optimistic concurrency only Distributed By Default Transparent partitioning and sharding @claus__m
  • 18. Index And Data Inverted index Term dictionary where field values point to rows (posting list) Field cache “Inverted inverted index”, column names point to the possible values and their rows On disk, cached in memory Immutable segments on disk, binary search in each segment, cached with mmap() into ram pages @claus__m
  • 20. Index And Data @claus__m Shards Compounds of multiple immutable segments, merged occasionally Rows are documents, columns are fields Vector space model to weight and score searches (_score field) Multi-threaded index access Shards are multiple segments, each is read with a thread
  • 21. Replication And High Availability Shared nothing architecture Every node handles every task Shard-based Replicas are copies of shards that are distributed in the cluster evenly Consistency Elected leader maintains and distributes a consistent cluster state CAP Tuneable consistency with synchronous inserts @claus__m
  • 23. PostgreSQL: Strengths Single-Node-Performance Predictable and fast SQL Sophistication Lots of features, many of them heavily optimized Transactions ACID compliance, concurrency control @claus__m
  • 24. PostgreSQL: Weaknesses Distribution High availability or working with huge data sets requires 3rd party software, partitioning Ingest speed ACID compliance slows down inserts Operational Complexity/DevOps Readiness Highly controllable features make it hard to manage Schema Flexibility Schema evolution management required @claus__m
  • 25. CrateDB: Strengths Distribution Distributed by nature, with tunable consistency Ingest speed Solid insert speeds with bulk inserts Operational Complexity/DevOps Readiness High flexibility, containerization, sane defaults Schema Flexibility Schema evolution on the fly Built-in Search Fulltext capabilities @claus__m
  • 26. CrateDB: Weaknesses Single-Node-Performance Distribution overhead requires a certain cluster size to be efficient SQL Features Many features are yet missing or hard to do in a distributed system Transactions No ACID compliance, eventual consistency/optimistic concurrency requires client-side handling @claus__m
  • 28. Use Cases: PostgreSQL ORMs Broad integration in various object-relational mappers in frameworks (hibernate, …) Transaction-based workloads Single, high-value transactions Extensive SQL compliance Required support for views, stored procedures, … Small data sets Hundreds of MBs to several GB @claus__m
  • 29. Use Cases: CrateDB DevOps Flexible schemas, ad-hoc queries, easy maintenance Analytics, machine learning Large scale inserts/queries, high concurrency, SQL Fulltext search Built-in tools for text-mining/analysis, built on the de-facto standard of search @claus__m
  • 30. Thanks! Links https://github.com/crate https://crate.io Follow us on twitter @crateio @claus__m Next webinar: Scale your SQL database with Docker, 27th July
  • 31. Q & A