SlideShare ist ein Scribd-Unternehmen logo
1 von 44
Downloaden Sie, um offline zu lesen
Ultra-Scalable
Transactional Management
Ricardo Jimenez-Peris
LeanXcale CEO & Founder
Solved how to scale
transactions to large scale (i.e.
100 million update
transactions per second) in a
fully seamless way
Breakthrough result of
15+ years of research
by a tenacious team
Ultra-Scalable Transactions
Scalability & Performance
Scale out linearly from 1 to
100s of nodes
 Full SQL
Simple and powerful queries
 Full ACID
Always consistent
 Patented Ultra-Scalable Transactional
Management
Scale to millions of update transactions
per second
 Ultra-Efficient Storage Engine
Run efficiently in today’s multi-core and
NUMA hardware
Evaluation without data manager/logging only transactional manager
16 nodes of 12 cores, 128 GB each
2.35 Million
transactions
per second
Scalability
0
100000
200000
300000
400000
500000
600000
700000
0
50000
100000
150000
200000
250000
300000
350000
0 20 40 60 80 100 120
Operations/second
Throughput(txn/second)
Number of Cores
Yahoo Cloud Serving Benchmark (YCSB)
Linear Scalability
TPC-C Results
0
5
10
15
20
25
30
35
40
45
50
0
20000
40000
60000
80000
100000
120000
140000
160000
180000
0 5 10 15 20 25 30 35
Latency(ms)
Throughput(tpmC)
# Worker Nodes
tpmC
Latency
Operational Data Analytical Queries
× Costs of ETLs represent 80% of business analytics
× Analytical queries on obsolete data
Copy Process (ETL)
Current Landscape at Enterprises
Solution: Blending Two Worlds
Real Time Analytics: Analytical Queries
over Operational Data
Operational Database OLTP Data Warehouse OLAP
LeanXcale: OLTP + OLAP
Dealing with the Polyglot World
x Data silos: Lack of queries across SQL &
NoSQL data stores
---------------------------------------------------------
Solution: Queries across SQL & NoSQL
Queries across SQL & NoSQL data stores:
• SQL, Neo4J, MongoDB, Hbase
• Subqueries written in native query
language/API: Full power of underlying
data stores.
• Result sets of subqueries exhibited as
temporary SQL tables.
• Integration query written in simple SQL.
What is the Magic?
The transactional management provides ultra-scalability
Fully transparent:
• No sharding.
• No required a priori knowledge about rows to be accessed.
• Syntactically: no changes required in the application.
• Semantically: equivalent behavior to a centralized system.
Provides Snapshot Isolation
(the isolation level provided by Oracle when set to “Serializable” isolation)
+
+
Transactional Processing
Ultra-Scalable Transactions
Time
LeanXcale
Process &
commits
transactions
in parallel
Provides a consistent
view
Traditional systems
have a single-node bottleneck
Time
Traditional
transactional DB
vs
Start End
Reads Writes
Reads & Writes
Snapshot isolation splits atomicity in two points one at the beginning of the
transaction where all reads happen and one at the end of the transaction
where all writes happen
Serializability provides a fully atomic view of a transaction, reads and writes
happen atomically at a single point in time
Snapshot Isolation VS Serializability
Centralized Transaction Manager
Central
TM
Atomicity Isolation
DurabilityConsistency
Traditional Approach
Single-Node Bottleneck
Central
TM
Atomicity Isolation
Writes
DurabilityIsolation
Reads
Centralized Transaction Manager
Traditional Approach
Single-Node Bottleneck
Atomicity
Atomicity
Atomicity
Isolation
Reads
Durability
Isolation
Writes
Scaling ACID Properties
Scaling ACID Properties
Conflict
Managers
Loggers
Commit
Sequencer
Snapshot
Server
Local
TMs
Atomicity
Isolation
Reads
Isolation
Writes
Durability
Separation of commit from the visibility of committed data
Proactive pre-assignment of commit timestamps to committing
transactions
Transactions can commit in parallel due to:
• They do not conflict
• They have their commit timestamp already assigned that will determine its
serialization order
• Visibility is regulated separately to guarantee the reading of fully consistent states
Detection and resolution of conflicts before commit
Main Principles
Transactional Life Cycle: Start
Snapshot
Server
The local txn mng
gets the “start TS”
from the snapshot
server.
Get start TS
Local Txn
Manager
Transactional Life Cycle: Execution
Local Txn
Manager
Get start TS
Run on start
TS snapshot
Conflict
Manager
The transaction will read the state
as of “start TS”.
Write-write conflicts are detected
by conflict managers on the fly.
Transactional Life Cycle: Commit
Get start TS
Run on start
TS snapshot
Commit
The local transaction
manager orchestrates
the commit.
Local Txn
Manager
Transactional Life Cycle: Commit
Data Store
Commit TS Writeset Writeset Commit TS
Local Txn
Manager
Get
Commit TS
Log
Public
Updates
Report
Snaps Serv
Commit
Sequencer
Snapshot
Server
Logger
Snapshot
Server
The Snapshot server keeps track
of the most recent snapshot that
is consistent:
• Its TS should such that there is no
previous commit TS that is not yet
durable and readable or it has been
discarded
• That is, it keeps the longest prefix of
used/discarded TSs such that there
are no gaps
Keeps track of
and reports most
recent consistent
TS
Gets
reports of
discarded
TSs
Gets reports
of durable &
readable TSs
In this way transactions can
commit in parallel and
consistency preserved
Transactional Life Cycle: Commit
Time
Sequence of timestamps received by the Snapshot Server
Evolution of the current snapshot at the Snapshot Server
11 15 12 14 13
11 11 12 12 15
Transactional Life Cycle: Commit
There can be as many conflict managers as needed, they scale in the
same way as hashing based key-value data stores
By doing concurrency control at conflict managers that has a much smaller
number than data managers, batching is much more effective
With TPC-C the ratio of nodes devoted to concurrency management and
query engine/region server is 20 to 1 (resulting in a 20 times more
efficient batching)
Each conflict manager takes care of a set of keys
Conflict Managers
Each logger takes care of a fraction of the log records
Loggers log in parallel and are uncoordinated
There can be as many loggers as needed to provide the necessary IO
bandwidth to log the rate of updates
Loggers can be replicated
If this is the case, the durability can be configured as:
•To be in the memory of a majority of logger replicas (replicated memory durability)
•To be in a persistent storage of a logger replica (1-safe durability)
•To be in a persistent storage of a majority of logger replicas (n-safe durability)
The client gets the commit reply after the writeset is durable (with respect
the configured durability)
Loggers
The described approach so far is the original reactive approach
It results in multiple messages per update transaction.
The adopted approach is proactive:
•The local transaction managers report periodically about the number of committed update
transactions per second
•The commit sequencer distributes batches of commit timestamps to the local transaction
managers
•The snapshot server gets periodically batches of timestamps (both used and discarded) from local
transaction managers
•The snapshot server reports periodically to local transaction managers the most current
consistent snapshot
Increasing efficiency
SQL processing is performed at the SQL engine tier
A SQL engine instance:
•Transforms SQL code into a query plan
•The query plan is optimized according the collected statistics (e.g. cardinality of keys)
•Orchestrate the query plan execution on top of the distributed data store
•Returns the result of the SQL execution to the client
•Maintains updated the statistics in the data store
The SQL engine has been attained by forking from Apache Derby the
query engine (same SQL dialect as DB2)
The scan operators has been modified to access KiVi instead of local
storage
The metadata is stored at KiVi instead of local storage
Increasing efficiency
KiVi Key-Value
Data Store
OLTP & OLAP
Query Engine
Ultra-Scalable
Transactions
LeanXcale Layers
s_id = id
σlocation = 'Rome' and color = 'red'
id = w_id
Stor
e
At the leaves of the Query
Plan there are Scan Operators that
have predicate filtering, aggregation,
grouping and sorting capabilities.
They have been rewritten to
access KiVi instead of local storage.
They enable to push down all
algebraic operators below a join.
SELECT
s.id, s.location
FROM
Store s
INNER JOIN Catalog c ON s.id=c.s_id
INNER JOIN Widget w ON c.w_id=w.id
WHERE
s.location='Rome' AND w.color='red'
SQL is translated into a query plan
represented as a tree of algebraic
operators. Algebraic operators
are written in Java plus bytecode
Store Cata
log
Wid
get
Query Engine
cat_id = id
location = 'Rome' and color = 'red'
Inv_id = id
color = 'red’ (Item)location = ‘Rome’ (Store)
Selections are
pushed down
Store Inven
tory
Cata
log
Selection Push Down
Select *
from Store s, Inventory I, Catalog c
where I.cat_id = c.id
and s.inv_id = i.id
and s.location = ‘Rome’
and c.color = ‘red’
Data Engine Instance 1 Data Engine Instance 2
Query Engine Instance
Data Engine Instance 1 Data Engine Instance 2
Aggregation Push Down
(units)
select sum(i.units)
from inventory i
Global Aggregation
Query Engine Instance
All values travel
from data engine instances
to the query engine
Inven
tory
Inven
tory
Data Engine Instance 1 Data Engine Instance 2
Local
Aggregation
Inven
tory
Inven
tory
Aggregation Push Down
(units) instance 1 (units) Instance 2
(units)
select sum(i.units)
from inventory i
Global Aggregation
Query Engine Instance
A single value travels
from each data engine instance
to the query engine
Redistribution
Query Engine Instance 1 Query Engine Instance 2
Local Projection
Store Cata
log
Wid
get
Store Cata
log
Wid
get
OLAP Query Engine
LeanXcale’s
Distributed Storage Engine
• KiVi is a full ACID and highly
efficient Relational Key-Value
datastore.
• Unlike existing Key-Value data
stores it has schema.
• Implements a novel data
structure that combines the
advantages of B+Trees for
range queries and the ones of
LSM-Trees for random updates
and inserts.
Disruptive Innovations
Ultra-Scalable Operational Database:
Analytical Queries over Operational Data:
Ultra-Efficient Storage Engine:
• Scales from 1 to 100s of nodes to millions of transactions per
second
• Full ACID & Full SQL
• Standard JDBC Driver
• Distributed Data Warehouse working over Operational Data.
• Real-Time Analytical Queries
• Designed to work efficiently in multi-core and many-core HW.
• Ultra-NUMA efficient.
KiVi has a radically new architecture:
• NUMA architecture is exploited with a shared nothing
approach.
• Very high level of efficiency is achieved thanks to
avoiding multi-threading and the associated costs
(context changes, thread synchronization).
• There are no NUMA remote accesses.
KiVi exploits the vectorial and SIMD capabilities of
current commodity server hardware, enabling to
process 10s of items with a single instruction.
KiVi is columnar as well getting columnar acceleration
for analytical queries.
Kivi Efficiency
Disruptive Innovations
Online Aggregations:
Continuous Dynamic Load Balancing:
Non-Intrusive Elasticity:
• Removes hotspots by computing aggregates on a transaction
without conflicts.
• Aggregate analytical queries become costless single row
queries.
• Multi-resource dynamic load balancing.
• Minimize footprint for any workload.
• Maximizes performance.
• Grows and shrinks the cluster size as needed to process the
incoming load.
• Minimizes operational costs by reducing HW resources to
actual needs (in cloud, on premise).
Another key feature of KiVi is its ability to enable dynamic
reconfiguration actions without stopping the processing, i.e., a
data region can be moved while transactions are updating and
reading the data
This is possible thanks to the properties of the transaction manager
that enables to move a data region in any partial state and apply
updates again at the target node with idempotence guarantees so
each update is applied exactly once.
KiVi Dynamic Reconfiguration
The problem of multi-resource load balancing is NP-Hard.
We have conceived a greedy algorithm that computes solutions
close to the optimal in an affordable time.
It is a novel multi-resource algorithm that considers all resources
in a way proportional to their scarcity.
KiVi Dynamic Load Balancing
When the average utilization of a resource (e.g. CPU) is above a
predefined threshold a new node is provisioned and the load
balancing algorithm then takes care of moving regions to balance
the load across the set of nodes.
Similarly, when the average utilization is such that a node can be
decommissioned keeping the average utilization below a
predefined threshold, then, the load of one of the nodes is
distributed to the rest of the nodes (using again the load balancing
algorithm) and the node gets decommissioned.
Kivi Elasticity
Disruptive Innovations
Efficient for Range Queries & Random Updates:
Costless Multi Version Concurrency Control:
Efficient High Availability:
• As efficient as B+-trees for range queries (used by relational DBs).
• As efficient as LSM-trees for random updates/inserts (used by
key-value data stores).
• Novel MVCC with almost zero overhead.
• Avoids stop-the-world obsolete version cleaning.
• Avoids resource waste for multi-versioning.
• Active-Active (Synchronous) Replication without Contention
and without Synchronization Overhead.
It provides active-active replication (multi-master) without
hampering performance:
• Novel replication approach avoiding the redundancy for attaining
atomicity at the transactional level and the replication level.
It will provide geo-replication without any penalty in throughput:
• Novel geo-replication algorithm that streams the logs in parallel to
the backup data center.
• The backup data center can run read-only transactions in a fully local
way.
• The backup data center can run update transactions remotely at the
primary data center.
KiVi High Availability
Acknowledgements
LeanXcale R&D has been and is partially supported by different funding sources including the
European Union’s FP7 and H2020 research and innovation programme under grants
BigDataStack (779747), CloudDBAppliance (732051), Vineyard (687628), CoherentPaaS
(611068), LeanBigData (619606), CrowdHealth (727560), Cybele (825355), Infinitech.
LeanXcale has been partially funded by Spanish CDTI and Spanish Ministry of Economy and Competitiveness in the NEOTEC
program under grant SNEO-20151285.
Ricardo Jimenez-Peris
LeanXcale CEO & Co-Founder
info@leanxcale.com
www.LeanXcale.com
@LeanXcale
https://www.linkedin.com/company/leanxcale
https://www.facebook.com/leanxcale/

Weitere ähnliche Inhalte

Was ist angesagt?

Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysisMariaDB plc
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQLMariaDB plc
 
Achieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environmentAchieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environmentRakuten Group, Inc.
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveTesora
 
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScaleHow Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScaleMariaDB plc
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Kristi Lewandowski
 
Big Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationBig Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationAlibaba Cloud
 
Migrating to Cassandra
Migrating to CassandraMigrating to Cassandra
Migrating to CassandraInstaclustr
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Precisely
 
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDBHow to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDBInfluxData
 
Capital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting PlatformCapital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting PlatformDataStax Academy
 
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...confluent
 
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Databricks
 
"An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ...
"An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ..."An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ...
"An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ...Dataconomy Media
 
Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Jay Patel
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoringMariaDB plc
 
eBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQLeBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQLXu Jiang
 
Scylla Summit 2018: Grab and Scylla: Driving Southeast Asia Forward
Scylla Summit 2018: Grab and Scylla: Driving Southeast Asia ForwardScylla Summit 2018: Grab and Scylla: Driving Southeast Asia Forward
Scylla Summit 2018: Grab and Scylla: Driving Southeast Asia ForwardScyllaDB
 

Was ist angesagt? (20)

Introducing workload analysis
Introducing workload analysisIntroducing workload analysis
Introducing workload analysis
 
The Rise of Streaming SQL
The Rise of Streaming SQLThe Rise of Streaming SQL
The Rise of Streaming SQL
 
The architecture of SkySQL
The architecture of SkySQLThe architecture of SkySQL
The architecture of SkySQL
 
Achieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environmentAchieving scale and performance using cloud native environment
Achieving scale and performance using cloud native environment
 
Managing Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack TroveManaging Cassandra Databases with OpenStack Trove
Managing Cassandra Databases with OpenStack Trove
 
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScaleHow Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
How Alibaba Cloud scaled ApsaraDB with MariaDB MaxScale
 
Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017Curriculum Associates Strata NYC 2017
Curriculum Associates Strata NYC 2017
 
Big Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data IntegrationBig Data Quickstart Series 3: Perform Data Integration
Big Data Quickstart Series 3: Perform Data Integration
 
Migrating to Cassandra
Migrating to CassandraMigrating to Cassandra
Migrating to Cassandra
 
Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?Which Change Data Capture Strategy is Right for You?
Which Change Data Capture Strategy is Right for You?
 
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDBHow to Enable Industrial Decarbonization with Node-RED and InfluxDB
How to Enable Industrial Decarbonization with Node-RED and InfluxDB
 
Capital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting PlatformCapital One: Using Cassandra In Building A Reporting Platform
Capital One: Using Cassandra In Building A Reporting Platform
 
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
 
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...Efficiently Building Machine Learning Models for Predictive Maintenance in th...
Efficiently Building Machine Learning Models for Predictive Maintenance in th...
 
"An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ...
"An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ..."An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ...
"An Introduction to Kx Technology: A Big Data Solution" Chris Leckey, a Data ...
 
Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013Cassandra at eBay - Cassandra Summit 2013
Cassandra at eBay - Cassandra Summit 2013
 
Under the hood: SkySQL monitoring
Under the hood: SkySQL monitoringUnder the hood: SkySQL monitoring
Under the hood: SkySQL monitoring
 
eBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQLeBay Cloud CMS based on NOSQL
eBay Cloud CMS based on NOSQL
 
Scylla Summit 2018: Grab and Scylla: Driving Southeast Asia Forward
Scylla Summit 2018: Grab and Scylla: Driving Southeast Asia ForwardScylla Summit 2018: Grab and Scylla: Driving Southeast Asia Forward
Scylla Summit 2018: Grab and Scylla: Driving Southeast Asia Forward
 
Cassandra in e-commerce
Cassandra in e-commerceCassandra in e-commerce
Cassandra in e-commerce
 

Ähnlich wie LeanXcale Presentation - Waterloo University

The End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional ManagementThe End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional ManagementRicardo Jimenez-Peris
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike, Inc.
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Peter Lawrey
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016Peter Lawrey
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareApache Apex
 
Why & how to optimize sql server for performance from design to query
Why & how to optimize sql server for performance from design to queryWhy & how to optimize sql server for performance from design to query
Why & how to optimize sql server for performance from design to queryAntonios Chatzipavlis
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applicationsDing Li
 
Model based transaction-aware cloud resources management case study and met...
Model based transaction-aware cloud resources management   case study and met...Model based transaction-aware cloud resources management   case study and met...
Model based transaction-aware cloud resources management case study and met...Leonid Grinshpan, Ph.D.
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon KinesisAmazon Web Services
 
Synchronicity of a distributed account system
Synchronicity of a distributed account systemSynchronicity of a distributed account system
Synchronicity of a distributed account systemLuis Caldeira
 
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...Startupfest
 
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...Amazon Web Services
 
High-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsHigh-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsRick Hightower
 
Data warehousing guidelines for bi and BAM solutions
Data warehousing guidelines for bi and BAM solutionsData warehousing guidelines for bi and BAM solutions
Data warehousing guidelines for bi and BAM solutionsShehap Elnagar
 
High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017Rick Hightower
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeSingleStore
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive MicroservicesRick Hightower
 

Ähnlich wie LeanXcale Presentation - Waterloo University (20)

The End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional ManagementThe End of a Myth: Ultra-Scalable Transactional Management
The End of a Myth: Ultra-Scalable Transactional Management
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Aerospike Hybrid Memory Architecture
Aerospike Hybrid Memory ArchitectureAerospike Hybrid Memory Architecture
Aerospike Hybrid Memory Architecture
 
Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016Microservices for performance - GOTO Chicago 2016
Microservices for performance - GOTO Chicago 2016
 
Low latency microservices in java QCon New York 2016
Low latency microservices in java   QCon New York 2016Low latency microservices in java   QCon New York 2016
Low latency microservices in java QCon New York 2016
 
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra TagareActionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
Actionable Insights with Apache Apex at Apache Big Data 2017 by Devendra Tagare
 
Why & how to optimize sql server for performance from design to query
Why & how to optimize sql server for performance from design to queryWhy & how to optimize sql server for performance from design to query
Why & how to optimize sql server for performance from design to query
 
Software architecture for data applications
Software architecture for data applicationsSoftware architecture for data applications
Software architecture for data applications
 
Model based transaction-aware cloud resources management case study and met...
Model based transaction-aware cloud resources management   case study and met...Model based transaction-aware cloud resources management   case study and met...
Model based transaction-aware cloud resources management case study and met...
 
Sql Server
Sql ServerSql Server
Sql Server
 
Accordion - VLDB 2014
Accordion - VLDB 2014Accordion - VLDB 2014
Accordion - VLDB 2014
 
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
(ARC310) Solving Amazon's Catalog Contention With Amazon Kinesis
 
Synchronicity of a distributed account system
Synchronicity of a distributed account systemSynchronicity of a distributed account system
Synchronicity of a distributed account system
 
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
Jeremy Edberg (MinOps ) - How to build a solid infrastructure for a startup t...
 
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
 
High-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulationsHigh-Speed Reactive Microservices - trials and tribulations
High-Speed Reactive Microservices - trials and tribulations
 
Data warehousing guidelines for bi and BAM solutions
Data warehousing guidelines for bi and BAM solutionsData warehousing guidelines for bi and BAM solutions
Data warehousing guidelines for bi and BAM solutions
 
High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017High-speed, Reactive Microservices 2017
High-speed, Reactive Microservices 2017
 
Data & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real TimeData & Analytics Forum: Moving Telcos to Real Time
Data & Analytics Forum: Moving Telcos to Real Time
 
High-Speed Reactive Microservices
High-Speed Reactive MicroservicesHigh-Speed Reactive Microservices
High-Speed Reactive Microservices
 

Kürzlich hochgeladen

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Kürzlich hochgeladen (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

LeanXcale Presentation - Waterloo University

  • 2. Solved how to scale transactions to large scale (i.e. 100 million update transactions per second) in a fully seamless way Breakthrough result of 15+ years of research by a tenacious team Ultra-Scalable Transactions
  • 3. Scalability & Performance Scale out linearly from 1 to 100s of nodes  Full SQL Simple and powerful queries  Full ACID Always consistent  Patented Ultra-Scalable Transactional Management Scale to millions of update transactions per second  Ultra-Efficient Storage Engine Run efficiently in today’s multi-core and NUMA hardware
  • 4. Evaluation without data manager/logging only transactional manager 16 nodes of 12 cores, 128 GB each 2.35 Million transactions per second Scalability
  • 5. 0 100000 200000 300000 400000 500000 600000 700000 0 50000 100000 150000 200000 250000 300000 350000 0 20 40 60 80 100 120 Operations/second Throughput(txn/second) Number of Cores Yahoo Cloud Serving Benchmark (YCSB) Linear Scalability
  • 6. TPC-C Results 0 5 10 15 20 25 30 35 40 45 50 0 20000 40000 60000 80000 100000 120000 140000 160000 180000 0 5 10 15 20 25 30 35 Latency(ms) Throughput(tpmC) # Worker Nodes tpmC Latency
  • 7. Operational Data Analytical Queries × Costs of ETLs represent 80% of business analytics × Analytical queries on obsolete data Copy Process (ETL) Current Landscape at Enterprises Solution: Blending Two Worlds Real Time Analytics: Analytical Queries over Operational Data Operational Database OLTP Data Warehouse OLAP LeanXcale: OLTP + OLAP
  • 8. Dealing with the Polyglot World x Data silos: Lack of queries across SQL & NoSQL data stores --------------------------------------------------------- Solution: Queries across SQL & NoSQL Queries across SQL & NoSQL data stores: • SQL, Neo4J, MongoDB, Hbase • Subqueries written in native query language/API: Full power of underlying data stores. • Result sets of subqueries exhibited as temporary SQL tables. • Integration query written in simple SQL.
  • 9. What is the Magic?
  • 10. The transactional management provides ultra-scalability Fully transparent: • No sharding. • No required a priori knowledge about rows to be accessed. • Syntactically: no changes required in the application. • Semantically: equivalent behavior to a centralized system. Provides Snapshot Isolation (the isolation level provided by Oracle when set to “Serializable” isolation) + + Transactional Processing
  • 11. Ultra-Scalable Transactions Time LeanXcale Process & commits transactions in parallel Provides a consistent view Traditional systems have a single-node bottleneck Time Traditional transactional DB vs
  • 12. Start End Reads Writes Reads & Writes Snapshot isolation splits atomicity in two points one at the beginning of the transaction where all reads happen and one at the end of the transaction where all writes happen Serializability provides a fully atomic view of a transaction, reads and writes happen atomically at a single point in time Snapshot Isolation VS Serializability
  • 13. Centralized Transaction Manager Central TM Atomicity Isolation DurabilityConsistency Traditional Approach Single-Node Bottleneck
  • 17. Separation of commit from the visibility of committed data Proactive pre-assignment of commit timestamps to committing transactions Transactions can commit in parallel due to: • They do not conflict • They have their commit timestamp already assigned that will determine its serialization order • Visibility is regulated separately to guarantee the reading of fully consistent states Detection and resolution of conflicts before commit Main Principles
  • 18. Transactional Life Cycle: Start Snapshot Server The local txn mng gets the “start TS” from the snapshot server. Get start TS Local Txn Manager
  • 19. Transactional Life Cycle: Execution Local Txn Manager Get start TS Run on start TS snapshot Conflict Manager The transaction will read the state as of “start TS”. Write-write conflicts are detected by conflict managers on the fly.
  • 20. Transactional Life Cycle: Commit Get start TS Run on start TS snapshot Commit The local transaction manager orchestrates the commit. Local Txn Manager
  • 21. Transactional Life Cycle: Commit Data Store Commit TS Writeset Writeset Commit TS Local Txn Manager Get Commit TS Log Public Updates Report Snaps Serv Commit Sequencer Snapshot Server Logger
  • 22. Snapshot Server The Snapshot server keeps track of the most recent snapshot that is consistent: • Its TS should such that there is no previous commit TS that is not yet durable and readable or it has been discarded • That is, it keeps the longest prefix of used/discarded TSs such that there are no gaps Keeps track of and reports most recent consistent TS Gets reports of discarded TSs Gets reports of durable & readable TSs In this way transactions can commit in parallel and consistency preserved Transactional Life Cycle: Commit
  • 23. Time Sequence of timestamps received by the Snapshot Server Evolution of the current snapshot at the Snapshot Server 11 15 12 14 13 11 11 12 12 15 Transactional Life Cycle: Commit
  • 24. There can be as many conflict managers as needed, they scale in the same way as hashing based key-value data stores By doing concurrency control at conflict managers that has a much smaller number than data managers, batching is much more effective With TPC-C the ratio of nodes devoted to concurrency management and query engine/region server is 20 to 1 (resulting in a 20 times more efficient batching) Each conflict manager takes care of a set of keys Conflict Managers
  • 25. Each logger takes care of a fraction of the log records Loggers log in parallel and are uncoordinated There can be as many loggers as needed to provide the necessary IO bandwidth to log the rate of updates Loggers can be replicated If this is the case, the durability can be configured as: •To be in the memory of a majority of logger replicas (replicated memory durability) •To be in a persistent storage of a logger replica (1-safe durability) •To be in a persistent storage of a majority of logger replicas (n-safe durability) The client gets the commit reply after the writeset is durable (with respect the configured durability) Loggers
  • 26. The described approach so far is the original reactive approach It results in multiple messages per update transaction. The adopted approach is proactive: •The local transaction managers report periodically about the number of committed update transactions per second •The commit sequencer distributes batches of commit timestamps to the local transaction managers •The snapshot server gets periodically batches of timestamps (both used and discarded) from local transaction managers •The snapshot server reports periodically to local transaction managers the most current consistent snapshot Increasing efficiency
  • 27. SQL processing is performed at the SQL engine tier A SQL engine instance: •Transforms SQL code into a query plan •The query plan is optimized according the collected statistics (e.g. cardinality of keys) •Orchestrate the query plan execution on top of the distributed data store •Returns the result of the SQL execution to the client •Maintains updated the statistics in the data store The SQL engine has been attained by forking from Apache Derby the query engine (same SQL dialect as DB2) The scan operators has been modified to access KiVi instead of local storage The metadata is stored at KiVi instead of local storage Increasing efficiency
  • 28. KiVi Key-Value Data Store OLTP & OLAP Query Engine Ultra-Scalable Transactions LeanXcale Layers
  • 29. s_id = id σlocation = 'Rome' and color = 'red' id = w_id Stor e At the leaves of the Query Plan there are Scan Operators that have predicate filtering, aggregation, grouping and sorting capabilities. They have been rewritten to access KiVi instead of local storage. They enable to push down all algebraic operators below a join. SELECT s.id, s.location FROM Store s INNER JOIN Catalog c ON s.id=c.s_id INNER JOIN Widget w ON c.w_id=w.id WHERE s.location='Rome' AND w.color='red' SQL is translated into a query plan represented as a tree of algebraic operators. Algebraic operators are written in Java plus bytecode Store Cata log Wid get Query Engine
  • 30. cat_id = id location = 'Rome' and color = 'red' Inv_id = id color = 'red’ (Item)location = ‘Rome’ (Store) Selections are pushed down Store Inven tory Cata log Selection Push Down Select * from Store s, Inventory I, Catalog c where I.cat_id = c.id and s.inv_id = i.id and s.location = ‘Rome’ and c.color = ‘red’ Data Engine Instance 1 Data Engine Instance 2 Query Engine Instance
  • 31. Data Engine Instance 1 Data Engine Instance 2 Aggregation Push Down (units) select sum(i.units) from inventory i Global Aggregation Query Engine Instance All values travel from data engine instances to the query engine Inven tory Inven tory
  • 32. Data Engine Instance 1 Data Engine Instance 2 Local Aggregation Inven tory Inven tory Aggregation Push Down (units) instance 1 (units) Instance 2 (units) select sum(i.units) from inventory i Global Aggregation Query Engine Instance A single value travels from each data engine instance to the query engine
  • 33. Redistribution Query Engine Instance 1 Query Engine Instance 2 Local Projection Store Cata log Wid get Store Cata log Wid get OLAP Query Engine
  • 34. LeanXcale’s Distributed Storage Engine • KiVi is a full ACID and highly efficient Relational Key-Value datastore. • Unlike existing Key-Value data stores it has schema. • Implements a novel data structure that combines the advantages of B+Trees for range queries and the ones of LSM-Trees for random updates and inserts.
  • 35. Disruptive Innovations Ultra-Scalable Operational Database: Analytical Queries over Operational Data: Ultra-Efficient Storage Engine: • Scales from 1 to 100s of nodes to millions of transactions per second • Full ACID & Full SQL • Standard JDBC Driver • Distributed Data Warehouse working over Operational Data. • Real-Time Analytical Queries • Designed to work efficiently in multi-core and many-core HW. • Ultra-NUMA efficient.
  • 36. KiVi has a radically new architecture: • NUMA architecture is exploited with a shared nothing approach. • Very high level of efficiency is achieved thanks to avoiding multi-threading and the associated costs (context changes, thread synchronization). • There are no NUMA remote accesses. KiVi exploits the vectorial and SIMD capabilities of current commodity server hardware, enabling to process 10s of items with a single instruction. KiVi is columnar as well getting columnar acceleration for analytical queries. Kivi Efficiency
  • 37. Disruptive Innovations Online Aggregations: Continuous Dynamic Load Balancing: Non-Intrusive Elasticity: • Removes hotspots by computing aggregates on a transaction without conflicts. • Aggregate analytical queries become costless single row queries. • Multi-resource dynamic load balancing. • Minimize footprint for any workload. • Maximizes performance. • Grows and shrinks the cluster size as needed to process the incoming load. • Minimizes operational costs by reducing HW resources to actual needs (in cloud, on premise).
  • 38. Another key feature of KiVi is its ability to enable dynamic reconfiguration actions without stopping the processing, i.e., a data region can be moved while transactions are updating and reading the data This is possible thanks to the properties of the transaction manager that enables to move a data region in any partial state and apply updates again at the target node with idempotence guarantees so each update is applied exactly once. KiVi Dynamic Reconfiguration
  • 39. The problem of multi-resource load balancing is NP-Hard. We have conceived a greedy algorithm that computes solutions close to the optimal in an affordable time. It is a novel multi-resource algorithm that considers all resources in a way proportional to their scarcity. KiVi Dynamic Load Balancing
  • 40. When the average utilization of a resource (e.g. CPU) is above a predefined threshold a new node is provisioned and the load balancing algorithm then takes care of moving regions to balance the load across the set of nodes. Similarly, when the average utilization is such that a node can be decommissioned keeping the average utilization below a predefined threshold, then, the load of one of the nodes is distributed to the rest of the nodes (using again the load balancing algorithm) and the node gets decommissioned. Kivi Elasticity
  • 41. Disruptive Innovations Efficient for Range Queries & Random Updates: Costless Multi Version Concurrency Control: Efficient High Availability: • As efficient as B+-trees for range queries (used by relational DBs). • As efficient as LSM-trees for random updates/inserts (used by key-value data stores). • Novel MVCC with almost zero overhead. • Avoids stop-the-world obsolete version cleaning. • Avoids resource waste for multi-versioning. • Active-Active (Synchronous) Replication without Contention and without Synchronization Overhead.
  • 42. It provides active-active replication (multi-master) without hampering performance: • Novel replication approach avoiding the redundancy for attaining atomicity at the transactional level and the replication level. It will provide geo-replication without any penalty in throughput: • Novel geo-replication algorithm that streams the logs in parallel to the backup data center. • The backup data center can run read-only transactions in a fully local way. • The backup data center can run update transactions remotely at the primary data center. KiVi High Availability
  • 43. Acknowledgements LeanXcale R&D has been and is partially supported by different funding sources including the European Union’s FP7 and H2020 research and innovation programme under grants BigDataStack (779747), CloudDBAppliance (732051), Vineyard (687628), CoherentPaaS (611068), LeanBigData (619606), CrowdHealth (727560), Cybele (825355), Infinitech. LeanXcale has been partially funded by Spanish CDTI and Spanish Ministry of Economy and Competitiveness in the NEOTEC program under grant SNEO-20151285.
  • 44. Ricardo Jimenez-Peris LeanXcale CEO & Co-Founder info@leanxcale.com www.LeanXcale.com @LeanXcale https://www.linkedin.com/company/leanxcale https://www.facebook.com/leanxcale/