SlideShare ist ein Scribd-Unternehmen logo
1 von 22
Predictable BiG
data Performance in
Real-time
Srini V. SRINIVASAN
17th
BIG data London Meetup
APRIL 22, 2013
Response time: Hours, Weeks
TB to PB
Read Intensive
TRANSACTIONS (OLTP)
Response time: Seconds
Gigabytes of data
Balanced Reads/Writes
ANALYTICS (OLAP)
STRUCTURED
DATA
Response time: Seconds
Terabytes of data
Read Intensive
© 2013 Aerospike. All rights reserved. Confidential Pg. 2
BIG DATA ANALYTICS
Real-time Transactions
Response time: < 10 ms
1-20 TB
Balanced Reads/Writes
24x7x365 Availability
UNSTRUCTURED DATA
REAL-TIME BIG DATA
Database Landscape
Requirements for Internet Enterprises
1. Know who the Interaction is
with
 Monitor 200+ Million US Consumers,
5+ Billion mobile devices and sensors
1. Determine intent based on
current context
 Page views, search terms, game state,
last purchase, friends list, ads served,
location
1. Respond now, use big data for
more accurate decisions
 Display the most relevant Ad
 Recommend the best product
 Deliver the richest gaming experience
 Eliminate fraud…
1. Service can NEVER go down!
© 2013 Aerospike. All rights reserved. Confidential Pg. 3
Challenges
1. Handle extremely high rates of persistent
read/write transactions
2. Avoid hot spots to maintain tight latency SLAs
3. Provide immediate consistency with replication
4. Allow long running tasks with transactions
5. Scale linearly as data sizes increase
1. Add capacity with no service interruption
© 2013 Aerospike. All rights reserved. Pg. 4
Native Flash  Performance
➤ Low Latency at High Throughput
© 2012 Aerospike. All rights reserved. Confidential Pg. 5
© 2013 Aerospike. All rights reserved. Confidential Pg. 6
“Only Aerospike was able to function in synchronous mode with a replication
factor of two.. it is a significant advantage that Aerospike is able to function
reliably on a smaller amount of hardware while still maintaining true consistency.”
Shared-Nothing Architecture
© 2013 Aerospike. All rights reserved. Pg. 7
OHIO Data Center
➤ Every node in a cluster is identical,
handles both transactions and long
running tasks
➤ Data is replicated synchronously with
immediate consistency within the
cluster
➤ Data is replicated asynchronously
across data centers
Distributed Hash Table
How Data Is Distributed (Replication Factor 2)
➤ Every key is hashed into a
20 byte (fixed length) string
using the RIPEMD160 hash function
➤ This hash + additional data
(fixed 64 bytes)
are stored in RAM in the index
➤ 4 bytes of this hash are used to
compute the partition id
➤ There are 4096 partitions
➤ Partition id maps to node id
based on cluster membership
© 2013 Aerospike. All rights reserved. Pg. 8
cookie-abcdefg-12345678cookie-abcdefg-12345678
182023kh15hh3kahdjsh182023kh15hh3kahdjsh
Partition
ID
Master
node
Replica
node
… 1 4
1820 2 3
1821 3 2
4096 4 1
Organizing the cluster
➤ Automatic multicast gossip protocol for node discovery
➤ Paxos consensus algorithm determines nodes in cluster
➤ Ordered list of nodes determines data location
➤ Data partitions balanced for minimal data motion
➤ Vote initiated and terminated in 100 milliseconds
© 2013 Aerospike. All rights reserved. Pg. 9
How it Works
1. Write sent to row master
2. Latch against simultaneous writes
3. Apply write to master memory
and replica memory
synchronously
4. Queue operations to disk
5. Signal completed transaction
(optional storage commit wait)
6. Master applies conflict resolution
policy (rollback/ rollforward)
© 2013 Aerospike. All rights reserved. Pg. 10
master replica
1. Cluster discovers new node via
gossip protocol
2. Paxos vote determines new data
organization
3. Partition migrations scheduled
4. When a partition migration starts,
write journal starts on destination
5. Partition moves atomically
6. Journal is applied and source data
deleted
transactions
continue
Writing with Immediate Consistency Adding a Node
Intelligent Client
Shields Applications from the Complexity of the Cluster
➤ Implements Aerospike API
➤ Optimistic row locking
➤ Optimized binary protocol
➤ Cluster tracking
 Learns about cluster
changes, partition map
 Gossip protocol
➤ Transaction semantics
 Global transaction ID
 Retransmit and timeout
© 2013 Aerospike. All rights reserved. Pg. 11
Cross Data Center Replication (XDR)
➤ Asynchronous replication for long link
delays and outages
➤ Namespace is configured to replicate to a
destination cluster – master / slave,
including star and ring
➤ Replication process
 Transaction journal on partition master and
replica
 XDR process writes batches to destination
 Transmission state shared with source
replica
 Retransmission in case of network fault
 When data arrives back at originating
cluster, transaction ID matching prevents
subsequent application and forwarding
➤ In master / master replication, conflict
resolution via multiple versions, or
timestamp
© 2013 Aerospike. All rights reserved. Confidential Pg. 12
Multi-core Optimization
 Right Architecture
 Shared nothing
 In-memory (or multiple SSDs)
 Tight code loop
 Lock free isolation
 OS, Programming Language, Libraries
 Modern Linux kernel
 C language
 Use epoll
 Tweaks
 Pin threads to processor cores
 IRQ affinity settings for NIC
 CPU Socket Isolation via pairing of CPU to NIC
Russ’s 10 Ingredient Recipe for
Making 1 Million TPS on $5K Hardware
© 2013 Aerospike. All rights reserved. Pg. 13
Flash-optimized Storage Layer
➤ Direct device access
 Direct attach performance
 Data written in flash optimal
large block patterns
 All indexes in RAM for low wear
 Constant background
defragmentation
 Log structured file system, “copy
on write”
 Clean restart through shared
memory
➤ Random distribution using hash
does not require RAID hardware
© 2013 Aerospike. All rights reserved. Pg. 14
…
SSD performance varies widely
•Aerospike has a certified
hardware list
•Free SSD certification tool,
CIO, is also available
Native Flash  17x better TCO
“…data-in-DRAM implementations like SAP HANA..should be bypassed…
..current leading data-in-flash database for transactional analytic apps
is Aerospike.” - David Floyer, CTO, Wikibon
© 2012 Aerospike. All rights reserved. Confidential | Pg. 15
$$$
http://wikibon.org/wiki/v/Data_in_DRAM_i
s_a_Flash_in_the_Pan
Case studies
Proven in Production
➤ AppNexus - #2 RTB after Google
 27 Billion auctions per day
 600+ QPS
 Aerospike servers in 6 clusters in 3
data centers
➤ Chango – #2 Search after Google
 Sees more Searches than
Yahoo! + bing
 Data on 300 Million users
➤ TradeDesk – first Ad Exchange
 Facebook Exchange partner
 FBX serves 25% of Ads on the
Internet
 1200% growth in 2012
“Aerospike has operated
without interruptions
and easily scaled to meet
our performance demands.”
– Mike Nolet, CTO, AppNexus
© 2013 Aerospike. All rights reserved. Confidential Pg. 17
Proven in Production
➤ eXelate – Data on 500 Million users
 Online data plus Nielsen, Mastercard,
Autobytel, Bizo data..
 Data on 400 million users
 20 Billion Transactions per month
 4x2 TB data per cluster
 4 clusters across 4 data centers
 “Scale.
Real-time performance.
Real-time replication at 4 datacenters.
Aerospike delivered.”
- Elad Efraim, eXelate CTO
➤ BlueKai – Serves half the Fortune 30
 #1 Data Exchange
 2 Trillion Transactions per month
© 2013 Aerospike. All rights reserved. Confidential
Fast? Scale & Never Fail?
➤ Cluster-aware Client Layer
➤ Per Node Optimizations
 Thread-core-pinning
 Real-time prioritization
➤ Extremely efficient
primary index scheme
 Index in DRAM
 64 byte index entry size
 Kernel quality C code;
no degradation due to
Java garbage collection
➤ Flash-optimized Data
Layer
➤ Shared-nothing
Distribution Layer
 Intelligent data
migration and re-
balancing
 Smart data expiration
and eviction
 Rolling upgrades and
background backups
➤ Cross Datacenter
Replication (XDR)
What makes Aerospike…
➤ © 2013 Aerospike. All rights reserved. Pg. 19
Mission
➤ Build the Modern Real-time Data Platform
1. Scaling the Internet of Everything
2. Pushing the limits of modern hardware
3. No data loss and No downtime
© 2013 Aerospike. All rights reserved. Confidential Pg. 20
Publish &
Subscribe
• ASQL & NoSQL
• Powerful Aggregations
(MapReduce++)
• ASQL & NoSQL
• Powerful Aggregations
(MapReduce++)
• Secondary Index Queries
Transactions
• User Defined Functions (UDF)
Security
Encryption
Compression
AEROSPIKE REAL-TIME DATA DATA
PLATFORM
• Distribution - Shared Nothing, ACID, Scale-out, Multiple datacenters
• Data Types – Int, Str, Blob, List, Map, Large Stack, Large Set, Large List
• Storage– DRAM, SSD, HDD
How to get Aerospike?
Free
Community Edition Enterprise Edition
➤ For developers looking
for speed and stability
and transparently scale
as they grow
 All features for
 2 nodes, 100GB
 1 cluster
 1 datacenter
 Community support
➤ For mission critical apps
needing to scale right from
the start
 Unlimited number of
nodes, clusters, data
centers
 Cross data center
replication
 Premium 24x7 support
 Priced by TBs of unique
data (not replicas)
➤ © 2013 Aerospike. All rights reserved. Pg. 21
Questions
© 2013 Aerospike. All rights reserved. Confidential Pg. 22

Weitere ähnliche Inhalte

Was ist angesagt?

The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessAerospike, Inc.
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike ArchitecturePeter Milne
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to DeploymentAerospike, Inc.
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...Aerospike, Inc.
 
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast AnalyticsGetting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast AnalyticsAlluxio, Inc.
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?Aerospike, Inc.
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Community
 
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph Community
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraDataWorks Summit
 
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangBest Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangSpark Summit
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed_Hat_Storage
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedEqunix Business Solutions
 
Red Hat Storage: Emerging Use Cases
Red Hat Storage: Emerging Use CasesRed Hat Storage: Emerging Use Cases
Red Hat Storage: Emerging Use CasesRed_Hat_Storage
 
Red Hat Storage for Mere Mortals
Red Hat Storage for Mere MortalsRed Hat Storage for Mere Mortals
Red Hat Storage for Mere MortalsRed_Hat_Storage
 
Improving Presto performance with Alluxio at TikTok
Improving Presto performance with Alluxio at TikTokImproving Presto performance with Alluxio at TikTok
Improving Presto performance with Alluxio at TikTokAlluxio, Inc.
 

Was ist angesagt? (20)

Redis vs Aerospike
Redis vs AerospikeRedis vs Aerospike
Redis vs Aerospike
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
Tectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven BusinessTectonic Shift: A New Foundation for Data Driven Business
Tectonic Shift: A New Foundation for Data Driven Business
 
Aerospike Architecture
Aerospike ArchitectureAerospike Architecture
Aerospike Architecture
 
Using Databases and Containers From Development to Deployment
Using Databases and Containers  From Development to DeploymentUsing Databases and Containers  From Development to Deployment
Using Databases and Containers From Development to Deployment
 
Introduction to Aerospike
Introduction to AerospikeIntroduction to Aerospike
Introduction to Aerospike
 
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
WEBINAR: Architectures for Digital Transformation and Next-Generation Systems...
 
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast AnalyticsGetting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
 
There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?There are 250 Database products, are you running the right one?
There are 250 Database products, are you running the right one?
 
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
Ceph Day San Jose - Enable Fast Big Data Analytics on Ceph with Alluxio
 
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David AlvarezCeph optimized Storage / Global HW solutions for SDS, David Alvarez
Ceph optimized Storage / Global HW solutions for SDS, David Alvarez
 
Hadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native EraHadoop Storage in the Cloud Native Era
Hadoop Storage in the Cloud Native Era
 
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene PangBest Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene Pang
 
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based HardwareRed hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
 
Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar AhmedPGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
 
Red Hat Storage: Emerging Use Cases
Red Hat Storage: Emerging Use CasesRed Hat Storage: Emerging Use Cases
Red Hat Storage: Emerging Use Cases
 
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
 
Red Hat Storage for Mere Mortals
Red Hat Storage for Mere MortalsRed Hat Storage for Mere Mortals
Red Hat Storage for Mere Mortals
 
Improving Presto performance with Alluxio at TikTok
Improving Presto performance with Alluxio at TikTokImproving Presto performance with Alluxio at TikTok
Improving Presto performance with Alluxio at TikTok
 

Andere mochten auch

Ansn ind 14_ir_suyamto
Ansn ind 14_ir_suyamtoAnsn ind 14_ir_suyamto
Ansn ind 14_ir_suyamtolukmanft21
 
Integrative Nutrition Pictures
Integrative Nutrition PicturesIntegrative Nutrition Pictures
Integrative Nutrition PicturesSara S
 
Slide Golle Ira
Slide Golle IraSlide Golle Ira
Slide Golle Iras5irgoll
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?Aerospike, Inc.
 
IT in Private Cardiology Practice, 2011
IT in Private Cardiology Practice, 2011IT in Private Cardiology Practice, 2011
IT in Private Cardiology Practice, 2011David Lee Scher, MD
 
Medical apps in clinical practice
Medical apps in clinical practiceMedical apps in clinical practice
Medical apps in clinical practiceDavid Lee Scher, MD
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourAerospike, Inc.
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinarAerospike, Inc.
 
Patient Access to Implantable Cardiac Rhythm Device Data
Patient Access to Implantable Cardiac Rhythm Device DataPatient Access to Implantable Cardiac Rhythm Device Data
Patient Access to Implantable Cardiac Rhythm Device DataDavid Lee Scher, MD
 
Keys to Building a Successful Mobile Health Strategy
Keys to Building a Successful Mobile Health StrategyKeys to Building a Successful Mobile Health Strategy
Keys to Building a Successful Mobile Health StrategyDavid Lee Scher, MD
 
Three digital health companies will change pharma
Three digital health companies will change pharmaThree digital health companies will change pharma
Three digital health companies will change pharmaDavid Lee Scher, MD
 
From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...
From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...
From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...David Lee Scher, MD
 
Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup. Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup. Neeraj Shrimali
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesAerospike, Inc.
 

Andere mochten auch (16)

Ansn ind 14_ir_suyamto
Ansn ind 14_ir_suyamtoAnsn ind 14_ir_suyamto
Ansn ind 14_ir_suyamto
 
Integración de Portafolio
Integración de PortafolioIntegración de Portafolio
Integración de Portafolio
 
Integrative Nutrition Pictures
Integrative Nutrition PicturesIntegrative Nutrition Pictures
Integrative Nutrition Pictures
 
Slide Golle Ira
Slide Golle IraSlide Golle Ira
Slide Golle Ira
 
You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?You Snooze You Lose or How to Win in Ad Tech?
You Snooze You Lose or How to Win in Ad Tech?
 
IT in Private Cardiology Practice, 2011
IT in Private Cardiology Practice, 2011IT in Private Cardiology Practice, 2011
IT in Private Cardiology Practice, 2011
 
Medical apps in clinical practice
Medical apps in clinical practiceMedical apps in clinical practice
Medical apps in clinical practice
 
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/HourRunning a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
Running a High Performance NoSQL Database on Amazon EC2 for Just $1.68/Hour
 
01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar01282016 Aerospike-Docker webinar
01282016 Aerospike-Docker webinar
 
Patient Access to Implantable Cardiac Rhythm Device Data
Patient Access to Implantable Cardiac Rhythm Device DataPatient Access to Implantable Cardiac Rhythm Device Data
Patient Access to Implantable Cardiac Rhythm Device Data
 
Keys to Building a Successful Mobile Health Strategy
Keys to Building a Successful Mobile Health StrategyKeys to Building a Successful Mobile Health Strategy
Keys to Building a Successful Mobile Health Strategy
 
Three digital health companies will change pharma
Three digital health companies will change pharmaThree digital health companies will change pharma
Three digital health companies will change pharma
 
From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...
From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...
From the Archives, 2008:Clinical and Economic Advantages Implantable Defibril...
 
The Digital KOL 6.2015
The Digital KOL 6.2015The Digital KOL 6.2015
The Digital KOL 6.2015
 
Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup. Linux container, namespaces & CGroup.
Linux container, namespaces & CGroup.
 
What the Spark!? Intro and Use Cases
What the Spark!? Intro and Use CasesWhat the Spark!? Intro and Use Cases
What the Spark!? Intro and Use Cases
 

Ähnlich wie Predictable Big Data Performance in Real-time

Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike
 
What a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfWhat a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfAerospike, Inc.
 
What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)bigdatagurus_meetup
 
What enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time BiddingWhat enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time BiddingAerospike
 
Brian Bulkowski. Aerospike
Brian Bulkowski. AerospikeBrian Bulkowski. Aerospike
Brian Bulkowski. AerospikeVolha Banadyseva
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle CoherenceBen Stopford
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Aerospike, Inc.
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...HostedbyConfluent
 
Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013dfilppi
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...DataStax
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...Andrew Liu
 
Rapid Application Design in Financial Services
Rapid Application Design in Financial ServicesRapid Application Design in Financial Services
Rapid Application Design in Financial ServicesAerospike
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014Dylan Tong
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data gridBogdan Dina
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryDatabricks
 
fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptyashsharma863914
 
WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202Timothy Spann
 
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...inside-BigData.com
 
Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Zhenxiao Luo
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyStuart Pook
 

Ähnlich wie Predictable Big Data Performance in Real-time (20)

Aerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower ManhattanAerospike AdTech Gets Hacked in Lower Manhattan
Aerospike AdTech Gets Hacked in Lower Manhattan
 
What a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdfWhat a Modern Database Enables_Srini Srinivasan.pdf
What a Modern Database Enables_Srini Srinivasan.pdf
 
What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)
 
What enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time BiddingWhat enterprises can learn from Real Time Bidding
What enterprises can learn from Real Time Bidding
 
Brian Bulkowski. Aerospike
Brian Bulkowski. AerospikeBrian Bulkowski. Aerospike
Brian Bulkowski. Aerospike
 
Data Grids with Oracle Coherence
Data Grids with Oracle CoherenceData Grids with Oracle Coherence
Data Grids with Oracle Coherence
 
Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...Flash Economics and Lessons learned from operating low latency platforms at h...
Flash Economics and Lessons learned from operating low latency platforms at h...
 
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
Distributed Data Storage & Streaming for Real-time Decisioning Using Kafka, S...
 
Cassandra summit-2013
Cassandra summit-2013Cassandra summit-2013
Cassandra summit-2013
 
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
 
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
[PASS Summit 2016] Blazing Fast, Planet-Scale Customer Scenarios with Azure D...
 
Rapid Application Design in Financial Services
Rapid Application Design in Financial ServicesRapid Application Design in Financial Services
Rapid Application Design in Financial Services
 
MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014MongoDB Sharding Webinar 2014
MongoDB Sharding Webinar 2014
 
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
 
Elastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent MemoryElastify Cloud-Native Spark Application with Persistent Memory
Elastify Cloud-Native Spark Application with Persistent Memory
 
fdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.pptfdocuments.in_aerospike-key-value-data-access.ppt
fdocuments.in_aerospike-key-value-data-access.ppt
 
WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202WarsawITDays_ ApacheNiFi202
WarsawITDays_ ApacheNiFi202
 
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
 
Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019Real time analytics at uber @ strata data 2019
Real time analytics at uber @ strata data 2019
 
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster RedundancyPilot Hadoop Towards 2500 Nodes and Cluster Redundancy
Pilot Hadoop Towards 2500 Nodes and Cluster Redundancy
 

Mehr von Aerospike, Inc.

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of EngagementAerospike, Inc.
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSAerospike, Inc.
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysAerospike, Inc.
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACIDAerospike, Inc.
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsAerospike, Inc.
 
Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Aerospike, Inc.
 

Mehr von Aerospike, Inc. (6)

2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement2017 DB Trends for Powering Real-Time Systems of Engagement
2017 DB Trends for Powering Real-Time Systems of Engagement
 
Leveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMSLeveraging Big Data with Hadoop, NoSQL and RDBMS
Leveraging Big Data with Hadoop, NoSQL and RDBMS
 
Get Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California HighwaysGet Started with Data Science by Analyzing Traffic Data from California Highways
Get Started with Data Science by Analyzing Traffic Data from California Highways
 
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACIDACID & CAP:  Clearing CAP Confusion and Why C In CAP ≠ C in ACID
ACID & CAP: Clearing CAP Confusion and Why C In CAP ≠ C in ACID
 
Storm Persistence and Real-Time Analytics
Storm Persistence and Real-Time AnalyticsStorm Persistence and Real-Time Analytics
Storm Persistence and Real-Time Analytics
 
Configuring Aerospike - Part 2
Configuring Aerospike - Part 2 Configuring Aerospike - Part 2
Configuring Aerospike - Part 2
 

Kürzlich hochgeladen

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Kürzlich hochgeladen (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Predictable Big Data Performance in Real-time

  • 1. Predictable BiG data Performance in Real-time Srini V. SRINIVASAN 17th BIG data London Meetup APRIL 22, 2013
  • 2. Response time: Hours, Weeks TB to PB Read Intensive TRANSACTIONS (OLTP) Response time: Seconds Gigabytes of data Balanced Reads/Writes ANALYTICS (OLAP) STRUCTURED DATA Response time: Seconds Terabytes of data Read Intensive © 2013 Aerospike. All rights reserved. Confidential Pg. 2 BIG DATA ANALYTICS Real-time Transactions Response time: < 10 ms 1-20 TB Balanced Reads/Writes 24x7x365 Availability UNSTRUCTURED DATA REAL-TIME BIG DATA Database Landscape
  • 3. Requirements for Internet Enterprises 1. Know who the Interaction is with  Monitor 200+ Million US Consumers, 5+ Billion mobile devices and sensors 1. Determine intent based on current context  Page views, search terms, game state, last purchase, friends list, ads served, location 1. Respond now, use big data for more accurate decisions  Display the most relevant Ad  Recommend the best product  Deliver the richest gaming experience  Eliminate fraud… 1. Service can NEVER go down! © 2013 Aerospike. All rights reserved. Confidential Pg. 3
  • 4. Challenges 1. Handle extremely high rates of persistent read/write transactions 2. Avoid hot spots to maintain tight latency SLAs 3. Provide immediate consistency with replication 4. Allow long running tasks with transactions 5. Scale linearly as data sizes increase 1. Add capacity with no service interruption © 2013 Aerospike. All rights reserved. Pg. 4
  • 5. Native Flash  Performance ➤ Low Latency at High Throughput © 2012 Aerospike. All rights reserved. Confidential Pg. 5
  • 6. © 2013 Aerospike. All rights reserved. Confidential Pg. 6 “Only Aerospike was able to function in synchronous mode with a replication factor of two.. it is a significant advantage that Aerospike is able to function reliably on a smaller amount of hardware while still maintaining true consistency.”
  • 7. Shared-Nothing Architecture © 2013 Aerospike. All rights reserved. Pg. 7 OHIO Data Center ➤ Every node in a cluster is identical, handles both transactions and long running tasks ➤ Data is replicated synchronously with immediate consistency within the cluster ➤ Data is replicated asynchronously across data centers
  • 8. Distributed Hash Table How Data Is Distributed (Replication Factor 2) ➤ Every key is hashed into a 20 byte (fixed length) string using the RIPEMD160 hash function ➤ This hash + additional data (fixed 64 bytes) are stored in RAM in the index ➤ 4 bytes of this hash are used to compute the partition id ➤ There are 4096 partitions ➤ Partition id maps to node id based on cluster membership © 2013 Aerospike. All rights reserved. Pg. 8 cookie-abcdefg-12345678cookie-abcdefg-12345678 182023kh15hh3kahdjsh182023kh15hh3kahdjsh Partition ID Master node Replica node … 1 4 1820 2 3 1821 3 2 4096 4 1
  • 9. Organizing the cluster ➤ Automatic multicast gossip protocol for node discovery ➤ Paxos consensus algorithm determines nodes in cluster ➤ Ordered list of nodes determines data location ➤ Data partitions balanced for minimal data motion ➤ Vote initiated and terminated in 100 milliseconds © 2013 Aerospike. All rights reserved. Pg. 9
  • 10. How it Works 1. Write sent to row master 2. Latch against simultaneous writes 3. Apply write to master memory and replica memory synchronously 4. Queue operations to disk 5. Signal completed transaction (optional storage commit wait) 6. Master applies conflict resolution policy (rollback/ rollforward) © 2013 Aerospike. All rights reserved. Pg. 10 master replica 1. Cluster discovers new node via gossip protocol 2. Paxos vote determines new data organization 3. Partition migrations scheduled 4. When a partition migration starts, write journal starts on destination 5. Partition moves atomically 6. Journal is applied and source data deleted transactions continue Writing with Immediate Consistency Adding a Node
  • 11. Intelligent Client Shields Applications from the Complexity of the Cluster ➤ Implements Aerospike API ➤ Optimistic row locking ➤ Optimized binary protocol ➤ Cluster tracking  Learns about cluster changes, partition map  Gossip protocol ➤ Transaction semantics  Global transaction ID  Retransmit and timeout © 2013 Aerospike. All rights reserved. Pg. 11
  • 12. Cross Data Center Replication (XDR) ➤ Asynchronous replication for long link delays and outages ➤ Namespace is configured to replicate to a destination cluster – master / slave, including star and ring ➤ Replication process  Transaction journal on partition master and replica  XDR process writes batches to destination  Transmission state shared with source replica  Retransmission in case of network fault  When data arrives back at originating cluster, transaction ID matching prevents subsequent application and forwarding ➤ In master / master replication, conflict resolution via multiple versions, or timestamp © 2013 Aerospike. All rights reserved. Confidential Pg. 12
  • 13. Multi-core Optimization  Right Architecture  Shared nothing  In-memory (or multiple SSDs)  Tight code loop  Lock free isolation  OS, Programming Language, Libraries  Modern Linux kernel  C language  Use epoll  Tweaks  Pin threads to processor cores  IRQ affinity settings for NIC  CPU Socket Isolation via pairing of CPU to NIC Russ’s 10 Ingredient Recipe for Making 1 Million TPS on $5K Hardware © 2013 Aerospike. All rights reserved. Pg. 13
  • 14. Flash-optimized Storage Layer ➤ Direct device access  Direct attach performance  Data written in flash optimal large block patterns  All indexes in RAM for low wear  Constant background defragmentation  Log structured file system, “copy on write”  Clean restart through shared memory ➤ Random distribution using hash does not require RAID hardware © 2013 Aerospike. All rights reserved. Pg. 14 … SSD performance varies widely •Aerospike has a certified hardware list •Free SSD certification tool, CIO, is also available
  • 15. Native Flash  17x better TCO “…data-in-DRAM implementations like SAP HANA..should be bypassed… ..current leading data-in-flash database for transactional analytic apps is Aerospike.” - David Floyer, CTO, Wikibon © 2012 Aerospike. All rights reserved. Confidential | Pg. 15 $$$ http://wikibon.org/wiki/v/Data_in_DRAM_i s_a_Flash_in_the_Pan
  • 17. Proven in Production ➤ AppNexus - #2 RTB after Google  27 Billion auctions per day  600+ QPS  Aerospike servers in 6 clusters in 3 data centers ➤ Chango – #2 Search after Google  Sees more Searches than Yahoo! + bing  Data on 300 Million users ➤ TradeDesk – first Ad Exchange  Facebook Exchange partner  FBX serves 25% of Ads on the Internet  1200% growth in 2012 “Aerospike has operated without interruptions and easily scaled to meet our performance demands.” – Mike Nolet, CTO, AppNexus © 2013 Aerospike. All rights reserved. Confidential Pg. 17
  • 18. Proven in Production ➤ eXelate – Data on 500 Million users  Online data plus Nielsen, Mastercard, Autobytel, Bizo data..  Data on 400 million users  20 Billion Transactions per month  4x2 TB data per cluster  4 clusters across 4 data centers  “Scale. Real-time performance. Real-time replication at 4 datacenters. Aerospike delivered.” - Elad Efraim, eXelate CTO ➤ BlueKai – Serves half the Fortune 30  #1 Data Exchange  2 Trillion Transactions per month © 2013 Aerospike. All rights reserved. Confidential
  • 19. Fast? Scale & Never Fail? ➤ Cluster-aware Client Layer ➤ Per Node Optimizations  Thread-core-pinning  Real-time prioritization ➤ Extremely efficient primary index scheme  Index in DRAM  64 byte index entry size  Kernel quality C code; no degradation due to Java garbage collection ➤ Flash-optimized Data Layer ➤ Shared-nothing Distribution Layer  Intelligent data migration and re- balancing  Smart data expiration and eviction  Rolling upgrades and background backups ➤ Cross Datacenter Replication (XDR) What makes Aerospike… ➤ © 2013 Aerospike. All rights reserved. Pg. 19
  • 20. Mission ➤ Build the Modern Real-time Data Platform 1. Scaling the Internet of Everything 2. Pushing the limits of modern hardware 3. No data loss and No downtime © 2013 Aerospike. All rights reserved. Confidential Pg. 20 Publish & Subscribe • ASQL & NoSQL • Powerful Aggregations (MapReduce++) • ASQL & NoSQL • Powerful Aggregations (MapReduce++) • Secondary Index Queries Transactions • User Defined Functions (UDF) Security Encryption Compression AEROSPIKE REAL-TIME DATA DATA PLATFORM • Distribution - Shared Nothing, ACID, Scale-out, Multiple datacenters • Data Types – Int, Str, Blob, List, Map, Large Stack, Large Set, Large List • Storage– DRAM, SSD, HDD
  • 21. How to get Aerospike? Free Community Edition Enterprise Edition ➤ For developers looking for speed and stability and transparently scale as they grow  All features for  2 nodes, 100GB  1 cluster  1 datacenter  Community support ➤ For mission critical apps needing to scale right from the start  Unlimited number of nodes, clusters, data centers  Cross data center replication  Premium 24x7 support  Priced by TBs of unique data (not replicas) ➤ © 2013 Aerospike. All rights reserved. Pg. 21
  • 22. Questions © 2013 Aerospike. All rights reserved. Confidential Pg. 22