SlideShare ist ein Scribd-Unternehmen logo
1 von 65
Downloaden Sie, um offline zu lesen
Transactional Streaming
If you can compute it, you can probably stream it.
John Hugg
March 30th, 2016
@johnhugg / jhugg@voltdb.com
Who Am I?
• First developer on the VoltDB project.
• Previously at Vertica and other data
startups.
• Have made so many bad decisions
over the years, that now I almost know
what I'm talking about.
• jhugg@voltdb.com
• @johnhugg
• http://chat.voltdb.com
Operations at Scale
Operations at Scale
• Ingest data from several sources into a horizontally scalable system.
• Process data on arrival 

(i.e., transform, correlate, filter, and aggregate data).
• Understand, act, and record.
• Push relevant data to a downstream, big data system.
Data Movement
Processing Logic
State Management
Right Now
Right Now
Right Now
One Size
Fits All
• Analytics and operational
stateful stores require
different storage engines to
be optimal.

Columns vs. Rows

Vertica vs. VoltDB
• Machine Learning

Multi-Dim Math

Search
• Microservices?
• Data Value?
Specifically:
Operational Stream Processing
and
Operational State
Where integration makes sense:
Leading Edge
Operations
What’s the Difference?
• Non-integrated systems means you write glue code, or you use
someone’s glue code.
• Operational glue code is different from batch-oriented glue code.
• Batch or OLAP has huge safety nets for glue code:
• HDFS, CSV, immutable data sets
• “Blow it away and reload”
• Much less time pressure
Glue Glue
You wrote this.
1 User.
Tested Well
1000s of users
Tested Well
1000s of users
Tested Well
1000s of users
Community Supplied
Many Users
But I’m not writing “glue code”
“I’m just using the well-tested Cassandra driver in my Storm code.”
• You’re using a computer network. They are not always reliable.
• Storm might fail in the middle of processing.
• Cassandra might fail in the middle of processing.
• Both systems are tested for this, but not together, using your glue
code.
Operational Glue
Code is Hard
Main Point:
Minimize it
Transactional Stream
Processing
Use the same system for
state and processing.
Ensures they are tested together.
No independant failures.
1 Transaction = 1 Event
ACID
• Atomic: Either 100% done or 0% done. No in-between.
• (Consistent)
• Isolated: Two concurrent operations can’t interfere with each other
• Durable: If it says it’s done, then it is done.
Processing Code
for a Single Event
Database / State
Processing Code
for a Single Event
Database / State
x x x x
Not Atomic
Romeo And Juliet Explain “Atomicity”
Operation 1:
Fake your death
Operation 2:
Tell Romeo
Processing Code
for a Single Event
Database / State
Processing Code
for a Single Event
Not Isolated
“A good example is
the best sermon.”
- Benjamin Franklin
Call Center Management
http://www.publicdomainpictures.net/
3000 AgentsMillions of Customers
Dashboards & Alerts Billing
Actions
Events
Processing
State
Call Center Management
Events
• “Begin Call”

Calling Number, Agent Id, Start Time, etc…
• “End Call”

Calling Number, Agent Id, End Time, etc…
What Kind of Problems
• Correlation - Streaming Join
• Out-of-order delivery
• At least once delivery - How to dedup
• Generate new event on call completion - once
• Precise Accounting
• Precise Stats - Event time vs processing time
Public Code
https://github.com/VoltDB/app-callcenter
It’s not finished as of today…
What’s the Hardest Part?
BeginCall code
EndCall code
State
Fake Call Generator
(Makes event pairs
with delay)
Bad Network
Transformer
(Duplicate & delay)
My Client Code
Correlation
Requires State
Schema for Call Center Example
CREATE TABLE opencalls
(
call_id BIGINT NOT NULL,
agent_id INTEGER NOT NULL,
phone_no VARCHAR(20 BYTES) NOT NULL,
start_ts TIMESTAMP DEFAULT NULL,
end_ts TIMESTAMP DEFAULT NULL,
PRIMARY KEY (call_id, agent_id, phone_no)
);
CREATE TABLE completedcalls
(
call_id BIGINT NOT NULL,
agent_id INTEGER NOT NULL,
phone_no VARCHAR(20 BYTES) NOT NULL,
start_ts TIMESTAMP NOT NULL,
end_ts TIMESTAMP NOT NULL,
duration INTEGER NOT NULL,
PRIMARY KEY (call_id, agent_id, phone_no)
);
Unpaired call begin/end events
Can arrive in any order
Any match transactionally
moves to the completed
calls table
Filtering Duplicates
Requires Idempotence
is the property of certain operations in
mathematics and computer science, that can be
applied multiple times without changing the
result beyond the initial application.
Idempotence
Idempotent Not Idempotent
set x = 5;
same as
set x = 5; set x = 5;
x++;
not same as
x++; x++;
if (x % 2 == 0) x++;
same as
if (x % 2 == 0) x++;
if (x % 2 == 0) x++;
if (x % 2 == 0) x *= 2;
not same as
if (x % 2 == 0) x *= 2;
if (x % 2 == 0) x *= 2;
spill coffee on brown pants eat whole plate of spaghetti
Idempotent Operations
Exactly Once Semantics
At-Least-Once Delivery
+
=
How to make BeginCall Idempotent?
• If call record is in completed calls,
ignore.
• If the call record is in open calls and is
missing end time, ignore.
• If call record is in open calls, check if
this event completes the call.

Yes, handle swapped begin & end
• Otherwise, create an new record in
open calls table.
open calls
completed calls
Tables
How to make BeginCall Idempotent?
• If call record is in completed calls,
ignore.
• If the call record is in open calls and is
missing end time, ignore.
• If call record is in open calls, check if
this event completes the call.

Yes, handle swapped begin & end
• Otherwise, create an new record in
open calls table.
open calls
completed calls
TablesIdempotency
• If call record is in completed calls,
ignore.
• If the call record is in open calls and is
missing end time, ignore.
• If call record is in open calls, check if
this event completes the call.

Yes, handle swapped begin & end
• Otherwise, create an new record in
open calls table.
This thing to the left
is a transaction.
Actual Math
https://www.flickr.com/photos/kimmanleyort/13148718593
Accounting & Statistics May Require:
Counting
• Counting is hard at
scale.
• 2 Kinds of fail:
• Missed counts
• Extra counts
Counting
Read
Read
x=27
Write 28
x=27 x=28
Write 28
x=28Value:
Incrementer 1:
Incrementer 2:
Processing Code
for a Single Event
Database / State
Processing Code
for a Single Event
Not Isolated
Counting
Systems with single-key
consistency
Systems with special features
to enable counters
ACID transactional systems
Systems that enforce a
single writer
As we say in
New England…
Performance is
wicked variable.
Not “Read Committed”
Accounting
• Accounting is just counting, but more so.
• Need to be able to increment by amount (or decrement).
• Often need to increment/decrement things in groups.
Accounting
• When gamer buys a Mystical Sword of Hegemony, update the following:
• Debit the gamer’s rubies or whatever.
• Update real-world region stats, like swords sold in gamer’s geo-
region, total money spent in gamer’s geo-region etc…
• Update game region stats for the current game location, say the
“Tar Shoals of Dintymoore”, like number of MSoHs in the region.
• Increment any offer-related stats, like record whether the MSoH was
offered because of customer engagement algorithm X15 or B12.
Processing Code
for a Single Event
Database / State
x x x x
Not Atomic
Accounting
Systems with single-key
consistency
Systems with special features
to enable counters
ACID transactional systems
Systems that enforce a
single writer
As we say in
New England…
Performance is
wicked variable.
?
Last Dollar Problem
• Ad-Tech app wants to show a user an ad from a campaign.
• The price of the ad is $0.90.
• Advertiser has $1.00 campaign budget left.
• If the budget check and the display aren’t ACID, it’s possible to
decide to show the ad twice.
• Ad-Tech app is forced to choose between over or under-billing.
Aggregation
• Aggregation is just counting and accounting that the system does
for you.
• Often this is counting chopped up by groups.
• Eg. Sword sales by region. % success by offer.
• In Call Center, it could be average call length by agent.
Accounting Aggregation
Systems with single-key
consistency
Systems with special features
to enable counters
ACID transactional systems
Systems that enforce a
single writer
As we say in
New England…
Performance is
wicked variable.
?
How to Aggregate Without Consistency?
• Use a stand-alone stream processor.
• Best fit for aggregation by time, and specifically by processing
time, not event time.
• Run a query on all the data every time you want the aggregation.
• BOO!
Actual Math
What’s the mean and standard deviation of call length 

chopped up various ways?
Running Variance
is my next band name.
Running Standard Deviation
The Details (mostly) Don’t Matter
• Still need to think about performance and likely horizontal
partitioning of work.
• Integration of State & Processing + Full ACID Transactions

=> I can program this math without thinking about:
• Failure
• Interference from weak isolation.
• Partial Visibility to State
Bonus Topics!
Latency
Low Latency Can Affect the Decision
500ms
Want to be here You lose money here
Get Into the “Fast Path”
• Policy Enforcement in Telco
• Fraud Detection “Smoke Tests”
• Change what a user sees in response to action:
• Change the next webpage content based on recent
website actions.
• Pick what’s behind the magic door based on how the
game is going.
Does
your
data
matter?
Problem
Factory full of robots
Sometimes they break
They log metadata
When Imperfect is Enough
• Before: No metadata. Maintenance works on stuff based on their
experience, schedules and visual inspection.
• Now: Basic stream processing system is up 99% of the time, and
provides a much richer guidance to maintenance. 

Robots fail less often and cost less to operate.
• Possible Future: More sophisticated stream processing is up
99.99% of the time and offers even more insight.

Robots fail a tiny bit less often and costs are a tiny bit down.
When Imperfect Isn’t Worth It
Probability of Failure
(under system X)
Expected Average
Failure Cost
# of Operations x xCost of System X +
• I’ve worked on Ad-Tech use cases => High # Operations
• Complex Multi-Cluster/System Monsters => High % failure
• Billing systems and fraud systems => High cost per failure
Licenses
Hardware
Engineering
(Switching Tech)
More consistent
systems don’t have to
be more expensive
Easier to develop => Less Engineering
More Efficient => Less Hardware
Conclusion - Thank You!
• Operations => Integration Wins

Analytics, Batch => Use Specialized Tools
• With transactions, complex math becomes
mostly typing.
• Many of these problems can be solved without
transactional streaming, but…
• It’s going to be harder
• It might be less accurate
BS
Stuff I Don't Know
Stuff I Know
T H I S TA L K
http://chat.voltdb.com
@johnhugg
jhugg@voltdb.com
all images from wikimedia w/ cc license unless otherwise noted

Weitere ähnliche Inhalte

Was ist angesagt?

101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)Henning Spjelkavik
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Gwen (Chen) Shapira
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streamsjimriecken
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBrian Ritchie
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafkaconfluent
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseSalesforce Engineering
 
Kafka At Scale in the Cloud
Kafka At Scale in the CloudKafka At Scale in the Cloud
Kafka At Scale in the Cloudconfluent
 
Reactive Supply To Changing Demand
Reactive Supply To Changing DemandReactive Supply To Changing Demand
Reactive Supply To Changing DemandJonas Bonér
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Tale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache FlinkTale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache FlinkKarthik Deivasigamani
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...Natan Silnitsky
 
101 ways to configure kafka - badly
101 ways to configure kafka - badly101 ways to configure kafka - badly
101 ways to configure kafka - badlyHenning Spjelkavik
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaGuozhang Wang
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservicespflueras
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafkaconfluent
 
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Clustrix
 

Was ist angesagt? (20)

101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)101 ways to configure kafka - badly (Kafka Summit)
101 ways to configure kafka - badly (Kafka Summit)
 
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
Multi-Cluster and Failover for Apache Kafka - Kafka Summit SF 17
 
Reducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive StreamsReducing Microservice Complexity with Kafka and Reactive Streams
Reducing Microservice Complexity with Kafka and Reactive Streams
 
Building Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache KafkaBuilding Event-Driven Systems with Apache Kafka
Building Event-Driven Systems with Apache Kafka
 
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache KafkaStrata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
Strata+Hadoop 2017 San Jose: Lessons from a year of supporting Apache Kafka
 
Apache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use caseApache BookKeeper Distributed Store- a Salesforce use case
Apache BookKeeper Distributed Store- a Salesforce use case
 
Kafka At Scale in the Cloud
Kafka At Scale in the CloudKafka At Scale in the Cloud
Kafka At Scale in the Cloud
 
Reactive Supply To Changing Demand
Reactive Supply To Changing DemandReactive Supply To Changing Demand
Reactive Supply To Changing Demand
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Tale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache FlinkTale of two streaming frameworks- Apace Storm & Apache Flink
Tale of two streaming frameworks- Apace Storm & Apache Flink
 
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
8 Lessons Learned from Using Kafka in 1000 Scala microservices - Scale by the...
 
Five steps perform_2013
Five steps perform_2013Five steps perform_2013
Five steps perform_2013
 
101 ways to configure kafka - badly
101 ways to configure kafka - badly101 ways to configure kafka - badly
101 ways to configure kafka - badly
 
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache KafkaBuilding Stream Infrastructure across Multiple Data Centers with Apache Kafka
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
 
Fraud Detection Architecture
Fraud Detection ArchitectureFraud Detection Architecture
Fraud Detection Architecture
 
Stability Patterns for Microservices
Stability Patterns for MicroservicesStability Patterns for Microservices
Stability Patterns for Microservices
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Dev Ops without the Ops
Dev Ops without the OpsDev Ops without the Ops
Dev Ops without the Ops
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
Tech Talk Series, Part 4: How do you achieve high availability in a MySQL env...
 

Ähnlich wie Transactional Streaming: If you can compute it, you can probably stream it.

Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012Nick Galbreath
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesEd Hunter
 
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...Amazon Web Services
 
Using Time Series for Full Observability of a SaaS Platform
Using Time Series for Full Observability of a SaaS PlatformUsing Time Series for Full Observability of a SaaS Platform
Using Time Series for Full Observability of a SaaS PlatformDevOps.com
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Brian Brazil
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...confluent
 
Kakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming appKakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming appNeil Avery
 
Respond to and troubleshoot production incidents like an sa
Respond to and troubleshoot production incidents like an saRespond to and troubleshoot production incidents like an sa
Respond to and troubleshoot production incidents like an saTom Cudd
 
Event-Driven Architectures Done Right | Tim Berglund, Confluent
Event-Driven Architectures Done Right | Tim Berglund, ConfluentEvent-Driven Architectures Done Right | Tim Berglund, Confluent
Event-Driven Architectures Done Right | Tim Berglund, ConfluentHostedbyConfluent
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the Worldjhugg
 
Universal Analytics Common Issues - MeasureFest
Universal Analytics Common Issues - MeasureFestUniversal Analytics Common Issues - MeasureFest
Universal Analytics Common Issues - MeasureFestdarafitzgerald
 
Observability - the good, the bad, and the ugly
Observability - the good, the bad, and the uglyObservability - the good, the bad, and the ugly
Observability - the good, the bad, and the uglyAleksandr Tavgen
 
Dynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldDynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldStéphane Dorrekens
 
Big Data presentation at GITPRO 2013
Big Data presentation at GITPRO 2013Big Data presentation at GITPRO 2013
Big Data presentation at GITPRO 2013Sameer Wadkar
 
Five finger audit
Five finger auditFive finger audit
Five finger auditBertil Hatt
 
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...InfluxData
 
B.tech admission in india
B.tech admission in indiaB.tech admission in india
B.tech admission in indiaEdhole.com
 
Decimal arithmetic in Processors
Decimal arithmetic in ProcessorsDecimal arithmetic in Processors
Decimal arithmetic in ProcessorsPeeyush Pashine
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by InstrumentAmazon Web Services
 
Data-driven product management
Data-driven product managementData-driven product management
Data-driven product managementArseny Kravchenko
 

Ähnlich wie Transactional Streaming: If you can compute it, you can probably stream it. (20)

Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
Rate Limiting at Scale, from SANS AppSec Las Vegas 2012
 
Netflix SRE perf meetup_slides
Netflix SRE perf meetup_slidesNetflix SRE perf meetup_slides
Netflix SRE perf meetup_slides
 
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
 
Using Time Series for Full Observability of a SaaS Platform
Using Time Series for Full Observability of a SaaS PlatformUsing Time Series for Full Observability of a SaaS Platform
Using Time Series for Full Observability of a SaaS Platform
 
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
 
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
The Art of The Event Streaming Application: Streams, Stream Processors and Sc...
 
Kakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming appKakfa summit london 2019 - the art of the event-streaming app
Kakfa summit london 2019 - the art of the event-streaming app
 
Respond to and troubleshoot production incidents like an sa
Respond to and troubleshoot production incidents like an saRespond to and troubleshoot production incidents like an sa
Respond to and troubleshoot production incidents like an sa
 
Event-Driven Architectures Done Right | Tim Berglund, Confluent
Event-Driven Architectures Done Right | Tim Berglund, ConfluentEvent-Driven Architectures Done Right | Tim Berglund, Confluent
Event-Driven Architectures Done Right | Tim Berglund, Confluent
 
Building a Database for the End of the World
Building a Database for the End of the WorldBuilding a Database for the End of the World
Building a Database for the End of the World
 
Universal Analytics Common Issues - MeasureFest
Universal Analytics Common Issues - MeasureFestUniversal Analytics Common Issues - MeasureFest
Universal Analytics Common Issues - MeasureFest
 
Observability - the good, the bad, and the ugly
Observability - the good, the bad, and the uglyObservability - the good, the bad, and the ugly
Observability - the good, the bad, and the ugly
 
Dynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the fieldDynamics CRM high volume systems - lessons from the field
Dynamics CRM high volume systems - lessons from the field
 
Big Data presentation at GITPRO 2013
Big Data presentation at GITPRO 2013Big Data presentation at GITPRO 2013
Big Data presentation at GITPRO 2013
 
Five finger audit
Five finger auditFive finger audit
Five finger audit
 
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
Using InfluxDB for Full Observability of a SaaS Platform by Aleksandr Tavgen,...
 
B.tech admission in india
B.tech admission in indiaB.tech admission in india
B.tech admission in india
 
Decimal arithmetic in Processors
Decimal arithmetic in ProcessorsDecimal arithmetic in Processors
Decimal arithmetic in Processors
 
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
(DVO205) Monitoring Evolution: Flying Blind to Flying by Instrument
 
Data-driven product management
Data-driven product managementData-driven product management
Data-driven product management
 

Kürzlich hochgeladen

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrainmasabamasaba
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park masabamasaba
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfayushiqss
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyAnusha Are
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is insideshinachiaurasa2
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 

Kürzlich hochgeladen (20)

%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodologyPharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

Transactional Streaming: If you can compute it, you can probably stream it.

  • 1. Transactional Streaming If you can compute it, you can probably stream it. John Hugg March 30th, 2016 @johnhugg / jhugg@voltdb.com
  • 2. Who Am I? • First developer on the VoltDB project. • Previously at Vertica and other data startups. • Have made so many bad decisions over the years, that now I almost know what I'm talking about. • jhugg@voltdb.com • @johnhugg • http://chat.voltdb.com
  • 4. Operations at Scale • Ingest data from several sources into a horizontally scalable system. • Process data on arrival 
 (i.e., transform, correlate, filter, and aggregate data). • Understand, act, and record. • Push relevant data to a downstream, big data system.
  • 6.
  • 7.
  • 11. One Size Fits All • Analytics and operational stateful stores require different storage engines to be optimal.
 Columns vs. Rows
 Vertica vs. VoltDB • Machine Learning
 Multi-Dim Math
 Search • Microservices? • Data Value?
  • 12. Specifically: Operational Stream Processing and Operational State Where integration makes sense: Leading Edge Operations
  • 13. What’s the Difference? • Non-integrated systems means you write glue code, or you use someone’s glue code. • Operational glue code is different from batch-oriented glue code. • Batch or OLAP has huge safety nets for glue code: • HDFS, CSV, immutable data sets • “Blow it away and reload” • Much less time pressure
  • 14. Glue Glue You wrote this. 1 User. Tested Well 1000s of users Tested Well 1000s of users Tested Well 1000s of users Community Supplied Many Users
  • 15. But I’m not writing “glue code” “I’m just using the well-tested Cassandra driver in my Storm code.” • You’re using a computer network. They are not always reliable. • Storm might fail in the middle of processing. • Cassandra might fail in the middle of processing. • Both systems are tested for this, but not together, using your glue code.
  • 16. Operational Glue Code is Hard Main Point: Minimize it
  • 18. Use the same system for state and processing. Ensures they are tested together. No independant failures.
  • 19. 1 Transaction = 1 Event ACID • Atomic: Either 100% done or 0% done. No in-between. • (Consistent) • Isolated: Two concurrent operations can’t interfere with each other • Durable: If it says it’s done, then it is done.
  • 20. Processing Code for a Single Event Database / State
  • 21. Processing Code for a Single Event Database / State x x x x Not Atomic
  • 22. Romeo And Juliet Explain “Atomicity” Operation 1: Fake your death Operation 2: Tell Romeo
  • 23. Processing Code for a Single Event Database / State Processing Code for a Single Event Not Isolated
  • 24. “A good example is the best sermon.” - Benjamin Franklin
  • 25. Call Center Management http://www.publicdomainpictures.net/ 3000 AgentsMillions of Customers Dashboards & Alerts Billing Actions Events Processing State
  • 26. Call Center Management Events • “Begin Call”
 Calling Number, Agent Id, Start Time, etc… • “End Call”
 Calling Number, Agent Id, End Time, etc…
  • 27. What Kind of Problems • Correlation - Streaming Join • Out-of-order delivery • At least once delivery - How to dedup • Generate new event on call completion - once • Precise Accounting • Precise Stats - Event time vs processing time
  • 29. What’s the Hardest Part? BeginCall code EndCall code State Fake Call Generator (Makes event pairs with delay) Bad Network Transformer (Duplicate & delay) My Client Code
  • 31. Schema for Call Center Example CREATE TABLE opencalls ( call_id BIGINT NOT NULL, agent_id INTEGER NOT NULL, phone_no VARCHAR(20 BYTES) NOT NULL, start_ts TIMESTAMP DEFAULT NULL, end_ts TIMESTAMP DEFAULT NULL, PRIMARY KEY (call_id, agent_id, phone_no) ); CREATE TABLE completedcalls ( call_id BIGINT NOT NULL, agent_id INTEGER NOT NULL, phone_no VARCHAR(20 BYTES) NOT NULL, start_ts TIMESTAMP NOT NULL, end_ts TIMESTAMP NOT NULL, duration INTEGER NOT NULL, PRIMARY KEY (call_id, agent_id, phone_no) ); Unpaired call begin/end events Can arrive in any order Any match transactionally moves to the completed calls table
  • 33. is the property of certain operations in mathematics and computer science, that can be applied multiple times without changing the result beyond the initial application. Idempotence
  • 34. Idempotent Not Idempotent set x = 5; same as set x = 5; set x = 5; x++; not same as x++; x++; if (x % 2 == 0) x++; same as if (x % 2 == 0) x++; if (x % 2 == 0) x++; if (x % 2 == 0) x *= 2; not same as if (x % 2 == 0) x *= 2; if (x % 2 == 0) x *= 2; spill coffee on brown pants eat whole plate of spaghetti
  • 35. Idempotent Operations Exactly Once Semantics At-Least-Once Delivery + =
  • 36. How to make BeginCall Idempotent? • If call record is in completed calls, ignore. • If the call record is in open calls and is missing end time, ignore. • If call record is in open calls, check if this event completes the call.
 Yes, handle swapped begin & end • Otherwise, create an new record in open calls table. open calls completed calls Tables
  • 37. How to make BeginCall Idempotent? • If call record is in completed calls, ignore. • If the call record is in open calls and is missing end time, ignore. • If call record is in open calls, check if this event completes the call.
 Yes, handle swapped begin & end • Otherwise, create an new record in open calls table. open calls completed calls TablesIdempotency
  • 38. • If call record is in completed calls, ignore. • If the call record is in open calls and is missing end time, ignore. • If call record is in open calls, check if this event completes the call.
 Yes, handle swapped begin & end • Otherwise, create an new record in open calls table. This thing to the left is a transaction.
  • 40. Counting • Counting is hard at scale. • 2 Kinds of fail: • Missed counts • Extra counts
  • 41. Counting Read Read x=27 Write 28 x=27 x=28 Write 28 x=28Value: Incrementer 1: Incrementer 2:
  • 42. Processing Code for a Single Event Database / State Processing Code for a Single Event Not Isolated
  • 43. Counting Systems with single-key consistency Systems with special features to enable counters ACID transactional systems Systems that enforce a single writer As we say in New England… Performance is wicked variable. Not “Read Committed”
  • 44. Accounting • Accounting is just counting, but more so. • Need to be able to increment by amount (or decrement). • Often need to increment/decrement things in groups.
  • 45. Accounting • When gamer buys a Mystical Sword of Hegemony, update the following: • Debit the gamer’s rubies or whatever. • Update real-world region stats, like swords sold in gamer’s geo- region, total money spent in gamer’s geo-region etc… • Update game region stats for the current game location, say the “Tar Shoals of Dintymoore”, like number of MSoHs in the region. • Increment any offer-related stats, like record whether the MSoH was offered because of customer engagement algorithm X15 or B12.
  • 46. Processing Code for a Single Event Database / State x x x x Not Atomic
  • 47. Accounting Systems with single-key consistency Systems with special features to enable counters ACID transactional systems Systems that enforce a single writer As we say in New England… Performance is wicked variable. ?
  • 48. Last Dollar Problem • Ad-Tech app wants to show a user an ad from a campaign. • The price of the ad is $0.90. • Advertiser has $1.00 campaign budget left. • If the budget check and the display aren’t ACID, it’s possible to decide to show the ad twice. • Ad-Tech app is forced to choose between over or under-billing.
  • 49. Aggregation • Aggregation is just counting and accounting that the system does for you. • Often this is counting chopped up by groups. • Eg. Sword sales by region. % success by offer. • In Call Center, it could be average call length by agent.
  • 50. Accounting Aggregation Systems with single-key consistency Systems with special features to enable counters ACID transactional systems Systems that enforce a single writer As we say in New England… Performance is wicked variable. ?
  • 51. How to Aggregate Without Consistency? • Use a stand-alone stream processor. • Best fit for aggregation by time, and specifically by processing time, not event time. • Run a query on all the data every time you want the aggregation. • BOO!
  • 52. Actual Math What’s the mean and standard deviation of call length 
 chopped up various ways?
  • 53. Running Variance is my next band name.
  • 55. The Details (mostly) Don’t Matter • Still need to think about performance and likely horizontal partitioning of work. • Integration of State & Processing + Full ACID Transactions
 => I can program this math without thinking about: • Failure • Interference from weak isolation. • Partial Visibility to State
  • 58. Low Latency Can Affect the Decision 500ms Want to be here You lose money here
  • 59. Get Into the “Fast Path” • Policy Enforcement in Telco • Fraud Detection “Smoke Tests” • Change what a user sees in response to action: • Change the next webpage content based on recent website actions. • Pick what’s behind the magic door based on how the game is going.
  • 61. Problem Factory full of robots Sometimes they break They log metadata
  • 62. When Imperfect is Enough • Before: No metadata. Maintenance works on stuff based on their experience, schedules and visual inspection. • Now: Basic stream processing system is up 99% of the time, and provides a much richer guidance to maintenance. 
 Robots fail less often and cost less to operate. • Possible Future: More sophisticated stream processing is up 99.99% of the time and offers even more insight.
 Robots fail a tiny bit less often and costs are a tiny bit down.
  • 63. When Imperfect Isn’t Worth It Probability of Failure (under system X) Expected Average Failure Cost # of Operations x xCost of System X + • I’ve worked on Ad-Tech use cases => High # Operations • Complex Multi-Cluster/System Monsters => High % failure • Billing systems and fraud systems => High cost per failure Licenses Hardware Engineering (Switching Tech)
  • 64. More consistent systems don’t have to be more expensive Easier to develop => Less Engineering More Efficient => Less Hardware
  • 65. Conclusion - Thank You! • Operations => Integration Wins
 Analytics, Batch => Use Specialized Tools • With transactions, complex math becomes mostly typing. • Many of these problems can be solved without transactional streaming, but… • It’s going to be harder • It might be less accurate BS Stuff I Don't Know Stuff I Know T H I S TA L K http://chat.voltdb.com @johnhugg jhugg@voltdb.com all images from wikimedia w/ cc license unless otherwise noted