Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
1
DESIGNING MODERN STREAMING
DATA APPLICATIONS
ARUN KEJARIWAL, KARTHIK RAMASAMY
@arun_kejariwal @karthikz
2
ABOUT US
3
4
Connected World
5
Ubiquity of Real-Time Data Streams
6
Data-Driven Decision Making
UNLOCKING INSIGHTS
7
GUIDE DECISION MAKING
DETERMINING TOP TRAFFIC INTENSIVE IP ADDRESSES
IDENTIFYING ARBITRAGE OPPORTUNIT...
ON-THE-FLY INSIGHTS
8
“PERISHABLE”
RECENCY IS CRITICAL
STREAMING/FAST DATA PROCESSING
9
✦ Events are analyzed and processed as they arrive
✦ Decisions are timely, contextual and...
MICROSERVICES
MODEL INFERENCEWORKFLOWS ANALYTICS
MONITORING
STREAMING APPLICATION PATTERNS
STREAM PROCESSING PATTERN
11
ComputeMessaging
Storage
Data	Inges6on Data	Processing
Results	StorageData	Storage
Data	
Serv...
LANDSCAPE
PLATFORMS
APPLICATION DOMAINS
PRACTICAL TAKEAWAYS
ANALYTICS
BUSINESS IMPACT
OUTLINE
12
FOCUS AREAS
13
Landscape of Platforms for Streaming Data
PLATFORM
14
COMPONENTS
PROCESSING
VELOCITY
STORAGE
UNBOUNDED
VARIETY
MESSAGING
VOLUME
15
APACHE KAFKA RABBITMQ
AKKA METAQ
DATABUS ACTIVEMQ
APACHE FLUME KESTREL
SCRIBE APACHE PULSAR
Messaging
NEXT GENERATION MESSAGING
16
DURABILITYCONSISTENCY
EASE OF OPERATIONS ISOLATION
RESILIENCY
SCALABILITY
KEY PROPERTIES
b
b
...
APACHE PULSAR
17
Flexible Messaging + Streaming System
backed by a durable log storage
APACHE PULSAR
18
Bookie Bookie Bookie
Broker Broker Broker
Producer Consumer
SERVING
Brokers can be added independently
Tr...
APACHE PULSAR - BROKER
19
✦ Broker is the only point of interaction for clients (producers and consumers)
✦ Brokers acquir...
APACHE PULSAR - BROKER
20
APACHE PULSAR - CONSISTENCY
21
Bookie
Bookie
BookieBrokerProducer
APACHE PULSAR - DURABILITY
22
Bookie
Bookie
BookieBrokerProducer
Journal
Journal
Journal
fsync
fsync
fsync
APACHE PULSAR - ISOLATION
23
APACHE PULSAR - SEGMENT STORAGE
24
234…20212223…40414243…60616263…
Segment 1
Segment 3
Segment 2
Segment 2
Segment 1
Segme...
APACHE PULSAR - RESILIENCY
25
1234…20212223…40414243…60616263…
Segment 1
Segment 3
Segment 2
Segment 2
Segment 1
Segment 3...
APACHE PULSAR - SEAMLESS CLUSTER EXPANSION
26
1234…20212223…40414243…60616263…
Segment 1
Segment 3
Segment 2
Segment 2
Seg...
APACHE PULSAR - TIERED STORAGE
27
Low Cost Storage
1234…20212223…40414243…60616263…
Segment 1
Segment 3
Segment 2
Segment ...
PARTITIONS VS SEGMENTS - WHY SHOULD YOU CARE?
28
PARTITIONS VS. SEGMENTS - WHY SHOULD YOU CARE?
29
✦ In Kafka, partitions are assigned to brokers “permanently”
✦ A single ...
UNIFIED MESSAGING MODEL - STREAMING
30
Pulsar topic/
partition
Producer 2
Producer 1
Consumer 1
Consumer 2
Subscription
A
...
UNIFIED MESSAGING MODEL - STREAMING
31
Pulsar topic/
partition
Producer 2
Producer 1
Consumer 1
Consumer 2
Subscription
B
...
UNIFIED MESSAGING MODEL - QUEUING
32
Pulsar topic/
partition
Producer 2
Producer 1
Consumer 2
Consumer 3
Subscription
C
M4...
DISASTER RECOVERY
33
Topic	(T1) Topic	(T1)
Topic	(T1)
Subscrip6on	
(S1)
Subscrip6on	
(S1)
Producer		
(P1)
Consumer		
(C1)
...
MULTITENANCY
34
Apache Pulsar Cluster
Product
Safety
ETL
Fraud
Detection
Topic-1
Account History
Topic-2
User Clustering
T...
PULSAR CLIENTS
35
Apache Pulsar Cluster
Java
Python
Go
C++ C
PULSAR PRODUCER
36
PulsarClient client = PulsarClient.create(
“http://broker.usw.example.com:8080”);
Producer producer = c...
PULSAR CONSUMER
37
PulsarClient client = PulsarClient.create(
"http://broker.usw.example.com:8080");
Consumer consumer = c...
SCHEMA REGISTRY
38
✦ Provides type safety to applications built on top of Pulsar
✦ Two approaches
๏ Client side - type saf...
PULSAR SCHEMAS - HOW DO THEY WORK?
39
✦ Enforced at the topic level
✦ Pulsar schemas consists of
๏ Name - Name refers to t...
SCHEMA VERSIONING
40
PulsarClient client = PulsarClient.builder()
.serviceUrl(“http://broker.usw.example.com:6650")
.build...
HOW TO PROCESS DATA MODELED AS STREAMS
41
✦ Consume data as it is produced (pub/sub)
✦ Light weight compute - transform an...
LIGHT WEIGHT COMPUTE
42
f(x)
Incoming	Messages Output	Messages
ABSTRACT VIEW OF COMPUTE REPRESENTATION
TRADITIONAL COMPUTE REPRESENTATION
43
DAG
%
%
%
%
%
Source 1
Source 2
Action
Action
Action
Sink 1
Sink 2
REALIZING COMPUTATION - EXPLICIT CODE
44
public static class SplitSentence extends BaseBasicBolt {
@Override
public void d...
REALIZING COMPUTATION - FUNCTIONAL
45
Builder.newBuilder()
.newSource(() -> StreamletUtils.randomFromList(SENTENCES))
.fla...
TRADITIONAL REAL TIME - SEPARATE SYSTEMS
46
Messaging Compute
TRADITIONAL REAL TIME SYSTEMS
47
DEVELOPER EXPERIENCE
✦ Powerful API but complicated
๏ Does everyone really need to learn ...
TRADITIONAL REAL TIME SYSTEMS
48
OPERATIONAL EXPERIENCE
✦ Multiple systems to operate
๏ IoT deployments routinely have tho...
LESSONS LEARNED - USE CASES
49
✦ Data transformations
✦ Data classification
✦ Data enrichment
✦ Data routing
✦ Data extract...
EMERGENCE OF CLOUD - SERVERLESS
50
✦ Simple function API
✦ Functions are submitted to the system
✦ Runs per events
✦ Compo...
SERVERLESS VS. STREAMING
51
✦ Both are event driven architectures
✦ Both can be used for analytics and data serving
✦ Both...
STREAM NATIVE COMPUTE USING FUNCTIONS
52
✦ Simplest possible API -function or a procedure
✦ Support for multi language
✦ U...
PULSAR FUNCTIONS
53
SDK LESS API
import java.util.function.Function;
public class ExclamationFunction implements Function<...
PULSAR FUNCTIONS
54
SDK API
import org.apache.pulsar.functions.api.PulsarFunction;
import org.apache.pulsar.functions.api....
PULSAR FUNCTIONS
55
✦ Simplest possible API -function or a procedure
✦ Support for multi language
✦ Use of native API for ...
PULSAR FUNCTIONS
56
✦ Function executed for every message of input topic
✦ Support for multiple topics as inputs
✦ Functio...
PROCESSING GUARANTEES
57
✦ ATMOST_ONCE
๏ Message acked to Pulsar as soon as we receive it
✦ ATLEAST_ONCE
๏ Message acked t...
DEPLOYING FUNCTIONS - BROKER
58
Broker 1
Worker
Function
wordcount-1
Function
transform-2
Broker 1
Worker
Function
transfo...
DEPLOYING FUNCTIONS - WORKER NODES
59
Worker
Function
wordcount-1
Function
transform-2
Worker
Function
transform-1
Functio...
DEPLOYING FUNCTIONS - KUBERNETES
60
Function
wordcount-1
Function
transform-1
Function
transform-3
Pod 1 Pod 2 Pod 3
Broke...
BUILT-IN STATE MANAGEMENT IN FUNCTIONS
61
✦ Functions can store state in inbuilt storage
๏ Framework provides a simple lib...
DISTRIBUTED STATE IN FUNCTIONS
62
import org.apache.pulsar.functions.api.Context;
import org.apache.pulsar.functions.api.P...
PULSAR - DATA IN AND OUT
63
✦ Users can write custom code using Pulsar producer and consumer API
✦ Challenges
๏ Where shou...
PULSAR IO TO THE RESCUE
64
Apache Pulsar ClusterSource Sink
INTERACTIVE QUERYING OF STREAMS - PULSAR SQL
65
1234…20212223…40414243…60616263…
Segment 1
Segment 3
Segment 2
Segment 2
S...
PULSAR IO - EXECUTION
66
Broker 1
Worker
Sink
Cassandra-1
Source
Kinesis-2
Broker 2
Worker
Source
Kinesis-1
Source
Twitter...
EASE OF OPERATIONS
67
✦ Brokers don’t have durable state
๏ Easily replaceable
๏ Topics are immediately reassigned to healt...
CONSUMER BACKLOG
68
✦ Metrics are available to make assessments
๏ When a problem started
๏ How big is backlog? messages? d...
ENFORCING MULTITENANCY
69
✦ Ensure tenants don’t cause performance issues on other tenants
๏ Backlog quotas
๏ Soft isolati...
PULSAR PERFORMANCE
70
PULSAR PERFORMANCE - LATENCY
71
APACHE PULSAR VS. APACHE KAFKA
72
Mul$-tenancy	
A	single	cluster	can	support	many	
tenants	and	use	cases
Seamless	Cluster	...
73
SPARK STREAMING APACHE APEX
APACHE FLINK SAMZA
APACHE BEAM
(MILLWHEEL, DATAFLOW)
ATHENAX
STYLUS APACHE STORM
STREAMALER...
PROCESSING
74
TASK ISOLATIONLOW LATENCY
MULTIPLE API'S MULTIPLE SEMANTICS
DIVERSE WORKLOADS
FAULT TOLERANCE
KEY PROPERTIES
HERON TERMINOLOGY
75
Topology
Directed	acyclic	graph		
ver6ces	=	computa6on,	and		
edges	=	streams	of	data	tuples
Spouts
S...
HERON TOPOLOGY
76
%
%
%
%
%
Spout 1
Spout 2
Bolt 1
Bolt 2
Bolt 3
Bolt 4
Bolt 5
HERON TOPOLOGY - PHYSICAL EXECUTION
77
%
%
%
%
%
Spout 1
Spout 2
Bolt 1
Bolt 2
Bolt 3
Bolt 4
Bolt 5
%%
%%
%%
%%
%%
HERON GROUPINGS
78
01
Shuffle Grouping
Random distribution of tuples
/
Fields Grouping
Group tuples by a field or
multiple ...
HERON TOPOLOGY - PHYSICAL EXECUTION
79
%
%
%
%
%
Spout 1
Spout 2
Bolt 1
Bolt 2
Bolt 3
Bolt 4
Bolt 5
%%
%%
%%
%%
%%
Shuffle ...
HERON TOPOLOGY - PHYSICAL EXECUTION
80
%
%
%
%
%
Spout 1
Spout 2
Bolt 1
Bolt 2
Bolt 3
Bolt 4
Bolt 5
%%
%%
%%
%%
%%
Shuffle ...
HERON ARCHITECTURE
81
Scheduler
Topology 1 Topology 2 Topology N
Topology
Submission
HERON ARCHITECTURE
82
Topology Master
ZK
Cluster
Stream
Manager
I1 I2 I3 I4
Stream
Manager
I1 I2 I3 I4
Logical Plan,
Physi...
TOPOLOGY MASTER
83
Monitoring of containers Gateway for metrics Assigns role
STREAM MANAGER
84
Routes tuples Implements backpressure Processing Semantics
STREAM MANAGER
85
% %
S1 B2 B3
%
B4
STREAM MANAGER
86
S1 B2
B3
Stream
Manager
Stream
Manager
Stream
Manager
Stream
Manager
S1 B2
B3 B4
S1 B2
B3
S1 B2
B3 B4
B4...
STREAM MANAGER BACK PRESSURE
87
Spout based back pressureTCP backpressure Stage by stage back pressure
HERON INSTANCE
88
Runs only one task (spout/bolt)
Exposes Heron API
Collects several metrics
API
G
HERON INSTANCE
89
Stream
Manager
Metrics
Manager
Gateway
Thread
Task Execution
Thread
data-in queue
data-out queue
metrics...
HERON DEPLOYMENT
90
Topology 1
Topology 2
Topology N
Heron
Tracker
Heron
API Server
Heron
Web
ZK
Cluster
Aurora Services
O...
HERON VISUALIZATION
91
HERON TOPOLOGY COMPLEXITY
92
HERON TOPOLOGY SCALE
93
CONTAINERS - 1 TO 600 INSTANCES - 10 TO 6000
HERON HAPPY FACTS :)
94
✦ No more pages during midnight for Heron team
๏ Backlog quotas
๏ Soft isolation - flow control and...
95
Coffee Break
HERON DEVELOPER ISSUES
96
01 02
Container	resource	alloca$on
Parallelism	tuning
HERON OPERATIONAL ISSUES
97
01 02 03
Slow Hosts Network Issues Data Skew
/ .
-
04
Load Variations
,
05
SLA Violations
/
SLOW HOSTS
98
Memory Parity Errors Impending Disk Failures Lower GHZ
NETWORK ISSUES
99
Network Partitioning
G
Network Slowness
NETWORK SLOWNESS
100
01 02 03
Delays	processing
Data	is	accumula$ng
Timelines	of	results	
Is	affected
DATA SKEW
101
Multiple Keys
Several	keys	map	into	single	
instance	and	their	count	is	
high
Single Key
Single	 key	 maps	 ...
LOAD VARIATIONS
102
Multiple Keys
Several	keys	map	into	single	
instance	and	their	count	is	
high
Single Key
Single	 key	 ...
SELF REGULATING STREAMING SYSTEMS
103
Automate	Tuning
SLO		
Maintenance
Self	Regula$ng	
Streaming	Systems
Tuning
Manual,	 ...
SELF REGULATING STREAMING SYSTEMS
104
Self tuning Self stabilizing Self healing
G !g
Several tuning knobs
Time consuming t...
ENTER DHALION
105
Dhalion periodically executes well-
specified policies that optimize
execution based on some objective.
...
DHALION POLICY FRAMEWORK
106
Symptom
Detector 1
Symptom
Detector 2
Symptom
Detector 3
Symptom
Detector N
....
Diagnoser 1
...
DYNAMIC RESOURCE PROVISIONING
107
Policy
This	policy	reacts	to	unexpected	
load	varia6ons	(workload	spikes)
Goal
Goal	is	t...
DYNAMIC RESOURCE PROVISIONING
108
Pending Tuples
Detector
Backpressure
Detector
Processing Rate
Skew Detector
Resource Ove...
PROCESSING HERON
PULSAR
BOOKKEEPER
UNIFICATION
109
MESSAGING
STORAGE
STREAMLIO
STREAMLIO
110
UNIFIED ARCHITECTURE
Interactive
Querying
Storm API Streamlet SQL
Application
Builder
Pulsar
API
BK/
HDFS
AP...
STREAMLIO ARCHITECTURE
111
KEY HIGHLIGHTS
DAG PROCESSING
GEO-REPLICATION
DURABILITY
CONSISTENCY FUNCTION PROCESSING
INFINI...
112
ETL
DATA PROCESSING
113
CLEANSING AND TRANSFORMATION
Data format consistency
M:0:Male, F:1:Female
Missing Data
Imputation
Pre-...
DATA ENRICHMENT
114
MULTIPLE DATA SOURCES
Operators
Associate
Aggregate
Filter
Properties
Scalable
Fault-tolerant
Secure
C...
ETL DATA PIPELINES - RETAIL
115
Scenario
Consumer retail company migrating multiple data
and analytics applications to pub...
ETL DATA PIPELINES - RETAIL
116
Cloud data warehouse
Batch reporting
and analytics
Real-time dashboards and
alerts
Data so...
117
Enterprise Event Bus
DATA SILOS
118
INTEGRATION
Disparate Data Sources
Multiple Data Types
Different Frequency
Data Fidelity
DATA FUSION
119
EXTRACTING INSIGHTS
Why?
Complementarity → Completeness
Redundancy → Accuracy
Cooperative → Dependability
...
120
STREAMING JOINS
CHALLENGES
Bounded sliding window of tuples
Traditional join semantics
Hard to exploit many-core paral...
ENTERPRISE EVENT BUS
121
Apache Pulsar Cluster
Application 2
Application 5
Application 1 Application 3
Application 4 Appli...
ENTERPRISE EVENT BUS - YAHOO!
122
Scenario
Need to collect and distribute user and data
events to distributed global appli...
123
UNBOUNDED
UNORDERED
EVENT TIME
PROCESSING TIME
LOW LATENCY
LIMITED MEMORY
SINGLE PASS , INCREMENTAL
APPROXIMATE
APPROXIMATION
124
Guaranteed bounds on
approximation factors
Error greater than 𝜖 with
probability 𝛿
BOUNDS
PROBABILISTICD...
APPROXIMATION
125
CHEBYSHEV
HOEFFDING
MARKOV
PROBABILISTIC BOUNDS
X: Random variable, μ: expectation, 𝜖>0
126
BIAS VS. VARIANCE
PROPERTIES
BIAS VARIANCE
How much the average of the
estimate differs from the true mean
Variance of ...
127
BIAS VS. VARIANCE
TRADE-OFF
*	Illustra6on	borrowed	from	h`ps://web.stanford.edu/~has6e/ElemStatLearn/
*
DATA SKETCHES
128
129
SAMPLING
FILTERING
CARDINALITY
QUANTILE
FREQUENT
MOMENTS
SPECTRUM OF
DATA SKETCHES
SAMPLING
130
Volume
๏ Unbounded
Velocity
Types
๏ Uniform
๏ Non-uniform
❖ Class imbalance
Latency
SAMPLING
131
✦ Maintain dynamic sample
๏ A data stream is a continuous process
๏ Not known in advance how many points may ...
SAMPLING
132
✦ Sliding window approach*
(sample size k, window width n)
๏ Sequence-based
❖ Replace expired element with ne...
SAMPLING
133
COMPRESSED SENSING
Distributed
๏ Communication cost
๏ Non-uniform sampling rates
High dimensionality
๏ Sparsi...
134
COMPRESSED SENSING
EARLY WORK
[Candès et al. 2006]
[Donoho 2006]
[Cormode and Muthukrishnan 2006]
[Freris et al. 2013]...
135
136
Document 2015
EVENTS
E-COMMERCE OPPORTUNITY
MODEL
137
INCREMENTAL
HANDLE NON-STATIONARITY, CONCEPT DRIFT
EVOLVE CONTINUOUSLY
PERSONALIZATION
138
THE HOLY GRAIL
Continuously evolving data streams
Time decay
Multi-armed bandit
Contextual
Deep learni...
INVENTORY MANAGEMENT
139
FORECASTING
Mining retail transactions
Seasonality
Virality
New product launches
140
Real-Time Trending
141
HERBERT SIMON
“A wealth of information
creates a poverty of
attention.”
142
CONTINUOUS (PREFERENCE) TOP-K ELEMENTS
*	Moura6dis	et	al.,	“Con8nuous	monitoring	of	top-k	queries	over	sliding	windows...
143
TWITTER
TRENDING
NEWS
TRENDING
144
Spotify, Apple Music, Pandora
AUDIO
145
YouTube, Vimeo, DailyMotion
TRENDING
VIDEO
Facebook Live, Periscope, Twitch
146
TRENDING
LIVE VIDEO
147
FAKE
ADDRESSING INTEGRITY
148
BIAS
VARIOUS SOURCES
Data bias
๏ Public image datasets
❖ Geographic representation
๏ Induces algorithmic bias[1]
❖ Onl...
149
FACT CHECKING
CROWDSOURCING
REAL TIME TRENDING IN PULSAR & HERON
150
Streamlio (Apache Pulsar and Apache Heron)
Data
Source 2
clean-fn 2
Data
Source 1...
LATENCY
151
USER EXPERIENCE
THE TAIL[1]
152
CLASSIFICATION[2]
✦ Whether elements can only be added or can be both
inserted and deleted?
๏ Cash registe...
QUANTILE ESTIMATION
153
[Arasu and Manku 2005]
SLIDING WINDOWS
[Cormode et al. 2005]
[Yi et al. 2009]
CONTINUOUS
MONITORIN...
QUANTILE ESTIMATION
154
PICK
[Blum et al. 1973]
EXACT MEDIAN
[Munro and Patterson 1980]
Q-DIGEST
[Shrivastava et al. 2002]...
Max signal
value
# Elements
Compression
Factor
Complete binary tree
Q-DIGEST[1]
155
✦ Groups	values	in	variable	size	bucke...
Q-DIGEST (CONTD.)
156
✦ Building a q-digest
✦ q-digests can be constructed in a
distributed fashion
๏ Merge q-digests
T-DIGEST[1]
157
[1]	T.	Dunning	and	O.	Ertl,	”Compu8ng	Extremely	Accurate	Quan8les	using	t-digests”,	2017.	h`ps://github.co...
T-DIGEST (CONTD.)
158
✦ Group samples into sub-sequences
๏ Smaller sub-sequences near the ends
๏ Larger sub-sequences in t...
T-DIGEST (CONTD.)
159
✦ Estimating quantile via interpolation
๏ Sub-sequences contain centroid of the samples
๏ Estimate t...
160
RELATED WORKS
SUMMARY DATA
STRUCTURE
[Greenwald and Khanna 2001]
[Luo et al. 2016]
RANDOM
BQ-SUMMARY
[Zhang et al. 200...
161
PERFORMANCE
162
MONITORING
CPM (Cost per 1000 Impressions), vCPM, CPCV
CPC (Cost per Click)
CPI (Cost per Install)
CPE (Co...
PERFORMANCE
163
MONITORING
AD
FORMATS
POP UNDER
INTERSTITIAL
REDIRECTBANNERS
REWARDED
VIDEO
POP UP
SOCIAL
VIDEO
FRAUD
164
DETECTION
IMPRESSION
CLICK
INSTALL
ENGAGEMENT
FRAUD DETECTION USING APACHE PULSAR
165
Apache Pulsar
Data
Source 2
clean-fn 2
Data
Source 1
Data
Source 3
clean-fn 1 frau...
IP/DEVICE ID BLACKLISTING
166
CUCKOO FILTER
[Fan et al. CoNext 2014]
QUOTIENT FILTER
[Bender et al. Hot Storage 2011]
MORT...
BLOOM FILTER
167
[1]	Illustra6on	borrowed	from	h`p://www.eecs.harvard.edu/~michaelm/postscripts/im2005b.pdf
[1]
BLOOM FILTER
168
✦		Natural generalization of hashing
✦ False positives are possible
✦ No false negatives
No deletions all...
VARIATIONS
169
BLOOM FILTER
[Mitzenmacher 2002]
COMPRESSED
[Fan et al. 2000]
COUNTING
[Cohen and Matias 2003]
SPECTRAL
[Ch...
QUOTIENT FILTER
170
✦		Supports insertion, deletion
✦ Merging/resizing feasible
✦ Lookups incur a single cache miss
✦ 20% ...
VARIATIONS
171
QUOTIENT FILTER
[Bender et al. 2011]
CASCADE
[Pandey et al. 2017]
COUNTING RANK-AND-SELECT
✦		Multiset of i...
CUCKOO FILTER
172
✦ Key Highlights
๏ Add and remove items dynamically
๏ For false positive rate ε < 3%, more space efficient...
Cuckoo Hashing [1]
CUCKOO FILTER
173
[1]	R.	Pagh	and	F.	Rodler.	Cuckoo	hashing.	Journal	of	Algorithms,	51(2):122-144,	2004...
CUCKOO FILTER
174
✦ Deletion
๏ Item	must	have	been	previously	
inserted
✦ Partial-key cuckoo hashing
✦ Fingerprint hashing...
CONCURRENT
CUCKOO HASHING
VARIATIONS
175
CUCKOO HASHING AND CUCKOO FILTER
[Li et al. 2014]
[Eppstein et al. 2017]
2-3 CUCK...
MORTON FILTER
176
✦ Key Highlights
๏ Insertions heavily biased towards the hash function H1
❖ Fingerprint retrieval requir...
MORTON FILTER
177
✦ Hashing
๏ Reduces TLB misses, DRAM row buffer misses and page faults
๏ Does not require total # buckets...
SPEND
178
DYNAMIC (RE)-ALLOCATION
SANs
Facebook, Google, Twitter, Yahoo, Snapchat
Ad Networks
Affiliates
Offer wall
TV
179
EXCHANGES
180
PROGRAMMATIC
DoubleClick
MoPub
OpenX
AppNexus
USER PROFILE
181
MULTI-DIMENSIONAL TARGETING
Platform
Device Type
OS Version
Apps Installed
Age, Gender
STATISTICAL ARBITRAGE
182
PREDICTION
Click Through Rate (CTR)
Click to Install (CTI)
Life Time Value (LTV)
Churn probabili...
ANOMALY DETECTION
183
Temporal
Distribution based
A
N
O
M
A
L
Y
DIFFERENT FLAVORS
safaribooksonline.com/library/view/under...
ANOMALY DETECTION
184
ROOTED IN STUDIES IN PROBABILITY
1777
1713
Reconciling discrepant observations
Euler 1749
Irregulari...
ANOMALY DETECTION
185
LONG HISTORY
First formalization of an exact rule for rejection of anomalies
ANOMALY DETECTION
186
CHARACTERISTICS
MAGNITUDE
Severity
WIDTH
Actionability
FREQUENCY
Reliability
DIRECTION
Positive/Nega...
ANOMALY DETECTION
187
WHY IT’S NON-TRIVIAL
SEASONALITY TREND BREAKOUTSTATIONARITYNOISE
ANOMALY DETECTION
188
ASSUMPTIONS
Normality, Stationarity
MOVING AVERAGES
Params: Width, Decay
RULE BASED
µ ± σ
COMMON APP...
ANOMALY DETECTION
189
MAD[1]
Median Absolute
Deviation
MEDIAN
MCD[2]
Minimum Covariance
Determinant
MVEE[3,4]
Minimum Volu...
ANOMALY DETECTION
190
CONTEXT
✦ Data types
๏ Text, Audio, Video
✦ Data veracity
๏ Wearables
๏ Smart cities, Connected
Home...
191
ANOMALY DETECTION
DIFFERENT APPLICATION DOMAINS
RADIO ANOMALY
DETECTION
MARKER
DISCOVERY
CROWDED SCENE
ANALYSIS
ANOMAL...
192
Real-Time Security
THREAT DETECTION
193
POTENTIAL HIGH BUSINESS DOWNSIDE
DDos Attack
Network Intrusion
Harassement
Account hacking
Identity t...
MONITORING
194
AUDIT LOGS
Answering the “What, When, Who, Why”
Downloading of unusual amount of data
Troubleshoot systems,...
REAL TIME SECURITY USING APACHE PULSAR
195
Apache Pulsar
Data
Source 2
threat-fn 2
Data
Source 1
Data
Source 3
threat-fn 1...
DEVICE IDS/IP ADDRESSES
196
FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS
✦ Require large amount of memory
✦ Long tail - low busi...
FLAVORS
197
FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS
l 1 heavy hitter
๏ m can be known/unknown
๏ Deterministic
❖ [Misra and ...
FLAVORS
198
FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS
l 2 heavy hitter
๏ Example
❖ [Charikar et al. 2004]
๏ Applications
❖ Co...
COUNT-MIN SKETCH[1]
199
✦ A two-dimensional array counts with w columns and d rows
✦ Each entry of the array is initially ...
COUNT-MIN SKETCH
200
✦ Count-Min sketch with conservative update (CU sketch)
๏ Update an item with frequency c
๏ Avoid unn...
COUNTER TREE[1]
201
✦ Motivation
๏ Equi-bitwidth counters → inefficient space usage owing variance in counts
✦ Two-dimension...
COUNTER TREE (CONTD.)
202
✦ Counting range
✦ Virtual counter array Vf
๏ Vf[i] = V[hi(f)], 0 ≤ i < r,
✦ Recording
๏ Increme...
COUNTER TREE (CONTD.)
203
✦ Estimation
๏ CT based Estimation (CTE)
❖ k = dh-1
❖ Xi is the value of the subtree rooted at C...
VARIATIONS
204
FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS
LOSSY COUNTING
[Manku and Motwani, 2002]
[Homem and Carvalho, 2010]
...
205
HEALTH CARE
206
WEARABLES
Smart watches, smart glasses, smart clothing,
implantables
Data veracity
๏ Hear rate (ECG and HR...
MONITORING
207
OIL DRILLING
Early detection of leaks and/or spills
๏ Oil
๏ Flammable gases
๏ Hazardous chemicals
Equipment...
QUALITY OF SERVICE
208
CUSTOMER FOCUS
WiFi, Video streaming, E-commerce
Metrics
๏ Availability
๏ Throughput
๏ Response tim...
INDUSTRIAL IOT USING APACHE PULSAR
209
Apache Pulsar Cloud
Data
Source 2
clean-fn 2
Data
Source 1
Data
Source 3clean-fn 1
...
NUMBER OF DISTINCT DATA SOURCES
210
CARDINALITY ESTIMATION
✦ Hash values as strings
✦ Occurrence of particular patterns in...
CARDINALITY ESTIMATION
211
ANOTHER CLASSIFICATION
SKETCH-BASED
Scan the entire data set once
Hash the items
Create sketch
...
HYPERLOGLOG
212
where
✦ Apply hash function h to every element in a multiset
✦ Cardinality of multiset is 2max(ϱ) where 0ϱ...
HYPERLOGLOG
213
OPTIMIZATIONS
✦ Use of 64-bit hash function
๏ Total memory requirement 5 * 2p -> 6 * 2p, where p is the pr...
MIN-COUNT
VARIATIONS
214
CARDINALITY ESTIMATION
[Giroire, 2009]
Optimal statistical efficiency
[Ting, 2014]
DISCRETE MAX-COU...
WRAPPING UP
215
PLATFORM UNIFICATION APPLICATIONS ANALYTICS
QUESTIONS
216
STAY IN TOUCH
@arun_kejariwal
@karthikz
TWITTER
EMAIL
arun_kejariwal@acm.org
karthik@streaml.io
217
218
@arun_kejariwal @karthikz
READINGS
219
Data Streams: Algorithms and Applications”,
by M. Muthukrishnan
“Sketching as a Tool for Numerical Linear Alg...
READINGS
220
“The algebra of stream processing functions”, TCS 2001.
STREAM CALCULUS
“A conductive calculus of streams”, M...
READINGS
221
“Streaming Algorithms for Robust Distinct
Elements”, SIGMOD 2016.
“Pyramid Sketch”, VLDB 2017. “Bias-Aware Sk...
222
READINGS
“Streaming Anomaly Detection Using Randomized
Matrix Sketching”, VLDB 2015.
“Graph Sketches: Sparsification, S...
OPEN SOURCE
223
Data Sketches (https://datasketches.github.io/)
Algebird (https://github.com/twitter/algebird)
StreamDM (h...
RESOURCES
224
https://www.slideshare.net/arunkejariwal/correlation-analysis-on-live-data-streams (O’Reilly Strata)
https:/...
RESOURCES
225
http://www.cs.toronto.edu/~koudas/courses/csc2508/SQL-on-Hadoop-Final.pdf (VLDB 2015)
https://www.cse.ust.hk...
RESOURCES
226
https://people.cs.umass.edu/~mcgregor/slides/10-jhu1.pdf
http://dmac.rutgers.edu/Workshops/WGUnifyingTheory/...
RESOURCES
227
https://www.cc.gatech.edu/~jx/reprints/talks/sigm07_tutorial.pdf (SIGMETRICS 2007)
hanj.cs.illinois.edu/bk3/...
Nächste SlideShare
Wird geladen in …5
×

Designing Modern Streaming Data Applications

1.088 Aufrufe

Veröffentlicht am

Many industry segments have been grappling with fast data (high-volume, high-velocity data). The enterprises in these industry segments need to process this fast data just in time to derive insights and act upon it quickly. Such tasks include but are not limited to enriching data with additional information, filtering and reducing noisy data, enhancing machine learning models, providing continuous insights on business operations, and sharing these insights just in time with customers. In order to realize these results, an enterprise needs to build an end-to-end data processing system, from data acquisition, data ingestion, data processing, and model building to serving and sharing the results. This presents a significant challenge, due to the presence of multiple messaging frameworks and several streaming computing frameworks and storage frameworks for real-time data.

In this tutorial we lead a journey through the landscape of state-of-the-art systems for each stage of an end-to-end data processing pipeline, messaging frameworks, streaming computing frameworks, storage frameworks for real-time data, and more. We also share case studies from the IoT, gaming, and healthcare as well as their experience operating these systems at internet scale at Twitter and Yahoo. We conclude by offering their perspectives on how advances in hardware technology and the emergence of new applications will impact the evolution of messaging systems, streaming systems, storage systems for streaming data, and reinforcement learning-based systems that will power fast processing and analysis of a large (potentially of the order of hundreds of millions) set of data streams.

Topics include:

* An introduction to streaming
* Common data processing patterns
* Different types of end-to-end stream processing architectures
* How to seamlessly move data across data different frameworks
* Case studies: Healthcare and the IoT
* Data sketches for mining insights from data streams

Veröffentlicht in: Technologie

Designing Modern Streaming Data Applications

  1. 1. 1 DESIGNING MODERN STREAMING DATA APPLICATIONS ARUN KEJARIWAL, KARTHIK RAMASAMY @arun_kejariwal @karthikz
  2. 2. 2 ABOUT US
  3. 3. 3
  4. 4. 4 Connected World
  5. 5. 5 Ubiquity of Real-Time Data Streams
  6. 6. 6 Data-Driven Decision Making
  7. 7. UNLOCKING INSIGHTS 7 GUIDE DECISION MAKING DETERMINING TOP TRAFFIC INTENSIVE IP ADDRESSES IDENTIFYING ARBITRAGE OPPORTUNITIES IN FINANCIAL TRAINING DETERMINING SESSIONS WHOSE DURATION >2X THAN NORMAL FORECASTING DETERIORATION IN HEALTH METRICS # UNIQUE USERS/DAY OF A MOBILE APP TARGETED ADVERTISING
  8. 8. ON-THE-FLY INSIGHTS 8 “PERISHABLE” RECENCY IS CRITICAL
  9. 9. STREAMING/FAST DATA PROCESSING 9 ✦ Events are analyzed and processed as they arrive ✦ Decisions are timely, contextual and based on fresh data ✦ Decision latency is eliminated ✦ Data in motion Ingest/ Buffer Analyze Act
  10. 10. MICROSERVICES MODEL INFERENCEWORKFLOWS ANALYTICS MONITORING STREAMING APPLICATION PATTERNS
  11. 11. STREAM PROCESSING PATTERN 11 ComputeMessaging Storage Data Inges6on Data Processing Results StorageData Storage Data Serving
  12. 12. LANDSCAPE PLATFORMS APPLICATION DOMAINS PRACTICAL TAKEAWAYS ANALYTICS BUSINESS IMPACT OUTLINE 12 FOCUS AREAS
  13. 13. 13 Landscape of Platforms for Streaming Data
  14. 14. PLATFORM 14 COMPONENTS PROCESSING VELOCITY STORAGE UNBOUNDED VARIETY MESSAGING VOLUME
  15. 15. 15 APACHE KAFKA RABBITMQ AKKA METAQ DATABUS ACTIVEMQ APACHE FLUME KESTREL SCRIBE APACHE PULSAR Messaging
  16. 16. NEXT GENERATION MESSAGING 16 DURABILITYCONSISTENCY EASE OF OPERATIONS ISOLATION RESILIENCY SCALABILITY KEY PROPERTIES b b b b b b INFINITE RETENTIONb
  17. 17. APACHE PULSAR 17 Flexible Messaging + Streaming System backed by a durable log storage
  18. 18. APACHE PULSAR 18 Bookie Bookie Bookie Broker Broker Broker Producer Consumer SERVING Brokers can be added independently Traffic can be shifted quickly across brokers STORAGE Bookies can be added independently New bookies will ramp up traffic quickly
  19. 19. APACHE PULSAR - BROKER 19 ✦ Broker is the only point of interaction for clients (producers and consumers) ✦ Brokers acquire ownership of group of topics and “serve” them ✦ Broker has no durable state ✦ Provides service discovery mechanism for client to connect to right broker
  20. 20. APACHE PULSAR - BROKER 20
  21. 21. APACHE PULSAR - CONSISTENCY 21 Bookie Bookie BookieBrokerProducer
  22. 22. APACHE PULSAR - DURABILITY 22 Bookie Bookie BookieBrokerProducer Journal Journal Journal fsync fsync fsync
  23. 23. APACHE PULSAR - ISOLATION 23
  24. 24. APACHE PULSAR - SEGMENT STORAGE 24 234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1 Segment 4 Segment 4
  25. 25. APACHE PULSAR - RESILIENCY 25 1234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1 Segment 4 Segment 4
  26. 26. APACHE PULSAR - SEAMLESS CLUSTER EXPANSION 26 1234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1 Segment 4 Segment 4 Segment Y Segment Z Segment X
  27. 27. APACHE PULSAR - TIERED STORAGE 27 Low Cost Storage 1234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1Segment 4 Segment 4
  28. 28. PARTITIONS VS SEGMENTS - WHY SHOULD YOU CARE? 28
  29. 29. PARTITIONS VS. SEGMENTS - WHY SHOULD YOU CARE? 29 ✦ In Kafka, partitions are assigned to brokers “permanently” ✦ A single partition is stored entirely in a single node ✦ Retention is limited by a single node storage capacity ✦ Failure recovery and capacity expansion require expensive “rebalancing” ✦ Rebalancing has a big impact over the system, affecting regular traffic
  30. 30. UNIFIED MESSAGING MODEL - STREAMING 30 Pulsar topic/ partition Producer 2 Producer 1 Consumer 1 Consumer 2 Subscription A M4 M3 M2 M1 M0 M4 M3 M2 M1 M0 X Exclusive
  31. 31. UNIFIED MESSAGING MODEL - STREAMING 31 Pulsar topic/ partition Producer 2 Producer 1 Consumer 1 Consumer 2 Subscription B M4 M3 M2 M1 M0 M4 M3 M2 M1 M0 Failover In case of failure in consumer 1
  32. 32. UNIFIED MESSAGING MODEL - QUEUING 32 Pulsar topic/ partition Producer 2 Producer 1 Consumer 2 Consumer 3 Subscription C M4 M3 M2 M1 M0 Shared Traffic is equally distributed across consumers Consumer 1 M4M3 M2M1M0
  33. 33. DISASTER RECOVERY 33 Topic (T1) Topic (T1) Topic (T1) Subscrip6on (S1) Subscrip6on (S1) Producer (P1) Consumer (C1) Producer (P3) Producer (P2) Consumer (C2) Data Center A Data Center B Data Center C Integrated in the broker message flow Simple configuration to add/remove regions Asynchronous (default) and synchronous replication
  34. 34. MULTITENANCY 34 Apache Pulsar Cluster Product Safety ETL Fraud Detection Topic-1 Account History Topic-2 User Clustering Topic-1 Risk Classification MarketingCampaigns ETL Topic-1 Budgeted Spend Topic-2 Demographic Classification Topic-1 Location Resolution Data Serving Microservice Topic-1 Customer Authentication 10 TB 7 TB 5 TB 3 TB 2 TB 8 TB 4 TB 3 TB ✦ Authentication ✦ Authorization ✦ Software isolation ๏ Storage quotas, flow control, back pressure, rate limiting ✦ Hardware isolation ๏ Constrain some tenants on a subset of brokers/bookies
  35. 35. PULSAR CLIENTS 35 Apache Pulsar Cluster Java Python Go C++ C
  36. 36. PULSAR PRODUCER 36 PulsarClient client = PulsarClient.create( “http://broker.usw.example.com:8080”); Producer producer = client.createProducer( “persistent://my-property/us-west/my-namespace/my-topic”); // handles retries in case of failure producer.send("my-message".getBytes()); // Async version: producer.sendAsync("my-message".getBytes()).thenRun(() -> { // Message was persisted });
  37. 37. PULSAR CONSUMER 37 PulsarClient client = PulsarClient.create( "http://broker.usw.example.com:8080"); Consumer consumer = client.subscribe( "persistent://my-property/us-west/my-namespace/my-topic", "my-subscription-name"); while (true) { // Wait for a message Message msg = consumer.receive(); System.out.println("Received message: " + msg.getData()); // Acknowledge the message so that it can be deleted by broker consumer.acknowledge(msg); }
  38. 38. SCHEMA REGISTRY 38 ✦ Provides type safety to applications built on top of Pulsar ✦ Two approaches ๏ Client side - type safety enforcement up to the application ๏ Server side - system enforces type safety and ensures that producers and consumers remain synced ✦ Schema registry enables clients to upload data schemas on a topic basis. ✦ Schemas dictate which data types are recognized as valid for that topic
  39. 39. PULSAR SCHEMAS - HOW DO THEY WORK? 39 ✦ Enforced at the topic level ✦ Pulsar schemas consists of ๏ Name - Name refers to the topic to which the schema is applied ๏ Payload - Binary representation of the schema ๏ Schema type - JSON, Protobuf and Avro ๏ User defined properties - Map of strings to strings (application specific - e.g git hash of the schema)
  40. 40. SCHEMA VERSIONING 40 PulsarClient client = PulsarClient.builder() .serviceUrl(“http://broker.usw.example.com:6650") .build() Producer<SensorReading> producer = client.newProducer(JSONSchema.of(SensorReading.class)) .topic(“sensor-data”) .sendTimeout(3, TimeUnit.SECONDS) .create() Scenario What happens No schema exists for the topic Producer is created using the given schema Schema already exists; producer connects using the same schema that’s already stored Schema is transmitted to the broker, determines that it is already stored Schema already exists; producer connects using a new schema that is compatible Schema is transmitted, compatibility determined and stored as new schema
  41. 41. HOW TO PROCESS DATA MODELED AS STREAMS 41 ✦ Consume data as it is produced (pub/sub) ✦ Light weight compute - transform and react to data as it arrives ✦ Heavy weight compute - continuous data processing ✦ Interactive query of stored streams
  42. 42. LIGHT WEIGHT COMPUTE 42 f(x) Incoming Messages Output Messages ABSTRACT VIEW OF COMPUTE REPRESENTATION
  43. 43. TRADITIONAL COMPUTE REPRESENTATION 43 DAG % % % % % Source 1 Source 2 Action Action Action Sink 1 Sink 2
  44. 44. REALIZING COMPUTATION - EXPLICIT CODE 44 public static class SplitSentence extends BaseBasicBolt { @Override public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word")); } @Override public Map<String, Object> getComponentConfiguration() { return null; } public void execute(Tuple tuple, BasicOutputCollector basicOutputCollector) { String sentence = tuple.getStringByField("sentence"); String words[] = sentence.split(" "); for (String w : words) { basicOutputCollector.emit(new Values(w)); } } } STITCHED BY PROGRAMMERS
  45. 45. REALIZING COMPUTATION - FUNCTIONAL 45 Builder.newBuilder() .newSource(() -> StreamletUtils.randomFromList(SENTENCES)) .flatMap(sentence -> Arrays.asList(sentence.toLowerCase().split("s+"))) .reduceByKeyAndWindow(word -> word, word -> 1, WindowConfig.TumblingCountWindow(50), (x, y) -> x + y);
  46. 46. TRADITIONAL REAL TIME - SEPARATE SYSTEMS 46 Messaging Compute
  47. 47. TRADITIONAL REAL TIME SYSTEMS 47 DEVELOPER EXPERIENCE ✦ Powerful API but complicated ๏ Does everyone really need to learn functional programming? ✦ Configurable and scalable but management overhead ✦ Edge systems have resource and management constraints
  48. 48. TRADITIONAL REAL TIME SYSTEMS 48 OPERATIONAL EXPERIENCE ✦ Multiple systems to operate ๏ IoT deployments routinely have thousands of edge systems ✦ Semantic differences ๏ Mismatch and duplication between systems ๏ Creates developer and operator friction
  49. 49. LESSONS LEARNED - USE CASES 49 ✦ Data transformations ✦ Data classification ✦ Data enrichment ✦ Data routing ✦ Data extraction and loading ✦ Real time aggregation ✦ Microservices Significant set of processing tasks are exceedingly simple
  50. 50. EMERGENCE OF CLOUD - SERVERLESS 50 ✦ Simple function API ✦ Functions are submitted to the system ✦ Runs per events ✦ Composition APIs to do complex things ✦ Wildly popular
  51. 51. SERVERLESS VS. STREAMING 51 ✦ Both are event driven architectures ✦ Both can be used for analytics and data serving ✦ Both have composition APIs ๏ Configuration based for serverless ๏ DSL based for streaming ✦ Serverless typically does not guarantee ordering ✦ Serverless is pay per action
  52. 52. STREAM NATIVE COMPUTE USING FUNCTIONS 52 ✦ Simplest possible API -function or a procedure ✦ Support for multi language ✦ Use of native API for each language ✦ Scale developers ✦ Use of message bus native concepts - input and output as topics ✦ Flexible runtime - simple standalone applications vs managed system applications APPLYING INSIGHT GAINED FROM SERVERLESS
  53. 53. PULSAR FUNCTIONS 53 SDK LESS API import java.util.function.Function; public class ExclamationFunction implements Function<String, String> { @Override public String apply(String input) { return input + "!"; } }
  54. 54. PULSAR FUNCTIONS 54 SDK API import org.apache.pulsar.functions.api.PulsarFunction; import org.apache.pulsar.functions.api.Context; public class ExclamationFunction implements PulsarFunction<String, String> { @Override public String process(String input, Context context) { return input + "!"; } }
  55. 55. PULSAR FUNCTIONS 55 ✦ Simplest possible API -function or a procedure ✦ Support for multi language ✦ Use of native API for each language ✦ Scale developers ✦ Use of message bus native concepts - input and output as topics ✦ Flexible runtime - simple standalone applications vs managed system applications
  56. 56. PULSAR FUNCTIONS 56 ✦ Function executed for every message of input topic ✦ Support for multiple topics as inputs ✦ Function output goes into output topic - can be void topic as well ✦ SerDe takes care of serialization/deserialization of messages ๏ Custom SerDe can be provided by the users ๏ Integration with schema registry
  57. 57. PROCESSING GUARANTEES 57 ✦ ATMOST_ONCE ๏ Message acked to Pulsar as soon as we receive it ✦ ATLEAST_ONCE ๏ Message acked to Pulsar after the function completes ๏ Default behavior - don’t want people to loose data ✦ EFFECTIVELY_ONCE ๏ Uses Pulsar’s inbuilt effectively once semantics ✦ Controlled at runtime by user
  58. 58. DEPLOYING FUNCTIONS - BROKER 58 Broker 1 Worker Function wordcount-1 Function transform-2 Broker 1 Worker Function transform-1 Function dataroute-1 Broker 1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3
  59. 59. DEPLOYING FUNCTIONS - WORKER NODES 59 Worker Function wordcount-1 Function transform-2 Worker Function transform-1 Function dataroute-1 Worker Function wordcount-2 Function transform-3 Node 1 Node 2 Node 3 Broker 1 Broker 2 Broker 3 Node 4 Node 5 Node 6
  60. 60. DEPLOYING FUNCTIONS - KUBERNETES 60 Function wordcount-1 Function transform-1 Function transform-3 Pod 1 Pod 2 Pod 3 Broker 1 Broker 2 Broker 3 Pod 7 Pod 8 Pod 9 Function dataroute-1 Function wordcount-2 Function transform-2 Pod 4 Pod 5 Pod 6
  61. 61. BUILT-IN STATE MANAGEMENT IN FUNCTIONS 61 ✦ Functions can store state in inbuilt storage ๏ Framework provides a simple library to store and retrieve state ✦ Support server side operations like counters ✦ Simplified application development ๏ No need to standup an extra system
  62. 62. DISTRIBUTED STATE IN FUNCTIONS 62 import org.apache.pulsar.functions.api.Context; import org.apache.pulsar.functions.api.PulsarFunction; public class CounterFunction implements PulsarFunction<String, Void> { @Override public Void process(String input, Context context) throws Exception { for (String word : input.split(".")) { context.incrCounter(word, 1); } return null; } }
  63. 63. PULSAR - DATA IN AND OUT 63 ✦ Users can write custom code using Pulsar producer and consumer API ✦ Challenges ๏ Where should the application to publish data or consume data from Pulsar? ๏ How should the application to publish data or consume data from Pulsar? ✦ Current systems have no organized and fault tolerant way to run applications that ingress and egress data from and to external systems
  64. 64. PULSAR IO TO THE RESCUE 64 Apache Pulsar ClusterSource Sink
  65. 65. INTERACTIVE QUERYING OF STREAMS - PULSAR SQL 65 1234…20212223…40414243…60616263… Segment 1 Segment 3 Segment 2 Segment 2 Segment 1 Segment 3 Segment 4 Segment 3 Segment 2 Segment 1 Segment 4 Segment 4 Segment Reader Segment Reader Segment Reader Segment Reader Coordinator
  66. 66. PULSAR IO - EXECUTION 66 Broker 1 Worker Sink Cassandra-1 Source Kinesis-2 Broker 2 Worker Source Kinesis-1 Source Twitter-1 Broker 3 Worker Sink Cassandra-2 Source Kinesis-3 Node 1 Node 2 Node 3 Fault tolerance Parallelism Elasticity Load Balancing On-demand updates
  67. 67. EASE OF OPERATIONS 67 ✦ Brokers don’t have durable state ๏ Easily replaceable ๏ Topics are immediately reassigned to healthy brokers ✦ Expanding capacity ๏ Simply add new broker node ๏ If other brokers are overloaded, traffic will be automatically assigned ✦ Load manager ๏ Monitor traffic load on all brokers (CPU, memory, network, topics) ๏ Initially place topics to least loaded brokers ๏ Reassign topics when a broker is overloaded
  68. 68. CONSUMER BACKLOG 68 ✦ Metrics are available to make assessments ๏ When a problem started ๏ How big is backlog? messages? disk space? ๏ How fast is draining? ๏ What’s the ETA to catch up with publishers? ✦ Establish where is the bottleneck ๏ Application is not fast enough ๏ Disk read I/O
  69. 69. ENFORCING MULTITENANCY 69 ✦ Ensure tenants don’t cause performance issues on other tenants ๏ Backlog quotas ๏ Soft isolation - flow control and throttling ✦ In cases when user behavior is triggering performance degradation ๏ Hard isolation as a last resource for quick reaction while proper fix is deployed ๏ Isolate tenant on a subset of brokers ๏ Can be also applied at the storage node level
  70. 70. PULSAR PERFORMANCE 70
  71. 71. PULSAR PERFORMANCE - LATENCY 71
  72. 72. APACHE PULSAR VS. APACHE KAFKA 72 Mul$-tenancy A single cluster can support many tenants and use cases Seamless Cluster Expansion Expand the cluster without any down $me High throughput & Low Latency Can reach 1.8 M messages/s in a single par$$on and publish latency of 5ms at 99pct Durability Data replicated and synced to disk Geo-replica$on Out of box support for geographically distributed applica$ons Unified messaging model Support both Topic & Queue seman$c in a single model Tiered Storage Hot/warm data for real $me access and cold event data in cheaper storage Pulsar Func$ons Flexible light weight compute Highly scalable Can support millions of topics, makes data modeling easier
  73. 73. 73 SPARK STREAMING APACHE APEX APACHE FLINK SAMZA APACHE BEAM (MILLWHEEL, DATAFLOW) ATHENAX STYLUS APACHE STORM STREAMALERT APACHE HERON Processing
  74. 74. PROCESSING 74 TASK ISOLATIONLOW LATENCY MULTIPLE API'S MULTIPLE SEMANTICS DIVERSE WORKLOADS FAULT TOLERANCE KEY PROPERTIES
  75. 75. HERON TERMINOLOGY 75 Topology Directed acyclic graph ver6ces = computa6on, and edges = streams of data tuples Spouts Sources of data tuples for the topology Examples - Pulsar/KaPa/MySQL/Postgres Bolts Process incoming tuples, and emit outgoing tuples Examples - filtering/aggrega6on/join/any func6on , %
  76. 76. HERON TOPOLOGY 76 % % % % % Spout 1 Spout 2 Bolt 1 Bolt 2 Bolt 3 Bolt 4 Bolt 5
  77. 77. HERON TOPOLOGY - PHYSICAL EXECUTION 77 % % % % % Spout 1 Spout 2 Bolt 1 Bolt 2 Bolt 3 Bolt 4 Bolt 5 %% %% %% %% %%
  78. 78. HERON GROUPINGS 78 01 Shuffle Grouping Random distribution of tuples / Fields Grouping Group tuples by a field or multiple fields All Grouping Replicates tuples to all tasks 02 . 03 - 04 Global Grouping Send the entire stream to one task ,
  79. 79. HERON TOPOLOGY - PHYSICAL EXECUTION 79 % % % % % Spout 1 Spout 2 Bolt 1 Bolt 2 Bolt 3 Bolt 4 Bolt 5 %% %% %% %% %% Shuffle Grouping Shuffle Grouping Fields Grouping Fields Grouping Fields Grouping Fields Grouping
  80. 80. HERON TOPOLOGY - PHYSICAL EXECUTION 80 % % % % % Spout 1 Spout 2 Bolt 1 Bolt 2 Bolt 3 Bolt 4 Bolt 5 %% %% %% %% %% Shuffle Grouping Shuffle Grouping Fields Grouping Fields Grouping Fields Grouping Fields Grouping
  81. 81. HERON ARCHITECTURE 81 Scheduler Topology 1 Topology 2 Topology N Topology Submission
  82. 82. HERON ARCHITECTURE 82 Topology Master ZK Cluster Stream Manager I1 I2 I3 I4 Stream Manager I1 I2 I3 I4 Logical Plan, Physical Plan and Execution State Sync Physical Plan DATA CONTAINER DATA CONTAINER Metrics Manager Metrics Manager Metrics Manager Health Manager MASTER CONTAINER
  83. 83. TOPOLOGY MASTER 83 Monitoring of containers Gateway for metrics Assigns role
  84. 84. STREAM MANAGER 84 Routes tuples Implements backpressure Processing Semantics
  85. 85. STREAM MANAGER 85 % % S1 B2 B3 % B4
  86. 86. STREAM MANAGER 86 S1 B2 B3 Stream Manager Stream Manager Stream Manager Stream Manager S1 B2 B3 B4 S1 B2 B3 S1 B2 B3 B4 B4 PHYSICAL EXECUTION
  87. 87. STREAM MANAGER BACK PRESSURE 87 Spout based back pressureTCP backpressure Stage by stage back pressure
  88. 88. HERON INSTANCE 88 Runs only one task (spout/bolt) Exposes Heron API Collects several metrics API G
  89. 89. HERON INSTANCE 89 Stream Manager Metrics Manager Gateway Thread Task Execution Thread data-in queue data-out queue metrics-out queue
  90. 90. HERON DEPLOYMENT 90 Topology 1 Topology 2 Topology N Heron Tracker Heron API Server Heron Web ZK Cluster Aurora Services Observability
  91. 91. HERON VISUALIZATION 91
  92. 92. HERON TOPOLOGY COMPLEXITY 92
  93. 93. HERON TOPOLOGY SCALE 93 CONTAINERS - 1 TO 600 INSTANCES - 10 TO 6000
  94. 94. HERON HAPPY FACTS :) 94 ✦ No more pages during midnight for Heron team ๏ Backlog quotas ๏ Soft isolation - flow control and throttling ✦ Very rare incidents for Heron customer teams ✦ Easy to debug during incidents for quick turn around ๏ Reduced resource utilization saving cost (~3x cost savings)
  95. 95. 95 Coffee Break
  96. 96. HERON DEVELOPER ISSUES 96 01 02 Container resource alloca$on Parallelism tuning
  97. 97. HERON OPERATIONAL ISSUES 97 01 02 03 Slow Hosts Network Issues Data Skew / . - 04 Load Variations , 05 SLA Violations /
  98. 98. SLOW HOSTS 98 Memory Parity Errors Impending Disk Failures Lower GHZ
  99. 99. NETWORK ISSUES 99 Network Partitioning G Network Slowness
  100. 100. NETWORK SLOWNESS 100 01 02 03 Delays processing Data is accumula$ng Timelines of results Is affected
  101. 101. DATA SKEW 101 Multiple Keys Several keys map into single instance and their count is high Single Key Single key maps into a instance and its count is high H C
  102. 102. LOAD VARIATIONS 102 Multiple Keys Several keys map into single instance and their count is high Single Key Single key maps into a instance and its count is high H C
  103. 103. SELF REGULATING STREAMING SYSTEMS 103 Automate Tuning SLO Maintenance Self Regula$ng Streaming Systems Tuning Manual, 6me-consuming and error-prone task of tuning various systems knobs to achieve SLOs SLO Maintenance of SLOs in the face of unpredictable load varia6ons and hardware or soWware performance degrada6on Self Regula$ng Streaming Systems System that adjusts itself to the environmental changes and con6nue to produce results
  104. 104. SELF REGULATING STREAMING SYSTEMS 104 Self tuning Self stabilizing Self healing G !g Several tuning knobs Time consuming tuning phase The system should take as input an SLO and automatically configure the knobs. The system should react to external shocks and a u t o m a t i c a l l y reconfigure itself Stream jobs are long running Load variations are common The system should identify internal faults and attempt to recover from them System performance affected by hardware or software delivering degraded quality of service
  105. 105. ENTER DHALION 105 Dhalion periodically executes well- specified policies that optimize execution based on some objective. We created policies that dynamically provision resources in the presence of load variations and auto-tune streaming applications so that a throughput SLO is met. Dhalion is a policy based framework integrated into Heron
  106. 106. DHALION POLICY FRAMEWORK 106 Symptom Detector 1 Symptom Detector 2 Symptom Detector 3 Symptom Detector N .... Diagnoser 1 Diagnoser 2 Diagnoser M .... Resolver Invocation D iagnosis 1 Diagnosis 2 D iagnosis M Symptom 1 Symptom 2 Symptom 3 Symptom N Symptom Detection Diagnosis Generation Resolution Resolver 1 Resolver 2 Resolver M .... Resolver Selection Metrics
  107. 107. DYNAMIC RESOURCE PROVISIONING 107 Policy This policy reacts to unexpected load varia6ons (workload spikes) Goal Goal is to scale up and scale down the topology resources as needed - while keeping the topology in a steady state where back pressure is not observed H C
  108. 108. DYNAMIC RESOURCE PROVISIONING 108 Pending Tuples Detector Backpressure Detector Processing Rate Skew Detector Resource Over provisioning Diagnoser Resource Under Provisioning Diagnoser Data Skew Diagnoser Resolver Invocation Diagnosis Symptoms Symptom Detection Diagnosis Generation Resolution Metrics Slow Instances Diagnoser Bolt Scale Down Resolver Bolt Scale Up Resolver Data Skew Resolver Restart Instances Resolver
  109. 109. PROCESSING HERON PULSAR BOOKKEEPER UNIFICATION 109 MESSAGING STORAGE STREAMLIO
  110. 110. STREAMLIO 110 UNIFIED ARCHITECTURE Interactive Querying Storm API Streamlet SQL Application Builder Pulsar API BK/ HDFS API Kubernetes Metadata Management Operational Monitoring Chargeback Security Authentication Quota Management Kafka API
  111. 111. STREAMLIO ARCHITECTURE 111 KEY HIGHLIGHTS DAG PROCESSING GEO-REPLICATION DURABILITY CONSISTENCY FUNCTION PROCESSING INFINITE RETENTION SQL
  112. 112. 112 ETL
  113. 113. DATA PROCESSING 113 CLEANSING AND TRANSFORMATION Data format consistency M:0:Male, F:1:Female Missing Data Imputation Pre-computation Sum, Max, Min, Distance Timestamp Hour of the day, Date
  114. 114. DATA ENRICHMENT 114 MULTIPLE DATA SOURCES Operators Associate Aggregate Filter Properties Scalable Fault-tolerant Secure Challenges Encryption Merge Union Sort
  115. 115. ETL DATA PIPELINES - RETAIL 115 Scenario Consumer retail company migrating multiple data and analytics applications to public cloud Challenges Needed a solution to connect data to dashboards, data warehouse, and applications Solution Cloud data pipeline built on Apache Pulsar to support data movement, transformation, and access
  116. 116. ETL DATA PIPELINES - RETAIL 116 Cloud data warehouse Batch reporting and analytics Real-time dashboards and alerts Data sources and applications Messaging, queuing, data transformations Cloud data lake
  117. 117. 117 Enterprise Event Bus
  118. 118. DATA SILOS 118 INTEGRATION Disparate Data Sources Multiple Data Types Different Frequency Data Fidelity
  119. 119. DATA FUSION 119 EXTRACTING INSIGHTS Why? Complementarity → Completeness Redundancy → Accuracy Cooperative → Dependability Planning Decision-Making Applications Smart Buildings Remote sensing Transportation Multi-Target Tracking (MTT)
  120. 120. 120 STREAMING JOINS CHALLENGES Bounded sliding window of tuples Traditional join semantics Hard to exploit many-core parallelism Sequential operation model (store and process) Linear data flow model (left-to-right / right-to-left) Performance ๏ Deployment parameters ❖ Processing capacity, level of parallelism ๏ Time-varying input parameters ❖ Varying sources, Rate of tuples SplitJoin * Guilsano et al., “Performance Modeling of Stream Joins”, DEBS 2017.
  121. 121. ENTERPRISE EVENT BUS 121 Apache Pulsar Cluster Application 2 Application 5 Application 1 Application 3 Application 4 Application 6
  122. 122. ENTERPRISE EVENT BUS - YAHOO! 122 Scenario Need to collect and distribute user and data events to distributed global applications at Internet scale Challenges ✦ Multiple technologies to handle messaging needs ✦ Multiple, siloed messaging clusters ✦ Hard to meet scale and performance ✦ Complex, fragile environment Solution ✦ Central event data bus using Apache Pulsar ✦ Consolidated multiple technologies and clusters into a single solution ✦ Fully-replicated across 8 global datacenter ✦ Processing >100B messages / day, 2.3M topics
  123. 123. 123 UNBOUNDED UNORDERED EVENT TIME PROCESSING TIME LOW LATENCY LIMITED MEMORY SINGLE PASS , INCREMENTAL APPROXIMATE
  124. 124. APPROXIMATION 124 Guaranteed bounds on approximation factors Error greater than 𝜖 with probability 𝛿 BOUNDS PROBABILISTICDETERMINISTIC
  125. 125. APPROXIMATION 125 CHEBYSHEV HOEFFDING MARKOV PROBABILISTIC BOUNDS X: Random variable, μ: expectation, 𝜖>0
  126. 126. 126 BIAS VS. VARIANCE PROPERTIES BIAS VARIANCE How much the average of the estimate differs from the true mean Variance of the estimate around its mean
  127. 127. 127 BIAS VS. VARIANCE TRADE-OFF * Illustra6on borrowed from h`ps://web.stanford.edu/~has6e/ElemStatLearn/ *
  128. 128. DATA SKETCHES 128
  129. 129. 129 SAMPLING FILTERING CARDINALITY QUANTILE FREQUENT MOMENTS SPECTRUM OF DATA SKETCHES
  130. 130. SAMPLING 130 Volume ๏ Unbounded Velocity Types ๏ Uniform ๏ Non-uniform ❖ Class imbalance Latency
  131. 131. SAMPLING 131 ✦ Maintain dynamic sample ๏ A data stream is a continuous process ๏ Not known in advance how many points may elapse before an analyst may need to use a representative sample ✦ Reservoir sampling[1] ๏ Probabilistic insertions and deletions on arrival of new stream points ๏ Probability of successive insertion of new points reduces with progression of the stream ❖ An unbiased sample contains a larger and larger fraction of points from the distant history of the stream ✦ Practical perspective ๏ Data stream may evolve and hence, the majority of the points in the sample may represent the stale history [1] J. S. Vi`er. Random Sampling with a Reservoir. ACM Transac6ons on Mathema6cal SoWware, Vol. 11(1):37–57, March 1985. OBTAINING A REPRESENTATIVE SAMPLE
  132. 132. SAMPLING 132 ✦ Sliding window approach* (sample size k, window width n) ๏ Sequence-based ❖ Replace expired element with newly arrived element ‣ Disadvantage: highly periodic ❖ Chain-sample approach ‣ Select element ith with probability Min(i,n)/n ‣ Select uniformly at random an index from [i+1, i+n] of the element which will replace the ith item ‣ Maintain k independent chain samples ๏ Timestamp-based ❖ # elements in a moving window may vary over time ❖ Priority-sample approach * B. Babcock. Sampling From a Moving Window Over Streaming Data. In Proceedings of SODA, 2002. 3 5 1 4 6 2 8 5 2 3 5 4 2 2 5 0 9 8 4 6 7 3 3 5 1 4 6 2 8 5 2 3 5 4 2 2 5 0 9 8 4 6 7 3 3 5 1 4 6 2 8 5 2 3 5 4 2 2 5 0 9 8 4 6 7 3 3 5 1 4 6 2 8 5 2 3 5 4 2 2 5 0 9 8 4 6 7 3
  133. 133. SAMPLING 133 COMPRESSED SENSING Distributed ๏ Communication cost ๏ Non-uniform sampling rates High dimensionality ๏ Sparsity Applications Sensor networks Network traffic monitoring Facial recognition Mitigate latency impact
  134. 134. 134 COMPRESSED SENSING EARLY WORK [Candès et al. 2006] [Donoho 2006] [Cormode and Muthukrishnan 2006] [Freris et al. 2013] Recursive approach COMPRESSED SENSING FOR STREAMING DATA [Yan et al. 2015] DISTRIBUTED OUTLIER DETECTION COMPRESSIVE SAMPLING [Candès and Wakin 2008] RELATED WORKS
  135. 135. 135
  136. 136. 136 Document 2015 EVENTS E-COMMERCE OPPORTUNITY
  137. 137. MODEL 137 INCREMENTAL HANDLE NON-STATIONARITY, CONCEPT DRIFT EVOLVE CONTINUOUSLY
  138. 138. PERSONALIZATION 138 THE HOLY GRAIL Continuously evolving data streams Time decay Multi-armed bandit Contextual Deep learning Privacy
  139. 139. INVENTORY MANAGEMENT 139 FORECASTING Mining retail transactions Seasonality Virality New product launches
  140. 140. 140 Real-Time Trending
  141. 141. 141 HERBERT SIMON “A wealth of information creates a poverty of attention.”
  142. 142. 142 CONTINUOUS (PREFERENCE) TOP-K ELEMENTS * Moura6dis et al., “Con8nuous monitoring of top-k queries over sliding windows”, SIGMOD 2006. # Yu et al., “Processing a large number of con8nuous preference top-k queries”, SIGMOD 2012. ^ Zhang et al., “Reverse k-ranks query”, VLDB 2014.
  143. 143. 143 TWITTER TRENDING NEWS
  144. 144. TRENDING 144 Spotify, Apple Music, Pandora AUDIO
  145. 145. 145 YouTube, Vimeo, DailyMotion TRENDING VIDEO
  146. 146. Facebook Live, Periscope, Twitch 146 TRENDING LIVE VIDEO
  147. 147. 147 FAKE ADDRESSING INTEGRITY
  148. 148. 148 BIAS VARIOUS SOURCES Data bias ๏ Public image datasets ❖ Geographic representation ๏ Induces algorithmic bias[1] ❖ Online advertising ❖ Jobs recommendation ❖ Customer profiling Anomalies Data fidelity ๏ Bad actors ❖ Bots ❖ Data farms * Figure borrowed from Baeza-Yates, “Bias on the web”, 2018. [1] Hajian et al., “Algorithmic bias: From Discrimina8on Discovery to Fairness-aware Data mining”, KDD 2016. *
  149. 149. 149 FACT CHECKING CROWDSOURCING
  150. 150. REAL TIME TRENDING IN PULSAR & HERON 150 Streamlio (Apache Pulsar and Apache Heron) Data Source 2 clean-fn 2 Data Source 1 Data Source 3 clean-fn 1 trend- topology 3 Trending Application T1 T2 T3
  151. 151. LATENCY 151 USER EXPERIENCE
  152. 152. THE TAIL[1] 152 CLASSIFICATION[2] ✦ Whether elements can only be added or can be both inserted and deleted? ๏ Cash register model ๏ Turnstile model ✦ What operations are allowed on the elements? ๏ Comparison model ๏ Fixed-universe model ✦ Whether the algorithm is deterministic/randomized? ๏ Monte Carlo randomization QUANTILE ESTIMATION [1] J. Deam and L. Barroso, “The tail at scale”, 2013. [2] G. Luo et al., “Quan8les over data streams: experimental comparisons, new analyses, and further improvements”, 2016.
  153. 153. QUANTILE ESTIMATION 153 [Arasu and Manku 2005] SLIDING WINDOWS [Cormode et al. 2005] [Yi et al. 2009] CONTINUOUS MONITORING [Shrivastava et al. 2004] [Agarwal et al. 2012] DISTRIBUTED DATA [Cormode et al. 2006] BIASED FLAVORS * Wang et al., “Quan8les over Data Streams: An Experimental Study”, SIGMOD 2013.
  154. 154. QUANTILE ESTIMATION 154 PICK [Blum et al. 1973] EXACT MEDIAN [Munro and Patterson 1980] Q-DIGEST [Shrivastava et al. 2002] Interest in this problem may be traced to the realm of sports and the design of (traditionally, tennis) tournaments to select the first- and second-best players. In 1883, Lewis Carroll published an article denouncing the unfair method by which the second- best player is usually determined in a "knockout tournament" -- the loser of the final match is often not the second-best! (Any of the players who lost only to the best player may be second-best.) Around 1930, Hugo Steinhaus brought the problem into the realm of algorithmic complexity by asking for the minimum number of matches required to (correctly) select both the first- and second-best players from a field of n contestants. I T-DIGEST [Dunning and Ertl 2017] p passes space
  155. 155. Max signal value # Elements Compression Factor Complete binary tree Q-DIGEST[1] 155 ✦ Groups values in variable size buckets of almost equal weights ๏ Unlike a traditional histogram, buckets can overlap ✦ Key features ๏ Detailed information about frequent values preserved ๏ Less frequent values lumped into larger buckets ✦ Using message of size m, answer within an error of ✦ Except root and leaf nodes, a node v ∈ q-digest iff [1] Shrivastava et al., “Medians and Beyond: New Aggrega8on Techniques for Sensor Networks”. SenSys, 2004.
  156. 156. Q-DIGEST (CONTD.) 156 ✦ Building a q-digest ✦ q-digests can be constructed in a distributed fashion ๏ Merge q-digests
  157. 157. T-DIGEST[1] 157 [1] T. Dunning and O. Ertl, ”Compu8ng Extremely Accurate Quan8les using t-digests”, 2017. h`ps://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf ✦ Approximation of rank-based statistics ๏ Compute quantile q with an accuracy relative to max(q, 1-q) ๏ Compute hybrid statistics such as trimmed statistics ✦ Key features ๏ Robust with respect to highly skewed distributions ๏ Independent of the range of input values (unlike q-digest) ๏ Relative error is bounded ❖ Non-equal bin sizes ❖ Few samples contribute to the bins corresponding to the extreme quantiles ๏ Merging independent t-digests ❖ Reasonable accuracy
  158. 158. T-DIGEST (CONTD.) 158 ✦ Group samples into sub-sequences ๏ Smaller sub-sequences near the ends ๏ Larger sub-sequences in the middle ✦ Scaling function ๏ Mapping k is monotonic ❖ k(0) = 1 and k(1) = δ ❖ k-size of each subsequence < 1 s Notional Index Compression parameter Quantile
  159. 159. T-DIGEST (CONTD.) 159 ✦ Estimating quantile via interpolation ๏ Sub-sequences contain centroid of the samples ๏ Estimate the boundaries of the sub-sequences ✦ Error ๏ Scales quadratically in # samples ๏ Small # samples in the sub-sequences near q=0 and q=1 improves accuracy ๏ Lower accuracy in the middle of the distribution ❖ Larger sub-sequences in the middle ✦ Two flavors ๏ Progressive merging (buffering based) and clustering variant s
  160. 160. 160 RELATED WORKS SUMMARY DATA STRUCTURE [Greenwald and Khanna 2001] [Luo et al. 2016] RANDOM BQ-SUMMARY [Zhang et al. 2006] One-scan randomized MR [Cormode et al. 2006] A SMALL SNAPSHOT
  161. 161. 161
  162. 162. PERFORMANCE 162 MONITORING CPM (Cost per 1000 Impressions), vCPM, CPCV CPC (Cost per Click) CPI (Cost per Install) CPE (Cost per Engagement) CPA (Cost per Action), CPO
  163. 163. PERFORMANCE 163 MONITORING AD FORMATS POP UNDER INTERSTITIAL REDIRECTBANNERS REWARDED VIDEO POP UP SOCIAL VIDEO
  164. 164. FRAUD 164 DETECTION IMPRESSION CLICK INSTALL ENGAGEMENT
  165. 165. FRAUD DETECTION USING APACHE PULSAR 165 Apache Pulsar Data Source 2 clean-fn 2 Data Source 1 Data Source 3 clean-fn 1 fraud-fn 1 Fraud Detection T1 T2 T3 fraud-fn 2
  166. 166. IP/DEVICE ID BLACKLISTING 166 CUCKOO FILTER [Fan et al. CoNext 2014] QUOTIENT FILTER [Bender et al. Hot Storage 2011] MORTON FILTER [Breslow and Jayasena, VLDB 2018] BLOOM FILTER [Bloom, CACM 1970] FILTERING
  167. 167. BLOOM FILTER 167 [1] Illustra6on borrowed from h`p://www.eecs.harvard.edu/~michaelm/postscripts/im2005b.pdf [1]
  168. 168. BLOOM FILTER 168 ✦ Natural generalization of hashing ✦ False positives are possible ✦ No false negatives No deletions allowed ✦ For false positive rate ε, # hash functions = log2(1/ε) where, n = # elements, k = # hash functions m = # bits in the array
  169. 169. VARIATIONS 169 BLOOM FILTER [Mitzenmacher 2002] COMPRESSED [Fan et al. 2000] COUNTING [Cohen and Matias 2003] SPECTRAL [Chazelle et al. 2004] BLOOMIER FILTER [Guo et al. 2010] BUFFERED [Lall and Ogihara, 2007] BITWISE [Yoon 2010] A2 [Canim et al. 2010] DYNAMIC [Einziger and Friedman, 2014] BLOOM-g [Debnath et al. 2011] Bloomflash [Naor and Yogev, 2013] SLIDING [Qiao et al. 2014] Tinyset
  170. 170. QUOTIENT FILTER 170 ✦ Supports insertion, deletion ✦ Merging/resizing feasible ✦ Lookups incur a single cache miss ✦ 20% bigger than Bloom Filter ✦ Stores p-bit fingerprints of elements ✦ Compact hash table ✦ P(hard collision) = Employs quotienting: remainder (r least significant bits) is stored in the bucket indexed by the quotient (q most significant bits) where, ⍺ = n/m, m=2q , n: # elements, m: # slots
  171. 171. VARIATIONS 171 QUOTIENT FILTER [Bender et al. 2011] CASCADE [Pandey et al. 2017] COUNTING RANK-AND-SELECT ✦ Multiset of integer, each of width p-bits ✦ l =log2( n/M) + O(1) in-flash QFs of exponentially increasing size, QF1, QF2, …, QFl stored contiguously, M: RAM size ✦ Search: O(log(n/M)) block reads ✦ Insert: O((log( n/M))/B) amortized block writes/erases, B: natural block size of the flash [Dutta et al. 2014] STREAMING [Bungeroth, 2018]
  172. 172. CUCKOO FILTER 172 ✦ Key Highlights ๏ Add and remove items dynamically ๏ For false positive rate ε < 3%, more space efficient than Bloom filter ๏ Higher performance than Bloom filter for many real workloads ๏ Asymptotically worse performance than Bloom filter ❖ Min fingerprint size α log (# entries in table) ✦ Overview ๏ Stores only a fingerprint of an item inserted ❖ Original key and value bits of each item not retrievable ๏ Set membership query for item x: search hash table for fingerprint of x
  173. 173. Cuckoo Hashing [1] CUCKOO FILTER 173 [1] R. Pagh and F. Rodler. Cuckoo hashing. Journal of Algorithms, 51(2):122-144, 2004. [2] Illustra6on borrowed from “Fan et al. Cuckoo Filter: Prac6cally Be`er Than Bloom. In Proceedings of the 10th ACM Interna6onal on Conference on Emerging Networking Experiments and Technologies, 2014.” [2] Illustra6on of Cuckoo hashing [2] ✦ High space occupancy ✦ Practical implementations: multiple items/bucket ✦ Example uses: Software-based Ethernet switches Cuckoo Filter [2] ✦ Uses a multi-way associative Cuckoo hash table ✦ Employs partial-key cuckoo hashing ๏ Store fingerprint of an item ๏ Relocate existing fingerprints to their alternative locations [2]
  174. 174. CUCKOO FILTER 174 ✦ Deletion ๏ Item must have been previously inserted ✦ Partial-key cuckoo hashing ✦ Fingerprint hashing ensures uniform distribution of items in the table ✦ Length of fingerprint << Size of h1 or h2 ✦ Possible to have multiple entries of a fingerprint in a bucket Alternate bucket Significantly shorter than h1 and h2
  175. 175. CONCURRENT CUCKOO HASHING VARIATIONS 175 CUCKOO HASHING AND CUCKOO FILTER [Li et al. 2014] [Eppstein et al. 2017] 2-3 CUCKOO FILTER HORTON TABLES [Sun et al. 2017] SMART CUCKOO [Breslow et al. 2016]
  176. 176. MORTON FILTER 176 ✦ Key Highlights ๏ Insertions heavily biased towards the hash function H1 ❖ Fingerprint retrieval require fewer hardware cache accesses ๏ Overflow Tracking Array (OTA): simple bit vector ❖ Tracks when fingerprints cannot be placed using H1 ❖ Most negative lookups only require accessing a single bucket ๏ Fullness counters ❖ Replace the empty slots (compression) and track the load of each logical bucket ❖ Reads and updates happen in-situ without explicit need for materialization ๏ Most insertions require accessing a single cache line ๏ Fewer fingerprint comparisons than a Cuckoo filter
  177. 177. MORTON FILTER 177 ✦ Hashing ๏ Reduces TLB misses, DRAM row buffer misses and page faults ๏ Does not require total # buckets to be a power of 2 K’s fingerprint Buckets per block Total buckets (multiple of 2) Bucket index where K’s fingerprint is stored Hash function ๏ Compute H’(K) and remap K’s fingerprint without knowing K itself ๏ For any key, candidate buckets mostly fall within the same page of memory Power of 2
  178. 178. SPEND 178 DYNAMIC (RE)-ALLOCATION SANs Facebook, Google, Twitter, Yahoo, Snapchat Ad Networks Affiliates Offer wall TV
  179. 179. 179
  180. 180. EXCHANGES 180 PROGRAMMATIC DoubleClick MoPub OpenX AppNexus
  181. 181. USER PROFILE 181 MULTI-DIMENSIONAL TARGETING Platform Device Type OS Version Apps Installed Age, Gender
  182. 182. STATISTICAL ARBITRAGE 182 PREDICTION Click Through Rate (CTR) Click to Install (CTI) Life Time Value (LTV) Churn probability REAL-TIME ANOMALY DETECTION
  183. 183. ANOMALY DETECTION 183 Temporal Distribution based A N O M A L Y DIFFERENT FLAVORS safaribooksonline.com/library/view/understanding-anomaly-detection/9781491983676/ “On the Runtime-Efficacy Trade-off of Anomaly Detection Techniques for Real-Time Streaming Data”, by Choudhary et al. 2017. (https://arxiv.org/abs/1710.04735)
  184. 184. ANOMALY DETECTION 184 ROOTED IN STUDIES IN PROBABILITY 1777 1713 Reconciling discrepant observations Euler 1749 Irregularities in the motion of Saturn and Jupiter Mayer 1750 Study of lunar libration Boscovich 1755 Measurements of the mean ellipticity of the Earth
  185. 185. ANOMALY DETECTION 185 LONG HISTORY First formalization of an exact rule for rejection of anomalies
  186. 186. ANOMALY DETECTION 186 CHARACTERISTICS MAGNITUDE Severity WIDTH Actionability FREQUENCY Reliability DIRECTION Positive/Negative Global Local
  187. 187. ANOMALY DETECTION 187 WHY IT’S NON-TRIVIAL SEASONALITY TREND BREAKOUTSTATIONARITYNOISE
  188. 188. ANOMALY DETECTION 188 ASSUMPTIONS Normality, Stationarity MOVING AVERAGES Params: Width, Decay RULE BASED µ ± σ COMMON APPROACHES
  189. 189. ANOMALY DETECTION 189 MAD[1] Median Absolute Deviation MEDIAN MCD[2] Minimum Covariance Determinant MVEE[3,4] Minimum Volume Enclosing Ellipsoid ROBUST MEASURES [1]P. J. Rousseeuw and C. Croux, “Alterna6ves to the Median Absolute Devia6on”, 1993. [2] h`p://onlinelibrary.wiley.com/wol1/doi/10.1002/wics.61/abstract [3] P. J. Rousseeuw and A. M. Leroy.,“Robust Regression and Outlier Detec6on”, 1987. [4] M. J.Todda and E. A. Yıldırım , “On Khachiyan's algorithm for the computa6on of minimum-volume enclosing ellipsoids”, 2007.
  190. 190. ANOMALY DETECTION 190 CONTEXT ✦ Data types ๏ Text, Audio, Video ✦ Data veracity ๏ Wearables ๏ Smart cities, Connected Home ๏ Internet of Things LIVE DATA ✦ Multi-dimensional ✦ Mow memory footprint ✦ Accuracy-speed tradeoff CHALLENGES
  191. 191. 191 ANOMALY DETECTION DIFFERENT APPLICATION DOMAINS RADIO ANOMALY DETECTION MARKER DISCOVERY CROWDED SCENE ANALYSIS ANOMALOUS HUMAN BEHAVIOR HUMAN TRAJECTORY PREDICTION [Popoola and Wang 2012] [Alahi et al. 2016] [O’Shea et al. 2016] [Schegl et al. 2017] [Sabokrou et al. 2017]
  192. 192. 192 Real-Time Security
  193. 193. THREAT DETECTION 193 POTENTIAL HIGH BUSINESS DOWNSIDE DDos Attack Network Intrusion Harassement Account hacking Identity theft Purchases
  194. 194. MONITORING 194 AUDIT LOGS Answering the “What, When, Who, Why” Downloading of unusual amount of data Troubleshoot systems, networks Real-time packet analysis Challenge: Encryption Suspicious access pattern
  195. 195. REAL TIME SECURITY USING APACHE PULSAR 195 Apache Pulsar Data Source 2 threat-fn 2 Data Source 1 Data Source 3 threat-fn 1 Query Audit Logs (SQL) Streaming threats & anomalies T1 T2 T3 Audit Log Investigation
  196. 196. DEVICE IDS/IP ADDRESSES 196 FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS ✦ Require large amount of memory ✦ Long tail - low business value EXACT ✦ Types ๏ Counter-based ❖ Example: Frequent Algorithm [Demaine et al. 2002] ๏ Sketch-based ❖ Example: GroupTest [Cormode and Muthukrishnan, 2002] ✦ Parameters ๏ # counters, # update operations, error APPROXIMATE
  197. 197. FLAVORS 197 FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS l 1 heavy hitter ๏ m can be known/unknown ๏ Deterministic ❖ [Misra and Gries 1982, Karp et al. 2003] ๏ Randomized ❖ [Estan and Varghese 2003, Kumar and Xu 2006] [1] Woodruff, 2016. “New Algorithms for Heavy Hi`ers in Data Streams”.ICDT. Frequency of item i # elements in stream Output
  198. 198. FLAVORS 198 FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS l 2 heavy hitter ๏ Example ❖ [Charikar et al. 2004] ๏ Applications ❖ Compressed sensing ❖ Numerical linear algebra ❖ Cascaded aggregates Frequency of item i # dis6nct items Output
  199. 199. COUNT-MIN SKETCH[1] 199 ✦ A two-dimensional array counts with w columns and d rows ✦ Each entry of the array is initially zero ✦ d hash functions are chosen uniformly at random from a pairwise independent family ✦ Update ๏ For a new element i, for each row j and k = hj(i), increment the kth column by one ✦ Point query where, sketch is the table ✦ Parameters [1] Cormode, Graham; S. Muthukrishnan (2005). "An Improved Data Stream Summary: The Count-Min Sketch and its Applica6ons". J. Algorithms 55: 29–38. ),( δε ! ! " # # $ = ε e w ! ! " # # $ = δ 1 lnd }1{}1{:,,1 wnhh d ……… →
  200. 200. COUNT-MIN SKETCH 200 ✦ Count-Min sketch with conservative update (CU sketch) ๏ Update an item with frequency c ๏ Avoid unnecessary updating of counter values => Reduce over-estimation error ๏ Prone to over-estimation error on low-frequency items ✦ Lossy Conservative Update (LCU) - SWS ๏ Divide stream into windows ๏ At window boundaries, ∀ 1 ≤ i ≤ w, 1 ≤ j ≤ d, decrement sketch[i,j] if 0 < sketch[i,j] ≤ [1] Cormode, G. 2009. Encyclopedia entry on ’Count-MinSketch’. In Encyclopedia of Database Systems. Springer., 511–516. VARIANTS [1]
  201. 201. COUNTER TREE[1] 201 ✦ Motivation ๏ Equi-bitwidth counters → inefficient space usage owing variance in counts ✦ Two-dimensional counter sharing ✦ m virtual counters (V[i], 0 ≤ i < m) ๏ Path from a leaf node to the root ๏ Counter at layer j in V[i] given by: [1] Chen et al., “Counter Tree: A Scalable Counter Architecture for Per-FLow Traffic Measurement”, TON 2017. # leaf nodes Total space (# bits) Counter bit width Tree height Degree of a non-leaf node
  202. 202. COUNTER TREE (CONTD.) 202 ✦ Counting range ✦ Virtual counter array Vf ๏ Vf[i] = V[hi(f)], 0 ≤ i < r, ✦ Recording ๏ Increment a counter in Vf, chosen uniformly at random, by one ๏ May result in update of multiple counters # counters in the bo`om layer
  203. 203. COUNTER TREE (CONTD.) 203 ✦ Estimation ๏ CT based Estimation (CTE) ❖ k = dh-1 ❖ Xi is the value of the subtree rooted at C[v], where C[v] is the component counter of V[u] at the highest level and ๏ CT based Maximum Likelihood Estimation (CTM) Reduce positive bias (overestimation)
  204. 204. VARIATIONS 204 FREQUENT/TOP-K ELEMENTS, HEAVY HITTERS LOSSY COUNTING [Manku and Motwani, 2002] [Homem and Carvalho, 2010] FILTERED SPACE SAVING COUNT-SKETCH [Dimitropoulos et al. 2008] PROBABILISTIC LOSSY COUNTING [Charikar et al, 2002] [Pitel and Fouquier, 2015] COUNT-MIN-LOG SPACE SAVING ALGORITHM [Metwally et al. 2005] FDPM [Yu etal., 2006] [Sivaraman et al. 2017] HASHPIPE
  205. 205. 205
  206. 206. HEALTH CARE 206 WEARABLES Smart watches, smart glasses, smart clothing, implantables Data veracity ๏ Hear rate (ECG and HRV) ๏ Brainwave (EEG) ๏ Muscle bio-signals (EMG) ๏ Blood pressure ๏ Calories burned ๏ Body temperature ๏ Steps walked Other applications ๏ Blood alcohol content ๏ Athletic performance
  207. 207. MONITORING 207 OIL DRILLING Early detection of leaks and/or spills ๏ Oil ๏ Flammable gases ๏ Hazardous chemicals Equipment health ๏ Corrosion Operational metrics ๏ Pressure ๏ Temperature ๏ Flow ๏ Air quality
  208. 208. QUALITY OF SERVICE 208 CUSTOMER FOCUS WiFi, Video streaming, E-commerce Metrics ๏ Availability ๏ Throughput ๏ Response time ๏ Document Complete ๏ Jitter ๏ Packet loss ratio ๏ Delivery success rate (messaging) ๏ Location accuracy Common issues ๏ Anomalies, Breakouts
  209. 209. INDUSTRIAL IOT USING APACHE PULSAR 209 Apache Pulsar Cloud Data Source 2 clean-fn 2 Data Source 1 Data Source 3clean-fn 1 logic-fn 1 T1 T2 T3 logic-fn 2 Apache Pulsar Edge filter-fn 1 filter-fn 1
  210. 210. NUMBER OF DISTINCT DATA SOURCES 210 CARDINALITY ESTIMATION ✦ Hash values as strings ✦ Occurrence of particular patterns in the binary representation ✦ Example: Hyperloglog [Flajolet et al. 2008] BIT-PATTERN OBSERVABLES ✦ Hash values as real numbers ✦ k-th smallest value ๏ Insensitive to distribution of repeated values ✦ Examples: MinCount [Giroire, 2000] ORDER STATISTIC OBSERVABLES
  211. 211. CARDINALITY ESTIMATION 211 ANOTHER CLASSIFICATION SKETCH-BASED Scan the entire data set once Hash the items Create sketch SAMPLING Potentially large estimation error UNIFORM HASHING Employs a uniform hash function LOGARITHMIC HASHING Keeps track of the most uncommon element observed so far Example: Hyperloglog INTERVAL BASED How packed an interval is? BUCKET BASED P(bucket is non-empty)
  212. 212. HYPERLOGLOG 212 where ✦ Apply hash function h to every element in a multiset ✦ Cardinality of multiset is 2max(ϱ) where 0ϱ-11 is the bit pattern observed at the beginning of a hash value ✦ Above suffers with high variance ๏ Employ stochastic averaging ๏ Partition input stream into m sub-streams Si using first p bits of hash values (m = 2p)
  213. 213. HYPERLOGLOG 213 OPTIMIZATIONS ✦ Use of 64-bit hash function ๏ Total memory requirement 5 * 2p -> 6 * 2p, where p is the precision ✦ Empirical bias correction ๏ Uses empirically determined data for cardinalities smaller than 5m and uses the unmodified raw estimate otherwise ✦ Sparse representation ๏ For n≪m, store an integer obtained by concatenating the bit patterns for idx and ϱ(w) ๏ Use variable length encoding for integers that uses variable number of bytes to represent integers ๏ Use difference encoding - store the difference between successive elements ✦ Other optimizations [1, 2] [1] h`p://druid.io/blog/2014/02/18/hyperloglog-op6miza6ons-for-real-world-systems.html [2] h`p://an6rez.com/news/75
  214. 214. MIN-COUNT VARIATIONS 214 CARDINALITY ESTIMATION [Giroire, 2009] Optimal statistical efficiency [Ting, 2014] DISCRETE MAX-COUNT F0 AND L0 ESTIMATION [Chen et al. 2011] S-BITMAP [Kane et al. 2010] Optimal space complexity
  215. 215. WRAPPING UP 215 PLATFORM UNIFICATION APPLICATIONS ANALYTICS
  216. 216. QUESTIONS 216
  217. 217. STAY IN TOUCH @arun_kejariwal @karthikz TWITTER EMAIL arun_kejariwal@acm.org karthik@streaml.io 217
  218. 218. 218 @arun_kejariwal @karthikz
  219. 219. READINGS 219 Data Streams: Algorithms and Applications”, by M. Muthukrishnan “Sketching as a Tool for Numerical Linear Algebra”, by D. Woodruff “Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches”, by G. Cormode, M. Garofalakis and P. J. Haas “Data Streams: Models and Algorithms”, by C. Aggarwal. “Graph Streaming Algorithms”, by A. McGregor “Data Sketching”, by G. Cormode.
  220. 220. READINGS 220 “The algebra of stream processing functions”, TCS 2001. STREAM CALCULUS “A conductive calculus of streams”, MSCS 2005. “Temporal stream algebra”, 2012. “Stream processing col-algebraically”, SCP 2013. “An algebra for pattern matching, time aware aggregates and partitions on relational data streams”, DEBS 2015.
  221. 221. READINGS 221 “Streaming Algorithms for Robust Distinct Elements”, SIGMOD 2016. “Pyramid Sketch”, VLDB 2017. “Bias-Aware Sketches”, VLDB 2017. “Efficient Adaptive Detection of Complex Event Patterns”, VLDB 2018. “Cardinality Estimation: An Experimental Survey”, VLDB 2018. “Augmented Sketch: Faster and More Accurate Stream Processing”, SIGMOD 2016. “BPTree: an L2 heavy hitters algorithm using constant memory”, PODS 2017. “A comparative analysis of state-of-the-art SQL- on-Hadoop systems for interactive analytics”, BigData 2018. “Unsupervised real-time anomaly detection for streaming data”, Neurocomputing2017. “Time Adaptive Sketches (Ada-Sketches) for Summarizing Data Streams”, SIGMOD 2016.
  222. 222. 222 READINGS “Streaming Anomaly Detection Using Randomized Matrix Sketching”, VLDB 2015. “Graph Sketches: Sparsification, Spanners, and Subgraphs”, PODS 2012. “Stream Order and Order Statistics: Quantile Estimation in Random-Order Streams”, SIAM JoC 2009. “Querying and mining data streams: You only get one look”, SIGMOD 2002. “Models and Issues in Data Stream Systems”, PODS 2002. “Clustering Data Streams”, FOCS 2000. “Persistent Data Sketching”, VLDB 2015. “SCREEN: Stream Data Cleaning under Speed Constraints”, SIGMOD 2015. “An optimal algorithm for the distinct elements problem”, PODS 2010. “Statistical Analysis of Sketch Estimators”, SIGMOD 2007.
  223. 223. OPEN SOURCE 223 Data Sketches (https://datasketches.github.io/) Algebird (https://github.com/twitter/algebird) StreamDM (http://huawei-noah.github.io/streamDM/) stream (https://cran.r-project.org/web/packages/stream/) FluRS (https://github.com/takuti/flurs)
  224. 224. RESOURCES 224 https://www.slideshare.net/arunkejariwal/correlation-analysis-on-live-data-streams (O’Reilly Strata) https://www.slideshare.net/arunkejariwal/modern-realtime-streaming-architectures (O’Reilly Strata) https://www.slideshare.net/arunkejariwal/live-anomaly-detection-80287265 (O’Reilly Strata) https://www.slideshare.net/arunkejariwal/anomaly-detection-in-realtime-data-streams-using-heron (O’Reilly Strata) https://www.slideshare.net/arunkejariwal/real-time-analytics-algorithms-and-systems (VLDB 2015)
  225. 225. RESOURCES 225 http://www.cs.toronto.edu/~koudas/courses/csc2508/SQL-on-Hadoop-Final.pdf (VLDB 2015) https://www.cse.ust.hk/vldb2002/program-info/tutorial-slides/T5garofalalis.pdf (VLDB 2002) https://datasketches.github.io/docs/Research.html https://people.cs.umass.edu/~mcgregor/slides/07-nipsslides.pdf (NIPS 2007) http://dimacs.rutgers.edu/~graham/pubs/slides/streammining.pdf
  226. 226. RESOURCES 226 https://people.cs.umass.edu/~mcgregor/slides/10-jhu1.pdf http://dmac.rutgers.edu/Workshops/WGUnifyingTheory/Slides/cormode.pdf https://archive.siam.org/meetings/sdm13/gupta.pdf (SDM 2013) http://www.outlier-analytics.org/odd13kdd/papers/slides_charu_aggarwal.pdf (ODD 2013) https://www.siam.org/meetings/sdm08/TS2.ppt (SDM 2008)
  227. 227. RESOURCES 227 https://www.cc.gatech.edu/~jx/reprints/talks/sigm07_tutorial.pdf (SIGMETRICS 2007) hanj.cs.illinois.edu/bk3/bk3_slides/12Outlier.ppt https://gist.github.com/debasishg/8172796 https://www.euroscipy.org/2017/descriptions/19827.html

×