Kappa architecture in the telecom industry

Kappa Architecture In The
Telco Industry
Ignacio Mulas Viela
Nicolas Seyvet

› Data source
– Events (metrics, logs) from physical and virtual servers
› Analytics:
– Real-time
– Statistical analysis
– Anomaly or novelty detection
High Level View
…
Flink-forward | Ignacio Mulas | 12-October-2015
Data source
Analytics

› Bounded  A start and an end
Finite, ingestion stops
› Unbounded  A start but no end
Infinite, ever-growing
Data Set
t3 t2 t1 t0…
tn
t3 t2 t1 t0…t∞
Unbounded
Bounded

› Twitter’s Nathan Mars
› But
– Two independent pipelines
– Complex maintenance
– Complex merge
Lambda Architecture
Strata London | Ignacio Mulas & Nicolas Seyvet | 3 – June – 2016

Ericsson Internal | 2011-10-19 | Page 8Strata London | Ignacio Mulas & Nicolas Seyvet | 3 – June – 2016

Kappa Architecture

› New model to abstract data processing
– Millwheel, Spark Streaming, Dataflow, Stratosphere (Flink)
› Stream engines
› Correctness
- Strong consistency
- Exactly-once-processing
› Resilience, fault tolerance
› Tools that can deal with time *
› APIs
The (Short) Evolution

Principles
Kappa Architecture
Everything is a
stream
Immutable data
sources
Single analytics
framework
Stream replay

› Stream representation
– Unbounded dataset composed by a sequence of events
› Data pipeline:
– Sequence of transformations on an unbounded data set that generates another
set with more insightful data
– UNIX pipes
Basics
…
Pub/Sub

Our Stack
Kafka Elastic
Search
Kibana
Flink
Analytics job 1
Analytics job 2
…
raw data
results job 1
…
…
Data sources
…

› Event time, which is when an event occurred
› Processing time, which is when an event is observed in the system
Time
Event time
Processingtime
reality
skew
Time drifts
Unordered events

Event Time
e0e1e2e3
…
t0t1t2t3
<tp0,e0><tp1,e1><tp2,e2><tp3,e3>
<te0,e0><te1,e1><te2,e2><te3,e3>
EventTimeExstractor()
enableTimestamps()
<te0,e0><te1,e1><te2,e2><te3,e3>
w2 w1 w0
window()
Flink-forward | Ignacio Mulas | 12-October-2015
Execution time
Te0
+ window
+ watermark
Flink Meetup | Ignacio Mulas | 26-November-2015Strata London | Ignacio Mulas & Nicolas Seyvet | 3 – June – 2016
e: event
tp: processing time
te: event time

2nd Client meeting…
It is nice, but… I cannot look at thousands of
numbers simultaneously, can you do better?

› Machine learning
– Automatically detect anomalies in the infrastructure
– Learn using raw and advanced metrics
› … add a new transformation to my unbounded data!
Advanced Data Pipeline
…
Stats
ML
analyticsData source

› Unsupervised machine learning
› Create a statistical model for “normal” behavior
– Poisson: count-based parameters
– Gaussian: value-based parameters
› Model adapts over time
Bayesian Detector
OK ANOMALYANOMALY

Log-Frequency Novelty Detector
…
…
Frequency_i+1
Frequency_2
Frequency_n
Phase 1: LEARN!
Phase 2: DETECT!
…
Frequency_1
OK
NOK
Time
window
Events
…
…
…
…
History

Multi-Variable Detector
t0t0t0 hk
hi
hm
… if…
.keyBy(host)
-slave
…
…
…

3rd Client meeting
Great! I can now spot when and where changes
occur … I´ll buy it! 

› Tools, abstractions and APIs unifying stream/batch
› Consistency, resiliency, fault-tolerance
› Event time handling
› Kappa architecture simplifies Big Data
– One stack, many pipelines (batch/stream)
– Flexible/extensible architecture
› Machine learning can be applied on unbounded data sets
– Treated as a complex transformation
– Some caveats
Summary
Stream
Batch
Καρρα

Please, feel free to contact us if you have
suggestions/comments/questions
ignacio.mulas.viela@ericsson.com / @ immulvi
nicolas.seyvet@ericsson.com / @NicolasSeyvet
Thank you!

Kappa architecture in the telecom industry

Kappa architecture in the telecom industry

Recommended

Recommended

More Related Content

Similar to Kappa architecture in the telecom industry

Similar to Kappa architecture in the telecom industry (20)

Recently uploaded

Recently uploaded (20)

Kappa architecture in the telecom industry

Editor's Notes