Many use cases in the telecommunication industry require producing counters, quality metrics, and alarms in a streaming fashion with very low latency. Most of this metrics are only valuable when they’re made available as soon as the associated events happened. In our company we are looking for a system able to produce this kind of real-time indicator, which must handle massive amounts of data (400,000 eps) with often peak loads (like New Year’s Eve) or out-of-order events like massive network disorder. Low latency and flexible window management with specific watermark emission are also a must-haves. Heterogeneous format, multiple flow correlation, and the possibility of late data arrival are other challenges. Flink being already widely used at Bouygues Telecom for real-time data integration, its features made it the evident candidate for the future System. In this talk, we'll present a real use case of streaming analytics using Flink, Kafka & HBase along with other legacy systems.
A Brief History of Time with Apache Flink Real-Time Monitoring
1. A Brief History of Time with
Apache Flink
Real time monitoring and analysis with Flink,
Kafka and HBase
Flink Forward 2016
Berlin, September, 12th, 2016
1
Thomas LAMIRAULT
Mohamed Amine ABDESSEMED
2. The speakers
• Software engineer & solution
architect @ Bouygues
Telecom since 2013
• A Flinker since the early
beginnings
@AminouvicTweets
2
Mohamed Amine
ABDESSEMED
• Bigdata Software engineer
@ Bouygues Telecom
since 2015
• A Flink master enthusiast
@thomaslamirault
Thomas
LAMIRAULT
3. Outline
• Who is Bouygues Telecom
• Data Value and Streaming Analytics
• Use case
• Challenges
• Streaming analytics with Flink
• Results
3
4. 15M Clients
12,1M Mobile subscriber
2,9M Fixed customer
WHO IS BOUYGUES TELECOM ?
Mobile . Fixed . TV . Internet . Cloud
4
First
Android
TV BOX
Leader
4G/4G+
VoLTE
UHSM
A very
Innovative
company
6. LUX: Logged User eXperience
Mobile QoE
• Our Big Data platform
• Produce Mobile QoE indicators based on massive
network equipment’s event logs (Billions
event/day).
• Goals:
– QoE (User) instead of QoS (Machine)
– Real-time Diagnostic (<60’ end-to-end latency)
– Business Intelligence
– Reporting
– Real-time alarming
6
8. Analytic Data value
8
Important event
occurrence
Predictive
analytics
Advanced
analytics
Time
Data
Value Streaming
analytics
T0 T1 T2 TyT-x Before important event
NOW
After important event
10. Analytic Data value
• Data is most valuable when made
available as soon as important events
occur.
• Get the most of Data
– Collect data fast.
– (Pre)Process it fast.
– Analyze it and create added value to act
faster!
10
12. The use case
• A simple and valuable use case
• Need to analyze the entire call traffic :
–Considering multiple aggregation axes
–Fine grained analysis
–Detect when something is happening
somewhere in real-time
–Compare with historical values trends
12
13. Challenges
• Low latency & streaming fashion counters
• Quickly available KPIs = value
• Massive amounts of data + peak loads
• Reliability
• Multiple flow correlation
• Time management:
– Out of order & late events our worst enemies
– Flexible window management
– Specific watermark emission
13
20. Streaming application Details
20
TnTnTn Multiple Input Topics
filterfilterfilter Keep only records that interest us
Extract timestamp
Extract relevant event timestamp, with
custom watermark emission
keyBy
Group by considered aggregation axes
window Tumbling windows
trigger
Custom trigger, allow configurable late
data and manage lagged data
sink Write Data into HBase
reduce Produce KPIs
UnionUnion Correlate input flows
21. The results
Production metrics
• Low latency (<100ms)
• Input : Up to ~80.000 events/sec
• Output : Produce ~40.000 KPI/window
21
23. Benefits
24
• Monitor and improve customer
experience
• Reduced incident detection time
• Help GNOC alarm prioritization
based on customer experience
• Reduced operating costs
24. Difficulties
• Massive amounts of data in both
input and output
• Savepoint/Checkpoint cost
• HBase analytic limitations
–Read vs Write
–Long Scan
• Massive out of order events
25