18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting with Multiple User Activity Data Sets

Mobile Network Failure Event Detection and
Forecasting with Multiple User Activity Data Sets
Yukio UEMATSU
02/05/2018 IAAI 2018
Koh TAKEUCHIMotoyuki OKI

Providing stable and high quality service is a critical
issue for mobile network service providers
Introduction 2

Network failures …
– L service outage
– L degrade user satisfaction
Introduction 3
Alert
Users
L Error ?
LI can’t
connect. Outage ?

Introduction
Alert
Users
L Error ?
LI can’t
connect. Outage ?
Tweet Search Call
Web AccessRSS
News
Publish user’s impressions
on various data sources
4

Introduction
Alert
Users
L Error ?
LI can’t
connect. Outage ?
Tweet Search Call
Web AccessRSS
News
Publish user’s impressions
on various data sources
5
User activity-based
failure detection
and forecasting

• Monitoring network traffic and logs [Gill+2011,
Brutlag2000]
Failure Detection and Forecasting
Failure Detection
6

Brutlag2000]
• Using tweet data
– Users detect outages before providers does it [Qiu+2010]
– A keyword filter and SVM [Takeshita+2015]
Failure Detection
7

Brutlag2000]
- Influenza, airport threats, and civil unrest, etc.
- Represent different aspects of the user activity
Failure Detection
+
8
Event Detection based on multiple data sets

Brutlag2000]
- Influenza, airport threats, and civil unrest, etc.
- Represent different aspects of the user activity
Focus network failure detection and forecasting
using multiple user activity data sets
Failure Detection
+
Event Detection based on multiple data sets
9

• Number of observations in each data
– Social (Tweets) and Telecom (Web Access Logs and Search
Queries) data sets
User Activity Data Sets 10

• Number of observations in each data
– Social (Tweets) and Telecom (Web Access Logs and Search
Queries) data sets
User Activity Data Sets
Zoom
Change ?
11

• Framework for failure detection and forecasting
from multiple user activity data sets
– Feature construction methods and model ensemble
method
• Extensive experiments using real-world multiple
data sets
Research Contributions 12

• To estimate a prediction model that predict a
failure event from feature vectors
– J Simple binary classification problem
• Two variables
–𝒙" = (𝑥&,",, … , 𝑥),",) ∈ ℝ) is a feature vector of user
activity data on the timestamp 𝑡 ∈ 1, … , 𝑇
–𝑦" ∈ 1, −1 is a label
• whether an event occurs at timestamp 𝑡 or not
Problem Formulation 13

𝒙" : Three data sets
Data Sets and Failure Events 14

Data Sets and Failure Events
𝒚" : Nine failure events
15

Five training and test data sets
＋
16

Special characteristics
L Imbalanced labels (see Ratio)
L Very sparse (see Sparsity)
＋
17

Special characteristics
L Imbalanced labels (see Ratio) ⇒ Bi-normal Separator*
L Very sparse (see Sparsity) ⇒ Simple Moving Average*
[*] See our paper
＋
18

• (1) Entire period (EP) detection : Entire periods of Test
Three Key Tasks for Failure Detection
Training : 210 days Test : 15 days
：Failure Event
start end
(1) EP
19

duration 60 mins
• (2) Early detection (ED) : Only 60 mins after a failure event
and the rest interval
：Failure Event
(2) Early Detection (60 min) start end
(1) EP
20

• (2) Early detection (ED) : Only 60 mins after a failure event
and the rest interval
• (3) Forecast : failures in α minutes
duration 60 mins
：Failure Event
(2) Early Detection (60 min)
(3) Forecast
start end
(1) EP
t = 1 t = T
duration α min
t =T +1 t =T +T’
21

• Experiment 1 (Two tasks : EP and ED)
– Comparison of multiple classification and anomaly detection
models in respect to AUC
– LR（Logistic Regression) / ADA (Adaboost of decision
stamps) / RF (Random Forest) / NN (Neural Network) /
OCS (One Class SVM) / AE (Auto Encoder)
• Experiment 2 (Two tasks : EP and ED)
– Effective approach to combine multiple models and data
sets
• Experiment 3 (One task : Forecast)
– Forecasting performance
Experiment Outline 22

Single data set is not sufficient
• Experiment 1
– Tasks : EP and ED
- Comparison methods : LR（Logistic Regression) / ADA
(Adaboost of decision stamps) / RF (Random Forest) / NN
(Neural Network) / OCS (One Class SVM) / AE (Auto Encoder)
Results : Optimal combinations of models and data are different
23

Single data set is not sufficient
• Experiment 1
- Comparison methods : LR（Logistic Regression) / ADA
(Adaboost of decision stamps) / RF (Random Forest) / NN
(Neural Network) / OCS (One Class SVM) / AE (Auto Encoder)
Results : Optimal combinations of models and data are different
24

Tweet data detects failures early
• Experiment 1
Assumption : Post tweets, then access web pages
25
…

• Experiment 1
26
…

• Experiment 1
27
…

• A model ensemble approach to effective
combinations of data sets and models
Combine multiple user activity data sets
LR 𝒑"
LR
RF
AE
AE
{(𝒙", 𝑦")}"7&:9
{(𝒙", 𝑦")}"7&:9
{𝒙"}"7&:9
:
{(𝒑", 𝑦")}"7&:9
^
𝑓(;)
𝚳 𝟏
𝚳 𝟐
𝚳 𝟑
𝚳 𝟑
𝑫 𝟏
𝑫 𝟐
𝑫 𝟑
Level 1 Level 2
𝒑" : Prediction scores of failure events of model 𝚳𝒊 and
data sets 𝑫𝒋
28

• Experiment 2
– Tasks : EP and 60min
– Model : ME
– Comparison method : Data Ensemble (DE)
• Results
Model ensemble method achieved best scores
LR 𝒑"
29
concatenation

Prediction scores (Test ID : 9) 30

ME suppress cases of false negatives 31

Effective combination of Tweets and Web access logs
Rapid rise
Delayed rise
32

• Experiment 3
– Task：Forecast
– Model : ME
– Comparison methods : DE and LR on each dataset
User activity can forecast future failures 33

• Experiment 3
– Task：Forecast
– Model : ME
– Comparison methods : DE and LR on each dataset
• Results : ME showed best performance
User activity can forecast future failures 34

• Summary
– Proposed a multiple user activity-based failure detection
and forecasting framework
– Demonstrated that our proposed methods improved AUC
scores through extensive experiments
• Future Works
– Utilizing additional data sets
– Deep neural network (e.g., LSTM model)
– Feature Analysis
– Deployment application to real-time monitoring system
Conclusions
Thank you !
35

• Framework for failure detection and forecasting
• Extensive experiments using real-world multiple data sets
Research Contributions
Overview of our framework
Users
Service Provider
Failure Event
𝒙 𝒕 : Multiple User Activity Data
Error ?
I can’t
connect.
Tweet Search Call
Web Access
𝑦" : Failure
Reports
Detection and Forecasting !
Social Data Telecom Data
Alert
Outage ?
RSS
News
Reporting
37

• First method : Simple Moving Average (SMA)
• Second method : BNS (Bi-Normal Separator [Forman2003])
Method : Feature Construction
L Sparseness of user activities
à Aggregate each feature observed from the past time
(t − S) to an average value in a time t
L Imbalanced labels
→ Feature scaling for imbalanced dataset
38
𝐹E&
; ： inverse cumulative
normal distribution
tpr : True positive rate
fpr : False positive rate
𝑡𝑓𝑏𝑛𝑠 𝑥I," = 𝑡𝑓 𝑥I," ×𝑏𝑛𝑠 𝑑I
𝑡𝑓 𝑥I," =
𝑥I,"
∑ 𝑥I,"
)
I7&
𝑏𝑛𝑠 𝑑I = 𝐹E&
𝑡𝑝𝑟 − 𝐹E&
𝑓𝑝𝑟

• Experimental Setup
– Evaluation Set : EP
– Model : Logistic regression with ridge regularization
– Comparison methods
• td（original features）/ tf-idf ([Salton+1986]) / tf-bns / +sma
（simple moving average）
• Results : J Improved AUC by Average 31％.
– SMA for time series features and BNS Scaling for imbalanced
dataset are effectiveness.
Experiment ：Effects of Feature Construction
* The values are Average and standard deviation of AUC.
+ Average 31％
39

18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting with Multiple User Activity Data Sets

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie 18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting with Multiple User Activity Data Sets

Ähnlich wie 18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting with Multiple User Activity Data Sets (20)

Mehr von LINE Corp.

Mehr von LINE Corp. (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

18.02.05_IAAI2018_Mobille Network Failure Event Detection and Forecasting with Multiple User Activity Data Sets