Anomaly detection in dynamic networks using multi-view time-series hypersphere learning
1. Anomaly Detection in Dynamic Networks using Multi-
View Time-Series Hypersphere Learning
Xian Teng, Yu-Ru Lin
School of Computing and Information, University of Pittsburgh
Our proposed method has advantage in detecting
events that involves anomalous temporal dynamics.
It highlights the necessity to extract temporal
patterns, and to exploit multiple data sources. As
part of future work, we plan to relax the assumption
that the streaming data can be partitioned into
periodic and well-aligned temporal segments
having similar patterns. In addition, we plan to
incorporate the interplay among individual objects
(e.g., vertices or edges) into analysis.
1. F. Chen and D. B. Neill. Non-parametric scan
statistics for event detection and forecasting in
heterogeneous social media graphs. In SIGKDD,
pages 1166–1175. ACM, 2014.
2. P. Rozenshtein, A. Anagnostopoulos, A. Gionis,
and N. Tatti. Event detection in activity
networks. In SIGKDD, pages 1176– 1185. ACM,
2014.
Anomaly detection in dynamic network systems has
attracted lots of attention in recent years.
Traditional techniques primarily focus on single-
view data [1,2]– that is, data captured from a
single or homogeneous data source. Besides, most
prior works do not take temporal variations into
account – they divide streaming data into fixed-
length segments and use integrated features as
inputs to train models. The integration of attributes
might lead to potential loss of temporal information
that is critical for anomaly detection.
In this work, we propose a novel approach called
Multi-View Time-Series Hypersphere Learning
(MTHL) to tackle this challenge.
Given a dynamic network with multiple time-
varying multivariate attributes, called “multi-view
multivariate time-series”, we seek to extract the
normal temporal patterns from historical
reference data set, so as to detect anomalies
(when and where) in a real-time condition.
Introduction & Motivation
Problem Definition
Method: MTHL Results: Synthetic Data
Conclusion
Results: Real-world Data
Reference
Multiple data
sources
Temporal
regularity
anomalous zones
anomalous time window
I. Data Generation
We simulate a dynamic network (e.g. a city system)
to produce the synthetic time-series data. Nodes
represent small zones in the city and edges
represent population flows among places.
II. Evaluation Matrix
Kappa Statistics compares the overall accuracy to
the expected random chance accuracy.
III. Results
•Performance versus anomaly pollution
•Performance versus data noise
•Performance versus label imbalance
Our method consistently outperforms the state-of-
the-art baseline methods in face of anomaly
pollution, data noises and label imbalance.
•Runtime
MTHL can make
very quick
decisions in
detecting outliers.
Post-election day on
Nov 10, 2016. MTHL
suggests that Midtown
Center, Midtown East,
Upper Manhattan and
Greenwich Village
exhibit anomalous
activities.
New Year’s Eve on Dec
31, 2016. MTHL tells
that Times Square and
the nearby zones are
anomalous, probably
related to the “Ball
Drop” event at Times
Square.