1. Expert Systems with Applications 38 (2011) 1709–1715
Contents lists available at ScienceDirect
Expert Systems with Applications
journal homepage: www.elsevier.com/locate/eswa
Preprocessing expert system for mining association rules
in telecommunication networks
Tong-Yan Li a,⇑, Xing-Ming Li b
a
Department of Communication Engineering, Chengdu University of Information Technology, Chengdu 610225, China
b
Key Laboratory of Broadband Optical Fiber Transmission and Communication Networks of Ministry of Education, UESTC, Chengdu 610054, China
a r t i c l e i n f o a b s t r a c t
Keywords: Recently, the application of association rules mining becomes an important research area in alarm corre-
Alarm correlation analysis lation analysis. However, the original alarms in the telecommunication networks cannot be used to mine
Preprocessing expert system association rules directly. This paper proposes a novel preprocessing expert system model to deal with
Association rules mining the original alarms. This model uses two important techniques, of which the time window technique
Neural network
is used for converting original alarms into transactions, and the neural network technique can classify
Weighted association rules
the alarms with different levels according to the characteristics of telecommunication networks in order
to mine the weighted association rules. Simulation results and the real-world applications demonstrate
the effectiveness and practicality of this preprocessing expert system.
Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction An alarm correlation system should be adapted to the fast
changing technical advances in the telecommunication domain. It
Recent global expansion in the demand for telecommunications is well known that TASA (Telecommunication Alarm Sequence
services has resulted in a considerable growth of networks in terms Analyzer) (Hatonen et al., 1996a, 1996b; Klemettinen, Mannila, &
of size, complexity and bandwidth. Networks often consist of hun- Toivonen, 1999) is a classical knowledge discovery system for ana-
dreds or even thousands of interconnected nodes from different lyzing large alarm databases from telecommunication networks.
manufacturers using various transport mediums and systems. As TASA supports two central phase of the knowledge discovery pro-
a result, when a network problem or failure occurs, it is possible cess: the pattern discovery process and the rules presentation
that a very large volume of alarms are generated. These alarms de- phase. In the first process, TASA finds automatically episode rules
scribe lots of detailed but very fragmented information about the and association rules, and in the rule presentation phase, some
problems. Typically, alarm flow is useful to find and isolate faults. powerful pruning, ordering, and grouping tools are used to support
However, it is also very difficult to determine the root cause of the large sets of rules. Obviously, the algorithms of TASA in pattern dis-
faults. As we know, Alarm correlation is used to be helpful in the covery process are based on the Apriori algorithm (Agrawal &
faults diagnosis and localization (Amani, Fathi, & Dehghan, 2005; Srikant, 1994; Ng, Lakshmanan, Han, & Pang, 1998; Sarawagi,
Hou & Zhang, 2008; Tang, Al-Shaer, & Boutaba, 2008). In the past, Thomas, & Agrawal, 1998; Srikant, Vu, & Agrawal, 1997), it fails
the knowledge of alarm correlation was mainly obtained by net- to reflect some characteristics of alarms effectively. For example,
work experts. With the development of telecommunication net- alarms from telecommunication network are always considered
works, it increasingly difficult for experts to keep up with the inequity, and they are usually made of short messages with general
rapid change of networks and discover the real useful knowledge textual formats. In particular, such massage includes information
from alarms. Therefore, researchers adopt many advanced meth- about the creation time of alarm, the observed symptom of fault
ods including data mining to analyze alarm correlation. Data min- and the device issuing the alarm. Therefore, we consider that the
ing is a science of extracting implicit, previously unknown, and items should be given different weights to reflect their importance
potentially useful information from large data sets or databases, in alarm correlation analysis. On the other hand, the strategy of
also known as knowledge discovery in databases (KDD). Telecom- finding frequent items would prune off infrequent items which
munication alarm correlation analysis based on data mining is now may include some useful relationships of association patterns. In
playing an important part in current research and drawing more fact, although rare events do not happen often or regularly, they of-
and more attentions. ten have special meaning or play an important role in some situa-
tion as predicting telecommunication equipment failures. It turns
⇑ Corresponding author. out that the alarm with weight can help find the rare but important
E-mail address: sunny60138800@yahoo.com.cn (T.-Y. Li). information. In addition, alarms in the telecommunication
0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2010.07.096
2. 1710 T.-Y. Li, X.-M. Li / Expert Systems with Applications 38 (2011) 1709–1715
networks are massive, bursting and intermittent. Although many can be extracted to form alarm events for mining weighted
methods (Bouloustas, Calo, & Finkel, 1994; Marilly, Aghasaryan, association rules.
Betgé-Brezetz, O’Martinot, & Delègue, 2002; Weiss & Hirsh, 1998) Incomplete data: in some special circumstances, some informa-
have been proposed to analyze the alarm correlation, few methods tion may be lost for lack of some alarm data. Both network man-
took account of how to deal with the original alarm data. agement channels interrupted and information transmitted
In this paper, we propose a novel preprocessing expert system unsuccessfully may lead to this problem.
to resolve above problems. In order to find out the root cause of Noise in the data: in the mining process, the data which is unre-
alarms and locate the faults accurately by using alarm correlation lated to fault diagnosis can be called noise. Noise has great
analysis, the processing time should be shortened for the need of interference with alarm correlation analysis, for instance, it
both intelligent network management and automation. During must be removed in preprocessing.
the process of data preprocessing, the framework of the knowledge Time non-synchronous: in a large network, the same equip-
discovery task will be formalized and the alarm weights will be ments usually can’t be standardized in common so that the time
determined. Meanwhile, we design a binary neural network, of shows different. It is therefore no surprise that so many time
which the input vector are some key elements that can represent errors can make the mining very difficult.
alarms. After the course of sample data training, alarms with the Alarm data, which are made of short messages, generally tex-
similar weights will be divided into the same class. The weights tual formats, and typically several fields including information
of the neural network may not only reflect the knowledge of the creation time, location and some alarm conditions, can be con-
experts but also change automatically when the input change. sidered inequity. These items should be given different weights
This paper is organized as follows: In Section 2, we introduce to reflect their different importance.
our system model and its operation process. Section 3 shows the
experimental platform and experimental results in telecommuni- From above analysis of the problems in alarm correlation anal-
cation network environment. Finally a conclusion is drawn in ysis, it is well known that the extraction and the time synchroniza-
Section 4. tion of alarms are two most important factors in data
preprocessing. The methods to resolve the two problems will be
provided in Section 3.2.
2. Preprocessing experts system proposed
2.2. Alarm extraction and time synchronization
2.1. Problem description of the original alarms
All the analysis methods need history logs about alarm data.
In the process of data preprocessing, we are interested in mak- However, we do not need any information about the topology of
ing the original alarms clean and useful. The preprocessing in- networks. In this respect, the analysis system can be used in differ-
cludes alarm data collection and data cleaning (cleaning means ent networks. We extracts alarm time, alarm level, alarm sort and
adding with the missing data, discarding the redundant data and equipment address to form an alarm event and marks it as a 4-tu-
reducing the volume of data). By preprocessing, we can convert ple (a, s, l, t).The four attributes are so important that can represent
original alarm data into alarm transactions. the alarm. Fig. 1 shows the information extraction process.
Alarms are short messages, generally of textual format, that are Time synchronization problem exists in the original alarms.
symptomatic of a change in condition (often an abnormality) in a Some alarms may happen in one or two seconds, and sometimes
system. According to X.733 protocol of the ITU-T standard recom- some alarms even occur at the same time. The mining efficiency
mendations, an alarm typically contain several fields giving infor- is too low to mine the original alarms directly. In order to deal with
mation about the following attributes: Equipment name, device this problem, we should converse the original alarms into the
type, equipment address, interface types, alarm level, alarm types, appropriate alarm affairs by examining the original data over a
alarm status, and alarm time, etc. user-specified time window.
Unfortunately, the alarm does not usually contain significant
information of network fault. When a network fault appears, it will Definition 1. Given a set E, the alarm sequence S = {s, Ts, Te} is an
trigger a series of alarms subsequently, but not all alarms have cor- ascending sequence occurring in the time interval [Ts, Te],
relation with the fault. As a matter of fact, it is necessary to analyze Sw = {w, ts, te} is a time window of the sequence S, in which ts Ts,
all alarms to find out the relationship of alarms and determine the te Te, w ¼ fw # Sjt s t t e g. te À ts is the width of the window, as
root cause of the fault. W.
As we know, original alarms from the actual network often
meet several major problems such as information redundancy,
incomplete, time synchronization, a lot of noise that has no rela- Definition 2. Given the time window width W and the window
tionship with the association rules, and different attributes. Con- sliding step s, the starting time of the sequence is Ti, and the ending
sidering these reasons, ARM-ACAS can not deal with the original time is Te, the time span Te À Ti = W is called the width of the win-
alarms directly. In order to be suitable for data mining, original dow. The alarm in a window started at Ti + s and ended at Te + s.
alarms must be converted into transactions and distributed with Fig. 2 shows the time widow, of which the alarm sequence is
different weights. In general, the main problems of the original {A, C, . . . , A, F}, the width of the window is 5 and the sliding step
alarms are described as follows: is 2.
Define a transaction time interval to deal with the asynchro-
A fault often triggers many alarms: It has been observed that in nous problem of alarms. The alarms with the values of ‘‘alarm
most cases faults occur in bursts because any change of the time” within the same time window may be incurred by the same
behavior of a single node of a complex network can perturb, fault. Combine all the alarms with the following conditions to form
and therefore may cause faults in other nodes of the same net- an alarm transaction:
work. As a result, equipment faults may occur intermittently
and cause multiple alarms. (1) The values of alarm time are within the same time interval.
An alarm typically contains many attributes, some of which are (2) The values of cleared time are within the same time interval.
failed to mine the association rules. Only a part of the attributes (3) The orders of alarms in the time interval are the same.
3. T.-Y. Li, X.-M. Li / Expert Systems with Applications 38 (2011) 1709–1715 1711
Fig. 1. Information extraction.
four classes. The model can be seen as the simplest kind of feed-
forward neural network which contains three inputs, six link
weights, two neuron and two outputs.
In this neural network model, three parameters should be con-
firmed: (1) the ration of description for the inputs; (2) the link
weights of the neural network and (3) the transfer function.
Fig. 2. An example of the time window. In our study, input datasets are alarm data in telecommunica-
tion network. Alarm attributes contain many factors, four of which
including the node degree of the alarm equipment, alarm level that
Over the time window operation, the original alarms will be
may reflect the severity and the alarm type influence the telecom-
converted into transactions. The ultimate goal of using time win-
munication network most, so that they should be chosen as the in-
dow is to improve the mining efficiency, to obtain the location
puts of the neural network. The outputs of the neural network are
quickly and to predict the severe network faults accurately.
the alarm classes we need. Alarms with the similar importance will
be divided into the same category, and different classes will have
2.3. Neural network proposed for confirming alarm weights different weight values. At first, we input the sample values with
the experience of experts. After training the link weights of neural
An artificial neural network is an information processing para- network with two neurons, we will get the neural network model.
digm that is inspired by the way biological nervous systems such The construction process of the neural network displays in Fig. 3.
as the brain, process information. Neural network, with its remark- Select the sample {p1, q1}, . . . , {pn, qn}, where P1 = (a11, . . . ,
able ability to derive meaning from complicated or imprecise data, a1m)T, . . . , Pn = (an1, . . . , anm)T (p is input vector, q is output vector),
È É
which can be used to extract patterns and detect that are too com- define the original value of link weight as w0 ; . . . ; w0 . Alarm data
1 m
plex to be noticed by either humans or other computer techniques. entry are multidimensional, the dimensional vectors can be ex-
There are already many methods for weights analysis in mining pressed as n  m. The link weights of the neural network can be ex-
association rules, but they are unfit for the alarm weights confir- pressed as m  1 dimension vectors, neural networks are designed
mation in telecommunication networks. Using neural network by vector multiplication and the output dimensional vector can be
can handle alarm weights well (Li Li, 2007). In this paper, we pro- expressed as n  1. Specifics as follows.
pose a binary neural network to confirm the weights of alarm data The inputs of the neural network are given as P1 = [a11, . . . ,
2 3
effectively. During the course of the neural network training, we a11 Á Á Á a1m
can determine a set of link weights which reduces the system error T T 6 . . 7
a1m] , . . . , Pn = [an1, ... , anm] , where anÂm ¼ 4 . . . 5¼
.
as close to zero as possible. In this case, relevant data are entered
an1 Á Á Á anm
into the neural network in order to identify patterns automatically. 2 T3
When the neural network has been trained successfully, we can P1
6 . 7 T
use it to determine the alarm weights. 4 . 5:The link weights are shown as WmÂ1 = [w1, ... , wm] .
.
PT
n
2.3.1. Design the neural network Pure inputs of the neural network are written as
In the neural network, the inputs have three key factors which n = a Á W + b = [P1, . . . , Pn] Á [w1, . . . , wm] + b.
influence the alarm weights, and the outputs are different classifi- The output can be written as Q ¼ f ðnÞ ¼ f ða Á W þ bÞ ¼ f ðanÂm Á
cations of alarms. Considering a binary neuron neural network, the W mÂ1 þ bÞ ¼ f ð½P 1 ; . . . ; Pn Š Á ½w1 ; . . . ; wm Š þ bÞ;, where f shows the
learning process can be accomplished to divide the alarms into transfer function of the neural network, b is the external input.
Fig. 3. Binary neural network classifiers.
4. 1712 T.-Y. Li, X.-M. Li / Expert Systems with Applications 38 (2011) 1709–1715
Output vectors Q1, Q2 have two values À1, 1, respectively. Input target value, and the adjustment rule can be shown as
8
vectors can be divided into four classes: (À1, À1) (À1, 1) (1, À1) e 0 Y new ¼ Y old þ X
(1, 1), and sample values are caught with the experience. After e 0 Y new ¼ Y old À X ; let q ¼ ge ¼ f 1 e ¼ 0; fracekeke – 0; :;
:
classified by the neural network, alarms with the similar impor- e ¼ 0 Y new ¼ Y old
tance will get the same weights. it is proved that if there exists a weight value, it can be converged to
How to choose the transfer function is a crucial step in the neu- the expectant value.Weight vector after kth iteration is
ral network design process. In our design, according to the charac-
YðkÞ ¼ Yðk À 1Þ þ X 0 ðk À 1Þ ð4Þ
teristics of alarm data transmission, the input value is set as À1 or
1, and the transfer function can be chosen as hard limiting function 0
in which X (k À 1) is an element of the following set:
hardlim. {X1, X2, . . . , XQ, ÀX1, . . . , ÀXQ}.Assume that the vector can classify
The hard limiting function hardlim is given as the Qth input correctly, described as Y*, assume that
À1; v 0 tq ¼ 1; XÃT Xq d 0 ð5Þ
f ðv Þ ¼ ð1Þ
1; v P0 tq ¼ 0; ÃT
X Xq Àd 0 ð6Þ
In order to prove the convergence of the rules, the upper and lower
2.3.2. Training process of the link weights limits of each vector are needed.
After the course of samples studying and link weights training, Set the initial weight vector 0, as Y(0) = 0.
the data will achieve linear classified. Training the link weights of Weight after kth iteration is
neural networks includes the steps described below. Let a(n) de-
note input vectors, Q(n) represent the actual output vectors; q(n) YðkÞ ¼ Yðk À 1Þ þ X 0 ðk À 1Þ ¼ X 0 ð0Þ þ X 0 ð1Þ þ Á Á Á þ X 0 ðk À 1Þ ð7Þ
are output values in theory; g means learning step, which is a po-
Inner product can be concluded as
sitive integer below 1 (the descriptions of the symbols are as the
same as before). Y ÃT YðkÞ ¼ Y ÃT X 0 ð0Þ þ Y ÃT X 0 ð1Þ þ Á Á Á þ Y 0T X 0 ðk À 1Þ: ð8Þ
(1) Choose the sample data {p1, q1}, . . . , {pn, qn}, where
*Y ÃT X q d
P1 = (a11, . . . , a1m)T, . . . , Pn = (an1, . . . , anm)T (p denote the
input elements, q denote the output elements). )Y ÃT X 0 ðjÞ d
È É
(2) Setting the initial values of the link weights as w0 ; . . . ; w0 .
1 m )Y ÃT YðkÞ kd ð9Þ
(3) In step n (n = 0, 1, 2, . . .), input the vector a(n), calculate the
actual output Q(n) = f(a(n) + b). By the Cauchy–Schwarz inequality,
(4) Adjustment rules of the link weights, in which e denotes the
system error: ðY ÃT YðkÞÞ2 6 kY ÃT k2 kYðkÞk2 ð10Þ
wðn þ 1Þ ¼ wðnÞ þ g½qðnÞ À Q ðnÞŠaðnÞ ð2Þ in which ||Y(k)||2 = YT(k)Y(k).From (9) and (10) we can conclude that
e ¼ qðnÞ À Q ðnÞ ð3Þ
ðY ÃT YðkÞÞ2 ðkdÞ2
(5) Set n = n + 1, if |e| P e, back to the step (4), else if |e| e, end kYðkÞk2 P Ã 2
: ð11Þ
kY k kY Ã k2
(e is a specified small positive value).
*kYðkÞk2 ¼ Y T ðkÞYðkÞ
Adjust the link weights by making the error achieve the system ¼ ½Yðk À 1Þ þ X 0 ðk À 1ÞŠT ½Yðk À 1Þ þ X 0 ðk À 1ÞŠ
error tolerance e, and then repeat the process of training until all
¼ Y T ðk À 1ÞYðk À 1Þ þ 2Y T ðk À 1ÞX 0 ðk À 1Þ ð12Þ
the patterns are trained completely. The training process of the link
weights is shown in Fig. 4. þ X 0T ðk À 1ÞX 0 ðk À 1Þ:
When the classification is wrong, the weights need update, and
2.3.3. Convergence analysis of the design
in this case the two symbols are contrary, so as to
YT(k À 1)X0 (k À 1) 6 0. So (12) can be simplified to
||Y(k)||2 6 ||Y(k À 1)||2 + ||X0 (k À 1)||2.After repeated iterations, it is
Proof. Select sample data as {p1, q1}, . . . , {pn, qn}, in which the
given by
expectation of output are qn with the value 1 or À1Let X = [P b]T be
the input vector, where b is an external input value. Here set the link kYðkÞk2 6 kX 0 ð0Þk2 þ kX 0 ð1Þk2 þ Á Á Á þ kX 0 ðk À 1Þk2 ð13Þ
weight vector Y = [W1]T, make 1 as its offset value. It is well known
2
that the pure input of the neural network is n = PTW + b = XTY and
0
Let u = max{||X (j)|| }, and it can satisfy
the update rule of the link weights is described as Ynew = Yold + Xq,
kYðkÞk2 6 k/ ð14Þ
let e = q(n) À Q(n) be the error of the true output value and the
Fig. 4. Training process of the link weights.
5. T.-Y. Li, X.-M. Li / Expert Systems with Applications 38 (2011) 1709–1715 1713
Fig. 5. Preprocessing expert system model.
The upper and lower limits of weight vector can be denoted as 2.4. Preprocessing expert system modeling
ðkdÞ2 Based on the above descriptions, our proposed preprocessing
kYðkÞk2 6 k/: ð15Þ
kY Ã k2 expert system model maintains two parts: the time window pro-
cessing and the neural network processing. In this whole system,
Therefore we have
time window processing module handles the original alarms first,
/kY Ã k2 and then input the cleaned data into the neural network to get
k : ð16Þ their weights. Fig. 5 shows the working process of the preprocess-
d2
ing expert system.
Eq. (16) proves that the count of updating is limited, therefore the From this figure, we can see that alarm processing expert sys-
training algorithm is convergence. It shows that after limited times tem is an expert system which is based on the rules, and each mod-
training, we will get the neural network we need. Proof is end. h ule has its own man-machine interface.
Fig. 6. The topology of the network.
6. 1714 T.-Y. Li, X.-M. Li / Expert Systems with Applications 38 (2011) 1709–1715
Table 1
The preprocessing of the alarms.
Time window is 5 s, the sliding step is 3 s
The number of original alarms 1000 2000 3000 4000 5000 6000 7000 8000 9000 10,000
The number of transactions 350 715 1064 1372 1739 2077 2404 2728 3093 3538
Time window is 10 s, the sliding step is 3 s
The number of original alarms 1000 2000 3000 4000 5000 6000 7000 8000 9000 10,000
The number of transactions 369 721 1041 1369 1716 2084 2396 2784 3106 3482
Time window is 10 s, the sliding step is 6 s
The number of original alarms 1000 2000 3000 4000 5000 6000 7000 8000 9000 10,000
The number of transactions 166 355 523 678 875 1037 1205 1384 1547 1725
The time window part has four Independent modules, and each Preprocessing process works on the simulated alarm datasets. In
module has its own rules. For example, time synchronization is the experiments, we selected three time windows to deal with dif-
handled by the rules of the time synchronization, and extracting ferent number of original alarms ranging from 1000 to 10,000, and
alarm transactions must follow the rules of extracting alarm trans- then the transactions would be generated and stored to mine the
actions. Meanwhile, the neural network processing part is in accor- association rules. After the preprocessing, we reduced more than
dance with the rules for setting the alarm weight. half number of the original alarms. Table 1 shows not only the pre-
The rules of the system shows as P P Q or IF P THEN Q, in which processing results of the original alarms, but also how great the
P is the prerequisite and Q is the conclusion. For the whole system, time window width have influence on a number of frequent items.
the rules of each part are given by the network management ex- In the first test, we set the time window for 5 s and sliding step for
pert. For example, the sliding step and the time window width 3 s. Original alarms can be converted to smaller number of transac-
are given by the experience of the experts for different networks, tions by the preprocessing expert system. From the table we can
and the learning samples of the neural network are also set by find that the number of transactions is nearly 1/3 of the original
the experts. alarms. For example, when the number of original alarms is up to
As we know, the preprocessing expert system is an important 10,000, we have only 3538 transactions. In the second test, we
part of the whole mining system, for it can provide clean and change the time window to 10 s, but keep the sliding step 3 s. Com-
appropriate data to find alarm association rules. parison of the first test indicates that when the alarms have small
number, the number of transactions is little bigger than the first
3. Experiments and results testing. But, when the number of original alarms increases, the dif-
ference is not obvious. In the final test, we double the value of the
3.1. The experimental setup time window and the sliding step used in the first test. Correspond-
ingly, the number of transactions is almost half of the first test. In
A series of experiments have been done to show the perfor- association rules mining, the number of transactions in the third
mance of our system on AMD Sempron (tm) Processor 2800+ ma- test is too small to find enough correlated rules. With comprehen-
chine with 512MB of main memory, running Microsoft Windows sive consideration, we decide to use the first test setting in our sim-
XP Professional operation system. All codes and interfaces are writ- ulation. Because the results show that when the window is 5 s, and
ten in JAVA. We can get the alarm data from the simulated tele- the sliding step is 3 s, the number of transaction changes very sta-
communication network in some principles. Fig. 6 shows the ble. In this situation, we will get enough transactions to find associ-
topology of real-world network with twenty nodes, there are three ation rules and make sure the pretreatment has high efficiency.
root nodes 1, 10, 18 among them, while alarms of other nodes are
triggered by these three root nodes. The bandwidth of root nodes
are 8M, the other link bandwidth is 2M and the link is finally con- 3.3. The test of training neural network
nected to the entire CHINANET with 100M bandwidth.
In order to determine the alarm weights, we must first sort the
3.2. Simulation principle alarm data. Classification is based on the three important attri-
butes of the alarms: the node degree of network equipment, the
A method for simulating the occurrence of alarms is also incor- alarm level and the alarm type. Before input the data to the neural
porated in the simulation. We construct the network and produce network, the three attributes of the alarms must be processed into
alarms in order to make sure that our algorithm is correct and effi- a triple (a, s, l), and the description attributes will be quantified. Gi-
cient. Based on the characteristics of telecommunication networks, ven a topology structure of the communication network, we can
we generate original alarms using the following principles: get the node degree, and then classify the alarm into 4 levels: (1)
serious alarm, quantified as 1; (2) major alarm, quantified as 2;
Alarms in lower level of arbitrary node cause corresponding (3) minor alarm, quantified as 3; (4) indicative alarm, quantified
alarms in upper lever of the same node. as 4. According to the network topology, the node which is directly
Alarms of the edge node may be transmitted to the center node linked to the root node and has more than 4 node degree generates
which connects with a probability p. serious alarms; the node generates major alarms when it directly
Alarms of the center node will be transmitted to one of the edge links to the root nodes and has the number of node degree between
node which connects randomly. 2 and 4; the node generates minor alarms must meet one of the
Alarms of the center node will be transmitted to all center following conditions: has less than 2 node degree but directly links
nodes which they connect with. to the root node, or has more than 3 node degree but indirectly
links to the root node; the node generate indicative alarms in other
Using above principles, the original alarms can reflect relation- situations. Alarm types are divided into five categories: (1) com-
ships of network elements truthfully. In this case, the alarm gener- munication alarms, quantified as 1; (2) device alarm, quantified
ated from edge node may have correlation with the alarms that as 2; (3) environmental alarms, quantified as 3; (4) running alarm,
generated from the center node. quantified as 4; (5) service alarm, quantified as 5.
7. T.-Y. Li, X.-M. Li / Expert Systems with Applications 38 (2011) 1709–1715 1715
This neural network has three layers: the first layer is the input work topology can be rapid realized by adjusting the
layer. On this layer, the vectors are input to the neural network weights so as to make the weighted association rules mining
after quantified according to the principle of the quantification; more scientific and effective.
the second layer is the middle layer of the neural network, and it
consists two neurons; the third layer is the output layer, output Experimental results show that this preprocessing expert sys-
the values of the classification. According to the various attributes tem plays a most important part in the whole mining process for
of the alarms, we divide the alarms into 4 categories with the value finding association rules and locating the root cause of faults.
of 0.1, 0.2, 0.3 and 0.4. The neural network is convergence when
the learning error square of the sample is less than 0.0001. After Acknowledgment
47 iterations, the test meets the convergence condition and the
neural network has been constructed completely. Finally, we can This work is supported by Natural Science Foundation of China
get the weights of all the alarms. (NSFC 60572091).
4. Conclusions References
Agrawal, R., Srikant, R. (1994). Fast algorithm for mining association rules. In
The application of association rules mining in telecommunica- Proceeding of the 20th VLDB conference (pp. 487–499).
tion network is an important area. In the special telecommunica- Amani, N., Fathi, M., Dehghan, M. (2005). A case-based reasoning method for
tion environment, fault management and alarm correlation alarm filtering and correlation in telecommunication networks. In Proceeding of
the Canadian conference on electrical and computer engineering (pp. 2182–2186).
analysis are critical but difficult tasks, for a large number of alarms Bouloustas, A. T., Calo, S. B., Finkel, A. (1994). Alarm correlation and fault
have their own characteristics. Therefore, dealing with these identification in communication networks. IEEE Transactions on
alarms flexibly and automatically are necessary and practical. Communications, 42(234), 523–533.
Hatonen, K., Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H. (1996).
The preprocessing expert system proposed in this paper is based
TASA: Telecommunication alarm sequence analyzer or how to enjoy faults in
on the time window technology and the neural network technol- your network. In Proceeding of IEEE network operations and management
ogy. Different from the traditional expert systems, our system is symposium, 1996 (pp. 520–529).
thought to be a more flexible approach with higher operating effi- Hatonen, K., Klemettinen, M., Mannila, H., Ronkainen, P., Toivonen, H. (1996).
Knowledge discovery from telecommunication network alarm databases. In
ciency. By using our system, we can quickly change the original Proceeding of 12th international conference on data engineering (ICDE’96), New
alarms into the proper form that we need for the association rules Orleans (pp. 115–122).
mining. Totally speaking, this preprocessing expert system has fol- Hou, S. Z., Zhang, X. F. (2008). Analysis and research for network management
alarms correlation based on sequence clustering algorithm. In Proceeding of
lowing improvements and innovations: ICICTA’2008 (pp. 982–986).
Klemettinen, M., Mannila, H., Toivonen, H. (1999). Rule discovery in
(1) In telecommunication networks, alarms have quantities and telecommunication alarm data. Network and Systems Management Journal,
7(4), 395–423.
dynamic changes. Preprocessing process helps turn the Li, T. Y., Li, X. M. (2007). The study of the neural network applied to weighted
alarms to the unified framework which is suitable for min- association rules mining. In Proceedings of the international conference
ing. There are a lot of things to do in this process, of which ICWAPR2007 (pp. 742–745).
Marilly, E., Aghasaryan, A., Betgé-Brezetz, S., O’Martinot, O., Delègue, G. (2002).
extracting useful elements of alarms and selecting proper Alarm correlation for complex telecommunication networks using neural
time window to resolve the original alarms are most networks and signal processing. In IEEE workshop on IP operations and
important. management (pp. 3–7).
Ng, R., Lakshmanan, L. V. S., Han, J., Pang, A. (1998). Exploratory mining and
(2) The weight determination can influence the mining effi-
pruning optimizations of constrained association rules. In Proceedings of the
ciency. In this presentation, we have described a feed-for- SIGMOD’98 (pp. 13–24).
ward neural network model to determine the alarm Sarawagi, S., Thomas, S., Agrawal, R. (1998). Integrating association rule mining
weights according to the features of alarms and telecommu- with relational database system: Alternatives and implications. In Proceedings of
the SIGMOD’98 (pp. 343–354).
nication network topologies. Based on some typical sample Srikant, R., Vu, Q., Agrawal, R. (1997). Mining association rules with item
values of an actual network, the link weights of neural net- constraints. In Proceedings of the KDD’97 (pp. 67–73).
work model have been fixed to stable values after training Tang, Y. N., Al-Shaer, E., Boutaba, R. (2008). Efficient fault diagnosis using
incremental alarm correlation and active investigation for internet and overlay
certain times. The application of neural network to deter- networks. IEEE Transactions on Network and Service Management, 5(1), 36–49.
mine the alarm weights can not only reflect the experience Weiss, G. M., Hirsh, H. (1998). Learning to predict rare events in event sequences.
and knowledge of experts, but also make full use of the In Proceedings of the 4th international conference on knowledge discovery and data
mining (pp. 359–363). AAAI Press.
alarm attributes accurately. Meanwhile, the changes of net-