The document provides guidelines for assessing the accuracy of log message template identification techniques. It discusses issues with existing accuracy metrics and proposes new metrics like Template Accuracy that are not sensitive to message frequency. It also recommends performing oracle template correction as templates extracted without source code are often incorrect. Additionally, it suggests analyzing incorrectly identified templates to understand weaknesses and provide insights to improve techniques. The guidelines aim to help properly evaluate template identification techniques for different use cases.
CALL ON âĽ8923113531 đCall Girls Kakori Lucknow best sexual service Online âď¸
Â
Guidelines for Assessing the Accuracy of Log Message Template Identification Techniques
1. Guidelines for Assessing the Accuracy of
â¨
Log Message Template Identification Techniques
Zanis Ali Khan
University of Luxembourg,
Luxembourg
Donghwan Shin
University of Luxembourg,
Luxembourg
Domenico Bianculli Lionel C. Briand
University of Luxembourg,
Luxembourg
University of Ottawa,
Canada
University of Luxembourg,
Luxembourg
2. INFO DataXceiver: Receiving block blk_119603041192418163 src: /10.251.38.197:37990 dest: /10.251.38.197:5001
0
INFO DataXceiver: Receiving block blk_5955217335243326590 src: /10.250.11.53:49019 dest: /10.250.11.53:5001
0
INFO DataXceiver: Receiving block blk_5311940628079058898 src: /10.251.107.98:38894 dest: /10.251.107.98:5001
0
INFO DataXceiver: Receiving block blk_697125257891768172 src: /10.251.110.160:59884 dest: /10.251.110.160:5001
0
INFO PacketResponder: PacketResponder 0 for block blk_-171369910704040079 terminatin
g
INFO PacketResponder: Received block blk_-171369910704040079 of size 67108864 from /10.251.110.6
8
INFO BlockReceiver: Receiving empty packet for block blk_-3842070622043972712
Software logs are everywhere
Example logs from HDFS
3. Only data available that
re
fl
ect the system
behaviour
Used for various
software engineering
tasks:
INFO DataXceiver: Receiving block blk_119603041192418163 src: /10.251.38.197:37990 dest: /10.251.38.197:5001
0
INFO DataXceiver: Receiving block blk_5955217335243326590 src: /10.250.11.53:49019 dest: /10.250.11.53:5001
0
INFO DataXceiver: Receiving block blk_5311940628079058898 src: /10.251.107.98:38894 dest: /10.251.107.98:5001
0
INFO DataXceiver: Receiving block blk_697125257891768172 src: /10.251.110.160:59884 dest: /10.251.110.160:5001
0
INFO PacketResponder: PacketResponder 0 for block blk_-171369910704040079 terminatin
g
INFO PacketResponder: Received block blk_-171369910704040079 of size 67108864 from /10.251.110.6
8
INFO BlockReceiver: Receiving empty packet for block blk_-3842070622043972712
Anomaly Detection Testing
Model Inference
10. ⢠Accuracy metrics
⢠Oracle templates
⢠Incorrectly identified templates
10
Scope of the guidelines
11. 11
Accuracy Metrics in Literature
⢠Grouping Accuracy (GA) [1], and Parsing Accuracy (PA) [2]:
ďťż
The ratio of
"correctly parsed" log messages over the total number of log messages
⢠GA: A log message is correctly parsed when its template corresponds to
the same group of log messages as the ground truth does
⢠PA: A log message is correctly parsed when all its static text and dynamic
variables (i.e.,
fi
xed and variables parts) are correctly identi
fi
ed
1. Zhu, Jieming, et al, "Tools and benchmarks for automated log parsing," 2019 IEEE/ACM 41st Interna
ti
onal Conference on So
ft
ware Engineering: So
ft
ware Engineering in Prac
ti
ce (ICSE-SEIP)
2. Die, Hetong, et al, "Logram: E
ffi
cient Log Parsing Using n-Gram Dic
ti
onaries," 2020 IEEE Transac
ti
ons on So
ft
ware Engineering (TSE)
To di
ff
eren
ti
ate the de
fi
ni
ti
ons, we used GA [1] term, ini
ti
ally men
ti
oned as Parsing Accuracy (PA) by the authors
12. ⢠GA is 100%, as the identi
fi
ed event templates grouping is identical to the
ground truth grouping
⢠PA is 33%, as only âGroup2â is correctly parsed in terms of parsing accuracy
de
fi
nition, while âGroup1âsâ
fi
xed and variable parts are not correctly identi
fi
ed
12
Example: GA and PA
Log Message Ground Truth Template Identified Event Template
cse:5071 open through proxy
cse:5071 HTTPS
<*> open through proxy <*> HTTPS <*>:<*> open through proxy <*>:<*> HTTPS
cse:5072 open through proxy
cse:5072 HTTPS
<*> open through proxy <*> HTTPS <*>:<*> open through proxy <*>:<*> HTTPS
sending request to node3 Sending request to <*> Sending request to <*>
Group1
Group2
13. 13
Issue: Accuracy Metrics in Literature
⢠Both GA and PA are sensitive to the number of log messages
⢠This is problematic if log messages are repeated (e.g., heartbeat messages)
14. 14
New Metric: Template Accuracy (TA)
TA = A template is âcorrectly identi
fi
edâ when the identi
fi
ed template is
identical (token-by-token) to the oracle template
Precision-TA (PTA):
Not sensitive to
the number of
messages
Recall-TA (RTA):
Total # of identi
fi
ed templates
# of correctly identi
fi
ed templates
Total # of oracle templates
# of correctly identi
fi
ed templates
Oracle Template Correct Identified Template Incorrect Identified Template
<*> Served block <*> to <*> <*> Served block <*> to <*> <*> Served block blk_000 to <*>
15. 15
Observed Issue No. 1
Results from different accuracy metrics are not properly compared
16. 16
Guideline #1: Do Use Appropriate Metrics
Use a metric that adequately assesses the accuracy of template
identi
fi
cation techniques in the context of the targeted use cases
17. 17
Guideline #1: Do Use Appropriate Metrics
Two important criteria
Message frequency is
important?
Variable tokens required?
18. 18
Observed Issue No. 2
Oracle templates are not always correct, mainly when extracted
manually without access to source code
Log Message Incorrect Oracle Template Correct Oracle Template
status is false status is false status is <*>
19. 19
Guideline #2: Do Perform Oracle Template Correction
Use oracle template correction, as the oracle templates
extracted without using source code are most likely incorrect
20. 20
Oracle Template Correction Rules (Partial)
Rule Description Example
Double Spaces (DS) Replace double spaces with a single space Input: _ _<*> Input: _ <*>
Digit (DG) Replace digit tokens with variables euid=0 euid=<*>
Boolean (BL) Replace True/False with a variable cancel=false cancel=<*>
Path String (PS) Replace a path-like token with a variable /lib/tmp started <*> started
The rules are based on the manual investigation of the oracle
templates in LogHub* (apply when the source code is not available)
* LogHub (Publicly available log benchmark): h
tt
ps://github.com/logpai/loghub
21. 21
Observed Issue No. 3
Additional information about the incorrectly identi
fi
ed templates is
not available
22. ⢠To understand in which way templates are incorrectly identi
fi
ed, we propose
an analysis of incorrect templates
⢠The basic idea is to consider template identi
fi
cation as the process of
generalizing log messages
⢠For example, two messages âretry 1â and âretry 2â can be generalized as a
template âretry <*>â
22
Guideline #3: Do Perform Incorrect Template Analysis
23. ⢠We can classify incorrect templates into three types:
⢠Over-Generalized (OG), Under-Generalized (UG), and MiXed (MX)
⢠Classi
fi
cation also helps identify the weakness of certain template identi
fi
cation technique
23
Guideline #3: Do Perform Incorrect Template Analysis
Log Messages Oracle Template Identified Template Type
Send message1
Send message2
Send <*>
<*> <*> OG
Send message1
Send message2
UG
<*> message1
<*> message2
MX
25. ⢠RQ1:
ďťż
How does the ranking of techniques vary when using di
ff
erent
accuracy metrics?
⢠RQ2: What is the impact of oracle template correction on di
ff
erent accuracy
metrics?
⢠RQ3: Can the analysis of incorrect templates provide any insight to improve
template identi
fi
cation techniques?
25
Research Questions
26. ⢠The rankings of techniques vary a lot
depending on the choice of accuracy
metric
sďťż
⢠Nevertheless, Drain outperforms
other techniques in general
⢠In terms of PA, PTA, and RTA metrics,
all template identi
fi
cation techniques
achieve a low accuracy score (less
than 31%)
26
RQ1:
ďťż
How does the ranking of techniques vary when using different accuracy metrics?
GA PA PTA RTA
Technique Score Technique Score Technique Score Technique Score
Drain 0.87 SLCT 0.30 Drain 0.27 Drain 0.29
Spell 0.79 Drain 0.29 AEL 0.25 AEL 0.27
AEL 0.79 AEL 0.26 SLCT 0.22 LenMaa 0.23
LenMa 0.77 LogMine 0.20 SHISO 0.16 LogCluster 0.22
IPLoM 0.76 LFA 0.20 LenMa 0.16 LogMine 0.20
LogMine 0.74 Logram 0.19 IPLoM 0.16 Spell 0.17
SHISO 0.68 IPLoM 0.19 LogMine 0.16 SHISO 0.17
LogCluster 0.65 Spell 0.18 Spell 0.15 SLCT 0.16
LFA 0.64 LenMa 0.18 LFA 0.14 IPLoM 0.15
SLCT 0.63 LogCluster 0.15 Logram 0.13 Logram 0.14
MoLFI 0.62 SHISO 0.13 LKE 0.12 LFA 0.14
LKE 0.56 LogSig 0.13 LogCluster 0.10 MoLFI 0.11
Logram 0.55 MoLFI 0.09 MoLFI 0.09 LKE 0.09
LogSig 0.53 LKE 0.08 LogSig 0.08 LogSig 0.07
27. ⢠It is essential to use an appropriate accuracy metric according to its use case
⢠Our guideline #1 should be utilized
⢠Empirical studies using the GA metric were too optimistic due to the use of
message-level grouping for calculating the accuracy scores of TI techniques
27
RQ1: Implications
28. ⢠The rankings of the techniques vary when using the oracle template
correction
⢠On average 28.5% of oracle templates for all datasets were incorrect, and are
corrected using the correction rules
28
RQ2: What is the impact of oracle template correction (OTC) on different accuracy metrics?
GA GA OTC PA PA OTC PTA PTA OTC
Technique Score Technique Score Technique Score Technique Score Technique Score Technique Score
Drain 0.87 Drain 0.86 SLCT 0.30 Drain 0.34 Drain 0.27 Drain 0.29
Spell 0.79 AEL 0.80 Drain 0.29 AEL 0.28 AEL 0.25 AEL 0.24
AEL 0.79 Spell 0.79 AEL 0.26 SLCT 0.27 SLCT 0.22 SLCT 0.19
29. ⢠The rankings of the techniques vary when using the oracle template
correction
⢠On average 28.5% of oracle templates for all datasets were incorrect, and are
corrected using the correction rules
29
RQ2: What is the impact of oracle template correction (OTC) on different accuracy metrics?
GA GA OTC PA PA OTC PTA PTA OTC
Technique Score Technique Score Technique Score Technique Score Technique Score Technique Score
Drain 0.87 Drain 0.86 SLCT 0.30 Drain 0.34 Drain 0.27 Drain 0.29
Spell 0.79 AEL 0.80 Drain 0.29 AEL 0.28 AEL 0.25 AEL 0.24
AEL 0.79 Spell 0.79 AEL 0.26 SLCT 0.27 SLCT 0.22 SLCT 0.19
30. ⢠The oracle template correction is important for properly ranking template
identi
fi
cation techniques based on their accuracy
⢠Using oracle template correction is essential, considering the large
percentage of incorrect oracle templates when they are manually identi
fi
ed
30
RQ2: Implications
31. ⢠Di
ff
erent techniques have di
ff
erent
percentage of OG, UG, and MX
⢠Based on the percentage, weakness
of technique can be identi
fi
ed
31
RQ3:
ďťż
Can the analysis of incorrect templates provide any insight to improve template
identification techniques?
Technique OG(%) UG(%) MX(%)
AEL 21.7 38.2 15.3
Drain 19.4 32.6 20.6
IPLoM 4.6 10.5 69.0
LFA 52.7 16.2 17.5
LKE 0.1 32.8 55.4
LenMa 5.5 44.5 33.8
LogCluster 0.0 72.9 17.0
LogMine 6.8 36.0 41.4
LogSig 0.1 14.2 77.7
Logram 27.3 26.4 33.2
MoLFI 0.0 8.4 82.5
SHISO 6.4 44.4 32.8
SLCT 17.5 28.9 31.7
Spell 7.7 13.1 64.2
32. 32
RQ3: Implications
⢠Technique with a high UG percentage (e.g., LogCluster) can be improved by
allowing it more aggressive in converting
fi
xed parts into variable parts
Oracle Template Identified Template (LogCluster)
['<*>:<*>:Got exception while serving <*> to <*>:'] 10.251.70.211:50010:Got exception while serving blk_424255210146453297 to /10.251.203.179:
['<*>:<*> Served block <*> to <*>'] 10.250.14.224:50010 Served block blk_666713934549639791 to /10.250.14.224
['<*>:<*> Served block <*> to <*>'] 10.251.30.179:50010 Served block blk_-2975629975082443857 to /10.251.30.179
['<*>:<*>:Got exception while serving <*> to <*>:'] 10.251.126.22:50010:Got exception while serving blk_1686195200514944346 to /10.250.6.223:
['<*>:<*> Served block <*> to <*>'] 10.250.9.207:50010 Served block blk_4355450627202483068 to /10.250.9.207
['<*>:<*> Served block <*> to <*>'] 10.250.6.191:50010 Served block blk_5952254363678329024 to /10.250.6.191
['<*>:<*>:Got exception while serving <*> to <*>:'] 10.251.70.211:50010:Got exception while serving blk_-667933171485085225 to /10.251.203.246:
['Veri
fi
cation succeeded for <*>'] Veri
fi
cation succeeded for blk_9188832735514090334
34. 34
Take Away Message
Manually extracted oracle templates (without source code) could be
incorrect, and thus have to be corrected
The evaluation results emphasize the importance of using adequate
metrics to assess TI techniques in the context of target use cases
To better understand in which way templates are incorrectly
identi
fi
ed, one can analyze incorrect templates
35. Zanis Ali Khan, Donghwan Shin, Domenico Bianculli, and
Lionel Briand
Guidelines for Assessing the Accuracy of Log
Message Template Identification Techniques