SlideShare ist ein Scribd-Unternehmen logo
1 von 18
Downloaden Sie, um offline zu lesen
Making Runtime Data Useful for Incident
Diagnosis: An Experience Report
Wolfsburg, November 28, 2018
QuASD 2018
Florian
Lautenschlager
Marcus
Ciolkowski
Measuring Technical Debt is not novel.
But: Existing Approaches ignore dynamic behavior.
2
Current software (quality) measurement is based on static metrics
plenty of tools exist (e.g. SonarQube)
rather simple to measure
much experience in measurement and presentation
Dynamic aspects / KPIs are underrepresented
test coverage is most common exception
dynamic indicators influence, e.g.
customer satisfaction
infrastructure and operation costs
tricky cases in maintenance hide often here (e.g., leaks and locks)
Gap: Measure and evaluate dynamic aspects
QAware 3
One example of dynamic aspects: Runtime structure
runtime structure: actual call relationships at runtime
runtime structure is important
to gain understanding of system at runtime
to identify performance problems
Challenge: often difficult to detect statically
abstraction and inheritance (many false positives)
code injection
reflection
soft links
EventMgr
Queue
add remove
Object event = queue.remove();
Method act = event.getMethod("act");
act.invoke(event);
Which class is called?
EventMgr
Queue
Focus today: How to gather and use runtime data for
incident diagnosis
QAware 4
What is incident diagnosis
Problem report or ticket
task: find & fix cause quickly
How is it done today
ask someone from the other team
(those who know how to get and interpret runtime data)
Idea: bring together runtime data in a unified, simple access
so everyone can easily use runtime data
and needs to ask only with more profound information
Azure
SQL
Databases
BLOB
Storage
NoSQL
Databases
HAProxy
http
http
Kubernetes
<Pod>
Admin
Gateway
<Pod>
API
Gateway
Voice Services
<Pod>
Servic
e
API
Admin
https
https
External Services
IDM
CDN
Skills
Weather
Radio
Device Services
<Pod>
Service
API
Admin
5
Project Setting: Cloud With Distributed Teams.
Azure
SQL
Databases
BLOB
Storage
NoSQL
Databases
HAProxy
http
http
Kubernetes
<Pod>
Admin
Gateway
<Pod>
API
Gateway
Voice Services
<Pod>
Servic
e
API
Admin
https
https
External Services
IDM
CDN
Skills
Weather
Radio
Device Services
<Pod>
Service
API
Admin
6
Incident Diagnosis: Identify a Problem
Across Teams and Components
?
?
?
?
?
?
7
Probes collect runtime data event-based or sampling-
based.
Event-based on state change
• Event for every observed change
• Bad for high frequency changes (high overhead)
Sampling-based on timer interval
• State changes might be lost
• Good for high frequency changes (overhead is controllable)
Probe
Probe
Runtime
Data_1
Runtime
Data_1
Runtime
Data_2
Runtime
Data_1
Runtime
Data_2
Runtime
Data_4
Runtime
Data_3
Runtime
Data_4
Software System
Interval Interval Interval
Event Event Event
State
Observe
Observe
represents
change
Interval
Event
8
Runtime data does not have a predefined structure.
In practice, there are three types of runtime data.
Metric: Numeric value
Textual: Structured log message
Trace/Span: A {method|service} call
Part of a (distributed) trace
Trace: Runtime data that belong
together and is ordered by time
incoming_request_count telekom
Name Attribute Value
Name Value Name Value
Span
Span
Span
Span
Span Span
Time
Trace
Smart Voice Hub
Exploration
Probes,
Collection
and
Storage
Storage,
Exploration
Transport
Storage,
Exploration
Probes and
Collection
Metric
sampling-based
Textual
event-based
Trace/Span
event-based
9
Problem: Runtime data is technically separated, but
needs to be connected.
QAware 10
Solution: All runtime data have to contain the
diagnosability context.
1. Detected: Alert on login failures for tenant smarthub
Metric: request_error{ tenant smarthub method
2. Analyze: Dig into login failures traces (spans) and logs (textual) for the tenant
Trace:
Logs: {“message”:”Connection Timeout”, “user”:”xb27”, “tenant”:”smarthub”}
Login
Validate Token Load User
Get Roles Get Rights
Tags:{tenant:”smarthub”, user:”xb27”}
Problem: Typical team member is not an expert in the toolchains.
Exploration
Probes,
Collection
and
Storage
Storage,
Exploration
Transport
Storage,
Exploration
Probes and
Collection
Smart Voice Hub
https://kibana.svh.de?traceId=d0227acffbd8671
https://zipkin.svh.de?traceId=d0227acffbd8671https://prometheus.svh.de?
Chatbot
Solution: Build a unified access to the toolchains!
11
12
The chatbot is our unified access layer.
13
Allows to interact with Smart Voice Hub
Response contains always the traceId
(part of the diagnosability context).
In case of an error:
Zipkin Trace: Request Trace
Kibana Logs: Request Application log
The chatbot is our unified access layer.
Contextual Logging:
Search for logs using parts of the diagnosability context.
14
Lessons Learnt
Dynamic Analysis for Everyone Not Just For Experts
16
We first had to convince people, but afterwards they loved the analysis tools.
Established and accepted toolchain is a building block
It pays off:
Better (more qualified) cross-team communication
Higher motivation understanding software behavior
Lower overhead for analysis experts
Increased awareness of runtime characteristics
First level support is interested
QAware 17
Determine actual runtime architecture
from traces / spans
and log files
Construct model of runtime architecture
for visualization
abstract functional / logical abstraction layers
for compliance checks
Towards resilience
improve dynamic indicators of technical debt, e.g.
resource consumption (e.g., time spent)
component health (predict&react early)
Next Steps: Derive Runtime Architecture
for Compliance Checks
QAware GmbH München
Aschauer Straße 32
81549 München
Tel.: +49 (0) 89 23 23 15 0
Fax: +49 (0) 89 23 23 15 129 github.com/qaware
linkedin.com/qaware slideshare.net/qaware
twitter.com/qaware xing.com/qaware
youtube.com/qawaregmbh
Marcus Ciolkowski
marcus.ciolkowski@qaware.de
@M_Ciolkowski

Weitere ähnliche Inhalte

Was ist angesagt?

Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingAmuhinda Hungai
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsC4Media
 
Observability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyObservability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyTimetrix
 
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Priyanka Aash
 
Safe Peak Technical Ppt W Product Publish
Safe Peak Technical Ppt W Product   PublishSafe Peak Technical Ppt W Product   Publish
Safe Peak Technical Ppt W Product Publishsqlserver.co.il
 
What Questions Do Programmers Ask About Configuration as Code?
What Questions Do Programmers Ask About Configuration as Code?What Questions Do Programmers Ask About Configuration as Code?
What Questions Do Programmers Ask About Configuration as Code?Akond Rahman
 
Vulnerability Detection Based on Git History
Vulnerability Detection Based on Git HistoryVulnerability Detection Based on Git History
Vulnerability Detection Based on Git HistoryKenta Yamamoto
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopKevin Crawley
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsBigPanda
 
POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...
POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...
POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...Liming Zhu
 
Shhh!: Secret Management Practices for Infrastructure as Code
Shhh!: Secret Management Practices for Infrastructure as Code Shhh!: Secret Management Practices for Infrastructure as Code
Shhh!: Secret Management Practices for Infrastructure as Code Akond Rahman
 
DockerCon SF 2019 - TDD is Dead
DockerCon SF 2019 - TDD is DeadDockerCon SF 2019 - TDD is Dead
DockerCon SF 2019 - TDD is DeadKevin Crawley
 
Red Team vs. Blue Team on AWS
Red Team vs. Blue Team on AWSRed Team vs. Blue Team on AWS
Red Team vs. Blue Team on AWSPriyanka Aash
 
OWASP AppSensor: Detecting Attacks in your Application
OWASP AppSensor:  Detecting Attacks in your ApplicationOWASP AppSensor:  Detecting Attacks in your Application
OWASP AppSensor: Detecting Attacks in your ApplicationQAware GmbH
 
Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...
Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...
Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...Akond Rahman
 
Cloud security : Automate or die
Cloud security : Automate or dieCloud security : Automate or die
Cloud security : Automate or diePriyanka Aash
 
Log management principle and usage
Log management principle and usageLog management principle and usage
Log management principle and usageBikrant Gautam
 
Cloud Native Security: New Approach for a New Reality
Cloud Native Security: New Approach for a New RealityCloud Native Security: New Approach for a New Reality
Cloud Native Security: New Approach for a New RealityCarlos Andrés García
 
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...DevOpsDays Tel Aviv
 

Was ist angesagt? (20)

Everything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed TracingEverything You wanted to Know About Distributed Tracing
Everything You wanted to Know About Distributed Tracing
 
Architectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep SystemsArchitectures That Scale Deep - Regaining Control in Deep Systems
Architectures That Scale Deep - Regaining Control in Deep Systems
 
Observability – the good, the bad, and the ugly
Observability – the good, the bad, and the uglyObservability – the good, the bad, and the ugly
Observability – the good, the bad, and the ugly
 
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
Transfer Learning: Repurposing ML Algorithms from Different Domains to Cloud ...
 
Safe Peak Technical Ppt W Product Publish
Safe Peak Technical Ppt W Product   PublishSafe Peak Technical Ppt W Product   Publish
Safe Peak Technical Ppt W Product Publish
 
What Questions Do Programmers Ask About Configuration as Code?
What Questions Do Programmers Ask About Configuration as Code?What Questions Do Programmers Ask About Configuration as Code?
What Questions Do Programmers Ask About Configuration as Code?
 
Vulnerability Detection Based on Git History
Vulnerability Detection Based on Git HistoryVulnerability Detection Based on Git History
Vulnerability Detection Based on Git History
 
DockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability WorkshopDockerCon SF 2019 - Observability Workshop
DockerCon SF 2019 - Observability Workshop
 
A Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOpsA Practical Guide to Anomaly Detection for DevOps
A Practical Guide to Anomaly Detection for DevOps
 
POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...
POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...
POD-Diagnosis: Error Detection and Diagnosis of Sporadic Operations on Cloud ...
 
Shhh!: Secret Management Practices for Infrastructure as Code
Shhh!: Secret Management Practices for Infrastructure as Code Shhh!: Secret Management Practices for Infrastructure as Code
Shhh!: Secret Management Practices for Infrastructure as Code
 
DockerCon SF 2019 - TDD is Dead
DockerCon SF 2019 - TDD is DeadDockerCon SF 2019 - TDD is Dead
DockerCon SF 2019 - TDD is Dead
 
Red Team vs. Blue Team on AWS
Red Team vs. Blue Team on AWSRed Team vs. Blue Team on AWS
Red Team vs. Blue Team on AWS
 
OWASP AppSensor: Detecting Attacks in your Application
OWASP AppSensor:  Detecting Attacks in your ApplicationOWASP AppSensor:  Detecting Attacks in your Application
OWASP AppSensor: Detecting Attacks in your Application
 
Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...
Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...
Characteristics of Defective Infrastructure as Code Scripts in Continuous Dep...
 
Observability
ObservabilityObservability
Observability
 
Cloud security : Automate or die
Cloud security : Automate or dieCloud security : Automate or die
Cloud security : Automate or die
 
Log management principle and usage
Log management principle and usageLog management principle and usage
Log management principle and usage
 
Cloud Native Security: New Approach for a New Reality
Cloud Native Security: New Approach for a New RealityCloud Native Security: New Approach for a New Reality
Cloud Native Security: New Approach for a New Reality
 
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
THE (IR)RATIONAL INCIDENT RESPONSE: HOW PSYCHOLOGICAL BIASES AFFECT INCIDENT ...
 

Ähnlich wie Making Runtime Data Useful for Incident Diagnosis: An Experience Report

Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environmentsDocker, Inc.
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Sourceaspyker
 
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud ManagementSukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud ManagementSukumar Nayak
 
Challenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime DataChallenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime DataQAware GmbH
 
Cloud-native application monitoring powered by Riverbed and Elasticsearch
Cloud-native application monitoring powered by Riverbed and ElasticsearchCloud-native application monitoring powered by Riverbed and Elasticsearch
Cloud-native application monitoring powered by Riverbed and ElasticsearchRichard Juknavorian
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)CIVEL Benoit
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1CIVEL Benoit
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUGSandesh Rao
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...Cisco DevNet
 
2014-12-16 defense news - shutdown the hackers
2014-12-16  defense news - shutdown the hackers2014-12-16  defense news - shutdown the hackers
2014-12-16 defense news - shutdown the hackersShawn Wells
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecurityTao Xie
 
Muves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalMuves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalElastic Grid, LLC.
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformmatteo mazzeri
 
Combining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified ObservabilityCombining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified ObservabilityElasticsearch
 
Les logs, traces et indicateurs au service d'une observabilité unifiée
Les logs, traces et indicateurs au service d'une observabilité unifiéeLes logs, traces et indicateurs au service d'une observabilité unifiée
Les logs, traces et indicateurs au service d'une observabilité unifiéeElasticsearch
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022Sandesh Rao
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Securitysedukull
 
Resistance is futile, resilience is crucial
Resistance is futile, resilience is crucialResistance is futile, resilience is crucial
Resistance is futile, resilience is crucialHristo Iliev
 

Ähnlich wie Making Runtime Data Useful for Incident Diagnosis: An Experience Report (20)

Proactive ops for container orchestration environments
Proactive ops for container orchestration environmentsProactive ops for container orchestration environments
Proactive ops for container orchestration environments
 
Netflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open SourceNetflix Cloud Architecture and Open Source
Netflix Cloud Architecture and Open Source
 
Sukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud ManagementSukumar Nayak-Agile-DevOps-Cloud Management
Sukumar Nayak-Agile-DevOps-Cloud Management
 
Challenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime DataChallenges in Assessing Technical Debt based on Dynamic Runtime Data
Challenges in Assessing Technical Debt based on Dynamic Runtime Data
 
Cloud-native application monitoring powered by Riverbed and Elasticsearch
Cloud-native application monitoring powered by Riverbed and ElasticsearchCloud-native application monitoring powered by Riverbed and Elasticsearch
Cloud-native application monitoring powered by Riverbed and Elasticsearch
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
 
Continuous Integration & the Release Maturity Model
Continuous Integration & the Release Maturity Model Continuous Integration & the Release Maturity Model
Continuous Integration & the Release Maturity Model
 
2014-12-16 defense news - shutdown the hackers
2014-12-16  defense news - shutdown the hackers2014-12-16  defense news - shutdown the hackers
2014-12-16 defense news - shutdown the hackers
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
 
Muves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 FinalMuves3 Elastic Grid Java One2009 Final
Muves3 Elastic Grid Java One2009 Final
 
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platformNatalie Godec - AirFlow and GCP: tomorrow's health service data platform
Natalie Godec - AirFlow and GCP: tomorrow's health service data platform
 
Combining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified ObservabilityCombining Logs, Metrics, and Traces for Unified Observability
Combining Logs, Metrics, and Traces for Unified Observability
 
Les logs, traces et indicateurs au service d'une observabilité unifiée
Les logs, traces et indicateurs au service d'une observabilité unifiéeLes logs, traces et indicateurs au service d'une observabilité unifiée
Les logs, traces et indicateurs au service d'une observabilité unifiée
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Security
 
Resistance is futile, resilience is crucial
Resistance is futile, resilience is crucialResistance is futile, resilience is crucial
Resistance is futile, resilience is crucial
 

Mehr von QAware GmbH

50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdfQAware GmbH
 
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...QAware GmbH
 
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN MainzFully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN MainzQAware GmbH
 
Down the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile ArchitectureDown the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile ArchitectureQAware GmbH
 
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!QAware GmbH
 
Make Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform EngineeringMake Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform EngineeringQAware GmbH
 
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit PlaywrightDer Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit PlaywrightQAware GmbH
 
Was kommt nach den SPAs
Was kommt nach den SPAsWas kommt nach den SPAs
Was kommt nach den SPAsQAware GmbH
 
Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo QAware GmbH
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See... Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...QAware GmbH
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster QAware GmbH
 
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.QAware GmbH
 
Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!QAware GmbH
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s AutoscalingQAware GmbH
 
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAPKontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAPQAware GmbH
 
Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.QAware GmbH
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s AutoscalingQAware GmbH
 
Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.QAware GmbH
 
Per Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API GatewaysPer Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API GatewaysQAware GmbH
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster QAware GmbH
 

Mehr von QAware GmbH (20)

50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf50 Shades of K8s Autoscaling #JavaLand24.pdf
50 Shades of K8s Autoscaling #JavaLand24.pdf
 
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
 
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN MainzFully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
 
Down the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile ArchitectureDown the Ivory Tower towards Agile Architecture
Down the Ivory Tower towards Agile Architecture
 
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!"Mixed" Scrum-Teams – Die richtige Mischung macht's!
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
 
Make Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform EngineeringMake Developers Fly: Principles for Platform Engineering
Make Developers Fly: Principles for Platform Engineering
 
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit PlaywrightDer Tod der Testpyramide? – Frontend-Testing mit Playwright
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
 
Was kommt nach den SPAs
Was kommt nach den SPAsWas kommt nach den SPAs
Was kommt nach den SPAs
 
Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo Cloud Migration mit KI: der Turbo
Cloud Migration mit KI: der Turbo
 
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See... Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
 
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
 
Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!Kubernetes with Cilium in AWS - Experience Report!
Kubernetes with Cilium in AWS - Experience Report!
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling
 
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAPKontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
 
Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.Service Mesh Pain & Gain. Experiences from a client project.
Service Mesh Pain & Gain. Experiences from a client project.
 
50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling50 Shades of K8s Autoscaling
50 Shades of K8s Autoscaling
 
Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.Blue turns green! Approaches and technologies for sustainable K8s clusters.
Blue turns green! Approaches and technologies for sustainable K8s clusters.
 
Per Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API GatewaysPer Anhalter zu Cloud Nativen API Gateways
Per Anhalter zu Cloud Nativen API Gateways
 
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
 

Kürzlich hochgeladen

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 

Kürzlich hochgeladen (20)

Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 

Making Runtime Data Useful for Incident Diagnosis: An Experience Report

  • 1. Making Runtime Data Useful for Incident Diagnosis: An Experience Report Wolfsburg, November 28, 2018 QuASD 2018 Florian Lautenschlager Marcus Ciolkowski
  • 2. Measuring Technical Debt is not novel. But: Existing Approaches ignore dynamic behavior. 2 Current software (quality) measurement is based on static metrics plenty of tools exist (e.g. SonarQube) rather simple to measure much experience in measurement and presentation Dynamic aspects / KPIs are underrepresented test coverage is most common exception dynamic indicators influence, e.g. customer satisfaction infrastructure and operation costs tricky cases in maintenance hide often here (e.g., leaks and locks)
  • 3. Gap: Measure and evaluate dynamic aspects QAware 3 One example of dynamic aspects: Runtime structure runtime structure: actual call relationships at runtime runtime structure is important to gain understanding of system at runtime to identify performance problems Challenge: often difficult to detect statically abstraction and inheritance (many false positives) code injection reflection soft links EventMgr Queue add remove Object event = queue.remove(); Method act = event.getMethod("act"); act.invoke(event); Which class is called? EventMgr Queue
  • 4. Focus today: How to gather and use runtime data for incident diagnosis QAware 4 What is incident diagnosis Problem report or ticket task: find & fix cause quickly How is it done today ask someone from the other team (those who know how to get and interpret runtime data) Idea: bring together runtime data in a unified, simple access so everyone can easily use runtime data and needs to ask only with more profound information
  • 7. 7 Probes collect runtime data event-based or sampling- based. Event-based on state change • Event for every observed change • Bad for high frequency changes (high overhead) Sampling-based on timer interval • State changes might be lost • Good for high frequency changes (overhead is controllable) Probe Probe Runtime Data_1 Runtime Data_1 Runtime Data_2 Runtime Data_1 Runtime Data_2 Runtime Data_4 Runtime Data_3 Runtime Data_4 Software System Interval Interval Interval Event Event Event State Observe Observe represents change Interval Event
  • 8. 8 Runtime data does not have a predefined structure. In practice, there are three types of runtime data. Metric: Numeric value Textual: Structured log message Trace/Span: A {method|service} call Part of a (distributed) trace Trace: Runtime data that belong together and is ordered by time incoming_request_count telekom Name Attribute Value Name Value Name Value Span Span Span Span Span Span Time Trace
  • 9. Smart Voice Hub Exploration Probes, Collection and Storage Storage, Exploration Transport Storage, Exploration Probes and Collection Metric sampling-based Textual event-based Trace/Span event-based 9 Problem: Runtime data is technically separated, but needs to be connected.
  • 10. QAware 10 Solution: All runtime data have to contain the diagnosability context. 1. Detected: Alert on login failures for tenant smarthub Metric: request_error{ tenant smarthub method 2. Analyze: Dig into login failures traces (spans) and logs (textual) for the tenant Trace: Logs: {“message”:”Connection Timeout”, “user”:”xb27”, “tenant”:”smarthub”} Login Validate Token Load User Get Roles Get Rights Tags:{tenant:”smarthub”, user:”xb27”}
  • 11. Problem: Typical team member is not an expert in the toolchains. Exploration Probes, Collection and Storage Storage, Exploration Transport Storage, Exploration Probes and Collection Smart Voice Hub https://kibana.svh.de?traceId=d0227acffbd8671 https://zipkin.svh.de?traceId=d0227acffbd8671https://prometheus.svh.de? Chatbot Solution: Build a unified access to the toolchains! 11
  • 12. 12 The chatbot is our unified access layer.
  • 13. 13 Allows to interact with Smart Voice Hub Response contains always the traceId (part of the diagnosability context). In case of an error: Zipkin Trace: Request Trace Kibana Logs: Request Application log The chatbot is our unified access layer.
  • 14. Contextual Logging: Search for logs using parts of the diagnosability context. 14
  • 16. Dynamic Analysis for Everyone Not Just For Experts 16 We first had to convince people, but afterwards they loved the analysis tools. Established and accepted toolchain is a building block It pays off: Better (more qualified) cross-team communication Higher motivation understanding software behavior Lower overhead for analysis experts Increased awareness of runtime characteristics First level support is interested
  • 17. QAware 17 Determine actual runtime architecture from traces / spans and log files Construct model of runtime architecture for visualization abstract functional / logical abstraction layers for compliance checks Towards resilience improve dynamic indicators of technical debt, e.g. resource consumption (e.g., time spent) component health (predict&react early) Next Steps: Derive Runtime Architecture for Compliance Checks
  • 18. QAware GmbH München Aschauer Straße 32 81549 München Tel.: +49 (0) 89 23 23 15 0 Fax: +49 (0) 89 23 23 15 129 github.com/qaware linkedin.com/qaware slideshare.net/qaware twitter.com/qaware xing.com/qaware youtube.com/qawaregmbh Marcus Ciolkowski marcus.ciolkowski@qaware.de @M_Ciolkowski