SlideShare ist ein Scribd-Unternehmen logo
1 von 71
Downloaden Sie, um offline zu lesen
Von Schweinen, Schlangen & Papierschnitten
Das 1x1 des Performance Troubleshooting
Rainer Schuppe
AppDynamics GmbH
about me
•
•
•
•
•

Customer Support
System Support / Ops
Consultant / Dev
Solution Architect
Sales Engineer
Oh no! Not again!

or: Why care about performance

Where to start? What to do? Who to
blame?
Tooling
Symptoms
Diagnose
How Many User Abandon Your Slow Website After 3 Seconds?*

These Leave
And Find Your
Competitor
43 %
57 %

These Stay
And Suffer
Through A
Poor Experience
*PhoCusWright and Akamai study
And What About 18-24 Year Olds After Only 2 Seconds Of Waiting? *

The Future Of
Your Business
Just Left and
Found Your
Competition

35 %
65 %

*PhoCusWright and Akamai study
Complexity increases
Release 1.1
Release 1.2
Release 1.23
Tomcat Release 1.5

.NET

Amazon EC2
Windows Azure

CLOUD
Release 2.4
Release 2.5
Release 2.6
Release 3.0

Login
Search Flight
View Flight Status
Make Reservation

Tomcat

Mule, Tibco, AG
Tomcat

ESB

VMWare

WEB 2.0
Memcached
Weblogic
Release 1.4
Release 1.5
Release 1.6
Release 2.0

Browser Logic
AJAX
Web Frameworks

Oracle

Coherence
Hadoop
Cassandra
MongoDB

SOA

.NET
MQ

AGILE

Release 3.4
Release 3.5
Release 3.6
Release 4.0

SQL
Server

Release 4.4
Release 4.5
Release 4.6
Release 5.0

JBoss

Release 1.4
Release 1.5
Release 1.6
Release 2.0

ATG, Vignette,
Sharepoint

BIG DATA
Generic Troubleshooting Process
Alert / Detection

Rootcause
Detection

Triage

Diagnosis

Data /
Information
Solution
Finding

Move on with life

Fix
Triage
• Determine who needs to fix it
• Starts with overview and comparison to
„normal“ performance
• First level task (Operators)
• First indication of problem type
• Needs transactional data
Business Transactions can help

•

46,463 Checkouts processed
◦

482 returned an error, 1325 were slow, 576 were very
slow and 111 stalled.

•

3,956 Payments processed
◦

12 returned an error, 242 were slow, 96 were very
slow and 79 stalled
50 ms
.NET
10 ms Amazon EC2
60 ms
Windows Azure

Release 1.1
Release 1.2
Release 1.23
Tomcat Release 1.5

45,3 ms

CLOUD

50 ms
Release 2.4
Release 2.5
Release 2.6
Release 3.0

Login
Search Flight
View Flight Status
Make Reservation

Tomcat

145 Mule, Tibco, AG
ms
145 ms
ESB
145 ms
145 ms
10 ms

WEBms
100 2.0

Memcached

250 ms
Weblogic
Release 1.4
Release 1.5
Release 1.6
Release 2.0

Browser Logic
AJAX
Web Frameworks

300.NET
ms
300 ms
310 ms
AGILE

Release 3.4
Release 3.5
Release 3.6
Release 4.0

SQL
Server

150 ms
Tomcat
160 VMWare
ms
145 ms
Oracle

Release 4.4
Release 4.5
Release 4.6
Release 5.0

Coherence

SOA

1 MQ
ms
15 ms

250 ms
JBoss
Release 1.4
Release 1.5
Release 1.6
Release 2.0

ATG, Vignette,
Sharepoint

Hadoop
Cassandra
MongoDB

BIG DATA
Release 1.1
Release 1.2
Release 1.23
Tomcat Release 1.5

Pr

.NET

ob

lem

Amazon EC2
Windows Azure

CLOUD
Release 2.4
Release 2.5
Release 2.6
Release 3.0

Login
Search Flight
View Flight Status
Make Reservation

Tomcat

Mule, Tibco, AG
Tomcat

ESB

VMWare

WEB 2.0
Memcached
Weblogic
Release 1.4
Release 1.5
Release 1.6
Release 2.0

Browser Logic
AJAX
Web Frameworks

Oracle

Coherence
Hadoop
Cassandra
MongoDB

SOA

.NET
MQ

AGILE

Release 3.4
Release 3.5
Release 3.6
Release 4.0

SQL
Server

Release 4.4
Release 4.5
Release 4.6
Release 5.0

JBoss

Release 1.4
Release 1.5
Release 1.6
Release 2.0

ATG, Vignette,
Sharepoint

BIG DATA
Diagnose
• Determine the root of the problem
• Uses first level information to narrow scope
• Needs specialists
• Lots of data / information needed in real time
and historical
• Usually needs iterations
• More than 1 tool used in the process
Rootcause detection
• Confirm the rootcause after you diagnosed it
• Document it
• Recreate it in test if possible
• Needs the same data as diagnostics
Solution finding
• Find a solution for the problem
• Architect a workaround or a fix
• Again needs the diagnostic data
• Run some test runs with different options check them in realtime
• Confirm the idea for the fix
• May be a different team then the trouble
shooters
How to get the data?
• Intuition
• Experience
• Tools
• Logfiles
• Communication
Tooling

© val-j - sxc.hu
3 Key Things Impact
Performance & Availability
Concurrency

Data Volume

Resource
Why do things crash and slow down?
Development

Concurrency

Data Volume

Resource

QA/Test

Concurrency Data Volume

Resource

Production

Concurrency

Data Volume

Resource
Technologies
Logging
ARM
Bytecode Instrumentation / Aspects
Sampling
JMX (Java Management Extensions)
PMI (IBM WebSphere specific)

Dev
Test
Prod
Logfiles
Pros:

Dev
Test
Prod

• Anything can be logged
• Easy to implement (if you have the sourcecode)
Cons:
• Only what the developer thinks is needed
• I/O heavy
• No chance for change if you don‘t own the
source code
• Lots of files - no TX context usually
• How to correlate in distributed environment?
Logfiles
[#|2013-04-16T16:04:44.319+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean|
_ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats Data Store timer|#]
[#|2013-04-16T16:04:44.335+0200|INFO|sun-appserver2.1|com.appdynamics.TOP.SUMMARY.STATS.WRITE|
_ThreadID=14;_ThreadName=pool-1-thread-9;|START TIME for timer service(TopSummaryStatsWriterTimerTaskBean) will be: Tue
Apr 16 16:05:00 CEST 2013|#]
[#|2013-04-16T16:04:44.338+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean|
_ThreadID=14;_ThreadName=pool-1-thread-9;|Successfully initialized the Top Summary Stats Data Store timer|#]
[#|2013-04-16T16:04:44.338+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean|
_ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats Data Purger timer|#]
[#|2013-04-16T16:04:44.369+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean|
_ThreadID=14;_ThreadName=pool-1-thread-9;|Successfully initialized the Top Summary Stats Data Purger timer|#]
[#|2013-04-16T16:04:44.369+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean|
_ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats Detail String cache timer|#]
[#|2013-04-16T16:04:44.376+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean|
_ThreadID=14;_ThreadName=pool-1-thread-9;|Successfully initialized the Top Summary Stats Detail String cache timer|#]
[#|2013-04-16T16:04:44.376+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean|
_ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats rollup timer|#]
Profiler
Pros:

• No config needed
• Lots of data - lots of detail
Cons:
• Lots of data - not suitable for production
• Needs experience
• No transactional concept / context

Dev
Test
Profiler
JMX (and similar)
Pros:

•
•
•

Built into most application servers
JConsole is part of the JDK
Easy to implement MBeans

Cons:

•
•
•
•

No transaction context
Not available for 3rd party
No historical data
Usually one JVM only

Dev
Test
Prod
JMX (and similar)
APM tools (free)
Pros:

• They are free
• Transaction context (most of them)
• Quick setup (the commercial ones)

Dev
Test
Prod

Cons:
• Usually functionally constrained (commercial)
• Hard to configure (open source)
• Usually no history
Dev
APM tools (commercial) Test

Pros:

• Transactions, Historical data
• Distributed monitoring
• Deep dive diagnostics
• Production fit
Cons:
• Costly
• Choose the right one

Prod
Link Tip
http://java.dzone.com/articles/java-performance-troubleshooti-0
Diagnosis
There are just 2 sorts of issues
codecentric AG

© NLTeddy - sxc.hu

31
codecentric AG

© ross666 - sxc.hu

32
50 shades of slow (appx.)
•
•
•
•
•
•

Constantly slow (Turtle)
Slowly, but constantly slower
Exponentially slower
Suddenly slower
Sporadically slow
Spontaneous crash
The wonderful world of errors
•
•
•
•
•
•

Sudden outage
Always erroneous
Sporadically Errormessages
Silent death / Bleed to death
Increasing errorrates
Wrong / meaningless error messages
Diagnosis – Rough Flow
Look at symptoms
Eliminate definite non-causes
Prioritize the suspicions
Confirm suspicion / Eliminate suspicion
• Compare with „normal“
• Gather more information
• Define root cause and confirm it
• Redo from Start
•
•
•
•
Possible Causes
(in no particular order)

•
•
•
•
•
•
•

Bad Coding
Too much load
Backend not reachable / slow
Conflicting resources
Memory Leak
Resource Leak
Network / Hardware Problem
Possible Symptoms
Consistent slowness
Slower and slower against some variable

•
•
•

•
•
•
•

Time / Load

Sporadic hangs / random errors
Foreseeable lockups
“Sudden chaos”
High utilization of resources (CPU,
memory, network, etc.)
The Causes
Linear Memory Leak
Symptoms:

•
•
•
•

OOM (Out of memory error)
Slow over time with spikes
Hockeystick graph

• Causes
•
•

Objects added to linear structures without being removed
(e.g., linked lists)
Other API misuse (addListener() without corresponding
removeListener(), etc.)
Linear Memory Leak
Aggregate detection:

•
•
•

linear growth in heap utilization
GC time growth

Specific detection:

•
•
•
•

Figure out object types being leaked
Verbose GC
Find related APIs and search code for misuse
Linear Memory Leak
Challenges

•
•
•

References - many small objects are referenced in one
collection
Death by 1000 cuts (Papierschnitte)

Specific detection:

•
•
•
•

Figure out object types being leaked
Verbose GC
Find related APIs and search code for misuse
Specific detection
•
•
•
•
•
•

•
•
•

•

Heap Dump Comparison

Needs at least 2 dumps
Stops the JVM
Can take several minutes each
Creates tons of data
Finds the object, not the code responsible for the leak

Profiler

High overhead - not for production
Lots of data

APM Solution
•
•
•

Collection based algorithm – finds only collection leaks
Instance counting
Trade off between low overhead and usefulness of data
Exponential Memory Leak
Causes:
• Objects added to most data structures
without being removed (e.g., vectors,
hashtables)
• Other API misuse (as Linear Leak)
• Aggregate detection:
• exponential growth in heap
• Specific detection:
• Same as Linear Leak
•
Resource Leak
Causes:
• API misuse of Java objects with resourcestyle lifecycle (create->use->destroy)
• Aggregate detection:
• Slow over time
• Growth in heap (if you’re lucky)
• Specific detection:
• Audit code for API misuses
• Object instance tracking
•
Resource conflict / blocking
•

•

•

Causes:
• Overcautious data integrity strategy
• Synchronising is always good
Aggregate detection:
• Stalled threads
• High thread usage - low CPU usage
Specific detection:
• Thread dumps as needed
• Stack traces / graphs
• CPU block / wait timing measurement
Resource conflict (bolck / wait)
Production Ground to a halt for 2 hours And again the next day

Trx/
min
Avg RT
Pool Limit
Pool Usage
Trx Stalls
Bad Coding: Infinite Loop
Causes:
• Infinite loop in code
• Aggregate detection:
• Stalled threads
• Permanently high usage of CPU / threads
• Specific detection:
• Thread dumps as needed
• Stack traces / graphs
•
Bad Coding: CPU-Bound Component
Causes:
• Idiot with a “Learn Java in 24 Hours” book
• Aggregate Detection:
• Response time measurement
• Aggregate CPU utilization
• Specific Detection:
• Detailed CPU utilization
• Typical Cure:
• Cache of data or of performed calculations
•
Layer-itis
Causes:

•
•
•

Poorly implemented data bridge layer, or simply
too many of them
DB -> XML -> XSLT -> More XML -> “Custom
Data Management Layer” -> Consumer

Aggregate Detection:

•
•

Response time measurements

Specific Detection:

•
•
•

Call graphs - Call trace (stack trace not
enough)
Ask for a design or architecture document
O/R Mapper misuse
Causes:

•
•
•
•

Hibernate fixes everything
Massive SQL statements (length and amount)
Wrong data strategy

Aggregate Detection:

•
•
•

Response time measurements
DB time measurements

Specific Detection:

•
•

Call stacks / snapshots
Caching issues
The Unending Retry
Causes:
• Continual attempts to call backend +
unavailable backend
• Aggregate Detection / Specific Detection:
• Response time measurement
• Backend detection - measurement (time
& # of calls)
• Stalled TX count
• Exceptions
• Busy thread count
•
don’t forget about thrown exceptions
Threading: Deadlock / Livelock
Causes:
• Fundamental error in threading / lock
acquisition strategy
• Aggregate Detection:
• Stalled threads / permanently high
concurrent usage
• Specific Detection:
• Deadlock detection in JVM
• Thread dumps
• Busy thread count
•
Threading: Deadlock
Found one Java-level deadlock:
=============================
"Thread-2":
  waiting to lock monitor 102054308 (object 7f3113800, a java.lang.Object),
  which is held by "Thread-1"
"Thread-1":
  waiting to lock monitor 1020348b8 (object 7f3113810, a java.lang.Object),
  which is held by "Thread-2"
 
Java stack information for the threads listed above:
===================================================
"Thread-2":
    at DeadlockTest$2.run(DeadlockTest.java:42)
    - waiting to lock <7f3113800> (a java.lang.Object)
    - locked <7f3113810> (a java.lang.Object)
    at java.lang.Thread.run(Thread.java:680)
"Thread-1":
    at DeadlockTest$1.run(DeadlockTest.java:26)
    - waiting to lock <7f3113810> (a java.lang.Object)
    - locked <7f3113800> (a java.lang.Object)
    at java.lang.Thread.run(Thread.java:680)
 
Threading: Chokepoint
Causes:
• Many threads bottlenecked waiting for
one lock
• Aggregate Detection:
• Stalled threads / high concurrent usage
• Exponential slowness
• Low CPU usage
• Specific Detection:
• Request response time monitoring
• CPU block / wait timing
•
Threading: Chokepoint
Internal Resource Bottleneck
•

•
•

•

•

•
•
•

Causes:

Overusage of internal resource (threads,
database connections, etc.)
Underallocation of same

Aggregate Detection:

Stalled threads / high concurrent usage
Call rate and average response time of internal
resource

Specific Detection:

Also compare with methods from Resource
Leak, External Bottleneck, and Overusage of
External System
External Bottleneck
Causes:

•
•
•

External system (database, authentication server) is
slow
Compare with Overusage of external system

Aggregate Detection:

•
•
•

Response time on backend calls
Exceptions

Specific Detection:

•
•
•

Callgraphs
Specific monitoring on those backends
Commit happy
Production Ground to a halt for 2 hours And again the next day

Trx/
min
Avg RT
Pool Limit
Pool Usage
Trx Stalls
Overusage of External System
Causes:

•
•
•

•
•

•
•
•

Poor design or tuning of interaction with backend system
(e.g., join between two million-row tables for each user
logon)
O/R mapper misconfiguration

Aggregate Detection:

Response time measurement

Specific Detection:

Timing on backend systems
Also need tools for those backend systems
excessive database access
query too much data
•
•

One interesting problem occurs when the size of
transactions with backend systems needs to be tuned
Can be intertwined with / exacerbated by Layer-itis and
Overusage of External System

Many small requests
System constantly
wastes resources
dispatching /
unmarshalling many
xactions and results
“Death by a thousand
cuts”

“Just Right”
One HUGE request
System periodically
slows to a crawl as
many resources get
thrown at large
chunk of work
“Pig in a Python”
Fragen ?

Weitere ähnliche Inhalte

Was ist angesagt?

Http to Https Get your WordPress website Compliant!
Http to Https Get your WordPress website Compliant!Http to Https Get your WordPress website Compliant!
Http to Https Get your WordPress website Compliant!Lynn Dye
 
BSides Lisbon 2013 - All your sites belong to Burp
BSides Lisbon 2013 - All your sites belong to BurpBSides Lisbon 2013 - All your sites belong to Burp
BSides Lisbon 2013 - All your sites belong to BurpTiago Mendo
 
Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk Simon Bennetts
 
The moment my site got hacked
The moment my site got hackedThe moment my site got hacked
The moment my site got hackedMarko Heijnen
 
Building a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQLBuilding a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQLYan Cui
 
OWASP 2013 APPSEC USA Talk - OWASP ZAP
OWASP 2013 APPSEC USA Talk - OWASP ZAPOWASP 2013 APPSEC USA Talk - OWASP ZAP
OWASP 2013 APPSEC USA Talk - OWASP ZAPSimon Bennetts
 
2014 ZAP Workshop 1: Getting Started
2014 ZAP Workshop 1: Getting Started2014 ZAP Workshop 1: Getting Started
2014 ZAP Workshop 1: Getting StartedSimon Bennetts
 
4.2. Web analyst fiddler
4.2. Web analyst fiddler4.2. Web analyst fiddler
4.2. Web analyst fiddlerdefconmoscow
 
DevOops Redux Ken Johnson Chris Gates - AppSec USA 2016
DevOops Redux Ken Johnson Chris Gates  - AppSec USA 2016DevOops Redux Ken Johnson Chris Gates  - AppSec USA 2016
DevOops Redux Ken Johnson Chris Gates - AppSec USA 2016Chris Gates
 
2014 ZAP Workshop 2: Contexts and Fuzzing
2014 ZAP Workshop 2: Contexts and Fuzzing2014 ZAP Workshop 2: Contexts and Fuzzing
2014 ZAP Workshop 2: Contexts and FuzzingSimon Bennetts
 
OWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP HackathonOWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP HackathonSimon Bennetts
 
CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)
CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)
CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)Sam Bowne
 
How to fix 504 Gateway Timeout Error on your WordPress Website?
How to fix 504 Gateway Timeout Error on your WordPress Website?How to fix 504 Gateway Timeout Error on your WordPress Website?
How to fix 504 Gateway Timeout Error on your WordPress Website?Anny Rathore
 
eCommerce performance, what is it costing you and what can you do about it?
eCommerce performance, what is it costing you and what can you do about it?eCommerce performance, what is it costing you and what can you do about it?
eCommerce performance, what is it costing you and what can you do about it?Peter Holditch
 
OWASP 2014 AppSec EU ZAP Advanced Features
OWASP 2014 AppSec EU ZAP Advanced FeaturesOWASP 2014 AppSec EU ZAP Advanced Features
OWASP 2014 AppSec EU ZAP Advanced FeaturesSimon Bennetts
 
Vo ip guide
Vo ip guideVo ip guide
Vo ip guideACP
 
Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...
Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...
Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...Paul Thompson
 
N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...
N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...
N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...gmaran23
 

Was ist angesagt? (20)

Http to Https Get your WordPress website Compliant!
Http to Https Get your WordPress website Compliant!Http to Https Get your WordPress website Compliant!
Http to Https Get your WordPress website Compliant!
 
BSides Lisbon 2013 - All your sites belong to Burp
BSides Lisbon 2013 - All your sites belong to BurpBSides Lisbon 2013 - All your sites belong to Burp
BSides Lisbon 2013 - All your sites belong to Burp
 
Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk
 
The moment my site got hacked
The moment my site got hackedThe moment my site got hacked
The moment my site got hacked
 
Building a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQLBuilding a social network in under 4 weeks with Serverless and GraphQL
Building a social network in under 4 weeks with Serverless and GraphQL
 
OWASP 2013 APPSEC USA Talk - OWASP ZAP
OWASP 2013 APPSEC USA Talk - OWASP ZAPOWASP 2013 APPSEC USA Talk - OWASP ZAP
OWASP 2013 APPSEC USA Talk - OWASP ZAP
 
2014 ZAP Workshop 1: Getting Started
2014 ZAP Workshop 1: Getting Started2014 ZAP Workshop 1: Getting Started
2014 ZAP Workshop 1: Getting Started
 
4.2. Web analyst fiddler
4.2. Web analyst fiddler4.2. Web analyst fiddler
4.2. Web analyst fiddler
 
DevOops Redux Ken Johnson Chris Gates - AppSec USA 2016
DevOops Redux Ken Johnson Chris Gates  - AppSec USA 2016DevOops Redux Ken Johnson Chris Gates  - AppSec USA 2016
DevOops Redux Ken Johnson Chris Gates - AppSec USA 2016
 
2014 ZAP Workshop 2: Contexts and Fuzzing
2014 ZAP Workshop 2: Contexts and Fuzzing2014 ZAP Workshop 2: Contexts and Fuzzing
2014 ZAP Workshop 2: Contexts and Fuzzing
 
OWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP HackathonOWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP Hackathon
 
CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)
CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)
CNIT 128 8. Identifying and Exploiting Android Implementation Issues (Part 1)
 
How to fix 504 Gateway Timeout Error on your WordPress Website?
How to fix 504 Gateway Timeout Error on your WordPress Website?How to fix 504 Gateway Timeout Error on your WordPress Website?
How to fix 504 Gateway Timeout Error on your WordPress Website?
 
eCommerce performance, what is it costing you and what can you do about it?
eCommerce performance, what is it costing you and what can you do about it?eCommerce performance, what is it costing you and what can you do about it?
eCommerce performance, what is it costing you and what can you do about it?
 
Testing Automaton - CFSummit 2016
Testing Automaton - CFSummit 2016Testing Automaton - CFSummit 2016
Testing Automaton - CFSummit 2016
 
OWASP 2014 AppSec EU ZAP Advanced Features
OWASP 2014 AppSec EU ZAP Advanced FeaturesOWASP 2014 AppSec EU ZAP Advanced Features
OWASP 2014 AppSec EU ZAP Advanced Features
 
Vo ip guide
Vo ip guideVo ip guide
Vo ip guide
 
Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...
Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...
Migrating Your WordPress Site to HTTPS - Getting it right the first time Word...
 
10 common cf server challenges
10 common cf server challenges10 common cf server challenges
10 common cf server challenges
 
N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...
N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...
N Different Strategies to Automate OWASP ZAP - OWASP APPSec BUCHAREST - Oct 1...
 

Andere mochten auch

Activty based research design for User Experience
Activty based research design for User ExperienceActivty based research design for User Experience
Activty based research design for User Experienceinnogy Innovation GmbH
 
Web Globalization balanced by User Experience (Mensch und Computer 2008)
Web Globalization balanced by User Experience (Mensch und Computer 2008)Web Globalization balanced by User Experience (Mensch und Computer 2008)
Web Globalization balanced by User Experience (Mensch und Computer 2008)Rainer Gibbert
 
InDAgo -- Prototyping Smart Mobility Assistants
InDAgo -- Prototyping Smart Mobility AssistantsInDAgo -- Prototyping Smart Mobility Assistants
InDAgo -- Prototyping Smart Mobility AssistantsUID GmbH
 
User Experience Optimierung
User Experience Optimierung User Experience Optimierung
User Experience Optimierung Connected-Blog
 
USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...
USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...
USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...USECON
 
eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...
eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...
eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...eparo GmbH
 

Andere mochten auch (6)

Activty based research design for User Experience
Activty based research design for User ExperienceActivty based research design for User Experience
Activty based research design for User Experience
 
Web Globalization balanced by User Experience (Mensch und Computer 2008)
Web Globalization balanced by User Experience (Mensch und Computer 2008)Web Globalization balanced by User Experience (Mensch und Computer 2008)
Web Globalization balanced by User Experience (Mensch und Computer 2008)
 
InDAgo -- Prototyping Smart Mobility Assistants
InDAgo -- Prototyping Smart Mobility AssistantsInDAgo -- Prototyping Smart Mobility Assistants
InDAgo -- Prototyping Smart Mobility Assistants
 
User Experience Optimierung
User Experience Optimierung User Experience Optimierung
User Experience Optimierung
 
USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...
USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...
USECON & Microsoft: Grundlagen des User Experience Designs fuer Windows Store...
 
eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...
eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...
eparo – User Experience Design und Usability. Niemand sagt mehr "Konzeption" ...
 

Ähnlich wie Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Papierschnitten

Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
 
Framework and Application Benchmarking
Framework and Application BenchmarkingFramework and Application Benchmarking
Framework and Application BenchmarkingPaul Jones
 
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014Amazon Web Services
 
CIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEMCIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEMICF CIRCUIT
 
End-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL ServerEnd-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL ServerKevin Kline
 
Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...
Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...
Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...Vikas Sahni
 
Building azure applications ireland
Building azure applications irelandBuilding azure applications ireland
Building azure applications irelandMichael Meagher
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSPC Adriatics
 
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)Nexcess.net LLC
 
SharePoint Performance Monitoring with Sean P. McDonough
SharePoint Performance Monitoring with Sean P. McDonoughSharePoint Performance Monitoring with Sean P. McDonough
SharePoint Performance Monitoring with Sean P. McDonoughGabrijela Orsag
 
Application Logging Good Bad Ugly ... Beautiful?
Application Logging Good Bad Ugly ... Beautiful?Application Logging Good Bad Ugly ... Beautiful?
Application Logging Good Bad Ugly ... Beautiful?Anton Chuvakin
 
Observability with Spring-based distributed systems
Observability with Spring-based distributed systemsObservability with Spring-based distributed systems
Observability with Spring-based distributed systemsRakuten Group, Inc.
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software PerformanceGibraltar Software
 
Observability in real time at scale
Observability in real time at scaleObservability in real time at scale
Observability in real time at scaleBalvinder Hira
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisBrendan Gregg
 
Open source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packagesOpen source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packagesRogue Wave Software
 
KoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganKoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganTobias Koprowski
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek PROIDEA
 

Ähnlich wie Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Papierschnitten (20)

Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelSilicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
Silicon Valley Code Camp 2015 - Advanced MongoDB - The Sequel
 
Framework and Application Benchmarking
Framework and Application BenchmarkingFramework and Application Benchmarking
Framework and Application Benchmarking
 
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
(WEB301) Operational Web Log Analysis | AWS re:Invent 2014
 
CIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEMCIRCUIT 2015 - Monitoring AEM
CIRCUIT 2015 - Monitoring AEM
 
End-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL ServerEnd-to-end Troubleshooting Checklist for Microsoft SQL Server
End-to-end Troubleshooting Checklist for Microsoft SQL Server
 
Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...
Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...
Building Real World Applications using Windows Azure - Scott Guthrie, 2nd Dec...
 
Building azure applications ireland
Building azure applications irelandBuilding azure applications ireland
Building azure applications ireland
 
SharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi VončinaSharePoint 2013 Performance Analysis - Robi Vončina
SharePoint 2013 Performance Analysis - Robi Vončina
 
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)
ExpressionEngine - Simple Steps to Performance and Security (EECI 2014)
 
SQL Server On SANs
SQL Server On SANsSQL Server On SANs
SQL Server On SANs
 
SharePoint Performance Monitoring with Sean P. McDonough
SharePoint Performance Monitoring with Sean P. McDonoughSharePoint Performance Monitoring with Sean P. McDonough
SharePoint Performance Monitoring with Sean P. McDonough
 
Application Logging Good Bad Ugly ... Beautiful?
Application Logging Good Bad Ugly ... Beautiful?Application Logging Good Bad Ugly ... Beautiful?
Application Logging Good Bad Ugly ... Beautiful?
 
Observability with Spring-based distributed systems
Observability with Spring-based distributed systemsObservability with Spring-based distributed systems
Observability with Spring-based distributed systems
 
Natural Laws of Software Performance
Natural Laws of Software PerformanceNatural Laws of Software Performance
Natural Laws of Software Performance
 
Spring insight what just happened
Spring insight   what just happenedSpring insight   what just happened
Spring insight what just happened
 
Observability in real time at scale
Observability in real time at scaleObservability in real time at scale
Observability in real time at scale
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
 
Open source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packagesOpen source: Top issues in the top enterprise packages
Open source: Top issues in the top enterprise packages
 
KoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just beganKoprowskiT - SQLBITS X - 2am a disaster just began
KoprowskiT - SQLBITS X - 2am a disaster just began
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 

Kürzlich hochgeladen

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Kürzlich hochgeladen (20)

HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Application Performance Troubleshooting 1x1 - Von Schweinen, Schlangen und Papierschnitten

  • 1. Von Schweinen, Schlangen & Papierschnitten Das 1x1 des Performance Troubleshooting Rainer Schuppe AppDynamics GmbH
  • 2. about me • • • • • Customer Support System Support / Ops Consultant / Dev Solution Architect Sales Engineer
  • 3. Oh no! Not again! or: Why care about performance Where to start? What to do? Who to blame? Tooling Symptoms Diagnose
  • 4. How Many User Abandon Your Slow Website After 3 Seconds?* These Leave And Find Your Competitor 43 % 57 % These Stay And Suffer Through A Poor Experience *PhoCusWright and Akamai study
  • 5. And What About 18-24 Year Olds After Only 2 Seconds Of Waiting? * The Future Of Your Business Just Left and Found Your Competition 35 % 65 % *PhoCusWright and Akamai study
  • 6. Complexity increases Release 1.1 Release 1.2 Release 1.23 Tomcat Release 1.5 .NET Amazon EC2 Windows Azure CLOUD Release 2.4 Release 2.5 Release 2.6 Release 3.0 Login Search Flight View Flight Status Make Reservation Tomcat Mule, Tibco, AG Tomcat ESB VMWare WEB 2.0 Memcached Weblogic Release 1.4 Release 1.5 Release 1.6 Release 2.0 Browser Logic AJAX Web Frameworks Oracle Coherence Hadoop Cassandra MongoDB SOA .NET MQ AGILE Release 3.4 Release 3.5 Release 3.6 Release 4.0 SQL Server Release 4.4 Release 4.5 Release 4.6 Release 5.0 JBoss Release 1.4 Release 1.5 Release 1.6 Release 2.0 ATG, Vignette, Sharepoint BIG DATA
  • 7. Generic Troubleshooting Process Alert / Detection Rootcause Detection Triage Diagnosis Data / Information Solution Finding Move on with life Fix
  • 8. Triage • Determine who needs to fix it • Starts with overview and comparison to „normal“ performance • First level task (Operators) • First indication of problem type • Needs transactional data
  • 9. Business Transactions can help • 46,463 Checkouts processed ◦ 482 returned an error, 1325 were slow, 576 were very slow and 111 stalled. • 3,956 Payments processed ◦ 12 returned an error, 242 were slow, 96 were very slow and 79 stalled
  • 10. 50 ms .NET 10 ms Amazon EC2 60 ms Windows Azure Release 1.1 Release 1.2 Release 1.23 Tomcat Release 1.5 45,3 ms CLOUD 50 ms Release 2.4 Release 2.5 Release 2.6 Release 3.0 Login Search Flight View Flight Status Make Reservation Tomcat 145 Mule, Tibco, AG ms 145 ms ESB 145 ms 145 ms 10 ms WEBms 100 2.0 Memcached 250 ms Weblogic Release 1.4 Release 1.5 Release 1.6 Release 2.0 Browser Logic AJAX Web Frameworks 300.NET ms 300 ms 310 ms AGILE Release 3.4 Release 3.5 Release 3.6 Release 4.0 SQL Server 150 ms Tomcat 160 VMWare ms 145 ms Oracle Release 4.4 Release 4.5 Release 4.6 Release 5.0 Coherence SOA 1 MQ ms 15 ms 250 ms JBoss Release 1.4 Release 1.5 Release 1.6 Release 2.0 ATG, Vignette, Sharepoint Hadoop Cassandra MongoDB BIG DATA
  • 11. Release 1.1 Release 1.2 Release 1.23 Tomcat Release 1.5 Pr .NET ob lem Amazon EC2 Windows Azure CLOUD Release 2.4 Release 2.5 Release 2.6 Release 3.0 Login Search Flight View Flight Status Make Reservation Tomcat Mule, Tibco, AG Tomcat ESB VMWare WEB 2.0 Memcached Weblogic Release 1.4 Release 1.5 Release 1.6 Release 2.0 Browser Logic AJAX Web Frameworks Oracle Coherence Hadoop Cassandra MongoDB SOA .NET MQ AGILE Release 3.4 Release 3.5 Release 3.6 Release 4.0 SQL Server Release 4.4 Release 4.5 Release 4.6 Release 5.0 JBoss Release 1.4 Release 1.5 Release 1.6 Release 2.0 ATG, Vignette, Sharepoint BIG DATA
  • 12.
  • 13. Diagnose • Determine the root of the problem • Uses first level information to narrow scope • Needs specialists • Lots of data / information needed in real time and historical • Usually needs iterations • More than 1 tool used in the process
  • 14. Rootcause detection • Confirm the rootcause after you diagnosed it • Document it • Recreate it in test if possible • Needs the same data as diagnostics
  • 15. Solution finding • Find a solution for the problem • Architect a workaround or a fix • Again needs the diagnostic data • Run some test runs with different options check them in realtime • Confirm the idea for the fix • May be a different team then the trouble shooters
  • 16. How to get the data? • Intuition • Experience • Tools • Logfiles • Communication
  • 18. 3 Key Things Impact Performance & Availability Concurrency Data Volume Resource
  • 19. Why do things crash and slow down? Development Concurrency Data Volume Resource QA/Test Concurrency Data Volume Resource Production Concurrency Data Volume Resource
  • 20. Technologies Logging ARM Bytecode Instrumentation / Aspects Sampling JMX (Java Management Extensions) PMI (IBM WebSphere specific) Dev Test Prod
  • 21. Logfiles Pros: Dev Test Prod • Anything can be logged • Easy to implement (if you have the sourcecode) Cons: • Only what the developer thinks is needed • I/O heavy • No chance for change if you don‘t own the source code • Lots of files - no TX context usually • How to correlate in distributed environment?
  • 22. Logfiles [#|2013-04-16T16:04:44.319+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean| _ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats Data Store timer|#] [#|2013-04-16T16:04:44.335+0200|INFO|sun-appserver2.1|com.appdynamics.TOP.SUMMARY.STATS.WRITE| _ThreadID=14;_ThreadName=pool-1-thread-9;|START TIME for timer service(TopSummaryStatsWriterTimerTaskBean) will be: Tue Apr 16 16:05:00 CEST 2013|#] [#|2013-04-16T16:04:44.338+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean| _ThreadID=14;_ThreadName=pool-1-thread-9;|Successfully initialized the Top Summary Stats Data Store timer|#] [#|2013-04-16T16:04:44.338+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean| _ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats Data Purger timer|#] [#|2013-04-16T16:04:44.369+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean| _ThreadID=14;_ThreadName=pool-1-thread-9;|Successfully initialized the Top Summary Stats Data Purger timer|#] [#|2013-04-16T16:04:44.369+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean| _ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats Detail String cache timer|#] [#|2013-04-16T16:04:44.376+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean| _ThreadID=14;_ThreadName=pool-1-thread-9;|Successfully initialized the Top Summary Stats Detail String cache timer|#] [#|2013-04-16T16:04:44.376+0200|INFO|sun-appserver2.1|com.singularity.ee.controller.beans.ControllerManagerBean| _ThreadID=14;_ThreadName=pool-1-thread-9;|Starting to initialize the Top Summary Stats rollup timer|#]
  • 23. Profiler Pros: • No config needed • Lots of data - lots of detail Cons: • Lots of data - not suitable for production • Needs experience • No transactional concept / context Dev Test
  • 25. JMX (and similar) Pros: • • • Built into most application servers JConsole is part of the JDK Easy to implement MBeans Cons: • • • • No transaction context Not available for 3rd party No historical data Usually one JVM only Dev Test Prod
  • 27. APM tools (free) Pros: • They are free • Transaction context (most of them) • Quick setup (the commercial ones) Dev Test Prod Cons: • Usually functionally constrained (commercial) • Hard to configure (open source) • Usually no history
  • 28. Dev APM tools (commercial) Test Pros: • Transactions, Historical data • Distributed monitoring • Deep dive diagnostics • Production fit Cons: • Costly • Choose the right one Prod
  • 30. Diagnosis There are just 2 sorts of issues
  • 33. 50 shades of slow (appx.) • • • • • • Constantly slow (Turtle) Slowly, but constantly slower Exponentially slower Suddenly slower Sporadically slow Spontaneous crash
  • 34. The wonderful world of errors • • • • • • Sudden outage Always erroneous Sporadically Errormessages Silent death / Bleed to death Increasing errorrates Wrong / meaningless error messages
  • 35. Diagnosis – Rough Flow Look at symptoms Eliminate definite non-causes Prioritize the suspicions Confirm suspicion / Eliminate suspicion • Compare with „normal“ • Gather more information • Define root cause and confirm it • Redo from Start • • • •
  • 36. Possible Causes (in no particular order) • • • • • • • Bad Coding Too much load Backend not reachable / slow Conflicting resources Memory Leak Resource Leak Network / Hardware Problem
  • 37. Possible Symptoms Consistent slowness Slower and slower against some variable • • • • • • • Time / Load Sporadic hangs / random errors Foreseeable lockups “Sudden chaos” High utilization of resources (CPU, memory, network, etc.)
  • 39. Linear Memory Leak Symptoms: • • • • OOM (Out of memory error) Slow over time with spikes Hockeystick graph • Causes • • Objects added to linear structures without being removed (e.g., linked lists) Other API misuse (addListener() without corresponding removeListener(), etc.)
  • 40. Linear Memory Leak Aggregate detection: • • • linear growth in heap utilization GC time growth Specific detection: • • • • Figure out object types being leaked Verbose GC Find related APIs and search code for misuse
  • 41. Linear Memory Leak Challenges • • • References - many small objects are referenced in one collection Death by 1000 cuts (Papierschnitte) Specific detection: • • • • Figure out object types being leaked Verbose GC Find related APIs and search code for misuse
  • 42. Specific detection • • • • • • • • • • Heap Dump Comparison Needs at least 2 dumps Stops the JVM Can take several minutes each Creates tons of data Finds the object, not the code responsible for the leak Profiler High overhead - not for production Lots of data APM Solution • • • Collection based algorithm – finds only collection leaks Instance counting Trade off between low overhead and usefulness of data
  • 43.
  • 44.
  • 45.
  • 46.
  • 47. Exponential Memory Leak Causes: • Objects added to most data structures without being removed (e.g., vectors, hashtables) • Other API misuse (as Linear Leak) • Aggregate detection: • exponential growth in heap • Specific detection: • Same as Linear Leak •
  • 48. Resource Leak Causes: • API misuse of Java objects with resourcestyle lifecycle (create->use->destroy) • Aggregate detection: • Slow over time • Growth in heap (if you’re lucky) • Specific detection: • Audit code for API misuses • Object instance tracking •
  • 49. Resource conflict / blocking • • • Causes: • Overcautious data integrity strategy • Synchronising is always good Aggregate detection: • Stalled threads • High thread usage - low CPU usage Specific detection: • Thread dumps as needed • Stack traces / graphs • CPU block / wait timing measurement
  • 51. Production Ground to a halt for 2 hours And again the next day Trx/ min Avg RT Pool Limit Pool Usage Trx Stalls
  • 52. Bad Coding: Infinite Loop Causes: • Infinite loop in code • Aggregate detection: • Stalled threads • Permanently high usage of CPU / threads • Specific detection: • Thread dumps as needed • Stack traces / graphs •
  • 53. Bad Coding: CPU-Bound Component Causes: • Idiot with a “Learn Java in 24 Hours” book • Aggregate Detection: • Response time measurement • Aggregate CPU utilization • Specific Detection: • Detailed CPU utilization • Typical Cure: • Cache of data or of performed calculations •
  • 54. Layer-itis Causes: • • • Poorly implemented data bridge layer, or simply too many of them DB -> XML -> XSLT -> More XML -> “Custom Data Management Layer” -> Consumer Aggregate Detection: • • Response time measurements Specific Detection: • • • Call graphs - Call trace (stack trace not enough) Ask for a design or architecture document
  • 55. O/R Mapper misuse Causes: • • • • Hibernate fixes everything Massive SQL statements (length and amount) Wrong data strategy Aggregate Detection: • • • Response time measurements DB time measurements Specific Detection: • • Call stacks / snapshots
  • 57. The Unending Retry Causes: • Continual attempts to call backend + unavailable backend • Aggregate Detection / Specific Detection: • Response time measurement • Backend detection - measurement (time & # of calls) • Stalled TX count • Exceptions • Busy thread count •
  • 58. don’t forget about thrown exceptions
  • 59. Threading: Deadlock / Livelock Causes: • Fundamental error in threading / lock acquisition strategy • Aggregate Detection: • Stalled threads / permanently high concurrent usage • Specific Detection: • Deadlock detection in JVM • Thread dumps • Busy thread count •
  • 60. Threading: Deadlock Found one Java-level deadlock: ============================= "Thread-2":   waiting to lock monitor 102054308 (object 7f3113800, a java.lang.Object),   which is held by "Thread-1" "Thread-1":   waiting to lock monitor 1020348b8 (object 7f3113810, a java.lang.Object),   which is held by "Thread-2"   Java stack information for the threads listed above: =================================================== "Thread-2":     at DeadlockTest$2.run(DeadlockTest.java:42)     - waiting to lock <7f3113800> (a java.lang.Object)     - locked <7f3113810> (a java.lang.Object)     at java.lang.Thread.run(Thread.java:680) "Thread-1":     at DeadlockTest$1.run(DeadlockTest.java:26)     - waiting to lock <7f3113810> (a java.lang.Object)     - locked <7f3113800> (a java.lang.Object)     at java.lang.Thread.run(Thread.java:680)  
  • 61. Threading: Chokepoint Causes: • Many threads bottlenecked waiting for one lock • Aggregate Detection: • Stalled threads / high concurrent usage • Exponential slowness • Low CPU usage • Specific Detection: • Request response time monitoring • CPU block / wait timing •
  • 63. Internal Resource Bottleneck • • • • • • • • Causes: Overusage of internal resource (threads, database connections, etc.) Underallocation of same Aggregate Detection: Stalled threads / high concurrent usage Call rate and average response time of internal resource Specific Detection: Also compare with methods from Resource Leak, External Bottleneck, and Overusage of External System
  • 64. External Bottleneck Causes: • • • External system (database, authentication server) is slow Compare with Overusage of external system Aggregate Detection: • • • Response time on backend calls Exceptions Specific Detection: • • • Callgraphs Specific monitoring on those backends
  • 66. Production Ground to a halt for 2 hours And again the next day Trx/ min Avg RT Pool Limit Pool Usage Trx Stalls
  • 67. Overusage of External System Causes: • • • • • • • • Poor design or tuning of interaction with backend system (e.g., join between two million-row tables for each user logon) O/R mapper misconfiguration Aggregate Detection: Response time measurement Specific Detection: Timing on backend systems Also need tools for those backend systems
  • 70. • • One interesting problem occurs when the size of transactions with backend systems needs to be tuned Can be intertwined with / exacerbated by Layer-itis and Overusage of External System Many small requests System constantly wastes resources dispatching / unmarshalling many xactions and results “Death by a thousand cuts” “Just Right” One HUGE request System periodically slows to a crawl as many resources get thrown at large chunk of work “Pig in a Python”