SlideShare ist ein Scribd-Unternehmen logo
1 von 45
© Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com
SELA DEVELOPER PRACTICE
May 23-25, 2017
Ido Flatow
Production Debugging War Stories
THE STORIES YOU ARE ABOUT TO HEAR
ARE BASED ON ACTUAL CASES.
LOCATIONS, TIMELINES, AND NAMES
HAVE BEEN CHANGED FOR DRAMATIC
PURPOSES AND TO PROTECT THOSE
INDIVIDUALS WHO ARE STILL CODING.
For the Next 50 Minutes…
Introduction
Service hangs
Unexplained exceptions
High memory consumption
Why Are You Here?
You are going to hear about
Bugs in web applications
Tips for better coding
Debugging tools, and when to use them
You will not leave here as expert debuggers! Sorry
But… You will leave with a good starting point
And probably anxious to check your code
How Are we Going to Do This?
What did the client report?
Which steps we used to troubleshoot the issue?
What did we find?
How did we fix it?
What were those tools we used?
The Tired WCF Service
Client
Local bank
Reported
WCF service works fine for few hours, then stops handling requests
Clients call the service, wait, then time out
Server CPU is high
Workaround
Restart IIS Application pool
Troubleshooting
Configured WCF to output performance counters
Used Performance Monitor to
watch WCF’s counters, specifically
Instances
Percent Of Max Concurrent Calls
Troubleshooting - cntd
Waited for the service to hang
Inspected counter values
Value was at 100% (101.563% to be exact)
At this point, no clients were active!
Reminder - WCF throttles concurrent calls (16 x #Cores)
Troubleshooting - cntd
Watched w3wp thread stacks
with Process Explorer
Noticed many .NET threads in sleep loop
Issue found - Requests hanged in the service, causing it to
throttle new requests
Fixed code to stop endless loop – problem solved!
The Tools in Use
Performance Monitor (perfmon.exe)
View counters that show the state of various application aspects
Most people use it to check CPU, memory, disk, and network state
.NET CLR has useful counters for memory, GC, JIT, locks, threads, exceptions,
etc.
Other useful counters: WCF, ASP.NET, IIS, and database providers
Sysinternals Process Explorer
Alternative to Task Manager
Select a process and view its managed and native threads and stacks
Examine each thread’s CPU utilization
View .NET CLR performance counters per process
https://download.sysinternals.com/files/ProcessExplorer.zip
Why We Do Volume Tests
Client
QA team. Government collaboration app
Reported
MVC web application works in regular day-to-day use
Application succeeded under load tests
Under volume tests, application throws unexplained errors
Returns HTTP 500, with no specific error message
Application logs are not showing any relevant information
Workaround
None. Failed under volume tests
Troubleshooting
Checked Event Viewer for errors, found nothing
Used Fiddler to view the HTTP 500 response
Error text was too general, not very useful
Troubleshooting - cntd
Decided to use IIS Failed Request Tracing
Luckily, the MVC app had an exception filter that used tracing
Created a Failed Request Tracing rule for HTTP 500
Added the System.Web.IisTraceListener to the web.config
Waited for the test to reach its breaking point…
Troubleshooting - cntd
Opened the newly created trace file in IE
Found an error! Exception in JSON serialization - string too big
Stack overflow to
the rescue…
Troubleshooting - cntd
Ran the test again – failed again!
Checked the JavaScriptSerializer serialization code
Where is MaxJsonLength set?
Inspected MVC’s JsonResult code
Found the code that inits the serializer
Troubleshooting – almost done
Code fix was quite easy
But how big was our JSON string? 5MB? 1GB?
Time to grab a memory dump…
return Json(data); return new JsonResult {
Data = data,
MaxJsonLength =
};
Troubleshooting – just one more thing
Quickest way to dump on an exception - DebugDiag
Troubleshooting – final piece of the puzzle
Tricky part, using WinDbg to find the values
Troubleshooting – final piece of the puzzle
Which thread had the exception - !Threads
Troubleshooting – final piece of the puzzle
Get the thread’s call stack - !ClrStack
JavaScriptSerializer.Serialize takes a StringBuilder …
Troubleshooting – final piece of the puzzle
List objects in the stack - !DumpStackObjects (!dso)
Troubleshooting – final piece of the puzzle
Get the object’s fields and values - !DumpObj (!do)
The Tools in Use
Fiddler
HTTP(S) proxy and web debugger
Inspect, create, and manipulate HTTP(S) traffic
View message content according to its type, such as image, XML/JSON, and JS
Record traffic, save for later inspection, or export as web tests
http://www.fiddlertool.com
IIS Failed Request Tracing
Troubleshoot request/response processing failures
Collects traces from IIS modules, ASP.NET pipeline, and your own trace
messages
Writes each HTTP context’s trace messages to a separate file
Create trace file on: status code, execution time, event severity
http://www.iis.net/learn/troubleshoot/using-failed-request-tracing
http://www.iis.net/downloads/community/2008/03/iis-70-trace-viewer
The Tools in Use
Decompilers
Browse content of .NET assemblies (.dll and .exe)
Decompile IL to C# or VB
Find usage of a field/method/property
Some tools support extensions and Visual Studio integration
http://ilspy.net
https://www.jetbrains.com/decompiler
http://www.telerik.com/products/decompiler.aspx
The Tools in Use
DebugDiag
Memory dump collector and analyzer
Can generate stack trees, mini dumps, and full dumps
Automatic dump on crash, hanged requests, perf. counter triggers, etc.
Contains an analysis tool that scans dump files for known issues
https://www.microsoft.com/en-us/download/details.aspx?id=49924
WinDbg
Managed and native debugger, for processes and memory dumps
Shows lists of threads, stack trees, and stack memory
Query the managed heap(s), object content, and GC roots
Various extensions to view HTTP requests, detect dead-locks, etc.
https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
Leaking Memory In .NET – It Is Possible!
Client
Local insurance company
Reported
Worker process memory usage increase over time
Not sure if it’s a managed or a native issue
Workaround
Increase application pool recycle to twice a day
Troubleshooting
First, need to know if the leak is native or managed
Checked process memory with Sysinternals VMMap
Looking at multiple snapshots, seems to be managed (.NET) related
Troubleshooting - cntd
Time to get some memory dumps
Need several dumps, so we can compare them
Very simple to do, using Windows Task Manager
Next, open them and compare memory heaps
Troubleshooting - cntd
Compared the dumps with Visual Studio 2015
(Requires the Enterprise edition)
Troubleshooting - cntd
Didn’t take long to notice the culprit and reason
Hundreds of DimutFile objects, each containing large byte arrays
Troubleshooting - cntd
These objects were not “leaked”, they were cached!
Recommended fix included
Do not cache many large objects
Cache with an expiration date (sliding / fixed)
Troubleshooting – wait a second…
The memory diff. had another suspicious leak
Why are we leaking the HomeController?
Troubleshooting - cntd
Checked roots
Controller is also cached, why?
Referenced by the CacheItemRemovedCallback event
Troubleshooting - cntd
Checked the code again
CacheItemRemoved is registered to the event, but it is an instance
method
Note - adding instance method to a global event may leak the instance
object AND ALL of its referenced objects
The fix - change the callback method to static
The Tools in Use
Sysinternals VMMap
Helps in understanding and optimizing memory usage
Shows a breakdown of the process memory types
Displays virtual and physical memory
Can show a detailed memory map of address spaces and usage
https://technet.microsoft.com/en-us/sysinternals/vmmap.aspx
Visual Studio managed memory debug (Enterprise)
Part of Visual Studio’s dump debugger
Displays list of object types and their inclusive/exclusive sizes
Tracks each object’s root paths
Compare memory heaps between dump files
https://msdn.microsoft.com/en-us/library/dn342825.aspx
Sometimes it is Simpler Than is Seems
Client
Local insurance company
Reported
Local service for downloading files responds poorly when under load
A single request takes ~3s, but multiple concurrent requests take ~10s
Asked to fine-tune their IIS server
Workaround
Deploy more servers to handle the load
Troubleshooting
Started by asking questions about the service
File-download service in ASP.NET Web Services (asmx)
File is copied from a share, processed, then downloaded as BASE64
Standard tested file size – 10MB each + 10 Concurrent downloads
Analyzed what can brake:
File copying is throttled by local network/disk
Processing (convert to PDF) is CPU-bound
Part of the code has contention over a resource
IIS cannot handle the load of request (unlikely )
Too many options, need to think were to start…
Troubleshooting - cntd
Started by loading the system in a controlled environment
Directed load test at a specific server
Pulled that server out of the load balancer (to minimize “noise”)
Checked stats under load:
CPU – at 20%
Network and disk – low usage
Troubleshooting - cntd
Opened IIS Request Monitoring to check request pipelines
Responses are hanging due to a network issue!!
Troubleshooting – cntd
Issue is with the network, but the server’s network is just fine
Maybe it’s the client’s network?
Network utilization is at 99%, ah?
Local NIC is 100Mbps, what is this, the 90s?
Troubleshooting – moment of clarity
Checked NIC model – it’s an Intel NIC, 1Gbps
Checked with IT department and got the answer – IP Phone
Machine’s Ethernet is connected to an IP Phone
Phone is connected to the wall
The old phone is 100Mbps
Let’s test it
Connected machine directly to the wall socket
Opened Task Manager – NIC is 1Gbps
Re-run the load – takes ~3s for all to files concurrently 
Note – always run load tests from a neutral server
The Tools in Use
IIS Realtime Request Monitoring
A.K.A. Runtime Status and Control API (RSCA)
Shows currently executing requests in each application pool
Assist in understanding where requests are hanging and for how long
Accessible via the IIS Admin or AppCmd
%windir%system32inetsrvappcmd list requests
Task Manager
Everyone knows how to use Task Manager, no?
Additional Tools (for next time…)
Process monitoring
Sysinternals Process Monitor
Tracing and logs
PerfView (CLR/ASP.NET/IIS ETW tracing), IIS/HTTP.sys logs, IIS Advanced
Logging, Log Parser Studio
Dumps
Sysinternals ProcDump, DebugDiag Analysis
Network sniffers
Wireshark
Microsoft Message Analyzer
How to Start?
Understand what is happening
Be able to reproduce the problem ”on-demand”
Choose the right tool for the task
When in doubt – get a memory dump!
Resources
You had them throughout the slides 
My Info
@IdoFlatow // idof@sela.co.il //
http://www.idoflatow.net/downloads

Weitere ähnliche Inhalte

Was ist angesagt?

Developing in the Cloud
Developing in the CloudDeveloping in the Cloud
Developing in the CloudRyan Cuprak
 
Operating Docker
Operating DockerOperating Docker
Operating DockerJen Andre
 
Performance Analysis of Idle Programs
Performance Analysis of Idle ProgramsPerformance Analysis of Idle Programs
Performance Analysis of Idle Programsgreenwop
 
Java script nirvana in netbeans [con5679]
Java script nirvana in netbeans [con5679]Java script nirvana in netbeans [con5679]
Java script nirvana in netbeans [con5679]Ryan Cuprak
 
4.2. Web analyst fiddler
4.2. Web analyst fiddler4.2. Web analyst fiddler
4.2. Web analyst fiddlerdefconmoscow
 
Security in serverless world (get.net)
Security in serverless world (get.net)Security in serverless world (get.net)
Security in serverless world (get.net)Yan Cui
 
Lares from LOW to PWNED
Lares from LOW to PWNEDLares from LOW to PWNED
Lares from LOW to PWNEDChris Gates
 
Delphix for DBAs by Jonathan Lewis
Delphix for DBAs by Jonathan LewisDelphix for DBAs by Jonathan Lewis
Delphix for DBAs by Jonathan LewisKyle Hailey
 
How it's made - MyGet (CloudBurst)
How it's made - MyGet (CloudBurst)How it's made - MyGet (CloudBurst)
How it's made - MyGet (CloudBurst)Maarten Balliauw
 
Rihards Olups - Zabbix at Nokia - Case Study
Rihards Olups - Zabbix at Nokia - Case StudyRihards Olups - Zabbix at Nokia - Case Study
Rihards Olups - Zabbix at Nokia - Case StudyZabbix
 
Caching 101: Caching on the JVM (and beyond)
Caching 101: Caching on the JVM (and beyond)Caching 101: Caching on the JVM (and beyond)
Caching 101: Caching on the JVM (and beyond)Louis Jacomet
 
What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...
What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...
What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...Heiko Voigt
 
Attack-driven defense
Attack-driven defenseAttack-driven defense
Attack-driven defenseZane Lackey
 
Need forbuildspeed agile2012
Need forbuildspeed agile2012Need forbuildspeed agile2012
Need forbuildspeed agile2012drewz lin
 

Was ist angesagt? (20)

Delphix
DelphixDelphix
Delphix
 
Developing in the Cloud
Developing in the CloudDeveloping in the Cloud
Developing in the Cloud
 
Operating Docker
Operating DockerOperating Docker
Operating Docker
 
Why Play Framework is fast
Why Play Framework is fastWhy Play Framework is fast
Why Play Framework is fast
 
Performance Analysis of Idle Programs
Performance Analysis of Idle ProgramsPerformance Analysis of Idle Programs
Performance Analysis of Idle Programs
 
Java script nirvana in netbeans [con5679]
Java script nirvana in netbeans [con5679]Java script nirvana in netbeans [con5679]
Java script nirvana in netbeans [con5679]
 
What is Delphix
What is DelphixWhat is Delphix
What is Delphix
 
Delphix
DelphixDelphix
Delphix
 
4.2. Web analyst fiddler
4.2. Web analyst fiddler4.2. Web analyst fiddler
4.2. Web analyst fiddler
 
Security in serverless world (get.net)
Security in serverless world (get.net)Security in serverless world (get.net)
Security in serverless world (get.net)
 
Lares from LOW to PWNED
Lares from LOW to PWNEDLares from LOW to PWNED
Lares from LOW to PWNED
 
Delphix for DBAs by Jonathan Lewis
Delphix for DBAs by Jonathan LewisDelphix for DBAs by Jonathan Lewis
Delphix for DBAs by Jonathan Lewis
 
How it's made - MyGet (CloudBurst)
How it's made - MyGet (CloudBurst)How it's made - MyGet (CloudBurst)
How it's made - MyGet (CloudBurst)
 
Rihards Olups - Zabbix at Nokia - Case Study
Rihards Olups - Zabbix at Nokia - Case StudyRihards Olups - Zabbix at Nokia - Case Study
Rihards Olups - Zabbix at Nokia - Case Study
 
Caching 101: Caching on the JVM (and beyond)
Caching 101: Caching on the JVM (and beyond)Caching 101: Caching on the JVM (and beyond)
Caching 101: Caching on the JVM (and beyond)
 
20120306 dublin js
20120306 dublin js20120306 dublin js
20120306 dublin js
 
What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...
What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...
What is cool with Domino V10, Proton and Node.JS, and why would I use it in ...
 
Node.js vs Play Framework
Node.js vs Play FrameworkNode.js vs Play Framework
Node.js vs Play Framework
 
Attack-driven defense
Attack-driven defenseAttack-driven defense
Attack-driven defense
 
Need forbuildspeed agile2012
Need forbuildspeed agile2012Need forbuildspeed agile2012
Need forbuildspeed agile2012
 

Ähnlich wie Production Debugging War Stories

Production debugging web applications
Production debugging web applicationsProduction debugging web applications
Production debugging web applicationsIdo Flatow
 
Node.js: A Guided Tour
Node.js: A Guided TourNode.js: A Guided Tour
Node.js: A Guided Tourcacois
 
Debugging the Web with Fiddler
Debugging the Web with FiddlerDebugging the Web with Fiddler
Debugging the Web with FiddlerIdo Flatow
 
NodeJS ecosystem
NodeJS ecosystemNodeJS ecosystem
NodeJS ecosystemYukti Kaura
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
PHP Performance: Principles and tools
PHP Performance: Principles and toolsPHP Performance: Principles and tools
PHP Performance: Principles and tools10n Software, LLC
 
devworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentationdevworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentationAlex Wu
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2jeperkins4
 
PyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive applicationPyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive applicationHua Chu
 
Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2Amazon Web Services
 
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)Tech in Asia ID
 
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 jsАНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 jsWDDay
 
GlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersGlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersRob Tweed
 
Making it fast: Zotonic & Performance
Making it fast: Zotonic & PerformanceMaking it fast: Zotonic & Performance
Making it fast: Zotonic & PerformanceArjan
 
Real World Single Page App - A Knockout Case Study
Real World Single Page App - A Knockout Case StudyReal World Single Page App - A Knockout Case Study
Real World Single Page App - A Knockout Case Studyhousecor
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...Cisco DevNet
 
Sherlock Homepage (Maarten Balliauw)
Sherlock Homepage (Maarten Balliauw)Sherlock Homepage (Maarten Balliauw)
Sherlock Homepage (Maarten Balliauw)Visug
 

Ähnlich wie Production Debugging War Stories (20)

Production debugging web applications
Production debugging web applicationsProduction debugging web applications
Production debugging web applications
 
Node.js: A Guided Tour
Node.js: A Guided TourNode.js: A Guided Tour
Node.js: A Guided Tour
 
Debugging the Web with Fiddler
Debugging the Web with FiddlerDebugging the Web with Fiddler
Debugging the Web with Fiddler
 
NodeJS ecosystem
NodeJS ecosystemNodeJS ecosystem
NodeJS ecosystem
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
PHP Performance: Principles and tools
PHP Performance: Principles and toolsPHP Performance: Principles and tools
PHP Performance: Principles and tools
 
devworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentationdevworkshop-10_28_1015-amazon-conference-presentation
devworkshop-10_28_1015-amazon-conference-presentation
 
EQR Reporting: Rails + Amazon EC2
EQR Reporting:  Rails + Amazon EC2EQR Reporting:  Rails + Amazon EC2
EQR Reporting: Rails + Amazon EC2
 
PyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive applicationPyConline AU 2021 - Things might go wrong in a data-intensive application
PyConline AU 2021 - Things might go wrong in a data-intensive application
 
Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2Bootstrapping - Session 1 - Your First Week with Amazon EC2
Bootstrapping - Session 1 - Your First Week with Amazon EC2
 
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
"You Don't Know NODE.JS" by Hengki Mardongan Sihombing (Urbanhire)
 
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 jsАНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
АНДРІЙ ШУМАДА «To Cover Uncoverable» Online WDDay 2022 js
 
Scaling 101 test
Scaling 101 testScaling 101 test
Scaling 101 test
 
Scaling 101
Scaling 101Scaling 101
Scaling 101
 
GlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js DevelopersGlobalsDB: Its significance for Node.js Developers
GlobalsDB: Its significance for Node.js Developers
 
Making it fast: Zotonic & Performance
Making it fast: Zotonic & PerformanceMaking it fast: Zotonic & Performance
Making it fast: Zotonic & Performance
 
Real World Single Page App - A Knockout Case Study
Real World Single Page App - A Knockout Case StudyReal World Single Page App - A Knockout Case Study
Real World Single Page App - A Knockout Case Study
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
 
Sherlock Homepage (Maarten Balliauw)
Sherlock Homepage (Maarten Balliauw)Sherlock Homepage (Maarten Balliauw)
Sherlock Homepage (Maarten Balliauw)
 

Mehr von Ido Flatow

Google Cloud IoT Core
Google Cloud IoT CoreGoogle Cloud IoT Core
Google Cloud IoT CoreIdo Flatow
 
Introduction to HTTP/2
Introduction to HTTP/2Introduction to HTTP/2
Introduction to HTTP/2Ido Flatow
 
Introduction to HTTP/2
Introduction to HTTP/2Introduction to HTTP/2
Introduction to HTTP/2Ido Flatow
 
From VMs to Containers: Introducing Docker Containers for Linux and Windows S...
From VMs to Containers: Introducing Docker Containers for Linux and Windows S...From VMs to Containers: Introducing Docker Containers for Linux and Windows S...
From VMs to Containers: Introducing Docker Containers for Linux and Windows S...Ido Flatow
 
Building IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureBuilding IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureIdo Flatow
 
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the FieldMigrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the FieldIdo Flatow
 
The Essentials of Building Cloud-Based Web Apps with Azure
The Essentials of Building Cloud-Based Web Apps with AzureThe Essentials of Building Cloud-Based Web Apps with Azure
The Essentials of Building Cloud-Based Web Apps with AzureIdo Flatow
 
Introduction to HTTP/2
Introduction to HTTP/2Introduction to HTTP/2
Introduction to HTTP/2Ido Flatow
 
Debugging your Way through .NET with Visual Studio 2015
Debugging your Way through .NET with Visual Studio 2015Debugging your Way through .NET with Visual Studio 2015
Debugging your Way through .NET with Visual Studio 2015Ido Flatow
 
ASP.NET Core 1.0
ASP.NET Core 1.0ASP.NET Core 1.0
ASP.NET Core 1.0Ido Flatow
 
Introducing HTTP/2
Introducing HTTP/2Introducing HTTP/2
Introducing HTTP/2Ido Flatow
 
Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6Ido Flatow
 
Powershell For Developers
Powershell For DevelopersPowershell For Developers
Powershell For DevelopersIdo Flatow
 
IaaS vs. PaaS: Windows Azure Compute Solutions
IaaS vs. PaaS: Windows Azure Compute SolutionsIaaS vs. PaaS: Windows Azure Compute Solutions
IaaS vs. PaaS: Windows Azure Compute SolutionsIdo Flatow
 
ASP.NET Web API and HTTP Fundamentals
ASP.NET Web API and HTTP FundamentalsASP.NET Web API and HTTP Fundamentals
ASP.NET Web API and HTTP FundamentalsIdo Flatow
 
Advanced WCF Workshop
Advanced WCF WorkshopAdvanced WCF Workshop
Advanced WCF WorkshopIdo Flatow
 
What's New in WCF 4.5
What's New in WCF 4.5What's New in WCF 4.5
What's New in WCF 4.5Ido Flatow
 
IIS for Developers
IIS for DevelopersIIS for Developers
IIS for DevelopersIdo Flatow
 
Debugging with Fiddler
Debugging with FiddlerDebugging with Fiddler
Debugging with FiddlerIdo Flatow
 

Mehr von Ido Flatow (20)

Google Cloud IoT Core
Google Cloud IoT CoreGoogle Cloud IoT Core
Google Cloud IoT Core
 
Introduction to HTTP/2
Introduction to HTTP/2Introduction to HTTP/2
Introduction to HTTP/2
 
Introduction to HTTP/2
Introduction to HTTP/2Introduction to HTTP/2
Introduction to HTTP/2
 
From VMs to Containers: Introducing Docker Containers for Linux and Windows S...
From VMs to Containers: Introducing Docker Containers for Linux and Windows S...From VMs to Containers: Introducing Docker Containers for Linux and Windows S...
From VMs to Containers: Introducing Docker Containers for Linux and Windows S...
 
Building IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on AzureBuilding IoT and Big Data Solutions on Azure
Building IoT and Big Data Solutions on Azure
 
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the FieldMigrating Customers to Microsoft Azure: Lessons Learned From the Field
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
 
The Essentials of Building Cloud-Based Web Apps with Azure
The Essentials of Building Cloud-Based Web Apps with AzureThe Essentials of Building Cloud-Based Web Apps with Azure
The Essentials of Building Cloud-Based Web Apps with Azure
 
Introduction to HTTP/2
Introduction to HTTP/2Introduction to HTTP/2
Introduction to HTTP/2
 
Debugging your Way through .NET with Visual Studio 2015
Debugging your Way through .NET with Visual Studio 2015Debugging your Way through .NET with Visual Studio 2015
Debugging your Way through .NET with Visual Studio 2015
 
ASP.NET Core 1.0
ASP.NET Core 1.0ASP.NET Core 1.0
ASP.NET Core 1.0
 
EF Core (RC2)
EF Core (RC2)EF Core (RC2)
EF Core (RC2)
 
Introducing HTTP/2
Introducing HTTP/2Introducing HTTP/2
Introducing HTTP/2
 
Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6Learning ASP.NET 5 and MVC 6
Learning ASP.NET 5 and MVC 6
 
Powershell For Developers
Powershell For DevelopersPowershell For Developers
Powershell For Developers
 
IaaS vs. PaaS: Windows Azure Compute Solutions
IaaS vs. PaaS: Windows Azure Compute SolutionsIaaS vs. PaaS: Windows Azure Compute Solutions
IaaS vs. PaaS: Windows Azure Compute Solutions
 
ASP.NET Web API and HTTP Fundamentals
ASP.NET Web API and HTTP FundamentalsASP.NET Web API and HTTP Fundamentals
ASP.NET Web API and HTTP Fundamentals
 
Advanced WCF Workshop
Advanced WCF WorkshopAdvanced WCF Workshop
Advanced WCF Workshop
 
What's New in WCF 4.5
What's New in WCF 4.5What's New in WCF 4.5
What's New in WCF 4.5
 
IIS for Developers
IIS for DevelopersIIS for Developers
IIS for Developers
 
Debugging with Fiddler
Debugging with FiddlerDebugging with Fiddler
Debugging with Fiddler
 

Kürzlich hochgeladen

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Kürzlich hochgeladen (20)

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Production Debugging War Stories

  • 1. © Copyright SELA Software & Education Labs Ltd. | 14-18 Baruch Hirsch St Bnei Brak, 51202 Israel | www.selagroup.com SELA DEVELOPER PRACTICE May 23-25, 2017 Ido Flatow Production Debugging War Stories
  • 2. THE STORIES YOU ARE ABOUT TO HEAR ARE BASED ON ACTUAL CASES. LOCATIONS, TIMELINES, AND NAMES HAVE BEEN CHANGED FOR DRAMATIC PURPOSES AND TO PROTECT THOSE INDIVIDUALS WHO ARE STILL CODING.
  • 3. For the Next 50 Minutes… Introduction Service hangs Unexplained exceptions High memory consumption
  • 4. Why Are You Here? You are going to hear about Bugs in web applications Tips for better coding Debugging tools, and when to use them You will not leave here as expert debuggers! Sorry But… You will leave with a good starting point And probably anxious to check your code
  • 5. How Are we Going to Do This? What did the client report? Which steps we used to troubleshoot the issue? What did we find? How did we fix it? What were those tools we used?
  • 6. The Tired WCF Service Client Local bank Reported WCF service works fine for few hours, then stops handling requests Clients call the service, wait, then time out Server CPU is high Workaround Restart IIS Application pool
  • 7. Troubleshooting Configured WCF to output performance counters Used Performance Monitor to watch WCF’s counters, specifically Instances Percent Of Max Concurrent Calls
  • 8. Troubleshooting - cntd Waited for the service to hang Inspected counter values Value was at 100% (101.563% to be exact) At this point, no clients were active! Reminder - WCF throttles concurrent calls (16 x #Cores)
  • 9. Troubleshooting - cntd Watched w3wp thread stacks with Process Explorer Noticed many .NET threads in sleep loop Issue found - Requests hanged in the service, causing it to throttle new requests Fixed code to stop endless loop – problem solved!
  • 10. The Tools in Use Performance Monitor (perfmon.exe) View counters that show the state of various application aspects Most people use it to check CPU, memory, disk, and network state .NET CLR has useful counters for memory, GC, JIT, locks, threads, exceptions, etc. Other useful counters: WCF, ASP.NET, IIS, and database providers Sysinternals Process Explorer Alternative to Task Manager Select a process and view its managed and native threads and stacks Examine each thread’s CPU utilization View .NET CLR performance counters per process https://download.sysinternals.com/files/ProcessExplorer.zip
  • 11. Why We Do Volume Tests Client QA team. Government collaboration app Reported MVC web application works in regular day-to-day use Application succeeded under load tests Under volume tests, application throws unexplained errors Returns HTTP 500, with no specific error message Application logs are not showing any relevant information Workaround None. Failed under volume tests
  • 12. Troubleshooting Checked Event Viewer for errors, found nothing Used Fiddler to view the HTTP 500 response Error text was too general, not very useful
  • 13. Troubleshooting - cntd Decided to use IIS Failed Request Tracing Luckily, the MVC app had an exception filter that used tracing Created a Failed Request Tracing rule for HTTP 500 Added the System.Web.IisTraceListener to the web.config Waited for the test to reach its breaking point…
  • 14. Troubleshooting - cntd Opened the newly created trace file in IE Found an error! Exception in JSON serialization - string too big Stack overflow to the rescue…
  • 15. Troubleshooting - cntd Ran the test again – failed again! Checked the JavaScriptSerializer serialization code Where is MaxJsonLength set? Inspected MVC’s JsonResult code Found the code that inits the serializer
  • 16. Troubleshooting – almost done Code fix was quite easy But how big was our JSON string? 5MB? 1GB? Time to grab a memory dump… return Json(data); return new JsonResult { Data = data, MaxJsonLength = };
  • 17. Troubleshooting – just one more thing Quickest way to dump on an exception - DebugDiag
  • 18. Troubleshooting – final piece of the puzzle Tricky part, using WinDbg to find the values
  • 19. Troubleshooting – final piece of the puzzle Which thread had the exception - !Threads
  • 20. Troubleshooting – final piece of the puzzle Get the thread’s call stack - !ClrStack JavaScriptSerializer.Serialize takes a StringBuilder …
  • 21. Troubleshooting – final piece of the puzzle List objects in the stack - !DumpStackObjects (!dso)
  • 22. Troubleshooting – final piece of the puzzle Get the object’s fields and values - !DumpObj (!do)
  • 23. The Tools in Use Fiddler HTTP(S) proxy and web debugger Inspect, create, and manipulate HTTP(S) traffic View message content according to its type, such as image, XML/JSON, and JS Record traffic, save for later inspection, or export as web tests http://www.fiddlertool.com IIS Failed Request Tracing Troubleshoot request/response processing failures Collects traces from IIS modules, ASP.NET pipeline, and your own trace messages Writes each HTTP context’s trace messages to a separate file Create trace file on: status code, execution time, event severity http://www.iis.net/learn/troubleshoot/using-failed-request-tracing http://www.iis.net/downloads/community/2008/03/iis-70-trace-viewer
  • 24. The Tools in Use Decompilers Browse content of .NET assemblies (.dll and .exe) Decompile IL to C# or VB Find usage of a field/method/property Some tools support extensions and Visual Studio integration http://ilspy.net https://www.jetbrains.com/decompiler http://www.telerik.com/products/decompiler.aspx
  • 25. The Tools in Use DebugDiag Memory dump collector and analyzer Can generate stack trees, mini dumps, and full dumps Automatic dump on crash, hanged requests, perf. counter triggers, etc. Contains an analysis tool that scans dump files for known issues https://www.microsoft.com/en-us/download/details.aspx?id=49924 WinDbg Managed and native debugger, for processes and memory dumps Shows lists of threads, stack trees, and stack memory Query the managed heap(s), object content, and GC roots Various extensions to view HTTP requests, detect dead-locks, etc. https://developer.microsoft.com/en-us/windows/downloads/windows-10-sdk
  • 26. Leaking Memory In .NET – It Is Possible! Client Local insurance company Reported Worker process memory usage increase over time Not sure if it’s a managed or a native issue Workaround Increase application pool recycle to twice a day
  • 27. Troubleshooting First, need to know if the leak is native or managed Checked process memory with Sysinternals VMMap Looking at multiple snapshots, seems to be managed (.NET) related
  • 28. Troubleshooting - cntd Time to get some memory dumps Need several dumps, so we can compare them Very simple to do, using Windows Task Manager Next, open them and compare memory heaps
  • 29. Troubleshooting - cntd Compared the dumps with Visual Studio 2015 (Requires the Enterprise edition)
  • 30. Troubleshooting - cntd Didn’t take long to notice the culprit and reason Hundreds of DimutFile objects, each containing large byte arrays
  • 31. Troubleshooting - cntd These objects were not “leaked”, they were cached! Recommended fix included Do not cache many large objects Cache with an expiration date (sliding / fixed)
  • 32. Troubleshooting – wait a second… The memory diff. had another suspicious leak Why are we leaking the HomeController?
  • 33. Troubleshooting - cntd Checked roots Controller is also cached, why? Referenced by the CacheItemRemovedCallback event
  • 34. Troubleshooting - cntd Checked the code again CacheItemRemoved is registered to the event, but it is an instance method Note - adding instance method to a global event may leak the instance object AND ALL of its referenced objects The fix - change the callback method to static
  • 35. The Tools in Use Sysinternals VMMap Helps in understanding and optimizing memory usage Shows a breakdown of the process memory types Displays virtual and physical memory Can show a detailed memory map of address spaces and usage https://technet.microsoft.com/en-us/sysinternals/vmmap.aspx Visual Studio managed memory debug (Enterprise) Part of Visual Studio’s dump debugger Displays list of object types and their inclusive/exclusive sizes Tracks each object’s root paths Compare memory heaps between dump files https://msdn.microsoft.com/en-us/library/dn342825.aspx
  • 36. Sometimes it is Simpler Than is Seems Client Local insurance company Reported Local service for downloading files responds poorly when under load A single request takes ~3s, but multiple concurrent requests take ~10s Asked to fine-tune their IIS server Workaround Deploy more servers to handle the load
  • 37. Troubleshooting Started by asking questions about the service File-download service in ASP.NET Web Services (asmx) File is copied from a share, processed, then downloaded as BASE64 Standard tested file size – 10MB each + 10 Concurrent downloads Analyzed what can brake: File copying is throttled by local network/disk Processing (convert to PDF) is CPU-bound Part of the code has contention over a resource IIS cannot handle the load of request (unlikely ) Too many options, need to think were to start…
  • 38. Troubleshooting - cntd Started by loading the system in a controlled environment Directed load test at a specific server Pulled that server out of the load balancer (to minimize “noise”) Checked stats under load: CPU – at 20% Network and disk – low usage
  • 39. Troubleshooting - cntd Opened IIS Request Monitoring to check request pipelines Responses are hanging due to a network issue!!
  • 40. Troubleshooting – cntd Issue is with the network, but the server’s network is just fine Maybe it’s the client’s network? Network utilization is at 99%, ah? Local NIC is 100Mbps, what is this, the 90s?
  • 41. Troubleshooting – moment of clarity Checked NIC model – it’s an Intel NIC, 1Gbps Checked with IT department and got the answer – IP Phone Machine’s Ethernet is connected to an IP Phone Phone is connected to the wall The old phone is 100Mbps Let’s test it Connected machine directly to the wall socket Opened Task Manager – NIC is 1Gbps Re-run the load – takes ~3s for all to files concurrently  Note – always run load tests from a neutral server
  • 42. The Tools in Use IIS Realtime Request Monitoring A.K.A. Runtime Status and Control API (RSCA) Shows currently executing requests in each application pool Assist in understanding where requests are hanging and for how long Accessible via the IIS Admin or AppCmd %windir%system32inetsrvappcmd list requests Task Manager Everyone knows how to use Task Manager, no?
  • 43. Additional Tools (for next time…) Process monitoring Sysinternals Process Monitor Tracing and logs PerfView (CLR/ASP.NET/IIS ETW tracing), IIS/HTTP.sys logs, IIS Advanced Logging, Log Parser Studio Dumps Sysinternals ProcDump, DebugDiag Analysis Network sniffers Wireshark Microsoft Message Analyzer
  • 44. How to Start? Understand what is happening Be able to reproduce the problem ”on-demand” Choose the right tool for the task When in doubt – get a memory dump!
  • 45. Resources You had them throughout the slides  My Info @IdoFlatow // idof@sela.co.il // http://www.idoflatow.net/downloads

Hinweis der Redaktion

  1. .loadby sos clr !threads ~23s !clrstack !dso !dumpobj [addr]
  2. .loadby sos clr !threads ~23s !clrstack !dso !dumpobj [addr]
  3. .loadby sos clr !threads ~23s !clrstack !dso !dumpobj [addr]
  4. .loadby sos clr !threads ~23s !clrstack !dso !dumpobj [addr]
  5. .loadby sos clr !threads ~23s !clrstack !dso !dumpobj [addr]