This blog is a continuation to my previous one in this Troubleshooting series that I’m planning to continue writing. If you haven’t read that one, please read before beginning this one. In the last blog, I spoke about a few questions, and my first step of any troubleshooting – isolation, which helps me narrow down my search, and gives me opportunity to do concrete troubleshooting rather than trying out a couple of random things like rebooting. Yeah, that does fixes a lot of problems, but will you understand the reason why the problem really happened after doing a reboot. Possibly not for a lot of instances. If your application is super smart in doing more logging, and the problem that happened can easily identified in the default logs like event logs, IIS logs, httperr logs, you are golden. Otherwise, you might be pushed by your boss to do an RCA, but you do not have sufficient data.
In this blog, I’m going to talk about one of the few common issues, and what possibly you need to do during these issues. Again, this is not going to be the complete list of things that you can do, but at least will give you a head start. You know what crash means, but will you be able to identify an end user scenario, and right away classify as crash. No. Not all the time. You know that your code needs to run inside a process in Windows. I define it has “crashed” if the process has exited because of unknown reasons, terminated unexpectedly. If you are running your ASP.NET websites on IIS7+ servers, it is the w3wp.exe process that runs your code.
Understanding Crash, and it’s symptoms
What are the possible symptoms that you can notice in your applications to term this as a crash, or what you can possibly see from the existing logs. I’ll also try talking about a few common tools that will help you in troubleshooting the crash.
First of all, let’s talk about a few common errors, issues that is noticed by the end user:
- Session Loss – This is the most common symptom for people who store the ASP.NET Session InProc. When the ASP.NET Session mode is configured InProc, your session variables (and values) are stored inside the w3wp.exe process corresponding to the Application Pool that’s configured to run your application. One of my colleague has earlier written briefly on a few questions to ask, and possible reasons for this in this article. Please read it, it is such a valuable ‘session loss’ troubleshooting guide. Often people term as ‘Session Loss’ if their application perhaps tells them that they are logged out of the system, and asking them to login again. Few application developers do custom logging, and they store some session variable when your session starts, and checks for that value in a few pages for custom logic, and show you the “logout” message if that variable doesn’t exist, forcing you to login again that will initialize the session variables again.
- Webpage keeps spinning for a long time, and gives you the result – This is again a most common scenario, but the end user reports this as a slow performance rather. If the process exits, it takes down all the initialization that you have done, be it session variables, or ASP.NET cache, or something that you have initialized. So, most common logic is, to re-initialize them if it is not available, so you spend time in re-initializing it, getting from database, reading from file system, and so forth.
- You see ‘Service Unavailable’ error in the page – This is also most common, where the process serving the Application Pool has terminated unexpectedly for x number of times in y seconds. Default configuration in IIS is, if your process crashes for 5 times under 5 minutes, the Application Pool is disabled, or stopped. The Administrator has to manually start the Application Pool in order to get the site back again. You can always change this option. I’d say, better leave it as default – so that it at least prompts you to fix this problem. If you disable this option, your crashes will go unnoticed, unless you pay attention to the user experience, and in the event logs. You can configure this under the ‘Advanced Settings’ of an Application Pool, under Rapid-Fail Protection. You can also configure other options like, a custom executable to run in case if this AppPool gets disabled due to this ‘Rapid Fail Protection’ feature of IIS.
- Other custom error messages which your application might throw in case if the initialized data becomes unavailable from the process memory.
Event logs that gets generated for a crash of the Application Pool
Here are a few event descriptions that you would see if a crash occurs:
Log Name: System
Date: [time stamp]
Event ID: 5011
A process serving application pool '[app pool name]' suffered a fatal communication error with the Windows Process Activation Service. The process id was '[PID]'. The data field contains the error number.
A process serving application pool 'DefaultAppPool' terminated unexpectedly. The process id was '[PID]'. The process exit code was 'exit code'.
Application pool '%1' is being automatically disabled due to a series of failures in the process(es) serving that application pool.
Event ID : 1000
Raw Event ID : 1000
Record Nr. : 15
Category : None
Source : .NET Runtime 2.0
Error Reporting Type : Error
Message : Faulting application w3wp.exe, version 6.0.3790.1830, stamp 42435be1, faulting module mscorwks.dll, version 2.0.50727.42, …
Now, you have seen how to define a crash, and possible symptoms, and event logs. But what next? You need to find the cause of the problem, right? You first should looks for clues in the event logs to see if there is anything logged from IIS/ASP.NET components during the time of the issue, other than the few of the above ones. You might be even lucky to spot a 3rd party module that’s causing the crash, but not all the time. What possibly can help you is a memory dump of the process, which been captured just before the process dies. There are a few tools that can help you collecting the memory dump of the crashing process, few which can analyze the dumps for you to an extent to show you the crashing stack. For the collection, you also have an inbuilt option that saves these memory dumps called, Windows Error Reporting. You can read the below blogs that shows you the steps to collect the dumps for this scenario.
Using Windows Error Reporting
Using WER: Collecting User-Mode Dumps
How To: Collect a Crash dump of an IIS worker process on IIS 7.0 (and above)
How to use ADPlus to troubleshoot "hangs" and "crashes"
Using DebugDiag Tool
How to Use the Debug Diagnostic Tool v1.1 (DebugDiag) to Debug User Mode Processes
Other tools that can help you is, ADPlus (that comes with the Debugging Tools for Windows) article, and ProcDump from the Microsoft Technet Sysinternals (-t option). DebugDiag tool comes with a powerful analyzer as well, where you can just double click the dump file that was collected, and it will create a beautiful report that consists of the crashing callstack, and a possible explanation/next steps for the issue. If you are interested to debug the dumps collected, below links could be super helpful! Tess is known for her brief blogs on dump analysis, and a great person to interact!
.NET Debugging Demos Lab 2: Crash
Hanselminutes on 9 - Debugging Crash Dumps with Tess Ferrandez and VS2010
I’ll follow up with more posts on general troubleshooting.