If you do not know me, I work as an Escalation Engineer in Microsoft IIS/ASP.NET Support team in Bangalore, primarily debugging ASP.NET applications of our customers. That’s my day job – to debug other’s code, also any Microsoft Component that’s involved. I ‘m planning to write a series of posts on general troubleshooting, and the steps I typically use to diagnose a customer problem. You could use the same, and it is definitely not a rocket science. If I can do this, you can too!
In this first post, I thought I will not talk anything technical, but probably a few things that you might want to do before you start the real troubleshooting. Let’s take a scenario of a ‘slow running’ ASP.NET website at hand, and see how do we approach this problem step by step. If you have already worked with Microsoft Support before, we would generally ask you ‘many’ questions. All of those questions are asked typically to understand the problem better. For example, for this slow request problem, here are the typical questions that you would need to ask yourself when troubleshooting. Few of these questions apply for ‘any’ problem that you troubleshoot.
- Issue is slowness, but how slow it is? First, you need to understand how much is the delay you are talking about, so that you can think of using a few tools to troubleshoot this quickly.
- It is slow, but how fast it should be? What is your expected time that page should respond in? If you do not have a benchmark, then you are shooting in the dark. In reality, this number would be a result of your testing. You would know how much the page typically takes, depending on its operation. Of course, if it is running a lengthy database operation, this number itself would be a larger one. It is very essential that you have a comparison.
- Environment of the server. Next, you would need to know where the problem is occurring. If you have multiple servers, which are those servers this slow performance problem occurs? It is possible that those servers where the problem occurs are really a slow ones, having an old hardware. Understanding the details of the environment is very important.
- Environment of the client. You own the server, but the problem perhaps is reported by your end users. You should try to know the environment of those end users as well. If it is happening from many users, you might want to understand about all ranging from their OS version, browser versions, to network topology.
- When did the problem start? Yes, this would be the most interesting question of all, but this is the one which might not get any ‘right’ answers most of the time. One of the many reasons could be, there perhaps were too many changes that were done. This ranges from deploying the application in a new server, installing a new service pack for the OS, or the application upgrade, to adding new users to the applications. Clear understanding of this would help diagnose the problem better.
- What exactly is slow? This is another tough question to answer most of the time since there might be many pages that are slow, and you may not know all. But, it is very essential that you list the name of the page, and the operations you do on the page that gives you the problem. For example, a button click on the login page is slow to give the response.
Again, these are a few important questions to ask, not the only questions. If I get a chance to talk to you while diagnosing your problem, I’ll perhaps ask 100 more questions – definitely related to the problem :) Okay, what next? You get the answers to these questions, what’s perhaps your next step?
My first step in troubleshooting is, always ‘Isolation’. Try narrow down your search. First split your main problem into pieces, to troubleshoot. For example, one button click might do 10 different activities, try isolating what in that 10 has the problem, so that you can try concentrating only on that particular activity that is slow. Isolation step will also include you trying to check if the problem is isolated to only a few users, or all the users. If it is only for say 2 of your users from their workstations, you have already avoided concentrating on the server, perhaps they have a slow network. This is just an example, your problem well could be in the server even in this case of just 2 users facing the problem, like custom code for them, the query that gets generated for them is different, etc.
Once you isolate the problem, the very important next step is to make sure you aren’t troubleshooting something which is already resolved in say, the latest release of your website. Always, do not try reinvent the wheel. Do not waste your CPU cycles (!) to work on an issue that someone has already fixed. Search. Search in support.microsoft.com, search in StackOverFlow, search in Bing, most important, search in your internal database if you have one, for issues that are already fixed. If you are at a critical problem, make sure you do enough search before trying to dig deeper. Again, this step doesn’t apply for some problems that are isolated to just your application, which is the case most of the time, like this slowness that we were talking about. But for issues like, Exceptions, runtime errors, etc.
Only after you have a clear understanding of the problem, and the environment that this is isolated to, and making sure the issue is not a known issue, you may proceed further. I can in fact write more in this post, but I’ll reserve more like this, general troubleshooting techniques for my future posts. If you are curious, here is what I’m planning to write further on. I’m sure I’ll add more to this list, and perhaps will update this post when I do.
- Defining Common issues in ASP.NET applications – Slow Perf, Hangs, Crashes, High Memory, etc.
- Built In tools that helps you troubleshoot a few of these issues.
- How much existing logs like IIS logs, HTTPERR logs, Event logs would tell you?
- Scenario #1 : Troubleshooting a slow performance problem using various tools.
- More Scenarios, more tools, whenever I find time to write.
Follow along if you are interested.