Monday, July 20, 2009

A Tribute to the Bug Hunt

There are few experiences in software development quite as humbling and as satisfying both as the Bug Hunt. The search for an elusive problem that causes systems to crash, data to be lost and bits to be unwantingly inverted is a wild chase where one is peeling off several layers of proximate causes in order to get to the ultimate one.

The interesting aspect is how the Hunt is often opened by having a preconception of what the problem is and where it resides. Often a hunt ends there. The interesting ones are those where the preconceptions get shattered several times as they are revealed as mere proximate causes or worse, false leads.

The humbling part, for me, consists of how I sometimes have an idee fixe about what the cause is and how that assumption turns out to be untrue. I daresay that this aspect made me more conscious in my interactions with other people and prevent me from jumping to conclusions.

There are many out there who are happy to have found the proximate cause and stop there. The symptom is solved. For example, a warning light flickers in an irritating way, so take out that light. Symptom solution is in my experience also typical when an electrician comes to repair something in my house; they observe the symptom, declare that the cause and proceed to fix it. They don't ask that critical question: why does this light flicker?

It amazes me to see this is often the prevailing mood in software development as well. If you are in a position to question someone who has just solved a problem, ask them why the problem occured and keep on asking why until you have satisfyingly heard the ultimate cause. If this is not business-as-usual, I predict you will find the level of understanding to be lacking.

The lack of understanding will ultimately come back and bite you when your dirty-fixed applications will be plagued by problems that start cooperating in ways unimagined to cause serious mischief, multiple root causes making the hunt even harder. Your application is more likely to reach the end of its life prematurely.

The Bug Hunt consists of five parts:
- arrange a problem description
- reproduce the problem in a separate environment
- determine the root cause by isolating the problem
- propose a concept solution
- implement the fix

The only part that should be subjected to economical choice is the fourth one, ie the type of solution. Here you can choose a proper solution, a quick-fix or no solution at all as the situation allows. Do not save costs on the hunt itself! You will be sorry that you did.

The end result of the hunt should be a narrative that explains exactly what has happened. If you are using an issue tracking system, demand that the hunt is documented meticulously, so knowledge is shared and valuable information on the patient preserved.

Doing all this right, will keep knowledge about your application with your team on a high level and the application itself is kept in good shape. You are better positioned to scale up, address new problems, refactor old code and add new functionality. The Bug Hunters are the heroes of your team; honor them!

Remember, the most powerful weapon of the Bug Hunter is that one little word: why.

Good Hunting!

No comments:

Post a Comment