PHYSICS 580                                                                                                            Fall 2006

 

HINTS FOR DEBUGGING

 

A “bug” is an error in your program. (For an interesting history of the term “bug”, see http://www.jamesshuggins.com/h/tek1/first_computer_bug.htm )  Almost no one writes a bug-free program from scratch the first time. According to lore, the standard for the software industry is 10 lines of error-free code per programmer per day.

Broadly speaking, there are two classes of bugs: bugs that cause your program to crash, and bugs that give you wrong answers.

There exist debugging tools that can be very useful; they are, however, platform dependent and so I will not discuss them. Such tools simplify the process discussed below. In general you have to compile the code with the right switch, such as –g, to make it amenable to use with a debugger. Keep in mind that code compiled with the debug switch generally runs very slow.

 

Writing short routines and including error traps will help enormously in finding bugs.

 

Staring at code generally does not help you find bugs, except for elementary mistakes in syntax. Even for experts.

 

Common culprits that cause crashes and/or erroneous answers:

Arrays out-of-bounds. An array is dimensioned to, say, 100, but you write to element 101. This can lead to nasty consequences. But sometimes nothing at all happens. This is easily founds with a check-bounds switch while compiling (usually –CB (Intel compiler) or –fcheck-bounds (native Linux compiler, at least old versions)).

Mismatched calls for subroutines/functions. If you declare

subroutine multiply(n,a,b,c,errflag)

but then

call multiply(n,a,b,c).

Mismatched declaration: In the above, if you declare the variable a to be real in one part of the code but double precision in the subroutine. Similarly if the arrays have incongruent dimensions.

Failure to initialize variables. When you have a variable, some compilers will initialize it to zero...but not all. Thus the following routine

do i = 1,100

  if(i/2*2 .eq. i)nevens = nevens+1

enddo

could give a nonsensical answer, because you didn’t initialize nevens = 0.

 

Beyond the above, you may have simply made a mistake in your algorithm or logic. There is no single route to finding a bug. Most broadly:

-- Build your code from the ground up. Test each subroutine as your create it. Don’t wait until the very end to test it—it will be harder to find where the bug is. For example, if you are writing a code to do multidimensional quadrature, test your 1D integration routine separately and first.

--Think of simple test cases for which you know the correct answer. Test thoroughly and obsessively. Don’t do just one test and assume your code is correct; it probably isn’t.

-- If you compare to another person’s code, don’t just look at the code itself; in the context of this course, that would be cheating, and besides, this method doesn’t work for large, complex codes. Instead, look at some intermediate results where you can compare the inner workings of the two codes. For example, suppose you are writing a multidimensional quadrature code using Gauss-Hermite quadrature, and your friend has a working code that does multidimensional quadrature using Bode’s rule. You can compare 1-D or 2-D results. If you are not using a debugger, this generally means using lots of write statements. Get used to it.

 

Again, I cannot emphasize enough the importance of validation through test cases. Try as many variations as practical. Assume something is wrong with you code, and try to find it.

(A good way, incidentally, is to hand over your code to a colleague and let them try it out. They won’t have the same unconscious assumptions as you and may very quickly find either bugs in your code, or something awkward in the way your code works.)