Debugging guide

1. Introduction

From time to time you might find a bug in a program and want to fix it. There are several ways of debugging software and this guide quickly covers some of them. It gives a short introduction about the main usage of each tool.

2. gdb

gdb is the GNU debugger. It is a command line based tool to debug and inspect programs. It basically allows to step through individual instructions, set breakpoints, and print any information from the running program. There are also graphical interface, but it is handy to know some basic for the command line driven interface.

2.1. Starting the debugger

Before using the debugger, make sure that the program you want to debug is compiled with debug symbols (gcc option -g). It is not strictly necessary, but it helps a lot to identify source code lines, especially if you don't know assembler code. Often it also helps to disable compiler optimization since some variables might be optimized away and their current value cannot be easily printed. On the other hand, some bugs only occur in optimized programs so in some cases the bug you want to find is not triggered in unoptimized builds.

2.1.1. Start the program within gdb with optional arguments

To start the debugging session, you just execute gdb with the program to debug as an argument:
$ gdb <program>
In the gdb prompt, you can run the program with additional args as follows:
(gdb) run arg1 arg2
It is also possible to tell gdb the program arguments when starting gdb so you can execute the program multiple times without having to give the arguments again:
$ gdb --args program arg1 arg2
(gdb) run
This is also handy if you have already run the program with complex arguments when experiencing the crash. Then you just prepend "gdb --args".

2.1.2. Start the program from a core file

If you have a core dump file from a previous crash and want to analyze why it crashed, you can run gdb together with the program and the core file:
$ gdb <program> <core file>
GDB will start as usual but you don't have to execute "run" but instead gdb will directly show information about the crash point.

2.1.3. Attach the debugger to a running program

You can also attach the debugger to an already running program by giving the PID of the program as an argument:
$ gdb <program> <pid>
The program will get interrupted, but you can continue the executing by entering the "continue" command.

2.2. Typical gdb commands

2.3. Printing program information

2.4. Threads

When debugging programs with multiple threads, it is possible to switch between threads and print information from each thread context:

2.5. Step-by-step debugging

When you are at the gdb prompt (after interrupting the program with Ctrl-c or a breakpoint triggered), you can step through single commands by using:

2.6. Create a core file

You can create a core file for later inspection from a running program by executing
(gdb) generate-core-file

2.7. Additional commands

2.8. Remote debugging

It is also possible to debug remotely executed programs by using gdbserver. Basically it involves the following steps: Additional information can be found in the gdb manual.

2.9. Getting help

GDB comes with a builtin help system. You can access it with the command help, for example:
(gdb) help bt

3. valgrind

valgrind is a powerful tool to find program errors that do not directly trigger a crash. For example, forgetting to initialize a variable often leads to unexpected behavior, or accessing memory outside an allocated buffer will often not directly crash the program, but modify some other, unrelated data which leads to crash later in the code. valgrind executes the program with a simulated CPU thus allowing to check each individual instruction and memory access. It will print detailed information and allows to attach a debugger to the instruction point where the invalid access happens.

3.1. Starting valgrind

To execute a program within valgrind, just give the program and its arguments to the tool valgrind:
valgrind <program> <arg1>
The program will be executed and on each error a complete report is printed, including the type of the error and the call stack. If the number call levels printed is not enough, you can increase the size of the printed call stack by setting the --num-callers option:
valgrind --num-callers=20 ...

3.2. Memory leaks

valgrind outputs possible memory leaks when the analyzed program terminates. By default, valgrind only outputs the number of bytes not freed. To get more detailed information about each leak, execute valgrind with:
valgrind --leak-check=full --show-reachable=yes ...
Not every reported memory loss is actually one, especially "still reachable" indicate memory that is still referenced somewhere, but it is not freed at the end of the program. Also, some system libraries allocate some buffers they use permanently, but they are not freed.

3.3. Debug errors

For every program error, valgrind prints a callstack and some information about the type of error. It is possible to use gdb to further debug those errors. To do so, start valgrind with two additional arguments:
valgrind --vgdb=yes --vgdb-error=0 ...
The vgdb-error option tells valgrind to skip that many errors before triggering the debugger, 0 means stop for debugger on the first error.

The actual debugging is done in separate gdb process. Valgrind uses the remote debugging feature with gdb, so you will have to start gdb in a separate terminal. Valgrind will tell you what to execute within gdb, but basically it is just

(gdb) target remote | /usr/bin/vgdb
As a shortcut, you can start gdb and issue the command in one step:
gdb -ex "target remote | /usr/bin/vgdb" <program>
This only works if there is only one valgrind process running. Otherwise you have to give the pid to vgdb by using the --pid argument:
(gdb) target remote | /usr/bin/vgdb --pid=1234
After starting gdb, you have to continue the program:
(gdb) c
For every error valgrind will break the debugger and you can use the normal debugging commands to investigate the problem. Then you can "continue" again until the next problem occurs.

Note: It used to be a little bit easier without having to run two programs separately. In older version of valgrind, you could just do:

valgrind --db-attach=yes ...
and valgrind had asked you if it should start the debugger on every error.

3.4. Analyzing errors

Here is a description of the most frequently seen errors: There are other errors possible which are described in the valgrind manual.

3.5. helgrind

The valgrind tool helgrind can be used to debug multi-threaded programs by analyzing possible race conditions. It's called like this:
valgrind --tool=helgrind <program> <args> ....
Just as the normal valgrind, it will output possible problems during executing, there is no separate analyzing tool for helgrind.

4. Further reading