1. Introduction

From time to time you might find a bug in a program and want to fix it. There are several ways of debugging software and this guide quickly covers some of them. It gives a short introduction about the main usage of each tool.

2. gdb

gdb is the GNU debugger. It is a command line based tool to debug and inspect programs. It basically allows to step through individual instructions, set breakpoints, and print any information from the running program. There are also graphical interface, but it is handy to know some basic for the command line driven interface.

2.1. Starting the debugger

Before using the debugger, make sure that the program you want to debug is compiled with debug symbols (gcc option -g). It is not strictly necessary, but it helps a lot to identify source code lines, especially if you don't know assembler code. Often it also helps to disable compiler optimization since some variables might be optimized away and their current value cannot be easily printed. On the other hand, some bugs only occur in optimized programs so in some cases the bug you want to find is not triggered in unoptimized builds.

2.1.1. Start the program within gdb with optional arguments

To start the debugging session, you just execute gdb with the program to debug as an argument:

$ gdb <program>

In the gdb prompt, you can run the program with additional args as follows:

(gdb) run arg1 arg2

It is also possible to tell gdb the program arguments when starting gdb so you can execute the program multiple times without having to give the arguments again:

$ gdb --args program arg1 arg2
(gdb) run

This is also handy if you have already run the program with complex arguments when experiencing the crash. Then you just prepend "gdb --args".

2.1.2. Start the program from a core file

If you have a core dump file from a previous crash and want to analyze why it crashed, you can run gdb together with the program and the core file:

$ gdb <program> <core file>

GDB will start as usual but you don't have to execute "run" but instead gdb will directly show information about the crash point.

2.1.3. Attach the debugger to a running program

You can also attach the debugger to an already running program by giving the PID of the program as an argument:

$ gdb <program> <pid>

The program will get interrupted, but you can continue the executing by entering the "continue" command.

2.2. Typical gdb commands

Interrupt execution
Once started, the program runs normally until it crashes or you interrupt the execution by pressing Ctrl-c.
run (r)
This command will start (or restart) the program.
continue (c)
continues the executing of the program until interrupting it again, or to the next breakpoint.

break
sets breakpoints. Examples

(gdb) break <sourcefile.c>:<source code line>

or any instruction address:

(gdb) break *0x4bff0000

quit
exits the debugger. You have the choice to stop a running program or detach again.

2.3. Printing program information

backtrace (bt)
prints the call stack from the inner to the outer function. The option "full" also prints the values of the local variables.
up
Changes the focus to the next function in the call stack by showing the call point where the current function has been called.
down
changes back to the callee point.
both up and down commands accept an argument indicating how many frames should be skipped. For example, up 3 switches to the third function call site.
frame <n>
directly jumps to call stack frame <n>
print <variable>
shows the content of the variable. You can also access structure members, dereference pointers like print *foo->bar, or even calling functions.
list *<address>
shows the source code of the given address
x <address>
examines the given memory address, printing information about possible function name and related information
disass <address1> <address2>
prints the disassembled program code from the given address to the given address. To disassemble around the current execution point, use disass $pc-32 $pc+32
info registers
prints the content of all CPU registers

2.4. Threads

When debugging programs with multiple threads, it is possible to switch between threads and print information from each thread context:

info threads
shows an overview of all active threads and their numbers
thread <thread number>
switches to the given thread.
thread apply
executes gdb commands for multiple threads. Examples:
- thread apply 1 3 bt prints the backtrace for thread 1 and 3.
- thread apply all bt prints the backtrace of all threads.

2.5. Step-by-step debugging

When you are at the gdb prompt (after interrupting the program with Ctrl-c or a breakpoint triggered), you can step through single commands by using:

next (n)
executes the next command. If it is a function call, it will return control after executing the function.
step (s)
executes the next command, but enters a function.

2.6. Create a core file

You can create a core file for later inspection from a running program by executing

(gdb) generate-core-file

2.7. Additional commands

directory
sets the directory where gdb searches for source files. If you have compiled the program with debug information, it should not be necessary to set this directory.
break main
adds a breakpoint at the first program entry point, the main function. This is handy if you want to add breakpoints for a function, but the symbols are not yet loaded by gbd. So you can add the main breakpoint, run the program, and after gdb stops, you can add additional breakpoints.

2.8. Remote debugging

It is also possible to debug remotely executed programs by using gdbserver. Basically it involves the following steps:

copy the program to both machines (the server running the programm, the client doing the debugging).

start the server on the target machine:

$ gdbserver client_ip_or_name:port <program>

start the client debugger:

$ gdb <program>
(gdb) target remote server_ip_or_name:port

continue (or initially start) the program:
```
(gdb) continue
```
debug as usual

Additional information can be found in the gdb manual.

2.9. Getting help

GDB comes with a builtin help system. You can access it with the command help, for example:

(gdb) help bt

3. valgrind

valgrind is a powerful tool to find program errors that do not directly trigger a crash. For example, forgetting to initialize a variable often leads to unexpected behavior, or accessing memory outside an allocated buffer will often not directly crash the program, but modify some other, unrelated data which leads to crash later in the code. valgrind executes the program with a simulated CPU thus allowing to check each individual instruction and memory access. It will print detailed information and allows to attach a debugger to the instruction point where the invalid access happens.

3.1. Starting valgrind

To execute a program within valgrind, just give the program and its arguments to the tool valgrind:

valgrind <program> <arg1>

The program will be executed and on each error a complete report is printed, including the type of the error and the call stack. If the number call levels printed is not enough, you can increase the size of the printed call stack by setting the --num-callers option:

valgrind --num-callers=20 ...

3.2. Memory leaks

valgrind outputs possible memory leaks when the analyzed program terminates. By default, valgrind only outputs the number of bytes not freed. To get more detailed information about each leak, execute valgrind with:

valgrind --leak-check=full --show-reachable=yes ...

Not every reported memory loss is actually one, especially "still reachable" indicate memory that is still referenced somewhere, but it is not freed at the end of the program. Also, some system libraries allocate some buffers they use permanently, but they are not freed.

3.3. Debug errors

For every program error, valgrind prints a callstack and some information about the type of error. It is possible to use gdb to further debug those errors. To do so, start valgrind with two additional arguments:

valgrind --vgdb=yes --vgdb-error=0 ...

The vgdb-error option tells valgrind to skip that many errors before triggering the debugger, 0 means stop for debugger on the first error.

The actual debugging is done in separate gdb process. Valgrind uses the remote debugging feature with gdb, so you will have to start gdb in a separate terminal. Valgrind will tell you what to execute within gdb, but basically it is just

(gdb) target remote | /usr/bin/vgdb

As a shortcut, you can start gdb and issue the command in one step:

gdb -ex "target remote | /usr/bin/vgdb" <program>

This only works if there is only one valgrind process running. Otherwise you have to give the pid to vgdb by using the --pid argument:

(gdb) target remote | /usr/bin/vgdb --pid=1234

After starting gdb, you have to continue the program:

(gdb) c

For every error valgrind will break the debugger and you can use the normal debugging commands to investigate the problem. Then you can "continue" again until the next problem occurs.

Note: It used to be a little bit easier without having to run two programs separately. In older version of valgrind, you could just do:

valgrind --db-attach=yes ...

and valgrind had asked you if it should start the debugger on every error.

3.4. Analyzing errors

Here is a description of the most frequently seen errors:

"Conditional jump or move depends on uninitialised value(s)"
This error means that some variable of an condition is uninitialized, therefore the resulting execution path is probably not the intended one. Since the debugging information in the object file are not detailed enough to identify the actual variable that is uninitialized, you have to add additional code (or split the condition into multiple lines), to determine the variable.
Consider that a variable might also be uninitialized when another uninitialized value is assigned to it.
This error may also appear when a variable is only correctly initialized in some control paths, but not all (like in an if-statement).
"Use of uninitialised value of size x"
Similar to the previous error, this indicates that uninitialized memory is accessed, of the given size. It might be an uninitialized buffer or a variable that is read.
Often this means that you forget to initialize some memory, or even memory is accessed outside expected boundaries.
"Invalid read of size x"
The address of a read operation is invalid. Often due to completely wrong pointers, or due to accessing memory already freed. valgrind shows additional information about the error, like the address, or where the memory has been allocated. The program often does not crash directly, but it will show undefined behavior at some point in time.
"Invalid write of size x"
The write operation targets an invalid address. This often results in an immediate crash, or it will corrupt data or instruction memory leading to completely undefined behavior. When this happens, you can not even believe any additional debugging information as the code executed might not be same as compiled.

There are other errors possible which are described in the valgrind manual.

3.5. helgrind

The valgrind tool helgrind can be used to debug multi-threaded programs by analyzing possible race conditions. It's called like this:

valgrind --tool=helgrind <program> <args> ....

Just as the normal valgrind, it will output possible problems during executing, there is no separate analyzing tool for helgrind.

Debugging guide