Debugging Python Applications with the PDB Module

Introduction

In this tutorial, we are going to learn how to use Python's PDB module for debugging Python applications. Debugging refers to the process of removing software and hardware errors from a software application. PDB stands for "Python Debugger", and is a built-in interactive source code debugger with a wide range of features, like pausing a program, viewing variable values at specific instances, changing those values, etc.

In this article, we will be covering the most commonly used functionalities of the PDB module.

Background

Debugging is one of the most disliked activities in software development, and at the same time, it is one of the most important tasks in the software development life cycle. At some stage, every programmer has to debug his/her code, unless he is developing a very basic software application.

There are many different ways to debug a software application. A very commonly used method is using the "print" statements at different instances of your code to see what is happening during execution. However, this method has many problems, such as the addition of extra code that is used to print the variables' values, etc. While this approach might work for a small program, tracking these code changes in a large application with many lines of code, spread over different files, can become a huge problem. The debugger solves that problem for us. It helps us find the error sources in an application using external commands, hence no changes to the code.

Note: As mentioned above, PDB is a built in Python module, so there is no need to install it from an external source.

Key Commands

To understand the main commands or tools that we have at our disposal in PDB, let's consider a basic Python program, and then try to debug it using PDB commands. This way, we will see with an example what exactly each command does.

# Filename: calc.py

operators = ['+', '-', '*', '/']
numbers = [10, 20]

def calculator():
    print("Operators available: ")
    for op in operators:
        print(op)

    print("Numbers to be used: ")
    for num in numbers:
        print(num)

def main():
    calculator()

main()

Here is the output of the script above:

Operators available:
+
-
*
/
Numbers to be used:
10
20

I have not added any comments in the code above, as it is beginner friendly and involves no complex concepts or syntax at all. It's not important to try and understand the "task" that this code achieves, as its purpose was to include certain things so that all of PDB's commands could be tested on it. Alright then, let's start!

Using PDB requires use of the Command Line Interface (CLI), so you have to run your application from the terminal or the command prompt.

Run the command below in your CLI:

$ python -m pdb calc.py

In the command above, my file's name is "calc.py", so you'll need to insert your own file name here.

Note: The -m is a flag, and it notifies the Python executable that a module needs to be imported; this flag is followed by the name of the module, which in our case is pdb.

The output of the command looks like this:

> /Users/junaid/Desktop/calc.py(3)<module>()
-> operators = [ '+', '-', '*', '/' ]
(Pdb)

The output will always have the same structure. It will start with the directory path to our source code file. Then, in brackets, it will indicate the line number from that file that PDB is currently pointing at, which in our case is "(3)". The next line, starting with the "->" symbol, indicates the line currently being pointed to.

In order to close the PDB prompt, simply enter quit or exit in the PDB prompt.

A few other things to note, if your program accepts parameters as inputs, you can pass them through the command line as well. For instance, had our program above required three inputs from the user, then this is what our command would have looked like:

$ python -m pdb calc.py var1 var2 var3

Moving on, if you had earlier closed the PDB prompt through the quit or exit command, then rerun the code file through PDB. After that, run the following command in the PDB prompt:

(Pdb) list

The output looks like this:

  1     # Filename: calc.py
  2
  3  -> operators = ['+', '-', '*', '/']
  4     numbers = [10, 20]
  5
  6     def calculator():
  7         print("Operators available: ")
  8         for op in operators:
  9             print(op)
 10
 11         print("Numbers to be used: ")
(Pdb)

This will show the first 11 lines of your program to you, with the "->" pointing towards the current line being executed by the debugger. Next, try this command in the PDB prompt:

(Pdb) list 4,6

This command should display the selected lines only, which in this case are lines 4 to 6. Here is the output:

  4     numbers = [10, 20]
  5
  6     def calculator():
(Pdb)

Debugging with Break Points

The next important thing that we will learn about is the breakpoint. Breakpoints are usually used for larger programs, but to understand them better we will see how they function on our basic example. Break points are specific locations that we declare in our code. Our code runs up to that location and then pauses. These points are automatically assigned numbers by PDB.

We have the following different options to create break points:

  1. By line number
  2. By function declaration
  3. By a condition

To declare a break point by line number, run the following command in the PDB prompt:

(Pdb) break calc.py:8

This command inserts a breakpoint at the 8th line of code, which will pause the program once it hits that point. The output from this command is shown as:

Breakpoint 1 at /Users/junaid/Desktop/calc.py: 8
(Pdb)

To declare break points on a function, run the following command in the PDB prompt:

(Pdb) break calc.calculator

In order to insert a breakpoint in this way, you must declare it using the file name and then the function name. This outputs the following:

Breakpoint 2 at /Users/junaid/Desktop/calc.py:6

As you can see, this break point has been assigned number 2 automatically, and the line number i.e. 6 at which the function is declared, is also shown.

Break points can also be declared by a condition. In that case the program will run until the condition is false, and will pause when that condition becomes true. Run the following command in the PDB prompt:

(Pdb) break calc.py:8, op == "*"

This will track the value of the op variable throughout execution and only break when its value is "*" at line 8.

To see all the break points that we have declared in the form of a list, run the following command in the PDB prompt:

(Pdb) break

The output looks like this:

Num Type         Disp Enb   Where
1   breakpoint   keep yes   at /Users/junaid/Desktop/calc.py: 8
2   breakpoint   keep yes   at /Users/junaid/Desktop/calc.py: 6
    breakpoint already hit 1 time
3   breakpoint   keep yes   at /Users/junaid/Desktop/calc.py: 8
    stop only if op == "*"
(Pdb)

Lastly, let's see how we can disable, enable, and clear a specific break point at any instance. Run the following command in the PDB prompt:

(Pdb) disable 2

This will disable breakpoint 2, but will not remove it from our debugger instance.

In the output you will see the number of the disabled break point.

Disabled breakpoint 2 at /Users/junaid/Desktop/calc.py:6
(Pdb)

Lets print out the list of all break points again to see the "Enb" value for breakpoint 2:

(Pdb) break

Output:

Num Type         Disp Enb   Where
1   breakpoint   keep yes   at /Users/junaid/Desktop/calc.py:8
2   breakpoint   keep no    at /Users/junaid/Desktop/calc.py:4 # you can see here that the "ENB" column for #2 shows "no"
    breakpoint already hit 1 time
3   breakpoint   keep yes   at /Users/junaid/Desktop/calc.py:8
    stop only if op == "*"
(Pdb)

To re-enable break point 2, run the following command:

(Pdb) enable 2

And again, here is the output:

Enabled breakpoint 2 at /Users/junaid/Desktop/calc.py:6

Now, if you print the list of all break points again, the "Enb" column's value for breakpoint 2 should show a "yes" again.

Let's now clear breakpoint 1, which will remove it all together.

(Pdb) clear 1

The output is as follows:

Deleted breakpoint 1 at /Users/junaid/Desktop/calc.py:8
(Pdb)

If we re-print the list of breakpoints, it should now only show two breakpoint rows. Let's see the "break" command's output:

Num Type         Disp Enb   Where
2   breakpoint   keep yes   at /Users/junaid/Desktop/calc.py:4
    breakpoint already hit 1 time
3   breakpoint   keep yes   at /Users/junaid/Desktop/calc.py:8
    stop only if op == "*"

Exactly what we expected.

Before we move ahead from this section, I want to show you all what is displayed when we actually run the code until the specified breakpoint. To do that, let's clear all the previous breakpoints and declare another breakpoint through the PDB prompt:

1. Clear all breakpoints

(Pdb) clear

After that, type "y" and hit "Enter". You should see an output like this appear:

Deleted breakpoint 2 at /Users/junaid/Desktop/calc.py:6
Deleted breakpoint 3 at /Users/junaid/Desktop/calc.py:8

2. Declare a new breakpoint

What we wish to achieve is that the code should run up until the point that the value of num variable is greater than 10. So basically, the program should pause before the number "20" gets printed.

(Pdb) break calc.py:13, num > 10

3. Run the code until this breakpoint

To run the code, use the "continue" command, which will execute the code until it hits a breakpoint or finishes:

(Pdb) continue

You should see the following output:

Operators available:
+
-
*
/
Numbers to be used:
10
> /Users/junaid/Desktop/calc.py(13)calculator()
-> print(num)

This is exactly what we expected, the program runs until that point and then pauses, now it's up to us if we wish to change anything, inspect variables, or if we want to run the script until completion. To instruct it to run to completion, run the "continue" command again. The output should be the following:

20
The program finished and will be restarted
> /Users/junaid/Desktop/calc.py(3)<module>()
-> operators = [ '+', '-', '*', '/' ]

In the above output, it can be seen that the program continues from exactly where it left off, runs the remaining part, and then restarts to allow us to debug it further if we wish. Let's move to the next section now.

Important Note: Before moving forward, clear all the breakpoints by running the "clear" command, followed by typing in "y" in the PDB prompt.

Next and Step Functions

Last, but not the least, let's study the next and step functions; these will be very frequently used when you start debugging your applications, so let's learn what they do and how they can be implemented.

The step and next functions are used to iterate throughout our code line by line; there's a very little difference between the two. While iterating, if the step function encounters a function call, it will move to the first line of that function's definition and show us exactly what is happening inside the function; whereas, if the next function encounters a function call, it will run all lines of that function in a single go and pause at the next function call.

Confused? Let's see that in an example.

Re-run the program through PDB prompt using the following command:

$ python -m pdb calc.py

Now type in continue in the PDB prompt, and keep doing that until the program reaches the end. I'm going to show a section of the whole input and output sequence below, which is sufficient to explain the point. The full sequence is quite long, and would only confuse you more, so it will be omitted.

> /Users/junaid/Desktop/calc.py(1)<module>()
-> operators = [ '+', '-', '*', '/' ]
(Pdb) step
> /Users/junaid/Desktop/calc.py(2)<module>()
-> numbers = [ 10, 20 ]
.
.
.
.
> /Users/junaid/Desktop/calc.py(6)calculator()
-> print("Operators available: " )
(Pdb) step
Operators available:
> /Users/junaid/Desktop/calc.py(8)calculator()
-> for op in operators:
(Pdb) step
> /Users/junaid/Desktop/calc.py(10)calculator()
-> print(op)
(Pdb) step
+
> /Users/junaid/Desktop/calc.py(8)calculator()
-> for op in operators:
(Pdb) step
> /Users/junaid/Desktop/calc.py(10)calculator()
-> print(op)

.
.
.
.

Now, re-run the whole program, but this time, use the "next" command instead of "step". I have shown the input and output trace for that as well.

> /Users/junaid/Desktop/calc.py(3)<module>()
-> operators = ['+', '-', '*', '/']
(Pdb) next
> /Users/junaid/Desktop/calc.py(4)<module>()
-> numbers = [10, 20]
(Pdb) next
> /Users/junaid/Desktop/calc.py(6)<module>()
-> def calculator():
(Pdb) next
> /Users/junaid/Desktop/calc.py(15)<module>()
-> def main():
(Pdb) next
> /Users/junaid/Desktop/calc.py(18)<module>()
-> main()
(Pdb) next
Operators available:
+
-
*
/
Numbers to be used:
10
20
--Return--

Alright, now that we have output trace for both of these functions, let's see how they are different. For the step function, you can see that when the calculator function is called, it moves inside that function, and iterates through it in "steps", showing us exactly what is happening in each step.

However, if you see the output trace for the next function, when "main" function is called, it doesn't show us what happens inside that function (i.e. a subsequent call to the calculator function), and then directly prints out the end result in a single go/step.

These commands are useful if you are iterating through a program and want to step through certain functions, but not others, in which case you can utilize each command for its purposes.

Conclusion

In this tutorial, we learned about a sophisticated technique for debugging python applications using an built-in module named PDB. We dove into the different troubleshooting commands that PDB provides us with, including the next and step statements, breakpoints, etc. We also applied them to a basic program to see them in action.