Why does Python Code Run Faster in a Function?

Introduction

Python is not necessarily known for its speed, but there are certain things that can help you squeeze out a bit more performance from your code. Surprisingly, one of these practices is running code in a function rather than in the global scope. In this article, we'll see why Python code runs faster in a function and how Python code execution works.

Python Code Execution

To understand why Python code runs faster in a function, we need to first understand how Python executes code. Python is an interpreted language, which means it reads and executes code line by line. When Python executes a script, it first compiles it to bytecode, an intermediate language that's closer to machine code, and then the Python interpreter executes this bytecode.

def hello_world():
    print("Hello, World!")

import dis
dis.dis(hello_world)
  2           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('Hello, World!')
              4 CALL_FUNCTION            1
              6 POP_TOP
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE

The dis module in Python disassembles the function hello_world into bytecode, as seen above.

Note: The Python interpreter is a virtual machine that executes the bytecode. The default Python interpreter is CPython, which is written in C. There are other Python interpreters like Jython (written in Java), IronPython (for .NET), and PyPy (written in Python and C), but CPython is the most commonly used.

Why Python Code Runs Faster in a Function

Consider a simplified example with a loop that iterates over a range of numbers:

def my_function():
    for i in range(100000000):
        pass

When this function is compiled, the bytecode might look something like this:

  SETUP_LOOP              20 (to 23)
  LOAD_GLOBAL             0 (range)
  LOAD_CONST              3 (100000000)
  CALL_FUNCTION           1
  GET_ITER            
  FOR_ITER                6 (to 22)
  STORE_FAST              0 (i)
  JUMP_ABSOLUTE           13
  POP_BLOCK           
  LOAD_CONST              0 (None)
  RETURN_VALUE

The key instruction here is STORE_FAST, which is used to store the loop variable i.

Now let's consider the bytecode if the loop is at the top level of a Python script:

  SETUP_LOOP              20 (to 23)
  LOAD_NAME               0 (range)
  LOAD_CONST              3 (100000000)
  CALL_FUNCTION           1
  GET_ITER            
  FOR_ITER                6 (to 22)
  STORE_NAME              1 (i)
  JUMP_ABSOLUTE           13
  POP_BLOCK           
  LOAD_CONST              2 (None)
  RETURN_VALUE

Notice the STORE_NAME instruction is used here, rather than STORE_FAST.

The bytecode STORE_FAST is faster than STORE_NAME because in a function, local variables are stored in a fixed-size array, not a dictionary. This array is directly accessible via an index, making variable retrieval very quick. Basically, it's just a pointer lookup into the list and an increase in the reference count of the PyObject, both of which are highly efficient operations.

On the other hand, global variables are stored in a dictionary. When you access a global variable, Python has to perform a hash table lookup, which involves calculating a hash and then retrieving the value associated with it. Though this is optimized, it's still inherently slower than an index-based lookup.

Benchmarking and Profiling Python Code

Want to test this for yourself? Try benchmarking and profiling your code.

Benchmarking and profiling are important practices in performance optimization. They help you understand how your code behaves and where the bottlenecks are.

Benchmarking is where you time your code to see how long it takes to run. You can use Python's built-in time module, as we'll show later, or use more sophisticated tools like timeit.

Profiling, on the other hand, provides a more detailed view of your code's execution. It shows you where your code spends most of its time, which functions are called, and how often. Python's built-in profile or cProfile modules can be used for this.

Here's one way you can profile your Python code:

import cProfile

def loop():
    for i in range(10000000):
        pass

cProfile.run('loop()')

This will output a detailed report of all the function calls made during the execution of the loop function.

Note: Profiling can add quite a bit of overhead to your code execution, so the execution time shown by the profiler will likely be longer than the actual execution time.

Benchmarking Code in a Function vs. Global Scope

In Python, the speed of code execution can vary depending on where the code is executed - in a function or in the global scope. Let's compare the two using a simple example.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

Consider the following code snippet that calculates the factorial of a number:

def factorial(n):
    result = 1
    for i in range(1, n + 1):
        result *= i
    return result

Now let's run the same code but in the global scope:

n = 20
result = 1
for i in range(1, n + 1):
    result *= i

To benchmark these two pieces of code, we can use the timeit module in Python, which provides a simple way to time small bits of Python code.

import timeit

# Factorial function here...

def benchmark():
    start = timeit.default_timer()

    factorial(20)

    end = timeit.default_timer()
    print(end - start)

#
# Run benchmark on function code
#
benchmark()
# Prints: 3.541994374245405e-06

#
# Run benchmark on global scope code
#
start = timeit.default_timer()

n = 20
result = 1
for i in range(1, n + 1):
    result *= i

end = timeit.default_timer()
print(end - start) 
# Pirnts: 5.375011824071407e-06

You'll find that the function code executes faster than the global scope code. This is because Python executes function code faster due to the reasons we discussed earlier.

Note: If you run the benchmark() function and global scope code in the same script, the global scope code will run faster. This is because the benchmark() function adds some overhead to the execution time and the global code is given some optimizations internally. However, if you run them separately, you'll find that the function code does run faster.

Profiling Code in a Function vs. Global Scope

Python provides a built-in module called cProfile for this purpose. Let's use it to profile a new function, which computes the sum of squres, in both a local and global scope.

import cProfile

def sum_of_squares():
    total = 0
    for i in range(1, 10000000):
        total += i * i

i = None
total = 0
def sum_of_squares_g():
    global i
    global total
    for i in range(1, 10000000):
        total += i * i
    
def profile(func):
    pr = cProfile.Profile()
    pr.enable()

    func()

    pr.disable()
    pr.print_stats()
#
# Profile function code
#
print("Function scope:")
profile(sum_of_squares)

#
# Profile global scope code
#
print("Global scope:")
profile(sum_of_squares_g)

From the profiling results, you'll see that the function code is more efficient in terms of execution time.

Function scope:
         2 function calls in 0.903 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.903    0.903    0.903    0.903 profiler.py:3(sum_of_squares)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}


Global scope:
         2 function calls in 1.358 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    1.358    1.358    1.358    1.358 profiler.py:10(sum_of_squares_g)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

We consider the sum_of_squares_g() function to be global since it used two global variables, i and total. As we saw earlier, it's the global variables that slow down the code execution, which is why we made those variables global in this code.

Optimizing Python Function Performance

Given that Python functions tend to run faster than equivalent code in the global scope, it's worth looking into how we can further optimize our function performance.

Of course, because of what we saw earlier, one strategy is to use local variables instead of global variables. Here's an example:

import time

# Global variable
x = 5

def calculate_power_global():
    for i in range(10000000):
        y = x ** 2  # Accessing global variable

def calculate_power_local(x):
    for i in range(10000000):
        y = x ** 2  # Accessing local variable

start = time.time()
calculate_power_global()
end = time.time()

print(f"Execution time with global variable: {end - start} seconds")

start = time.time()
calculate_power_local(x)
end = time.time()

print(f"Execution time with local variable: {end - start} seconds")

In this example, calculate_power_local will typically run faster than calculate_power_global, because it's using a local variable instead of a global one.

Execution time with global variable: 1.9901456832885742 seconds
Execution time with local variable: 1.9626312255859375 seconds

Another optimization strategy is to use built-in functions and libraries whenever possible. Python's built-in functions are implemented in C, which is much faster than Python. Similarly, many Python libraries, such as NumPy and Pandas, are also implemented in C or C++, making them faster than equivalent Python code.

For example, consider the task of summing a list of numbers. You could write a function to do this:

def sum_numbers(numbers):
    total = 0
    for number in numbers:
        total += number
    return total

However, Python's built-in sum function will do the same thing, but faster:

numbers = [1, 2, 3, 4, 5]
total = sum(numbers)

Try timing these two code snippets yourself and figure out which one is faster!

Conclusion

In this article, we've explored the interesting world of Python code execution, specifically focusing on why Python code tends to run faster when encapsulated in a function. We briefly looked into the concepts of benchmarking and profiling, providing practical examples of how these processes can be carried out in both a function and the global scope.

We also discussed a few ways to optimize your Python function performance. While these tips can certainly make your code run faster, you should use certain optimizations carefully as it's important to balance readability and maintainability with performance.

Last Updated: September 18th, 2023
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms