Concurrency in Python


Computing has evolved over time and more and more ways have come up to make computers run even faster. What if instead of executing a single instruction at a time, we can also execute several instructions at the same time? This would mean a significant increase in the performance of a system.

Through concurrency, we can achieve this and our Python programs will be able to handle even more requests at a single time, and over time leading to impressive performance gains.

In this article, we will discuss concurrency in the context of Python programming, the various forms it comes in and we will speed up a simple program in order to see the performance gains in practice.

What is Concurrency?

When two or more events are concurrent it means that they are happening at the same time. In real life, concurrency is common since a lot of things happen at the same time all the time. In computing, things are a bit different when it comes to concurrency.

In computing, concurrency is the execution of pieces of work or tasks by a computer at the same time. Normally, a computer executes a piece of work as others wait their turn, once it is completed, the resources are freed and the next piece of work begins execution. This is not the case when concurrency is implemented as the pieces of work to be executed don't always have to wait for others to be completed. They are executed at the same time.

Concurrency vs Parallelism

We have defined concurrency as the execution of tasks at the same time, but how does it compare to parallelism, and what is it?

Parallelism is achieved when multiple computations or operations are carried out at the same time or in parallel with the goal of speeding up the computation process.

Both concurrency and parallelism are involved with performing multiple tasks simultaneously, but what sets them apart is the fact that while concurrency only takes place in one processor, parallelism is achieved through utilizing multiple CPUs to have tasks done in parallel.

Thread vs Process vs Task

While generally speaking, threads, processes, and tasks may refer to pieces or units of work. However, in detail they are not so similar.

A thread is the smallest unit of execution that can be performed on a computer. Threads exist as parts of a process and are usually not independent of each other, meaning they share data and memory with other threads within the same process. Threads are also sometimes referred to as lightweight processes.

For example, in a document processing application, one thread could be responsible for formatting the text and another handles the autosaving, while another is doing spell checks.

A process is a job or an instance of a computed program that can be executed. When we write and execute code, a process is created to execute all the tasks that we have instructed the computer to do through our code. A process can have a single primary thread or have several threads within it, each with its own stack, registers, and program counter. But they all share the code, data, and memory.

Some of the common differences between processes and threads are:

  • Processes work in isolation while threads can access the data of other threads
  • If a thread within a process is blocked, other threads can continue executing, while a blocked process will put on hold the execution of the other processes in the queue
  • While threads share memory with other threads, processes do not and each process has its own memory allocation.

A task is simply a set of program instructions that are loaded in memory.

Multithreading vs Multiprocessing vs Asyncio

Having explored threads and processes, let us now delve deeper into the various ways a computer executes concurrently.

Multithreading refers to the ability of a CPU to execute multiple threads concurrently. The idea here is to divide a process into various threads that can be executed in a parallel manner or at the same time. This division of duty enhances the speed of execution of the entire process. For example, in a word processor like MS Word, a lot of things are going on when in use.

Multithreading will allow the program to autosave the content being written, perform spell checks for the content, and also format the content. Through multithreading, all this can take place simultaneously and the user does not have to complete the document first for the saving to happen or the spell checks to take place.

Only one processor is involved during multithreading and the operating system decides when to switch tasks in the current processor, these tasks may be external to the current process or program being executed in our processor.

Multiprocessing, on the other hand, involves utilizing two or more processor units on a computer to achieve parallelism. Python implements multiprocessing by creating different processes for different programs, with each having its own instance of the Python interpreter to run and memory allocation to utilize during execution.

AsyncIO or asynchronous IO is a new paradigm introduced in Python 3 for the purpose of writing concurrent code by using the async/await syntax. It is best for IO-bound and high-level networking purposes.

When to use Concurrency

The advantages of concurrency are best tapped into when solving CPU-bound or IO-bound problems.

CPU-bound problems involve programs that do a lot of computation without requiring networking or storage facilities and are only limited by capabilities of the CPU.

IO-bound problems involve programs that rely on input/output resources which sometimes may be slower than the CPU and are usually in use, therefore, the program has to wait for the current task to release the I/O resources.

It is best to write concurrent code when the CPU or I/O resources are limited and you want to speed up your program.

How to use Concurrency

In our demonstration example, we will solve a common I/O bound problem, which is downloading files over a network. We will write non-concurrent code and concurrent code and compare the time taken for each program to complete.

We will download images from Imgur through their API. First, we need to create an account and then register our demo application in order to access the API and download some images.

Once our application is set up on Imgur, we will receive a client identifier and client secret that we'll use to access the API. We'll save the credentials in a .env file since Pipenv automatically loads the variables from the .env file.

Synchronous Script

With those details, we can create our first script that will simply download a bunch of images to a downloads folder:

import os  
from urllib import request  
from imgurpython import ImgurClient  
import timeit

client_secret = os.getenv("CLIENT_SECRET")  
client_id = os.getenv("CLIENT_ID")

client = ImgurClient(client_id, client_secret)

def download_image(link):  
    filename = link.split('/')[3].split('.')[0]
    fileformat = link.split('/')[3].split('.')[1]
    request.urlretrieve(link, "downloads/{}.{}".format(filename, fileformat))
    print("{}.{} downloaded into downloads/ folder".format(filename, fileformat))

def main():  
    images = client.get_album_images('PdA9Amq')
    for image in images:

if __name__ == "__main__":  
    print("Time taken to download images synchronously: {}".format(timeit.Timer(main).timeit(number=1)))

In this script, we pass an Imgur album identifier and then download all the images in that album using the function get_album_images(). This gives us a list of the images and then we use our function to download the images and save them to a folder locally.

This simple example gets the job done. We are able to download images from Imgur but it does not work concurrently. It only downloads one image at a time before moving on to the next image. On my machine, the script took 48 seconds to download the images.

Optimizing with Multithreading

Let us now make our code concurrent using Multithreading and see how it performs:

# previous imports from synchronous version are maintained
import threading  
from concurrent.futures import ThreadPoolExecutor

# Imgur client setup remains the same as in the synchronous version

# download_image() function remains the same as in the synchronous

def download_album(album_id):  
    images = client.get_album_images(album_id)
    with ThreadPoolExecutor(max_workers=5) as executor:, images)

def main():  

if __name__ == "__main__":  
    print("Time taken to download images using multithreading: {}".format(timeit.Timer(main).timeit(number=1)))

In the above example, we create a Threadpool and set up 5 different threads to download images from our gallery. Remember threads execute on a single processor.

This version of our code takes 19 seconds. That is almost three times faster than the synchronous version of the script.

Optimizing with Multiprocessing

Now we will implement Multiprocessing over several CPUs for the same script to see how it performs:

# previous imports from synchronous version remain
import multiprocessing

# Imgur client setup remains the same as in the synchronous version

# download_image() function remains the same as in the synchronous

def main():  
    images = client.get_album_images('PdA9Amq')

    pool = multiprocessing.Pool(multiprocessing.cpu_count())
    result =, [ for image in images])

if __name__ == "__main__":  
    print("Time taken to download images using multiprocessing: {}".format(timeit.Timer(main).timeit(number=1)))

In this version, we create a pool that contains the number of CPU cores on our machine and then map our function to download the images across the pool. This makes our code run in a parallel manner across our CPU and this multiprocessing version of our code takes an average of 14 seconds after multiple runs.

This is slightly faster than our version that utilizes threads and significantly faster than our non-concurrent version.

Optimizing with AsyncIO

Let us implement the same script using AsyncIO to see how it performs:

# previous imports from synchronous version remain
import asyncio  
import aiohttp

# Imgur client setup remains the same as in the synchronous version

async def download_image(link, session):  
    Function to download an image from a link provided.
    filename = link.split('/')[3].split('.')[0]
    fileformat = link.split('/')[3].split('.')[1]

    async with session.get(link) as response:
        with open("downloads/{}.{}".format(filename, fileformat), 'wb') as fd:
            async for data in response.content.iter_chunked(1024):

    print("{}.{} downloaded into downloads/ folder".format(filename, fileformat))

async def main():  
    images = client.get_album_images('PdA9Amq')

    async with aiohttp.ClientSession() as session:
        tasks = [download_image(, session) for image in images]

        return await asyncio.gather(*tasks)

if __name__ == "__main__":  
    start_time = timeit.default_timer()

    loop = asyncio.get_event_loop()
    results = loop.run_until_complete(main())

    time_taken = timeit.default_timer() - start_time

    print("Time taken to download images using AsyncIO: {}".format(time_taken))

There are few changes that stand out in our new script. First, we no longer use the normal requests module to download our images, but instead we use aiohttp. The reason for this is that requests is incompatible with AsyncIO since it uses Python's http and sockets module.

Sockets are blocking by nature, i.e. they cannot be paused and execution continued later on. aiohttp solves this and helps us achieve truly asynchronous code.

The keyword async indicates that our function is a coroutine (Co-operative Routine), which is a piece of code that can be paused and resumed. Coroutines multitask cooperatively, meaning they choose when to pause and let others execute.

We create a pool where we make a queue of all the links to the images we wish to download. Our coroutine is started by putting it in the event loop and executing it until completion.

After several runs of this script, the AsyncIO version takes 14 seconds on average to download the images in the album. This is significantly faster than the multithreaded and synchronous versions of the code, and pretty similar to the multiprocessing version.

Performance Comparison

Synchronous Multithreading Multiprocessing Asyncio
48s 19s 14s 14s


In this post, we have covered concurrency and how it compares to parallelism. We have also explored the various methods that we can use to implement concurrency in our Python code, including multithreading and multiprocessing, and also discussed their differences.

From the examples above, we can see how concurrency helps our code run faster than it would in a synchronous manner. As a rule of thumb, Multiprocessing is best suited for CPU-bound tasks while Multithreading is best for I/O-bound tasks.

The source code for this post is available on GitHub for reference.

Author image
About Robley Gori
Nairobi, Kenya Twitter Website