Python: Make Time Delay (Sleep) for Code Execution


Code Delaying (also known as sleeping) is exactly what the name implies, the delaying of code execution for some amount of time. The most common need for code delaying is when we're waiting for some other process to finish, so that we can work with the result of that process. In multi-threaded systems, a thread might want to wait for another thread to finish an operation, to continue working with that result.

Another example could be lessening the strain on a server we're working with. For example, while web scraping (ethically), and following the ToS of the website in question, abiding by the robots.txt file - you might very well want to delay the execution of each request as to not overwhelm the resources of the server.

Many requests, fired in rapid succession can, depending on the server in question, quickly take up all of the free connections and effectively become a DoS Attack. To allow for breathing space, as well as to make sure we don't negatively impact either the users of the website or the website itself - we'd limit the number of requests sent by delaying each one.

A student, waiting for exam results might furiously refresh their school's website, waiting for news. Alternatively, they might write a script that checks if the website has anything new on it. In a sense, code delay can technically become code scheduling with a valid loop and termination condition - assuming that the delay mechanism in place isn't blocking.

In this article, we'll take a look at how to delay code execution in Python - also known as sleeping. This can be done in a few ways:

Delaying Code with time.sleep()

One of the most common solutions to the problem is the sleep() function of the built-in time module. It accepts the number of seconds you'd like the process to sleep for - unlike many other languages that are based in milliseconds:

import datetime
import time


This results in:


Quite clearly, we can see a 5s delay between the two print() statements, with a fairly high precision - down to the second decimal place. If you'd like to sleep for less than 1 second, you can easily pass non-whole numbers as well:


Though, keep in mind that with 2 decimal places, the sleep duration might not be exactly on spot, especially since it's hard to test, given the fact that the print() statements take some (variable) time to execute as well.

However, there's one major downside to the time.sleep() function, very noticeable in multi-threaded environments.

time.sleep() is blocking.

It seizes up the thread it's on and blocks it for the duration of the sleep. This makes it unfit for longer waiting times, as it clogs up the thread of the processor during that time period. Additionally, this make it unfit for Asynchronous and Reactive Applications, which oftentimes require real-time data and feedback.

Another thing to note about time.sleep() is the fact that you can't stop it. Once it starts, you can't externally cancel it without terminating the entire program or if you cause the sleep() method itself to throw an exception, which would halt it.

Asynchronous and Reactive Programming

Asynchronous Programming revolves around parallel execution - where a task can be executed and finish independent of the main flow.

In Synchronous Programming - if a Function A calls Function B, it stops execution until Function B finishes execution, after which Function A can resume.

In Asynchronous Programming - if a Function A calls Function B, regardless of its dependence of the result from Function B, both can execute at the same time, and if need be, wait for the other one to finish to utilize each other results.

Reactive Programming is a subset of Asynchronous Programming, which triggers code execution reactively, when data is presented, regardless of whether the function supposed to process it is already busy. Reactive Programming relies heavily on Message-Driven Architectures (where a message is typically an event or a command).

Both Asynchronous and Reactive applications are the ones that suffer greatly from blocking code - so using something like time.sleep() isn't a good fit for them. Let's take a look at some non-blocking code delay options.

Delaying Code with asyncio.sleep()

Asyncio is a Python library dedicated to writing concurrent code, and uses the async/await syntax, which might be familiar to developers who have used it in other languages.

Let's install the module via pip:

$ pip install asyncio

Once installed, we can import it into our script and rewrite our function:

import asyncio
async def main():
    await asyncio.sleep(5)

When working with asyncio, we mark functions that run asynchronously as async, and await the results of operations such as asyncio.sleep() that will be finished at some point in the future.

Similar to the previous example, this will print two times, 5 seconds apart:


Though, this doesn't really illustrate the advantage of using asyncio.sleep(). Let's rewrite the example to run a few tasks in parallel, where this distinction is a lot more clear:

import asyncio
import datetime

async def intense_task(id):
    await asyncio.sleep(5)
    print(id, 'Running some labor-intensive task at ',

async def main():
    await asyncio.gather(

Here, we've got an async function, which simulates a labor-intensive task that takes 5 seconds to finish. Then, using asyncio, we create multiple tasks. Each task can run asynchronously, though, only if we call them asynchronously. If we were to run them sequentially, they'd also execute sequentially.

To call them in parallel, we use the gather() function, which, well, gathers the tasks and executes them:

1 Running some labor-intensive task at  17:35:21.068469
2 Running some labor-intensive task at  17:35:21.068469
3 Running some labor-intensive task at  17:35:21.068469

These are all executed at the same time, and the waiting time for the three of them isn't 15 seconds - it's 5.

On the other hand, if we were to tweak this code to use time.sleep() instead:

import asyncio
import datetime
import time

async def intense_task(id):
    print(id, 'Running some labor-intensive task at ',

async def main():
    await asyncio.gather(

We'd be waiting for 5 seconds between each print() statement:

1 Running some labor-intensive task at  17:39:00.766275
2 Running some labor-intensive task at  17:39:05.773471
3 Running some labor-intensive task at  17:39:10.784743

Delaying Code with Timer

The Timer class is a Thread, that can run and execute operations only after a certain time period has passed. This behavior is exactly what we're looking for, though, it's a bit of an overkill to use Threads to delay code if you're not already working with a multi-threaded system.

The Timer class needs to start(), and can be halted via cancel(). Its constructor accepts an integer, denoting the number of seconds to wait before executing the second parameter - a function.

Let's make a function and execute it via a Timer:

from threading import Timer
import datetime

def f():
    print("Code to be executed after a delay at:",

print("Code to be executed immediately at:",
timer = Timer(3, f)

This results in:

Code to be executed immediately at: 19:47:20.032525
Code to be executed after a delay at: 19:47:23.036206

The cancel() method comes in really handy if we have multiple functions running, and we'd like to cancel the execution of a function, based on the results of another, or on another condition.

Let's write a function f(), which calls on both f2() and f3(). f2() is called as-is - and returns a random integer between 1 and 10, simulating the time it took to run that function.

f3() is called through a Timer and if the result of f2() is greater than 5, f3() is cancelled, whereas if f2() runs in the "expected" time of less than 5 - f3() runs after the timer ends:

from threading import Timer
import datetime
import random

def f():
    print("Executing f1 at",
    result = f2()
    timer = Timer(5, f3)
    if(result > 5):
        print("Cancelling f3 since f2 resulted in", result)

def f2():
    print("Executing f2 at",
    return random.randint(1, 10)

def f3():
    print("Executing f3 at",


Running this code multiple times would look something along the lines of:

Executing f1 at 20:29:10.709578
Executing f2 at 20:29:10.709578
Cancelling f3 since f2 resulted in 9

Executing f1 at 20:29:14.178362
Executing f2 at 20:29:14.178362
Executing f3 at 20:29:19.182505

Delaying Code with Event

The Event class can be used to generate events. A single event can be "listened to" by multiple threads. The Event.wait() function blocks the thread it's on, unless the Event.isSet(). Once you set() an Event, all the threads that waited are awoken and the Event.wait() becomes non-blocking.

This can be used to synchronize threads - all of them pile up and wait() until a certain Event is set, after which, they can dictate their flow.

Let's create a waiter method and run it multiple times on different threads. Each waiter starts working at a certain time and checks if they're still on the hour every second, right before they take an order, which takes a second to fulfill. They'll be working until the Event is set - or rather, their working time is up.

Each waiter will have their own thread, while management resides in the main thread, and call when everyone can call home. Since they're feeling extra generous today, they'll cut the working time and let the waiters go home after 4 seconds of work:

import threading
import time
import datetime

def waiter(event, id):
    print(id, "Waiter started working at",
    event_flag = end_of_work.wait(1)
    while not end_of_work.isSet():
        print(id, "Waiter is taking order at",
    if event_flag:
        print(id, "Waiter is going home at",

end_of_work = threading.Event()

for id in range(1, 3):
    thread = threading.Thread(target=waiter, args=(end_of_work, id))

print("Some time passes, management was nice and cut the working hours short. It is now",

Running this code results in:

1 Waiter started working at 23:20:34.294844
2 Waiter started working at 23:20:34.295844
1 Waiter is taking order at 23:20:35.307072
2 Waiter is taking order at 23:20:35.307072
1 Waiter is taking order at 23:20:36.320314
2 Waiter is taking order at 23:20:36.320314
1 Waiter is taking order at 23:20:37.327528
2 Waiter is taking order at 23:20:37.327528
Some time passes, management was nice and cut the working hours short. It is now 23:20:38.310763

The end_of_work event was used here to sync up the two threads and control when they work and when not to, delaying the code execution by a set time between the checks.


In this guide, we've taken a look at several ways to delay code execution in Python - each applicable toa different context and requirement.

The regular time.sleep() method is pretty useful for most applications, though, it's not really optimal for long waiting times, isn't commonly used for simple scheduling and is blocking.

Using asyncio, we've got an asynchronous version of time.sleep() that we can await.

The Timer class delays code execution and can be cancelled if need be.

The Event class generates events that multiple threads can listen to and respond accordingly, delaying code execution until a certain event is set.