Guide to Sending HTTP Requests in Python with urllib3

Guide to Sending HTTP Requests in Python with urllib3

Introduction

Resources on the Web are located under some kind of web-address (even if they're not accessible), oftentimes referred to as a URL (Uniform Resource Locator). These resources are, most of the time, manipulated by an end-user (retrieved, updated, deleted, etc.) using the HTTP protocol through respective HTTP Methods.

In this guide, we'll be taking a look at how to leverage the urllib3 library, which allows us to send HTTP Requests through Python, programmatically.

Note: The urllib3 module can only used with Python 3.x.

What is HTTP?

HTTP (HyperText Transfer Protocol) is a data transfer protocol used for, typically, transmitting hypermedia documents, such as HTML, but can also be used to transfer JSON, XML or similar formats. It's applied in the Application Layer of the OSI Model, alongside other protocols such as FTP (File Transfer Protocol) and SMTP (Simple Mail Transfer Protocol).

HTTP is the backbone of the World Wide Web as we know it today and it's main task is to enable a communication channel between web browsers and web servers, through a lifecycle of HTTP Requests and HTTP Responses - the fundamental communication components of HTTP.

It's based on the client-server model where a client requests a resource, and the server responds with the resource - or a lack thereof.

A typical HTTP Request may look something like:

GET /tag/java/ HTTP/1.1
Host: stackabuse.com
Accept: */*
User-Agent: Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion

If the server finds the resource, the HTTP Response's header will contain data on how the request/response cycle fared:

HTTP/1.1 200 OK
Date: Thu, 22 Jul 2021 18:16:38 GMT
Content-Type: text/html; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
...

And the response body will contain the actual resource - which in this case is an HTML page:

<!DOCTYPE html>
<html lang="en">
   <head>
      <meta name="twitter:title" content="Stack Abuse"/>
      <meta name="twitter:description" content="Learn Python, Java, JavaScript/Node, Machine Learning, and Web Development through articles, code examples, and tutorials for developers of all skill levels."/>
      <meta name="twitter:url" content="https://stackabuse.com"/>
      <meta name="twitter:site" content="@StackAbuse"/>
      <meta name="next-head-count" content="16"/>
   </head>
...

The urllib3 Module

The urllib3 module is the latest HTTP-related module developed for Python and the successor to urllib2. It supports file uploads with multi-part encoding, gzip, connection pooling and thread safety. It usually comes pre-installed with Python 3.x, but if that's not the case for you, it can easily be installed with:

$ pip install urllib3

You can check your version of urllib3 by accessing the __version__ of the module:

import urllib3

# This tutorial is done with urllib3 version 1.25.8
print(urrlib3.__version__)

Alternatively, you can use the Requests module, which is built on top of urllib3. It's more intuitive and human-centered, and allows for a wider range of HTTP requests. If you'd like to read more about it - read our Guide to the Requests Module in Python.

HTTP Status Codes

Whenever an HTTP request is sent - the response, other than the requested resource (if available and accessible), also contains an HTTP Status Code, signifying how the operation went. It is paramount that you know what the status code you got means, or at least what it broadly implies.

Is there a problem? If so, is it due to the request, the server or me?*

There are five different groups of response codes:

  1. Informational codes (between 100 and 199)
  2. Successful codes (between 200 and 299) - 200 is the most common one
  3. Redirect codes (between 300 and 399)
  4. Client error codes (between 400 and 499) - 404 is the most common one
  5. Server error codes (between 500 and 599) - 500 is the most common one

To send requests using urllib3, we use an instance of the PoolManager class, which takes care of the actual requests for us - covered shortly.

All responses to these requests are packed into an HTTPResponse instance, which, naturally, contains the status of that response:

import urllib3 

http = urllib3.PoolManager()

response = http.request("GET", "http://www.stackabuse.com")
print(response.status) # Prints 200

You can use these statuses to alter the logic of the code - if the result is 200 OK, not much probably needs to be done further. However, if the result is a 405 Method Not Allowed response - your request was probably badly constructed.

However, if a website responds with a 418 I'm a teapot status code, albeit rare - it's letting you know that you can't brew coffee with a teapot. In practice, this typically means that the server doesn't want to respond to the request, and never will. If it were a temporary halt for certain requests - a 503 Service Unavailable status code is much more fitting.

Note: The 418 I'm a teapot status code is a real but playful status code, added as an April Fools' joke.

The Pool Manager

A Connection Pool is a cache of connections that can be reused when needed in future requests, used to improve performance when executing certain commands numerous times. Similarly enough - when sending various requests, a Connection Pool is made so certain connections can be reused.

urllib3 keeps track of requests and their connections through the ConnectionPool and HTTPConnection classes. Since making these by hand leads to a lot of boilerplate code - we can delegate the entirety of the logic to the PoolManager, which automatically creates connections and adds them to the pool. By adjusting the num_pools argument, we can set the number of pools it'll use:

import urllib3

http = urllib3.PoolManager(num_pools=3)

response1 = http.request("GET", "http://www.stackabuse.com")
response2 = http.request("GET", "http://www.google.com")

Only through the PoolManager, can we send a request(), passing in the HTTP Verb and the address we're sending a request to. Different verbs signify different intents - whether you want to GET some content, POST it to a server, PATCH an existing resource or DELETE one.

How to Send HTTP Requests in Python with urllib3

Finally, let's take a look at how to send different request types via urllib3, and how to interpret the data that's returned.

Send HTTP GET Request

An HTTP GET request is used when a client requests to retrieve data from a server, without modifying it in any way, shape or form.

To send an HTTP GET request in Python, we use the request() method of the PoolManager instance, passing in the appropriate HTTP Verb and the resource we're sending a request for:

import urllib3

http = urllib3.PoolManager()

response = http.request("GET", "http://jsonplaceholder.typicode.com/posts/")

print(response.data.decode("utf-8"))

Here, we sent a GET request to {JSON} Placeholder. It's a website that generates dummy JSON data, sent back in the response's body. Typically, the website is used to test HTTP Requests on, stubbing the response.

The HTTPResponse instance, namely our response object holds the body of the response. It can be accessed by the data property which is a bytes stream. Since a website might respond with an encoding we're not suited for, and since we'll want to convert the bytes to a str anyway - we decode() the body and encode it into UTF-8 to make sure we can coherently parse the data.

If you'd like to read more, read our about guide to Converting Bytes to Strings in Python.

Finally, we print the response's body:

[
  {
    "userId": 1,
    "id": 1,
    "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
    "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"
  },
  {
    "userId": 1,
    "id": 2,
    "title": "qui est esse",
    "body": "est rerum tempore vitae\nsequi sint nihil reprehenderit dolor beatae ea dolores neque\nfugiat blanditiis voluptate porro vel nihil molestiae ut reiciendis\nqui aperiam non debitis possimus qui neque nisi nulla"
  },
...

Send HTTP GET Request with Parameters

Rarely do we not add certain parameters to requests. Path variables and request parameters are very common and allow for dynamic linking structures and organizing resources. For instance - we may want to search for a specific comment on a certain post through an API - http://random.com/posts/get?id=1&commentId=1.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

Naturally, urllib3 allows us to add parameters to GET requests, via the fields argument. It accepts a dictionary of the parameter names and their values:

import urllib3 

http = urllib3.PoolManager()

response = http.request("GET",
                        "http://jsonplaceholder.typicode.com/posts/", 
                        fields={"id": "1"})

print(response.data.decode("utf-8"))

This will return only one object, with an id of 1:

[
	{
  		"userId": 1,
  		"id": 1,
  		"title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
  		"body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas 				totam\nnostrum rerum est autem sunt rem eveniet architecto"
	}
]

HTTP POST Request

An HTTP POST request is used for sending data from the client side to the server side. Its most common usage is with file-uploading or form-filling, but can be used to send any data to a server, with a payload:

import urllib3

http = urllib3.PoolManager()
response = http.request("POST", "http://jsonplaceholder.typicode.com/posts", fields={"title": "Created Post", "body": "Lorem ipsum", "userId": 5})

print(response.data.decode("utf-8"))

Even though we're communicating with the same web address, because we're sending a POST request, the fields argument will now specify the data that'll be sent to the server, not retrieved.

We've sent a JSON string, denoting an object with a title, body and userId. The {JSON} Placeholder service also stubs the functionality to add entities, so it returns a response letting us know if we've been able to "add" it to the database, and returns the id of the "created" post:

{
  "id": 101
}

HTTP DELETE Request

Finally, to send HTTP DELETE requests, we simply modify the verb to "DELETE" and target a specific post via its id. Let's delete all posts with the ids of 1..5:

import urllib3

http = urllib3.PoolManager()
for i in range(1, 5):
    response = http.request("DELETE", "http://jsonplaceholder.typicode.com/posts", fields={"id": i})
    print(response.data.decode("utf-8"))

An empty body is returned, as the resources are deleted:

{}
{}
{}
{}

When creating a REST API - you'll probably want to give some status code and message to let the user know that a resource has been deleted successfully.

Send HTTP PATCH Requests

While we can use POST requests to update resources, it's considered good practice if we keep POST requests for only creating resources. Instead, we can fire a PATCH request too update an existing resource.

Let's get the first post and then update it with a new title and body:

import urllib3

data = {
    'title': 'Updated title',
    'body': 'Updated body'
}

http = urllib3.PoolManager()

response = http.request("GET", "http://jsonplaceholder.typicode.com/posts/1")
print(response.data.decode('utf-8'))

response = http.request("PATCH", "https://jsonplaceholder.typicode.com/posts/1", fields=data)
print(response.data.decode('utf-8'))

This should result in:

{
  "userId": 1,
  "id": 1,
  "title": "sunt aut facere repellat provident occaecati excepturi optio reprehenderit",
  "body": "quia et suscipit\nsuscipit recusandae consequuntur expedita et cum\nreprehenderit molestiae ut ut quas totam\nnostrum rerum est autem sunt rem eveniet architecto"
}
{
  "userId": 1,
  "id": 1,
  "title": "Updated title",
  "body": "Updated body"
}

Send Secure HTTPS Requests in Python with urllib3

The urllib3 module also provides client-side SSL verification for secure HTTP connections. We can achieve this with the help of another module, called certifi, which provides the standard Mozilla certificate bundle.

Its installation is pretty straightforward via pip:

$ pip install certifi

With certifi.where(), we reference the installed Certificate Authority (CA). This is an entity that issues digital certificates, which can be trusted. All these trusted certificates are contained in the certifi module:

import urllib3
import certifi

http = urllib3.PoolManager(ca_certs=certifi.where())
response = http.request("GET", "https://httpbin.org/get")

print(response.status)

Now, we can send a secure request to the server.

Uploading Files with urllib3

Using urllib3, we can also upload files to a server. To upload files, we encode the data as multipart/form-data, and pass in the filename as well as its contents as a tuple of file_name: file_data.

To read the contents of a file, we can use Python's built-in read() method:

import urllib3
import json

with open("file_name.txt") as f:
    file_data = f.read()

# Sending the request.
resp = urllib3.request(
    "POST",
    "https://reqbin.com/post-online",
    fields= {
       "file": ("file_name.txt", file_data),
    }
)

print(json.loads(resp.data.decode("utf-8"))["files"])

For the purpose of the example, let's create a file named file_name.txt and add some content:

Some file data
And some more

Now, when we run the script, it should print out:

{'file': 'Some file data\nAnd some more'}

When we send files using urllib3, the response's data contains a "files" attribute attached to it, which we access through resp.data.decode("utf-8")["files"]. To make the output a bit more readable, we use the json module to load the response and display it as a string.

You can also supply a third argument to the tuple, which specifies the MIME type of the uploaded file:

... previous code
fields={
  "file": ("file_name.txt", file_data, "text/plain"),
}

Conclusion

In this guide, we've taken a look at how to send HTTP Requests using urllib3, a powerful Python module for handling HTTP requests and responses.

We've also taken a look at what HTTP is, what status codes to expect and how to interpret them, as well as how to upload files and send secure requests with certifi.

Last Updated: July 23rd, 2021
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Want a remote job?

    Prepping for an interview?

    • Improve your skills by solving one coding problem every day
    • Get the solutions the next morning via email
    • Practice on actual problems asked by top companies, like:
     
     
     

    Better understand your data with visualizations

    With over 330+ pages, you'll learn the ins and outs of visualizing data in Python with popular libraries like Matplotlib, Seaborn, Bokeh, and more.

    © 2013-2021 Stack Abuse. All rights reserved.