Reading and Writing Lists to a File in Python

Introduction

Python programmers intensively use arrays, lists, and dictionaries as serialized data structures. Storing these data structures persistently requires either a file or a database to properly work.

In this article, we'll take a look at how to write a list to file, and how to read that list back into memory.

To write data in a file, and to read data from a file, the Python programming language offers the standard methods write() and read() for dealing with a single line, as well as writelines() and readlines() for dealing with multiple lines. Furthermore, both the pickle and the json modules allow clever ways of dealing with serialized data sets as well.

Using the read() and write() Methods

To deal with characters (strings) the basic read() and write() methods work excellently. Saving such a list line by line into the file listfile.txt can be done as follows:

# Define a list of places
places = ['Berlin', 'Cape Town', 'Sydney', 'Moscow']

with open('listfile.txt', 'w') as filehandle:
    for listitem in places:
        filehandle.write(f'{listitem}\n')

The listitem is extended by a line break "\n", firstly, and then stored into the output file. Now we can take a look at how to read the entire list from the file listfile.txt back into memory:

# Define an empty list
places = []

# Open the file and read the content in a list
with open('listfile.txt', 'r') as filehandle:
    for line in filehandle:
        # Remove linebreak which is the last character of the string
        curr_place = line[:-1]
        # Add item to the list
        places.append(curr_place)

Keep in mind that you'll need to remove the line break from the end of the string. In this case, it helps us that Python allows list operations on strings, too. This removal is simply done as a list operation on the string itself, which keeps everything but the last element. This element contains the character "\n" that represents the line break on UNIX/Linux systems.

Using the writelines() and readlines() Methods

As mentioned at the beginning of this article, Python also contains the two methods - writelines() and readlines() - to write and read multiple lines in one step, respectively. Let's write the entire list to a file on disk:

# Define a list of places
places_list = ['Berlin', 'Cape Town', 'Sydney', 'Moscow']

with open('listfile.txt', 'w') as filehandle:
    filehandle.writelines(f"{place for place in places_list}\n")

To read the entire list from a file on disk we need to:

# Define an empty list
places = []

# Open the file and read the content in a list
with open('listfile.txt', 'r') as filehandle:
    filecontents = filehandle.readlines()
    for line in filecontents:
        # Remove linebreak which is the last character of the string
        curr_place = line[:-1]
        # Add item to the list
        places.append(curr_place)

The code above follows a more traditional approach borrowed from other programming languages. Let's write it in a more Pythonic way:

# Define an empty list
places = []

# Open the file and read the content in a list
with open('listfile.txt', 'r') as filehandle:
    places = [current_place.rstrip() for current_place in filehandle.readlines()]

Firstly, the file content is read via readlines(). Secondly, in a for loop from each line the line break character is removed using the rstrip() method. Thirdly, the string is added to the list of places as a new list item.

In comparison with the listing before the code is much more compact, but may be more difficult to read for beginner Python programmers.

Using the Joblib Module

The initial methods explained up to now store the list in a way that humans can still read it - quite literally a sequential list in a file. This is great for creating simple reports or outputting export files for users to further use, such as CSV files. However - if your aim is to just serialize a list into a file, that can be loaded later, there's no need to store it in a human-readable format.

The joblib module provides the easiest way to dump a Python object (can be any object really):

import joblib

places = ['Berlin', 'Cape Town', 'Sydney', 'Moscow']
# Dumps into file
joblib.dump(places, 'places.sav')
# Loads from file
places = joblib.load('places.sav')
print(places) # ['Berlin', 'Cape Town', 'Sydney', 'Moscow']

joblib remains the simplest and cleanest way to serialize objects in an efficient format, and load them later. You can use any arbitrary format, such as .sav, .data, etc. It doesn't really matter - both joblib and alternatives like pickle will read the files just fine.

Using the pickle Module

As an alternative to joblib, we can use pickle! Its dump() method stores the list efficiently as a binary data stream. Firstly, the output file listfile.data is opened for binary writing ("wb"). Secondly, the list is stored in the opened file using the dump() method:

import pickle

places = ['Berlin', 'Cape Town', 'Sydney', 'Moscow']

with open('listfile.data', 'wb') as filehandle:
    # Store the data as a binary data stream
    pickle.dump(places, filehandle)

As the next step, we read the list from the file as follows. Firstly, the output file listfile.data is opened binary for reading ("rb"). Secondly, the list of places is loaded from the file using the load() method:

import pickle

with open('listfile.data', 'rb') as filehandle:
    # Read the data as a binary data stream
    placesList = pickle.load(filehandle)

The two examples here demonstrate the usage of strings. Although, pickle works with all kinds of Python objects such as strings, numbers, self-defined structures, and every other built-in data structure Python provides.

Advice: For a detailed guide on pickling objects in general, read our "How to Pickle and Unpickle Objects in Python"!

Using the JSON Format

The binary data format pickle uses is specific to Python. To improve the interoperability between different programs the JavaScript Object Notation (JSON) provides an easy-to-use and human-readable schema, and thus became very popular for serializing files and sharing them over APIs.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

The following example demonstrates how to write a list of mixed variable types to an output file using the json module. Having opened the output file for writing, the dump() method stores the basic list in the file using the JSON notation:

import json

# Define list with values
basic_list = [1, "Cape Town", 4.6]

# Open output file for writing
with open('listfile.txt', 'w') as filehandle:
    json.dump(basic_list, filehandle)

Reading the contents of the output file back into memory is as simple as writing the data. The corresponding method to dump() is named load():

import json

# Open output file for reading
with open('listfile.txt', 'r') as filehandle:
    basic_list = json.load(filehandle)

Conclusion

Different methods we've shown above range from simple writing/reading data up to dumping/loading data via binary streams using pickle and JSON. This simplifies storing a list persistently and reading it back into memory.

Last Updated: September 20th, 2022
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Frank HofmannAuthor

IT developer, trainer, and author. Coauthor of the Debian Package Management Book (http://www.dpmb.org/).

© 2013-2024 Stack Abuse. All rights reserved.

AboutDisclosurePrivacyTerms