As serialized data structures, Python programmers intensively use arrays, lists, and dictionaries. Storing these data structures persistently requires either a file or a database to work with. This article describes how to write a list to file, and how to read that list back into memory.
To write data in a file, and to read data from a file, the Python programming language offers the standard methods
read() for dealing with a single line, as well as
readlines() for dealing with multiple lines. Furthermore, both the
pickle and the
json module allow clever ways of dealing with serialized data sets as well.
Using the read and write Methods
To deal with characters (strings) the basic methods work excellent. Saving such a list line by line into the file
listfile.txt can be done as follows:
# define list of places places = ['Berlin', 'Cape Town', 'Sydney', 'Moscow'] with open('listfile.txt', 'w') as filehandle: for listitem in places: filehandle.write('%s\n' % listitem)
In line 6 the
listitem is extended by a linebreak "\n", firstly, and stored into the output file, secondly. To read the entire list from the file listfile.txt back into memory this Python code shows you how it works:
# define an empty list places =  # open file and read the content in a list with open('listfile.txt', 'r') as filehandle: for line in filehandle: # remove linebreak which is the last character of the string currentPlace = line[:-1] # add item to the list places.append(currentPlace)
Keep in mind that you'll need to remove the linebreak from the end of the string. In this case it helps us that Python allows list operations on strings, too. In line 8 of the code above this removal is simply done as a list operation on the string itself, which keeps everything but the last element. This element contains the character "\n" that represents the linebreak on UNIX/Linux systems.
Using the writelines and readlines Methods
As mentioned at the beginning of this article Python also contains the two methods
readlines() to write and read multiple lines in one step, respectively. To write the entire list to a file on disk the Python code is as follows:
# define list of places places_list = ['Berlin', 'Cape Town', 'Sydney', 'Moscow'] with open('listfile.txt', 'w') as filehandle: filehandle.writelines("%s\n" % place for place in places_list)
To read the entire list from a file on disk the Python code is as follows:
# define empty list places =  # open file and read the content in a list with open('listfile.txt', 'r') as filehandle: filecontents = filehandle.readlines() for line in filecontents: # remove linebreak which is the last character of the string current_place = line[:-1] # add item to the list places.append(current_place)
The listing above follows a more traditional approach borrowed from other programming languages. To write it in a more Pythonic way have a look at the code below:
# define empty list places =  # open file and read the content in a list with open('listfile.txt', 'r') as filehandle: places = [current_place.rstrip() for current_place in filehandle.readlines()]
Having opened the file
listfile.txt in line 5, re-establishing the list takes place entirely in line 6. Firstly, the file content is read via
readlines(). Secondly, in a
for loop from each line the linebreak character is removed using the
rstrip()method. Thirdly, the string is added to the list of places as a new list item. In comparison with the listing before the code is much more compact, but may be more difficult to read for beginner Python programmers.
Using the pickle Module
The different methods explained up to now store the list in a way that humans can still read it. In case this is not needed the pickle module may become quite handy for you. Its
dump() method stores the list efficiently as a binary data stream. Firstly, in line 7 (in the code below) the output file
listfile.data is opened for binary writing ("wb"). Secondly, in line 9
the list is stored in the opened file using the
# load additional module import pickle # define a list of places placesList = ['Berlin', 'Cape Town', 'Sydney', 'Moscow'] with open('listfile.data', 'wb') as filehandle: # store the data as binary data stream pickle.dump(placesList, filehandle)
As the next step we read the list from the file as follows. Firstly, the output file
listfile.data is opened binary for reading ("rb") in line 4. Secondly, the list of places is loaded from the file using the
# load additional module import pickle with open('listfile.data', 'rb') as filehandle: # read the data as binary data stream placesList = pickle.load(filehandle)
The two examples here demonstrate the usage of strings. Although,
pickle works with all kind of Python objects such as strings, numbers, self-defined structures, and every other built-in data structure Python provides.
Using the JSON Format
The binary data format
The following example demonstrates how to write a list of mixed variable types to an output file using the json module. In line 4 the basic list is defined. Having opened the output file for writing in line 7, the
dump() method stores the basic list in the file using the JSON notation.
import json # define list with values basicList = [1, "Cape Town", 4.6] # open output file for writing with open('listfile.txt', 'w') as filehandle: json.dump(basicList, filehandle)
Reading the contents of the output file back into memory is as simple as writing the data. The corresponding method to
dump() is named
load(), and works as follows:
import json # open output file for reading with open('listfile.txt', 'r') as filehandle: basicList = json.load(filehandle)
The different methods shown above range from simple writing/reading data up to dumping/loading data via binary streams using pickle and JSON. This simplifies storing a list persistently, and reading it back into memory.
The author would like to thank Zoleka Hatitongwe for her support while preparing the article.