Saving data to a file is one of the most common programming tasks you may come across in your developer life.
Generally, programs take some input and produce some output. There are numerous cases in which we'd want to persist these results. We may find ourselves saving data to a file for later processing - from webpages we browse, simple dumps of tabular data we use for reports, machine learning and training or logging during the application runtime - we rely on applications writing to files rather than doing it manually.
Python allows us to save files of various types without having to use third-party libraries. In this article, we'll dive into saving the most common file formats in Python.
Opening and Closing a File
Opening a File
The contents of a file can be accessed when it's opened, and it's no longer available for reading and writing after it's been closed.
Opening a file is simple in Python:
my_data_file = open('data.txt', 'w')
When opening a file you'll need the filename - a string that could be a relative or absolute path. The second argument is the mode, this determines the actions you can do with the open file.
Here are some of the commonly used ones:
r
- (default mode) open the file for readingw
- open the file for writing, overwriting the content if the file already exists with datax
- creates a new file, failing if it existsa
- open the file for writing, appending new data at the end of the file's contents if it already existsb
- write binary data to files instead of the default text data+
- allow reading and writing to a mode
Let's say you wanted to write to a file and then read it after, your mode should be 'w+'. If you wanted to write and then read from a file, without deleting the previous contents then you'll use 'a+'.
Closing a File
Closing a file is even easier in Python:
my_data_file.close()
You simply need to call the close method on the file object. It's important to close the file after you are finished using it, and there are many good reasons to do so:
- Open files take up space in RAM
- Lowers chance of data corruption as it's no longer accessible
- There's a limit of files your OS can have open
For small scripts, these aren't pressing concerns, and some Python implementations will actually automatically close files for you, but for large programs don't leave closing your files to chance and make sure to free up the used resources.
Using the "with" Keyword
Closing a file can be easily forgotten, we're human after all. Lucky for us, Python has a mechanism to use a file and automatically close it when we're done.
To do this, we simply need to use the with
keyword:
with open('data.txt', 'w') as my_data_file:
# TODO: write data to the file
# After leaving the above block of code, the file is closed
The file will be open for all the code that's indented after using the with
keyword, marked as the # TODO
comment. Once that block of code is complete, the file will be automatically closed.
This is the recommended way to open and write to a file as you don't have to manually close it to free up resources and it offers a failsafe mechanism to keep your mind on the more important aspects of programming.
Saving a Text File
Now that we know the best way to access a file, let's get straight into writing data.
Fortunately, Python makes this straightforward as well:
with open('do_re_mi.txt', 'w') as f:
f.write('Doe, a deer, a female deer\n')
f.write('Ray, a drop of golden sun\n')
The write()
function takes a string and puts that content into the file stream. Although we don't store it, the write()
function returns the number of characters it just entered i.e. the length of the input string.
Note: Notice the inclusion of the newline character, \n
. It’s used to write to a next line in the file, otherwise, all the text would be added as a single line.
Saving Multiple Lines at Once
With the write()
function we can take one string and put it into a file. What if we wanted to write multiple lines at once?
We can use the writelines()
function to put data in a sequence (like a list or tuple) and into a file:
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
with open('browsers.txt', 'w') as f:
web_browsers = ['Firefox\n', 'Chrome\n', 'Edge\n']
f.writelines(web_browsers)
As before, if we want the data to appear in new lines we include the newline character at the end of each string.
If you'd like to skip the step of manually entering the newline character after each item in the list, it's easy to automate it:
with open('browsers.txt', 'w') as f:
web_browsers = ['Firefox\n', 'Chrome\n', 'Edge\n']
f.writelines("%s\n" % line for line in web_browsers)
Note: The input for writelines()
must be a flat sequence of strings or bytes - no numbers, objects or nested sequences like a list within a list are allowed.
If you're interested in reading more about lists and tuples, we already have an article dedicated to them - Lists vs Tuples in Python.
Saving a CSV File
CSV (Comma Separated Values) files are commonly used for storing tabular data. Because of its popularity, Python has some built-in methods to make writing files of that type easier:
import csv
weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
sales = ['10', '8', '19', '12', '25']
with open('sales.csv', 'w') as csv_file:
csv_writer = csv.writer(csv_file, delimiter=',')
csv_writer.writerow(weekdays)
csv_writer.writerow(sales)
We first need to import the csv
library to get their helper functions. We open the file as we're accustomed to but instead of writing content on the csv_file
object, we create a new object called csv_writer
.
This object provides us with the writerow()
method which allows us to put all the row's data in the file in one go.
If you'd like to learn more about using CSV files in Python in more detail, you can read more here: Reading and Writing CSV Files in Python.
Saving a JSON File
JSON is another popular format for storing data, and just like with CSVs, Python has made it dead simple to write your dictionary data into JSON files:
import json
my_details = {
'name': 'John Doe',
'age': 29
}
with open('personal.json', 'w') as json_file:
json.dump(my_details, json_file)
We do need to import the json
library and open the file. To actually write the data to the file, we just call the dump()
function, giving it our data dictionary and the file object.
If you'd like to know more about using JSON files in Python, you can more from this article: Reading and Writing JSON to a File in Python.
Conclusion
Saving files can come in handy in many kinds of programs we write. To write a file in Python, we first need to open the file and make sure we close it later.
It's best to use the with
keyword so files are automatically closed when we're done writing to them.
We can use the write()
method to put the contents of a string into a file or use writelines()
if we have a sequence of text to put into the file.
For CSV and JSON data, we can use special functions that Python provides to write data to a file once the file is open.