File manipulation is one of the most important skills to master in any programming language, and doing it correctly is of utmost importance. Making a mistake could cause an issue in your program, other programs running on the same system, and even the system itself.
Possible errors can occur due to the parent directory not existing, or by other programs changing files in the file system at the same time, creating something that is called a race condition.
A race condition (in this case called a data race) occurs when two or more programs want to create a file of the same name in the same place. If this type of bug occurs, it is very hard to find and fix since it is nondeterministic, or put simply, different things can happen depending on the exact timing of the two racers competing for the data.
In this article, we'll see how to create a subdirectory in Python the safe way, step by step. Everything henceforth will work on Mac, Linux, and Windows.
Safely Creating a Nested Directory with pathlib
There are plenty of ways to create a subdirectory, but perhaps the simplest is using the
pathlib module. The
pathlib module is primarily made to help abstract different operating system file systems and provide a uniform interface to work with most of them.
Thanks to it, your code should be platform-independent. Note that this only works on newer versions of Python (3.5 and up).
Let's say we have an absolute path of a directory given to us as a string, and we wish to create a subdirectory with a given name. Let's create a directory called
OuterDirectory, and place
InnerDirectory inside it.
Path from the
pathlib module, create a
Path object with the desired path for our new file, and use the
mkdir() method which has the following signature:
Path.mkdir(mode=0o777, parents=False, exist_ok=False)
The following code snippet does what we described above:
from pathlib import Path # Import the module path = Path("/home/kristina/OuterDirectory/InnerDirectory") # Create Path object path.mkdir() # Cake the directory
mkdir() doesn't succeed, no directory will be made and an error will be raised.
mkdir() Options and Errors
If you run the code without creating the
OuterDirectory, you'll see the following error:
Traceback (most recent call last): File "makesubdir.py", line 3, in <module> path.mkdir() File "/home/kristina/anaconda3/lib/python3.7/pathlib.py", line 1230, in mkdir self._accessor.mkdir(self, mode) FileNotFoundError: [Errno 2] No such file or directory: '/home/kristina/OuterDirectory/InnerDirectory'
InnerDirectory already exists:
Traceback (most recent call last): File "/home/kristina/Desktop/UNM/makesubdir.py", line 3, in <module> path.mkdir() File "/home/kristina/anaconda3/lib/python3.7/pathlib.py", line 1230, in mkdir self._accessor.mkdir(self, mode) FileExistsError: [Errno 17] File exists: '/home/kristina/OuterDirectory/InnerDirectory'
If a directory already exists, the raised error will be
FileExistsError, and if the parent doesn't exist, the
FileNotFoundError will be raised.
Since we don't want our program breaking whenever it encounters an error like this, we'll place this code in a try block:
from pathlib import Path path = Path("/home/kristina/OuterDirectory/InnerDir") try: path.mkdir() except OSError: print("Failed to make nested directory") else: print("Nested directory made")
When run, if the directory is successfully made, the output will be:
Nested directory made
If we run into errors, the following will be outputted:
Failed to make a nested directory
mkdir() method takes three parameters:
modeparameter, if given, combined with
umaskindicates which users have reading, writing, and executing privileges. By default, all users have all privileges which might not be what we want if security is an issue. We'll touch more on this later.
parentsindicates, in the case when the parent directory is missing, should the method:
- Create the missing parent directory itself (
- Or to raise an error, like in our second example (
- Create the missing parent directory itself (
exist_okspecifies if the
FileExistsErrorshould be raised if a directory of the same name already exists. Note that this error will still be raised if the file of the same name is not a directory.
Assigning Access Privileges
Let's make a directory called
SecondInnerDirectory where only the owner has all reading, writing and executing privileges, inside the non-existent
from pathlib import Path path = Path("/home/kristina/SecondOuterDirectory/SecondInnerDirectory") path.mkdir(mode = 0o007, parents= True, exist_ok= True)
This should execute without error. If we navigate to the
SecondOuterDirectory and check it's contents from the console like so:
We should get the output:
total 12 drwxrwxr-x 3 kristina kristina 4096 dec 10 01:26 . drwxr-xr-x 77 kristina kristina 4096 dec 10 01:26 .. d------r-x 2 kristina kristina 4096 dec 10 01:26 SecondInnerDirectory
Okay, so we can see that the parent directory was successfully made, but privileges aren't as expected. The owner lacks writing privilege.
Free eBook: Git Essentials
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
The issue we have here is that
umask isn't letting us create the desired privileges. To get around this, we'll save
umask's original value, temporarily change it, and finally, return it to its original value using the
umask() method from the OS module.
umask() returns the old value of
Let's rewrite our code to test this out:
from pathlib import Path import os old_mask = os.umask(0) # Saving the old umask value and setting umask to 0 path = Path("/home/kristina/SecondOuterDirectory/SecondInnerDirectory") path.mkdir(mode = 0o007, parents= True, exist_ok= True) os.umask(old_mask) # Reverting umask value
Executing this code, and using the
ls -al command again will result in the following output:
total 12 drwxrwxrwx 3 kristina kristina 4096 dec 10 01:45 . drwxr-xr-x 77 kristina kristina 4096 dec 10 01:45 .. d------rwx 2 kristina kristina 4096 dec 10 01:45 SecondInnerDirectory
In order to safely manipulate files on many different systems, we need a robust way to handle errors such as data races. Python offers great support for this through the
Errors can always occur when working with file systems, and the best way to deal with this is by carefully setting up systems to catch all errors that can potentially crash our program or cause other issues. Writing clean code makes for durable programs.