Python: Safely Create Nested Directory

Python: Safely Create Nested Directory

Introduction

File manipulation is one of the most important skills to master in any programming language, and doing it correctly is of utmost importance. Making a mistake could cause an issue in your program, other programs running on the same system, and even the system itself.

Possible errors can occur due to the parent directory not existing, or by other programs changing files in the file system at the same time, creating something that is called a race condition.

A race condition (in this case called a data race) occurs when two or more programs want to create a file of the same name in the same place. If this type of bug occurs, it is very hard to find and fix since it is nondeterministic, or put simply, different things can happen depending on the exact timing of the two racers competing for the data.

In this article, we'll see how to create a subdirectory in Python the safe way, step by step. Everything henceforth will work on Mac, Linux, and Windows.

Safely Creating a Nested Directory with pathlib

There are plenty of ways to create a subdirectory, but perhaps the simplest is using the pathlib module. The pathlib module is primarily made to help abstract different operating system file systems and provide a uniform interface to work with most of them.

Thanks to it, your code should be platform-independent. Note that this only works on newer versions of Python (3.5 and up).

Let's say we have an absolute path of a directory given to us as a string, and we wish to create a subdirectory with a given name. Let's create a directory called OuterDirectory, and place InnerDirectory inside it.

We'll import Path from the pathlib module, create a Path object with the desired path for our new file, and use the mkdir() method which has the following signature:

Path.mkdir(mode=0o777, parents=False, exist_ok=False)

The following code snippet does what we described above:

from pathlib import Path # Import the module
path = Path("/home/kristina/OuterDirectory/InnerDirectory") # Create Path object
path.mkdir() # Cake the directory

If mkdir() doesn't succeed, no directory will be made and an error will be raised.

mkdir() Options and Errors

If you run the code without creating the OuterDirectory, you'll see the following error:

Traceback (most recent call last):
  File "makesubdir.py", line 3, in <module>
    path.mkdir()
  File "/home/kristina/anaconda3/lib/python3.7/pathlib.py", line 1230, in mkdir
    self._accessor.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/kristina/OuterDirectory/InnerDirectory'

Or if InnerDirectory already exists:

Traceback (most recent call last):
  File "/home/kristina/Desktop/UNM/makesubdir.py", line 3, in <module>
    path.mkdir()
  File "/home/kristina/anaconda3/lib/python3.7/pathlib.py", line 1230, in mkdir
    self._accessor.mkdir(self, mode)
FileExistsError: [Errno 17] File exists: '/home/kristina/OuterDirectory/InnerDirectory'

If a directory already exists, the raised error will be FileExistsError, and if the parent doesn't exist, the FileNotFoundError will be raised.

Since we don't want our program breaking whenever it encounters an error like this, we'll place this code in a try block:

from pathlib import Path 
path = Path("/home/kristina/OuterDirectory/InnerDir") 
try:
    path.mkdir() 
except OSError:
    print("Failed to make nested directory")
else:
    print("Nested directory made")

When run, if the directory is successfully made, the output will be:

Nested directory made

If we run into errors, the following will be outputted:

Failed to make a nested directory

The mkdir() method takes three parameters: mode, parents, and exit_ok.

  • The mode parameter, if given, combined with umask indicates which users have reading, writing, and executing privileges. By default, all users have all privileges which might not be what we want if security is an issue. We'll touch more on this later.
  • parents indicates, in the case when the parent directory is missing, should the method:
    1. Create the missing parent directory itself (true)
    2. Or to raise an error, like in our second example (false)
  • exist_ok specifies if the FileExistsError should be raised if a directory of the same name already exists. Note that this error will still be raised if the file of the same name is not a directory.

Assigning Access Privileges

Let's make a directory called SecondInnerDirectory where only the owner has all reading, writing and executing privileges, inside the non-existent SecondOuterDirectory:

from pathlib import Path
path = Path("/home/kristina/SecondOuterDirectory/SecondInnerDirectory")
path.mkdir(mode = 0o007, parents= True, exist_ok= True)

This should execute without error. If we navigate to the SecondOuterDirectory and check it's contents from the console like so:

$ ls -al

We should get the output:

total 12
drwxrwxr-x  3 kristina kristina 4096 dec 10 01:26 .
drwxr-xr-x 77 kristina kristina 4096 dec 10 01:26 ..
d------r-x  2 kristina kristina 4096 dec 10 01:26 SecondInnerDirectory

Okay, so we can see that the parent directory was successfully made, but privileges aren't as expected. The owner lacks writing privilege.

Free eBook: Git Essentials

Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!

The issue we have here is that umask isn't letting us create the desired privileges. To get around this, we'll save umask's original value, temporarily change it, and finally, return it to its original value using the umask() method from the OS module. umask() returns the old value of umask.

Let's rewrite our code to test this out:

from pathlib import Path
import os 

old_mask = os.umask(0) # Saving the old umask value and setting umask to 0

path = Path("/home/kristina/SecondOuterDirectory/SecondInnerDirectory")
path.mkdir(mode = 0o007, parents= True, exist_ok= True)

os.umask(old_mask) # Reverting umask value

Executing this code, and using the ls -al command again will result in the following output:

total 12
drwxrwxrwx  3 kristina kristina 4096 dec 10 01:45 . 
drwxr-xr-x 77 kristina kristina 4096 dec 10 01:45 ..
d------rwx  2 kristina kristina 4096 dec 10 01:45 SecondInnerDirectory

Conclusion

In order to safely manipulate files on many different systems, we need a robust way to handle errors such as data races. Python offers great support for this through the pathlib module.

Errors can always occur when working with file systems, and the best way to deal with this is by carefully setting up systems to catch all errors that can potentially crash our program or cause other issues. Writing clean code makes for durable programs.

Last Updated: January 4th, 2021
Was this article helpful?

Improve your dev skills!

Get tutorials, guides, and dev jobs in your inbox.

No spam ever. Unsubscribe at any time. Read our Privacy Policy.

Kristina PopovicAuthor

CS student with a passion for juggling and math.

Want a remote job?

    Prepping for an interview?

    • Improve your skills by solving one coding problem every day
    • Get the solutions the next morning via email
    • Practice on actual problems asked by top companies, like:
     
     
     

    Better understand your data with visualizations

    With over 330+ pages, you'll learn the ins and outs of visualizing data in Python with popular libraries like Matplotlib, Seaborn, Bokeh, and more.

    © 2013-2021 Stack Abuse. All rights reserved.