Python has a set of built-in library objects and functions to help us with this task. In this tutorial, we'll learn how to check if a file or directory is empty in Python.
Distinguish Between a File and a Directory
When we'd like to check if a path is empty or not, we'll want to know if it's a file or directory since this affects the approach we'll want to use.
Let's say we have two placeholder variables
filepath identifying a local directory and file:
dirpath = '/mnt/f/code.books/articles/python' filepath = '/mnt/f/code.books/articles/python/code/file_dir.py'
Python provides the
os module which is a standard Python package of functions, objects, and constants to work with the operating system.
os.path provides us with the
isdir() functions to easily distinguish between a file and a directory:
import os dirpath = '/mnt/f/code.books/articles/python' filepath = '/mnt/f/code.books/articles/python/code/file_dir.py' os.path.isfile(dirpath) # False os.path.isdir(dirpath) # True os.path.isfile(filepath) # True os.path.isdir(filepath) # False
Both of these functions return a
Python 3.4 introduced the
pathlib module, that provides an Object-oriented interface to work with the filesystems.
pathlib simplifies working with filesystems as compared to
Path class of the
pathlib module accepts a path as its argument and returns a
Path object, that can be easily queried or chained further with methods and attributes:
from pathlib import Path dirpath = '/mnt/f/code.books/articles/python' filepath = '/mnt/f/code.books/articles/python/code/file_dir.py' Path(dirpath).is_file() # False Path(dirpath).is_dir() # True Path(filepath).is_file() # True Path(dirpath).is_file() # False
Here, we're checking if the
Path object is a file or directory instead.
Check if a File is Empty
An empty file or a zero-byte file is any file that contains no data or content. The file can be any file type. Certain files (such as music files) may have no data but still contain metadata (such as the author). Such files can't be considered as an empty file.
One can create an empty file quickly on Linux and MacOS:
$ touch emptyfile
Or on Windows:
$ type nul > emptyfile
Let's define variables now -
nonemptyfile pointing to an empty file having zero bytes and a non-empty file having the size of one byte:
emptyfile = '/mnt/f/code.books/articles/python/emptyfile' nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'
Let's take a look at the type and size of these files:
$ ls -l -rwxrwxrwx 1 root root 0 Sep 10 18:06 emptyfile -rwxrwxrwx 1 root root 1 Sep 10 18:08 onebytefile $ file emptyfile emptyfile: empty $ file onebytefile onebytefile: very short file (no magic)
Alternatively, we can use Python's
os module to check this information as well. The
os.stat() function returns a
stat_result object. This object is basically a data structure that is collection of the file's properties:
import os emptyfile = '/mnt/f/code.books/articles/python/emptyfile' nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile' result = os.stat(nonemptyfile) result.st_size # 1 result = os.stat(emptyfile) result.st_size # 0
os.path module makes it very easy to work with file paths. Apart from checking existence of a path or distinguishing their type we can also retrieve the size of a file specified as a string.
os.path.getsize() returns size of a file specified as a path-like-object and is much easier to use than
import os emptyfile = '/mnt/f/code.books/articles/python/emptyfile' nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile' os.path.getsize(emptyfile) # 0 os.path.getsize(nonemptyfile) # 1
If we are working on Python 3.4 or above we can use the
pathlib module to retrieve size of a file. This basically replaces the
Path.stat() returns the
stat_result property of a
Path object which is equivalent to return value of
from pathlib import Path emptyfile = '/mnt/f/code.books/articles/python/emptyfile' nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile' print('File stats: ' + Path(emptyfile).stat()) print('File size: ' + Path(emptyfile).stat().st_size + ' byte(s)') print('File stats: ' + Path(nonemptyfile).stat()) print('File size: ' + Path(nonemptyfile).stat().st_size + ' byte(s)')
This results in:
File stats: os.stat_result(st_mode=33279, st_ino=14355223812249048, st_dev=17, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1600087010, st_mtime=1600087010, st_ctime=1600087010) File size: 0 byte(s) File stats: os.stat_result(st_mode=33279, st_ino=5629499534218713, st_dev=17, st_nlink=1, st_uid=0, st_gid=0, st_size=1, st_atime=1600088120, st_mtime=1600088072, st_ctime=1600088072) File size: 1 byte(s)
Check if a Directory is Empty
A directory that contains no other files or sub-directories is an empty directory. However, every directory (even empty ones) do contain the following 2 entries:
- . (pronounced dot) references current directory and is useful in operations like finding something inside the current directory
- .. (pronounced double dot) references parent directory of the current directory, is required to step back from the current directory
Let's define two variables -
nonemptydirectory pointing to an empty and a non-empty directory:
emptydirectory = '/mnt/f/code.books/articles/python/markdown' nonemptydirectory = '/mnt/f/code.books/articles/python/code'
The empty directory doesn't have any items in it:
$ pwd /mnt/f/code.books/articles/python/markdown $ ls -la total 0 drwxrwxrwx 1 root root 512 Sep 11 11:52 . drwxrwxrwx 1 root root 512 Sep 10 20:22 ..
The non-empty directory has a single file:
$ pwd /mnt/f/code.books/articles/python/code $ ls -la total 0 drwxrwxrwx 1 root root 512 Sep 14 11:02 . drwxrwxrwx 1 root root 512 Sep 14 18:22 .. -rwxrwxrwx 1 root root 425 Sep 14 12:27 file_dir.py
os.listdir() returns a sequence that contains the name of all the items found in the directory path passed as the argument. It does not include the
import os os.listdir(emptydirectory) #  os.listdir(nonemptydirectory) # ['file_dir.py']
Calculating the length of the returned list easily determines if the directory is empty or not. An empty directory always has a length of zero:
import os print(len(os.listdir(nonemptydirectory))) # 1 print(len(os.listdir(emptydirectory))) # 0
os.listdir() function is useful when you need a whole bunch of entries name as a list for further processing. However, to check if there's at least a single entry, we don't need a list of all the files inside.
If a directory is huge, the
os.listdir() function will take a long time to run, whereas, as long as there's more than
0 entries, our question is answered.
A function that comes to aid is
os.scandir() which returns a lazy iterable or generator.
Generators return iterators that can be looped over like normal iterables such as a list. But unlike a list, set or dictionary, they do not store a whole bunch of values in memory and instead return a new value on request.
This approach is approximately ~200 times faster on directories of ~1000 files.
So instead of looping over the whole directory structure, we can use
os.scandir() to check if there is at least one entry found in the directory path:
import os emptydirectory = '/mnt/f/code.books/articles/python/markdown' nonemptydirectory = '/mnt/f/code.books/articles/python/code' print(next(os.scandir(emptydirectory), None)) print(next(os.scandir(nonemptydirectory), None)) # <DirEntry 'file_dir.py'>
We are using
next() which is a built-in function to retrieve the next available item from the lazy iterator returned by
emptydirectory has no available items - it is returning
None whereas for
nonemptydirectory it is returning an
A preferred approach to the
os module is the
pathlib module. We'll use
pathlib.Path.iterdir(), which is not only simpler but also much easier to use than
It returns back a lazy iterable or generator object much like
os.scandir(), that iterates over the files in directory path passed as argument:
from pathlib import Path print(Path(emptydirectory).iterdir()) # <generator object Path.iterdir at 0x7f2cf6f584a0>
next(), we are trying to fetch next available item. With
None as the default return item,
next() won't raise a
StopIteration exception in case there is no item in the collection:
print(next(Path(emptydirectory).iterdir(), None)) # None print(next(Path(nonemptydirectory).iterdir(), None)) # /mnt/f/code.books/articles/python/code/file_dir.py
Most of the built-in Python functions work with iterables, including the any() function that returns back
True if the iterable has at least one element that can be evaluated as
from pathlib import Path print(any(Path(emptydirectory).iterdir()) # False print(any(nonemptydirectory).iterdir()) # True
In this tutorial, we've gone over how to distinguish between files and directories, after which we've checked for their emptiness.
This can be done via the
pathlib modules and their convenience functions and classes.