Python: Check if File or Directory is Empty

Introduction

Python has a set of built-in library objects and functions to help us with this task. In this tutorial, we'll learn how to check if a file or directory is empty in Python.

Distinguish Between a File and a Directory

When we'd like to check if a path is empty or not, we'll want to know if it's a file or directory since this affects the approach we'll want to use.

Let's say we have two placeholder variables dirpath and filepath identifying a local directory and file:

dirpath = '/mnt/f/code.books/articles/python'
filepath = '/mnt/f/code.books/articles/python/code/file_dir.py'

Using os.path

Python provides the os module which is a standard Python package of functions, objects, and constants to work with the operating system.

os.path provides us with the isfile() and isdir() functions to easily distinguish between a file and a directory:

import os

dirpath = '/mnt/f/code.books/articles/python'
filepath = '/mnt/f/code.books/articles/python/code/file_dir.py'

os.path.isfile(dirpath) # False
os.path.isdir(dirpath) # True
os.path.isfile(filepath) # True
os.path.isdir(filepath) # False

Both of these functions return a Boolean value.

Using pathlib

Python 3.4 introduced the pathlib module, that provides an Object-oriented interface to work with the filesystems.

pathlib simplifies working with filesystems as compared to os or os.path.

The Path class of the pathlib module accepts a path as its argument and returns a Path object, that can be easily queried or chained further with methods and attributes:

from pathlib import Path

dirpath = '/mnt/f/code.books/articles/python'
filepath = '/mnt/f/code.books/articles/python/code/file_dir.py'

Path(dirpath).is_file() # False
Path(dirpath).is_dir() # True
Path(filepath).is_file() # True
Path(dirpath).is_file() # False

Here, we're checking if the Path object is a file or directory instead.

Check if a File is Empty

An empty file or a zero-byte file is any file that contains no data or content. The file can be any file type. Certain files (such as music files) may have no data but still contain metadata (such as the author). Such files can't be considered as an empty file.

One can create an empty file quickly on Linux and MacOS:

$ touch emptyfile

Or on Windows:

$ type nul > emptyfile

Let's define variables now - emptyfile and nonemptyfile pointing to an empty file having zero bytes and a non-empty file having the size of one byte:

emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'

Let's take a look at the type and size of these files:

$ ls -l
-rwxrwxrwx 1 root root   0 Sep 10 18:06 emptyfile
-rwxrwxrwx 1 root root   1 Sep 10 18:08 onebytefile
$ file emptyfile
emptyfile: empty
$ file onebytefile
onebytefile: very short file (no magic)

Using os.stat

Alternatively, we can use Python's os module to check this information as well. The os.stat() function returns a stat_result object. This object is basically a data structure that is collection of the file's properties:

import os

emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'

result = os.stat(nonemptyfile)
result.st_size # 1

result = os.stat(emptyfile)
result.st_size # 0

Using os.path

Python's os.path module makes it very easy to work with file paths. Apart from checking existence of a path or distinguishing their type we can also retrieve the size of a file specified as a string.

os.path.getsize() returns size of a file specified as a path-like-object and is much easier to use than os.stat():

import os

emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'

os.path.getsize(emptyfile) # 0

os.path.getsize(nonemptyfile) # 1

Using pathlib

If we are working on Python 3.4 or above we can use the pathlib module to retrieve size of a file. This basically replaces the os module. Path.stat() returns the stat_result property of a Path object which is equivalent to return value of os.stat():

from pathlib import Path

emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'

print('File stats: ' + Path(emptyfile).stat())

print('File size: ' + Path(emptyfile).stat().st_size + ' byte(s)')

print('File stats: ' + Path(nonemptyfile).stat())

print('File size: ' + Path(nonemptyfile).stat().st_size + ' byte(s)')

This results in:

File stats: os.stat_result(st_mode=33279, st_ino=14355223812249048, st_dev=17, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1600087010, st_mtime=1600087010, st_ctime=1600087010)
File size: 0 byte(s)

File stats: os.stat_result(st_mode=33279, st_ino=5629499534218713, st_dev=17, st_nlink=1, st_uid=0, st_gid=0, st_size=1, st_atime=1600088120, st_mtime=1600088072, st_ctime=1600088072)
File size: 1 byte(s)

Check if a Directory is Empty

A directory that contains no other files or sub-directories is an empty directory. However, every directory (even empty ones) do contain the following 2 entries:

  • . (pronounced dot) references current directory and is useful in operations like finding something inside the current directory
  • .. (pronounced double dot) references parent directory of the current directory, is required to step back from the current directory

Let's define two variables - emptydirectory and nonemptydirectory pointing to an empty and a non-empty directory:

emptydirectory = '/mnt/f/code.books/articles/python/markdown'
nonemptydirectory = '/mnt/f/code.books/articles/python/code'

The empty directory doesn't have any items in it:

$ pwd
/mnt/f/code.books/articles/python/markdown
$ ls -la
total 0
drwxrwxrwx 1 root root 512 Sep 11 11:52 .
drwxrwxrwx 1 root root 512 Sep 10 20:22 ..

The non-empty directory has a single file:

$ pwd
/mnt/f/code.books/articles/python/code
$ ls -la
total 0
drwxrwxrwx 1 root root 512 Sep 14 11:02 .
drwxrwxrwx 1 root root 512 Sep 14 18:22 ..
-rwxrwxrwx 1 root root 425 Sep 14 12:27 file_dir.py

Using os.listdir()

The os.listdir() returns a sequence that contains the name of all the items found in the directory path passed as the argument. It does not include the . and .. entries:

import os

os.listdir(emptydirectory) # []
os.listdir(nonemptydirectory) # ['file_dir.py']

Calculating the length of the returned list easily determines if the directory is empty or not. An empty directory always has a length of zero:

import os

print(len(os.listdir(nonemptydirectory))) # 1
print(len(os.listdir(emptydirectory))) # 0

Using os.scandir()

The os.listdir() function is useful when you need a whole bunch of entries name as a list for further processing. However, to check if there's at least a single entry, we don't need a list of all the files inside.

If a directory is huge, the os.listdir() function will take a long time to run, whereas, as long as there's more than 0 entries, our question is answered.

A function that comes to aid is os.scandir() which returns a lazy iterable or generator.

Generators return iterators that can be looped over like normal iterables such as a list. But unlike a list, set or dictionary, they do not store a whole bunch of values in memory and instead return a new value on request.

This approach is approximately ~200 times faster on directories of ~1000 files.

So instead of looping over the whole directory structure, we can use os.scandir() to check if there is at least one entry found in the directory path:

import os

emptydirectory = '/mnt/f/code.books/articles/python/markdown'
nonemptydirectory = '/mnt/f/code.books/articles/python/code'

print(next(os.scandir(emptydirectory), None))
print(next(os.scandir(nonemptydirectory), None)) # <DirEntry 'file_dir.py'>

We are using next() which is a built-in function to retrieve the next available item from the lazy iterator returned by os.scandir(). Since emptydirectory has no available items - it is returning None whereas for nonemptydirectory it is returning an os.DirEntry object.

Using pathlib

A preferred approach to the os module is the pathlib module. We'll use pathlib.Path.iterdir(), which is not only simpler but also much easier to use than os.listdir() or os.scandir().

It returns back a lazy iterable or generator object much like os.scandir(), that iterates over the files in directory path passed as argument:

from pathlib import Path
print(Path(emptydirectory).iterdir()) # <generator object Path.iterdir at 0x7f2cf6f584a0>

Using next(), we are trying to fetch next available item. With None as the default return item, next() won't raise a StopIteration exception in case there is no item in the collection:

print(next(Path(emptydirectory).iterdir(), None)) # None
print(next(Path(nonemptydirectory).iterdir(), None)) # /mnt/f/code.books/articles/python/code/file_dir.py

Most of the built-in Python functions work with iterables, including the any() function that returns back True if the iterable has at least one element that can be evaluated as True:

from pathlib import Path
print(any(Path(emptydirectory).iterdir()) # False
print(any(nonemptydirectory).iterdir()) # True

Conclusion

In this tutorial, we've gone over how to distinguish between files and directories, after which we've checked for their emptiness.

This can be done via the os or pathlib modules and their convenience functions and classes.

Author image
About Tapan Pandey
India Twitter