Introduction
Python has a set of built-in library objects and functions to help us with this task. In this tutorial, we'll learn how to check if a file or directory is empty in Python.
Distinguish Between a File and a Directory
When we'd like to check if a path is empty or not, we'll want to know if it's a file or directory since this affects the approach we'll want to use.
Let's say we have two placeholder variables dirpath
and filepath
identifying a local directory and file:
dirpath = '/mnt/f/code.books/articles/python'
filepath = '/mnt/f/code.books/articles/python/code/file_dir.py'
Using os.path
Python provides the os
module which is a standard Python package of functions, objects, and constants to work with the operating system.
os.path
provides us with the isfile()
and isdir()
functions to easily distinguish between a file and a directory:
import os
dirpath = '/mnt/f/code.books/articles/python'
filepath = '/mnt/f/code.books/articles/python/code/file_dir.py'
os.path.isfile(dirpath) # False
os.path.isdir(dirpath) # True
os.path.isfile(filepath) # True
os.path.isdir(filepath) # False
Both of these functions return a Boolean
value.
Using pathlib
Python 3.4 introduced the pathlib
module, that provides an Object-oriented interface to work with the filesystems.
pathlib
simplifies working with filesystems as compared to os
or os.path
.
The Path
class of the pathlib
module accepts a path as its argument and returns a Path
object, that can be easily queried or chained further with methods and attributes:
from pathlib import Path
dirpath = '/mnt/f/code.books/articles/python'
filepath = '/mnt/f/code.books/articles/python/code/file_dir.py'
Path(dirpath).is_file() # False
Path(dirpath).is_dir() # True
Path(filepath).is_file() # True
Path(dirpath).is_file() # False
Here, we're checking if the Path
object is a file or directory instead.
Check if a File is Empty
An empty file or a zero-byte file is any file that contains no data or content. The file can be any file type. Certain files (such as music files) may have no data but still contain metadata (such as the author). Such files can't be considered as an empty file.
One can create an empty file quickly on Linux and MacOS:
$ touch emptyfile
Or on Windows:
$ type nul > emptyfile
Let's define variables now - emptyfile
and nonemptyfile
pointing to an empty file having zero bytes and a non-empty file having the size of one byte:
emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'
Let's take a look at the type and size of these files:
$ ls -l
-rwxrwxrwx 1 root root 0 Sep 10 18:06 emptyfile
-rwxrwxrwx 1 root root 1 Sep 10 18:08 onebytefile
$ file emptyfile
emptyfile: empty
$ file onebytefile
onebytefile: very short file (no magic)
Using os.stat
Alternatively, we can use Python's os
module to check this information as well. The os.stat()
function returns a stat_result
object. This object is basically a data structure that is collection of the file's properties:
import os
emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'
result = os.stat(nonemptyfile)
result.st_size # 1
result = os.stat(emptyfile)
result.st_size # 0
Using os.path
Python's os.path
module makes it very easy to work with file paths. Apart from checking existence of a path or distinguishing their type we can also retrieve the size of a file specified as a string.
os.path.getsize()
returns size of a file specified as a path-like-object and is much easier to use than os.stat()
:
import os
emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'
os.path.getsize(emptyfile) # 0
os.path.getsize(nonemptyfile) # 1
Using pathlib
If we are working on Python 3.4 or above we can use the pathlib
module to retrieve size of a file. This basically replaces the os
module. Path.stat()
returns the stat_result
property of a Path
object which is equivalent to return value of os.stat()
:
from pathlib import Path
emptyfile = '/mnt/f/code.books/articles/python/emptyfile'
nonemptyfile = '/mnt/f/code.books/articles/python/onebytefile'
print('File stats: ' + Path(emptyfile).stat())
print('File size: ' + Path(emptyfile).stat().st_size + ' byte(s)')
print('File stats: ' + Path(nonemptyfile).stat())
print('File size: ' + Path(nonemptyfile).stat().st_size + ' byte(s)')
This results in:
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
File stats: os.stat_result(st_mode=33279, st_ino=14355223812249048, st_dev=17, st_nlink=1, st_uid=0, st_gid=0, st_size=0, st_atime=1600087010, st_mtime=1600087010, st_ctime=1600087010)
File size: 0 byte(s)
File stats: os.stat_result(st_mode=33279, st_ino=5629499534218713, st_dev=17, st_nlink=1, st_uid=0, st_gid=0, st_size=1, st_atime=1600088120, st_mtime=1600088072, st_ctime=1600088072)
File size: 1 byte(s)
Check if a Directory is Empty
A directory that contains no other files or sub-directories is an empty directory. However, every directory (even empty ones) do contain the following 2 entries:
- . (pronounced dot) references current directory and is useful in operations like finding something inside the current directory
- .. (pronounced double dot) references parent directory of the current directory, is required to step back from the current directory
Let's define two variables - emptydirectory
and nonemptydirectory
pointing to an empty and a non-empty directory:
emptydirectory = '/mnt/f/code.books/articles/python/markdown'
nonemptydirectory = '/mnt/f/code.books/articles/python/code'
The empty directory doesn't have any items in it:
$ pwd
/mnt/f/code.books/articles/python/markdown
$ ls -la
total 0
drwxrwxrwx 1 root root 512 Sep 11 11:52 .
drwxrwxrwx 1 root root 512 Sep 10 20:22 ..
The non-empty directory has a single file:
$ pwd
/mnt/f/code.books/articles/python/code
$ ls -la
total 0
drwxrwxrwx 1 root root 512 Sep 14 11:02 .
drwxrwxrwx 1 root root 512 Sep 14 18:22 ..
-rwxrwxrwx 1 root root 425 Sep 14 12:27 file_dir.py
Using os.listdir()
The os.listdir()
returns a sequence that contains the name of all the items found in the directory path passed as the argument. It does not include the .
and ..
entries:
import os
os.listdir(emptydirectory) # []
os.listdir(nonemptydirectory) # ['file_dir.py']
Calculating the length of the returned list easily determines if the directory is empty or not. An empty directory always has a length of zero:
import os
print(len(os.listdir(nonemptydirectory))) # 1
print(len(os.listdir(emptydirectory))) # 0
Using os.scandir()
The os.listdir()
function is useful when you need a whole bunch of entries name as a list for further processing. However, to check if there's at least a single entry, we don't need a list of all the files inside.
If a directory is huge, the os.listdir()
function will take a long time to run, whereas, as long as there's more than 0
entries, our question is answered.
A function that comes to aid is os.scandir()
which returns a lazy iterable or generator.
Generators return iterators that can be looped over like normal iterables such as a list. But unlike a list, set or dictionary, they do not store a whole bunch of values in memory and instead return a new value on request.
This approach is approximately ~200 times faster on directories of ~1000 files.
So instead of looping over the whole directory structure, we can use os.scandir()
to check if there is at least one entry found in the directory path:
import os
emptydirectory = '/mnt/f/code.books/articles/python/markdown'
nonemptydirectory = '/mnt/f/code.books/articles/python/code'
print(next(os.scandir(emptydirectory), None))
print(next(os.scandir(nonemptydirectory), None)) # <DirEntry 'file_dir.py'>
We are using next()
which is a built-in function to retrieve the next available item from the lazy iterator returned by os.scandir()
. Since emptydirectory
has no available items - it is returning None
whereas for nonemptydirectory
it is returning an os.DirEntry
object.
Using pathlib
A preferred approach to the os
module is the pathlib
module. We'll use pathlib.Path.iterdir()
, which is not only simpler but also much easier to use than os.listdir()
or os.scandir()
.
It returns back a lazy iterable or generator object much like os.scandir()
, that iterates over the files in directory path passed as argument:
from pathlib import Path
print(Path(emptydirectory).iterdir()) # <generator object Path.iterdir at 0x7f2cf6f584a0>
Using next()
, we are trying to fetch next available item. With None
as the default return item, next()
won't raise a StopIteration
exception in case there is no item in the collection:
print(next(Path(emptydirectory).iterdir(), None)) # None
print(next(Path(nonemptydirectory).iterdir(), None)) # /mnt/f/code.books/articles/python/code/file_dir.py
Most of the built-in Python functions work with iterables, including the any() function that returns back True
if the iterable has at least one element that can be evaluated as True
:
from pathlib import Path
print(any(Path(emptydirectory).iterdir()) # False
print(any(nonemptydirectory).iterdir()) # True
Conclusion
In this tutorial, we've gone over how to distinguish between files and directories, after which we've checked for their emptiness.
This can be done via the os
or pathlib
modules and their convenience functions and classes.