How to Write a Makefile - Automating Python Setup, Compilation, and Testing

Introduction

When you want to run a project that has multiple sources, resources, etc., you need to make sure that all of the code is recompiled before the main program is compiled or run.

For example, imagine our software looks something like this:

main_program.source -> uses the libraries `math.source` and `draw.source`
math.source -> uses the libraries `floating_point_calc.source` and `integer_calc.source`
draw.source -> uses the library `opengl.source`

So if we make a change in opengl.source for example, we need to recompile both draw.source and main_program.source because we want our project to be up-to-date on all ends.

This is a very tedious and time-consuming process. And because all good things in the software world come from some engineer being too lazy to type in a few extra commands, Makefile was born.

Makefile uses the make utility, and if we're to be completely accurate, Makefile is just a file that houses the code that the make utility uses. However, the name Makefile is much more recognizable.

Makefile essentially keeps your project up to date by rebuilding only the necessary parts of your source code whose children are out of date. It can also automatize compilation, builds and testing.

In this context, a child is a library or a chunk of code which is essential for its parent's code to run.

This concept is very useful and is commonly used with compiled programming languages. Now, you may be asking yourself:

Isn't Python an interpreted language?

Well, Python is technically both an interpreted and compiled language, because in order for it to interpret a line of code, it needs to precompile it into byte code which is not hardcoded for a specific CPU, and can be run after the fact.

A more detailed, yet concise explanation can be found on Ned Batchelder's blog. Also, if you need a refresher on how Programming Language Processors work, we've got you covered.

Concept Breakdown

Because Makefile is just an amalgamation of multiple concepts, there are a few things you'll need to know in order to write a Makefile:

  1. Bash Scripting
  2. Regular Expressions
  3. Target Notation
  4. Understanding your project's file structure

With these in hand, you'll be able to write instructions for the make utility and automate your compilation.

Bash is a command language (it's also a Unix shell but that's not really relevant right now), which we will be using to write actual commands or automate file generation.

For example, if we want to echo all the library names to the user:

DIRS=project/libs
for file in $(DIRS); do
    echo $$file
done

Target notation is a way of writing which files are dependent on other files. For example, if we want to represent the dependencies from the illustrative example above in proper target notation, we'd write:

main_program.cpp: math.cpp draw.cpp
math.cpp: floating_point_calc.cpp integer_calc.cpp
draw.cpp: opengl.cpp

As far as file structure goes, it depends on your programming language and environment. Some IDEs automatically generate some sort of Makefile as well, and you won't need to write it from scratch. However, it's very useful to understand the syntax if you want to tweak it.

Sometimes modifying the default Makefile is even mandatory, like when you want to make OpenGL and CLion play nice together.

Bash Scripting

Bash is mostly used for automation on Linux distributions, and is essential to becoming an all-powerful Linux "wizard". It's also an imperative script language, which makes it very readable and easy to understand. Note that you can run bash on Windows systems, but it's not really a common use case.

First let's go over a simple "Hello World" program in Bash:

# Comments in bash look like this

#!/bin/bash
# The line above indicates that we'll be using bash for this script
# The exact syntax is: #![source]
echo "Hello world!"

When creating a script, depending on your current umask, the script itself might not be executable. You can change this by running the following line of code in your terminal:

chmod +x name_of_script.sh

This adds execute permission to the target file. However, if you want to give more specific permissions, you can execute something similar to the following command:

chmod 777 name_of_script.sh

More information on chmod on this link.

Next, let's quickly go over some basics utilizing simple if-statements and variables:

#!/bin/bash

echo "What's the answer to the ultimate question of life, the universe, and everything?"
read -p "Answer: " number
# We dereference variables using the $ operator
echo "Your answer: $number computing..."
# if statement
# The double brackets are necessary, whenever we want to calculate the value of an expression or subexpression, we have to use double brackets, imagine you have selective double vision.
if (( number == 42 ))
then
	echo "Correct!"
	# This notation, even though it's more easily readable, is rarely used.
elif (( number == 41 || number == 43 )); then
	echo "So close!"
	# This is a more common approach
else
	echo "Incorrect, you will have to wait 7 and a half million years for the answer!"
fi

Now, there is an alternative way of writing flow control which is actually more common than if statements. As we all know Boolean operators can be used for the sole purpose of generating side-effects, something like:

++a && b++  

Which means that we first increment a, and then depending on the language we're using, we check if the value of the expression evaluates to True (generally if an integer is >0 or =/=0 it means its boolean value is True). And if it is True, then we increment b.

This concept is called conditional execution and is used very commonly in bash scripting, for example:

#!/bin/bash

# Regular if notation
echo "Checking if project is generated..."
# Very important note, the whitespace between `[` and `-d` is absolutely essential
# If you remove it, it'll cause a compilation error
if [ -d project_dir ]
then
	echo "Dir already generated."
else
	echo "No directory found, generating..."
	mkdir project_dir
fi

This can be rewritten using a conditional execution:

echo "Checking if project is generated..."
[ -d project_dir ] || mkdir project_dir 

Or, we can take it even further with nested expressions:

echo "Checking if project is generated..."
[ -d project_dir ] || (echo "No directory found, generating..." && mkdir project_dir)

Then again, nesting expressions can lead down a rabbit hole and can become extremely convoluted and unreadable, so it's not advised to nest more than two expressions at most.

You might be confused by the weird [ -d ] notation used in the code snippet above, and you're not alone.

The reasoning behind this is that originally conditional statements in Bash were written using the test [EXPRESSION] command. But when people started writing conditional expressions in brackets, Bash followed, albeit with a very unmindful hack, by just remapping the [ character to the test command, with the ] signifying the end of the expression, most likely implemented after the fact.

Because of this, we can use the command test -d FILENAME which checks if the provided file exists and is a directory, like this [ -d FILENAME ].

Regular Expressions

Regular expressions (regex for short) give us an easy way to generalize our code. Or rather to repeat an action for a specific subset of files that meet certain criteria. We'll cover some regex basics and a few examples in the code snippet below.

Note: When we say that an expression catches ( -> ) a word, it means that the specified word is in the subset of words that the regular expression defines:

# Literal characters just signify those same characters
StackAbuse -> StackAbuse
sTACKaBUSE -> sTACKaBUSE

# The or (|) operator is used to signify that something can be either one or other string
Stack|Abuse -> Stack
			-> Abuse
Stack(Abuse|Overflow) -> StackAbuse
					  -> StackOverflow

# The conditional (?) operator is used to signify the potential occurrence of a string
The answer to life the universe and everything is( 42)?...
	-> The answer to life the universe and everything is...
    -> The answer to life the universe and everything is 42...
    
# The * and + operators tell us how many times a character can occur
# * indicates that the specified character can occur 0 or more times
# + indicates that the specified character can occur 1 or more times 
He is my( great)+ uncle Brian. -> He is my great uncle Brian.
							   -> He is my great great uncle Brian.
# The example above can also be written like this:
He is my great( great)* uncle Brian.

This is just the bare minimum you need for the immediate future with Makefile. Though, on the long term, learning Regular Expressions is a really good idea.

Target Notation

After all of this, now we can finally get into the meat of the Makefile syntax. Target notation is just a way of representing all the dependencies that exist between our source files.

Let's look at an example that has the same file structure as the example from the beginning of the article:

# First of all, all pyc (compiled .py files) are dependent on their source code counterparts
main_program.pyc: main_program.py
	python compile.py $<
math.pyc: math.py
	python compile.py $<	
draw.pyc: draw.py
	python compile.py $<

# Then we can implement our custom dependencies
main_program.pyc: main_program.py math.pyc draw.pyc
	python compile.py $<
math.pyc: math.py floating_point_calc.py integer_calc.py
	python compile.py $<	
draw.pyc: draw.py opengl.py
	python compile.py $<

Keep in mind that the above is just for the sake of clarifying how the target notation works. It's very rarely used in Python projects like this, because the difference in performance is in most cases negligible.

More often than not, Makefiles are used to set up a project, clean it up, maybe provide some help and test your modules. The following is an example of a much more realistic Python project Makefile:

# Signifies our desired python version
# Makefile macros (or variables) are defined a little bit differently than traditional bash, keep in mind that in the Makefile there's top-level Makefile-only syntax, and everything else is bash script syntax.
PYTHON = python3

# .PHONY defines parts of the makefile that are not dependant on any specific file
# This is most often used to store functions
.PHONY = help setup test run clean

# Defining an array variable
FILES = input output

# Defines the default target that `make` will to try to make, or in the case of a phony target, execute the specified commands
# This target is executed whenever we just type `make`
.DEFAULT_GOAL = help

# The @ makes sure that the command itself isn't echoed in the terminal
help:
	@echo "---------------HELP-----------------"
	@echo "To setup the project type make setup"
	@echo "To test the project type make test"
	@echo "To run the project type make run"
	@echo "------------------------------------"

# This generates the desired project file structure
# A very important thing to note is that macros (or makefile variables) are referenced in the target's code with a single dollar sign ${}, but all script variables are referenced with two dollar signs $${}
setup:
	
	@echo "Checking if project files are generated..."
	[ -d project_files.project ] || (echo "No directory found, generating..." && mkdir project_files.project)
	for FILE in ${FILES}; do \
		touch "project_files.project/$${FILE}.txt"; \
	done

# The ${} notation is specific to the make syntax and is very similar to bash's $() 
# This function uses pytest to test our source files
test:
	${PYTHON} -m pytest
	
run:
	${PYTHON} our_app.py

# In this context, the *.project pattern means "anything that has the .project extension"
clean:
	rm -r *.project

With that in mind, let's open up the terminal and run the Makefile to help us out with generating and compiling a Python project:

running make with the makefile

Conclusion

Makefile and make can make your life much easier, and can be used with almost any technology or language.

It can automate most of your building and testing, and much more. And as can be seen from the example above, it can be used with both interpreted and compiled languages.

Author image
Belgrade