Formatting Strings with the Python Template Class

Introduction

Python Templates are used to substitute data into strings. With Templates, we gain a heavily customizable interface for string substitution (or string interpolation).

Python already offers many ways to substitute strings, including the recently introduced f-Strings. While it is less common to substitute strings with Templates, its power lies in how we can customize our string formatting rules.

In this article, we'll format strings with Python's Template class. We'll then have a look at how we can change the way our Templates can substitute data into strings.

For a better understanding of these topics, you'll require some basic knowledge on how to work with classes and regular expressions.

Understanding the Python Template Class

The Python Template class was added to the string module since Python 2.4. This class is intended to be used as an alternative to the built-in substitution options (mainly to %) for creating complex string-based templates and for handling them in a user-friendly way.

The class's implementation uses regular expressions to match a general pattern of valid template strings. A valid template string, or placeholder, consists of two parts:

  • The $ symbol
  • A valid Python identifier. An identifier is any sequence of upper and lower case letters A to Z, underscores (_), and digits 0 to 9. An identifier cannot begin with digits nor can it be a Python keyword.

In a template string, $name and $age would be considered valid placeholders.

To use the Python Template class in our code, we need to:

  1. Import Template from the string module
  2. Create a valid template string
  3. Instantiate Template using the template string as an argument
  4. Perform the substitution using a substitution method

Here's a basic example of how we can use the Python Template class in our code:

>>> from string import Template
>>> temp_str = 'Hi $name, welcome to $site'
>>> temp_obj = Template(temp_str)
>>> temp_obj.substitute(name='John Doe', site='StackAbuse.com')
'Hi John Doe, welcome to StackAbuse.com'

We notice that when we build the template string temp_str, we use two placeholders: $name and $site. The $ sign performs the actual substitution and the identifiers (name and site) are used to map the placeholders to the concrete objects that we need to insert into the template string.

The magic is completed when we use the substitute() method to perform the substitution and build the desired string. Think of substitute() as if we were telling Python, go through this string and if you find $name, then replace it for John Doe. Continue searching through the string and, if you find the identifier $site, then turn it into StackAbuse.com.

The names of the arguments that we pass to .substitute() need to match with the identifiers that we used in the placeholders of our template string.

The most important difference between Template and the rest of the string substitution tools available in Python is that the type of the argument is not taken into account. We can pass in any type of object that can be converted into a valid Python string. The Template class will automatically convert these objects into strings and then insert them into the final string.

Now that we know the basics on how to use the Python Template class, let's dive into the details of its implementation to get a better understanding of how the class works internally. With this knowledge at hand, we'll be able to effectively use the class in our code.

The Template String

The template string is a regular Python string that includes special placeholders. As we've seen before, these placeholders are created using a $ sign, along with a valid Python identifier. Once we have a valid template string, the placeholders can be replaced by our own values to create a more elaborated string.

According to PEP 292 -- Simpler String Substitutions, the following rules apply for the use of the $ sign in placeholders:

  1. $$ is an escape; it is replaced with a single $
  2. $identifier names a substitution placeholder matching a mapping key of "identifier". By default, "identifier" must spell a Python identifier as defined in http://docs.python.org/reference/lexical_analysis.html#identifiers-and-keywords. The first non-identifier character after the $ character terminates this placeholder specification.
  3. ${identifier} is equivalent to $identifier. It is required when valid identifier characters follow the placeholder but are not part of the placeholder, e.g. "${noun}ification". (Source)

Let's code some examples to better understand how these rules work.

We'll start with an example of how we can escape the $ sign. Suppose we're dealing with currencies and we need to have the dollar sign in our resulting strings. We can double the $ sign to escape itself in the template string as follows:

>>> budget = Template('The $time budget for investment is $$$amount')
>>> budget.substitute(time='monthly', amount='1,000.00')
'The monthly budget for investment is $1,000.00'

Note that there is no need to add and extra space between the escaped sign and next placeholder like we did in $$$amount. Templates are smart enough to be able to escape the $ sign correctly.

The second rule states the basics for building a valid placeholder in our template strings. Every placeholder needs to be built using the $ character followed by a valid Python identifier. Take a look at the following example:

>>> template = Template('$what, $who!')
>>> template.substitute(what='Hello', who='World')
'Hello, World!'

Here, both placeholders are formed using valid Python identifiers (what and who). Also notice that, as stated in the second rule, the first non-identifier character terminates the placeholder as you can see in $who! where the character ! isn't part of the placeholder, but of the final string.

There could be situations where we need to partially substitute a word in a string. That's the reason we have a second option to build a placeholder. The third rule states that ${identifier} is equivalent to $identifier and should be used when valid identifier characters follow the placeholder but are not part of the placeholder itself.

Let's suppose that we need to automate the creation of files containing commercial information about our company's products. The files are named following a pattern that includes the product code, name, and production batch, all of them separated by an underscore (_) character. Consider the following example:

>>> filename_temp = Template('$code_$product_$batch.xlsx')
>>> filename_temp.substitute(code='001', product='Apple_Juice', batch='zx.001.2020')
Traceback (most recent call last):
  ...
KeyError: 'code_'

Since _ is a valid Python identifier character, our template string doesn't work as expected and Template raises a KeyError. To correct this problem, we can use the braced notation (${identifier}) and build our placeholders as follows:

>>> filename_temp = Template('${code}_${product}_$batch.xlsx')
>>> filename_temp.substitute(code='001', product='Apple_Juice', batch='zx.001.2020')
'001_Apple_Juice_zx.001.2020.xlsx'

Now the template works correctly! That's because the braces properly separate our identifiers from the _ character. It's worth noting that we only need to use the braced notation for code and product and not for batch because the . character that follows batch isn't a valid identifier character in Python.

Finally, the template string is stored in the template property of the instance. Let's revisit the Hello, World! example, but this time we're going to modify template a little bit:

>>> template = Template('$what, $who!')  # Original template
>>> template.template = 'My $what, $who template'  # Modified template
>>> template.template
'My $what, $who template'
>>> template.substitute(what='Hello', who='World')
'My Hello, World template'

Since Python doesn't restrict the access to instance attributes, we can modify our template string to meet our needs whenever we want. However, this is not a common practice when using the Python Template class.

It's best to create new instances of Template for every different template string we use in our code. This way, we'll avoid some subtle and hard-to-find bugs related to the use of uncertain template strings.

The substitute() Method

So far, we've been using the substitute() method on a Template instance to perform string substitution. This method replaces the placeholders in a template string using keyword arguments or using a mapping containing identifier-value pairs.

The keyword arguments or the identifiers in the mapping must agree with the identifiers used to define the placeholders in the template string. The values can be any Python type that successfully converts to a string.

Since we've covered the use of keyword arguments in previous examples, let's now concentrate on using dictionaries. Here's an example:

>>> template = Template('Hi $name, welcome to $site')
>>> mapping = {'name': 'John Doe', 'site': 'StackAbuse.com'}
>>> template.substitute(**mapping)
'Hi John Doe, welcome to StackAbuse.com'

When we use dictionaries as arguments with substitute(), we need to use the dictionary unpacking operator: **. This operator will unpack the key-value pairs into keyword arguments that will be used to substitute the matching placeholders in the template string.

Common Template Errors

There are some common errors that we can inadvertently introduce when using the Python Template class.

For example, a KeyError is raised whenever we supply an incomplete set of argument to substitute(). Consider the following code which uses an incomplete set of arguments:

>>> template = Template('Hi $name, welcome to $site')
>>> template.substitute(name='Jane Doe')
Traceback (most recent call last):
  ...
KeyError: 'site'

If we call substitute() with a set of arguments that doesn't match all the placeholders in our template string, then we'll get a KeyError.

If we use an invalid Python identifier in some of our placeholders, then we'll get a ValueError telling us that the placeholder is incorrect.

Take this example where we use an invalid identifier, $0name as a placeholder instead of $name.

>>> template = Template('Hi $0name, welcome to $site')
>>> template.substitute(name='Jane Doe', site='StackAbuse.com')
Traceback (most recent call last):
  ...
ValueError: Invalid placeholder in string: line 1, col 4

Only when the Template object reads the template string to perform the substitution that it discovers the invalid identifier. It immediately raises a ValueError. Note that 0name isn't a valid Python identifier or name because it starts with a digit.

The safe_substitute() Method

The Python Template class has a second method that we can use to perform string substitution. The method is called safe_substitute(). It works similarly to substitute() but when we use an incomplete or non-matching set of arguments the method doesn't rise a KeyError.

In this case, the missing or non-matching placeholder appears unchanged in the final string.

Here's how safe_substitute() works using an incomplete set of arguments (site will be missing):

>>> template = Template('Hi $name, welcome to $site')
>>> template.safe_substitute(name='John Doe')
'Hi John Doe, welcome to $site'

Here, we fist call safe_substitute() using an incomplete set of arguments. The resulting string contains the original placeholder $site, but no KeyError is raised.

Customizing the Python Template Class

The Python Template class is designed for subclassing and customization. This allows us to modify the regular expression patterns and other attributes of the class to meet our specific needs.

In this section, we'll be covering how to customize some of the most important attributes of the class and how this impact the general behavior of our Template objects. Let's start with the class attribute .delimiter.

Using a Different Delimiter

The class attribute delimiter holds the character used as the placeholder's starting character. As we've seen so far, its default value is $.

Since the Python Template class is designed for inheritance, we can subclass Template and change the default value of delimiter by overriding it. Take a look at the following example where we override the delimiter to use # instead of $:

from string import Template
class MyTemplate(Template):
    delimiter = '#'

template = MyTemplate('Hi #name, welcome to #site')
print(template.substitute(name='Jane Doe', site='StackAbuse.com'))

# Output:
# 'Hi Jane Doe, welcome to StackAbuse.com'

# Escape operations also work
tag = MyTemplate('This is a Twitter hashtag: ###hashtag')
print(tag.substitute(hashtag='Python'))

# Output:
# 'This is a Twitter hashtag: #Python'

We can use our MyTemplate class just like we use the regular Python Template class. However, we must now use # instead of $ to build our placeholders. This can be handy when we're working with strings that handle a lot of dollar signs, for example, when we're dealing with currencies.

Note: Do not replace a delimiter with a regular expression. The template class automatically escapes the delimiter. Therefore, if we use a regular expression as delimiter it's highly likely that our custom Template would not work correctly.

Changing What Qualifies as an Identifier

The idpattern class attribute holds a regular expression that is used to validate the second half of a placeholder in a template string. In other words, idpattern validates that the identifiers we use in our placeholders are valid Python identifiers. The default value of idpattern is r'(?-i:[_a-zA-Z][_a-zA-Z0-9]*)'.

We can subclass Template and use our own regular expression pattern for idpattern. Suppose that we need to restrict the identifiers to names that neither contain underscores (_) nor digits ([0-9]). To do this, we can override idpattern and remove these characters from the pattern as follow:

from string import Template
class MyTemplate(Template):
    idpattern = r'(?-i:[a-zA-Z][a-zA-Z]*)'

# Underscores are not allowed
template = MyTemplate('$name_underscore not allowed')
print(template.substitute(name_underscore='Jane Doe'))

If we run this code we will get this error:

Traceback (most recent call last):
    ...
KeyError: 'name'

We can confirm that digits are not allowed as well:

template = MyTemplate('$python3 digits not allowed')
print(template.substitute(python3='Python version 3.x'))

The error will be:

Traceback (most recent call last):
    ...
KeyError: 'python'

Since underscore and digits are not included in our custom idpattern, the Template object applies the second rule and break the placeholder with the first non-identifier character after $. That's why we get a KeyError in each case.

Building Advanced Template Subclasses

There could be situations where we need to modify the behavior of the Python Template class, but overriding delimiter, idpattern, or both is not enough. In these cases, we can go further and override the pattern class attribute to define an entirely new regular expression for our custom Template subclasses.

If you decide to use a whole new regular expression for pattern, then you need to provide a regular expression with four named groups:

  1. escaped matches the escape sequence for the delimiter, like in $$
  2. named matches the delimiter and a valid Python identifier, like in $identifier
  3. braced matches the delimiter and a valid Python identifier using braces, like in ${identifier}
  4. invalid matches other ill-formed delimiters, like in $0site

The pattern property holds a compiled regular expression object. However, it's possible to inspect the original regular expression string by accessing the pattern attribute of the pattern property. Check out the following code:

>>> template = Template('$name')
>>> print(template.pattern.pattern)
\$(?:
    (?P<escaped>\$) |   # Escape sequence of two delimiters
    (?P<named>(?-i:[_a-zA-Z][_a-zA-Z0-9]*))      |   # delimiter and a Python identifier
    {(?P<braced>(?-i:[_a-zA-Z][_a-zA-Z0-9]*))}   |   # delimiter and a braced identifier
    (?P<invalid>)              # Other ill-formed delimiter exprs
  )

This code outputs the default string used to compile the pattern class attribute. In this case, we can clearly see the four named groups that conform to the default regular expression. As stated before, if we need to deeply customize the behavior of Template, then we should provide these same four named groups along with specific regular expressions for each group.

Running Code with eval() and exec()

Note: The built-in functions eval() and exec() can have important security implications when used with malicious input. Use with caution!

This last section is intended to open up your eyes on how powerful the Python Template class can be if we use it along with some Python built-in functions like eval() and exec().

The eval() function executes a single Python expression and returns its result. The exec() function also executes a Python expression, but it never returns its value. You normally use exec() when you're only interested in the side-effect of an expression, like a changed variable value for example.

The examples that we're going to cover may seem somewhat unconventional, but we're sure that you can find some interesting use cases for this powerful combination of Python tools. They give insight into how tools that generate Python code work!

For the first example, we're going to use a Template along with eval() to dynamically create lists via a list comprehension:

>>> template = Template('[$exp for item in $coll]')
>>> eval(template.substitute(exp='item ** 2', coll='[1, 2, 3, 4]'))
[1, 4, 9, 16]
>>> eval(template.substitute(exp='2 ** item', coll='[3, 4, 5, 6, 7, 8]'))
[8, 16, 32, 64, 128, 256]
>>> import math
>>> eval(template.substitute(expression='math.sqrt(item)', collection='[9, 16, 25]'))
[3.0, 4.0, 5.0]

Our template object in this example holds the basic syntax of a list comprehension. Beginning with this template, we can dynamically create lists by substituting the placeholders with valid expressions (exp) and collections (coll). As a final step, we run the comprehension using eval().

Since there is no limit on how complex our template strings can be, it's possible to create template strings that hold any piece of Python code. Let's consider the following example of how to use a Template object for creating an entire class:

from string import Template

_class_template = """
class ${klass}:
    def __init__(self, name):
        self.name = name

    def ${method}(self):
        print('Hi', self.name + ',', 'welcome to', '$site')
"""

template = Template(_class_template)
exec(template.substitute(klass='MyClass',
                         method='greet',
                         site='StackAbuse.com'))

obj = MyClass("John Doe")
obj.greet()

Here, we create a template string to hold a fully-functional Python class. We can later use this template for creating different classes but using different names according to our needs.

In this case, exec() creates the real class and bring it to our current namespace. From this point on, we can freely use the class as we would do with any regular Python class.

Even though these examples are fairly basic, they show how powerful the Python Template class can be and how we can take advantage of it to solve complex programming problems in Python.

Conclusion

The Python Template class is intended to be used for string substitution or string interpolation. The class works using regular expressions and provides a user-friendly and powerful interface. It's a viable alternative to other to the built-in string substitution options when it comes to creating complex string-based templates.

In this article, we've learned how the Python Template class works. We also learned about the more common errors that we can introduce when using Template and how to work around them. Finally, we covered how to customize the class through subclassing and how to use it to run Python code.

With this knowledge at hand, we're in a better condition to effectively use the Python Template class to perform string interpolation or substitution in our code.