Python as a scripting language is quite simple and compact. Compared to other languages, you only have a relatively low number of keywords to internalize in order to write proper Python code. Furthermore, both simplicity as well as readability of the code are preferred, which is what Python prides itself on. In order to achieve both goals, it is helpful that you follow the language's specific guidelines.
This article focuses on the guidelines mentioned above to write valid code that represents a more Pythonic way of programming. It is a selection of guidelines that focuses on practical usage, and further guidelines can be read in The Hitchhiker's Guide to Python and the PEP8 Style Guide.
Tim Peters - an American Python developer - combines the principles of the language humorously in the Zen of Python. These rules comprise the main targets and style of the language. Hopefully, these rules help you to orient as a developer.
Zen of Python
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
--Tim Peters
General Programming Guidelines
Following the Zen of Python, readability of code counts. To ensure properly-formatted code, the Python language has some programming guidelines outlined in PEP8 - for example consistent indentation, a specific line length, writing one statement per line only, and formulating pieces of code in a rather explicit than implicit way. We will explain these rules below step by step.
Indentation
Indentation is required for classes, functions (or methods), loops, conditions, and lists. You may use either tabulators, or spaces, but you must not combine both of them in the same script. For Python 3, spaces are the preferred indentation method, and more specifically four spaces are desired. As an example, a list is recommended to be defined in one of these two ways as follows:
Writing lists
# version 1
numbers = [
1, 2, 3,
4, 5, 6
]
# version 2
numbers = [
1, 2, 3,
4, 5, 6
]
As pointed out in PEP8, the closing bracket can either be lined-up under the first non-whitespace character of the last line of the list, as in "version 1", or under the first character of the line that starts the list as in "version 2".
Using spaces requires us to work with the same number of spaces per indentation level. The next example shows you how not to write your code, which mixes tabulators and a different number of spaces on each line.
Bad example
def draw_point(x, y):
"""draws a point at position x,y"""
if (x > 0):
set_point(x, y)
return
In order to indent code blocks properly the next example uses four spaces per indentation level, consequently:
Good example
def draw_point(x, y):
"""draws a point at position x,y"""
if (x > 0):
set_point(x, y)
return
One Statement per Line
The example above follows another important rule regarding writing code: use one statement per line, only. Although, the Python language allows you to write several statements per line that are separated by a semi-colon as follows:
Bad
print ("Berlin"); print ("Cape Town")
if x == 1: print ("Amsterdam")
For a better clarity write the code like that, instead:
Good
print ("Berlin")
print ("Cape Town")
if x == 1:
print ("Amsterdam")
This refers to using Python modules as well. Many programming examples show two or more modules that are imported on a single line as follows:
Bad practice
import sys, os
It is much better to import one module per line, instead:
Good practice
import sys
import os
Check out our hands-on, practical guide to learning Git, with best-practices, industry-accepted standards, and included cheat sheet. Stop Googling Git commands and actually learn it!
Place the import
statements at the beginning of the file, after the copyright information, and the docstrings. Furthermore, it is common to group the import
statements into standard modules from the Python library, related third-party modules, and finally library-specific imports. Inserting a blank line as well as comments help the readability and to understand the code in a better way.
Importing External Modules
# use operating-system specific routines
import os
# use regular expressions routines
import re
# use SAX XML library/parser
from xml.sax import make_parser, handler
...
Line Length
A single line should not exceed the number of 79 characters, whereas a docstring or comment is no longer than 72 characters. Lines of code can be wrapped by using a backslash (\
) as follows:
Code with a Line-Break
with open('/path/to/some/file/you/want/to/read') as file_1, \
open('/path/to/some/file/being/written', 'w') as file_2:
file_2.write(file_1.read())
Explicit vs Implicit Code
Python as a scripting language is flexible enough that it allows you to use "tricks" throughout your code. Although you should take into account that many times your code is read by other developers. To improve the readability, it is better to write explicit code instead of making implicit assumptions, like using one-liners or "tricks".
In the example below the function calculation()
hides the two values x
and y
in a single parameter named args
. This way of writing also allows callers to pass more or less than these values to the function if desired, but it is not obvious at first sight.
Bad
def calculation(*args):
"""calculation of the total"""
x, y = args
return (x+y)
print(calculation(3, 4))
For higher clarity it is recommended to write it like this, instead:
Good
def calculation(x,y):
"""calculation of the total"""
total = x + y
return (total)
print(calculation(3, 4))
Naming Conventions
There exist quite a few variations to name modules, classes, methods/functions, and variables. This includes the usage of lowercase and uppercase letters with or without underscores, capitalized words, and mixed styles. Due to the huge diversity of developers you will find all these styles, and there is little consistency among the modules.
Naming Style Variations
shoppingcart = [] # lowercase
shopping_cart = [] # lowercase with underscores
SHOPPINGCART = [] # uppercase
SHOPPING_CART = [] # uppercase with underscores
ShoppingCart = [] # capitalized words
shoppingCart = [] # mixed style
Which of the styles you use is up to you. Again, be consistent, and use the same style in your entire code. According to PEP8, the following main rules apply:
- Names of identifiers have to be ASCII compatible
- Modules are required to have short, all-lowercase names
- Classes follow the capitalized words convention
- Exceptions follow the capitalized words convention, and are expected to have the
Error
suffix if they refer to errors - Constants are written in uppercase letters
For more details have a look at the PEP8 standard.
We should also point out that it is considered more "Pythonic" to use the "lowercase with underscores" approach when naming variables in Python, although any approach is allowed.
Code Style Validation
Guidelines are great to achieve code that follows certain conditions. As a programmer you want to make sure that you follow them as much as possible. Automated tools are great to help you to validate your code.
As mentioned above, the guidelines are described in PEP8. Consequently, the Python language contains a correesponding command line tool to help you check your code against the guidelines. Originally known as pep8
, this code checker was renamed to pycodestyle in 2016. It is maintained by the Python Code Quality Authority, and belongs to a number of tools like the source code analyzers pylint and pyflakes, the complexity checker mccabe as well as the docstring checker pydocstyle.
pycodestyle
analyzes your Python code, and reports violations that cover indentation errors, blank lines that are unnecessary, and the usage of tabulators instead of spaces. The following example contains a sample output with some typical errors and warnings:
$ pycodestyle --first stack.py
stack.py:3:1: E265 block comment should start with '# '
stack.py:12:1: E302 expected 2 blank lines, found 1
stack.py:13:1: W191 indentation contains tabs
In Debian GNU/Linux, the tool is available as the packages python-pycodestyle
(for Python 2.x), and python3-pycodestyle
(for Python 3.x). Both of them come with a number of useful parameters, for example:
--first
: Show first occurrence of each error (as seen above). The output shows the file the error was detected in as well as the line number and the column.--show-source
: Show source code for each error
$ pycodestyle --show-source stack.py
stack.py:3:1: E265 block comment should start with '# '
#o
^
stack.py:12:1: E302 expected 2 blank lines, found 1
class Stack:
^
stack.py:13:1: W191 indentation contains tabs
def __init__(self):
^
...
--statistics
: Count errors and warnings. In the following example,pycodestyle
detected two errors - E265 and E302 - as well as 30 warnings (W191).
$ pycodestyle --statistics stack.py
...
1 E265 block comment should start with '# '
1 E302 expected 2 blank lines, found 1
30 W191 indentation contains tabs
The same tool is also available online. Just copy and paste your code into the tool, and see the validation result.
Conclusion
Writing proper Python code is not always easy. But luckily there are guidelines that help, as well as command line tools to ensure that your code meets these guidelines. With the various resources available it can be very easy :)
Acknowledgements
The author would like to thank Zoleka Hatitongwe for her support while preparing the article.