Python's os and subprocess Popen Commands

Introduction

Python offers several options to run external processes and interact with the operating system. However, the methods are different for Python 2 and 3. Python 2 has several methods in the os module, which are now deprecated and replaced by the subprocess module, which is the preferred option in Python 3.

Throughout this article we'll talk about the various os and subprocess methods, how to use them, how they're different from each other, on what version of Python they should be used, and even how to convert the older commands to the newer ones.

Hopefully by the end of this article you'll have a better understanding of how to call external commands from Python code and which method you should use to do it.

First up is the older os.popen* methods.

The os.popen* Methods

The os module offers four different methods that allows us to interact with the operating system (just like you would with the command line) and create a pipe to other commands. These methods I'm referring to are: popen, popen2, popen3, and popen4, all of which are described in the following sections.

The goal of each of these methods is to be able to call other programs from your Python code. This could be calling another executable, like your own compiled C++ program, or a shell command like ls or mkdir.

os.popen

The os.popen method opens a pipe from a command. This pipe allows the command to send its output to another command. The output is an open file that can be accessed by other programs.

The syntax is as follows:

os.popen(command[, mode[, bufsize]])  

Here the command parameter is what you'll be executing, and its output will be available via an open file. The argument mode defines whether or not this output file is readable ('r') or writable ('w'). Appending a 'b' to the mode will open the file in binary mode. Thus, for example "rb" will produce a readable binary file object.

In order to retrieve the exit code of the command executed, you must use the close() method of the file object.

The bufsize parameter tells popen how much data to buffer, and can assume one of the following values:

  • 0 = unbuffered (default value)
  • 1 = line buffered
  • N = approximate buffer size, when N > 0; and default value, when N < 0

This method is available for Unix and Windows platforms, and has been deprecated since Python version 2.6. If you're currently using this method and want to switch to the Python 3 version, here is the equivalent subprocess version for Python 3:

Method Replaced by
pipe = os.popen('cmd', 'r', bufsize) pipe = Popen('cmd', shell=True, bufsize=bufsize, stdout=PIPE).stdout
pipe = os.popen('cmd', 'w', bufsize) pipe = Popen('cmd', shell=True, bufsize=bufsize, stdin=PIPE).stdin

The code below shows an example of how to use the os.popen method:

import os

p = os.popen('ls -la')  
print(p.read())  

The code above will ask the operating system to list all files in the current directory. The output of our method, which is stored in p, is an open file, which is read and printed in the last line of the code. The of this code (in the context of my current directory) result is as follows:

$ python popen_test.py 
total 32  
drwxr-xr-x   7 scott  staff  238 Nov  9 09:13 .  
drwxr-xr-x  29 scott  staff  986 Nov  9 09:08 ..  
-rw-r--r--   1 scott  staff   52 Nov  9 09:13 popen2_test.py
-rw-r--r--   1 scott  staff   55 Nov  9 09:14 popen3_test.py
-rw-r--r--   1 scott  staff   53 Nov  9 09:14 popen4_test.py
-rw-r--r--   1 scott  staff   49 Nov  9 09:13 popen_test.py
-rw-r--r--   1 scott  staff    0 Nov  9 09:13 subprocess_popen_test.py

os.popen2

This method is very similar to the previous one. The main difference is what the method outputs. In this case it returns two file objects, one for the stdin and another file for the stdout.

The syntax is as follows:

popen2(cmd[, mode[, bufsize]])  

These arguments have the same meaning as in the previous method, os.popen.

The popen2 method is available for both the Unix and Windows platforms. However, it is found only in Python 2. Again, if you want to use the subprocess version instead (shown in more detail below), use the following instead:

Method Replaced by
(child_stdin, child_stdout) = os.popen2('cmd', mode, bufsize) p = Popen('cmd', shell=True, bufsize=bufsize, stdin=PIPE, stdout=PIPE, close_fds=True)
(child_stdin, child_stdout) = (p.stdin, p.stdout)

The code below shows an example on how to use this method:

import os

in, out = os.popen2('ls -la')  
print(out.read())  

This code will produce the same results as shown in the first code output above. The difference here is that the output of the popen2 method consists of two files. Thus, the 2nd line of code defines two variables: in and out. In the last line, we read the output file out and print it to the console.

os.popen3

This method is very similar to the previous ones. However, the difference is that the output of the command is a set of three files: stdin, stdout, and stderr.

The syntax is:

os.popen3(cmd[, mode[, bufsize]])  

where the arguments cmd, mode, and bufsize have the same specifications as in the previous methods. The method is available for Unix and Windows platforms.

Note that this method has been deprecated and the Python documentation advises us to replace the popen3 method as follows:

Method Replaced by
(child_stdin,
child_stdout,
child_stderr) = os.popen3('cmd', mode, bufsize)
p = Popen('cmd', shell=True, bufsize=bufsize,
stdin=PIPE, stdout=PIPE, stderr=PIPE, close_fds=True)
(child_stdin,
child_stdout,
child_stderr) = (p.stdin, p.stdout, p.stderr)

As in the previous examples, the code below will produce the same result as seen in our first example.

import os

in, out, err = os.popen3('ls -la')  
print(out.read())  

However, in this case, we have to define three files: stdin, stdout, and stderr. The list of files from our ls -la command is saved in the out file.

os.popen4

As you probably guessed, the os.popen4 method is similar to the previous methods. However, in this case, it returns only two files, one for the stdin, and another one for the stdout and the stderr.

This method is available for the Unix and Windows platforms and (surprise!) has also been deprecated since version 2.6. To replace it with the corresponding subprocess Popen call, do the following:

Method Replaced by
(child_stdin, child_stdout_and_stderr) = os.popen4('cmd', mode, bufsize) p = Popen('cmd', shell=True, bufsize=bufsize,
stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
(child_stdin, child_stdout_and_stderr) = (p.stdin, p.stdout)

The following code will produce the same result as in the previous examples, which is shown in the first code output above.

import os

in, out = os.popen4('ls -la')  
print(we.read())  

As we can see from the code above, the method looks very similar to popen2. However, the out file in the program will show the combined results of both the stdout and the stderr streams.

Summary of differences

The differences between the different popen* commands all have to do with their output, which is summarized in the table below:

Method Arguments
popen stdout
popen2 stdin, stdout
popen3 stdin, stdout, stderr
popen4 stdin, stdout and stderr

In addition the popen2, popen3, and popen4 are only available in Python 2 but not in Python 3. Python 3 has available the popen method, but it is recommended to use the subprocess module instead, which we'll describe in more detail in the following section.

The susbprocess.Popen Method

The subprocess module was created with the intention of replacing several methods available in the os module, which were not considered to be very efficient. Within this module, we find the new Popen class.

The Python documentation recommends the use of Popen in advanced cases, when other methods such like subprocess.call cannot fulfill our needs. This method allows for the execution of a program as a child process. Because this is executed by the operating system as a separate process, the results are platform dependent.

The available parameters are as follows:

subprocess.Popen(args, bufsize=0, executable=None, stdin=None, stdout=None, stderr=None, preexec_fn=None, close_fds=False, shell=False, cwd=None, env=None, universal_newlines=False, startupinfo=None, creationflags=0)  

One main difference of Popen is that it is a class and not just a method. Thus, when we call subprocess.Popen, we're actually calling the constructor of the class Popen.

There are quite a few arguments in the constructor. The most important to understand is args, which contains the command for the process we want to run. It can be specified as a sequence of parameters (via an array) or as a single command string.

The second argument that is important to understand is shell, which is defaults to False. On Unix, when we need to run a command that belongs to the shell, like ls -la, we need to set shell=True.

For example, the following code will call the Unix command ls -la via a shell.

import subprocess  
subprocess.Popen('ls -la', shell=True)  

The results can be seen in the output below:

$ python subprocess_popen_test.py 
total 40  
drwxr-xr-x   7 scott  staff  238 Nov  9 09:13 .  
drwxr-xr-x  29 scott  staff  986 Nov  9 09:08 ..  
-rw-r--r--   1 scott  staff   52 Nov  9 09:13 popen2_test.py
-rw-r--r--   1 scott  staff   55 Nov  9 09:14 popen3_test.py
-rw-r--r--   1 scott  staff   53 Nov  9 09:14 popen4_test.py
-rw-r--r--   1 scott  staff   49 Nov  9 09:13 popen_test.py
-rw-r--r--   1 scott  staff   56 Nov  9 09:16 subprocess_popen_test.py

Using the following example from a Windows machine, we can see the differences of using the shell parameter more easily. Here we're opening Microsoft Excel from the shell, or as an executable program. From the shell, it is just like if we were opening Excel from a command window.

The following code will open Excel from the shell (note that we have to specify shell=True):

import subprocess  
subprocess.Popen("start excel", shell=True)  

However, we can get the same results by calling the Excel executable. In this case we are not using the shell, so we leave it with its default value (False); but we have to specify the full path to the executable.

import subprocess  
subprocess.Popen("C:\Program Files (x86)\Microsoft Office\Office15\excel.exe")  

In addition, when we instantiate the Popen class, we have access to several useful methods:

Method Description
Popen.poll() Checks if the child process has terminated.
Popen.wait() Wait for the child process to terminate.
Popen.communicate() Allows to interact with the process.
Popen.send_signal() Sends a signal to the child process.
Popen.terminate() Stops the child process.
Popen.kill() Kills a child process.

The full list can be found at the subprocess documentation. The most commonly used method here is communicate.

The communicate method allows us to read data from the standard input, and it also allows us to send data to the standard output. It returns a tuple defined as (stdoutdata, stderrdata).

For example, the following code will combine the Windows dir and sort commands.

import subprocess

p1 = subprocess.Popen('dir', shell=True, stdin=None, stdout=subprocess.PIPE, stderr=subprocess.PIPE)  
p2 = subprocess.Popen('sort /R', shell=True, stdin=p1.stdout)

p1.stdout.close()  
out, err = p2.communicate()  

In order to combine both commands, we create two subprocesses, one for the dir command and another for the sort command. Since we want to sort in reverse order, we add /R option to the sort call.

We define the stdout of process 1 as PIPE, which allows us to use the output of process 1 as the input for process 2. Then we need to close the stdout of process 1, so it can be used as input by process 2. The communication between process is achieved via the communicate method.

Running this from a Windows command shell produces the following:

> python subprocess_pipe_test.py
11/09/2017  08:52 PM                  234 subprocess_pipe_test.py  
11/09/2017  07:13 PM                   99 subprocess_pipe_test2.py  
11/09/2017  07:08 PM                   66 subprocess_pipe_test3.py  
11/09/2017  07:01 PM                   56 subprocess_pipe_test4.py  
11/09/2017  06:48 PM     <DIR>            ..  
11/09/2017  06:48 PM     <DIR>            .  
 Volume Serial Number is 2E4E-56A3
 Volume in drive D is ECA
 Directory of D:\MyPopen
               4 File(s)            455 bytes
               2 Dir(s)  18,634,326,016 bytes free

Wrapping up

The os methods presented a good option in the past, however, at present the subprocess module has several methods which are more powerful and efficient to use. Among the tools available is the Popen class, which can be used in more complex cases. This class also contains the communicate method, which helps us pipe together different commands for more complex functionality.

What do you use the popen* methods for, and which do you prefer? Let us know in the comments!