Python Subprocess Module - Running external programs
Python’s subprocess module provides several methods of running external programs. It’s easy to use, but robust usage with proper error checking requires few more details.
Content
- method summary
- shell parameter summary
- Basic Usage - primary function calls
- Error Checking
- Advanced Usage
- things NOT to do
methods summary
Method | Returns | On errors | Recommended Use |
---|---|---|---|
call |
exit code | returns non-zero exit code | programs which are expected to fail (or return non-zero code) |
check_call |
exit code (always zero) | raises CalledProcessError | programs which should not fail (failure is the exception, if you’ll excuse the pun) |
check_output |
program's output (stdout) | raises CalledProcessError | programs which should not fail |
Popen |
Popen Object | requires additional error handilng code | Fine-control of program’s execution |
shell parameter summary
shell=True |
shell=False (the default) |
|
---|---|---|
cmd type | string:"ls -l /etc/passwd" |
list:["ls","-l","/etc/passwd"] |
When program not found | returns non-zero exit code (usually 127). With check_call /check_output will raise CalledProcessError |
raise OSError |
Advantages | Simple to use; Allows shell expansion (e.g. ls /tmp/*.txt , $HOME ) and complex pipe commands ( seq 10 | rev | tac 2>/dev/null ) |
Safer,easier to use with problematic file names (e.g. with spaces or non-English characters); Avoids potential shell-related security issues; |
Disadvantages | potential security issues, see here | shell-functionality (redirection, pipes, globbing, env-vars) requires extra python code ; |
Basic Usage
- The examples below show typical usage, with contrieved examples
(there is rarely a real need to run
ls -l
from within python - better use os.path module for such things) - The examples start with minimal error checking, then progress to complete code with proper error checking.
check_call
Run the program (ls
), STDOUT/STDERR shared with the script’s
(e.g. will be printed to the terminal):
from subprocess import check_call
check_call("ls -l /etc/passwd /dev/null",shell=True)
Similarly, without shell interpolation:
from subprocess import check_call
check_call(["ls","-l","/etc/passwd","/dev/null"])
check_output
Run the program (seq
), STDOUT is returned as a variable:
from subprocess import check_output
numbers = check_output("seq 10",shell=True)
call
Run the program, get its exit code.
grep
returns 0 if the regular-expression matched any lines,
1 if no lines matched, or other if an error occured.
Because a regular expression can contain characters with special shell-meaning,
it is better to run it with shell=False
and avoid variable-expansion:
from subprocess import call
rc = call(["grep","-Eq","(z|k|tc)sh","/etc/passwd"])
if rc == 0:
print ("someone is using zsh/ksh/tcsh")
elif rc == 1:
print ("zsh/ksh/tcsh not used")
else:
print ("an error occured, grep returned %d" % rc)
If any error occurs, grep
will print a message to STDERR, which will
be sent to the same STDERR of the script’s (e.g. the screen) - the user
will see it.
Popen
Popen enables
fine-control over the external process - getting its STDOUT,STDERR and returned code.
In the example below out
will contain the content of /etc/passwd
,
and err
will contain an error message about /foo/bar
not being found.
The returned code will be 1:
from subprocess import Popen,PIPE
p = Popen(["cat","/etc/passwd","/foo/bar"], stdout=PIPE,stderr=PIPE)
(out,err) = p.communicate();
print ("cat returned code = %d" % p.returncode)
print ("cat output:\n\n%s\n\n" % out)
print ("cat errors:\n\n%s\n\n" % err)
Error Checking
check_output error checking (without shell expansion)
The check_*
functions will raise CalledProcessError if a program fails (runs, but returns non-zero exit code), or OSError if the program was not found.
The following contrived example will randomly run seq 10
, seqXX 10
, seq foo
, seqXX foo
- thus generating different types of errors:
from subprocess import check_output, CalledProcessError
from random import choice
import sys
try:
cmd = choice( ["seq", "seqXX" ] )
param = choice( ["foo", "10" ] )
numbers = check_output([cmd,param])
print ("'%s %s' succeeded, result=%s" % (cmd,param,str(numbers)))
except CalledProcessError as e:
sys.exit("'%s %s' failed, returned code %d" % (cmd,param,e.returncode))
except OSError as e:
sys.exit("failed to execute program '%s': '%s'" % (cmd, str(e)))
( download check-output.py )
check_output error checking (with shell expansion)
The check_*
functions will raise CalledProcessError if a program fails (it runs, but returns non-zero exit code), or if the program is not found (and the return-code will be 127). OSError is unlikely but can still happen and should be accounted for.
The following contrived example will randomly run seq 10
, seqXX 10
, seq foo
, seqXX foo
- thus generating different types of errors:
from subprocess import check_output, CalledProcessError
from random import choice
import sys
try:
prog = choice( ["seq", "seqXX" ] )
param = choice( ["foo", "10" ] )
# WARNING: constructing shell commands without input validation
# could lead to security issues!
cmd = "%s %s" % (prog, param)
numbers = check_output(cmd, shell=True)
print ("command '%s' succeeded, returned: %s" % (cmd,str(numbers)))
except CalledProcessError as e:
if e.returncode==127:
sys.exit("program '%s' not found" % (prog))
elif e.returncode<=125:
sys.exit("'%s' failed, returned code %d" % (cmd,e.returncode))
else:
# Things get hairy and unportable - different shells return
# different values for coredumps, signals, etc.
sys.exit("'%s' likely crashed, shell retruned code %d" % (cmd,e.returncode))
except OSError as e:
# unlikely, but still possible: the system failed to execute the shell
# itself (out-of-memory, out-of-file-descriptors, and other extreme cases).
sys.exit("failed to run shell: '%s'" % (str(e)))
( download check-output-shell.py )
Popen error checking (without shell)
Popen
(without shell) will run the program and return its STDOUT,STDERR and return code.
OSError will be raised
if the program is not found.
The following contrived example will randomly run seq 10
, seqXX 10
, seq foo
, seqXX foo
- thus generating different types of errors:
from subprocess import Popen,PIPE
from random import choice
import sys
try:
prog = choice( ["seq", "seqXX" ] )
param = choice( ["foo", "10" ] )
p = Popen([prog,param],stdout=PIPE,stderr=PIPE)
(out,err) = p.communicate()
if p.returncode == 0:
print ("command '%s %s' succeeded, returned: %s" \
% (prog, param, str(out)))
else:
print ("command '%s %s' failed, exit-code=%d error = %s" \
% (prog, param, p.returncode, str(err)))
except OSError as e:
sys.exit("failed to execute program '%s': %s" % (prog, str(e)))
( download popen.py )
Popen error checking (with shell)
Popen
(with shell) will run the program and return its STDOUT,STDERR and return code.
return code of 127 indicates the program was not found by the shell.
OSError will be raised
on extreme cases where the system failed to run a new shell.
The following contrived example will randomly run seq 10
, seqXX 10
, seq foo
, seqXX foo
- thus generating different types of errors:
from subprocess import Popen,PIPE
from random import choice
import sys
try:
prog = choice( ["seq", "seqXX" ] )
param = choice( ["foo", "10" ] )
# WARNING: constructing shell commands without input validation
# could lead to security issues!
cmd = "%s %s" % (prog, param)
p = Popen(cmd,shell=True,stdout=PIPE,stderr=PIPE)
(out,err) = p.communicate()
if p.returncode == 0:
print ("command '%s %s' succeeded, returned: %s" \
% (prog, param, str(out)))
elif p.returncode <= 125:
print ("command '%s %s' failed, exit-code=%d error = %s" \
% (prog, param, p.returncode, str(err)))
elif p.returncode == 127:
print ("program '%s' not found: %s" % (prog, str(err)))
else:
# Things get hairy and unportable - different shells return
# different values for coredumps, signals, etc.
sys.exit("'%s' likely crashed, shell retruned code %d" % (cmd,e.returncode))
except OSError as e:
# unlikely, but still possible: the system failed to execute the shell
# itself (out-of-memory, out-of-file-descriptors, and other extreme cases).
sys.exit("failed to run shell: '%s'" % (str(e)))
( download popen-shell.py )
Advanced Usage
Merging STDERR into STDOUT
When using check_output
, output is returned as a python variable
while STDERR is shared with the script’s STDERR (e.g. printed to the
terminal). By merging STDERR into STDOUT, the external program’s
errors can be hidden from the user (while stil using the simple
check_output
method).
This method is also useful if a program writes information other than error messages to STDERR (e.g. progress information).
from subprocess import check_output, CalledProcessError, STDOUT
try:
numbers = check_output(["seq","foo", stderr=STDOUT)
except CalledProcessError as e:
sys.exit("'seq foo'' failed, returned code %d" % e.returncode )
except OSError as e:
sys.exit("failed to execute seq: %s" % (str(e)))
In the above example, the error message (content of STDERR) is lost. The program’s failure is detected by its non-zero exit-code (leading to a ‘CalledProcessError’ exception.
There is little reason to use this method with call
/check_call
, since
these functions do not return the output in any form.
There is little reason to use this method with Popen
, since it returns
the content of STDERR in a separate variable.
Redirecting output to a file
Using a file object with stdout
parameter will redirect the program’s STDOUT
to a file:
# shell equivalent:
# ls -l /etc > files.txt
from subprocess import check_call
fout = open('files.txt','w');
check_call ("ls -l /etc/", shell=True, stdout=fout)
STDERR can be similarly redirected:
# shell equivalent:
# ls -l /etc /foo/bar >files.txt 2>errors.txt
from subprocess import check_call
fout = open('files.txt','w')
ferr = open('errors.txt','w')
check_call ("ls -l /etc/ /foo/bar", shell=True, stdout=fout, stderr=ferr)
For properly robust code, I/O errors must be checked as well:
from subprocess import check_call
try:
fout = open('files.txt','w')
ferr = open('errors.txt','w')
check_call ("ls -l /etc/", shell=True, stdout=fout, stderr=ferr)
fout.close()
ferr.close()
except IOError as e:
sys.exit("I/O error on '%s': %s" % (e.filename, e.strerror))
except CalledProcessError as e:
sys.exit("'ls' failed, returned code %d (check 'errors.txt')" \
% (e.returncode))
except OSError as e:
sys.exit("failed to run shell: %s" % (str(e)))
( download redirect-to-a-file.py )
Redirecting STDIN from a file
The following contrived example runs base64
with STDIN redirected from a file
(in real-world applications it is recommended to use python’s built-in base64 module):
# shell equivalent:
# b=$(base64 < /etc/passwd) || echo base64 failed
from subprocess import check_output
import sys
try:
fin=open('/etc/passwd','r')
b=check_output(["base64"], stdin=fin)
fin.close()
print ("encoded passwd: %s" % str(b))
except IOError as e:
sys.exit("I/O error on '%s': %s" % (e.filename, e.strerror))
except CalledProcessError as e:
sys.exit("base64 failed: %s" % (str(e)))
except OSError as e:
sys.exit("failed to run 'base64': %s" % (str(e)))
( download redirect-stdin-from-file.py )
Redirecting BOTH STDIN and STDOUT
# Shell Equivalent:
# base64 < /etc/passwd > encoded-passwd.txt \
# || echo base64 failed
from subprocess import check_call, CalledProcessError
try:
fin=open('/etc/passwd','r')
fout=open('encoded-passwd.txt','w')
check_call(["base64"], stdin=fin, stdout=fout)
fout.close()
fin.close()
except IOError as e:
sys.exit("I/O error on '%s': %s" % (e.filename, e.strerror))
except CalledProcessError as e:
sys.exit("base64 failed: %s" % (str(e)))
except OSError as e:
sys.exit("failed to run 'base64': %s" % (str(e)))
( download redirect-stdin-stdout-to-files.py )
STDIN,STDOUT from/to a python variable
NOTE: when using this method, ensure the input is small. Sadly there are not strict guidelines as to ‘how small’, the Python documentation says: “Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.”. Few lines of text should be fine. To send large amount of data, use file redirection or other methods. This issue is not python-specific. For detailed discussion of pipes and subprocesses, see Advanced Programming in the Unix Environment, 3rd Ed., Chapter 15 (“Interprocess Communication”) section 15.2 (“Pipes”).
# shell (almost) equivalent:
# out=$(printf "Hello World" | base64 2>err) || echo base64 failed
import sys
from subprocess import Popen,PIPE
try:
p = Popen(["base64"], stdin=PIPE, stdout=PIPE,stderr=PIPE)
input = "Hello World"
(out,err) = p.communicate(input)
if p.returncode==0:
print ("encoded value = %s" % (str(out)))
else:
print ("failed to encode, exit-code=%d, error=%s" % (p.returncode,str(err)))
except OSError as e:
sys.exit("failed to run 'base64': %s" % (str(e)))
( download redirect-stdin-stdout-to-vars.py )
Detecting error by STDERR
Most ‘well-behaved’ unix programs will terminte with exit code 0 upon success
and non-zero exit-code upon error. Shell syntax and python’s
check_call
/check_output
operate under this assumption.
Some programs, however, exit with exit-code zero EVEN if there was an error, and write error information to STDERR. Detecting errors for such programs requires a bit more code.
The example below runs openssl
with an incorrect option (sha1X
).
openssl exits with code 0 even if wrong parameters are
given. Detection of errors is done by examining the content of the
returned STDERR string.
Shell Equivalent:
A=$(openssl sha1X < /etc/passwd 2>tmp.err) \
|| echo "openssl failed (should not happen)"
if test -s tmp.err ; then
echo "Openssl failed, error = "
cat tmp.err
exit 1
fi
echo "OpenSSL succeeded, output = $A"
Python code:
import sys
from subprocess import Popen, PIPE
try:
fin=open('/etc/passwd','r')
p = Popen(["openssl","sha1X"], stdin=fin, stdout=PIPE,stderr=PIPE)
(out,err) = p.communicate();
fin.close()
if p.returncode!=0:
## NOTE: This will never happen (in this example), as openssl
## does not return non-zero on errors.
## We'll have to detect errors in a different way.
sys.exit("openssl failed, exit code=%d, error=%s" \
% (p.returncode, str(err)))
# OpenSSL will write errors messages to STDERR. If it's not empty,
# there was an error.
err = err.strip();
if len(err) != 0:
sys.exit("openssl failed, error = %s" % (str(err)))
out = out.strip();
print ("OpenSSL succeeded, output = %s" % (str(out)))
except IOError as e:
sys.exit("I/O error on '%s': %s" % (e.filename, e.strerror))
except OSError as e:
sys.exit("failed to execute command 'base64': %s" % (str(e)))
( download detect-errors-by-stderr.py )
Re-enabling SIGPIPE
Python’s subprocess module disables SIGPIPE by default (SIGPIPE is
sets to ignore). Most unix programs expect to run with SIGPIPE enabled.
When running a single external program (e.g. with shell=False
) this is
usually not a problem. It does become a problem when running shell-pipes,
or when the executed program runs sub-programs on its own.
Example: the following shell command will print one line and terminate immediately. This happens because when ‘head’ terminates, ‘seq’ will receive a SIGPIPE signal, causing it to terminate without counting all the way to 99999999:
seq 1 0.00001 99999999 | head -n1
Yet the following python code will run for a long time, because SIGPIPEs are silenced by default:
#!/usr/bin/env python
from subprocess import check_output
x = check_output("seq 1 0.0001 99999999 | head -n1",shell=True)
( download sigpipe-issue.py )
NOTE: future versions of GNU coreutils ‘seq’ will detect and address this issue regardless of python’s SIGPIPE settings, but the problem will still exist for other programs.
The solution is to re-enable the default SIGPIPE behaviour.
Use the following method for Python2:
#!/usr/bin/env python
import signal
from subprocess import check_output
def sigpipe_fix():
signal.signal(signal.SIGPIPE, signal.SIG_DFL)
x = check_output("seq 1 0.0001 99999999 | head -n1",
preexec_fn=sigpipe_fix,shell=True)
print "x = ", x
( download sigpipe-fix2.py )
Use the following method for Python3:
#!/usr/bin/env python3
import signal
from subprocess import check_output
x = check_output("seq 1 0.0001 99999999 | head -n1",
restore_signals=True,shell=True)
print ("x = ", x)
( download sigpipe-fix3.py )
See further details and discussion:
http://www.chiark.greenend.org.uk/~cjwatson/blog/python-sigpipe.html,
http://bugs.python.org/issue1652,
http://www.pixelbeat.org/programming/sigpipe_handling.html.
things NOT to do
Some techniques and methods should be avoided (despite being demonstrated in various locations).
using stdin.write / stdout.read
The Popen object
has stdin
,stdout
,stderr
attributes. DO NOT use them. Quoting
from the manual:
Warning Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
Examples might look like so:
## !! BAD EXAMPLE - DO NOT USE stdin.write/stdout.read !!
p = Popen([...], stdout=PIPE, stdin=PIPE, stderr=STDOUT, shell=False)
p.stdin.write("input to program")
result = p.stdout.readline()
p.wait()
p.kill()
using PIPE with check_call/check_output/call
DO NOT set stdout=PIPE
or stderr=PIPE
in check_call
/
check_output
/call
methods.
Only use Popen
and comminucate()
with the PIPE options.
using Popen.wait() with PIPEs
DO NOT use the Popen.wait()
when using PIPEs for stdout/stderr - this might lead to a deadlock.
Use Popen
and comminucate()
instead.
Pass unvalidated input as a string with ‘shell=True’.
Do not pass unvalidated input (e.g. entered by the user or read from a
file) directly to a command-line string with shell=True
.
This could lead to a security problem.
See Subprocess FAQ
for a detailed example.