A secure version of Python

Andrew KUCHLING (fnord@cs.mcgill.ca)
11 Aug 1994 15:16:10 GMT

Last week, one article discussed using the HTTP protocol for fetching
and executing Python code from remote machines, and proposed a secure
version of Python for executing untrusted code. (I forget who the
original poster was, and I can't find the printout I made--it may have
been Steve Majewski.)

I've been considering the issue because I'm thinking of experimenting with
distributed computing protocols written in Python, and a secure Python would
not only guard against crackers, but would also help protect me from my own
stupidity.

Some hazards can be avoided by using the limiting features in Unix; for
example, limiting the number of processes that can be created will protect
against fork bombs, and would let you leave the posix.fork() function in the
secure version. You could keep the file I/O functions by only letting
untrusted programs manipulate files in a temporary directory or RAM disk.
However, I've tried to be conservative and assumed that such protection is
not available, and that untrusted programs will be fairly limited in scope,
and won't need to carry out file I/O or multiprocess computation.

Most obviously, several built-in modules will have to be removed, because
the functions they contain are dangerous. The socket module has to go
completely; you don't want untrusted programs to be able to leak information
outside your system, or to access other local systems.

Most of the POSIX module would have to be removed; untrusted programs almost
certainly shouldn't need to create or read files, make links, fork or
execute other programs. Forbidding file I/O means that untrusted Python
programs have no way to leak information. About the only functions in the
POSIX module that don't present hazards are: the get{cwd,egid,euid,...}
functions, nice(), times(), umask(), uname(), wait(), and waitpid().

If you decide that you *must* allow untrusted programs to read and write
files, a solution might be to restrict file creation to a single directory
by building this precaution into the OS modules; it would be a sort of
implicit chroot() from the point of view of an untrusted Python program. It
would still be possible to try to use up all the i-nodes or the disk space
on the filesystem; I can't think of any way to guard against that. You may
or may not want to let untrusted programs read files outside of the safe
directory.

I don't know if it's possible to write a thread bomb, by analogy with
a fork bomb; I assume that it's possible to fill up some sort of
operating system data structure by continually creating new threads.
In any case, threads are only available on a few architectures and
shouldn't be part of a standard secure Python. So, the thread module
has to go, too. (I hope someone who uses threads will clear this
question up.)

Modules that I'm not sure of:
dbm, gdbm, grp, posixpath, pwd : You may not want untrusted programs
to look at your password or group databases, or to check for the existence
of certain files on your machine. If you prohibit file manipulation, the
DBM and GDBM modules are of no use, and can be removed.

Standard language features: File objects present the same hazard as the
OS module does, and most of the above comments apply, so open() should
be removed.

sys.path should not be assignable; we will probably allow untrusted programs
to call standard library modules like cmd.py, and a cracker may change
sys.path and create his own versions of library modules; for example
urllib.py imports ftplib.py .

The interpreter should monitor its size, and terminate itself if it gets too
large. This would prevent running code to fill up all available memory by
building gigantic lists or dictionaries, or executing an infinitely
recursive function.

Various library modules would become irrelevant, and could be
deleted. For example, if you don't allow reading/writing files, many
library modules (such as rfc822.py) won't be able to run at all.
However, I don't think it's possible for library modules to be unsafe
if the language itself is secure.

That covers all the possible attacks I could think of. Can anyone
think of the potential attacks I've missed?


Andrew Kuchling
fnord@binkley.cs.mcgill.ca