Python, WWW, Literate Programming and Collaborative Computing

Steven D. Majewski (sdm7g@elvis.med.virginia.edu)
Wed, 3 Aug 1994 01:26:57 -0400

Besides trying to write Python CGI scripts for my *REAL* work,
I've been thinking more about some dropped threads about
embedding help & documentation in python modules and how to
support a interactive, non-batch oriented form of Literate Programming.

[ D.E.Knuth, <<Literate Programming>> CSLI lecture notes ; no 27. 1992 ]
[ Literate Programming Library <http://info.desy.de/www/LitProg.html> ]

Michael J. McLay at NIST <mclay@eeel.nist.gov> has done a
prototype "netimport" module, which I found when I was fetching
the latest version of his cgi.py module from
URL:<http://www.eeel.nist.gov/python/>

netimport <http://www.eeel.nist.gov/python/netimport.html>
allows import of a remote module via http.
[ His disclaimer said not to "publish" the URLs, but since he has
posted the first to comp.lang.python himself, I'll assume he
didn't mean "publish" to include posting it here.
BTW Michael: It didn't work for me, but it haven't poked
around enough to figure out why. I end up with an EMPTY
test.py in my current directory. I assume ( particularly
from the dangling code in site.py ) this is an "in-progress"
snapshot. ]

I had already been thinking of several methods to link
Python together with the collaborative nature of WWW and
the ideas of Literate Programming, and had started to
sketch out various alternatives when I came across that
module. ( Similar to some of the notions I had been considering.)

I was going to wait until I had some sort of prototype before
I brought up the subject, but seeing that there are already
others thinking along these lines, I thought I would at least
transcribe some of my notes and post them for comment.

Sorry that they are so crude, but I really DO have other
work do get back to! :->

[ The other inspiration was feeling that NOBODY is *ever* going
to have the time to properly maintain the Python library and
all of the contributed code, so maybe we need a better mechanism
for collaboration. ]

Goals:

* Use WWW to organize the Python Library and contributed code.

* Enable collaborative Python programming projects.

* Use WWW browsers as class/documentation/help browsers from Python.

* Allow Literate Programming in an interactive HyperText style,
rather than the batch/Book oriented WEB/TeX approach.
URL:<http://info.desy.de/www/LitProg/HTML.html>

-----

file format:

* Embed HTML tags in Python comments.
An ML example in http://info.desy.de/www/LitProg/HTML-ml.html
does this. The wrap the code in commented <PRE> ... </PRE>'s
<CODE> would be better and more specific, but the example may
have predated introduction of that tag.
PROBLEM: "<", ">" have to be escaped.

* Embed Python code in a HTML document.
Strip out and concatenate everything between <CODE> ... </CODE>
and unescape "&<", "&>", etc.

* Use yet another format with cgi-filters to/from Python and HTML.
( a WEB like tangle & weave )

Note: <!--PYTHON--><CODE> or something to designate the language
would be even better. Mosaic ( other WWW browsers? ) seem to
happily ignore tags they don't know, so "<CODE><PYTHON>" or
something similar might be OK. Maybe <CODE> ought to be optionally
parameterized: <CODE LANG="Python"> ? (That is a point to bring
up on html mailing list.)
But indicating the language would allow other language dependent
processing of the stream: e.g. keywords rendered bold, generate
links to function/class definitions, etc.

We might also want other types of <CODE> strings. Perhaps not
all <CODE> is displayed in the document view ? ( But it *does*
get included for CODE-ONLY import. )

---

Tangle/Weave: Separating the code-only from the documentation.

( "Client" here means either a WWW browser like Mosaic, lynx, Dancer, or a WWW client *FUNCTION* that implements a remote import capability. )

Client function ( importhtml( url ) ) gets the text/html and strips and concatenates the <CODE> ... </CODE>, execs that string, and possibly enters the raw HTML string in the module namespace ( __HTML__ ? )

Server side parsing: A server script gets the URL with the module name as a fragment. By default, the server sends text/html, but if #module is followed by "#CODE", then it send python code. The text/html might include a FORMS button to request CODE-ONLY.

Client and Server negotiate via "Accept:" on what should be sent: text/html or text/x-python-code (or something else). Browsers will typically get text/html. The client import function will specifically ask for text/x-python-code.

---- Security concerns: Lots of them. NIST prototype uses a list of trusted sources, but they also consider the need for a "Safe-Python" ( Has anyone tried to figure out what needs to be excluded from Python to make it safe ? Do we have to restrict os.system, os.popen and most file open's, or can we just spawn another python running as Nobody ? Look at safe-tcl. )

Caching: How to maintain a local cache and determine when to fetch a new copy from the source.

* Forget about importing directly from remote source, and make the browser+server support a "save code-only to local file" . ( This also punts the Security concerns. )

* Use "Expires:" to advise on how estimated volatility of this module, and adjust caching strategy accordingly.

* Leave this to be handled by URN's, URC's and future protocols.

-- Steve Majewski (804-982-0831) <sdm7g@Virginia.EDU> -- -- UVA Department of Molecular Physiology and Biological Physics -- -- Box 449 Health Science Center Charlottesville,VA 22908 --