Re: quickie filters

Guido van Rossum (Guido.van.Rossum@cwi.nl)
Wed, 08 Jul 1992 10:53:50 +0200

Lou Kates writes:
>This example seems sufficiently useful, in general, that it (or
>possibly a sligthly enhanced version of it) should be added to
>the standard distributed Python library. It is useful not only
>for itself but also as a learning example of how you do this sort
>of thing in Python.

We are in almost complete agreement! Here's a (much enhanced) version
that supports more Perl options and parses the script only once. I
added the -F option in analogy to awk; how do you set the field
separator in Perl?

I've called it "pp.py" and (some version of) it will appear under this
name as a demo script in the next release (whenever that is). Adding
it to the library won't do much good since the Python library consists
of modules that you can import from other Python programs, while this
is a "script" that normally runs as a main program.

Nasty bug warning: calling it with the option string "-na" will cause
weird behavior if your Python interpreter has stdwin compiled in :-(.
This is because the X11 option processing will eat the "-na" (which it
thinks is an abbreviation for "-name") and the following argument.
Using "-n -a" will work fine.

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"Now, let's get one thing *quite* clear. I most definitely told you!"

----------------------------- pp.py ------------------------------------
#! /usr/local/python

# Emulate some Perl command line options.
# Usage: pp [-a] [-c] [-e scriptline] [-F fieldsep] [-n] [-p] [file] ...
# Where the options mean the following:
# -a : together with -n or -p, splits each line into list F
# -c : check syntax only, do not execute any code
# -e scriptline : gives one line of the Python script; may be repeated
# -F fieldsep : sets the field separator for the -a option [not in Perl]
# -n : runs the script for each line of input
# -p : prints the line after the script has run
# When no script lines have been passed, the first file argument
# contains the script. With -n or -p, the remaining arguments are
# read as input to the script, line by line. If a file is '-'
# or missing, standard input is read.

import sys
import string
import getopt

FS = ''
SCRIPT = []
AFLAG = 0
CFLAG = 0
NFLAG = 0
PFLAG = 0

try:
optlist, ARGS = getopt.getopt(sys.argv[1:], 'ace:F:np')
except getopt.error, msg:
sys.stderr.write(sys.argv[0] + ': ' + msg + '\n')
sys.exit(2)

for option, optarg in optlist:
if option == '-a':
AFLAG = 1
elif option == '-c':
CFLAG = 1
elif option == '-e':
for line in string.splitfields(optarg, '\n'):
SCRIPT.append(line)
elif option == '-F':
FS = optarg
elif option == '-n':
NFLAG = 1
PFLAG = 0
elif option == '-p':
NFLAG = 1
PFLAG = 1
else:
print option, 'not recognized???'

if not ARGS: ARGS.append('-')

if not SCRIPT:
if ARGS[0] == '-':
fp = sys.stdin
else:
fp = open(ARGS[0], 'r')
while 1:
line = fp.readline()
if not line: break
SCRIPT.append(line[:-1])
del fp
del ARGS[0]
if not ARGS: ARGS.append('-')

if CFLAG:
prologue = ['if 0:']
epilogue = []
elif NFLAG:
# Note that it is on purpose that AFLAG and PFLAG are
# tested dynamically each time through the loop
prologue = [ \
'LINECOUNT = 0', \
'for FILE in ARGS:', \
' \tif FILE == \'-\':', \
' \t \tFP = sys.stdin', \
' \telse:', \
' \t \tFP = open(FILE, \'r\')', \
' \tLINENO = 0', \
' \twhile 1:', \
' \t \tLINE = FP.readline()', \
' \t \tif not LINE: break', \
' \t \tLINENO = LINENO + 1', \
' \t \tLINECOUNT = LINECOUNT + 1', \
' \t \tL = LINE[:-1]', \
' \t \taflag = AFLAG', \
' \t \tif aflag:', \
' \t \t \tif FS: F = string.splitfields(L, FS)', \
' \t \t \telse: F = string.split(L)' \
]
epilogue = [ \
' \t \tif not PFLAG: continue', \
' \t \tif aflag:', \
' \t \t \tif FS: print string.joinfields(F, FS)', \
' \t \t \telse: print string.join(F)', \
' \t \telse: print L', \
]
else:
prologue = ['if 1:']
epilogue = []

# Note that we indent using tabs only, so that any indentation style
# used in 'command' will come out right after re-indentation.

program = string.joinfields(prologue, '\n') + '\n'
for line in SCRIPT:
program = program + (' \t \t' + line + '\n')
program = program + (string.joinfields(epilogue, '\n') + '\n')

import tempfile
tfn = tempfile.mktemp()
try:
fp = open(tfn, 'w')
fp.write(program)
fp.close()
execfile(tfn)
finally:
import os
try:
os.unlink(tfn)
except:
pass