Re: Ideas about enhancements to fileobjects

Consultant (amrit@xvt.com)
Tue, 23 Nov 1993 13:18:12 -0700 (MST)

>
> I seem to be using python to do a lot of file scanning.
> Anyway, I want my file scanning programs to be fast and able to produce
> useful error messages, so I whipped up a pseudo-subclass of fileobject
> (this means I'm prototyping in python) that gives me line-oriented input
> useful for line-by-line file scanning.
>
> I added two attributes: a current line number and the file name, that make
> it much easier for parsers to produce useful error messages (yeah, you can
> write a single parser that keeps track of line numbers, but wait until you
> try to write a parser class), and I added a number of methods for
> looking ahead at lines in the file.
>
> f.peekline() - peek the next unpeeked line
> f.peeklinen() - peek the nth unread line
> f.peekreset() - make all unread lines unpeeked
>

The simplest thing to do may be to read the file in all at once and
maintain a map of line numbers and file lines. (This is only practical,
of course, if the file is not too big.)

This is a fragment of code I used in a turing machine simulator I wrote
which I now cut & paste into other file processing programs that I write:

def load(input):
....
lines = input.readlines()
for lineno, line in bagof(lambda pair: len(pair[1]) > 0,
map(lambda x,y: (x+1, strip(y)),
range(len(lines)), lines)):

if re.match(line) == -1:
raise ValueError, `lineno` + ': Bad syntax for state transition'

state, char, new_state, operation = re.group(1, 2, 3, 4)
program[(eval(state), char)] = (eval(new_state), operation)

(The code here uses functions that will be in Python 1.0)

If the variable lines contains ['a', ' ', 'c'], then the subexpression,
'map(lambda x,y: (x+1, strip(y)) ...)' transforms the list to a list
of tuples of line number/line content pairs; the resultant list becomes:
[(1, 'a'), (2, ''), (3, 'c')]. The bagof() operation removes empty
lines, so the final list becomes: [(1, 'a'), (3, 'c')].

Here a list of tuples is appropriate since we just use these for iteration;
for random access to lines in a file it would be more appropriate to maintain
a mapping of line numbers to text strings.

(I realize you don't have bagof(), map(), etc. Send mail to me or Guido if
you are interested in this approach.)