Re: Ideas about enhancements to fileobjects

jredford@lehman.com
Tue, 23 Nov 93 09:37:16 -0500

Some comments on the comments.

>> filename: good. It's saved already so should be accessible.

There are no guarantees that this file has not been moved, or removed,
replaced with another file or link, or otherwise modified.

This is data you had when you made the file. I think if you want it
around you should keep it from then. Pass around a tuple with the file
name & the file object, dont try to put the name & other application
data into the object. Next someone will want the address & port of a
socket to be part of the object.

>> lineno: useful, but has one problem: it can't always be correct.
>> Keeping it up-to-date after read() is possible but may slow read() of
>> large files down a bit; keeping it up-to-date after seek() is
>> (realistically speaking) impossible. And a minor detail: should it
>> represent the number of lines read so far or the number of the next
>> line?
>>
>> Suggestion: make it a writable attribute, initialized to 0; set to -1
>> by seek(); if it's -1, it's left unchanged by read() and readline();
>> if >= 0, readline() bumps it by 1, read() bumps it by the number of \n
>> characters in the string read. Well, why not do the same for
>> writeline() and write()... Finally, initialize it to -1 when the file
>> is opened with mode 'rb' or 'wb'. I suggest that the filename be made
>> a writable attribute as well -- might be useful to cheat etc.

This is comepletely untrustable. If you want to count the number of
'\n's you have read, thats fine, but that dosent prevent someone from
inserting more into the top of the file. If you want a number that
equals the number of times you call readline(), thats easy enough to
keep on your own.

>> peek functions: I'm less convinced that this is worth the additional
>> complexity -- and I've a feeling that it might encourage bad style (oh
>> there he goes again I hear some of you thinking :-). On the other
>> hand it might be a good idea. I've a suggestion for a slightly
>> different style of interface: f.peekline() would return the next
>> unpeeked line and f.peekline(n) would return the n'th line (counting
>> from 0, obviously). I don't see when f.peekreset() would be necessary
>> -- for definiteness, code should always use f.peekline(n) if there may
>> be different pieces of code peeking in the same file. Maybe
>> f.peekline() should mean f.peekline(n+1) when called after
>> f.peekline(n) if I understand correctly how you would use this most of
>> the time.

I dont think this has any redeeming aspect. 'peek' semantics are not
gaurenteed past 1 character. Peeking a regular file makes no sense. If
you want to read the next 2 lines then seek back to where you are, do
that. Or open 2 file descriptors & use one for read ahead. Using these
peek function on a file that represented a socket would be a minor
nightmare, as it would break any other dup'd readers of the socket.

Oh, and this would definitely encourage bad style. _using_ it is bad
style.

This mostly look like cruft that would slow down files just to make
some applications minorly easier. Parsers arent really the kind of
thing one expects to write more than once, if that, and it isnt
supposed to be trivial even then.

Speaking of such things, is there or has someone considered adding a
M3-style Sx module?

--
John Redford (AKA GArrow) | 3,600 hours of tape.
jredford@lehman.com       | 5 cans of Scotchguard.