Actually, for most file formats I've encountered it's pretty easy for
the conversion program to infer the byte order used by a file from the
first few bytes of data (e.g. a magic number) -- presuming the
conversion program what the data is supposed to look like...
In general users needn't know what byte order a file is in -- the hard
part is knowing what the format is in the first place. Maybe "magic
(a replacement for file(1) written in Perl just posted on the net) can
help them with that -- although for me it just said
``Undefined subroutine "main'BufMatch" called at magic.pl line 429.''
>Actually, you have a point. I don't expect them to know what byte order
>is, but it is reasonable for them to know that what they are trying to
>do is convert IBM-PC files to Sparc/Unix files of some type. So I reject
>the idea of an option like 'dd's "conv=swab", but something like
>"-from pc" is reasonable. But I still want to keep the "low-level" code
>portable.
Given some table mapping architectures to byte orders you could add a
"-to sparc" option. But of course the "dd" example is naive: no file
formats consists of arrays of words that can be byte-swapped like
that; in reality you will have to write code that understands the data
and knows when to swap words or longs or nothing.
>Also, I WILL admit that *I* don't know the byte order of various machines
>off the top of my head. But I *do* know that there is a "network byte
>order" and conversions to/from that and native byte order.
Reading a word in network byte order is easy in Python:
def rwordn(fp):
s = fp.read(2)
if len(s) < 2: raise EOFError
hibyte, lobyte = ord(s[0]), ord(s[1])
return hibyte<<8 | lobyte
Writing is of course as easy. There's no need to know the byte order
used by Python here. This assumes network byte order is big-endian
(hibyte first), which is what the Internet specifies -- but who knows
what other networks use?
You can write similar routines rwordb() and rwordl() to read words in
big and little endian order -- rwordb() will of course be identical to
rwordn().
>Where pathname need to be written into a program, then it is not
>unreasonable to expect the programmer to explicitly use a
>"path.to_native" conversion function, to indicate that he doesn't
>actually MIND if the pathname get's munged up ( truncated | character
>translated | etc. ) as long as there is a determinate mapping.
>( for practical purposes one-to-one )
I don't understand this. Pathnames hardcoded into programs are almost
always things like /etc/termcap, /usr/tmp or the equivalent of
$HOME/.mh_profile. How would you expect a path.to_native for Mac or
MS-DOS to translate these? You can *never* expect to be able to move
a program containing hardcoded pathnames to such a system without
having to edit them.
Filenames (i.e. no slash on UNIX) are a different matter, but even
there the choice of names usually has to be revised by a human being
when moving to a different O.S., e.g. names begginning with a . are
not usable on MS-DOS.
>I think we are in agreement on:
>(1) There will be ( for example ) a 'path' module for unix|dos|mac|etc.
> that will attempt to hide or at least isolate machine differences.
>(2) That portable code should only need to import 'path', and not
> need to figure out which 'specific' module ( unixpath|macpath|dospath )
> it needs to load. [ I don't care what mechanics we choose to do this,
> as long as we can hide the machinery! ]
>(3) But searching for module dependencies and renameing files is not
> the preferred solution to the above. [ The machinery here is
> painfully visible, even if only visible to ONE person ( the site
> maintainer/installer of Python. ]
Yes. Yes! YES!!!
I will try to put a lot of this the a next release, or at least make a
decent attempt (after all I routinely move Python code between a Mac
and UNIX so I have ample opportunity to test it in two totally
different environments).
>[ BTW: I have managed to get my packet drivers working on my 486 PC,
> So I can telnet the python sources onto it. I have the Gnu C compiler
> PC port installed, So I hope to start porting Python *REAL SOON* ]
That's good news!
--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"This is an ex-parrot"