Re: Reading binary data from a file.

Guido.van.Rossum@cwi.nl
Wed, 03 Nov 1993 10:45:55 +0100

> Just found myself wanting to do this:
>
> Integer = f.read(4)
>
> Yes. I have an integer (in binary) on a file and I know that my
> machine representation of this interger is 4 bytes. So I want to read
> these 4 bytes and convert them into a python integer. Is there some
> support for this or do I have to do a conversion function?

Two possible ways, using different (optional, but highly recommended)
built-in modules:

(1) Using module struct, you can convert almost any binary data (in
your host's format and byte order) to Python. In this case it would
be:

import struct
(Integer,) = struct.unpack('l', f.read(4))

This is the appropriate method if you are reading something like a
header structure; the format string can specify a list of types
describing the header, e.g. 'hhl' would be two shorts and a long,
converted to a triple of three Python integers.

Notes:

(a) struct.unpack() always returns a tuple, hence the funny left-hand
side of the assignment

(b) the choice of a format letter is machine dependent -- 'l' is a
machine 'long' (usually 4 bytes but can be 8 e.g. on DEC alpha or HP
snake); 'i' is a machine 'int' (usually 4 bytes but can be 2 e.g. on
Mac or PC)

(c) there is currently no way to specify an alternate byte order

(2) Using module array, you can efficiently read arrays of simple data
types:

import array
a = array.array('l')
a.read(f, 1)
Integer = a[0]

This is the appropriate method if you want to read many items of the
same type, e.g. to read a 1000 integers in one fell swoop, you would
use a.read(f, 1000). The same remark about the choice of format
letter applies; but there is a method to byte swap an entire array
(a.byteswap()) -- currently undocumented.

--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>