RE: dbm

Michael McLay (mclay@eeel.nist.gov)
Thu, 23 Mar 95 10:34:50 EST

Matthew Jones writes:
> Help please!
>
> I'm trying to create a simple database using the python binding to
> dbm. It fails to enter records after a certain point. The error I get
> is:
>
> Traceback (innermost last):
> File "./createdb.py", line 29, in ?
> db[`entry_key`] = `entry`
> dbm.error: Cannot add item to database
>
> and the database file size is enormous. The databse seems to be
> putting lots of null byte padding between records. I tried using the
> gdbm module but couldn't find the gdbm.h file.
>
> Why is the database growing at an alarming rate, is the inter-record
> padding an error or typical?

You've hit a couple dbm "features". Here's the extract from the dbm
man page.

BUGS
The .pag file will contain holes so that its apparent size
is about four times its actual content. Older versions of
the UNIX operating system may create real file blocks for
these holes when touched. These files cannot be copied by
normal means (cp(1), cat(1V), tar(1), ar(1V)) without fil-
ling in the holes.

The sum of the sizes of a key/content pair must not exceed
the internal block size (currently 1024 bytes). Moreover
all key/content pairs that hash together must fit on a sin-
gle block. store() will return an error in the event that a
disk block fills with inseparable data.

The dbm.error is probably the result of exceeding the internal block
size. The remedy is to either use the gdbm module or use the db
library from the BSD release. Both libraries fixed the problem with
internal block size limit. The BSD library has a backwards compatible
interfae with dbm. It does differ in that the BSD version use only
one file to store the database information. The BSD library is located in:

ftp://ftp.cs.berkeley.edu/ucb/4bsd/db.tar.Z

Some additional information on using the BSD library with Python is
available in ftp://ftp.eeel.nist.gov/pub/python/README.

-- 
Michael J. McLay
National Institute of Standards and Technology
Bld 220 Rm A357 (office), Bld 220 Rm B344 (mail)
Gaithersburg, Maryland 20899, (301)975-4099