Re: C header -> Python Module?

guido@CNRI.Reston.VA.US
Fri, 21 Apr 95 17:05:34 -0400

>Let's say I have a simple database library written in C and I would like
>to use it from Python. Is there a program to automate the glue code
>writing?
>
>I took a look at modulator.py which does something similar, but not
>quite what I want in this case.
>
>Actually, I made a stab at writing h2pymodule.py myself, but if
>somebody has already finished, I'd like to hear about it.
>
>I realize not everything can be made fully automatic, but much
>handwork could be avoided in all cases, and simple function calls
>could be 'glued' automatically.

I have been playing with this idea for a while now, and managed to
generate fairly complete modules for a number of Macintosh toolkits
(e.g. Windows, QuickDraw, Menus, etc.). The results of all this are
(I believe) in the full Python 1.2 distribution under Tools/bgen.

It is, unfortunately, not by any means easy to do it fully automatic.

The bgen package contains a number of classes that generate boilerplate
code for functions and methods, as well as for other things. At the
heart of it all is a class 'FunctionGenerator' which can generate the various
pieces of C code that you need to implement a function interface.
It is instantiated with the function's return type, its name, and
a list of descriptors for the arguments. For instance, to interface
to the function

int spam(const char *filename, int mode);

you would instantiate

spam = FunctionGenerator(int, 'spam',
(stringptr, 'filename', InMode),
(int, 'mode', InMode))

and call something like

spam.generate()

This would output the following C fragment:

static PyObject *XXX_spam(_self, _args)
PyObject *_self;
PyObject *_args;
{
PyObject *_res = NULL;
int _rv;
char* filename;
int mode;
if (!PyArg_ParseTuple(_args, "si",
&filename,
&mode))
return NULL;
_rv = spam(filename,
mode);
_res = Py_BuildValue("i",
_rv);
return _res;
}

Perhaps not the most beautiful code, but correct.

Note that the type names (int, stringptr) are objects too -- in fact
a type has a lot of freedom to specify exactly what code is generated
for variables of that type. You can add your own types too, and you
can subclass from the FunctionGenerator class to add quirks of your own.

There's also a module that takes a header file looking for function
prototypes, and creates a list of generator calls -- potentially automating
the whole process. This is less likely to be perfect, since not all
information is available. Take for instance a function that specifies
a parameter of type int*. This could be an input or an output parameter
(or both), and it could be pointing to a single integer or to an array.
Often there are hints hidden in the prototype that help (e.g. the
parameter names) but these are not standardized. Therefore it is necessary
to customize this module for each header (or at least for each style)
that you want to translate.

Finally note that this is not finished work. I have tried it on a couple
of Mac modules only, which have a fairly consistent style, and even there
I found that I needed to change the base class to add more flexibility
almost with each new module I tried to attack. Yet, if you need to
interface to a large library whose interface is given as a bunch of .h
files, it's probably more cost effective to spend some time on
customizing a scanner for its header file style than it would be to
manually write all those calls to FunctionGenerator, let alone writing
all the C code by hand!

--Guido van Rossum <guido@CNRI.Reston.VA.US>
URL: http://www.cwi.nl/~guido/