PROPOSAL: A Generic Python Object Interface for Python C Modules

Jim Fulton (jfulton@dsjfqvarsa.er.usgs.GOV)
Mon, 27 Feb 1995 18:21:43 GMT

PROPOSAL: A Generic Python Object Interface for Python C Modules

Problem

Python modules written in C that must access Python objects must do
so through routines whose interfaces are described by a set of
include files. Unfortunately, these routines vary according to the
object accessed. To use these routines, the C programmer must check
the type of the object being used and must call a routine based on
the object type. For example, to access an element of a sequence,
the programmer must determine whether the sequence is a list or a
tuple:

if(is_tupleobject(o))
e=gettupleitem(o,i)
else if(is_listitem(o))
e=getlistitem(o,i)

If the programmer wants to get an item from another type of object
that provides sequence behavior, there is no clear way to do it
correctly.

The persistent programmer may peruse object.h and find that the
_typeobject structure provides a means of invoking up to (currently
about) 41 special operators. So, for example, a routine can get an
item from any object that provides sequence behavior. However, to
use this mechanism, the programmer must make their code dependent on
the current Python implementation.

Also, certain semantics, especially memory management semantics, may
differ by the type of object being used. Unfortunately, these
semantics are not clearly described in the current include files.
An abstract interface providing more consistent semantics is needed.

Proposal

I propose the creation of a standard interface (with an associated
library of routines and/or macros) for generically obtaining the
services of Python objects. This proposal can be viewed as one
components of a Python C interface consisting of several components.

From the viewpoint of of C access to Python services, we have (as
suggested by Guido in off-line discussions):

- "Very high level layer": two or three functions that let you exec or
eval arbitrary Python code given as a string in a module whose name is
given, passing C values in and getting C values out using
mkvalue/getargs style format strings. This does not require the user
to declare any variables of type "PyObject *". This should be enough
to write a simple application that gets Python code from the user,
execs it, and returns the output or errors. (Error handling must also
be part of this API.)

- "Abstract objects layer": which is the subject of this proposal.
It has many functions operating on objects, and lest you do many
things from C that you can also write in Python, without going
through the Python parser.

- "Concrete objects layer": This is the public type-dependent
interface provided by the standard built-in types, such as floats,
strings, and lists. This interface exists and is currently
documented by the collection of include files provides with the
Python distributions.

From the point of view of Python accessing services provided by C
modules:

- "Python module interface": this interface consist of the basic
routines used to define modules and their members. Most of the
current extensions-writing guide deals with this interface.

- "Built-in object interface": this is the interface that a new
built-in type must provide and the mechanisms and rules that a
developer of a new built-in type must use and follow.

This proposal is a "first-cut" that is intended to spur
discussion. See especially the lists of notes.

The Python C object interface will provide four protocols: object,
numeric, sequence, and mapping. Each protocol consists of a
collection of related operations. If an operation that is not
provided by a particular type is invoked, then a standard exception,
NotImplementedError is raised with a operation name as an argument.
In addition, for convenience this interface defines a set of
constructors for building objects of built-in types. This is needed
so new objects can be returned from C functions that otherwise treat
objects generically.

Memory Management

For all of the functions described in this proposal, if a function
retains a reference to a Python object passed as an argument, then the
function will increase the reference count of the object. It is
unnecessary for the caller to increase the reference count of an
argument in anticipation of the object's retention.

All Python objects returned from functions should be treated as new
objects. Functions that return objects assume that the caller will
retain a reference and the reference count of the object has already
been incremented to account for this fact. A caller that does not
retain a reference to an object that is returned from a function
must decrement the reference count of the object (using
DECREF(object)) to prevent memory leaks.

Note that the behavior mentioned here is different from the current
behavior for some objects (e.g. lists and tuples) when certain
type-specific routines are called directly (e.g. setlistitem). The
proposed abstraction layer will provide a consistent memory
management interface, correcting for inconsistent behavior for some
built-in types.

See note 7.

Protocols

Object Protocol:
1
int *PyObject_Print(PyObject *o, FILE *fp, int flags)

Print an object, o, on file, fp. Returns 1 on success, 0 on
error. The flags argument is used to enable certain printing
options. The only option currently supported is Py_Print_RAW.

(What should be said about Py_Print_RAW?)

int PyObject_HasAttrString(PyObject *o, char *attr_name)

Returns 1 if o has the attribute attr_name, and 0 otherwise.
This is equivalent to the Python expression:
hasattr(o,attr_name).

This function always succeeds.

PyObject* PyObject_AttrString(PyObject *o, char *attr_name)

Retrieve an attributed named attr_name form object o.
Returns the attribute value on success, or NULL on failure.
This is the equivalent of the Python expression: o.attr_name.

int PyObject_HasAttr(PyObject *o, PyObject *attr_name)

Returns 1 if o has the attribute attr_name, and 0 otherwise.
This is equivalent to the Python expression:
hasattr(o,attr_name).

This function always succeeds.

PyObject* PyObject_Attr(PyObject *o, PyObject *attr_name)

Retrieve an attributed named attr_name form object o.
Returns the attribute value on success, or NULL on failure.
This is the equivalent of the Python expression: o.attr_name.

1
int PyObject_SetAttrString(PyObject *o, char *attr_name, PyObject *v)

Set the value of the attribute named attr_name, for object o,
to the value, v. Returns 1 on success, 0 on failure. This is
the equivalent of the Python statement: o.attr_name=v.

1
int PyObject_SetAttr(PyObject *o, PyObject *attr_name, PyObject *v)

Set the value of the attribute named attr_name, for object o,
to the value, v. Returns 1 on success, 0 on failure. This is
the equivalent of the Python statement: o.attr_name=v.

1
int PyObject_DelAttrString(PyObject *o, char *attr_name)

Delete attribute named attr_name, for object o. Returns 1 on
success, 0 on failure. This is the equivalent of the Python
statement: del o.attr_name.

1
int PyObject_DelAttr(PyObject *o, PyObject *attr_name)

Delete attribute named attr_name, for object o. Returns 1 on
success, 0 on failure. This is the equivalent of the Python
statement: del o.attr_name.

2
int PyObject_Cmp(PyObject *o1, PyObject *o2, int *result)

Compare the values of o1 and o2 using a routine provided by
o1, if one exists, otherwise with a routine provided by o2.
The result of the comparison is returned in result. Returns
1 on success, 0 on failure. This is the equivalent of the Python
statement: result=cmp(o1,o2).

4
int PyObject_Compare(PyObject *o1, PyObject *o2)

Compare the values of o1 and o2 using a routine provided by
o1, if one exists, otherwise with a routine provided by o2.
Returns the result of the comparison on success. On error,
the value returned is undefined. This is equivalent to the
Python expression: cmp(o1,o2).

PyObject *PyObject_Repr(PyObject *o)

Compute the string representation of object, o. Returns the
string representation on success, NULL on failure. This is
the equivalent of the Python expression: repr(o).

Called by the repr() built-in function and by reverse quotes.

PyObject *PyObject_Str(PyObject *o)

Compute the string representation of object, o. Returns the
string representation on success, NULL on failure. This is
the equivalent of the Python expression: str(o).)

Called by the str() built-in function and by the print
statement.

int *PyCallable_Check(PyObject *o);

Determine if the object, o, is callable. Return 1 if the
object is callable and 0 otherwise.

This function always succeeds.

PyObject *PyObject_CallObject(PyObject *callable_object, PyObject *args)

Call a callable Python object, callable_object, with
arguments given by the tuple, args. If no arguments are
needed, then args may be NULL. Returns the result of the
call on success, or NULL on failure. This is the equivalent
of the Python expression: apply(o,args).)

2
int PyObject_Spam(PyObject *o, long *hash_value)

Compute the hash, hash_value, of an object, o. Returns 1 on
success, 0 on error. This function is similar to
PyObject_Hash, but smells better. This is the equivalent of
the Python statement: hash_value=hash(o).

4
long PyObject_Hash(PyObject *o)

Compute and return the hash, hash_value, of an object, o. On
failure, return -1. This is the equivalent of the Python
expression: hash(o).

4
int *PyObject_IsTrue(PyObject *o)

Returns 1 if the object, o, is considered to be true, and
0 otherwise. This is equivalent to the Python expression:
not not o

This function always succeeds.

PyObject *PyObject_Type(PyObject *o)

On success, returns a type object corresponding to the object
type of object o. On failure, returns NULL. This is
equivalent to the Python expression: type(o).

int PyObject_Length(PyObject *o)

Return the length of object o. If the object, o, provides
both sequence and mapping protocols, the sequence length is
returned. On error, -1 is returned. This is the equivalent
to the Python expression: len(o).

PyObject *PyObject_GetItem(PyObject *o, PyObject *key)

Return element of o corresponding to the object, key, or NULL
on failure. This is the equivalent of the Python expression:
o[key].

1
int PyObject_SetItem(PyObject *o, PyObject *key, PyObject *v)

Map the object, key, to the value, v. The reference count of
v is unchanged, however, the reference count of the object
that v replaces is decremented. (Does key replace an object?)
Returns 1 on success, 0 on failure. This is the equivalent
of the Python statement: o[key]=v.

Number Protocol:

int PyNumber_Check(PyObject *o)

Returns 1 if the object, o, provides numeric protocols, and
false otherwise.

This function always succeeds.

3
PyObject *PyNumber_Add(PyObject *o1, PyObject *o2)

Returns the result of adding o1 and o2, or null on failure.
This is the equivalent of the Python expression: o1+o2.

3
PyObject *PyNumber_Subtract(PyObject *o1, PyObject *o2)

Returns the result of subtracting o2 from o1, or null on
failure. This is the equivalent of the Python expression:
o1-o2.

3
PyObject *PyNumber_Multiply(PyObject *o1, PyObject *o2)

Returns the result of multiplying o1 and o2, or null on
failure. This is the equivalent of the Python expression:
o1*o2.

3
PyObject *PyNumber_Divide(PyObject *o1, PyObject *o2)

Returns the result of dividing o1 by o2, or null on failure.
This is the equivalent of the Python expression: o1/o2.

3
PyObject *PyNumber_Remainder(PyObject *o1, PyObject *o2)

Returns the remainder of dividing o1 by o2, or null on
failure. This is the equivalent of the Python expression:
o1%o2.

3
PyObject *PyNumber_Divmod(PyObject *o1, PyObject *o2)

See the built-in function divmod. Returns NULL on failure.
This is the equivalent of the Python expression:
divmod(o1,o2).

3
PyObject *PyNumber_Power(PyObject *o1, PyObject *o2, PyObject *o3)

See the built-in function pow. Returns NULL on failure.
This is the equivalent of the Python expression:
pow(o1,o2,o3), where o3 is optional.

PyObject *PyNumber_Negative(PyObject *o)

Returns the negation of o on success, or null on failure.
This is the equivalent of the Python expression: -o.

PyObject *PyNumber_Positive(PyObject *o)

Returns the (what?) of o on success, or NULL on failure.
This is the equivalent of the Python expression: +o.

PyObject *PyNumber_Absolute(PyObject *o)

Returns the absolute value of o, or null on failure. This is
the equivalent of the Python expression: abs(o).

PyObject *PyNumber_Invert(PyObject *o)

Returns the bitwise negation of o on success, or NULL on
failure. This is the equivalent of the Python expression:
~o.

3
PyObject *PyNumber_Lshift(PyObject *o1, PyObject *o2)

Returns the result of left shifting o1 by o2 on success, or
NULL on failure. This is the equivalent of the Python
expression: o1 << o2.

3
PyObject *PyNumber_Rshift(PyObject *o1, PyObject *o2)

Returns the result of right shifting o1 by o2 on success, or
NULL on failure. This is the equivalent of the Python
expression: o1 >> o2.

PyObject *PyNumber_And(PyObject *o1, PyObject *o2)

Returns the result of "anding" o2 and o2 on success and NULL
on failure. This is the equivalent of the Python
expression: o1 and o2.

3
PyObject *PyNumber_Xor(PyObject *o1, PyObject *o2)

Returns the bitwise exclusive or of o1 by o2 on success, or
NULL on failure. This is the equivalent of the Python
expression: o1^o2.

3
PyObject *PyNumber_Or(PyObject *o1, PyObject *o2)

Returns the result or o1 and o2 on success, or NULL on
failure. This is the equivalent of the Python expression:
o1 or o2.

PyObject *PyNumber_Coerce(PyObject *o1, PyObject *o2)

On success, returns a tuple containing o1 and o2 converted to
a common numeric type, or None if no conversion is possible.
Returns NULL on failure. This is equivalent to the Python
expression: coerce(o1,o2).

PyObject *PyNumber_Int(PyObject *o)

Returns the o converted to an integer object on success, or
NULL on failure. This is the equivalent of the Python
expression: int(o).

PyObject *PyNumber_Long(PyObject *o)

Returns the o converted to a long integer object on success,
or NULL on failure. This is the equivalent of the Python
expression: long(o).

PyObject *PyNumber_Float(PyObject *o)

Returns the o converted to a float object on success, or NULL
on failure. This is the equivalent of the Python expression:
float(o).

Sequence protocol:

int PySequence_Check(PyObject *o)

Return 1 if the object provides sequence protocol, and zero
otherwise.

This function always succeeds.

PyObject *PySequence_Concat(PyObject *o1, PyObject *o2)

Return the concatination of o1 and o2 on success, and NULL on
failure. This is the equivalent of the Python
expression: o1+o2.

PyObject *PySequence_Repeat(PyObject *o, int count)

Return the result of repeating sequence object o count times,
or NULL on failure. This is the equivalent of the Python
expression: o1*count.

PyObject *PySequence_GetItem(PyObject *o, int i)

Return the ith element of o, or NULL on failure. This is the
equivalent of the Python expression: o[i].

PyObject *PySequence_GetSlice(PyObject *o, int i1, int i2)

Return the slice of sequence object o between i1 and i2, or
NULL on failure. This is the equivalent of the Python
expression, o[i1:i2].

1
int PySequence_SetItem(PyObject *o, int i, PyObject *v)

Assign object v to the ith element of o. The reference
count of v is unchanged, however, the reference count of the
object that v replaces is decremented. Returns 1 on success,
0 on failure. This is the equivalent of the Python
statement, o[i]=v.

1
int PySequence_SetSlice(PyObject *o, int i1, int i2, PyObject *v)

Assign the sequence object, v, to the slice in sequence
object, o, from i1 to i2. This is the equivalent of the Python
statement, o[i1:i2]=v.

PyObject *PySequence_Tuple(PyObject *o)

Returns the o as a tuple on success, and NULL on failure.
This is equivalent to the Python expression: tuple(o)

int PySequence_Count(PyObject *o, PyObject *value)

Return the number of occurrences on value on o, that is,
return the number of keys for which o[key]==value. On
failure, return -1. This is equivalent to the Python
expression: o.count(value).

int PySequence_In(PyObject *o, PyObject *value)

Determine if o contains value. If an item in o is equal to
X, return 1, otherwise return 0. On error, return -1. This
is equivalent to the Python expression: value in o.

int PySequence_Index(PyObject *o, PyObject *value)

Return the first index for which o[i]=value. On error,
return -1. This is equivalent to the Python
expression: o.index(value).

Mapping protocol:

int PyMapping_Check(PyObject *o)

Return 1 if the object provides mapping protocol, and zero
otherwise.

This function always succeeds.

2
int PyMapping_Length(PyObject *o, int *l)

Returns the number of keys in object o on success, and -1 on
failure. For objects that do not provide sequence protocol,
this is equivalent to the Python expression: len(o).

int PyMapping_DelItemString(PyObject *o, char *key)

Remove the mapping for object, key, from the object *o.
Return 1 on success, and 0 on failure. This is equivalent to
the Python statement: del o[key].

int PyMapping_DelItem(PyObject *o, PyObject *key)

Remove the mapping for object, key, from the object *o.
Return 1 on success, and 0 on failure. This is equivalent to
the Python statement: del o[key].

int PyMapping_HasKeyString(PyObject *o, char *key)

On success, return 1 if the mapping object has the key, key,
and 0 otherwise. This is equivalent to the Python expression:
o.has_key(key).

This function always succeeds.

int PyMapping_HasKey(PyObject *o, PyObject *key)

Return 1 if the mapping object has the key, key,
and 0 otherwise. This is equivalent to the Python expression:
o.has_key(key).

This function always succeeds.

PyObject *PyMapping_Keys(PyObject *o)

On success, return a list of the keys in object o. On
failure, return NULL. This is equivalent to the Python
expression: o.keys().

PyObject *PyMapping_Values(PyObject *o)

On success, return a list of the values in object o. On
failure, return NULL. This is equivalent to the Python
expression: o.values().

PyObject *PyMapping_Items(PyObject *o)

On success, return a list of the items in object o, where
each item is a tuple containing a key-value pair. On
failure, return NULL. This is equivalent to the Python
expression: o.items().

1
int PyMapping_Clear(PyObject *o)

Make object o empty. Returns 1 on success and 0 on failure.
This is equivalent to the Python statement:
for key in o.keys(): del o[key]

PyObject *PyMapping_GetItemString(PyObject *o, char *key)

Return element of o corresponding to the object, key, or NULL
on failure. This is the equivalent of the Python expression:
o[key].

PyObject *PyMapping_SetItemString(PyObject *o, char *key, PyObject *value)

Map the object, key, to the value, v. The reference count of
v is unchanged, however, the reference count of the object
that v replaces is decremented. (Does key replace an object?)
Returns 1 on success, 0 on failure. This is the equivalent
of the Python statement, o[key]=v.

Constructors:

PyObject *PyFile_FromString(char *file_name, char *mode)

On success, returns a new file object that is opened on the
file given by file_name, with a file mode given by mode,
where mode has the same semantics as the standard C routine,
fopen. On failure, return -1.

PyObject *PyFile_FromFile(FILE *fp,
char *file_name, char *mode,
int close_on_del)

Return a new file object for an already opened standard C
file pointer, fp. A file name, file_name, and open mode,
mode, must be provided as well as a flag, close_on_del, that
indicates whether the file is to be closed when the file
object is destroyed. On failure, return -1.

PyObject *PyFloat_FromDouble(double v)

Returns a new float object with the value, v, on success, and
NULL on failure.

PyObject *PyInt_FromLong(long v)

Returns a new int object with the value, v, on success, and
NULL on failure.

PyObject *PyList_New(int l)

Returns a new list of length, l, on success, and NULL on
failure.

PyObject *PyLong_FromLong(long v)

Returns a new long object with the value, v, on success, and
NULL on failure.

PyObject *PyLong_FromDouble(double v)

Returns a new long object with the value, v, on success, and
NULL on failure.

PyObject *PyDict_New()

Returns a new empty dictionary on success, and NULL on
failure.

PyObject *PyString_FromString(char *v)

Returns a new string object with the value, v, on success, and
NULL on failure.

PyObject *PyString_FromStringAndSize(char *v, int l)

Returns a new string object with the value, v, and length, l,
on success, and NULL on failure.

PyObject *PyTuple_New(int l)

Returns a new tuple of length, l, on success, and NULL on
failure.

Notes:

0
The proposal is based on (ripped directly off from) the special
operations currently supported by the _typeobject structure.

1
Some special routines currently return 0 on success and -1 on
failure. In the proposal above, I return 1 on success and 0 on
failure. This makes error checking more consistent. For example, I
find the following macro very useful:

#define TRY(E) if(! (E)) return NULL;

This allows things like:

TRY(spam=PySequence_getitem(o,i));

which I find more readable (especially if there are a lot of these)
than:

if(! (spam=PySequence_getitem(o,i))
return NULL;

or:

spam=PySequence_getitem(o,i);
if(! spam)
return NULL;

The proposed change in return semantics could be implemented by an
abstraction layer without changing existing code.

2
Some special routines return values that may be 0 for a normal
return. For these routines, I have made the return value an output
argument so as to retain the ability to check for errors by checking
whether the return value is zero.

3
What, if any, coercion takes place? Is the same model supported for
both classes and built-in types? What needs to be said here?
I don't see any hooks at the C level for providing "right" versions
of these.

4
Some of the proposed routines already exists and provide no way to
separately return an error result and a normal result. In this
case, I have not included the normal result in the argument list.

5
Note that this will work for either mapping or sequence types,
however, for sequence types, key should be an int object (or
otherwise convertible to an int). For types that provide both
sequence and mapping protocols, if the key is an int object, then
the sequence operations will be used, otherwise the mapping
operations will be used.

7
I wonder if it might not be a good idea, in Python 1.2 to fix the
inconsistent memory management behavior of built-in types. In 1.2,
the new Python naming scheme is being introduced. In most cases,
new names are introduced (or equivalently old names are retained)
through preprocessor macros. Perhaps where we desire to change the
memory management semantics, the new names should have different
semantics that the old names. For example, setlistitem could retain
the current semantics and PyList_SetItem could provide the semantics
described here.

$Id: Python-C-Interface,v 1.2 1995/02/27 18:07:09 jfulton Exp $

--
-- Jim Fulton      jfulton@mailqvarsa.er.usgs.gov    (703) 648-5622
                   U.S. Geological Survey, Reston VA  22092 
This message is being posted to obtain or provide technical information
relating to my duties at the U.S. Geological Survey.