Abstract
A lightweight mechanism has been developed for making Python extension types more class-like. Classes can be developed in an extension language, such as C or C++, and these classes can be treated like other python classes:
They can be subclassed in python,
They provide access to method documentation strings, and
They can be used to directly create new instances.
Extension classes provide support for extended method binding protocols to support additional method types and additional method call sematics.
An example class shows how extension classes are implemented and how they differ from extension types.
Extension classes illustrate how the Python class mechanism can be extended and may provide a basis for improved or specialized class models.
Problem
Currently, Python provides two ways of defining new kinds of objects:
Python classes
Extension types
Each aproach has it's strengths. Extension types provide much greater control to the programmer and, generally, better performance. Because extension types are written in C, the programmer has greater access to external resources. (Note that Python's use of the term type has little to do with the notion of type as a formal specification.)
Classes provide a higher level of abstraction and are generally much easier to develop. Classes provide full inheritence support, while support for inheritence when developing extension types is very limited. Classes provide run-time meta-data, such as method doc strings, that are useful for documentation and discovery. Classes act as factories for creating instances, while separate functions must be provided to create instances of types.
It would be useful to combine the features of the two approaches. It would be useful to be able to have better support for inheritence for types, or to be able to subclass from types in Python. It would be useful to be able to have class-like meta-data support for types and the ability to construct instances directly from types.
We have need, in a number of projects, for semantics that are
slightly different than the usual class semantics, yet we want to do
most of our development in C. For example, we have developed a
persistence mechanism [1] that redefines __getattr__
and
__setattr__
to take storage-related actions when object state is
accessed or modified. We want to be able to take certain actions on
every attribute reference, but for python class instances,
__getattr__
is only called when attribute lookup fails by normal
means.
As another example, we would like to have greater control over how methods are bound. Currently, when accessing a class instance attribute, the attribute value is bound together with the instance in a method object if and only if the attribute value is a python function. For some applications, we might also want to be able to bind extension functions, or other types of callable objects, such as HTML document templates [2]. Furthermore, we might want to have greater control over how objects are bound. For example, we might want to bind instances and callable objects with special method objects that assure that no more than one thread accesses the object or method at one time.
We can provide these special sematics in extension types, but we wish to provide them for classes developed in Python.
Background
At the first Python Conference, Don Beaudry presented work [3] done at V.I. Corp to integrate Python with C++ frameworks. This system provided a number of important features, including:
Definition of extension types that provide class-like meta-data and that can be called to create instances.
Ability to subclass in python from C types.
Ability to define classes in python who's data are stored as C structures rather than dictionaries to better interface to C and C++ libraries, and for better performance.
Less dynamic data structures. In particular, the data structure for a class is described declaratively during class definition.
Support for enumeration types.
This work was not released, initially.
Shortly after the workshop, changes were made to Python to support the subclassing features described in [3]. These changes were not documented until recently [4].
At the third Python workshop, I presented some work I had done on generating module documentation for extension types. Based on the discussion at this workshop, I developed a meta-type proposal [5]. This meta-type proposal was for an object that simply stored meta-information for a type, for the purpose of generating module documentation.
In the summer of 1996, Don Beaudry released the system described in [3] under the name MESS [6]. MESS addresses a number of needs but has a few drawbacks:
Only single inheritence is supported.
The mechanisms for defining MESS extension types is very different from and more complicated than the standard Python type creation mechanism.
Defining MESS types requires the use of an extensive C applications programming interface. This presents problems for configuraing dynamically-loaded extension modules unless the MESS library is linked into the Python interpreter.
Because the system tries to do a number of different things, it is fairly large, about 15,000 lines.
There is very little documentation, especially for the C programming interface.
The system is a work in progress, with a number of outstanding bugs.
As MESS matures, we expect most of these problems to be addressed.
Extension Classes
To meet short term needs for a C-based persistence mechanism [1], an extension class module was developed using the mechanism described in [4] and building on ideas from MESS [6]. The extension class module recasts extension types as "extension classes" by seeking to eliminate, or at least reduce semantic differences between types and classes. The module was designed to meet the following goal:
Provide class-like behavior for extension types, including interfaces for meta information and for contructing instances.
Support subclassing in Python from extension classes, with support for multiple inheritence.
Provide a small hardened implementation that can be used for current products.
Provide a mechanism that requires minimal modification to existing extension types.
Provide a basis for research on alternative semantics for classes and inheritence.
Base extension classes and extension subclasses
Base extension classes are implemented in C. Extension subclasses are implemented in python and inherit, directly or indirectly from one or more base extension classes. An extension subclass may inherit from base extension classes, extension subclasses, and ordinary python classes. The usual inheritence order rules apply. Currently, extension subclasses must conform to the following two rules:
The first super class listed in the class statement defining an extension subclass must be either a base extension class or an extension subclass.
At most one base extension direct or indirect super class may define C data members. If an extension subclass inherits from multiple base extension classes, then all but one must be mix-in classes that provide extension methods but no data.
Meta Information
Like standard python classes, extension classes have the following attributes containing meta-data:
__doc__
a documentation string for the class,
__name__
the class name,
__bases__
a sequence of base classes,
__dict__
a class dictionary.
The class dictionary provides access to unbound methods and their documentation strings, including extension methods and special methods, such as methods that implement sequence and numeric protocols. Unbound methods can be called with instance first arguments.
Subclass instance data
Extension subclass instances have instance dictionaries, just like Python class instances do. When fetching attribute values, extension class instances will first try to obtain data from the base extension class data structure, then from the instance dictionary, then from the class dictionary, and finally from base classes. When setting attributes, extension classes first attempt to use extension base class attribute setting operations, and if these fail, then data are placed in the instance dictionary.
Implementing base extension classes
A base extension class is implemented in much the same way that an extension type is implemented, except:
The include file, ExtensionClass.h
, must be included.
The type structure is declared to be of type PyExtensionClass
, rather
than of type PyTypeObject
.
The type structure has an additional member that must be defined
after the documentation string. This extra member is a method chain
PyMethodChain
) containing a linked list of method definition
PyMethodDef
) lists. Method chains can be used to implement
method inheritence in C. Most extensions don't use method chains,
but simply define method lists, which are null-terminated arrays
of method definitions. A macro, METHOD_CHAIN
is defined in
ExtensionClass.h
that converts a method list to a method chain.
(See the example below.)
Module functions that create new instances must be replaced by an
__init__
method that initializes, but does not create storage for
instances.
The extension class must be initialized and exported to the module with::
PyExtensionClass_Export(d,"name",type);
where name
is the module name and type
is the extension class
type object.
Attribute lookup
Attribute lookup is performed by calling the base extension class
getattr
operation for the base extension class that includes C
data, or for the first base extension class, if none of the base
extension classes include C data. ExtensionClass.h
defines a
macro Py_FindAttrString
that can be used to find an object's
attributes that are stored in the object's instance dictionary or
in the object's class or base classes:
In addition, a macro is provided that replaces Py_FindMethod
calls with logic to perform the same sort of lookup that is
provided by Py_FindAttrString
.
Linking
The extension class mechanism was designed to be useful with
dynamically linked extension modules. Modules that implement
extension classes do not have to be linked against an extension
class library. The macro PyExtensionClass_Export
imports the
ExtensionClass
module and uses objects imported from this module
to initialize an extension class with necessary behavior.
Example: MultiMapping objects
As an example, consider an extension class that implements a "MultiMapping". A multi-mapping is an object that encapsulates 0 or more mapping objects. When an attempt is made to lookup an object, the encapsulated mapping objects are searched until an object is found.
Consider an implementation of a MultiMapping extension type, without use of the extension class mechanism:
This module defines an extension type, MultiMapping
, and exports a
module function, MultiMapping
, that creates MultiMapping
Instances. The type provides two methods, push
, and pop
, for
adding and removing mapping objects to the multi-mapping.
The type provides mapping behavior, implementing mapping length
and subscript operators but not mapping a subscript assignment
operator.
Now consider an extension class implememtation of the MultiMapping objects:
This version includes ExtensionClass.h
. The two declarations of
MMtype
have been changed from PyTypeObject
to PyExtensionClass
.
The METHOD_CHAIN
macro has been used to add methods to the end of
the definition for MMtype
. The module function, newMMobject has
been replaced by the MMtype
method, MM__init__
. Note that this
method does not create or return a new object. Finally, the lines:
Have been added to both initialize the extension class and to export it in the module dictionary.
To use this module, compile, link, and import it as with any other extension module. The following python code illustrates the module's use:
Creating the MultiMapping
object took three steps, one to create
an empty MultiMapping
, and two to add mapping objects to it. We
might wish to simplify the process of creating MultiMapping
objects by providing a constructor that takes source mapping
objects as parameters. We can do this by subclassing MultiMapping
in Python:
Bindable objects
Python classes bind Python function attributes into methods. When a class has a function attribute that is accessed as an instance attribute, a method object is created and returned that contains references to the original function and instance. When the method is called, the original function is called with the instance as the first argument followed by any arguments passed to the bethod.
Extension classes provide a similar mechanism for attributes that
are Python functions or inherited extension functions. In addition,
if an extension class attribute is an instance of an extension class
that defines __call__
and __bind_to_object__
methods, then when
the attribute is accessed through an instance, it's
__bind_to_object__
method will be called to create a bound method.
Consider the following example:
Note that ExtensionClass.Base
is a base extension class that
provides no function other than creating extension subclasses. It
is used here to allow extension classes to be defined totally in
python to take advantage of the binding mechanism.
When run, this program outputs: 'called
Status
The current release of the extension class module is 0.3 [Download]. The implementation is about two thousand lines in size, including comments. This release will work with Python verion 1.3 or 1.4, but does not take advantage of attribute access optimizations available in Python 1.4.
Installation
Dynamic linking installation
Installation is in two steps. First, run make
to build the
extension class module, and then run make
with a target of
install
to install the ExtensionClass
module and the
ExtensionClass.h
header file in the standard python
directories. For Python revision 1.4 and higher, use the default
make file:
for Python 1.3, use the make file, 1.3-Makefile
:
Note that the make files can also be used to build the sample extension class module, MultiMapping:
Static linking installation
To statically link the extension class module into the Python interpreter:
copy the files: ExtensionClass.c
and ExtensionClass.h
to
the Modules
directory in the Python source tree,
add the following line to the Setup
file in the Modules
directory::
ExtensionClass ExtensionClass.c
rebuild python, and
copy ExtensionClass.h
to the Python run-time include
directory,
Issues
There are a number of issues that came up in the course of this work and that deserve mention.
Currently, the class extension mechanism described in [4] requires
that the first superclass in a list of superclasses must be of the
extended class type. This may not be convenient if mix-in
behavior is desired. If a list of base classes starts with a
standard python class, but includes an extension class, then an
error is raised. It would be more useful if, when a list of base
classes contains one or more objects that are not python classes,
the first such object was used to control the extended class
definition. To get around this, the ExtensionClass
module exports
a base extension class, Base
, that can be used as the first base
class in a list of base classes to assure that an extension
subclass is created.
Currently, only one base extension class can define any data in C. The data layout of subclasse instances is the same as for the base class that defines data in C, except that the data structure is extended to hold an instance dictionary. The data structure begins with a standard python header, and extension methods expect the C instance data to occur immediately after the object header. If two or more base classes defined C data, the methods for the different base classes would expect their data to be in the same location. A solution might be to allocate base class instances and store pointers to these instances in the subclass data structure. The method binding mechanism would have to be a more complicated to make sure that methods were bound to the correct base data structure.
There is currently no support for subclassing in C, beyond that provided by method chains..
Rules for mixed-type arithmetic are different for python class instances than they are for extension type instances. Python classes can define right and left versions of numeric binary operators, or they can define a coersion operator for converting binary operator operands to a common type. For extension types, only the latter, coersion-based, approach is supported. The coersion-based approach does not work well for many data types for which coersion rules depend on the operator. Because extension classes are based on extension types, they are currently limited to the coersion-based approach. It would be straightforward to extend the extension class implementation to allow both types of mixed-type arithmetic control.
I considered making extension classes immutable, meaning that class attributes could not be set after class creation. I also considered making extension subclasses cache inherited attributes. Both of these are related and attractive for some applications, however, I decided that it would be better to retain standard class instance sematics and provide these features as options at a later time.
It would be useful to be able to specify parameters that control class creation, but that would otherwise not appear in the class dictionary. For example, it would be useful to provide parameters to control mutability of classes (not class instances), or to turn on caching of inherited class attributes.
The extension class module defines new method types to bind C and python methods to extension class instances. It would be useful for these method objects to provide access to function call information, such as the number and names of arguments and the number of defaults, by parsing extension function documentation strings.
Applications
Aside from test and demonstration applications, the extension class mechanism has been used to provide an extension-based implementation of the persistence mechanism described in [1]. We plan to develop this further to provide features such as automatic deactivation of objects not used after some period of time and to provide more efficient peristent-object cache management.
Future projects include creation of Java-like synchronized objects and impementation of aquisition [7], an inheritence-like mechansism that provides attribute sharing between container and component objects.
Summary
The extension-class mechanism described here provides a way to add class services to extension types. It allows:
Subclassing extension classes in Python,
Construction of extension class instances by calling extension classes,
Extension classes to provide meta-data, such as unbound methods and their documentation string.
In addition, the extension class module provides a relatively concise example of the use of mechanisms that were added to Python to support MESS [6], and that were described at the fourth Python Workshop [4]. It is hoped that this will spur research in improved and specialized models for class implementation in Python.
References
[1] Fulton, J., Providing Persistence for World-Wide-Web Applications, Proceedings of the 5th Python Workshop. http://www.digicool.com/papers/Persistence.html
[2] Page, R. and Cropper, S., Document Template, Proceedings of the 5th Python Workshop. http://www.digicool.com/papers/DocumentTemplate.html
[3] Beaudry, D., Deriving Built-In Classes in Python, Proceedings of the First International Python Workshop. http://www.python.org/workshops/1994-11/BuiltInClasses/BuiltInClasses_1.html
[4] Von Rossum, G., Don Beaudry Hack - MESS, presented in the Developer's Future Enhancements session of the 4th Python Workshop. http://www.python.org/workshops/1996-06/notes/thursday.html
[5] Fulton, J., Meta-Type Object. This is a small proposal, the text of which is contained in a sample implementation source file, http://www.digicool.com/jim/MetaType.c.
[6] Beaudry, D., and Ascher, D., The Meta-Extension Set, http://maigret.cog.brown.edu/pyutil/
[7] Gil, J., Lorenz, D., Environmental Acquisition--A New Inheritance-Like Abstraction Mechanism, OOPSLA '96 Proceedings, ACM SIG-PLAN, October, 1996 http://www.bell-labs.com/people/cope/oopsla/Oopsla96TechnicalProgramAbstracts.html#GilLorenz
[Download] ftp://ftp.digicool.com/pub/releases/ExtensionClass-0.3.tar.gz