Re: Overloading function calls

Tim Peters (tim@ksr.com)
Sat, 04 Jun 94 01:31:32 -0400

WRT coercions and letting RHS objects grab control, following is a
specific strategy that, so far, seems at least close. I'm not sure I
wouldn't like it <wink> -- but I deny that it's possible to really
understand a scheme before wrestling with it in practice, so can't swear
that I _would_ like it. Play with your own code "as if" these were the
rules, and see what you think! It helps that it achieves quite a bit
of backward compatibility, and is much more a refinement (albeit a non-
trivial one) of the current scheme than a fundamentally new scheme.

In evaluating the current scheme, I think there's a tendency to focus on
its weaknesses to the exclusion of its strengths; at least I've been
guilty of that. Some observations after looking over lots of code:

+ __add__ and __mul__ must be made to work like the others; few code
these methods correctly today.

+ "The others" must work like the others too <0.6 grin>; when and where
coercion occurs, and special methods get invoked, must be predictable.

+ __coerce__ methods are valuable! Uniform promotion to a common type is
often semantically correct, regardless of efficiency, and a magic
method to do that under the covers makes prototying fast and easy (or
rather it _could_ be if the magic method were uniformly applied ...).
Ease of prototying should win; the scheme below allows favoring
efficiency instead at the price of more typing.

OTOH, sometimes __coerce__-style promotion to a common type is plain
wrong. E.g., if Matrix is an instance of a square array class, then,
to get the natural meanings, Matrix+3 must promote to an array filled
with 3's, but Matrix*3 must promote to a zero matrix with 3's only on
the main diagonal. __coerce__ isn't smart enough to know the
difference; nor should it be <wink>.

My old Date class is a simpler example, where Date-365 means something
quite different from Date1-Date2.

+ Full-blown genuine numeric types (FBGNTs) are rare and tricky to write.
This has much more to do with their nature than with Python, though.
I'd like to see Python have _enough_ support for writing them, but it
probably shouldn't worry too much about them.

In particular, while FBGNTs absolutely need to let a RHS object grab
control, I'm surprised to find that few classes other than FBGNTs
seem to want to do that. Does this match others' observations? E.g.,
the way C++ overloads "<<" for output seems typical: non-FBGNT's like
to overload a "numeric" infix operator just as a shorthand for writing

object.method(argument1).method(argument2).method(argument3)...

so usage seems always to be of the form

object binop argument1 binop argument2 binop argument3 ...

and there's no need for a RHS object to grab control. The most recent
example of this phenomenon posted to the list may have been my Pipe
class; same thing happens over & over.

If that's generally true, _most_ applications neither need nor want
coercions, or accommodation for RHS dispatch; and it's then reasonable
to make the rarer FBGNTs say more to get them (while most people won't
have to worry about coercions at all).

With all that in mind, here's the alternative scheme:

1) Allow but not *require* today's __coerce__ methods.

2) Allow but not require new __rcoerce__ (or __coerce_rev__; whatever)
methods.

Note that #1 and #2 follow from point #4.

3) When Python sees

x binop y

where at least one of {x,y} is an instance, code to evaluate it looks
like:

arg_tuple = user_binop_coerce(x, y) # defined in point #4
return apply( arg_tuple[0].__binop__, arg_tuple[1:] )

I'd write this in C except the intent is so much clearer in Python; in
particular, arg_tuple[0].__binop__ may raise AttributeError etc.

4)
def user_binop_coerce(x, y):
result, source = None, ''

if hasattr(x, '__coerce__'):
source = '__coerce__'
result = x.__coerce__(y)

if result is None:
# x.__coerce__ doesn't exist, or does but gave up
if hasattr(y, '__rcoerce__'):
source = '__rcoerce__'
result = y.__rcoerce__(x)

if source == '':
# user didn't define any relevant coercions: fine!
return x, y

# user did try to coerce: ensure they succeeded
if result is None:
raise TypeError, source + ' returned None'

if type(result) is not type(()):
raise TypeError, source + ' must return None or tuple'

if len(result) >= 2:
return result

return ValueError, 'tuple returned by ' + source + ' too short'

Notes:

+ The scheme is subtler than it looks; you'll probably need to read this
over twice, & try some realistic examples, before it all clicks.

+ If you never define __[r]coerce__, coercions will never happen on your
objects, and you don't have to worry about it. In that case, though, a
RHS object can't gain control automatically.

+ Non-FBGNT classes that overload "numeric" operators today will, for the
most part, continue to work unchanged (although in most cases you could
simplify them by throwing away their __coerce__ method and changing the
operator methods to stop undoing the coercion ...).

+ FBGNTs that work today have a good chance of continuing to work _if_
the line

__rcoerce__ = __coerce__

is added after their "def __coerce__" method. The same trick is
appropriate for any FBGNT that needs RHS dispatch, does want coercion,
and is happy to always promote to a common type.

+ FBGNTs (or other classes) that want RHS objects to gain control, and
need to know which order the arguments came in, are intended to do it
like so:

A) Define __rcoerce__ to either coerce or not, depending on the class's
needs, but in any case to return a 3-tuple with 3rd component equal
to 1.

B) Define binop methods with a third argument defaulting to 0.

E.g., for a Matrix class that wants to see its raw arguments, and also
needs to capture LHS scalars:

class Matrix:
def __rcoerce__(x,y): # __coerce__ isn't needed
if type(y) in (type(1), type(1L), type(1.0)) or \
type(y) is type(x) and y.__class__ is Matrix:
return x,y,1
# presuming that other types are too scary and
# so we fail (return None)

def __add__(x, y, swapped=0):
# Matrix + 3 doesn't find any coercions to apply,
# so invokes __add__(Matrix, 3)

# 3 + Matrix invokes __rcoerce__(Matrix,3),
# which returns Matrix,3,1, and so
# __add__(Matrix, 3, 1) is invoked

# 'swapped' happens to be irrelevant to __add__

if type(y) in (type(1), type(1L), type(1.0)):
# copy x to new matrix, adding y to
# each element, and return that
else:
# whatever

def __mul__(x, y, swapped=0):
# very much like __add__
if type(y) in (type(1), type(1L), type(1.0)):
# copy x to new matrix, multiplying each
# element by y, and return that
else:
# whatever

def __sub__(x, y, swapped=0):
# 'swapped' is important here

xsign = 1

if type(y) in (type(1), type(1L), type(1.0)):
if swapped: xsign, y = -1, -y
# copy xsign*x to new matrix, subtracting y from
# each element, and return that
else:
# whatever

etc. I haven't yet tried recoding enough classes under this specific
scheme to be sure, but the ease with which (currently) tough nuts like
the Matrix class get cracked is encouraging.

+ The scheme *allows* for __[r]coerce__ to return tuples of any length
(>= 2). It *needs* to allow for 3-tuples in order to sneak argument-
ordering info into methods of classes that want RHS dispatch. There
may or may not be a good use for tuples bigger than that; I'm not sure
yet.

more-like-python-than-python<grin>-ly y'rs - tim

Tim Peters tim@ksr.com
not speaking for Kendall Square Research Corp