Easy "switch statements" and "function overloading" in python

Jim Roskind (jar@infoseek.com)
Sun, 3 Apr 1994 13:27:52 +0800

SIAK, but this message explains how to (IMHO) do fast, efficient
readable case statements and function overloading using the features
*already* present in Python. I think the resulting code is more
readable and maintainable, and it does *not* require a language
extension. I've been using in a a variety of contexts now, and it
performs admirably :-).

The technique I'll described is based on one of the standard
approaches to compiling switch statements, but plays on the strength
of Python to "get the job done." I'll start with the case statement
example, because it provides the underpinnings for my function
overloading technique. C coders that are very familiar with function
pointers may guess where I'm going, and jump to the chase (re:
examples in appendix).

The typical implementation of a "switch" action using
if-elif-... constructs (as suggested in the tutorial and in the FAQ)
is:

def process(event)
if event = 'call':
block_of_code_1_which_sets_result
elif event = 'return':
block_of_code_2_which_sets_result
elif event = 'except':
block_of_code_3_which_sets_result
else:
raise UnknownEvent, event
return result

The bad thing about the above pseudo switch is that it:

1) Uses a sequential test for each possibility (which is slow
when there are lots of choices).

2) All the "block_of_code..." fragments are indented an extra
level (which disrupts readability, unless you make them function calls
;-) ).

3) In a large switch statement it is hard to see what all the
covered options are.

The following is the my technique for doing the switch. The first
thing I fix is the modularity by defining functions for each case:

def process_call():
block_of_code_1_which_sets_result
return result

def process_return():
block_of_code_2_which_sets_result
return result

def process_except():
block_of_code_3_which_sets_result
return result

Many folks would agree that the above functional modularization makes
code a bit more readable, but would (at first) be bothered by the
additional cost of an extra function call. All this vanishes when the
rest of the implementation of the switch statement is shown:

def process(event):
return process_func_table[event]()

process_func_table = {
'call' : process_call,
'return': process_return,
'except': process_except,
}

And that's it! This is a fine (though perchance, technical)
application of Python's associative arrays (and the speed with which
they are evaluated) to reduce what *was* a linear search of options
:-). Notice that you can add extra entries in the switch statement
*very* easily (just extend the table). If you are concerned with
providing safe and correct default case handling, you can change the
definition of process() to:

def process(event):
if process_func_table.has_key(event):
return process_func_table[event]()
else
return process_default()

If you wanted to exactly replicate the original code (with the
exception it raised), then you could write:

def process(event):
if process_func_table.has_key(event):
return process_func_table[event]()
else
raise UnknownEvent, event

I'll give a "complex" example of this at in the APPENDIX for this
email note. Note that this technique is *only* used in compilers
when the number of case entries exceeds some threshold. The same is
clearly true here in python (i.e. for cases with 2 or 3 entries, it is
often more work than it is worth to transition to this method).

If I haven't lost you so far, the big coupe is about to appear. The
trick to overloading functions is to realize that a typical
implementation (in most languages) uses some criteria to switch on the
type of the argument in order to select a function! The good news is
that Python gives us pretty exactly what we need in the built-in
function "type()."

A typical "overloaded" function call is simply given a small
dispatching wrapper:

def my_print(val):
return my_print_func_table[type(val)](val)

The same sort of caveats and adjustments are needed if you want a
default function, or wish to change the exception when a "surprising"
type is brought in. You should also note that instances of *all* user
defined classes are "<type 'instance'>". As a result, if you want
different overloading for each user defined class, you'd have to have
one more dispatch level:

def my_print_instance(val):
return my_print_instance_func_table[val.__class__](val)

A more complete example is given in Appendix B, including a fairly
large table.

For those of you who might argue: "This is a terrible hack, and
requires the programmer to do all this stuff when a compiler should,"
I have an answer (history will decide the quality of my answer). I've
seen other languages (example: C++ ) *try* to have nice concise rules
for how to resolve overloading on multiple arguments. The results
were neither nice, nor concise. I like the fact that with this
technique you get to control the selection of the overloaded function,
and that you do not have to rely on an arbitrary set of rules (which
are too numerous to remember). You can overload in any way you want
by building an appropriate dispatch table in advance. Note my use of
the word "building" in the last sentence can include either "defining"
or writing a program that generates such! ;-)

Most importantly, with the above two techniques, function overloading
and switch statements are here *today* in Python. IMHO, the code you
write with these techniques will be readable, and it will rely on
Python features (mostly associative arrays) to get the job done
quickly.

We now return you to your regularly scheduled mailing list :-).

Jim

Jim Roskind
408-982-4469
jar@infoseek.com

APPENDIX A: Switch Statement Example

The following is an advanced example of using this switch technique in
my Python profiler. It is one of a variety of techniques that makes
the event processing time of my profiler very low.

cut here--------------------------------
dispatch = { \
'call' : trace_dispatch_call, \
'return' : trace_dispatch_return, \
'exception': trace_dispatch_exception, \
}

class Profiler:
def trace_dispatch(self, frame, event, arg):
delta_time = ...
...
dispatch[event](self, frame, delta_time):
...
return

def trace_dispatch_exception(self, frame, t):
...
return

def trace_dispatch_call(self, frame, t):
...
return

def trace_dispatch_return(self, frame, t):
...
return

...
cut here---------------------------------------

In my eternal search for speed in my profiler, I even took the above
code a few steps further. I noted that it was wasteful to
time-and-again bind the entry in the dispatch function to "self." The
following is the optimized version. This optimization was *ONLY* done
because dispatching is *THE* time critical element of profiling
overhead. I give the example here because it also shows how bound
methods can be entered into tables.

cut here--------------------------------
class Profiler:
def __init__(self, *arg):
...
self.dispatch = { \
'call' : self.trace_dispatch_call, \
'return' : self.trace_dispatch_return, \
'exception': self.trace_dispatch_exception, \
}
...

def trace_dispatch(self, frame, event, arg):
delta_time = ...
...
self.dispatch[event](frame, delta_time):
...
return

...
cut here---------------------------------------

Note that in the above example an instance is created very rarely
(once per profile), but the trace_dispatch happens *very* often. As a
result, I chose to use bound methods during the setup of the dispatch
table (re: "self.trace_dispatch_return" rather than just
"trace_dispatch_return"). This decision has potential consequences
(early binding of methods) that I'll note exist, but won't discuss
here. I will note that this change did reduce the overhead in my
profiler.

APPENDIX B: Function Overload Example

The following example is taken from my code that implements a
persistent object class. The simple (human-readable object store)
implementation creates a string, such that eval'ing that string will
"reconstitute" the original object (or provide ref to the memory
resident incarnation of the object). The repr() function does this
nicely for internal scalar types, but does not handle recursive
references properly, and does not isolate distinct persistent objects
(in order to prevent two separate incarnations from being formed).

The bottom line is that I needed a function that looks like repr(), but
"does the right thing." The following is the implementation of
Prepr(), with is everything that repr() wanted to be. It was critical
that I had a different Prepr() function to handle each Python type, so
function overoading was a natural here.

def Prepr ( Value ):
try:
global Prepr_func_table
prepr_func = Prepr_func_table[type(Value)]
except KeyError:
raise EncodingError , "Type '" + `type(Value)` \
+ "'cannot be saved persistently"
return prepr_func(Value)

Prepr_func_table = { \
type ( None ): repr ,\
type ( "string" ): repr ,\
type ( 1 ): repr ,\
type ( 1L ): repr ,\
type ( 3.14 ): repr ,\
type ( [] ): Prepr_list ,\
type ( {} ): Prepr_dict ,\
type ( () ): Prepr_tuple ,\
type ( sys ): Prepr_module ,\
type ( PersistentObject ): Prepr_class ,\
type ( PersistentObject() ): Prepr_instance ,\
}

Note that in the above table, the same function 'repr' is provided in
several of the entries. This is another *big* advantage of defining
your own overload resolution table. In C++ for example, I would have
had to write separate functions for each of these entries! I *hate*
repeated code, and such rewriting would be a great travesty (of Python
justice), because a **slow** user defined function would have to be
used as a wrapper around a **fast** bulitin function (re: repr).

I'll have to leave it to the reader to work out the rest of
the code in my class. Note that the name of the class is
PersistentObject, and hence the last two entries in the table are for
a "user defined class" and for an "instance of a user defined class,"
respectively. Without function overloading, as shown in this note, it
would be very painful to write/read all the actual underlying code :-).