Re: Jim Fulton : Extensible compound statements -- new exec flavor

Jim Fulton (jfulton@disqvarsa.er.usgs.GOV)
Wed, 1 Feb 1995 13:34:32 GMT

>>>>> "Guido" == Guido van Rossum <Guido.van.Rossum@cwi.nl> writes:
In article <9501281121.AA16443=guido@guppie.cwi.nl>
Guido.van.Rossum@cwi.nl writes:

> Jim Fulton asked my opinion about his proposal for a "transaction"
> statement. I'd like to see this discussion back in public (I sort
> of missed the first round) so I'm cc'ing my reply to the list, but I'm
> not quoting Jim's material -- you can look it up in the mailing list
> archives if you need to <URL:http://www.cwi.nl/~guido/hypermail/>.
> (If Jim thinks parts are unclear he can post his own message to the
> list or add it to his reply to this one.)

I could not find my original post at this URL. Also, I sent Guido a
more long winded, and I hope, clearer message on this subject. I'll
include it here for reference and respond to Guido's post in a
separate message.

On Fri, 27 Jan 1995 17:24:03 +0100
Guido.van.Rossum@cwi.nl said:
> Jim -- I got totally confused about what was proposed. For some
> reason I received some of the replies before the original proposal, so
> then I waited for the proposal, and either it never came or it came at
> a time when I didn't have the time to read it and then I refiled it or
> forgot about it. So I don't have the proposal handy.
>
> I *think* it was mainly syntactic sugar for
>
> StartTransaction()
> ok = 0
> try:
> ...some code...
> ok = 1
> finally:
> if ok:
> CommitTransaction()
> else:
> RollbackTransaction()
>
> is that right?

More or less, except that it could also be syntactical sugar for
other things as well, such as:

some_lock.acquire()
try:
...some code...
finally:
some_lock.release()

> If you can jog my memory about the exact syntax that you were
> proposing I'll have another look. (This normally doesn't happen, but
> I'm banned from my office due to renovations, and although I can work
> from home, my email fluency has dropped dramatically...)

No problem.

I suggested something that would make transactions much cleaner, but
would make other things easier as well. I'm *not* suggesting that you
add transaction support directly, rather, I'm suggesting a way that
people can create limited execution control extensions.

I will first give my proposed extension as briefly and clearly as I
can and then give a more long-winded explanation of why this would help
my research with transaction models. I will also briefly show some
other applications.

Proposed new exec compound statement:
-------------------------------------


Add a compound exec statement: cexec_stmt:

cexec: "exec" callable ":" suite
["else" ":" suite]

(The description above is an attempt to follow the convention for
statement syntax description used in chapter 7 of the Python
Reference.)

This form of exec evaluates the expression callable that must
evaluate to a callable object. The callable object is called with four
arguments:

- A compiled code object for the first suite,
- The local dictionary,
- The global dictionary, and
- A compiled code object for the second suite or None if no else
suite was provided.

If a break or a continue statement is executed in either of the
suites, then an associated exception is raised, which may be
handled by the callable object in some fashion.

Note that no new keyword is necessary and the change is backward
compatible. The existing exec syntax will still work, I believe.

Advantages:

- New control constructs, such as transactions, can be modelled.
- Fully backward compatible.

Disadvantages:

- Guido has to modify the language. (Slightly?)

There is one complication:

What if an exec suite contains a return statement. For example:

def spam():
...
exec eggs:
...
if scrambled: return fried

One might want to have the return return from spam, even though the code
that contains the return is executed in eggs or something called by
eggs.

One possibility is to disallow returns in exec suites.

Another possibility is to cause returns from exec suites to raise
an exception, such as ExecReturn, with the returned value as the
exception value. Then it would be cool if somehow an unhandled
(or re-raised) ExecReturn caused a return from the function that
defined the exec suite. This actually gets rather ugly in a
number of ways, so I could definately live with not allowing
returns from exec suites.

An interesting diversion:

def foo():
try:
return 1
finally:
return 2

foo() # returns 2

Alternatives:

- Extend the try statement in a similar fashion. This would
involve allowing the try statement to have an expression that
evaluates to a callable:

try callable:
...

Advantages:

- Makes use of existing compound statement try.
- Tightly integrates with exception handling (this seems
particularly apealing for nested transactions, as pointed
out in the newsgroup).

Disadvantages:

- A more complicated interface to the callable object is
needed. For example, to retain the ability to preserve try
semantics, I think that code objects would have to be passed
for each of the except conditions and each of the except
suites.
- Tightly integrates with exception handling (This may not
make sense for non-transaction applications.)

- Invent some altogether new statement for this purpose

Advantages:

- Complete freedom to design whatever syntax one might want.

Disadvantages:

- New keywords are necessary

So how would this help with modeling transactions and other things in
Python?

Transactions can be modelled as objects, and I have gone pretty far
with an implementation of nested transactions based on transaction
objects. The question is, how are transactions used? Should the
transaction objects be visible directly to the programmer? My original
model was that a Transaction module would define a transaction class
and would keep track of the current transaction in each thread. The
programmer would do things like:


Transaction.Begin()
#some code
Transaction.End()

Now consider the nested case:

Transaction.Begin()
# Now in transaction T1.
try:
Transaction.Begin()
# Now in transaction T2
# some code
Transaction.End()
except Transaction.Abort:
# recover from failure of T1.1
except e1:
# handle some other exception
except e2:
# handle some other exception
except e3:
# handle some other exception
... more except blocks

# Do some more stuff
Transaction.End()

Now, the problem is, what if a non-abort exception occurs while
executing T2. Transaction T2 has not completed. There is no way for
the transaction class to know that the exception has occured. Unless
the transaction is explicitly aborted in all of the exception suites
for e1, e2, e3, ..., then the transaction will not be properly
aborted. In fact, the last Transaction.End() will incorrectly commit
transaction T2 and fail to commit T1. Things are really much worse
than this. In non-trivial applications, flow of control may me much
more complicated and it may be difficult to assure that
Transaction.Begin()s and Transaction.End()s are properly matched.

I see two solutions. First, I could require that the programmer must
specify the transaction that is being committed and aborted, as in:

t1 = Transaction.Begin()
# Now in transaction T1.
try:
t2 = Transaction.Begin()
# Now in transaction T2
# some code
t2.End()
except Transaction.Abort:
# recover from failure of T1.1
except e1:
# handle some other exception
except e2:
# handle some other exception
except e3:
# handle some other exception
... more except blocks

# Do some more stuff
t1.End()

Now we have two extra variables t1 and t2 to worry about. What if
there are no aborts but we do t1.End() before doing t2.End()? What
should the result be? Do we commit t2? Or abort both? What if we
create a separate thread to run t2? Should a commit of t1 before a
commit of t2 block until t2 has committed? What if execution branched
around the commit of t2 so that t2 never commits? We can define
these semantics, but I think that whatever we define will lead to
confusion.

Another solution is to require a function call to execute a
transaction. For example:

def t1():
# Now in transaction T1.

def t2():
# Now in transaction T2

try:
Transaction.Do(t2)
except Transaction.Abort:
# recover from failure of T1.1
except e1:
# handle some other exception
except e2:
# handle some other exception
except e3:
# handle some other exception
... more except blocks

Transaction.Do(t1)


Where Transaction.Do is something like:


def Do(f,*args):

Transaction.Begin()
try:
apply(f,args)
Transaction.End()
except Transaction.Abort:
raise sys.exc_type,sys.exc_value
except:
Transaction.Abort()
raise sys.exc_type,sys.exc_value


Do is able to detect exceptions and make sure that the the transaction
completes before returning.

The second solution is much cleaner than the first in my opinion, but
in the user code, the user has to create two extraneous functions, and
these must have different local dictionaries and cannot access the
surrounding local namespace if they appear in functions.

By far, the cleanest approach IMHO is:

exec Transaction.Code:
# Now in transaction T1.
try:
exec Transaction.Code:
# Now in transaction T2
# some code
except Transaction.Abort:
# recover from failure of T1.1
except e1:
# handle some other transaction
except e2:
# handle some other transaction
except e3:
# handle some other transaction
... more except blocks

Note that I don't need any extraneous variables or functions. The
transactions can share the same name space if I want them to. I don't
have to worry about forgetting to commit a transaction, as the commits
are implied by the block structure.

In this last version, Transaction.Code is a callable object. It could be
something like:

def Code(suite,global,local,else=None):

Transaction.Begin()
try:
exec suite in global in local
Transaction.End()
except Transaction.Abort:
raise sys.exc_type,sys.exc_value
except:
Transaction.Abort()
raise sys.exc_type,sys.exc_value

Some other applications:

My proposal allows a much cleaner transaction syntax to be used *and*
it makes other useful things possible.

Critical regions:

For example, as someone else pointed out, it can be used to
implement critical regions. For example, given a lock, you
could have something like:

exec Critical_Region(some_lock):
# Some code
...

where the lock is aquired before entering the code block and is
released when leaving the code block, whether leaving the code
block at the end, through an exception, or with
a break or continue.

Threads as blocks:

exec Thread:
... some code that runs as a separate thread ...

Event handlers as blocks:

exec some_widget.addCallback('someCallbackName'):
... The callback code ...

--
-- Jim Fulton      jfulton@mailqvarsa.er.usgs.gov    (703) 648-5622
                   U.S. Geological Survey, Reston VA  22092 
This message is being posted to obtain or provide technical information
relating to my duties at the U.S. Geological Survey.