New Python Syntax -- Guido's Reply

Guido.van.Rossum@cwi.nl
Thu, 26 May 1994 15:58:24 +0200

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: spoon: "Re: [RFC] Draft Proposal for New Python Syntax"
Previous message: Michael Lee Keller: "Re: [RFC] Draft Proposal for New Python Syntax"
Next in thread: Chris Hoffmann: "New Python Syntax -- Guido's Reply"

First, I would like to thank Thomas Kofler for sticking his neck out
and writing a proposal. I would also like everyone else who responded
for making their voice heard. I hope we can keep this discussion
relatively flame free!

My own response to the CURRENT proposal is: no way (but see below).

My reasons for rejecting this proposal are a mix of intuition about
elegance, and backward compatibility concerns. I'll spare you my
intuition since it is hard to discuss rationally -- several people
have already posted their strong support for the current, elegant
syntax. So, on to backward compatibility.

The proposal breaks every Python program in the world in a major way!

While existing code could in principle be fixed automatically, the
fixer would have to be a fairly complicated program (probably
containing a full parser for Python) that would still make mistakes in
certain cases. It might also mess up the alignment of comments in
certain cases. It would mean a major hassle for all Python
programmers in the world to find all their programs and fix them all
-- I expect I'm not the only one who has Python code in dozens of
different directories all over my directory tree. Not all my Python
scripts are recognizable by their .py extension either! I expect that
users on PCs and Macs would have an even harder time, since the tools
for finding files and applying a tool to them all are more primitive
than on UNIX.

Even if the problem of fixing all existing Python code could be solved
perfectly by automation, two significant conversion problems remain
that CAN'T be automated: documentation and tools that manipulate
Python source code. I'm sure that many people have written local
guides to Python (e.g. the U. of Virginia people, who have been using
Python in their VR system for years now) as well as tools for
manipulating Python source code (e.g. Tim Peters' python-model.el for
Emacs, and my own scripts for fixing various minor syntax
incompatibilities in the past). All this documentation would have to
be updated, and all these tools would have to be rewritten. I just
can't do that to the Python community in this stage.

If you still think that breaking all Python code in the world is only
a minor problem, don't read on -- you don't need to reply, there is
nothing more I can say about this issue...

Next, I am a bit worried about remarks that some people make along the
lines of "we think Python is great except for its use of indentation
for grouping, and if you don't fix this problem we won't be able to
use it". (e.g. Richard Golding, Joerg Wittenberger) Unless you
explain WHAT is wrong with the current syntax for your application,
there's little chance that a solution will emerge that will please
you. (A challenge for you is at the end of this message.)

On towards a more constructive path: let me formulate some
requirements for a new proposal. There are at least two types of
requirements: (1) backwards compatibility; (2) solve the right
problem.

(1) Backwards Compatibility
---------------------------

Existing working Python code must continue to work, without warnings,
except possibly if a new reserved word (such as 'do', 'begin' or
'end') is introduced, for code where those happen to be used as
identifier (that shouldn't be much code anyway). In particular, this
means that in general whitespace remains significant (at least at the
start of lines). It also means that any use of braces for grouping is
out -- as several have pointed out, braces are needed for dictionary
displays. Even if it weren't syntactically ambiguous, I don't think
it would be a good idea to overload braces for another purpose.
(This doesn't rule out compound tokens involving braces, like '{|'
'|}'.)

Solutions that leave old code working but encourage the migration to a
different style will also be frowned upon -- I quite like the current
style and from the responses it looks that I'm not the only one. For
short programs it will remain the preferred style. In other words,
any proposal should describe an optional new style.

Programs that manipulate Python source code may have to be adapted to
deal with code that uses the new style, but not in big ways. Maybe
there could be a tool to convert the new style to the old style so
unchanged tools might still be applied.

Documentation should not need to be adapted except by some short
paragraphs explaining the new style and how it maps to the documented
/ preferred style.

One final remark: we can't have a global switch that affects the
syntax -- it would make it impossible to mix modules written in the
two styles. A switch (e.g. 'pragma') per file would be acceptable
though.

(2) Solve the Right Problem
---------------------------

There are numerous different versions of what the problem is.

Thomas' proposal states as a requirement the need to eliminate all
syntactic meaning of white space. Even if we forget about the meaning
of whitespace to separate keywords and identifiers, his proposal
leaves one use of whitespace unchanged: a newline ends a comment.
(Well... come to think of it, Ada does the same.) I think this is a
little extreme. It is also in violation with the backwards
compatibility requirement. So I suggest we try to find a less extreme
problem statement first.

Donald Beaudry once proposed (on 14 March) an extension to allow ':('
and ':)' -- the "frowning Guido" and "happy Guido" tokens -- as
alternative block delimiters in lambda forms only. This solves a
totally different problem: the desire to have multi-line lambda's.
(Why stop at lambda's? If lambda's can contain arbitrary blocks, we
could introduce "valof" blocks as in BCPL. Seriously, if you need a
multi-like lambda, you can always introduce a local name for it and
use a local function definition instead.)

As an extension, Donald proposed (actually he already proposed this on
10 March, but with a buggier implementation) to allow the same tokens
as general block delimiters that effectively switch off the meaning of
INDENT, DEDENT and NEWLINE. The latter has the unfortunate effect
that you have to use semicolons after each statement. I also think
that there might be some problems with parsing of the short forms of
if...else... and similar statements, since it's not clear how you end
the else-suite if you don't use a block. For example:

if a==b: :(
# The indentation in this block is NOT significant
if b==c: print 'a==b==c';
else: print 'a==b!=c';
print Yo!';
:)

I think the "print 'Yo!'" would be seen by the parser (but not by most
human readers!) as part of the else clause. Maybe such details can be
fixed by revising the grammar somewhat more. (Note that by making his
delimiters out of two existing tokens, the lexer AUTOMATICALLY shuts
off the meaning of whitespace! Very clever, Donald...)

Anyway, I can only guess what problem it was exactly that Donald
wanted to solve with this hack. My best guess is that he wanted an
alternative syntax so that IN THOSE PLACES WHERE HE WANTED he could
write code without having to worry about indents and newlines.

A few other solutions have been proposed that seem to be designed to
solve a weaker problem. They have in common that they still require
newlines and proper indentation -- all they do is add redundancy so an
editor or other tool can fix the indentation, at least if their
options are used consistently.

- (George Reynolds, Marc Wachowitz) You can replace the colon by 'do'
and then the following suite must end with 'end' (or 'end <keyword>').

- (Marc Wachowitz, 2nd try) A block may start with 'begin' and end with
'end' (indented at the same level as the statements in the block).

- (John Redford) Place a comment of the form "# END" at the end of a
suite, indented at the same level of the suite. It requires no
changes to the Python interpreter.

- (myself) Add an optional "end if" clause to the end of an
if..elif..elif..else... group, and "end try" to a
try...except... group, etc. For additional security, "end class" and
"end def" can carry the name of the class/function as well.

One person who (I hope) would be happy with this sort of solution is
Ray Johnson, who complained that he wants to give end users simple
editing tools (like Motif text input boxes) to write Python code. I
remember several others also making a similar point. All they need to
do is (a) tell their users to consistently use the "verbose end"
style, and (b) run all code (after editing, but before handing it to
Python) through a reformatting filter. (Note that I have already
written such a filter for my own proposal.)

What Next?
----------

Before crafting a new proposal we need to figure out what problem we
need to solve to make the largest number of people happy, within the
constraints of backward compatibility.

I THINK that the most urgent, and possibly only, problem that really
needs solving is to have a way to tell where a block ends, so a
reformatting tool can reconstruct the indentation.

A stronger version of the problem statement would be to have an option
to write blocks without having to worry about where put in newlines
and indentation at all.

I can imagine an intermediate version where newlines would still be
significant but indentation would not be -- though it's not clear how
much this buys us. (One advantage might be that a statement following
an else clause would not be ambiguous.)

(Note that Thomas' proposal REQUIRES us to ignore newlines and
indentation, thus is even stronger.)

IF solving the weakest version is enough, I propose to go with my own
solution (optional "end <keyword>" closing statement groups). It is
less verbose than requiring an end keyword on each suite (i.e. before
'else', 'except' etc.). It also requires the smallest number of
changes to the parser, but this argument shouldn't necessarily be used
against the others unless the decision ends in a tie.

If this version of the problem statement is too weak for you, please
let us know -- and explain under which circumstances you would still
run in trouble. Remember that we could add a pragma to the start of a
file telling the parser to require block closers.

--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>
URL: <http://www.cwi.nl/cwi/people/Guido.van.Rossum.html>

Next message: spoon: "Re: [RFC] Draft Proposal for New Python Syntax"
Previous message: Michael Lee Keller: "Re: [RFC] Draft Proposal for New Python Syntax"
Next in thread: Chris Hoffmann: "New Python Syntax -- Guido's Reply"