Re: Multi-line string extension

Guido van Rossum (Guido.van.Rossum@cwi.nl)
Sun, 17 Apr 1994 20:33:54 GMT

Ease of implementation wins (for now): I've implemented string literal
concatenation with a one-character change to Grammar/Grammar plus
about 10 lines in Python/compile.c. It will become part of imminent
release 1.0.2 (as will default parameter values).

All other suggestions would have required much more work, even Tim's
suggestion of simply allowing <newline> as a legal character in string
literals (because the scanner has a rather unfortunate fixation on
tokenizing a line at a time).

A compile time optimizer that folds constant expressions in general
would be a valuable addition but would probably require a major
rewrite of the parse tree allocation code (about the oldest code in
the Python system).

Unfortunately, the quest for more convenient multi-line string
literals is not over now. If I have followed the discussion so far
correctly (and it's not the first time this has been discussed!), the
candidates are:

1. allow <backslash><newline> in string literals as in C

2. allow unadorned <newline> in string literals as Perl and sh

3. add a new type of string quote that allows strings to span lines

4. add one or more variant of Perl-style "here" documents (some with
variable substitution?)

To me, 1 seems the least controversial. The problem with 2 is that
especially beginning users can be confused by the diagnostics for a
missing quote. Number 3 will require us to invent new quotes and will
still have the missing quote problem. And should the open and close
quotes be different or not? Number 4 is actually not too hard to
implement (by bypassing the tokenizer entirely) but << is already
taken (and the games that Perl plays are off-limits for Python :) so
would also require invention. In any case, 3 and 4 do more to make
the language "bigger" than 1 or 2.

Since Python inherits most of its lexical conventions from C anyway, I
would be most happy with choice number 1. This implies that a
multi-line string with embedded newlines should be written like this:

print "This is a message\n\
containing a newline"

instead of

print "This is a message
containing a newline"

If you want the indentation to line up, you will be able to write

print "This is a message\n" \
"containing a newline"

Personally, I will continue to write

print "This is a message"
print "containing a newline"

which most pleases my own sense of esthetics :-)

--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>
URL: <http://www.cwi.nl/cwi/people/Guido.van.Rossum.html>