Re: Multi-line string extension

Guido.van.Rossum@cwi.nl
Mon, 18 Apr 1994 10:22:30 +0200

> Sorry, but nobody's gonna convince me I care how long it takes to
> catentate string literals. Don correctly (IMO) identified that as a
> side "bonus" at the start (or as a cup of soup <wink>).

Same here.

> Hey, if pointing out problems is rejecting, I'd never get out of bed in
> the morning <wink>.

Someone should write a Python program to scan comp.lang.python for
Petersisms like these :)

> As I understand [Jim's] proposal, it only goes so far as changing that to
>
> err('Each non-empty non-comment line in a substitution file must\n'
> 'contain exactly two words: an identifier and its replacement.\n'
> 'Comments start with a # character and end at end of line.\n'
> 'If an identifier is preceded with a *, it is not substituted\n'
> 'inside a comment even when -c is specified.\n' )
>
> And that leaves it just as irritating & error-prone to create, and
> especially to _modify_, as before. Multi-line informative msgs
> frequently need to be reformatted as programs change (to add, remove,
> delete, or rephrase information), and all the quotes, and backslashes (if
> it weren't in an unclosed paren structure like the above is), and escaped
> newlines make that a pain in the butt even for a measly 5-line example.
> Even tools _designed_ for reformatting (like the Emacs fill-region) can't
> cope with all the syntactic noise, in any of the 3 variations above.
>
> I would like something better than that, & I believe this kind of thing
> was the actual thrust of Don's original proposal too. Ugly as it is, the
> following would be a major improvement (as would be Don's suggestion):
>
> err(
> "Each non-empty non-comment line in a substitution file must
> contain exactly two words: an identifier and its replacement.
> Comments start with a # character and end at end of line.
> If an identifier is preceded with a *, it is not substituted
> inside a comment even when -c is specified.
> " )

Now you've done it. You've convinced me that we need a way to write
long strings without any embedded syntax for newlines. I think the
reference to Emacs fill-region (which I use all the time) did it.

> Entering & reformatting text blocks written in this style is pleasant;
> _reading_ them is pleasant in the interior, but icky at the boundaries.

And so is the Perl style "here" document. The advantage of that is
that it's less likely that a missing end quote confuses you. One
(ugly?) alternative that comes to my own mind is triple quotes, e.g.:

err(
"""Each non-empty non-comment line in a substitution file must
contain exactly two words: an identifier and its replacement.
Comments start with a # character and end at end of line.
If an identifier is preceded with a *, it is not substituted
inside a comment even when -c is specified.
""")

> [...] over the long run, you'll be happier if you leave initial
> codegen as stupid as possible [...]

I'm happy with that, since I can leave writing the optimizer to
someone else :-)

> BTW, constant-folders in general don't buy much, unless [...]

Thanks for the warning again.

My own defense against premature optimization is simply that I don't
care if something runs 30% slower, but there is a lot of pressure in
this group from people who disagree!

> print <<PLEA unless $match;
> for '$name', please eyeball the original defn following, to
> make sure it's compatible with its deduced size $size:
> $origdef
> PLEA

Actually, I don't see why this is so much better than

if not match():
print "for '%s', please eyeball the original defn following, to" % name
print " make sure it's compatible with its deduced size %d:" % size
print origdef

(and the same for your next example).

> If you agree that prototyping is a strong natural use for Python, I'd
> like to suggest that (a) prototyping often involves producing structured
> output, the content and structure of which often changes rapidly and/or
> massively as the prototype evolves, and (b) pasting together structured
> output via catenating strings mixed with backticking variables, hand-
> counting embedded blanks to get things to "line up", & explicitly
> inserting escaped newlines, is doing it at a level no higher than C's.
>
> In most (all?) other respects, Python is a wonderful language for
> prototyping already. I do think it falls short in this specific area,
> though, and don't think it _wants_ to.

I agree that output formatting is not one of Python's strongest
points. Personally, I haven't missed it very much -- most of the code
I write has either graphical (or audio!) output or small bits of
unstructured output characteristic for debugging code... But I have a
feeling that if I ignore you I will get email about this until the end
of times (which is defined as the day the last Python user dies:).

My problem with Perl-style 'here' documents remains that it is a very
un-Python-like piece of syntax (how's that for an emotional argument:)
and that I can't think of an "intuitive" operator, since << is already
taken for left shift.

So what do you think of the triple quote convention? To be specific,
I'm thinking of the following rules. Either single or double quotes
can be tripled to start a different kind of quoted string. Inside
such strings, backslash escapes still work, and <backslash><newline>
is ignored, but unescaped <newline> is kept in the string (rather than
being an error). Sequences of 1 or 2 quotes do not terminate the
string. A sequence of three quotes can be enclosed by quoting at
least one of them with a backslash. There is no variable substitution
but of course you can use a triple-quoted string as format string or
concatenate it with a back-ticked expression.

"""--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>
URL: <http://www.cwi.nl/cwi/people/Guido.van.Rossum.html>"""