PEP 308 -- Conditional Expressions

PEP:	308
Title:	Conditional Expressions
Version:	ee4f22d14190
Last-Modified:	2007-06-28 19:44:20 +0000 (Thu, 28 Jun 2007)
Author:	Guido van Rossum, Raymond Hettinger
Status:	Final
Type:	Standards Track
Content-Type:	text/plain
Created:	7-Feb-2003
Post-History:	7-Feb-2003, 11-Feb-2003

Adding a conditional expression

    On 9/29/2005, Guido decided to add conditional expressions in the
    form of "X if C else Y". [1]

    The motivating use case was the prevalance of error-prone attempts
    to achieve the same effect using "and" and "or". [2]
    
    Previous community efforts to add a conditional expression were
    stymied by a lack of consensus on the best syntax.  That issue was
    resolved by simply deferring to a BDFL best judgment call.

    The decision was validated by reviewing how the syntax fared when
    applied throughout the standard library (this review approximates a
    sampling of real-world use cases, across a variety of applications,
    written by a number of programmers with diverse backgrounds). [3]

    The following change will be made to the grammar.  (The or_test
    symbols is new, the others are modified.)

        test: or_test ['if' or_test 'else' test] | lambdef
        or_test: and_test ('or' and_test)*
        ...
        testlist_safe: or_test [(',' or_test)+ [',']]
        ...
        gen_for: 'for' exprlist 'in' or_test [gen_iter]

    The new syntax nearly introduced a minor syntactical backwards
    incompatibility.  In previous Python versions, the following is
    legal:

        [f for f in lambda x: x, lambda x: x**2 if f(1) == 1]

    (I.e. a list comprehension where the sequence following 'in' is an
    unparenthesized series of lambdas -- or just one lambda, even.)

    In Python 3.0, the series of lambdas will have to be
    parenthesized, e.g.:

        [f for f in (lambda x: x, lambda x: x**2) if f(1) == 1]

    This is because lambda binds less tight than the if-else
    expression, but in this context, the lambda could already be
    followed by an 'if' keyword that binds less tightly still (for
    details, consider the grammar changes shown above).

    However, in Python 2.5, a slightly different grammar is used that
    is more backwards compatible, but constrains the grammar of a
    lambda used in this position by forbidding the lambda's body to
    contain an unparenthesized condition expression.  Examples:

        [f for f in (1, lambda x: x if x >= 0 else -1)]    # OK
        [f for f in 1, (lambda x: x if x >= 0 else -1)]    # OK
        [f for f in 1, lambda x: (x if x >= 0 else -1)]    # OK
        [f for f in 1, lambda x: x if x >= 0 else -1]      # INVALID

References

    [1] Pronouncement
        http://mail.python.org/pipermail/python-dev/2005-September/056846.html

    [2] Motivating use case:
        http://mail.python.org/pipermail/python-dev/2005-September/056546.html
        http://mail.python.org/pipermail/python-dev/2005-September/056510.html        

    [3] Review in the context of real-world code fragments:
        http://mail.python.org/pipermail/python-dev/2005-September/056803.html

Introduction to earlier draft of the PEP (kept for historical purposes)

    Requests for an if-then-else ("ternary") expression keep coming up
    on comp.lang.python.  This PEP contains a concrete proposal of a
    fairly Pythonic syntax.  This is the community's one chance: if
    this PEP is approved with a clear majority, it will be implemented
    in Python 2.4.  If not, the PEP will be augmented with a summary
    of the reasons for rejection and the subject better not come up
    again.  While the BDFL is co-author of this PEP, he is neither in
    favor nor against this proposal; it is up to the community to
    decide.  If the community can't decide, the BDFL will reject the
    PEP.

    After unprecedented community response (very good arguments were
    made both pro and con) this PEP has been revised with the help of
    Raymond Hettinger.  Without going through a complete revision
    history, the main changes are a different proposed syntax, an
    overview of proposed alternatives, the state of the curent
    discussion, and a discussion of short-circuit behavior.

    Following the discussion, a vote was held.  While there was an overall
    interest in having some form of if-then-else expressions, no one
    format was able to draw majority support.  Accordingly, the PEP was
    rejected due to the lack of an overwhelming majority for change.
    Also, a Python design principle has been to prefer the status quo
    whenever there are doubts about which path to take.

Proposal

    The proposed syntax is as follows:

	(if <condition>: <expression1> else: <expression2>) 

    Note that the enclosing parentheses are not optional.
    
    The resulting expression is evaluated like this:

    - First, <condition> is evaluated.

    - If <condition> is true, <expression1> is evaluated and is the
      result of the whole thing.

    - If <condition> is false, <expression2> is evaluated and is the
      result of the whole thing.

    A natural extension of this syntax is to allow one or more 'elif'
    parts:

      (if <cond1>: <expr1> elif <cond2>: <expr2> ... else: <exprN>)

    This will be implemented if the proposal is accepted.

    The downsides to the proposal are:

    * the required parentheses
    * confusability with statement syntax
    * additional semantic loading of colons

    Note that at most one of <expression1> and <expression2> is
    evaluated.  This is called a "short-circuit expression"; it is
    similar to the way the second operand of 'and' / 'or' is only
    evaluated if the first operand is true / false.

    A common way to emulate an if-then-else expression is:

        <condition> and <expression1> or <expression2>

    However, this doesn't work the same way: it returns <expression2>
    when <expression1> is false!  See FAQ 4.16 for alternatives that
    work -- however, they are pretty ugly and require much more effort
    to understand.

Alternatives

    Holger Krekel proposed a new, minimally invasive variant:

        <condition> and <expression1> else <expression2>

    The concept behind it is that a nearly complete ternary operator
    already exists with and/or and this proposal is the least invasive
    change that makes it complete.  Many respondants on the
    newsgroup found this to be the most pleasing alternative.
    However, a couple of respondants were able to post examples
    that were mentally difficult to parse.  Later it was pointed
    out that this construct works by having the "else" change the
    existing meaning of "and".

    As a result, there is increasing support for Christian Tismer's
    proposed variant of the same idea:

        <condition> then <expression1> else <expression2>

    The advantages are simple visual parsing, no required parenthesis,
    no change in the semantics of existing keywords, not as likely
    as the proposal to be confused with statement syntax, and does
    not further overload the colon.  The disadvantage is the
    implementation costs of introducing a new keyword.  However,
    unlike other new keywords, the word "then" seems unlikely to
    have been used as a name in existing programs.

    ---

    Many C-derived languages use this syntax:

        <condition> ? <expression1> : <expression2>

    Eric Raymond even implemented this.  The BDFL rejected this for
    several reasons: the colon already has many uses in Python (even
    though it would actually not be ambiguous, because the question
    mark requires a matching colon); for people not used to C-derived
    language, it is hard to understand.

    ---

    The original version of this PEP proposed the following syntax:

        <expression1> if <condition> else <expression2>

    The out-of-order arrangement was found to be too uncomfortable
    for many of participants in the discussion; especially when
    <expression1> is long, it's easy to miss the conditional while
    skimming.

    ---

    Some have suggested adding a new builtin instead of extending the
    syntax of the language.  For example:

        cond(<condition>, <expression1>, <expression2>)

    This won't work the way a syntax extension will because both
    expression1 and expression2 must be evaluated before the function
    is called.  There's no way to short-circuit the expression
    evaluation.  It could work if 'cond' (or some other name) were
    made a keyword, but that has all the disadvantages of adding a new
    keyword, plus confusing syntax: it *looks* like a function call so
    a casual reader might expect both <expression1> and <expression2>
    to be evaluated.

Summary of the Current State of the Discussion

    Groups are falling into one of three camps:

    1.  Adopt a ternary operator built using punctuation characters:

            <condition> ? <expression1> : <expression2>

    2.  Adopt a ternary operator built using new or existing keywords.
        The leading examples are:

            <condition> then <expression1> else <expression2>
            (if <condition>: <expression1> else: <expression2>) 

    3.  Do nothing.

    The first two positions are relatively similar.

    Some find that any form of punctuation makes the language more
    cryptic.  Others find that punctuation style is appropriate for
    expressions rather than statements and helps avoid a COBOL style:
    3 plus 4 times 5.

    Adapting existing keywords attempts to improve on punctuation
    through explicit meaning and a more tidy appearance.  The downside
    is some loss of the economy-of-expression provided by punctuation
    operators.  The other downside is that it creates some degree of
    confusion between the two meanings and two usages of the keywords.

    Those difficulties are overcome by options which introduce new
    keywords which take more effort to implement.

    The last position is doing nothing.  Arguments in favor include
    keeping the language simple and concise; maintaining backwards
    compatibility; and that any every use case can already be already
    expressed in terms of "if" and "else".  Lambda expressions are an
    exception as they require the conditional to be factored out into
    a separate function definition.

    The arguments against doing nothing are that the other choices
    allow greater economy of expression and that current practices
    show a propensity for erroneous uses of "and", "or", or one their
    more complex, less visually unappealing workarounds.

Short-Circuit Behavior

    The principal difference between the ternary operator and the
    cond() function is that the latter provides an expression form but
    does not provide short-circuit evaluation.

    Short-circuit evaluation is desirable on three occasions:
                                                         
    1. When an expression has side-effects
    2. When one or both of the expressions are resource intensive
    3. When the condition serves as a guard for the validity of the
      expression.

    #  Example where all three reasons apply
    data = isinstance(source, file)  ?  source.readlines()
                                     :  source.split()

    1. readlines() moves the file pointer
    2. for long sources, both alternatives take time
    3. split() is only valid for strings and readlines() is only
       valid for file objects.

    Supporters of a cond() function point out that the need for
    short-circuit evaluation is rare.  Scanning through existing code
    directories, they found that if/else did not occur often; and of
    those only a few contained expressions that could be helped by
    cond() or a ternary operator; and that most of those had no need
    for short-circuit evaluation.  Hence, cond() would suffice for
    most needs and would spare efforts to alter the syntax of the
    language.

    More supporting evidence comes from scans of C code bases which
    show that its ternary operator used very rarely (as a percentage
    of lines of code).

    A counter point to that analysis is that the availability of a
    ternary operator helped the programmer in every case because it
    spared the need to search for side-effects.  Further, it would
    preclude errors arising from distant modifications which introduce
    side-effects.  The latter case has become more of a reality with
    the advent of properties where even attribute access can be given
    side-effects.

    The BDFL's position is that short-circuit behavior is essential
    for an if-then-else construct to be added to the language.

Detailed Results of Voting

    Votes rejecting all options:  82
    Votes with rank ordering:     436
                                  ---
    Total votes received:         518


            ACCEPT                  REJECT                  TOTAL
            ---------------------   ---------------------   -----
            Rank1   Rank2   Rank3   Rank1   Rank2   Rank3
    Letter  
    A       51      33      19      18      20      20      161
    B       45      46      21      9       24      23      168
    C       94      54      29      20      20      18      235
    D       71      40      31      5       28      31      206
    E       7       7       10              3       5       32
    F       14      19      10              7       17      67
    G       7       6       10      1       2       4       30
    H       20      22      17      4       10      25      98
    I       16      20      9       5       5       20      75
    J       6       17      5       1               10      39
    K       1               6               4       13      24
    L               1       2               3       3       9
    M       7       3       4       2       5       11      32
    N               2       3               4       2       11
    O       1       6       5       1       4       9       26
    P       5       3       6       1       5       7       27
    Q       18      7       15      6       5       11      62
    Z                                               1       1
            ---     ---     ---     ---     ---     ---     ----
    Total   363     286     202     73      149     230     1303
    RejectAll                       82      82      82      246
            ---     ---     ---     ---     ---     ---     ----
    Total   363     286     202     155     231     312     1549


    CHOICE KEY
    ----------
    A.  x if C else y
    B.  if C then x else y
    C.  (if C: x else: y)
    D.  C ? x : y
    E.  C ? x ! y
    F.  cond(C, x, y)
    G.  C ?? x || y
    H.  C then x else y
    I.  x when C else y
    J.  C ? x else y
    K.  C -> x else y
    L.  C -> (x, y)
    M.  [x if C else y]
    N.  ifelse C: x else y
    O.  <if C then x else y>
    P.  C and x else y
    Q.  any write-in vote


    Detail for write-in votes and their ranking:
    --------------------------------------------
    3:  Q reject y x C elsethenif
    2:  Q accept (C ? x ! y)
    3:  Q reject ...
    3:  Q accept  ? C : x : y
    3:  Q accept (x if C, y otherwise)
    3:  Q reject ...
    3:  Q reject NONE
    1:  Q accept   select : (<c1> : <val1>; [<cx> : <valx>; ]* elseval)
    2:  Q reject if C: t else: f
    3:  Q accept C selects x else y
    2:  Q accept iff(C, x, y)    # "if-function"
    1:  Q accept (y, x)[C]
    1:  Q accept          C true: x false: y
    3:  Q accept          C then: x else: y
    3:  Q reject
    3:  Q accept (if C: x elif C2: y else: z)
    3:  Q accept C -> x : y
    1:  Q accept  x (if C), y
    1:  Q accept if c: x else: y
    3:  Q accept (c).{True:1, False:2}
    2:  Q accept if c: x else: y
    3:  Q accept (c).{True:1, False:2}
    3:  Q accept if C: x else y
    1:  Q accept  (x if C else y)
    1:  Q accept ifelse(C, x, y)
    2:  Q reject x or y <- C
    1:  Q accept (C ? x : y) required parens
    1:  Q accept  iif(C, x, y)
    1:  Q accept ?(C, x, y)
    1:  Q accept switch-case
    2:  Q accept multi-line if/else
    1:  Q accept C: x else: y
    2:  Q accept (C): x else: y
    3:  Q accept if C: x else: y
    1:  Q accept     x if C, else y
    1:  Q reject choice: c1->a; c2->b; ...; z
    3:  Q accept [if C then x else y]
    3:  Q reject no other choice has x as the first element
    1:  Q accept (x,y) ? C
    3:  Q accept x if C else y (The "else y" being optional)
    1:  Q accept (C ? x , y)
    1:  Q accept  any outcome (i.e form or plain rejection) from a usability study
    1:  Q reject (x if C else y)
    1:  Q accept  (x if C else y)
    2:  Q reject   NONE
    3:  Q reject   NONE
    3:  Q accept  (C ? x else y)
    3:  Q accept  x when C else y
    2:  Q accept  (x if C else y)
    2:  Q accept cond(C1, x1, C2, x2, C3, x3,...)
    1:  Q accept  (if C1: x elif C2: y else: z)
    1:  Q reject cond(C, :x, :y)
    3:  Q accept  (C and [x] or [y])[0]
    2:  Q reject
    3:  Q reject
    3:  Q reject all else
    1:  Q reject no-change
    3:  Q reject deliberately omitted as I have no interest in any other proposal
    2:  Q reject (C then x else Y)
    1:  Q accept       if C: x else: y
    1:  Q reject (if C then x else y)
    3:  Q reject C?(x, y)

Copyright

    This document has been placed in the public domain.

Python Wiki

Python Insider Blog

Python 2 or 3?

Help Fund Python

Non-English Resources

Adding a conditional expression

References

Introduction to earlier draft of the PEP (kept for historical purposes)

Proposal

Alternatives

Summary of the Current State of the Discussion

Short-Circuit Behavior

Detailed Results of Voting

Copyright