PEP 211 -- Adding A New Outer Product Operator

PEP:	211
Title:	Adding A New Outer Product Operator
Version:	270b464879a6
Last-Modified:	2016-03-31 17:45:40 +0300 (Thu, 31 Mar 2016)
Author:	Greg Wilson <gvwilson at ddj.com>
Status:	Deferred
Type:	Standards Track
Created:	15-Jul-2000
Python-Version:	2.1
Post-History:

Introduction

    This PEP describes a proposal to define "@" (pronounced "across")
    as a new outer product operator in Python 2.2.  When applied to
    sequences (or other iterable objects), this operator will combine
    their iterators, so that:

        for (i, j) in S @ T:
            pass

    will be equivalent to:

        for i in S:
            for j in T:
                pass

    Classes will be able to overload this operator using the special
    methods "__across__", "__racross__", and "__iacross__".  In
    particular, the new Numeric module (PEP 209) will overload this
    operator for multi-dimensional arrays to implement matrix
    multiplication.

Background

    Number-crunching is now just a small part of computing, but many
    programmers --- including many Python users --- still need to
    express complex mathematical operations in code.  Most numerical
    languages, such as APL, Fortran-90, MATLAB, IDL, and Mathematica,
    therefore provide two forms of the common arithmetic operators.
    One form works element-by-element, e.g. multiplies corresponding
    elements of its matrix arguments.  The other implements the
    "mathematical" definition of that operation, e.g. performs
    row-column matrix multiplication.

    Zhu and Lielens have proposed doubling up Python's operators in
    this way [1].  Their proposal would create six new binary infix
    operators, and six new in-place operators.

    The original version of this proposal was much more conservative.
    The author consulted the developers of GNU Octave [2], an open
    source clone of MATLAB.  Its developers agreed that providing an
    infix operator for matrix multiplication was important: numerical
    programmers really do care whether they have to write "mmul(A,B)"
    instead of "A op B".

    On the other hand, when asked how important it was to have infix
    operators for matrix solution and other operations, Prof. James
    Rawlings replied [3]:

        I DON'T think it's a must have, and I do a lot of matrix
        inversion. I cannot remember if its A\b or b\A so I always
        write inv(A)*b instead. I recommend dropping \.

    Based on this discussion, and feedback from students at the US
    national laboratories and elsewhere, we recommended adding only
    one new operator, for matrix multiplication, to Python.

Iterators

    The planned addition of iterators to Python 2.2 opens up a broader
    scope for this proposal.  As part of the discussion of PEP 201,
    Lockstep Iteration[4], the author of this proposal conducted an
    informal usability experiment[5].  The results showed that users
    are psychologically receptive to "cross-product" loop syntax.  For
    example, most users expected:

        S = [10, 20, 30]
        T = [1, 2, 3]
        for x in S; y in T:
            print x+y,

    to print "11 12 13 21 22 23 31 32 33".  We believe that users will
    have the same reaction to:

        for (x, y) in S @ T:
            print x+y

    i.e. that they will naturally interpret this as a tidy way to
    write loop nests.

    This is where iterators come in.  Actually constructing the
    cross-product of two (or more) sequences before executing the loop
    would be very expensive.  On the other hand, "@" could be defined
    to get its arguments' iterators, and then create an outer iterator
    which returns tuples of the values returned by the inner
    iterators.

Discussion

    1. Adding a named function "across" would have less impact on
       Python than a new infix operator.  However, this would not make
       Python more appealing to numerical programmers, who really do
       care whether they can write matrix multiplication using an
       operator, or whether they have to write it as a function call.

    2. "@" would have be chainable in the same way as comparison
       operators, i.e.:

        (1, 2) @ (3, 4) @ (5, 6)

       would have to return (1, 3, 5) ... (2, 4, 6), and *not*
       ((1, 3), 5) ... ((2, 4), 6).  This should not require special
       support from the parser, as the outer iterator created by the
       first "@" could easily be taught how to combine itself with
       ordinary iterators.

    3. There would have to be some way to distinguish restartable
       iterators from ones that couldn't be restarted.  For example,
       if S is an input stream (e.g. a file), and L is a list, then "S
       @ L" is straightforward, but "L @ S" is not, since iteration
       through the stream cannot be repeated.  This could be treated
       as an error, or by having the outer iterator detect
       non-restartable inner iterators and cache their values.

    4. Whiteboard testing of this proposal in front of three novice
       Python users (all of them experienced programmers) indicates
       that users will expect:

        "ab" @ "cd"

       to return four strings, not four tuples of pairs of
       characters.  Opinion was divided on what:

        ("a", "b") @ "cd"

       ought to return...

Alternatives

    1. Do nothing --- keep Python simple.

    This is always the default choice.

    2. Add a named function instead of an operator.

    Python is not primarily a numerical language; it may not be worth
    complexifying it for this special case.  However, support for real
    matrix multiplication *is* frequently requested, and the proposed
    semantics for "@" for built-in sequence types would simplify
    expression of a very common idiom (nested loops).

    3. Introduce prefixed forms of all existing operators, such as
       "~*" and "~+", as proposed in PEP 225 [1].

    Our objections to this are that there isn't enough demand to
    justify the additional complexity (see Rawlings' comments [3]),
    and that the proposed syntax fails the "low toner" readability
    test.

Acknowledgments

    I am grateful to Huaiyu Zhu for initiating this discussion, and to
    James Rawlings and students in various Python courses for their
    discussions of what numerical programmers really care about.

References

    [1] PEP 225, Elementwise/Objectwise Operators, Zhu, Lielens
        http://www.python.org/dev/peps/pep-0225/

    [2] http://bevo.che.wisc.edu/octave/

    [3] http://www.egroups.com/message/python-numeric/4

    [4] PEP 201, Lockstep Iteration, Warsaw
        http://www.python.org/dev/peps/pep-0201/

    [5] http://mail.python.org/pipermail/python-dev/2000-July/006427.html

Python Wiki

Python Insider Blog

Python 2 or 3?

Help Fund Python

Non-English Resources

Introduction

Background

Iterators

Discussion

Alternatives

Acknowledgments

References