PEP: | 3113 |
---|---|
Title: | Removal of Tuple Parameter Unpacking |
Version: | 4682ccc5e7f4 |
Last-Modified: | 2008-02-24 04:25:44 +0000 (Sun, 24 Feb 2008) |
Author: | Brett Cannon <brett at python.org> |
Status: | Final |
Type: | Standards Track |
Content-Type: | text/x-rst |
Created: | 02-Mar-2007 |
Python-Version: | 3.0 |
Post-History: |
Contents
Abstract
Tuple parameter unpacking is the use of a tuple as a parameter in a function signature so as to have a sequence argument automatically unpacked. An example is:
def fxn(a, (b, c), d): pass
The use of (b, c) in the signature requires that the second argument to the function be a sequence of length two (e.g., [42, -13]). When such a sequence is passed it is unpacked and has its values assigned to the parameters, just as if the statement b, c = [42, -13] had been executed in the parameter.
Unfortunately this feature of Python's rich function signature abilities, while handy in some situations, causes more issues than they are worth. Thus this PEP proposes their removal from the language in Python 3.0.
Why They Should Go
Introspection Issues
Python has very powerful introspection capabilities. These extend to function signatures. There are no hidden details as to what a function's call signature is. In general it is fairly easy to figure out various details about a function's signature by viewing the function object and various attributes on it (including the function's func_code attribute).
But there is great difficulty when it comes to tuple parameters. The existence of a tuple parameter is denoted by its name being made of a . and a number in the co_varnames attribute of the function's code object. This allows the tuple argument to be bound to a name that only the bytecode is aware of and cannot be typed in Python source. But this does not specify the format of the tuple: its length, whether there are nested tuples, etc.
In order to get all of the details about the tuple from the function one must analyse the bytecode of the function. This is because the first bytecode in the function literally translates into the tuple argument being unpacked. Assuming the tuple parameter is named .1 and is expected to unpack to variables spam and monty (meaning it is the tuple (spam, monty)), the first bytecode in the function will be for the statement spam, monty = .1. This means that to know all of the details of the tuple parameter one must look at the initial bytecode of the function to detect tuple unpacking for parameters formatted as \.\d+ and deduce any and all information about the expected argument. Bytecode analysis is how the inspect.getargspec function is able to provide information on tuple parameters. This is not easy to do and is burdensome on introspection tools as they must know how Python bytecode works (an otherwise unneeded burden as all other types of parameters do not require knowledge of Python bytecode).
The difficulty of analysing bytecode not withstanding, there is another issue with the dependency on using Python bytecode. IronPython [3] does not use Python's bytecode. Because it is based on the .NET framework it instead stores MSIL [4] in func_code.co_code attribute of the function. This fact prevents the inspect.getargspec function from working when run under IronPython. It is unknown whether other Python implementations are affected but is reasonable to assume if the implementation is not just a re-implementation of the Python virtual machine.
No Loss of Abilities If Removed
As mentioned in Introspection Issues, to handle tuple parameters the function's bytecode starts with the bytecode required to unpack the argument into the proper parameter names. This means that there is no special support required to implement tuple parameters and thus there is no loss of abilities if they were to be removed, only a possible convenience (which is addressed in Why They Should (Supposedly) Stay).
The example function at the beginning of this PEP could easily be rewritten as:
def fxn(a, b_c, d): b, c = b_c pass
and in no way lose functionality.
Exception To The Rule
When looking at the various types of parameters that a Python function can have, one will notice that tuple parameters tend to be an exception rather than the rule.
Consider PEP 3102 (keyword-only arguments) and PEP 3107 (function annotations) [5] [6]. Both PEPs have been accepted and introduce new functionality within a function's signature. And yet for both PEPs the new feature cannot be applied to tuple parameters as a whole. PEP 3102 has no support for tuple parameters at all (which makes sense as there is no way to reference a tuple parameter by name). PEP 3107 allows annotations for each item within the tuple (e.g., (x:int, y:int)), but not the whole tuple (e.g., (x, y):int).
The existence of tuple parameters also places sequence objects separately from mapping objects in a function signature. There is no way to pass in a mapping object (e.g., a dict) as a parameter and have it unpack in the same fashion as a sequence does into a tuple parameter.
Uninformative Error Messages
Consider the following function:
def fxn((a, b), (c, d)): pass
If called as fxn(1, (2, 3)) one is given the error message TypeError: unpack non-sequence. This error message in no way tells you which tuple was not unpacked properly. There is also no indication that this was a result that occurred because of the arguments. Other error messages regarding arguments to functions explicitly state its relation to the signature: TypeError: fxn() takes exactly 2 arguments (0 given), etc.
Little Usage
While an informal poll of the handful of Python programmers I know personally and from the PyCon 2007 sprint indicates a huge majority of people do not know of this feature and the rest just do not use it, some hard numbers is needed to back up the claim that the feature is not heavily used.
Iterating over every line in Python's code repository in the Lib/ directory using the regular expression ^\s*def\s*\w+\s*\( to detect function and method definitions there were 22,252 matches in the trunk.
Tacking on .*,\s*\( to find def statements that contained a tuple parameter, only 41 matches were found. This means that for def statements, only 0.18% of them seem to use a tuple parameter.
Why They Should (Supposedly) Stay
Practical Use
In certain instances tuple parameters can be useful. A common example is code that expects a two-item tuple that represents a Cartesian point. While true it is nice to be able to have the unpacking of the x and y coordinates for you, the argument is that this small amount of practical usefulness is heavily outweighed by other issues pertaining to tuple parameters. And as shown in No Loss Of Abilities If Removed, their use is purely practical and in no way provide a unique ability that cannot be handled in other ways very easily.
Self-Documentation For Parameters
It has been argued that tuple parameters provide a way of self-documentation for parameters that are expected to be of a certain sequence format. Using our Cartesian point example from Practical Use, seeing (x, y) as a parameter in a function makes it obvious that a tuple of length two is expected as an argument for that parameter.
But Python provides several other ways to document what parameters are for. Documentation strings are meant to provide enough information needed to explain what arguments are expected. Tuple parameters might tell you the expected length of a sequence argument, it does not tell you what that data will be used for. One must also read the docstring to know what other arguments are expected if not all parameters are tuple parameters.
Function annotations (which do not work with tuple parameters) can also supply documentation. Because annotations can be of any form, what was once a tuple parameter can be a single argument parameter with an annotation of tuple, tuple(2), Cartesian point, (x, y), etc. Annotations provide great flexibility for documenting what an argument is expected to be for a parameter, including being a sequence of a certain length.
Transition Plan
To transition Python 2.x code to 3.x where tuple parameters are removed, two steps are suggested. First, the proper warning is to be emitted when Python's compiler comes across a tuple parameter in Python 2.6. This will be treated like any other syntactic change that is to occur in Python 3.0 compared to Python 2.6.
Second, the 2to3 refactoring tool [1] will gain a fixer [2] for translating tuple parameters to being a single parameter that is unpacked as the first statement in the function. The name of the new parameter will be changed. The new parameter will then be unpacked into the names originally used in the tuple parameter. This means that the following function:
def fxn((a, (b, c))): pass
will be translated into:
def fxn(a_b_c): (a, (b, c)) = a_b_c pass
As tuple parameters are used by lambdas because of the single expression limitation, they must also be supported. This is done by having the expected sequence argument bound to a single parameter and then indexing on that parameter:
lambda (x, y): x + y
will be translated into:
lambda x_y: x_y[0] + x_y[1]
References
[1] | 2to3 refactoring tool (http://svn.python.org/view/sandbox/trunk/2to3/) |
[2] | 2to3 fixer (http://svn.python.org/view/sandbox/trunk/2to3/fixes/fix_tuple_params.py) |
[3] | IronPython (http://www.codeplex.com/Wiki/View.aspx?ProjectName=IronPython) |
[4] | Microsoft Intermediate Language (http://msdn.microsoft.com/library/en-us/cpguide/html/cpconmicrosoftintermediatelanguagemsil.asp?frame=true) |
[5] | PEP 3102 (Keyword-Only Arguments) (http://www.python.org/dev/peps/pep-3102/) |
[6] | PEP 3107 (Function Annotations) (http://www.python.org/dev/peps/pep-3107/) |
Copyright
This document has been placed in the public domain.