str() and repr() of sequence like classes, esp. 'array' [RFC]

Steven D. Majewski (sdm7g@elvis.med.virginia.edu)
Wed, 3 Nov 1993 12:28:35 -0500

I was suprised when I first tried using the array module and
when I typed:

>>>array( 'f' )

I got back:

array( 'f' )

Not what I was *expecting* to see.

My puzzlement faded when I next typed:

>>>array( 'f', [ 1,2,3,4,5 ] )

and got:

array('f', [1.0, 2.0, 3.0, 4.0, 5.0])

Neat!
The print representation is able to reproduce the object:

>>> ( eval(str(array('f',[1,2,3,4,5]))) == array('f',[1,2,3,4,5]) )
1

An admirable goal.

However - In some of the example sequence-like classes I have thrown
together, I have tended towards trying to make the new classes behave
generally like the objects they are trying to replace.
Well - except for that ") (" , "] [" business for reverse sequences.
Maybe I haven't been consistant with that rule, but I've played
around with the technique enough to see that it, also is a useful
feature. And even there, though I was willing to accept ') 3 ,2 ,1 ('
as the reverse of '( 1,2,3 )', I was trying at the same time to
get "Hello World!" to be "!dlroW olloH" . - I think I was starting
to graple with the "correct" way to distinguish the behaviour of
str() and repr(). [ And 'array' brings up some of the same questions.]

If a mutable string is going to be "drop replacable" in a lot of
proceures that expect a string, then :

>>>print mutable-string-obj

ought to do pretty much the same thing as:

>>>print string

With 'array', I have to say:

>>> print array( 'c' , 'Hello World!' ).tostring()
Hello World!

because:

>>> print array( 'c' , 'Hello World!' )

gives me:

array('c', 'Hello World!')

The need for this "drop replacable" behaviour is more obvious for
string-like objects than for other sequence types, but I can think
or other cases where we would like our replacement-classes to do
a good job of impersonating their antecedents. But clearly, at other
times, we WANT to either get a self-reproducing string ( like array
returns ) or a bracketed type representation ( sort of like the way
files are represented : "<open file '<stdin>', mode 'r' at 20083d98>" )

I assert that this sort of problem is just the place to distinguish
between the strings returned by 'str()' and 'repr()' and I propose
that we agree ( and document ) some conventions on how they are
distinguished.

My feeling is that the repr() of an object is the thing that should
always be it's "true" representation, but the str() of an object
should be it's most useful and convenient representation - i.e. the
one that minimizes (and hides) what may be non-essential differences
in classes and objects.

Thus:

>>>repr( array( 'f', [ 1,2,3,4,5 ] ) )

would be:

array('f', [1.0, 2.0, 3.0, 4.0, 5.0])

but:

>>>str( array( 'f', [ 1,2,3,4,5 ] ) )

would be:

[1.0, 2.0, 3.0, 4.0, 5.0]

And:
>>>repr(array( 'c', 'Hello World' ))

would be:
array( 'c', 'Hello World' )

but:
>>>str(array( 'c', 'Hello World' ))

would be, simply:

'Hello World'

I suspect it might also be useful in other (non-sequence) classes to
distinguish between the representation that hides differences in
implementation, and the representation that asserts those details.

I think this proposal is in accord with current usage and how
'print' works, and how 'str()' and 'repr()' currently distinguish
some objects. ( i.e. I *Don't* think anyone is going to propose that
it be the other way around entirely! )

Does anyone see any problems with it ?
Would anyone expect this sort of change to 'array' to produce
much broken code ?

- Steve Majewski (804-982-0831) <sdm7g@Virginia.EDU>
- UVA Department of Molecular Physiology and Biological Physics