Re: deep copy

Guido van Rossum (Guido.van.Rossum@cwi.nl)
Mon, 29 Jun 1992 11:46:39 +0200

>I was working on an application which requires assigning (i.e.
>deep copying) a two dimensional array (i.e. a list of lists). As
>an aside, it also requires adding a one dimensional array
>elementwise to a one dimensional slice of a two dimensional
>array.
>
>In general, my mindset is APL-ish here and I would like to assign
>(i.e. deep copy) multi-dimensional arrays. (Actually I would like
>to do other APL style operations as well including elementwise
>operations such as addition and subtraction on pairs of
>multidimensional, or at least one and two dimensional arrays.)
>
>At least some of these items (such as assigning multidimensional
>arrays in a deep copy sense) seem as fundamental to me as strings
>and dictionaries.
>
>I assume that the applications you refer to above were primarily
>systems software oriented whereas I am referring more to
>mathematical, statistical and data processing sorts of
>applications.

Yes, I'm afraid that's the general mindset from which Python
originated. If you really need a lot of 2-or-more dimensional array
operations you may be better off adding a new type to the language.
The type could use the interface style used by dictionaries (the
general term being "mappings") and it could define additional methods
for slice operations. Buzz, buzz, whirr, whirr (thinking aloud :-)
you might use it as follows:

>>> import matrix
>>> m = matrix.new(10, 10, 0.0) # Create a 10x10 matrix initialized with zeros
>>> print m.dim()
(10, 10)
>>> print m.keys()
[(0, 0), (0, 1), ..., (0, 9), (1, 0), (1, 1), ..., (9, 9)]
>>> for i in range(m.dim()[0]): m[i, i] = 1.0
>>> for row, col in m.keys(): print row, col, m[row, col]
0 0 1.0
0 1 0.0
.
.
.
9 9 1.0
>>> m2 = m # Usual pointer copy -- m2 is the same object as m
>>> m3 = m.copy() # creates a new object -- elements are still shared
>>> m4 = m.slice((0, 10), (5, 10)) # creates a new object
# containing rows [0..10), cols [5, 10)
>>> print m4.dim()
(10, 5)
>>> print m4.keys()
[(0, 0), (0, 1), ..., (0, 4), (1, 0), (1, 1), ..., (9, 4)]
>>> # Note how the indexes always start at 0 (point of discussion though)
>>> m.setslice((0, 10), (0, 5), m4) # Copy right half (m4) to left half

Vector extracted from a matrix are lists:

>>> v1 = m.row(0)
>>> print v1
[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
>>> v2 = m.col(1)
>>> v3 = m.vec(0, 2) # same as m.row(2)
>>> v4 = m.vec(1, 3) # same as m.col(3)
>>> m.setrow(4, v3)
>>> m.setcol(5, v4)
>>> m.setvec(0, 6, v1)
>>> m.setvec(1, 7, v2)

Dot and other products are simply added.

I'm not sure what needs to be generalized to make this work for higher
dimensional arrays. Sparse arrays are another matter. One
interface would make references to non-existing elements illegal, and
require the user to use keys() or has_key() to figure out whether an
element exists; another interface would keep the sparseness internal
to the implementation and return a default value.

Note that there is no reason why elements of matrixes should be
numbers, except when they are used by numerical operations.

The language currently makes it hard to support element-wise addition
etc. using the standard notation (m1+m2 etc.) but it might evolve --
although + is already defined for lists to mean concatenation.

Other ideas?

--Guido van Rossum, CWI, Amsterdam <guido@cwi.nl>
"It's not much of a cheese shop really, is it?"