Removing elements from a list while iterating...

Skip Montanaro (skip@dolphin.automatrix.com)
07 Mar 1995 20:38:57 GMT

We all know this is unsafe:

for m in list:
if m matches some criterion:
list.remove(m)

since the list is mutating out from under the internal pointers that the for
loop is using.

I've seen Guido recommend:

l = list[:]
for m in l:
if m matches some criterion:
list.remove(m)

which separates the list being modified from the list being iterated over.

Unfortunately, I have a big list that is expensive to duplicate (~ 100K
elements) and I run over it several times looking for elements that match
different criteria and culling them. I came up with the following way out
of my dilemma.

Instead of creating the list with my data directly as elements of the list,
I create elements that are themselves lists, with the first component the
real data and the second a marker number:

list = []
for l in foo.readlines():
list.append([l, 0])

Then I can iterate over them and simply set the marker when the real data
satisfies the criteria:

for elt in list:
if elt[1] == 1: continue
if elt[0] matches some criterion:
# do something with elt[0]
elt[1] == 1

I got a tremendous performance increase in my little application --
analyzing my Web logs -- which hadn't been a problem until we were tagged as
cool site of the day (http://www.infi.net/cool.html) last Saturday and had
60,000 accesses. Our poor little server. Thank goodness we didn't get
picked mid-week...

-- 
Skip Montanaro		skip@automatrix.com			  (518)372-5583
Musi-Cal: http://www.calendar.com/concerts/ -or- concerts@calendar.com
Internet Conference Calendar: http://www.calendar.com/conferences/