Re: Substitute multipel chars.

Guido.van.Rossum@cwi.nl
Mon, 05 Sep 1994 14:48:25 +0200

Magnus asks for tr(1) like functionality.

This isn't directly provided in Python, but you could code it is
several ways. Suppose we have a dictionary "trans" containing the
desired translations, e.g.

trans = {'a': 'A', 'b': 'B'}

Then we could define a function tr(str, trans) which does the
translation, as follows:

def tr(str, trans):
res = ''
for c in str:
if trans.has_key(c): res = res + trans[c]
else: res = res + c
return res

Example run:

>>> tr('abcd', trans)
'ABcd'
>>>

And especially for Ulf, here's a one-liner using lambda and map (which
requires import string):

string.joinfields(map(lambda c, trans=trans: \
trans.has_key(c) and trans[c] or c, str), \
'')

The lambda version is probably much slower, and has a bug if you want
to use this for deleting characters (the for loop version can do this
by setting e.g. trans['x'] = '').

An optimization for a situation where the same translation must be
applied to many input strings would be to augment the trans dictionary
to contain mappings for all characters, so the has_key() test can be
skipped:

def tr(str, trans):
augment(trans)
return ftr(str, trans)

def augment(trans):
for i in range(256):
c = chr(i)
if not trans.has_key(c): trans[c] = c

def ftr(str, trans):
res = ''
for c in str:
res = res + trans[c]
return res

Also note that the "res = res + trans[c]" operation is actually
building the result string a character at a time; for very long
strings a more efficient solution might be to break the input up in
blocks of 100 or 1000 characters (Python's string copying is
blindingly fast, so using too small a block size loses most
performance gain in the extra overhead). If you have to do a whole
file, doing it per line probably pays -- but then it might be faster
to run tr(1) in a subprocess :-)

I haven't measured anything about this.

--Guido van Rossum, CWI, Amsterdam <Guido.van.Rossum@cwi.nl>
<URL:http://www.cwi.nl/cwi/people/Guido.van.Rossum.html>