Re: Unique string storage (was Re: Why don't stringobjects have methods?)

Mark Lutz (lutz@KaPRE.COM)
Mon, 4 Apr 94 16:32:32 MDT

Ahem; I typed (without checking...):

> Example use---
> x = Symbol('abc')
> y = Symbol('a' + 'bc')
> x is y -> yes: same dictionary/property-list

of course this is wrong, since x and y are different class
instance objects. You'd need to do:

x.data is y.data -> yes: same dictionary/property-list

unless you store the Symbol class itself in the string dictionary:

Module Symbol2.py---

class Symbol(str):
def __init__(self):
self.data = {'name':str}
def __repr__(self):
return self.getprop('name')
def getprop(self, key):
try:
return self.data[key]
except KeyError:
return None
def putprop(self, key, value):
self.data[key] = value

_tab = {}

def intern(str):
try:
return _tab[str]
except KeyError:
_tab[str] = Symbol(str)
return _tab[str]

New-and-improved example use---

x = intern('abc')
y = intern('a' + 'bc')
x is y -> yes: same class instance
print y -> 'abc'
y.getprop('name') -> 'abc'
y.putprop('value', 99)

Now, since this makes 'Symbol' just a wrapper on top of a dictionary,
and since a class is a form of dictionary itself, you can use getattr()
and setattr() built-in methods, and get rid of the dictionary altogether,
saving space:

Module Symbol3.py---

class Symbol(str):
def __init__(self):
self.name = str
def __repr__(self):
return self.name
def getprop(self, key):
try:
return getattr(self, key)
except KeyError:
return None
def putprop(self, key, value):
setattr(self, key, value)

<rest stays the same>

or you could get rid of getprop, putprop as well:

Module Symbol4.py---

class Symbol(str):
def __init__(self):
self.name = str
def __repr__(self):
return self.name

_tab = {}

def intern(str):
try:
return _tab[str]
except KeyError:
_tab[str] = Symbol(str)
return _tab[str]

New-and-re-improved example use---

x = intern('abc')
y = intern('a' + 'bc')
x is y -> yes: same class instance
print y -> 'abc'
y.getattr('name') -> 'abc'
y.setattr('value', 99)

which is very simple code, but will still run much slower than
built-in symbol table support.

Mark Lutz