global vs builtin (was Re: lambda construction)

Tim Peters (tim@ksr.com)
Thu, 16 Jun 94 01:30:44 -0400

> [guido]
> ...
> I am thinking about optimizing the distinction between globals and
> built-ins in the same manner as that between locals and globals -- scan
> the entire module for assignments, including global statements in its
> functions and methods; anything that isn't obviously global must be
> built-in. This may stop some code from working but that would've been
> walking on thin ice anyway.

I can think of very little real code this would break, assuming that
"from ModuleName import *" at module level turned off the optimization.
Am I correct in believing that these cover the other broken cases?:

1) A module vrbl binding is created via exec, e.g.

exec 'vrbl = 3'

2) A module vrbl binding is created via direct attribute assignment in a
different module, e.g.

# in module B
import A
A.vrbl = 3

3) A module vrbl binding is created via direct attribute assignment in
the same module, e.g.

# in module A
A.vrbl = 3

4) A module vrbl binding is created via direct assignment to the module's
__dict__, e.g.

# in module B (or mutatis mutandis in module A)
import A
A.__dict__['vrbl'] = 3

5) A module vrbl binding is created via setattr, e.g.

# in module B (or mutatis mutandis in module A)
import A
setattr(A, 'vrbl', 3)

6) A module vrbl binding is created via assignment to the f_globals dict
attribute of a frame object, e.g.
frame.f_back.f_globals['vrbl'] = 3

For #1, 'exec' at module level could also turn off the optimization. For
#2, #4 & #5, I think you'd be doing us a _favor_ if you "broke" them <0.4
grin>. Nobody ever does #3. Steve M's recently reposted ImportFrom
package does fiddle the NS of the module from which it's called via #6,
but haven't seen that form of abuse <wink> elsewhere.

BTW, as recently as March 7th _you_ wrote, about #2:

> The same kind of optimization cannot be performed for modules, since
> a module can have global variables added dynamically from other
> modules -- e.g.
>
> import foo
> foo.bar = 12
>
> adds the variable 'bar' to module 'foo'. This is a feature.

But since I never saw that as "a feature", I won't muddy the issues by
pointing that out <grin>.

> ...
> Given the relative frequency with which built-ins like len, range and
> str are called this may be a worthwhile optimization.

Can it be pushed another level? I.e., skipping the global NS search is
worth something, but skipping a search altogether would be worth a lot
more. Sketch:

1) Any given release of Python has a fixed set of names it _knows_ are
built-in.

2) The built-in NS could be implemented as a vector of objects
corresponding to the known names, + a (initially empty) dict on the
side for "built-in names" that may be added dynamically.

3) When Python compiles code for a module & determines that a name must
be a built-in, then:

A) If the name is a known built-in name, it compiles to a new opcode
that simply indexes into the vector of built-in objects. I.e.,
much the way a function's local NS is implemented today -- the
point is that no search or external call would be required to find
or alter the current binding for a builtin name.

B) If the name is not in the vector of known built-in names, compile
to an opcode that searches/changes the dict of unknown-at-compile-
time builtin names.

How to keep it all consistent in the presence of setattr/hasattr/getattr/
del etc applied to __builtin__ isn't clear to me, except that it's
clearly doable given enough icky special-casing. A cheap prototype that
ignores the icky cases would be interesting, to see how much time it
might actually save.

life-in-the-thought-to-be-fast-lane-ly y'rs - tim

Tim Peters tim@ksr.com
not speaking for Kendall Square Research Corp