Re: Is this a regex bug or just me?

Guido.van.Rossum@cwi.nl
Tue, 07 Feb 1995 14:06:12 +0100

[...]
> so I built the following pattern:
>
> %{smonth}/%{sday}%{? - %{?%{emonth}/}%{eday}}
>
> which generated
>
> \([0-9]+\)/\([0-9]+\)\([ ]*-[ ]*\(\([0-9]+\)/\)?\([0-9]+\)\)?
>
> and
>
> {'smonth': 1, 'sday': 2, 'emonth': 5, 'eday': 6}
>
> just as I expected. It works fine for date strings of the form 1/25 or
> 4/30-5/1, but returns incorrect results for dates of the form 1/25-26. The
> return value of group(5) is '26' instead of None. This is especially
> perplexing since group(4), which encloses group(5) correctly returns None.
>
> (For those with acute regexp-itis group(4) and group(6) are nested inside
> group(3). group(5) is nested inside group(4). Both group(3) and group(4)
> are optional. I saw nothing in the Emacs regexp syntax info page that would
> suggest optional regexps should not be nested within one another.)
>
> I noticed that the version of Tatu Ylonen's regexpr.c code used in Python
> seemed to not be the most recent, so I fetched the version that was posted
> to comp.sources.misc (in volume 27) and the one patch for it I found (in
> volume 29), merged Guido's changes into them and rebuilt Python (1.1.1) but
> saw no improvement.
>
> Can anybody steer me in the right direction? Have I
>
> a. overstepped the bounds of regular expressions (nesting multiple
> optional regexps, prehaps)?
> b. failed in my understanding of how they work?
> c. generated a faulty regular expression?
> d. found a bug in regexpr.c?
> e. some, all or none of the above? :-)

It may be a bug in Tatu Ylonen's code. I tried this in Emacs 19 and
it seems to correctly make group 5 empty. I haven't looked at the
code but can imagine that the register values (indicating where the
parentheses match) are filled in while making partial matches and that
nested register values are not erased when backtracking at a higher
level. I'm not sufficiently versed in the code to be able to find a
fix. Tatu?

--Guido van Rossum, CWI, Amsterdam <mailto:Guido.van.Rossum@cwi.nl>
<http://www.cwi.nl/cwi/people/Guido.van.Rossum.html>