Re: [Q] How to search the back slash?

Guido.van.Rossum@cwi.nl
Sat, 28 Jan 1995 11:04:58 +0100

> Hi. Can someone please tell me how to search the string
> '\\' (one back slash after metachar substituion)? I got
> the following result:
>
> tomkwong-bash$ python
> Python 1.1 (Jan 5 1995)
> Copyright 1991-1994 Stichting Mathematisch Centrum, Amsterdam
> >>> import regex
> >>> regex.compile('\\')
> Traceback (innermost last):
> File "<stdin>", line 1, in ?
> regex.error: Regular expression ends prematurel <=== oops
> >>> regex.compile('\n')
> <regex object at 418f44> <=== ok

Short answer: use four backslashes: regex.compile('\\\\') matches a
single backslash.

Long answer: when you do regex.compile('...'), the string literal is
subject to *two* sibstitution passes: one where the string literal is
translated (this translates \n, \t etc. to their ASCII equivalents, as
well as \000 and \x00); and one where the regular expression
metacharacters are interpreter: this translates things like '.', 'c*'
and '[a-z]'. Now the string literal translation leaves unrecognized
backslash escapes in, so that e.g. '\[' really is a string of two
characters, and the regex package will interpret the backslash to mean
"the next character is not a regex metacharacter". Now the only
character that's special for both translation passes is backslash. If
you want to match a regex consisting of a single backslash, you'll
have to pass it a string object containing two backslashes. And since
you muse double backslashes in string literals, the way to write one
is four backslashes.

Hope this helps,

--Guido van Rossum, CWI, Amsterdam <mailto:Guido.van.Rossum@cwi.nl>
<http://www.cwi.nl/cwi/people/Guido.van.Rossum.html>