python – raw strings in Python3, regular expressions
According to the rules of interpolation:
n becomes the ascii byte 0x0A; this applies to your first string to match.
rn becomes the literal
n, that is
n; this applies to the second string to match.
\n becomes the literal
n; this applies to your first pattern string.
r\n becomes the literal
\n; this applies to the second pattern string.
When you perform the matching there is another round of interpolation done on patterns by
n turns into the ascii byte 0x0A (first pattern)
\n turns into the literal
n (second pattern)
So in the end your first string matches the first pattern as both contain ascii 0x0A,
and the second string matches the second pattern as both contain literal
Thats it, no mystery here.
A raw string essentially tells the system to read the backslashes in the following string as what they are – backslashes. So,
However, the system treats backslashes in non-raw strings as a method to escape out the following character. Hence,
n in non-raw strings becomes a newline.
In your code,
pattern contains a string with a newline, not a backslash and n. Were you to use
pattern = rn,
pattern would contain a backslash and n, but not a newline
Hence, searching for a
\n in the string, essentially tells the system to escape out a
(thus, it searches for a backslash) followed by
First of all, lets clarify:
c contains a newline, and
n, literally. This can be verified by printing the strings.
When you search for
\n, the regex pattern searches for a newline. So,
ddoes not match.
pattern = nthen
pattern = r\n, then