Multi line matching pattern
You're trying to use regular expressions to match a large chunk of text , And you need to match across multiple lines .
This is a typical problem when you use a little bit of (.) To match any character , Forget a little (.) The fact that line breaks can't be matched . such as , Suppose you want to try to match C Notes on language segmentation ：
r'/\*(.*?)\*/') text1 = '/* this is a comment */' text2 = '''/* this is a >>> multiline comment */ ''' comment.findall(text1) [' this is a comment '] comment.findall(text2)  >>>comment = re.compile(
To fix this problem , You can modify the pattern string , Increase support for line breaks . such as ：
r'/\*((?:.j\n)*?)\*/') comment.findall(text2) [' this is a\n multiline comment '] >>>comment = re.compile(
In this mode , (?:.|\n) A non capture group was specified ( That is, it defines a match only , It can't be captured or numbered individually ).
re.compile() The function takes a flag parameter called re.DOTALL , It's very useful here . It can make points in regular expressions (.) Match any character, including line breaks . such as ：
r'/\*(.*?)\*/', re.DOTALL) comment.findall(text2) [' this is a\n multiline comment ']comment = re.compile(
For simple cases use re.DOTALL Tag parameters work well , But if the patterns are very complex or if you combine multiple patterns to construct string tokens , In this case, there may be some problems with this tag parameter . If you choose , It's better to define your own regular expression patterns , In this way, it can work well without additional tag parameters .