1, What is regular expression ?
Regular expressions (regular expression) Is an expression used to express a set of strings concisely .
2, What is the role ?
① The characteristics of expressing text types . ② Find or replace a set of strings at the same time . ③ Match all or part of a string .
3, Common operators :
The operator | explain | Example |
. | Represents any single character | |
[] | Character set , Give a value range for a single character | [abc] Express a,b,c,[a-z] Express a-z Single character |
[^] | Non character set , Give the exclusion range for a single character | [^abc] Express Division a,b,c A single character other than |
* | Previous character 0 Times or infinitely | abc* Express ab,abc,abcc,abccc wait |
+ | Previous character 1 Times or infinitely | abc+ Express abc,abcc,abccc wait |
? | The previous character appears or does not appear | abc Express ab,abc |
| | Any one of the left and right expressions | abc|def Express abc,def |
{m} | Extend the previous character m Time | ab{4}c Express abbbbc |
{m,n} | Extend the previous character m To n Time , contain m,n | ab{1,2}c Express abc,abbc |
^ | Match the beginning of a string | ^abc Express abc And at the beginning of the string |
$ | Match string end | abc$ Express abc And at the end of the string |
() | Group markers , The interior can only be used | The operator | (abc) Express abc,(abc | def) Express abe、def |
\d | Numbers , Equivalent to [0,9] | |
\w | Word characters , Equivalent to [A-Za-z0-9_] |
4, Some syntax examples of regular expressions
Regular expressions | The corresponding string |
P(Y|YT|YTH|YTHO)?N | "PN","PYN","PYTN","PYTHN","PYTHON" |
PYTHON+ | "PYTHON","PYTHONN","PYTHONNN"....... |
PY[TH]ON | "PYTON","PYHON" |
PY[^TH]?ON | "PYON","PYAON","PYBON","PYCON"...... |
PY{:3}N | "PN","PYN","PYYN","PYYYN" |
5, Classic examples of regular expressions
^[A-Za-z]+$ | from 26 A string of letters |
^[A-Za-z0-9]+$ | from 26 A string of letters and numbers |
^-?\d+$ | String in integer form |
^[0-9]*[1-9][0-9]*$ | A string in the form of a positive integer |
[1-9]\d{5} | Postcode in China |
[\u4e00-\u9fa5] | Match Chinese characters |
\d{3}-\d{8}|\d{4}-\d{7} | Domestic phone number ,010-12345678 |
[1-9]?\d | 0-99 |
1\d{2} | 100-199 |
2[0-4]\d | 200-249 |
25[0-5] | 250-255 |
(([1-9]?\d|1\d{2}|2[0-4]\d|25[0-5]).){3}([1-9]?\d|1\d{2}|2[0-4]\d|25[0-5]) |
matching ip Address |
6,re Basic use of Library
re.search() | Search a string for the first place to match a regular expression , return match object |
re.match() | Match regular expressions from the beginning of a string , return match object |
re.findall() | Search string , Return all matching substrings with list type |
re.split() | Split a string according to the regular expression matching result , Return list type |
re.finditer() | Search string , Returns the iteration type of a matching result , Each iteration element is match object |
re.sub() | Replace all substrings matching regular expressions in a string , Return the replaced string |
①search(pattern, string, flags=0)
pattern: The string or native string representation of a regular expression
string: String to match
flags: Control flags when regular expressions are used

1 import re 2 match = re.search(r"[1-9]\d{5}", "haha 723300") 3 if match: 4 print(match.group()) 5 6 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 7 723300 8 9 Process finished with exit code 0
②match(pattern,string,flags=0)
It should be noted that match The function starts at the beginning of a string , If the start doesn't match , No more searching for , If found, the return value is One match object , Return when you can't find it None

1 import re 2 match = re.match(r"[1-9]\d{5}", "haha 723300") 3 print(type(match)) 4 match = re.match(r"[1-9]\d{5}", "723300 haha") 5 if match: 6 print(match.group()) 7 8 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 9 <class 'NoneType'> 10 723300 11 12 Process finished with exit code 0
so search And match The difference is that :
match The substring to be matched must be at the beginning of the string , Otherwise, we can't find , and search There is no such requirement
③findall(pattern,string,flags=0)

1 import re 2 c = re.findall(r"[1-9]\d{5}", "haha723300 xixi612203") 3 print(type(c)) 4 print(c) 5 6 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 7 <class 'list'> 8 ['723300', '612203'] 9 10 Process finished with exit code 0
④split(pattern,string,maxsplit=0,flags=0)
maxsplit: Maximum number of divisions , The rest is output as the last element

1 import re 2 a = re.split(r"[1-9]\d{5}", "haha723300 xixi612203") 3 print(type(a)) 4 print(a) 5 6 a = re.split(r"[1-9]\d{5}", "haha723300 xixi612203", maxsplit=1) 7 print(a) 8 9 str1 = "name: hpl, age: 18" 10 b = re.split(r'\:|\,', str1) 11 print(b) 12 13 14 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 15 <class 'list'> 16 ['haha', ' xixi', ''] 17 ['haha', ' xixi612203'] 18 ['name', ' hpl', ' age', ' 18'] 19 20 Process finished with exit code 0
⑤finditer(pattern,string,flags=0)

1 import re 2 for m in re.finditer(r"[1-9]\d{5}", "haha723300 xixi612203"): 3 if m: 4 print(m.group()) 5 6 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 7 723300 8 612203 9 10 Process finished with exit code 0
⑥sub(pattern,repl,string,count=0,flags=0)
repl: Replace string matching string
count: The maximum number of replacements to match

1 import re 2 m = re.sub(r"[1-9]\d{5}", "love", "haha723300 xixi612203") 3 if m: 4 print(m) 5 6 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 7 hahalove xixilove 8 9 Process finished with exit code 0
7,re Library match object
attribute :
string The text to be matched
re Used when matching pattern object ( Regular expressions )
pos The beginning of regular expression search text
endpos The end of regular expression search text
Method :
group() Get the matching string
start() Match string at the beginning of the original string
end() Match string at the end of the original string
span() return (start)…(end)

1 import re 2 match = re.search(r"[1-9]\d{5}", "haha723300 xixi612203") 3 print(match.string) 4 print(match.re) 5 print(match.pos) 6 print(match.endpos) 7 print(match.group()) 8 print(match.start()) 9 print(match.end()) 10 print(match.span()) 11 12 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 13 haha723300 xixi612203 14 re.compile('[1-9]\\d{5}') 15 0 16 21 17 723300 18 4 19 10 20 (4, 10) 21 22 Process finished with exit code 0
8,re Library Greedy matching and minimum matching
①re The library defaults to greedy matching , That is, the output matches the longest substring

1 import re 2 match = re.search(r'PY.*N','PYANBNCNDN') 3 print(match.group()) 4 5 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 6 PYANBNCNDN 7 8 Process finished with exit code 0
② The method of minimum matching : Add... After the extension operator ?
The operator | explain |
*? | Previous character 0 Times or infinitely , Minimum match |
+? | Previous character 1 Times or infinitely , Minimum match |
?? | Previous character 0 Time or 1 Second expansion , Minimum match |
[m,n]? | Extend the previous character m to n Time ( contain n), Minimum match |

1 import re 2 match = re.search(r'PY.*?N','PYANBNCNDN') 3 print(match.group()) 4 5 G:\Project1\venv\Scripts\python.exe G:/Project1/practice/lianxi2.py 6 PYAN 7 8 Process finished with exit code 0