9527NNN 2021-11-25 10:09:47

The use of parsing libraries is similar .

bs4 The biggest advantage is that the analytical formula is concise , Simple extraction . The disadvantage is that the extracted text needs to be reprocessed . Unlike re and lxml What you need directly can be very concise without redundancy to extract the required text .

The elder brother wrote the specific usage in great detail

For official documents bs4 The usage of the library is explained in detail . It can be said that the summary is very comprehensive .

class TiebaSpider(object):
def __init__(self):
def get_html(self,url):
res=requests.get(url=url,headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36'})
return html
def parse_html(self,html):
parse_html = BeautifulSoup(html, 'html.parser')
text = parse_html.select('#content p')
return text
def save_html(self,filename,html):
with open('D:/request/'+filename,'w') as f:
for i in html:
j = str(i)
j = j[3:-4]

Part of the source code


  1. Pandas draws line chart, bar chart and bar chart
  2. Exploration and practice of easy transformation between various data structures in Python
  3. A simple practice of Python list generation
  4. Use Python to guess the number of machines and judge the number of guesses
  5. Question about Python: did you learn Python
  6. 20210928 | Python case: building tax calculation function
  7. Python basic syntax collation
  8. Some small accumulation of writing programs in python (4)
  9. Some small accumulation of writing programs in python (3)
  10. Python leak detection tips (3)
  11. Python leak detection tips (2)
  12. Python leak detection tips (1)
  13. Python foundation and MySQL Foundation
  14. Some small accumulation of writing programs in python (2)
  15. Some small accumulation of writing programs in Python
  16. Python 3.7.3 + cuda9.2 installing Python
  17. Python knowledge used to write programs
  18. Python installation + vscode configuration Python environment
  19. Some small problems during Python installation
  20. Answer and Q & A of Python practice introduction course
  21. Sorting out the learning route for Python beginners
  22. The 6-line Python code uses the pdf2docx module converter object to convert PDF into docx file
  23. Batch compression of picture files using Python
  24. Using Python to write djikstra algorithm for robot path planning
  25. python实现手机号获取短信验证码 | 对接打码平台
  26. Detailed explanation of Euler Rodriguez code in Python
  27. Prove that angular displacement is not a vector with Python
  28. Using Python program to deeply understand homogeneous transfer matrix t
  29. Triple product formula of vector and its proof (with Python code)
  30. Derivation of differential formula of DH matrix using Python
  31. Python openpyxl operation on Excel (get the total number of columns, get the value of a row, get the value of a column, and set the cell value)
  32. Realizing Excel data filtering and data matching with Python
  33. Python reads and writes files
  34. Four scenarios of processing Excel files with Python
  35. Python converts dictionary to excel
  36. Python implements file reading and writing
  37. Basic Python syntax -- functions
  38. Python learning thinking
  39. Python basic syntax -- Boolean operation, break, continue, pass
  40. Python basic syntax -- loop
  41. Basic Python syntax -- lists, dictionaries
  42. Python basic syntax -- conditional judgment, input ()
  43. Python first experience - efficient office, data analysis, crawler
  44. Modulenotfounderror: no module named 'Django summernote details
  45. Key points for Django to make personal blog website
  46. Path setting of templates in Django settings
  47. Leetcode 1611. Minimum one bit operations to make integers Zero (Python)
  48. Directory C: \ users \ a \ desktop \ Django_ The blog master is registered as the GIT root, but no git repository details are found there
  49. Django. Core. Exceptions. Improveconfigured: application labels aren't unique, duplicates: admin
  50. How to verify that Django has created the project details correctly
  51. How to create a database when using Django to construct a website
  52. The solution of using Django framework to create project in Windows system
  53. Running Python virtual environment on win10 system to execute ll_ Env \ scripts \ activate: unable to load file elaboration scheme
  54. Detailed explanation of constructing virtual environment with Django in Python 3
  55. Python implementation of affine cipher
  56. RC4 Python implementation
  57. Simple: Python_ Automatic body temperature clock
  58. 用python把两个csv中的日期的列提出年,做出新的一列,再把两个csv表格按照新做出的日期这列和ID号合并为一个表。
  59. python中类实例化后,其对象无法被其他模块调用方法
  60. [JSON] - Python creates JSON file format