Python crawlers are used by you to climb sister pictures. We are ashamed to be with them!

Five packs of spicy strips! 2021-09-15 12:36:36
python crawlers used climb sister


Hello everyone , I'm spicy . 

Today, I mainly share that a fan friend asked me to help him climb a sister map website , However, the website image scale is relatively large , So don't leave links , Although it's relatively simple , But it is still of great learning significance , This is by no means a water article ! Of course, we should share what we have learned .

 

 

Effect display

Collect data target

Website : Does not provide ( The dog's head lives , Prevent entering the small black house )

Tool use

development tool :pycharm

development environment :python3.7, Windows10

Using the toolkit :requests,lxml

Key learning content

  • requests Tool use

  • xpath Extract web data

Project analysis ideas

Get home page information , adopt requests Request web data , The current web page data is dynamically loaded .

url Parameter modification request other pages , adopt xpath Method to extract the URL to enter the details page , The information on the details page will be more wonderful .

image.png

Extract the... That enters the details page a label , Request web page data again , Get the details page data , Through again xpath Get the label of the picture and the name of the picture , It should be noted that img The picture address in the label is actually a dynamic graph , The data we need to get is div Inside the label data-src.

image.png

Get the corresponding picture label , Save the corresponding picture data Be accomplished !!!

Need website address to follow Sanlian + Private acquisition 【 Learning exchange only , Be sure to collect it for three times , Otherwise, it's easy to find 】

Easy source sharing

import requests
from lxml import etree
​
​
​
url = 'https://www.xxxx.com/page/4/'
response = requests.get(url)
html = etree.HTML(response.text)
href_list = html.xpath('//div[@class="item-title"]/a/@href')
for href in href_list:
   res = requests.get(href)
   html_data = etree.HTML(res.text)
   img_url_list = html_data.xpath('//div[@data-fancybox="gallery"]/@data-src')
   img_name_list = html_data.xpath('//img/@alt')
   print(img_url_list)
   for img_url, img_name in zip(img_url_list, img_name_list):
       result = requests.get(img_url).content
       with open(' picture /' + img_name + ".jpg", "wb")as f:
           f.write(result)
           print(" Downloading :", img_name)
​

Learning exchange only !! Invasion and deletion !

版权声明
本文为[Five packs of spicy strips!]所创,转载请带上原文链接,感谢
https://pythonmana.com/2021/09/20210909161448741z.html

  1. Python - Programmation orientée objet - pratique (6)
  2. Python - Programmation orientée objet - réflexion hasattr, GetAttr, GetAttr, delattr
  3. Python - Programmation orientée objet - _Dict
  4. Python - pydantic (2) Modèle imbriqué
  5. Non-ASCII character ‘\xe5‘ in file kf1.py on line 4, but no encoding declared; see http://python.or
  6. python笔记(一)
  7. Non - ASCII character 'xe5' in file kf1.py on Line 4, but no Encoding declared;Voirhttp://python.or
  8. Notes Python (1)
  9. Talk about how JMeter executes Python scripts concurrently
  10. In Beijing, you can't see the moon in the Mid Autumn Festival. Draw a blood red moon in Python
  11. Un des pandas crée un objet
  12. Machine learning | unitary regression model Python practical case
  13. Draw a "Lollipop chart" with Excel and python
  14. Python uses scikit learn to calculate TF-IDF
  15. Getting started with Python Basics_ 3 conditional statements and iterative loops
  16. Python dynamic properties and features
  17. 云计算开发:Python内置函数-min()函数详解
  18. [Python skill] how to speed up loop operation and numpy array operation
  19. 雲計算開發:Python內置函數-min()函數詳解
  20. Développement de l'informatique en nuage: explication détaillée de la fonction intégrée python - min ()
  21. 从0起步学Python(附程序实例讲解)第1讲
  22. 从0起步学Python(附程序实例讲解)第1讲
  23. Apprendre Python à partir de 0 (avec des exemples de programme) leçon 1
  24. Apprendre Python à partir de 0 (avec des exemples de programme) leçon 1
  25. With Python, I'll take you to enjoy it for a month when the Mid Autumn Festival is coming
  26. You can't write interface software in Python! Which software on sale has no UI?
  27. Python国内外原题解析及源码1~15
  28. Python实现长篇英文自动纠错~
  29. Python implémente la correction automatique des erreurs en anglais long
  30. Analyse des problèmes originaux et code source de Python au pays et à l'étranger 1 ~ 15
  31. 一张思维导图学Python之表白
  32. Python教学中课程思政建设的研究探索2
  33. Recherche sur la construction idéologique et politique du Programme d'études dans l'enseignement Python 2
  34. Une présentation de la cartographie mentale Python
  35. Python高级用法总结(8)-函数式编程
  36. Python + Mirai development QQ robot starting tutorial (2021.9.9 test is valid)
  37. Python Advanced use Summary (8) - functional Programming
  38. How to get started with Python and share learning methods for free. All you want to know is here
  39. Python + Mirai development QQ robot starting tutorial (2021.9.9 test is valid)
  40. Python趣味编程中(PPT适合青少儿和零基础学习Python)
  41. Python基础第1讲(含代码、Python最新安装包、父与子的编程之旅:与小卡特一起学Python中文版)
  42. 用 Python 增强 Git
  43. Python基礎第1講(含代碼、Python最新安裝包、父與子的編程之旅:與小卡特一起學Python中文版)
  44. Base Python leçon 1 (y compris le Code, le dernier paquet d'installation Python, le voyage de programmation parent - enfant: apprendre la version chinoise de python avec le petit Carter)
  45. Dans la programmation amusante Python (ppt pour les jeunes enfants et l'apprentissage de base zéro Python)
  46. 非常好的题目详解Python字典的用法
  47. Python teaches you to build wechat push live Betta reminder from 0 (single room simplified version)
  48. Python 协程与 JavaScript 协程的对比
  49. 手把手带你用Python实现一个量化炒股策略
  50. Main dans la main pour mettre en œuvre une stratégie quantitative de spéculation boursière en python
  51. Comparaison des coproductions Python et JavaScript
  52. 【python种子项目ppc】一行代码生成项目与开发详细指导
  53. Docker 部署一个用 Python 编写的 Web 应用
  54. Python - poetry(4)管理环境
  55. Python - poetry(2)命令介绍
  56. [Python Seed Project PPC] a line of Code Generation Project and Development detailed guidance
  57. Introduction à la commande python - Poetry (2)
  58. Python - Poetry (4) Management Environment
  59. I collected Banhua's spatial data set in Python. In addition to meizhao, I found her other secrets again!
  60. I modified ban Hua's boot password in Python and found her secret after logging in again!