Hello everyone , I'm spicy .
Today, I mainly share that a fan friend asked me to help him climb a sister map website , However, the website image scale is relatively large , So don't leave links , Although it's relatively simple , But it is still of great learning significance , This is by no means a water article ！ Of course, we should share what we have learned .
Website ： Does not provide （ The dog's head lives , Prevent entering the small black house ）
development tool ：pycharm
development environment ：python3.7, Windows10
Using the toolkit ：requests,lxml
requests Tool use
xpath Extract web data
Get home page information , adopt requests Request web data , The current web page data is dynamically loaded .
url Parameter modification request other pages , adopt xpath Method to extract the URL to enter the details page , The information on the details page will be more wonderful .
Extract the... That enters the details page a label , Request web page data again , Get the details page data , Through again xpath Get the label of the picture and the name of the picture , It should be noted that img The picture address in the label is actually a dynamic graph , The data we need to get is div Inside the label data-src.
Get the corresponding picture label , Save the corresponding picture data Be accomplished ！！！
Need website address to follow Sanlian + Private acquisition 【 Learning exchange only , Be sure to collect it for three times , Otherwise, it's easy to find 】
import requests from lxml import etree url = 'https://www.xxxx.com/page/4/' response = requests.get(url) html = etree.HTML(response.text) href_list = html.xpath('//div[@class="item-title"]/a/@href') for href in href_list: res = requests.get(href) html_data = etree.HTML(res.text) img_url_list = html_data.xpath('//div[@data-fancybox="gallery"]/@data-src') img_name_list = html_data.xpath('//img/@alt') print(img_url_list) for img_url, img_name in zip(img_url_list, img_name_list): result = requests.get(img_url).content with open(' picture /' + img_name + ".jpg", "wb")as f: f.write(result) print(" Downloading :", img_name)