Download from Python Alliance

SunriseCai 2020-11-13 11:28:47
download python alliance


This blog is only for my spare time to record articles , Publish to , Only for users to read , If there is any infringement , Please let me know , I'll delete it .
This article is pure and wild , There is no reference to other people's articles or plagiarism . Insist on originality !!

Preface

Hello . Here is Python Reptiles from getting started to giving up series of articles . I am a SunriseCai.

Combined with video viewing , Better taste !

【Python Introduction to reptiles 】 Climb the whole skin of the League of Heroes :https://www.bilibili.com/video/BV1nQ4y1T7k2


This article mainly introduces the use of crawler download Of the League of heroes The skin of all heroes .

Hero League hero Library :https://lol.qq.com/data/info-heros.shtml

1. Article ideas

Check out the League of heroes website , As shown in the following figures :

  • home page ( First level page )
     Insert picture description here
  • Skin page ( Secondary page )
     Insert picture description here
  • picture ( Three level page )
     Insert picture description here

As can be seen from the above pictures , This one is still a Russian one !!!

  1. visit home page ( First level page ) obtain all Hero link ( Secondary page )
  2. visit Hero link ( Secondary page ) obtain Picture links ( Three level page )
  3. visit Picture links ( Three level page ), Save the picture . Insert picture description here

that , The next step is to download the pictures with the code .

2. request + analysis Webpage

It says , The home page of this article is https://lol.qq.com/data/info-heros.shtml.

2.1 Request home page

Browser open homepage , Click on F12, Get into Developer model . Look at the page structure , Found out Secondary page The link to <li> Inside the label .perfect !!!

 Insert picture description here

  • I believe that careful you have found , Secondary page url The law of is :https://lol.qq.com/data/info-defail.shtml?id= This string of characters is followed by a number , How did this figure come from ? The following will explain .

Home page request code :

import requests
url = 'https://lol.qq.com/data/info-heros.shtml'
headers = {

'User-Agent': 'Mozilla/5.0'
}
def get_hero_list():
res = requests.get(url, headers=headers)
if res.status_code == 200:
print(res.text)
else:
print('your code is fail')

!!!
After executing the above code , I found that there is no <li> Content of the label , What's going on here ?<li> The content of the tag is most likely through xhr Load asynchronously The document that came out , Let's grab the bag and have a look !!

  • When I asked for the homepage again, I found , stay xhr here , There is one hero_list.js file , Which translates as List of Heroes .

  • notice hero_list.js Of url by :https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js
     Insert picture description here

  • After clicking , Find out that this is exactly what we need !!!

  • This is it , Pay attention to the red part of the frame heroId, This is the secondary page mentioned above url Ass Id
     Insert picture description here

  • Type in the browser hero_list.js File address , Here's the picture :
     Insert picture description here
    very nice , The request code is also very simple , Just put the code above url Replace with https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js that will do .

2.2 Request secondary page ( Skin page )

Here we use The daughter of darkness – Anne For example , notice Anne share 13 A skin .
 Insert picture description here

  • Bag grabbing discovery , There is one 1.js The data in the file just corresponds to Anne Of 13 A skin .
     Insert picture description here

  • 1.js Of documents url by :https://game.gtimg.cn/images/lol/act/img/js/hero/1.js

  • 2.js Of documents url by :https://game.gtimg.cn/images/lol/act/img/js/hero/2.js

  • Yes, of course , The one behind this one Id It's for every hero heroId.
     Insert picture description here

  • Open with a browser 1.js Of documents url, As shown in the figure below :
     Insert picture description here

notice 1.js There are several documents img Of url, The pixel ratio of the image they represent is as follows :

name Pixel ratio
mainImg 980x500
iconImg 60x60
loadingImg 308x560
videoImg 130x75
sourceImg 1920x470

This article uses mainImg Do a download Demo .

ad locum , Clear your mind :

  1. request https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js obtain Hero name Id.
  2. according to Id To splice the hero's skin urlhttps://game.gtimg.cn/images/lol/act/img/js/hero/1.jsId Replace numbers 1
  3. After the above request, you can get the picture of url 了 , Ask for pictures url that will do .

3. Code section

3.1 Code : home page

  • Request home page , Get secondary page ( Links to skin pages )
import requests
url = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js'
headers = {

'User-Agent': 'Mozilla/5.0'
}
def get_hero_list():
"""
:return: Get the hero name and heroId
"""
res = requests.get(url, headers=headers)
if res.status_code == 200:
data = json.loads(res.text)
for item in data['hero']:
id = item['heroId']
name = item['name']
title = item['title']
print(id, name, title)
else:
print('your code is fail')
get_hero_list()
  • give the result as follows :
     Insert picture description here

3.2 Code : Secondary page ( Skin page )

import requests
skinUrl = 'https://game.gtimg.cn/images/lol/act/img/js/hero/{}.js'
headers = {

'User-Agent': 'Mozilla/5.0'
}
def get_skin_url(Id):
"""
:param Id: hero ID, For splicing url
:return:
"""
res = requests.get(skinUrl.format(Id), headers=headers)
if res.status_code == 200:
data = json.loads(res.text)
for item in data['skins']:
url = item['mainImg']
name = item['name'].replace('/', '')
print(url, name)
else:
print('your code is fail')
get_hero_list()
  • give the result as follows :
  • It is worth noting that , If it's a skin with multiple colors , Maybe not mainImg Of url.
     Insert picture description here

3.3 Complete code

  • Copy and paste to run
# -*- coding: utf-8 -*-
# @Time : 2020/1/28 21:12
# @Author : SunriseCai
# @File : YXLMSpider.py
# @Software: PyCharm
import os
import json
import time
import requests
""" League of heroes skin crawler program """
class YingXLMSpider(object):
def __init__(self):
self.onePageUrl = 'https://game.gtimg.cn/images/lol/act/img/js/heroList/hero_list.js'
self.skinUrl = 'https://game.gtimg.cn/images/lol/act/img/js/hero/{}.js'
self.headers = {

'User-Agent': 'Mozilla/5.0'
}
def get_heroList(self):
"""
:return: Get the hero's heroId, And the name of the hero
"""
res = requests.get(url=self.onePageUrl, headers=self.headers)
if res.status_code == 200:
data = json.loads(res.text)
for item in data['hero']:
Id = item['heroId']
title = item['title']
self.get_skin_url(Id, title)
else:
print('your code is fail')
def get_skin_url(self, Id, folder):
"""
:param Id: hero ID, For splicing url
:param folder: A folder named after the hero
:return:
"""
url = self.skinUrl.format(Id)
res = requests.get(url, headers=self.headers)
if res.status_code == 200:
data = json.loads(res.text)
for item in data['skins']:
url = item['mainImg']
name = item['name'].replace('/', '')
self.download_picture(url, name, folder)
else:
print('your code is fail')
def download_picture(self, url, name, folder):
"""
:param url: Skin address
:param name: Skin name
:param folder: Folder
:return:
"""
# If the folder does not exist, create
if not os.path.exists(folder):
os.makedirs(folder)
# Judge url Not empty and If the image does not exist, download it locally ( It is mainly used for breakpoint reconnection )
if not url == '' and not os.path.exists('%s/%s.jpg' % (folder, name)):
time.sleep(1)
res = requests.get(url, headers=self.headers)
with open('%s/%s.jpg' % (folder, name), 'wb') as f:
f.write(res.content)
print('%s.jpg' % name, ' Download successful ')
f.close()
def main(self):
self.get_heroList()
if __name__ == '__main__':
spider = YingXLMSpider()
spider.main()

Let's see the results :
 Insert picture description here
 Insert picture description here

  • This article is very water , But the general idea is right . It is suggested that you copy and paste the code to execute it , If you have any questions, please solve them by yourself , It is better to believe in books than to have no books
  • It can't be solved. We can communicate together .

Finally, I will summarize the content of this chapter :

  1. It introduces Hero alliance Website all hero skin crawler ideas
  2. Code display
  3. nothing

sunrisecai

  • Thank you for your patience in watching , Focus , Neverlost .
  • For the convenience of chicken pecking each other , Welcome to join QQ Group organization :648696280

Next article , be known as 《Python Reptiles from getting started to giving up 09 | Python Reptile battle – Download Netease cloud music 》.

版权声明
本文为[SunriseCai]所创,转载请带上原文链接,感谢

  1. 利用Python爬虫获取招聘网站职位信息
  2. Using Python crawler to obtain job information of recruitment website
  3. Several highly rated Python libraries arrow, jsonpath, psutil and tenacity are recommended
  4. Python装饰器
  5. Python实现LDAP认证
  6. Python decorator
  7. Implementing LDAP authentication with Python
  8. Vscode configures Python development environment!
  9. In Python, how dare you say you can't log module? ️
  10. 我收藏的有关Python的电子书和资料
  11. python 中 lambda的一些tips
  12. python中字典的一些tips
  13. python 用生成器生成斐波那契数列
  14. python脚本转pyc踩了个坑。。。
  15. My collection of e-books and materials about Python
  16. Some tips of lambda in Python
  17. Some tips of dictionary in Python
  18. Using Python generator to generate Fibonacci sequence
  19. The conversion of Python script to PyC stepped on a pit...
  20. Python游戏开发,pygame模块,Python实现扫雷小游戏
  21. Python game development, pyGame module, python implementation of minesweeping games
  22. Python实用工具,email模块,Python实现邮件远程控制自己电脑
  23. Python utility, email module, python realizes mail remote control of its own computer
  24. 毫无头绪的自学Python,你可能连门槛都摸不到!【最佳学习路线】
  25. Python读取二进制文件代码方法解析
  26. Python字典的实现原理
  27. Without a clue, you may not even touch the threshold【 Best learning route]
  28. Parsing method of Python reading binary file code
  29. Implementation principle of Python dictionary
  30. You must know the function of pandas to parse JSON data - JSON_ normalize()
  31. Python实用案例,私人定制,Python自动化生成爱豆专属2021日历
  32. Python practical case, private customization, python automatic generation of Adu exclusive 2021 calendar
  33. 《Python实例》震惊了,用Python这么简单实现了聊天系统的脏话,广告检测
  34. "Python instance" was shocked and realized the dirty words and advertisement detection of the chat system in Python
  35. Convolutional neural network processing sequence for Python deep learning
  36. Python data structure and algorithm (1) -- enum type enum
  37. 超全大厂算法岗百问百答(推荐系统/机器学习/深度学习/C++/Spark/python)
  38. 【Python进阶】你真的明白NumPy中的ndarray吗?
  39. All questions and answers for algorithm posts of super large factories (recommended system / machine learning / deep learning / C + + / spark / Python)
  40. [advanced Python] do you really understand ndarray in numpy?
  41. 【Python进阶】Python进阶专栏栏主自述:不忘初心,砥砺前行
  42. [advanced Python] Python advanced column main readme: never forget the original intention and forge ahead
  43. python垃圾回收和缓存管理
  44. java调用Python程序
  45. java调用Python程序
  46. Python常用函数有哪些?Python基础入门课程
  47. Python garbage collection and cache management
  48. Java calling Python program
  49. Java calling Python program
  50. What functions are commonly used in Python? Introduction to Python Basics
  51. Python basic knowledge
  52. Anaconda5.2 安装 Python 库(MySQLdb)的方法
  53. Python实现对脑电数据情绪分析
  54. Anaconda 5.2 method of installing Python Library (mysqldb)
  55. Python implements emotion analysis of EEG data
  56. Master some advanced usage of Python in 30 seconds, which makes others envy it
  57. python爬取百度图片并对图片做一系列处理
  58. Python crawls Baidu pictures and does a series of processing on them
  59. python链接mysql数据库
  60. Python link MySQL database