## 20 lines of Python code crawling the king's glory all hero skin

~wangweijun 2020-11-13 07:21:32
lines python code crawling king

### introduction

The glory of the king has been played by everyone , I should have heard of those who haven't played , As the hottest mobile phone in the world MOBA game , Cough , It seems to be out of the question . Our focus today is to crawl all the skin of the king's glory all the heroes , And just use 20 That's ok Python The code is done .

### preparation

It's not hard to get the skin itself , The difficulty lies in the analysis of , We have to get the skin picture first url Address , Don't talk much , We will come to the official website of the king's glory ：

Let's click on the hero profile , Then choose a hero at will , next F12 Turn on the debugging table , Find the image address of the hero's original skin ：

next , Let's switch the hero's skin , You will find that there is no significant change in the image address , It's just that the last number has changed , Let's compare the addresses of the two skin pictures ：

``````http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/523/523-bigskin-1.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/523/523-bigskin-2.jpg
``````

We can guess , For the same hero's skin image address , It's just that the last number is different , To prove our conjecture , We can continue to find a full skin image of a hero , Find one with a little more skin , For example, what I am looking for here is sun Shangxiang , Put all of its skin image addresses together and compare ：

``````http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-1.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-2.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-3.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-4.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-5.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-6.jpg
http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/111/111-bigskin-7.jpg
``````

So we come to the conclusion that , The same hero's skin image path from 1 Start increasing in sequence , Let's take a look at how different heroes are distinguished . Will find , No matter how the skin image changes , The address above the browser is always the same , So we'll have two different heroes url Let's compare the addresses ：

``````https://pvp.qq.com/web201605/herodetail/523.shtml
https://pvp.qq.com/web201605/herodetail/111.shtml
``````

At first glance , There seems to be no rule , But we're going to find something here , The last number actually controls which hero , We think it's the number of the hero , Unfortunately , There seems to be no rule between hero numbers , There's no need to worry , Let's go to the official website to find the clues .

In the hero profile interface , We turn on F12 Debugging station , By grabbing network requests , I found several documents ：

Click on the network , And then click XHR, You can see these documents , You should know the name of the document , These files store the hero list information , Let's click to see ：

you 're right , What's stored here is hero information , Including the hero's name , Hero number and other information , We can try the accuracy of the information , For example, Joe's ename, That is, the hero number is 106, So as I thought before , The detailed address of hero Joe should be ：https://pvp.qq.com/web201605/herodetail/106.shtml
After trying, I found that it was true .

Come here , The preparation is done , In fact, it's here , The whole project is half finished , Next is the implementation of the code .

### Code implementation

First let's create one Python file , Then import os and requests modular .
Follow the previous steps , We need to get the hero list information first , That is to say herolist.json file , The file address is ：https://pvp.qq.com/web201605/js/herolist.json, This can be found in the debugging table .
So we need to get the hero list information through this address first json data , And then parse json data , Extract useful information ：

``````url = 'https://pvp.qq.com/web201605/js/herolist.json'
herolist = requests.get(url) # Get a list of Heroes json file
herolist_json = herolist.json() # Turn into json Format
hero_name = list(map(lambda x: x['cname'], herolist.json())) # Pick up the hero's name
hero_number = list(map(lambda x: x['ename'], herolist.json())) # Extract the hero's number
``````

So we get the hero's name and number , You can test the output ：
With the hero number , It's easy , Just splice it url address ：
`http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/' + hero_number + '/' + hero_number + '-bigskin-1.jpg`, In this way, you can get the skin pictures of all heroes , But there's a problem , How little skin a hero has , Some heroes have only two skins , There are six or seven , So we don't know the maximum image number , Here I take a more stupid approach , It's about having a variable from 1 To 10 Add in sequence to the mosaic image address , We don't deal with any pictures we don't have , Because no hero has more skin than 10 individual , So we can get all the pictures . Now let's look at the code implementation ：

``````# Download the pictures
i = 0
for j in hero_number:
# Create folder
# Enter the created folder
i += 1
for k in range(10):
# Splicing url
onehero_link = 'http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/' + str(j) + '/' + str(
j) + '-bigskin-' + str(k) + '.jpg'
im = requests.get(onehero_link) # request url
if im.status_code == 200:
open(str(k) + '.jpg', 'wb').write(im.content) # write file
``````

The implementation is very simple , The code comments have also been written clearly , With this function , We just need to call , You can download the pictures , The complete code of the whole program is as follows ：

``````import os
import requests
url = 'https://pvp.qq.com/web201605/js/herolist.json'
herolist = requests.get(url) # Get a list of Heroes json file
herolist_json = herolist.json() # Turn into json Format
hero_name = list(map(lambda x: x['cname'], herolist.json())) # Pick up the hero's name
hero_number = list(map(lambda x: x['ename'], herolist.json())) # Extract the hero's number
i = 0
for j in hero_number:
# Create folder
# Enter the created folder
i += 1
for k in range(10):
# Splicing url
onehero_link = 'http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/' + str(j) + '/' + str(
j) + '-bigskin-' + str(k) + '.jpg'
im = requests.get(onehero_link) # request url
if im.status_code == 200:
open(str(k) + '.jpg', 'wb').write(im.content) # write file
``````

Remove comments , near 20 Line of code we completed the king glory all hero skin crawling , Isn't it very simple ？ We can test this program , First, create a folder on the desktop , be known as wzry, Because I've written the code here , If you want to modify it, you can also modify it , After creating the folder, click Run , Wait for a moment , All the pictures are downloaded .

For the program json String parsing , We can also use jsonpath Module to do , Using this module can get the information we want more quickly , The parsing method is as follows ：

``````hero_name = jsonpath.jsonpath(html_json, "\$..cname")
hero_number = jsonpath.jsonpath(html_json, "\$..ename")
``````

The method receives one json String and parsing rules ,\$…cname It means to find any location from the root directory cname For the value of the bond , And put it in the dictionary .

### ending

Reptiles are very interesting , Because it's very intuitive , Strong visual impact , It also has a sense of achievement , Reptiles are powerful , But don 't crawl through the privacy information .