Homo sapiens 2021-01-22 16:49:39
What this blog brings to you is to use python Crawler get the starting point of Chinese network popularity ranking Top100 The share of , I hope you can feel the charm of reptiles in the process of learning ! Let's start with the website Come to the home page of the starting point Chinese website !

according to url And the number of pages that need to get resources , We can write the list derivation of the URL first{}'.format(str(i)) for i in range(1,6)

The specific code is shown below :

@File : Get the starting point of Chinese network popularity ranking
@Time : 2019/10/21 22:31
@Author : Fengming Jingjun
@Software: PyCharm
Reprint please indicate the original author
It's not easy to create , For sharing only
# Import related libraries
import xlwt
import requests
from lxml import etree
import time
# Initialization list , Save crawler data
all_info_list = []
def get_info(url):
html = requests.get(url)
selector = etree.HTML(html.text)
# Positioning big tags , In turn, cycle , Get links to the details of each novel on each page url
infos = selector.xpath('//ul[@class="all-img-list cf"]/li')
# Traverse links , Get the details of each novel
for info in infos:
# title
title = info.xpath('div[2]/h4/a/text()')[0]
# author
author = info.xpath('div[2]/p[1]/a[1]/text()')[0]
# style 1
style1 = info.xpath('div[2]/p[1]/a[2]/text()')[0]
# style 2
style2 = info.xpath('div[2]/p[1]/a[3]/text()')[0]
# style
style = style1 + style2
# Degree of closure
complete = info.xpath('div[2]/p[1]/span/text()')[0]
# Introduction to the novel
introduce = info.xpath('div[2]/p[2]/text()')[0].strip()
info_list = [title, author, style, complete, introduce]
# Put the data in a list
# Set sleep time
# The main entry of the program
if __name__ == '__main__':
urls = ['{}'.format(str(i)) for i in range(1,6)]
for url in urls:
# Define header
header = ['title', 'author', 'style', 'complete', 'introduce']
# Create Workbook
book = xlwt.Workbook(encoding='utf_8')
# Create sheet
sheet = book.add_sheet('Shee1')
# python range() Function to create a list of integers , Generally used in for In circulation .
# Python len() Method returns the object ( character 、 list 、 Tuples etc. ) Length or number of items .
for h in range(len(header)):
# Write header
sheet.write(0, h, header[h])
i = 1
# Through the loop traversal , Put the data in xls In the table
for list in all_info_list:
j = 0
for data in list:
sheet.write(i, j, data)
# View results
j += 1
i += 1
# Out of data storage , Save the workbook to the local path'qidianxiaoshuo.xls')

Effect validation :

Because we saved the final results in our local xls In file , So just open it and you can see !

qidianxiaoshuo.xls file

If you see the effect above , Congratulations on your success ! Does it feel interesting ~~ That's all for this sharing , Don't forget to pay attention to it , Xiaojun will launch more simple and fun technologies one after another ٩(๑>◡<๑)۶

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

本文为[Homo sapiens]所创,转载请带上原文链接,感谢

