You're going to learn Python on the sly, and then you'll be stunned (day 10)

Python a meow 2020-11-10 17:23:16
going learn python sly ll


 

List of articles

  • Preface
  • Welcome to our circle
  • The little reptile kicked the steel plate
  • Mission requirements
  • Demand analysis
  • 1.
  • 2.
  • 3.
  • Scheme 1 :
  • Option two :
  • data type
  • Regular expressions
  • What is regular expression ?
  • Basic grammar
  • Ordinary character
  • demonstration
  • qualifiers
  • Locator
  • choice
  • python Regular expressions
  • solve the problem

Preface

Previous review : You have to learn Python( Ninth days )

It's still this paragraph

 This series of articles default that you have certain C or C++ Basics , Because I learned a little C++ After the fur of Python.
 This series of articles default you will Baidu , Study ‘ modular ’ The words of this module , Or suggest you have your own editor and compiler , The last article has already made a recommendation for you ?
 so what , The catalogue of this series , To be honest, I prefer those two books Primer Plus, So follow their directory structure .
 This series will also focus on developing your hands-on skills , After all, I can't tell you all the knowledge , So the ability to solve their own needs is particularly important , So I buried holes in the article, please don't regard them as pits , That's the exercise I left you , Please show your powers , Take care of yourself .
1234567

What are you doing today ? Think I'm going to write cookies ? No more than that. , I kicked the steel plate the day before yesterday , Climb down a pile of messy code , I've consulted my predecessors , Use regular expressions .

A lot of time , It's not that you are incompetent , It's because you don't have that insight , I don't have that vision .
So we have to go to different fields , I'm an experienced student 、 teacher 、 The elders ask for advice .

therefore , Here we still want to talk about our study group .

 

Welcome to our circle

If you have difficulties in learning , Looking for one python Learning communication environment , Can join us python circle , Skirt number 947618024, Can claim python Learning materials , It will save a lot of time , Reduce a lot of problems .


The little reptile kicked the steel plate

This is how it happened , Yesterday my little reptile crawled back pitifully , It seems to have been wronged . How can I bear it ? It must be operated like a tiger , So I went to see what was sacred .

Mission requirements

Crawling through the lyrics of Lin Zhixuan , Which song ? Which song did you tell me ? It's good to make your own decision on such a small matter .

Demand analysis

1.

First of all, judge that it is impossible for you to capture directly from the web page , So we opened it network. Why can not ? I must have failed .

2.

There are two pages for lyrics ,
One is a page that hasn't played songs yet :https://y.qq.com/n/yqq/song/001PGGQ81Xxw9l.html
The other is the page where the lyrics are played :https://y.qq.com/portal/player.html

The second page was tried , Not as good as the first page , You can grab it as an exercise .
So we chose the first page :

 

3.

Skip a wave of find operations here , Specific view 《 Ninth days 》

The direct result :

import requests
import json
from bs4 import BeautifulSoup
headers = {
'origin':'https://y.qq.com',
# Source of the request , In this case, we don't need to add this parameter , Just to demonstrate
'referer':'https://y.qq.com/n/yqq/song/004Z8Ihr0JIu5s.html',
# Source of the request , Carry more information than “origin” Richer , In this case, we don't need to add this parameter , Just to demonstrate
'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36',
# What device is the request from , What browser
}
# Camouflage request header
#url = 'https://y.qq.com/n/yqq/song/001PGGQ81Xxw9l.html'
url = 'https://c.y.qq.com/lyric/fcgi-bin/fcg_query_lyric_yqq.fcg?nobase64=1&musicid=106678944&-=jsonp1&g_tk_new_20200303=5381&g_tk=5381&loginUin=0&hostUin=0&format=json&inCharset=utf8&outCharset=utf-8¬ice=0&platform=yqq.json&needNewCode=0'
res_song = requests.get(url,headers = headers)
# Scheme 1
soup = BeautifulSoup(res_song.text,'html.parser')
print(soup)
# Option two
# json_res = json.loads(res_song.text)
# print(json_res['lyric'])
1234567891011121314151617181920212223242526

Excellent , Let's take a look at the results of these two options :

Scheme 1 :

 

Option two :

 


What to do with that ? Fortunately, both of these things can be used as string processing , It would be , Regular .

data type

 

 

 

Regular expressions

good , Now let's look at regular expressions .

What is regular expression ?

Regular expressions (Regular Expression) It's a text pattern , Include normal characters ( for example ,a To z Between the letters ) And special characters ( be called " Metacharacters ").
Regular expressions are described using a single string 、 Match a string that matches a syntax rule .

Basic grammar

Ordinary character

Normal characters include all printable and nonprintable characters that are not explicitly specified as metacharacters . This includes all uppercase and lowercase letters 、 All figures 、 All punctuation and some other symbols .

Character interpretation [ABC] matching […] All characters in , for example [aeiou] Match string “google runoob taobao” All of the e o u a Letter .[^ABC] Match except […] All the characters of the characters in , for example [^aeiou] Match string “google runoob taobao” In addition to e o u a All the letters of the letter .[A-Z][A-Z] Represents an interval , Match all capital letters ,[a-z] Means all lowercase letters .. Match break (\n、\r) Any single character other than , Equivalent to [^\n\r].[\s\S] Match all .\s Is to match all blanks , Including line breaks ,\S Not blank , Including line breaks .\w Match the letter 、 Numbers 、 Underline . Equivalent to [A-Za-z0-9_]

demonstration

 

 

 

 

 

 


qualifiers

Qualifiers are used to specify how many times a given component of a regular expression must appear to satisfy a match . Yes * or + or ? or {n} or {n,} or {n,m} common 6 Kind of .

The qualifiers for regular expressions are :

Qualifier expression * Match previous subexpression zero or more times . for example ,zo* Can match “z” as well as “zoo”.* Equivalent to {0,}.+ Match previous subexpression one or more times . for example ,‘zo+’ Can match “zo” as well as “zoo”, But can't match “z”.+ Equivalent to {1,}.? Match previous subexpression zero or once . for example ,“do(es)?” Can match “do” 、 “does” Medium “does” 、 “doxy” Medium “do” .? Equivalent to {0,1}.{n}n Is a non negative integer . Matched definite n Time . for example ,‘o{2}’ Can't match “Bob” Medium ‘o’, But it matches “food” Two of them o.{n,}n Is a non negative integer . Match at least n Time . for example ,‘o{2,}’ Can't match “Bob” Medium ‘o’, But it can match. “foooood” All in o.‘o{1,}’ Equivalent to ‘o+’.‘o{0,}’ Is equivalent to ‘o*’.{n,m}m and n All non negative integers , among n <= m. Least match n  Times and at most m Time . for example ,“o{1,3}” Will match “fooooood” Top three in o.‘o{0,1}’ Equivalent to ‘o?’. Please note that there cannot be spaces between commas and two numbers .

 

 

 

 

 

 

 

 

 

Locator

Locators enable you to fix regular expressions to the beginning or end of a line . They also enable you to create regular expressions like this , These regular expressions appear in a word 、 At the beginning of a word or at the end of a word .

Locators are used to describe the boundaries of strings or words ,^ and $ Refers to the beginning and end of a string ,\b Describe the front or back boundary of a word ,\B Indicates a non word boundary .

The locators of regular expressions are :

Character description ^ Matches where the input string starts . If set RegExp Object's Multiline attribute ,^ Also with \n or \r Position matching after .$ Matches the position of the end of the input string . If set RegExp Object's Multiline attribute ,$ Also with \n or \r Previous position match .\b Matches a word boundary , That is, the position between words and spaces .\B Non word boundary matching .

Be careful : Cannot use qualifier with locator . Because there cannot be more than one position immediately before or after the line feed or word boundary , Therefore, such as ^* Expressions like that .
To match the text at the beginning of a line of text , Please use at the beginning of regular expression ^ character . Don't put ^ This usage of is confused with the usage within the bracket expression .
To match the text at the end of a line of text , Use at the end of the regular expression $ character .

choice

Use parentheses () Enclose all the options , Use... Between adjacent options | Separate .

python Regular expressions

A regular expression is a special sequence of characters , It can help you easily check whether a string matches a certain pattern .
Python since 1.5 Version has been added re modular , It provides Perl Style regular expression pattern .
re Module enable Python The language has all the regular expression functions .

Portal


Come back and practice again , Not very skilled yet ...


solve the problem

I have a skull ache today , Just put the code directly , It's a little bit flawed , Let's say that :‘ word ’ And the name of the writer should be used : Separate ,‘ song ’ In the same way .

import re
import requests
from bs4 import BeautifulSoup
headers = {
'origin':'https://y.qq.com',
# Source of the request , In this case, we don't need to add this parameter , Just to demonstrate
'referer':'https://y.qq.com/n/yqq/song/004Z8Ihr0JIu5s.html',
# Source of the request , Carry more information than “origin” Richer , In this case, we don't need to add this parameter , Just to demonstrate
'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36',
# What device is the request from , What browser
}
# Camouflage request header
url = 'https://c.y.qq.com/lyric/fcgi-bin/fcg_query_lyric_yqq.fcg?nobase64=1&musicid=106678944&-=jsonp1&g_tk_new_20200303=5381&g_tk=5381&loginUin=0&hostUin=0&format=json&inCharset=utf8&outCharset=utf-8¬ice=0&platform=yqq.json&needNewCode=0'
res_song = requests.get(url,headers = headers)
# Scheme 1
soup = BeautifulSoup(res_song.text,'html.parser')
#print(soup.text)
pat=re.compile(r'[\u4e00-\u9fa5]+')
result=pat.findall(soup.text)
result = '\n'.join(result[5:])
print(result)

One more sentence at the end , Want to learn Python Please contact Xiaobian , Here's my own set python Learning materials and routes , Anyone who wants this information can enter q skirt 947618024 receive .

The material of this article comes from the Internet , If there is infringement, please contact to delete .

版权声明
本文为[Python a meow]所创,转载请带上原文链接,感谢

  1. 利用Python爬虫获取招聘网站职位信息
  2. Using Python crawler to obtain job information of recruitment website
  3. Several highly rated Python libraries arrow, jsonpath, psutil and tenacity are recommended
  4. Python装饰器
  5. Python实现LDAP认证
  6. Python decorator
  7. Implementing LDAP authentication with Python
  8. Vscode configures Python development environment!
  9. In Python, how dare you say you can't log module? ️
  10. 我收藏的有关Python的电子书和资料
  11. python 中 lambda的一些tips
  12. python中字典的一些tips
  13. python 用生成器生成斐波那契数列
  14. python脚本转pyc踩了个坑。。。
  15. My collection of e-books and materials about Python
  16. Some tips of lambda in Python
  17. Some tips of dictionary in Python
  18. Using Python generator to generate Fibonacci sequence
  19. The conversion of Python script to PyC stepped on a pit...
  20. Python游戏开发,pygame模块,Python实现扫雷小游戏
  21. Python game development, pyGame module, python implementation of minesweeping games
  22. Python实用工具,email模块,Python实现邮件远程控制自己电脑
  23. Python utility, email module, python realizes mail remote control of its own computer
  24. 毫无头绪的自学Python,你可能连门槛都摸不到!【最佳学习路线】
  25. Python读取二进制文件代码方法解析
  26. Python字典的实现原理
  27. Without a clue, you may not even touch the threshold【 Best learning route]
  28. Parsing method of Python reading binary file code
  29. Implementation principle of Python dictionary
  30. You must know the function of pandas to parse JSON data - JSON_ normalize()
  31. Python实用案例,私人定制,Python自动化生成爱豆专属2021日历
  32. Python practical case, private customization, python automatic generation of Adu exclusive 2021 calendar
  33. 《Python实例》震惊了,用Python这么简单实现了聊天系统的脏话,广告检测
  34. "Python instance" was shocked and realized the dirty words and advertisement detection of the chat system in Python
  35. Convolutional neural network processing sequence for Python deep learning
  36. Python data structure and algorithm (1) -- enum type enum
  37. 超全大厂算法岗百问百答(推荐系统/机器学习/深度学习/C++/Spark/python)
  38. 【Python进阶】你真的明白NumPy中的ndarray吗?
  39. All questions and answers for algorithm posts of super large factories (recommended system / machine learning / deep learning / C + + / spark / Python)
  40. [advanced Python] do you really understand ndarray in numpy?
  41. 【Python进阶】Python进阶专栏栏主自述:不忘初心,砥砺前行
  42. [advanced Python] Python advanced column main readme: never forget the original intention and forge ahead
  43. python垃圾回收和缓存管理
  44. java调用Python程序
  45. java调用Python程序
  46. Python常用函数有哪些?Python基础入门课程
  47. Python garbage collection and cache management
  48. Java calling Python program
  49. Java calling Python program
  50. What functions are commonly used in Python? Introduction to Python Basics
  51. Python basic knowledge
  52. Anaconda5.2 安装 Python 库(MySQLdb)的方法
  53. Python实现对脑电数据情绪分析
  54. Anaconda 5.2 method of installing Python Library (mysqldb)
  55. Python implements emotion analysis of EEG data
  56. Master some advanced usage of Python in 30 seconds, which makes others envy it
  57. python爬取百度图片并对图片做一系列处理
  58. Python crawls Baidu pictures and does a series of processing on them
  59. python链接mysql数据库
  60. Python link MySQL database