We are addicted to short videos and can't extricate ourselves? Python crawler advanced, take you to play short video

Five packs of spicy strips! 2021-09-15 12:47:57
addicted short videos extricate python


Hello everyone , I'm spicy .

Now short video can be said to be a unique thing , At dinner 、 At rest 、 Lying in bed brushing short videos , I'm going to bring it to you today python Reptile advanced : Meipai video address encryption analysis .

Grab the target

Target website : Beautiful video
 Insert picture description here

Tool use

development environment :win10、python3.7
development tool :pycharm、Chrome
tool kit :requests、xpath、base64

Key learning content

The analysis process of data collected by reptiles
js Code debugging skills
js Reverse parsing code
Python Code conversion

Analysis of project ideas

Go to the home page of the website
Pick the categories you are interested in
Get the jump address of the hyperlink entering the details page according to the home page address
 Insert picture description here
Find the corresponding encrypted video playback address data
 Insert picture description here
This data is static web page data , adopt js Code to decode
Find the corresponding parsing code
First find the video playback address
Find the encryption that resolves the video address js file
Clicking play will trigger the file
 Insert picture description here
You can roughly see that this is base64 Encrypted data
In the corresponding js Search for keywords in the file
find js Encryption method of
 Insert picture description here
js Some usage of functions

 # eplace() Method is used to replace some characters with others in a string
# parseInt Convert the data to the corresponding integer
# base64.atob Yes base64 Decode the encoded string
# substring Method to extract from a string start The specified number of characters from which the subscript begins

 Insert picture description here
take js Code to Python Code

import base64
def decode(data):
def getHex(a):
return {
'str': a[4:],
'hex': ''.join(list(a[:4])[::-1]),
}
def getDec(a):
b = str(int(a, 16))
return {
'pre': list(b[:2]),
'tail': list(b[2:]),
}
def substr(a, b):
c = a[0: int(b[0])]
d = a[int(b[0]): int(b[0]) + int(b[1])]
return c + a[int(b[0]):].replace(d, "")
def getPos(a, b):
b[0] = len(a) - int(b[0]) - int(b[1])
return b
b = getHex(data)
c = getDec(b['hex'])
d = substr(b['str'], c['pre'])
return base64.b64decode(substr(d, getPos(d, c['tail'])))
print(decode("e121Ly9tBrI84RdnZpZGVvMTAubWVpdHVkYXRhLmNvbS82MGJjZDcwNTE3NGZieXBueG5udnRwMTA5N19IMjY0XzFfNWY3YThmM2U0MTEwNy5tc2JVjAu3EDQ="))

Get the final video playback address
 Insert picture description here
 Insert picture description here

Easy source sharing

import requests
from lxml import etree
import base64
def decode_mp4(data):
def getHex(a):
return {

'str': a[4:],
'hex': ''.join(list(a[:4])[::-1]),
}
def getDec(a):
b = str(int(a, 16))
return {

'pre': list(b[:2]),
'tail': list(b[2:]),
}
def substr(a, b):
c = a[0: int(b[0])]
d = a[int(b[0]): int(b[0]) + int(b[1])]
return c + a[int(b[0]):].replace(d, "")
def getPos(a, b):
b[0] = len(a) - int(b[0]) - int(b[1])
return b
b = getHex(data)
c = getDec(b['hex'])
d = substr(b['str'], c['pre'])
return base64.b64decode(substr(d, getPos(d, c['tail'])))
# Run the main function 
def main():
url = 'https://www.meipai.com'
headers = {

'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36',
}
response = requests.get(url=url, headers=headers)
html_data = etree.HTML(response.text)
href_list = html_data.xpath('//div/a/@href')
# print(href_list)
for href in href_list:
res = requests.get('https://www.meipai.com' + href, headers=headers)
html = etree.HTML(res.text)
name = html.xpath('//div[@id="detailVideo"]/img/@alt')[0]
mp4_data = html.xpath('//div[@id="detailVideo"]/@data-video')[0]
# print(name, mp4_data)
mp4_url = decode_mp4(mp4_data).decode('utf-8')
print(mp4_url)
result = requests.get("http:" + mp4_url)
with open(name + ".mp4", 'wb') as f:
f.write(result.content)
f.close()
if __name__ == '__main__':
main()

Welcome to exchange technology in the comments , Remember to connect three times with one button , I wish you all happiness !

版权声明
本文为[Five packs of spicy strips!]所创,转载请带上原文链接,感谢
https://pythonmana.com/2021/09/20210909161448846B.html

  1. Python - Programmation orientée objet - pratique (6)
  2. Python - Programmation orientée objet - réflexion hasattr, GetAttr, GetAttr, delattr
  3. Python - Programmation orientée objet - _Dict
  4. Python - pydantic (2) Modèle imbriqué
  5. Non-ASCII character ‘\xe5‘ in file kf1.py on line 4, but no encoding declared; see http://python.or
  6. python笔记(一)
  7. Non - ASCII character 'xe5' in file kf1.py on Line 4, but no Encoding declared;Voirhttp://python.or
  8. Notes Python (1)
  9. Talk about how JMeter executes Python scripts concurrently
  10. In Beijing, you can't see the moon in the Mid Autumn Festival. Draw a blood red moon in Python
  11. Un des pandas crée un objet
  12. Machine learning | unitary regression model Python practical case
  13. Draw a "Lollipop chart" with Excel and python
  14. Python uses scikit learn to calculate TF-IDF
  15. Getting started with Python Basics_ 3 conditional statements and iterative loops
  16. Python dynamic properties and features
  17. 云计算开发:Python内置函数-min()函数详解
  18. [Python skill] how to speed up loop operation and numpy array operation
  19. 雲計算開發:Python內置函數-min()函數詳解
  20. Développement de l'informatique en nuage: explication détaillée de la fonction intégrée python - min ()
  21. 从0起步学Python(附程序实例讲解)第1讲
  22. 从0起步学Python(附程序实例讲解)第1讲
  23. Apprendre Python à partir de 0 (avec des exemples de programme) leçon 1
  24. Apprendre Python à partir de 0 (avec des exemples de programme) leçon 1
  25. With Python, I'll take you to enjoy it for a month when the Mid Autumn Festival is coming
  26. You can't write interface software in Python! Which software on sale has no UI?
  27. Python国内外原题解析及源码1~15
  28. Python实现长篇英文自动纠错~
  29. Python implémente la correction automatique des erreurs en anglais long
  30. Analyse des problèmes originaux et code source de Python au pays et à l'étranger 1 ~ 15
  31. 一张思维导图学Python之表白
  32. Python教学中课程思政建设的研究探索2
  33. Recherche sur la construction idéologique et politique du Programme d'études dans l'enseignement Python 2
  34. Une présentation de la cartographie mentale Python
  35. Python高级用法总结(8)-函数式编程
  36. Python + Mirai development QQ robot starting tutorial (2021.9.9 test is valid)
  37. Python Advanced use Summary (8) - functional Programming
  38. How to get started with Python and share learning methods for free. All you want to know is here
  39. Python + Mirai development QQ robot starting tutorial (2021.9.9 test is valid)
  40. Python趣味编程中(PPT适合青少儿和零基础学习Python)
  41. Python基础第1讲(含代码、Python最新安装包、父与子的编程之旅:与小卡特一起学Python中文版)
  42. 用 Python 增强 Git
  43. Python基礎第1講(含代碼、Python最新安裝包、父與子的編程之旅:與小卡特一起學Python中文版)
  44. Base Python leçon 1 (y compris le Code, le dernier paquet d'installation Python, le voyage de programmation parent - enfant: apprendre la version chinoise de python avec le petit Carter)
  45. Dans la programmation amusante Python (ppt pour les jeunes enfants et l'apprentissage de base zéro Python)
  46. 非常好的题目详解Python字典的用法
  47. Python teaches you to build wechat push live Betta reminder from 0 (single room simplified version)
  48. Python 协程与 JavaScript 协程的对比
  49. 手把手带你用Python实现一个量化炒股策略
  50. Main dans la main pour mettre en œuvre une stratégie quantitative de spéculation boursière en python
  51. Comparaison des coproductions Python et JavaScript
  52. 【python种子项目ppc】一行代码生成项目与开发详细指导
  53. Docker 部署一个用 Python 编写的 Web 应用
  54. Python - poetry(4)管理环境
  55. Python - poetry(2)命令介绍
  56. [Python Seed Project PPC] a line of Code Generation Project and Development detailed guidance
  57. Introduction à la commande python - Poetry (2)
  58. Python - Poetry (4) Management Environment
  59. I collected Banhua's spatial data set in Python. In addition to meizhao, I found her other secrets again!
  60. I modified ban Hua's boot password in Python and found her secret after logging in again!