Kill the captcha! Using Python image to recognize moving slider captcha

User 2966292 2021-02-23 16:46:32
kill captcha using python image


Preface

Captcha is often a stumbling block on the crawler road , And the patterns are endless : Picture verification 、 Slider verification 、 Interactive verification 、 Behavior verification, etc . With OCR Maturity of Technology , Image verification has gradually faded out of the mainstream , and 「 Slider verification 」 More and more in the public eye .“ So powerful , What does this guy look like ?” you 're right , That's what it looks like sai:

The way to solve it is also intuitive , First find the location of the gap ( Usually it's just X Position of the shaft ), Then drag the slider . today kimol You will lead us to use python Identify the position of the notch in the slider verification .

One 、 Gap identification

Identify gaps in the picture , Mainly to make use of python Image processing library in cv2, The installation method is as follows :

pip install opencv-python

notes : This is not “pip install cv2” Oh ~

1. Read the picture

The image of slider verification is divided into two parts , One is the background image :

The other is the gap picture :

utilize imread Function to read it :

# Read the background image and the gap image
bg_img = cv2.imread('bg.jpg') # Background image
tp_img = cv2.imread('tp.png') # Gap picture

2. Identify the edge of the image

To better match the gap to the background , We first have to identify the edges of the image :

# Identify the edge of the image
bg_edge = cv2.Canny(bg_img, 100, 200)
tp_edge = cv2.Canny(tp_img, 100, 200)

This step is crucial ! Otherwise, the notch matching will not be accurate .

Here we get the grayscale image of the edge of the image , Further change the image format to RGB Format :

# Change the image format
bg_pic = cv2.cvtColor(bg_edge, cv2.COLOR_GRAY2RGB)
tp_pic = cv2.cvtColor(tp_edge, cv2.COLOR_GRAY2RGB)

The converted background image is :

The converted gap graph is :

3. Gap matching

utilize cv2 Medium matchTemplate function , You can search for the corresponding gap in the background image , The specific code is as follows :

# Gap matching
res = cv2.matchTemplate(bg_pic, tp_pic, cv2.TM_CCOEFF_NORMED)

res Match results for each location , Represents the probability of matching , Choose one of them 「 The highest probability 」 The point of , It's where the notch matches :

min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res) # Looking for the best match

min_val,max_val,min_loc,max_loc They are the minimum values of matching 、 The maximum value of the match 、 The position of the minimum 、 The position of the maximum value .ps. Of course , You can write your own loop here , But if there are ready-made functions, why not use them ?

thus , We already have the location of the gap , Its X The axis coordinates are :

X = max_loc[0]

In order to show the location of the gap more intuitively , We'll mark the gap with a rectangle :

# Draw a box
th, tw = tp_pic.shape[:2]
tl = max_loc # The coordinates of the upper left corner
br = (tl[0]+tw,tl[1]+th) # The coordinates of the lower right corner
cv2.rectangle(bg_img, tl, br, (0, 0, 255), 2) # Draw a rectangle
cv2.imwrite('out.jpg', bg_img) # Keep it locally

give the result as follows :

perfect ~ Call it a day !!!

Two 、 Complete code

For more convenient use in practical application , We encapsulate the code as a function :

def identify_gap(bg,tp,out):
'''
bg: Background image
tp: Gap picture
out: Output pictures
'''
# Read the background image and the gap image
bg_img = cv2.imread(bg) # Background image
tp_img = cv2.imread(tp) # Gap picture
# Identify the edge of the image
bg_edge = cv2.Canny(bg_img, 100, 200)
tp_edge = cv2.Canny(tp_img, 100, 200)
# Change the image format
bg_pic = cv2.cvtColor(bg_edge, cv2.COLOR_GRAY2RGB)
tp_pic = cv2.cvtColor(tp_edge, cv2.COLOR_GRAY2RGB)
# Gap matching
res = cv2.matchTemplate(bg_pic, tp_pic, cv2.TM_CCOEFF_NORMED)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res) # Looking for the best match
# Draw a box
th, tw = tp_pic.shape[:2]
tl = max_loc # The coordinates of the upper left corner
br = (tl[0]+tw,tl[1]+th) # The coordinates of the lower right corner
cv2.rectangle(bg_img, tl, br, (0, 0, 255), 2) # Draw a rectangle
cv2.imwrite(out, bg_img) # Keep it locally
# Back to the gap X coordinate
return tl[0]

Read local image file is selected here , It's not very convenient in the process of crawling . If you have any interested partners , You can change it yourself , Change the input to image stream .

At the end

This article ends with the paragraph , If you need complete source code or any suggestions, you are welcome to send private letters . Last , Thank you for your patience , I'll see you next time ~

This article is from WeChat official account. - CoXie Take you to programming (Pythoni521)

The source and reprint of the original text are detailed in the text , If there is any infringement , Please contact the yunjia_community@tencent.com Delete .

Original publication time : 2021-01-27

Participation of this paper Tencent cloud media sharing plan , You are welcome to join us , share .

版权声明
本文为[User 2966292]所创,转载请带上原文链接,感谢
https://pythonmana.com/2021/02/20210223164213500x.html

  1. Python notes: List
  2. Translation: practical Python Programming 02_ 03_ Formatting
  3. Python中的四种队列(queue)、堆(heap)
  4. Side effects of Python mutable types as default parameters of functions
  5. This is the best Python tutorial I've ever seen: ten minutes to get to know python
  6. 使用python编写量子线路打印的简单项目,并使用Sphinx自动化生成API文档
  7. Python happy enemy: crawler and anti crawler with a solution to give you New Year
  8. 使用python编写量子线路打印的简单项目,并使用Sphinx自动化生成API文档
  9. When writing python, you will encounter the following error: modulenotfounderror: no module named ' email.mime '; 'email' is not a package
  10. Python class call and private and public property method call
  11. Proprietary methods for Python classes
  12. Foundation of Python: number string and list
  13. Foundation of Python: number string and list
  14. Foundation of Python: number string and list
  15. 华为 Python网络自动化
  16. Python Cannot open E:\Python36\Scripts\pip-script.py
  17. Peeping into the future is not a dream, python data analysis is easy to achieve
  18. The practical skills summed up by Alibaba and Huawei Python engineers, only you haven't seen them yet?
  19. Sour! See the Python programmers on the tiktok get the pay slip...
  20. Foundation of Python: number string and list
  21. Python installation tutorial
  22. Python installation tutorial
  23. This article will familiarize you with the transformation process of Python - > Cafe - > om model
  24. Four kinds of queues and heaps in Python
  25. Using Python to write a simple project of quantum circuit printing, and using Sphinx to automatically generate API documents
  26. Using Python to write a simple project of quantum circuit printing, and using Sphinx to automatically generate API documents
  27. Huawei Python Network Automation
  28. Python Cannot open E:\Python36\Scripts\pip- script.py
  29. 找不到Python问题解决
  30. PHP和Python哪个更有市场前景?我学的是PHP
  31. Python problem resolution not found
  32. Which has more market prospects, PHP or Python? I studied PHP
  33. Foundation of Python: number string and list
  34. python 编码问题之终极解决
  35. The ultimate solution to the problem of Python coding
  36. 能取值亦能赋值的Python切片
  37. Python slice with value and value
  38. 能取值亦能赋值的Python切片
  39. Python slice with value and value
  40. python 异常处理
  41. Python exception handling
  42. python 异常处理
  43. Python exception handling
  44. Orca: 基于DolphinDB的分布式pandas接口
  45. Orca: distributed panda interface based on dolphin DB
  46. 5个无聊Python程序,用Python整蛊你的朋友们吧
  47. Five boring Python programs, trick your friends with Python
  48. python进阶训练营
  49. Python advanced training camp
  50. 【免费】0基础也能轻松学的Python训练营来啦,限时抢位中!
  51. [free] Python training camp, which is easy to learn, is here. It's time to grab a place!
  52. 手把手教你把Python应用到实际开发 不再空谈语法
  53. 全面系统Python3.8入门+进阶 (程序员必备第二语言)
  54. Hand in hand to teach you how to apply Python to practical development
  55. Comprehensive system introduction to Python 3.8 + Advanced
  56. Python语言的排序算法有哪些?Python学习班!
  57. Python language sorting algorithm what? Python classes!
  58. Java、JavaScript、C、C++、PHP、Python都是用来开发什么?
  59. 为什么学习Python?什么途径学习Python合适?
  60. What are Java, JavaScript, C, C + +, PHP and python used to develop?