Why is it wrong to inherit Python built-in types?!

Under the cat pea 2020-11-15 11:44:38
wrong inherit python built-in built

This article from the “Python Why? ” series , Please check out All articles

not long ago ,Python cat I recommend a book to you 《 smooth Python》( Click to jump to read ), That article has a lot of “ words of excessive praise ”, It seems to be rather vague ……

however ,《 smooth Python》 A book is worth reading again and again , You can learn from the old . Recently, I found a strange knowledge point in the book , So I'm going to talk about this topic —— Subclassing built-in types can be problematic ?!

1、 What are the built-in types ?

Before the official start , First of all, we need to popularize science : Which are Python Built in type ?

According to the classification of official documents , Built in type (Built-in Types) It mainly includes the following contents :

Detailed documentation :https://docs.python.org/3/library/stdtypes.html

among , There is a well-known Numeric type 、 Sequence type 、 Text type 、 Mapping types and so on , And, of course, we've introduced Boolean type ... object wait .

In so much content , This article focuses only on those actions Callable object (callable) Built in type , That is, with built-in functions (built-in function) On the surface, similar ones :int、str、list、tuple、range、set、dict……

These types (type) It can be simply understood as a class in other languages (class), however Python There is no customary nomenclature for the hump , So it's easy to get misunderstood .

stay Python 2.2 after , These built-in types can be subclassed (subclassing), That is, it can be inherited (inherit).

2、 Subclassing of built-in types

as everyone knows , For a common object x,Python The public built-in function is needed to find its length in len(x), It is not like Java Object oriented languages like this , The object of the latter usually has its own x.length() Method .(PS: Analysis of these two design styles , Recommended reading This article

Now? , Suppose we want to define a list class , Hope it has its own length() Method , At the same time, keep all the features of a normal list .

The experimental code is as follows ( Just for demonstration ):

# Define a list Subclasses of
class MyList(list):
def length(self):
return len(self)

We make MyList This custom class inherits list, At the same time, a new definition of length() Method . thus ,MyList have append()、pop() Method, etc. , It also has length() Method .

# Add two elements
ss = MyList()
ss.append(" cat ")
print(ss.length()) # Output :2

The other built-in types mentioned earlier , It can also be subclassed in this way , It shouldn't be hard to understand .

By the way , What are the benefits of subclassing built-in types / What about the usage scenarios ?

There's an intuitive example , When we are in a custom class , When you need to use a list object frequently ( To add it / Remove elements 、 Pass on as a whole ……), In this case, if our class inherits from list, You can write directly self.append()、self.pop(), Or will self Pass as an object , So you don't have to define an extra list object , It will be concise in writing .

There are other benefits / Use scenarios ? Welcome to leave a message for discussion ~~

3、 Subclassing of built-in types “ problem ”

Finally, it's time to enter the formal theme of this article :)

generally , In our textbook cognition , Methods in subclasses override methods with the same name in the parent class , in other words , The search priority of subclass method is higher than that of parent method .

Here's an example , Parent class Cat, Subclass PythonCat, There is one. say() Method , The function is to say the current object inner_voice:

# Python A cat is a cat
class Cat():
def say(self):
return self.inner_voice()
def inner_voice(self):
return " meow "
class PythonCat(Cat):
def inner_voice(self):
return " Meow meow "

When we create subclasses PythonCat The object is , its say() Methods will take precedence over the ones defined by themselves inner_voice() Method , instead of Cat Of the parent class inner_voice() Method :

my_cat = PythonCat()
# The following results are in line with expectations
print(my_cat.inner_voice()) # Output : Meow meow
print(my_cat.say()) # Output : Meow meow

It's a convention of programming languages , It's a basic principle , Students who have learned the basics of object-oriented programming should know .

However , When Python When implementing inheritance , It seems incomplete Will operate according to the above rules . It's divided into two situations :

  • In line with common sense : For using Python Implementation classes , They will follow “ The subclass precedes the parent ” Principles
  • Contrary to common sense : For practical use C Implementation classes ( namely str、list、dict And so on, these built-in types ), When explicitly calling subclass methods , Will follow “ The subclass precedes the parent ” Principles ; however , When there is an implicit call , They seem to follow “ The parent class precedes the child class ” Principles , That is, the usual inheritance rules will fail here

contrast PythonCat Example , It's equivalent to saying , Call directly my_cat.inner_voice() when , Will get the right “ Meow meow ” result , But in calling my_cat.say() when , You get more than you expect “ meow ” result .

Here is 《 smooth Python》 The example given in (12.1 chapter ):

class DoppelDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, [value] * 2)
dd = DoppelDict(one=1) # {'one': 1}
dd['two'] = 2 # {'one': 1, 'two': [2, 2]}
dd.update(three=3) # {'three': 3, 'one': 1, 'two': [2, 2]}

In this case ,dd['two'] The subclass will be called directly __setitem__() Method , So the results are in line with expectations . If other tests are as expected , The end result will be {'three': [3, 3], 'one': [1, 1], 'two': [2, 2]}.

However , Initialization and update() The direct calls are inherited from the parent class __init__() and __update__(), And then they Implicitly call __setitem__() Method , There is no subclass method called at this time , Instead, it calls the method of the parent class , The result is beyond expectation !

official Python This double rule approach , It's a little against common sense , If you don't pay attention to , It's easy to step on the pit if it's not done well .

that , Why is there such an exception ?

4、 The true face of the built-in type method

We know that built-in types don't implicitly call subclass overlay methods , next , Namely Python cat To get to the bottom of it : Why doesn't it call ?

smooth Python》 There is no further questioning in the book , however , I tried to make a wild guess ( It should be verified from the source code ): The methods of built-in types are all made with C The realization of language , In fact, they don't call each other , So there is no problem of lookup priority when calling .

in other words , Ahead “__init__() and __update__() Will implicitly call __setitem__() Method ” This statement is not accurate !

These magic methods are actually independent of each other !__init__() Have their own setitem Realization , It doesn't call the parent class __setitem__(), With Subclasses of course __setitem__() It doesn't matter much .

To understand logically , Dictionary __init__() Method contains __setitem__() The function of , So we thought the former would call the latter , This is the embodiment of habitual thinking , However, the actual invocation relationship may be like this :

The method on the left opens the door of the language interface and enters the world on the right , To fulfill all its missions there , It will not go back to the original interface to find the next instruction ( There is no red line path in the graph ). The reason is not simple , namely C Code calls between languages are more efficient , The implementation path is shorter , The implementation process is simpler .

Empathy ,dict Type of get() Methods and __getitem__() There is no call relationship , If the subclass only covers __getitem__() Words , When a subclass calls get() When the method is used , It will actually use the parent class get() Method .(PS: On this point ,《 smooth Python》 And PyPy The description of the document is not accurate , They mistook get() Method will call __getitem__())

in other words ,Python The method of built-in type has no calling relationship itself , Even though they're at the bottom C When language is realized , There may be common logic or methods that can be reused .

I think of it. “Python Why? ” The series has been analyzed 《Python Why can support arbitrary truth value judgment ?》. Before we write if xxx when , It seems to implicitly call __bool__() and __len__() Magic methods , In fact, however, the procedure is based on POP_JUMP_IF_FALSE Instructions , It goes straight into pure C Logic of code , There is no call to these magic methods !

therefore , Be aware of C After the special methods implemented are independent of each other , Let's go back to subclassing built-in types , There will be new discoveries :

Of the parent class __init__() Magic method will break the language interface to achieve their own mission , However, it is similar to subclasses of __setitem__() There is no pathway , That is, the red line path in the figure is not reachable .

Special methods go their own way , thus , We'll come to a different conclusion from the previous one : actually Python Strictly followed “ Subclass methods precede parent methods ” The principle of succession , It doesn't destroy common sense !

Last but not least ,__missing__() It's a special case .《 smooth Python》 Just a simple and vague sentence , Not too much unfolding .

After preliminary experiments , I found that when subclasses define this method ,get() Read nonexistent key when , Normal return None; however __getitem__() and dd['xxx'] Read nonexistent key when , Will be defined by subclass __missing__() To deal with .

I haven't had time to analyze , Please leave me a message if you know the answer .

5、 Best practices for subclassing built-in types

in summary , There is no problem with subclassing built-in types , It's just that we don't recognize the special method (C Methods of language implementation ) The true face of , Will lead to the deviation of the result .

that , This calls up a new question : If you have to inherit a built-in type , What is the best practice ?

First , If after inheriting a built-in type , It doesn't rewrite (overwrite) Its special method , There's no problem with subclassing .

secondly , If you want to override a particular method after inheritance , Remember to rewrite all the methods you want to change , for example , If you want to change get() Method , Will rewrite get() Method , If you want to change __getitem__() Method , It's going to be rewritten ……

however , If we just want to rewrite some logic ( namely C The language part ), If all the special methods used to use the logic have changed , For example, rewrite __setitem__() The logic of , At the same time, initialize and update() Wait for the operation to change , So what to do ?

We know that there is no reuse between special methods , That is to say, simply define new __setitem__() It's not enough. , that , How can we influence multiple methods at the same time ?

PyPy This unofficial Python The version found this problem , What it does is to make special methods of built-in types call , Establish a connection between them .

official Python Of course, I'm aware of this problem , But it doesn't change the nature of the built-in type , It's about offering new solutions :UserString、UserList、UserDict……

Except for the name , They are basically equivalent to built-in types .

The basic logic of these classes is to use Python Realized , This is equivalent to the previous article C Some of the logic of the language interface has moved to Python Interface , Set up the call chain on the left side , In this way , It solves the reuse problem of some special methods .

In contrast to the previous example , After adopting a new way of inheritance , The result is as expected :

from collections import UserDict
class DoppelDict(UserDict):
def __setitem__(self, key, value):
super().__setitem__(key, [value] * 2)
dd = DoppelDict(one=1) # {'one': [1, 1]}
dd['two'] = 2 # {'one': [1, 1], 'two': [2, 2]}
dd.update(three=3) # {'one': [1, 1], 'two': [2, 2], 'three': [3, 3]}

obviously , If you want to inherit str/list/dict Words , The best practice is to inherit collections The classes provided by the library .

6、 Summary

I've written this much , It's time to do ending 了 ~~

In the previous article in this series ,Python Cats from the search order and speed of two aspects , Analysis of the “ Why built in functions / Built in types are not everything ”, This article follows it in the same vein , It also reveals some mysterious and seemingly flawed behavior of the built-in type .

Although this article is from 《 smooth Python》 The inspiration from the book , But beyond the surface of language , We asked one more question “ Why? ”, Thus further analysis of the phenomenon behind the principle .

In short , Special methods for built-in types are created by C Language independent implementation of , They are Python There is no call relation in the language , So when subclassing a built-in type , A particular method that is overridden only affects the method itself , It doesn't affect the effect of other special methods .

If we have a misconception about the relationship between particular methods , You might think Python Destroyed “ Subclass methods precede parent methods ” The basic principle of inheritance is .( unfortunately 《 smooth Python》 and PyPy They all have this wrong perception )

In order to meet the general expectations of built-in types ,Python In the standard library UserString、UserList、UserDict These extension classes , It is convenient for programmers to inherit these basic data types .

At the end : This paper belongs to “Python Why? ” series (Python Cat products ), The series focuses on Python The grammar of 、 Design and development , One by one “ Why? ” This is the starting point , Try to show Python The charm of . If you have other topics of interest , Welcome to fill in 《Python One hundred thousand why ? 》 In the questionnaire in .

official account 【Python cat 】, Series of excellent articles , Yes Python Why series 、 Meow philosophy Cat Series 、Python Advanced Series 、 Good book recommendation series 、 Technical writing 、 High quality English recommendation and translation, etc , Welcome to pay attention .

本文为[Under the cat pea]所创,转载请带上原文链接,感谢

  1. 利用Python爬虫获取招聘网站职位信息
  2. Using Python crawler to obtain job information of recruitment website
  3. Several highly rated Python libraries arrow, jsonpath, psutil and tenacity are recommended
  4. Python装饰器
  5. Python实现LDAP认证
  6. Python decorator
  7. Implementing LDAP authentication with Python
  8. Vscode configures Python development environment!
  9. In Python, how dare you say you can't log module? ️
  10. 我收藏的有关Python的电子书和资料
  11. python 中 lambda的一些tips
  12. python中字典的一些tips
  13. python 用生成器生成斐波那契数列
  14. python脚本转pyc踩了个坑。。。
  15. My collection of e-books and materials about Python
  16. Some tips of lambda in Python
  17. Some tips of dictionary in Python
  18. Using Python generator to generate Fibonacci sequence
  19. The conversion of Python script to PyC stepped on a pit...
  20. Python游戏开发,pygame模块,Python实现扫雷小游戏
  21. Python game development, pyGame module, python implementation of minesweeping games
  22. Python实用工具,email模块,Python实现邮件远程控制自己电脑
  23. Python utility, email module, python realizes mail remote control of its own computer
  24. 毫无头绪的自学Python,你可能连门槛都摸不到!【最佳学习路线】
  25. Python读取二进制文件代码方法解析
  26. Python字典的实现原理
  27. Without a clue, you may not even touch the threshold【 Best learning route]
  28. Parsing method of Python reading binary file code
  29. Implementation principle of Python dictionary
  30. You must know the function of pandas to parse JSON data - JSON_ normalize()
  31. Python实用案例,私人定制,Python自动化生成爱豆专属2021日历
  32. Python practical case, private customization, python automatic generation of Adu exclusive 2021 calendar
  33. 《Python实例》震惊了,用Python这么简单实现了聊天系统的脏话,广告检测
  34. "Python instance" was shocked and realized the dirty words and advertisement detection of the chat system in Python
  35. Convolutional neural network processing sequence for Python deep learning
  36. Python data structure and algorithm (1) -- enum type enum
  37. 超全大厂算法岗百问百答(推荐系统/机器学习/深度学习/C++/Spark/python)
  38. 【Python进阶】你真的明白NumPy中的ndarray吗?
  39. All questions and answers for algorithm posts of super large factories (recommended system / machine learning / deep learning / C + + / spark / Python)
  40. [advanced Python] do you really understand ndarray in numpy?
  41. 【Python进阶】Python进阶专栏栏主自述:不忘初心,砥砺前行
  42. [advanced Python] Python advanced column main readme: never forget the original intention and forge ahead
  43. python垃圾回收和缓存管理
  44. java调用Python程序
  45. java调用Python程序
  46. Python常用函数有哪些?Python基础入门课程
  47. Python garbage collection and cache management
  48. Java calling Python program
  49. Java calling Python program
  50. What functions are commonly used in Python? Introduction to Python Basics
  51. Python basic knowledge
  52. Anaconda5.2 安装 Python 库(MySQLdb)的方法
  53. Python实现对脑电数据情绪分析
  54. Anaconda 5.2 method of installing Python Library (mysqldb)
  55. Python implements emotion analysis of EEG data
  56. Master some advanced usage of Python in 30 seconds, which makes others envy it
  57. python爬取百度图片并对图片做一系列处理
  58. Python crawls Baidu pictures and does a series of processing on them
  59. python链接mysql数据库
  60. Python link MySQL database