This article from the “Python Why? ” series , Please check out All articles
not long ago ,Python cat
I recommend a book to you 《 smooth Python》( Click to jump to read ), That article has a lot of “ words of excessive praise ”, It seems to be rather vague ……
however ,《 smooth Python》 A book is worth reading again and again , You can learn from the old . Recently, I found a strange knowledge point in the book , So I'm going to talk about this topic —— Subclassing built-in types can be problematic ?!
1、 What are the built-in types ?
Before the official start , First of all, we need to popularize science : Which are Python Built in type ?
According to the classification of official documents , Built in type (Built-in Types) It mainly includes the following contents :
Detailed documentation :https://docs.python.org/3/lib...
among , There is a well-known Numeric type 、 Sequence type 、 Text type 、 Mapping types and so on , And, of course, we've introduced Boolean type 、... object wait .
In so much content , This article focuses only on those actions Callable object
(callable) Built in type , That is, with built-in functions (built-in function) On the surface, similar ones :int、str、list、tuple、range、set、dict……
These types (type) It can be simply understood as a class in other languages (class), however Python There is no customary nomenclature for the hump , So it's easy to get misunderstood .
stay Python 2.2 after , These built-in types can be subclassed (subclassing), That is, it can be inherited (inherit).
2、 Subclassing of built-in types
as everyone knows , For a common object x,Python The public built-in function is needed to find its length in len(x), It is not like Java Object oriented languages like this , The object of the latter usually has its own x.length() Method .(PS: Analysis of these two design styles , Recommended reading This article )
Now? , Suppose we want to define a list class , Hope it has its own length() Method , At the same time, keep all the features of a normal list .
The experimental code is as follows ( Just for demonstration ):
# Define a list Subclasses of
class MyList(list):
def length(self):
return len(self)
We make MyList This custom class inherits list, At the same time, a new definition of length() Method . thus ,MyList have append()、pop() Method, etc. , It also has length() Method .
# Add two elements
ss = MyList()
ss.append("Python")
ss.append(" cat ")
print(ss.length()) # Output :2
The other built-in types mentioned earlier , It can also be subclassed in this way , It shouldn't be hard to understand .
By the way , What are the benefits of subclassing built-in types / What about the usage scenarios ?
There's an intuitive example , When we are in a custom class , When you need to use a list object frequently ( To add it / Remove elements 、 Pass on as a whole ……), In this case, if our class inherits from list, You can write directly self.append()、self.pop(), Or will self Pass as an object , So you don't have to define an extra list object , It will be concise in writing .
There are other benefits / Use scenarios ? Welcome to leave a message for discussion ~~
3、 Subclassing of built-in types “ problem ”
Finally, it's time to enter the formal theme of this article :)
generally , In our textbook cognition , Methods in subclasses override methods with the same name in the parent class , in other words , The search priority of subclass method is higher than that of parent method .
Here's an example , Parent class Cat, Subclass PythonCat, There is one. say() Method , The function is to say the current object inner_voice:
# Python A cat is a cat
class Cat():
def say(self):
return self.inner_voice()
def inner_voice(self):
return " meow "
class PythonCat(Cat):
def inner_voice(self):
return " Meow meow "
When we create subclasses PythonCat The object is , its say() Methods will take precedence over the ones defined by themselves inner_voice() Method , instead of Cat Of the parent class inner_voice() Method :
my_cat = PythonCat()
# The following results are in line with expectations
print(my_cat.inner_voice()) # Output : Meow meow
print(my_cat.say()) # Output : Meow meow
It's a convention of programming languages , It's a basic principle , Students who have learned the basics of object-oriented programming should know .
However , When Python When implementing inheritance , It seems incomplete Will operate according to the above rules . It's divided into two situations :
- In line with common sense : For using Python Implementation classes , They will follow “ The subclass precedes the parent ” Principles
- Contrary to common sense : For practical use C Implementation classes ( namely str、list、dict And so on, these built-in types ), When explicitly calling subclass methods , Will follow “ The subclass precedes the parent ” Principles ; however , When there is an implicit call , They seem to follow “ The parent class precedes the child class ” Principles , That is, the usual inheritance rules will fail here
contrast PythonCat Example , It's equivalent to saying , Call directly my_cat.inner_voice() when , Will get the right “ Meow meow ” result , But in calling my_cat.say() when , You get more than you expect “ meow ” result .
Here is 《 smooth Python》 The example given in (12.1 chapter ):
class DoppelDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, [value] * 2)
dd = DoppelDict(one=1) # {'one': 1}
dd['two'] = 2 # {'one': 1, 'two': [2, 2]}
dd.update(three=3) # {'three': 3, 'one': 1, 'two': [2, 2]}
In this case ,dd['two'] The subclass will be called directly \\_setitem\\_() Method , So the results are in line with expectations . If other tests are as expected , The end result will be {'three': [3, 3], 'one': [1, 1], 'two': [2, 2]}.
However , Initialization and update() The direct calls are inherited from the parent class \\_init\\() and \\update\\(), And then they Implicitly call \setitem\\_() Method , There is no subclass method called at this time , Instead, it calls the method of the parent class , The result is beyond expectation !
official Python This double rule approach , It's a little against common sense , If you don't pay attention to , It's easy to step on the pit if it's not done well .
that , Why is there such an exception ?
4、 The true face of the built-in type method
We know that built-in types don't implicitly call subclass overlay methods , next , Namely Python cat
To get to the bottom of it : Why doesn't it call ?
《 smooth Python》 There is no further questioning in the book , however , I tried to make a wild guess ( It should be verified from the source code ): The methods of built-in types are all made with C The realization of language , In fact, they don't call each other , So there is no problem of lookup priority when calling .
in other words , Ahead “\\_init\\() and \\update\\() Will implicitly call \setitem\\_() Method ” This statement is not accurate !
These magic methods are actually independent of each other !\\_init\\() Have their own setitem Realization , It doesn't call the parent class \\setitem\\(), With Subclasses of course \\setitem\\_() It doesn't matter much .
To understand logically , Dictionary \\_init\\() Method contains \\setitem\\_() The function of , So we thought the former would call the latter , This is the embodiment of habitual thinking , However, the actual invocation relationship may be like this :
The method on the left opens the door of the language interface and enters the world on the right , To fulfill all its missions there , It will not go back to the original interface to find the next instruction ( There is no red line path in the graph ). The reason is not simple , namely C Code calls between languages are more efficient , The implementation path is shorter , The implementation process is simpler .
Empathy ,dict Type of get() Methods and \\_getitem\\() There is no call relationship , If the subclass only covers \\getitem\\() Words , When a subclass calls get() When the method is used , It will actually use the parent class get() Method .(PS: On this point ,《 smooth Python》 And PyPy The description of the document is not accurate , They mistook get() Method will call \\getitem\\_())
in other words ,Python The method of built-in type has no calling relationship itself , Even though they're at the bottom C When language is realized , There may be common logic or methods that can be reused .
I think of it. “Python Why? ” The series has been analyzed 《Python Why can support arbitrary truth value judgment ?》. Before we write if xxx
when , It seems to implicitly call \\_bool\\() and \\len\\_() Magic methods , In fact, however, the procedure is based on POP_JUMP_IF_FALSE Instructions , It goes straight into pure C Logic of code , There is no call to these magic methods !
therefore , Be aware of C After the special methods implemented are independent of each other , Let's go back to subclassing built-in types , There will be new discoveries :
Of the parent class \\_init\\() Magic method will break the language interface to achieve their own mission , However, it is similar to subclasses of \\setitem\\_() There is no pathway , That is, the red line path in the figure is not reachable .
Special methods go their own way , thus , We'll come to a different conclusion from the previous one : actually Python Strictly followed “ Subclass methods precede parent methods ” The principle of succession , It doesn't destroy common sense !
Last but not least ,\\_missing\\_() It's a special case .《 smooth Python》 Just a simple and vague sentence , Not too much unfolding .
After preliminary experiments , I found that when subclasses define this method ,get() Read nonexistent key when , Normal return None; however \\_getitem\\() and dd['xxx'] Read nonexistent key when , Will be defined by subclass \\missing\\_() To deal with .
I haven't had time to analyze , Please leave me a message if you know the answer .
5、 Best practices for subclassing built-in types
in summary , There is no problem with subclassing built-in types , It's just that we don't recognize the special method (C Methods of language implementation ) The true face of , Will lead to the deviation of the result .
that , This calls up a new question : If you have to inherit a built-in type , What is the best practice ?
First , If after inheriting a built-in type , It doesn't rewrite (overwrite) Its special method , There's no problem with subclassing .
secondly , If you want to override a particular method after inheritance , Remember to rewrite all the methods you want to change , for example , If you want to change get() Method , Will rewrite get() Method , If you want to change \\_getitem\\_() Method , It's going to be rewritten ……
however , If we just want to rewrite some logic ( namely C The language part ), If all the special methods used to use the logic have changed , For example, rewrite \\_setitem\\_() The logic of , At the same time, initialize and update() Wait for the operation to change , So what to do ?
We know that there is no reuse between special methods , That is to say, simply define new \\_setitem\\_() It's not enough. , that , How can we influence multiple methods at the same time ?
PyPy This unofficial Python The version found this problem , What it does is to make special methods of built-in types call , Establish a connection between them .
official Python Of course, I'm aware of this problem , But it doesn't change the nature of the built-in type , It's about offering new solutions :UserString、UserList、UserDict……
Except for the name , They are basically equivalent to built-in types .
The basic logic of these classes is to use Python Realized , This is equivalent to the previous article C Some of the logic of the language interface has moved to Python Interface , Set up the call chain on the left side , In this way , It solves the reuse problem of some special methods .
In contrast to the previous example , After adopting a new way of inheritance , The result is as expected :
from collections import UserDict
class DoppelDict(UserDict):
def __setitem__(self, key, value):
super().__setitem__(key, [value] * 2)
dd = DoppelDict(one=1) # {'one': [1, 1]}
dd['two'] = 2 # {'one': [1, 1], 'two': [2, 2]}
dd.update(three=3) # {'one': [1, 1], 'two': [2, 2], 'three': [3, 3]}
obviously , If you want to inherit str/list/dict Words , The best practice is to inherit collections
The classes provided by the library .
6、 Summary
I've written this much , It's time to do ending 了 ~~
In the previous article in this series ,Python Cats from the search order and speed of two aspects , Analysis of the “ Why built in functions / Built in types are not everything ”, This article follows it in the same vein , It also reveals some mysterious and seemingly flawed behavior of the built-in type .
Although this article is from 《 smooth Python》 The inspiration from the book , But beyond the surface of language , We asked one more question “ Why? ”, Thus further analysis of the phenomenon behind the principle .
In short , Special methods for built-in types are created by C Language independent implementation of , They are Python There is no call relation in the language , So when subclassing a built-in type , A particular method that is overridden only affects the method itself , It doesn't affect the effect of other special methods .
If we have a misconception about the relationship between particular methods , You might think Python Destroyed “ Subclass methods precede parent methods ” The basic principle of inheritance is .( unfortunately 《 smooth Python》 and PyPy They all have this wrong perception )
In order to meet the general expectations of built-in types ,Python In the standard library UserString、UserList、UserDict These extension classes , It is convenient for programmers to inherit these basic data types .
At the end : This paper belongs to “Python Why? ” series (Python Cat products ), The series focuses on Python The grammar of 、 Design and development , One by one “ Why? ” This is the starting point , Try to show Python The charm of . If you have other topics of interest , Welcome to fill in 《Python One hundred thousand why ? 》 In the questionnaire in .
official account 【Python cat 】, Series of excellent articles , Yes Python Why series 、 Meow philosophy Cat Series 、Python Advanced Series 、 Good book recommendation series 、 Technical writing 、 High quality English recommendation and translation, etc , Welcome to pay attention .