Still | 《 base 》
Start talking about weak references ( weakref ) Before , Let's first look at what weak references are ? What exactly does it do ?
Suppose we have a multithreaded program , Concurrent processing of application data :
# Take up a lot of resources , The cost of creation and destruction is high
class Data:
def __init__(self, key):
pass
Application data Data By a key Unique identification , The same data may be accessed by multiple threads at the same time . because Data It takes up a lot of system resources , The cost of creation and consumption is high . We hope Data Only one copy is maintained in the program , Even if it is accessed by multiple threads at the same time , And I don't want to create it again .
So , We try to design a caching middleware Cacher :
import threading
# Data caching
class Cacher:
def __init__(self):
self.pool = {}
self.lock = threading.Lock()
def get(self, key):
with self.lock:
data = self.pool.get(key)
if data:
return data
self.pool[key] = data = Data(key)
return data
Cacher Use a... Inside dict Object to cache the created Data copy , And provide get Method is used to obtain application data Data .get Method first look up the cache Dictionary , If the data already exists , It will be returned directly to ; If the data doesn't exist , Then create one and save it in the dictionary . therefore , After the data is created for the first time, it enters the cache Dictionary , If other threads access at the same time , Using the same copy in the cache .
It feels great ! But the beauty is :Cacher There is a risk of resource leakage !
because Data Once created , It's stored in the cache Dictionary , Will never release ! In other words , Program resources, such as memory , Will continue to grow , It's likely to explode eventually . therefore , We want a data to wait until all threads no longer access it , It can be released automatically .
We can do it in Cacher Number of references to maintain data in , get Method automatically accumulates this count . At the same time, provide a remove The new method is used to release data , It first reduces the number of references , And delete the data from the cache field when the number of references decreases to zero .
Thread calls get Method to get data , After the data is used up, you need to call remove Method to release it .Cacher It is equivalent to realizing the reference counting method again , It's too much trouble !Python Isn't there a built-in garbage collection mechanism ? Why do applications need to implement themselves ?
The main crux of the conflict is Cacher Cache Dictionary : It works as a middleware , Data objects are not used by itself , Therefore, in theory, no reference should be made to the data . What kind of black technology can... Without reference , Find the target ? We know , Assignment will produce reference !
At this time , Weak reference ( weakref ) It's a grand show ! A weak reference is a special kind of object , Can be used without reference , Associate the target object .
# Create a data
>>> d = Data('fasionchan.com')
>>> d
<__main__.Data object at 0x1018571f0>
# Create a weak reference to the data
>>> import weakref
>>> r = weakref.ref(d)
# Call the weakly referenced object , You can find the object pointed to
>>> r()
<__main__.Data object at 0x1018571f0>
>>> r() is d
True
# Delete temporary variable d,Data Object has no other references , It will be recycled
>>> del d
# Call the weak reference object again , Finding goals Data The object is gone ( return None)
>>> r()
thus , We just need to Cacher Change the cache dictionary to hold weak references , The problem will be solved !
import threading
import weakref
# Data caching
class Cacher:
def __init__(self):
self.pool = {}
self.lock = threading.Lock()
def get(self, key):
with self.lock:
r = self.pool.get(key)
if r:
data = r()
if data:
return data
data = Data(key)
self.pool[key] = weakref.ref(data)
return data
Because the cache dictionary only saves Data Weak references to objects , therefore Cacher Does not affect the Data Reference count for object . When all threads run out of data , The reference count drops to zero and is released .
actually , It is common to cache data objects with dictionaries , So weakref The module also provides two dictionary objects that only store weak references :
therefore , Our data cache dictionary can use weakref.WeakValueDictionary To achieve , Its interface is exactly the same as that of an ordinary dictionary . In this way, we don't have to maintain weak reference objects ourselves , The code logic is more concise and clear :
import threading
import weakref
# Data caching
class Cacher:
def __init__(self):
self.pool = weakref.WeakValueDictionary()
self.lock = threading.Lock()
def get(self, key):
with self.lock:
data = self.pool.get(key)
if data:
return data
self.pool[key] = data = Data(key)
return data
weakref The module also has many useful tool classes and tool functions , Please refer to the official documents for details , No more details here .
that , Where is weak reference , Why is there such magic ? Next , Let's take off its veil together , See the truth !
>>> d = Data('fasionchan.com')
# weakref.ref Is a built-in type object
>>> from weakref import ref
>>> ref
<class 'weakref'>
# call weakref.ref Type object , Created a weak reference instance object
>>> r = ref(d)
>>> r
<weakref at 0x1008d5b80; to 'Data' at 0x100873d60>
After the previous chapters , We are familiar with reading the source code of built-in objects , The relevant source files are as follows :
Let's first pick up the field structure of the weakly referenced object , Defined in Include/weakrefobject.h The... In the header file 10-41 That's ok :
typedef struct _PyWeakReference PyWeakReference;
/* PyWeakReference is the base struct for the Python ReferenceType, ProxyType,
* and CallableProxyType.
*/
#ifndef Py_LIMITED_API
struct _PyWeakReference {
PyObject_HEAD
/* The object to which this is a weak reference, or Py_None if none.
* Note that this is a stealth reference: wr_object's refcount is
* not incremented to reflect this pointer.
*/
PyObject *wr_object;
/* A callable to invoke when wr_object dies, or NULL if none. */
PyObject *wr_callback;
/* A cache for wr_object's hash code. As usual for hashes, this is -1
* if the hash code isn't known yet.
*/
Py_hash_t hash;
/* If wr_object is weakly referenced, wr_object has a doubly-linked NULL-
* terminated list of weak references to it. These are the list pointers.
* If wr_object goes away, wr_object is set to Py_None, and these pointers
* have no meaning then.
*/
PyWeakReference *wr_prev;
PyWeakReference *wr_next;
};
#endif
thus it can be seen ,PyWeakReference A structure is the flesh of a weakly referenced object . It is a fixed length object , In addition to fixing the head 5 A field :
Combined with the comments in the code , We know :
thus it can be seen , The working principle of weak reference is actually... In design pattern Observer mode ( Observer ). When the object is destroyed , All its weakly referenced objects are notified , And properly handled .
Master the basic principle of weak reference , Enough for us to use it well . If you are interested in the source code , You can also delve into some of its implementation details .
As we mentioned earlier , All weak references to the same object , Organized into a two-way linked list , The linked list header is saved in the object . Because there are a variety of object types that can create weak references , It is difficult to represent by a fixed structure . therefore ,Python Provide a field in the type object tp_weaklistoffset , Record the offset of the weak reference chain header pointer in the instance object .
Result the , For any object o , We just need to go through ob_type Field to find its type object t , According to t Medium tp_weaklistoffset Field to find the object o Weak reference chain header .
Python stay Include/objimpl.h Two macro definitions are provided in the header file :
/* Test if a type supports weak references */
#define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0)
#define PyObject_GET_WEAKREFS_LISTPTR(o) \
((PyObject **) (((char *) (o)) + Py_TYPE(o)->tp_weaklistoffset))
When we create weak references , You need to call a weak reference type object weakref And the referenced object d Pass it in as a parameter . Weak reference type object weakref Is the type of all weakly referenced instance objects , Is a globally unique type object , It's defined in Objects/weakrefobject.c in , namely :_PyWeakref_RefType( The first 350 That's ok ).
According to the knowledge learned from the object model ,Python When an object is called , It executes... In its type object tp_call function . therefore , Call a weakly referenced object weakref when , Execution is weakref Type object , That is to say type Of tp_call function .tp_call The function returns to call weakref Of tp_new and tp_init function , among tp_new Allocate memory for instance objects , and tp_init Is responsible for initializing the instance object .
go back to Objects/weakrefobject.c Source file , You can see _PyWeakref_RefType Of tp_new The field is initialized to weakref___new__ ( The first 276 That's ok ). The main processing logic of this function is as follows :
When an object is recycled ,tp_dealloc The function will call PyObject_ClearWeakRefs Function to clean up its weak references . This function takes out the weak reference linked list of the object , Then walk through one by one , clear wr_object Field and execute wr_callback Callback function ( if there be ). The details will not be expanded , If you are interested, you can check it yourself Objects/weakrefobject.c The source code , be located 880 That's ok .
Okay , After learning this section , We have thoroughly mastered the knowledge of weak citation . Weak references can be used without generating reference counts , Manage target objects , Commonly used in frameworks and middleware . Weak references look amazing , In fact, the design principle is a very simple observer mode . After the weak reference object is created, it is inserted into a linked list maintained by the target object , Observe ( subscribe ) Object destruction event .
Not yet ? Try them
▲Python Garbage collection mechanism and principle analysis
▲ How to say Python Is the fastest language ?
▲Python Artifact Celery Source code reading (1)
▲Python Most commonly used 5 Can you use thread lock ?
▲Python Coroutines with JavaScript The contrast between the two