On the magical usage and principle of weak reference in Python

Python cat 2021-10-28 17:59:04
magical usage principle weak reference

Still | 《 base 》


Start talking about weak references ( weakref ) Before , Let's first look at what weak references are ? What exactly does it do ?

Suppose we have a multithreaded program , Concurrent processing of application data :

#  Take up a lot of resources , The cost of creation and destruction is high 
class Data:
    def __init__(self, key):

Application data Data By a key Unique identification , The same data may be accessed by multiple threads at the same time . because Data It takes up a lot of system resources , The cost of creation and consumption is high . We hope Data Only one copy is maintained in the program , Even if it is accessed by multiple threads at the same time , And I don't want to create it again .

So , We try to design a caching middleware Cacher :

import threading
#  Data caching
class Cacher:
    def __init__(self):
        self.pool = {}
        self.lock = threading.Lock()
    def get(self, key):
        with self.lock:
            data = self.pool.get(key)
            if data:
                return data
            self.pool[key] = data = Data(key)
            return data

Cacher Use a... Inside dict Object to cache the created Data copy , And provide get Method is used to obtain application data Data .get Method first look up the cache Dictionary , If the data already exists , It will be returned directly to ; If the data doesn't exist , Then create one and save it in the dictionary . therefore , After the data is created for the first time, it enters the cache Dictionary , If other threads access at the same time , Using the same copy in the cache .

It feels great ! But the beauty is :Cacher There is a risk of resource leakage !

because Data Once created , It's stored in the cache Dictionary , Will never release ! In other words , Program resources, such as memory , Will continue to grow , It's likely to explode eventually . therefore , We want a data to wait until all threads no longer access it , It can be released automatically .

We can do it in Cacher Number of references to maintain data in , get Method automatically accumulates this count . At the same time, provide a remove The new method is used to release data , It first reduces the number of references , And delete the data from the cache field when the number of references decreases to zero .

Thread calls get Method to get data , After the data is used up, you need to call remove Method to release it .Cacher It is equivalent to realizing the reference counting method again , It's too much trouble !Python Isn't there a built-in garbage collection mechanism ? Why do applications need to implement themselves ?

The main crux of the conflict is Cacher Cache Dictionary : It works as a middleware , Data objects are not used by itself , Therefore, in theory, no reference should be made to the data . What kind of black technology can... Without reference , Find the target ? We know , Assignment will produce reference !

characteristic use

At this time , Weak reference weakref ) It's a grand show ! A weak reference is a special kind of object , Can be used without reference , Associate the target object .

#  Create a data 
>>> d = Data('fasionchan.com')
>>> d
<__main__.Data object at 0x1018571f0>

#  Create a weak reference to the data
>>> import weakref
>>> r = weakref.ref(d)

#  Call the weakly referenced object , You can find the object pointed to
>>> r()
<__main__.Data object at 0x1018571f0>
>>> r() is d

#  Delete temporary variable d,Data Object has no other references , It will be recycled
>>> del d
#  Call the weak reference object again , Finding goals Data The object is gone ( return None)
>>> r()

thus , We just need to Cacher Change the cache dictionary to hold weak references , The problem will be solved !

import threading
import weakref
#  Data caching
class Cacher:
    def __init__(self):
        self.pool = {}
        self.lock = threading.Lock()
    def get(self, key):
        with self.lock:
            r = self.pool.get(key)
            if r:
                data = r()
                if data:
                    return data
            data = Data(key)
            self.pool[key] = weakref.ref(data)
            return data

Because the cache dictionary only saves Data Weak references to objects , therefore Cacher Does not affect the Data Reference count for object . When all threads run out of data , The reference count drops to zero and is released .

actually , It is common to cache data objects with dictionaries , So weakref The module also provides two dictionary objects that only store weak references :

  • weakref.WeakKeyDictionary , Key holds only weakly referenced mapping classes ( Once the key no longer has a strong reference , Key value pair entries will automatically disappear );
  • weakref.WeakValueDictionary , Value holds only weakly referenced mapping classes ( Once the value no longer has a strong reference , Key value pair entries will automatically disappear );

therefore , Our data cache dictionary can use weakref.WeakValueDictionary To achieve , Its interface is exactly the same as that of an ordinary dictionary . In this way, we don't have to maintain weak reference objects ourselves , The code logic is more concise and clear :

import threading
import weakref
#  Data caching
class Cacher:
    def __init__(self):
        self.pool = weakref.WeakValueDictionary()
        self.lock = threading.Lock()
    def get(self, key):
        with self.lock:
            data = self.pool.get(key)
            if data:
                return data
            self.pool[key] = data = Data(key)
            return data

weakref The module also has many useful tool classes and tool functions , Please refer to the official documents for details , No more details here .

working principle

that , Where is weak reference , Why is there such magic ? Next , Let's take off its veil together , See the truth !

>>> d = Data('fasionchan.com')

# weakref.ref  Is a built-in type object
>>> from weakref import ref
>>> ref
<class 'weakref'>

#  call weakref.ref Type object , Created a weak reference instance object
>>> r = ref(d)
>>> r
<weakref at 0x1008d5b80to 'Dataat 0x100873d60>

After the previous chapters , We are familiar with reading the source code of built-in objects , The relevant source files are as follows :

  • Include/weakrefobject.h The header file contains the object structure and some macro definitions ;
  • Objects/weakrefobject.c The source file contains weak reference type objects and their method definitions ;

Let's first pick up the field structure of the weakly referenced object , Defined in Include/weakrefobject.h The... In the header file 10-41 That's ok :

typedef struct _PyWeakReference PyWeakReference;

/* PyWeakReference is the base struct for the Python ReferenceType, ProxyType,
 * and CallableProxyType.

#ifndef Py_LIMITED_API
struct _PyWeakReference {

    /* The object to which this is a weak reference, or Py_None if none.
     * Note that this is a stealth reference:  wr_object's refcount is
     * not incremented to reflect this pointer.

    PyObject *wr_object;

    /* A callable to invoke when wr_object dies, or NULL if none. */
    PyObject *wr_callback;

    /* A cache for wr_object's hash code.  As usual for hashes, this is -1
     * if the hash code isn't known yet.

    Py_hash_t hash;

    /* If wr_object is weakly referenced, wr_object has a doubly-linked NULL-
     * terminated list of weak references to it.  These are the list pointers.
     * If wr_object goes away, wr_object is set to Py_None, and these pointers
     * have no meaning then.

    PyWeakReference *wr_prev;
    PyWeakReference *wr_next;

thus it can be seen ,PyWeakReference A structure is the flesh of a weakly referenced object . It is a fixed length object , In addition to fixing the head 5 A field :

  • wr_object , Object pointer , Point to the referenced object , Weak reference: the referenced object can be found according to this field , But no reference is generated ;
  • wr_callback , Point to a callable object , When the referenced object is destroyed, it will be called ;
  • hash , Cache the hash value of the referenced object ;
  • wr_prev and wr_next They are forward and backward pointers , Used to organize weak reference objects into two-way linked lists ;

Combined with the comments in the code , We know :

  • Weakly referenced objects pass through wr_object Field is associated with the referenced object , As shown by the dotted arrow in the above figure ;
  • An object can be associated with multiple weakly referenced objects at the same time , In the picture Data The instance object is associated with two weakly referenced objects ;
  • All weak references associated with the same object , Organized into a two-way linked list , The linked list header is saved in the referenced object , As shown by the solid arrow in the figure above ;
  • When an object is destroyed ,Python Will traverse its weak reference linked list , One by one processing :
    • take wr_object Field set to None , If the weakly referenced object is called again, it will return None , The caller knows that the object has been destroyed ;
    • Execute callback function wr_callback ( if there be );

thus it can be seen , The working principle of weak reference is actually... In design pattern Observer mode Observer ). When the object is destroyed , All its weakly referenced objects are notified , And properly handled .

Implementation details

Master the basic principle of weak reference , Enough for us to use it well . If you are interested in the source code , You can also delve into some of its implementation details .

As we mentioned earlier , All weak references to the same object , Organized into a two-way linked list , The linked list header is saved in the object . Because there are a variety of object types that can create weak references , It is difficult to represent by a fixed structure . therefore ,Python Provide a field in the type object tp_weaklistoffset , Record the offset of the weak reference chain header pointer in the instance object .

Result the , For any object o , We just need to go through ob_type Field to find its type object t , According to t Medium tp_weaklistoffset Field to find the object o Weak reference chain header .

Python stay Include/objimpl.h Two macro definitions are provided in the header file :

/* Test if a type supports weak references */
#define PyType_SUPPORTS_WEAKREFS(t) ((t)->tp_weaklistoffset > 0)

#define PyObject_GET_WEAKREFS_LISTPTR(o) \
    ((PyObject **) (((char *) (o)) + Py_TYPE(o)->tp_weaklistoffset))

  • PyType_SUPPORTS_WEAKREFS Used to determine whether type objects support weak references , Only when the tp_weaklistoffset Weak references are only supported if they are greater than zero , Built-in objects list Weak references are not supported ;
  • PyObject_GET_WEAKREFS_LISTPTR Used to fetch the weak reference chain header of an object , It passes first Py_TYPE Macro found type object t , Find another pass tp_weaklistoffset Field to determine the offset , Finally, the address of the linked list header field can be obtained by adding it to the object address ;

When we create weak references , You need to call a weak reference type object weakref And the referenced object d Pass it in as a parameter . Weak reference type object weakref Is the type of all weakly referenced instance objects , Is a globally unique type object , It's defined in Objects/weakrefobject.c in , namely :_PyWeakref_RefType( The first 350 That's ok ).

According to the knowledge learned from the object model ,Python When an object is called , It executes... In its type object tp_call function . therefore , Call a weakly referenced object weakref when , Execution is weakref Type object , That is to say type Of tp_call function .tp_call The function returns to call weakref Of tp_new and tp_init function , among tp_new Allocate memory for instance objects , and tp_init Is responsible for initializing the instance object .

go back to Objects/weakrefobject.c Source file , You can see _PyWeakref_RefType Of tp_new The field is initialized to weakref___new__ ( The first 276 That's ok ). The main processing logic of this function is as follows :

  1. Analytical parameters , Get the referenced object ( The first 282 That's ok );
  2. call PyType_SUPPORTS_WEAKREFS The macro determines whether the referenced object supports weak references , Throw exceptions if not supported ( The first 286 That's ok );
  3. call GET_WEAKREFS_LISTPTR Row takes out the weak reference linked list header field of the object , For ease of insertion, a secondary pointer is returned ( The first 294 That's ok );
  4. call get_basic_refs Take out the top of the list callback It's empty Underlying weakly referenced objects ( if there be , The first 295 That's ok );
  5. If callback It's empty , And the object exists callback Null base weak reference , Then reuse the instance and directly return it to ( The first 296 That's ok );
  6. If it cannot be reused , call tp_alloc Function to allocate memory 、 Complete field initialization , And insert it into the weak reference linked list of the object ( The first 309 That's ok );
    • If callback It's empty , Insert it directly into the front of the linked list , Convenient for subsequent reuse ( See the first 4 spot );
    • If callback Non empty , Insert it into the underlying weak reference object ( if there be ) after , Ensure that the underlying weak reference is in the chain header , Easy access ;

When an object is recycled ,tp_dealloc The function will call PyObject_ClearWeakRefs Function to clean up its weak references . This function takes out the weak reference linked list of the object , Then walk through one by one , clear wr_object Field and execute wr_callback Callback function ( if there be ). The details will not be expanded , If you are interested, you can check it yourself Objects/weakrefobject.c The source code , be located 880 That's ok .

Okay , After learning this section , We have thoroughly mastered the knowledge of weak citation . Weak references can be used without generating reference counts , Manage target objects , Commonly used in frameworks and middleware . Weak references look amazing , In fact, the design principle is a very simple observer mode . After the weak reference object is created, it is inserted into a linked list maintained by the target object , Observe ( subscribe ) Object destruction event .

Python The cat technology exchange group is open ! In the group, there are in-service employees of the first and second tier large factories in China , There are also students at home and abroad , There are more than ten years old programming birds , There are also new people who have just entered primary and secondary schools , The learning atmosphere is good ! Students who want to join the group , Please reply in the public number 『 Communication group 』, Get brother cat's wechat ( No advertising party , If you are the one !)~

Not yet ? Try them

Python Garbage collection mechanism and principle analysis

How to say Python Is the fastest language ?

Python Artifact Celery Source code reading (1)

Python Most commonly used 5 Can you use thread lock ?

Python Coroutines with JavaScript The contrast between the two

Lengthy Python Code , How to reconstruct ?

If you find this article helpful
Please be generous Share and give the thumbs-up , Thank you
本文为[Python cat]所创,转载请带上原文链接,感谢

  1. 有关python求众数,中位数和均值的题目
  2. 零基础5天入门Python数据分析:第四课
  3. 零基础5天入门Python数据分析:第三课(上)
  4. 零基础5天入门Python数据分析:第一课
  5. python redis自带门神 lock 方法
  6. 【算法学习】LCP 01. 猜数字(java / c / c++ / python / go)
  7. 【Python量化分析100例】Day2-星期几最容易被割韭菜
  8. python逆推年份,前两问写好了,第三问不会
  9. Python 爬取百度网页如何绕过安全验证
  10. 零基础5天入门Python数据分析:第五课
  11. Python人脸融合时出现关于pybind11的问题
  12. python如何返回除数,公约数,倍数
  13. python 返回多重嵌套列表(多于两层嵌套)的元素
  14. 用Python采集了几千条相亲文案,终于发现了告别单身的秘密
  15. python正负序列题,目前只学到循环,怎么做啊(*꒦ິ⌓꒦ີ)
  16. 拿爱奇艺练手Python爬虫,是在法律边缘试探吗?爬虫技巧学习
  17. Python注释删除代码依然报错
  18. python的pyautogui模块中的pyautogui.scroll()括号中无论写什么值滚动范围都相同
  19. 为什么python在vscode里运行报语法错误,在IDLE里就不会
  20. 请问python如何在将pdf转成word时,去除pdf上的页眉页脚(或者对于每页pdf只取第2行-倒数第二行)
  21. matlab改为python,偏最小二乘回归分析的一个程序
  22. 应该是python基础题希望能用基础方法解决
  23. 想找个会Python的做场外援助,上课没听明白
  24. Python程序,插入不了MySQL的date格式
  25. (初学者)关于Python操作Excel问题
  26. 求人来解答这两道Python题
  27. python中用三引号换行,举例说明
  28. python数码管该怎么用,十四段
  29. python进行中文文本聚类(切词以及Kmeans聚类)
  30. Python - 字符串作为文件
  31. Python - 转换二进制为ASCII码
  32. Python - 在段落中计算令牌
  33. Python - 重新格式化段落
  34. Python - 排序线
  35. Python - 字符串不变性
  36. Python - 文本摘要
  37. Python+微信小程序开发(六)双向绑定和前后端通信
  38. 基于Anaconda搭建Django环境
  39. Django基础篇(2)--视图
  40. 288页的python编程文档,从入门到实践,入门看这一篇就够了
  41. Python Web实战:Flask + Vue 开发一个漂亮的词云网站
  42. 让我深夜十二点催她睡觉,我用 Python 轻松搞定!
  43. 4.Python-常用语句
  44. 【Python】基于FastAPI的Restful规范实践
  45. 【Python】FastAPI脚手架:规范FastAPI后端接口项目开发
  46. 【Python】单元测试实践内部指南
  47. Django开发中使用Cache缓存提升10倍效率
  48. python如何重复执行程序命令而不是一次退出
  49. python 编写程序题使用for循环
  50. 一道简单的python作业题,就是不能运行
  51. 使用python回答,望有人来帮
  52. 用python插入日期格式到mysql数据库中,一直运行不了。
  53. 关于以下Python问题如何解决
  54. Use Python to help the financial sister solve the PDF splitting. The sister said it was great...
  55. Comment résoudre les problèmes Python suivants
  56. 如何使用python建立列表?新手入门
  57. python 3d画图库matplotlib,第一次用
  58. python 3d畫圖庫matplotlib,第一次用
  59. Python 3D painting Library matplotlib, utilisé pour la première fois
  60. Comment créer une liste en utilisant python? Débutant