Python small data pool and code block caching mechanism

The end of the world 2021-04-06 19:54:44
python small data pool code


Preface

This paper divides " summary " Outside , The rest are cognitive processes , It's not recommended to look at ;3.7.5; I don't know where to find this official document , At present, we have not found , Who knows, please leave a message ? Thank you for the !

summary :

If it's in the same code block , The cache mechanism under the same code block is adopted ;-- It's understandable ?
If it's a different code block , The small data pool resident mechanism is adopted ; -- It can be understood as an ancestor ?
It should be noted that , Interactive input , Each command is a block of code ;

Realization Intern The way to retain the mechanism is very simple , By maintaining a string pool , This pool is a dictionary structure , Compile time , If the string already exists in the pool, no new string will be created , Return the string object created before ,
If it hasn't been added to the pool before , First, construct a string object , And add this object to the pool , For the next time ;

The length is 0 And 1 The string must be resident ;
String dwell occurs at program compile time ;
The resident string must be ASCll Letter , Numbers and underscores ;

1. Caching mechanism of code block

Python The program is constructed from code blocks . Block is a Python The text of the program , It's executed as a unit .
Code block : A module , A function , One class , A file is a code block ;
Interactive mode : stay cmd Into Python Inside the interpreter , Every command you enter is a block of code ;

Python When executing the command of initialization object of the same code block , Will check if its value exists , If there is , It will be reused ;
Satisfy Caching mechanism of code block Only one of them exists in memory , namely :id identical ;
Scope of application of caching mechanism of code block : int(float),str,bool;

int(float): Any number can be reused in the same code block ;
bool: True and False In the dictionary will be used 1,0 The way exists , And reuse ;
str: In the same block of code , Only one string with the same value exists in memory :

s1 = 'janes@!#*ewq'
s2 = 'janes@!#*ewq'
print(s1 is s2) # True
a1 = 'janes45613256132!@#$%#^%@$%' * 1
b1 = 'janes45613256132!@#$%#^%@$%' * 1
print(a1 is b1) # True
s1 = 'hah_' * 6
s2 = 'hah_' * 6
print(s1 is s2) # True
2. Small data pools

Python Automatically put -5~256 The integer of is cached , When you assign these integers to variables , It doesn't recreate the object , Instead, use the cache objects that have been created ;
Python A string that satisfies certain rules will be resident in the string pool , Create a , When you assign these strings to variables , It doesn't recreate the object , Instead, use the objects created in the string resident pool ;
bool The value is True,False, No matter how many variables you create point to True,False, It has only one in memory ;

Small data pools are also for int(float),str,bool;
Small data pools are caching mechanisms for different code blocks ;

# cmd, -5~256 Although the small integer of is not in the same code block , But they apply the small data pool mechanism
>>>a = 245
>>>b = 245
>>>a is b # True
# The length is 0 And 1 The string must be resident ;
# String dwell occurs at program compile time ;
# The resident string must be ASCll Letter , Numbers and underscores ;
>>>s1 = '@'
>>>s2 = '@'
>>>s1 is s2 # True
>>>s1 = ''
>>>s2 = ''
>>>s1 is s2 # True
>>>s1 = 'a_b_c'
>>>s2 = 'a_b_c'
>>>s1 is s2 # True
>>>s1 = 'a b_c'
>>>s2 = 'a b_c'
>>>s1 is s2 # False
>>>s1 = 'a_b_c' * 1
>>>s2 = 'a_b_c' * 1
>>>s1 is s2 # True
>>>s1 = 'abd_d23' * 3
>>>s2 = 'abd_d23' * 3
>>>s1 is s2 # True
>>>a, b = "some_thing!", "some_thing!"
>>>a is b # False
>>>a, b = "some_thing", "some_thing"
>>>a is b # True
a1 = 1000
b1 = 1000
a1 is b1 # True
class C1(object): 
   a = 100
   b = 100
   c = 1000
   d = 1000
class C2(object):
   a = 100
   b = 1000
print(C1.a is C1.b)  # True
print(C1.a is C2.a)  # True
print(C1.c is C1.d)  # True
print(C1.c is C2.b)  # False
3. Advantages and disadvantages

advantage : Of the same string ( Like identifiers ), Directly from the pool , Avoid frequent creation and destruction , Improve efficiency , To save memory ;

shortcoming : String concatenation 、 Performance impact on string modification and so on ;
Because it's immutable , So string modification is not inplace Local operation , To create a new object , This is also why it is not recommended to use when splicing multiple strings + While using join();
join() First calculate the length of all strings , And then copy them one by one , only new One time object ;

Small integer object pool

To avoid frequent integer requests and destruction of memory space ,python Using a small integer object pool ,Python The definition of a small integer is [-5, 256] , These integer objects are built in advance , Will not be recycled ;
One Python In the program , Whether this integer is in LEGB Where in the world , All integers in this range use the same object ;

# 3.7.5, ipython7.18.1
a = -5
b = -5
a is b # True
a = -6
b = -6
a is b # False
a = 256
b = 256
a is b # True
a = 257
b = 257
a is b # Flase
Large integer object pool

cmd In the end , Every time a large integer is assigned a value , Each big integer is recreated ,Pycharm in , Every time it runs , All the code is loaded into memory , Belong to a whole , So at this time, there will be a pool of large integer objects, and the large integers in a code block are the same object ;
c and d In a block of code , and C1.b and C2.b Each has its own code block , So it's not equal ;

# cmd terminal
a = 1000
b = 1000
a is b # False
--------------------
class C1(object): 
   a = 100
   b = 100
   c = 1000
   d = 1000
class C2(object):
   a = 100
   b = 1000
print(C1.a is C1.b)  # True
print(C1.a is C2.a)  # True
print(C1.c is C1.d)  # True ?? Don't cmd There is also a pool of large integers in ?? Class is loaded in a block of memory , Same value, same address ??
print(C1.c is C2.b)  # False
# pycharm Wait in the editor
a = 1000
b = 1000
a is b # True
--------------------
class C1(object): 
   a = 100
   b = 100
   c = 1000
   d = 1000
class C2(object):
   a = 100
   b = 1000
print(C1.a is C1.b)  # True
print(C1.a is C2.a)  # True
print(C1.c is C1.d)  # True
print(C1.c is C2.b)  # False
String resident mechanism

Python Interpreter in order to improve the efficiency and performance of string usage , Compile time , Used intern( String resident ) Technology to improve string efficiency , What is? intern Mechanism ? That is, only one copy of the string object with the same value will be saved , Put it in a string pool , It's common. , Of course , It can't be changed , This also determines that strings must be immutable objects ( Integer types are also immutable objects )??, Floating point numbers don't work ;

Simple principle :

Realization Intern The way to retain the mechanism is very simple , By maintaining a string pool , This pool is a dictionary structure , Compile time , If the string already exists in the pool, no new string will be created , Return the string object created before , If it hasn't been added to the pool before , First, construct a string object , And add this object to the pool , For the next time .;
however , Inside the interpreter, there's a pair of intern The use strategy of the mechanism is well researched , Some scenes are automatically used intern , Some places need to be started manually , Look at the following common scenarios :

# cmd Floating point numbers in are not cached
a = 1.0
b = 1.0
a is b # False
# cmd Not all strings in will take intern Mechanism ; only Include underline 、 Numbers 、 The string of letters will be intern-- Class identifier
s1="hello"
s2="hello"
s1 is s2 # True
# If there are spaces , Not enabled by default intern Mechanism
s1="hell o"
s2="hell o"
s1 is s2 # False
s1 = "hell!*o"
s2 = "hell!*o"
print(s1 is s2) # False
# If a string is longer than 20 Characters , Do not start intern Mechanism -- It's written like this on the Internet , No more than twenty is true , But I'm on my own 3.7/8.5 Try it on the version , There seems to be no limit to discovery , I don't know. Python Updated , Or something ……
s1 = "a" * 20
s2 = "a" * 20
s1 is s2 # True
s1 = "a" * 21
s2 = "a" * 21
s1 is s2 # True
s1 = "ab" * 10
s2 = "ab" * 10
s1 is s2 # True
s1 = "ab" * 11
s2 = "ab" * 11
s1 is s2 # True
# 'kz' + 'c' Compile time has become 'kzc', and s1 + 'c' in s1 It's a variable. , It will be spliced at run time , So it wasn't intern?
'kz' + 'c' is 'kzc' # True
s1 = 'kz'
s2 = 'kzc'
s1+'c' is 'kzc' # False
# pycharm Wait in the editor , As long as it's the same string , All for True, It doesn't have to be an underline 、 Numbers 、 String of letters
s1 = "hell o"
s2 = "hell o"
print(s1 is s2) # True
s1 = "hell!*o"
s2 = "hell!*o"
print(s1 is s2) # True
s1 = "a" * 20
s2 = "a" * 20
print(s1 is s2) # True
s1 = "a" * 21
s2 = "a" * 21
print(s1 is s2) # True
s1 = "ab" * 10
s2 = "ab" * 10
print(s1 is s2) # True
s1 = "ab" * 11
s2 = "ab" * 11
print(s1 is s2) # True
'kz' + 'c' is 'kzc' # True
s1 = 'kz'
s2 = 'kzc'
s1+'c' is 'kzc' # False
# Editor ,float It's also cached
a = 1.0
b = 1.0
a is b

Reference resources :

https://www.zhihu.com/question/29945705 python The strange problems in the book
https://www.pianshen.com/article/9128116263/ python- Small data pools , In depth analysis of code block
https://www.dazhuanlan.com/2020/01/16/5e1f70e908538/ cpython Medium string interning

版权声明
本文为[The end of the world]所创,转载请带上原文链接,感谢
https://pythonmana.com/2021/04/20210406195343806y.html

  1. Spark Delta Lake 0.4.0 发布,支持 Python API 和部分 SQL
  2. How to transfer office files to PDF
  3. Are you still worried about multiple excel summary statistics? Python second processing really fragrant!
  4. Making music aggregate downloader with Python
  5. Spark delta Lake 0.4.0 is released, supporting Python API and part of SQL
  6. Python信息搜集
  7. Python information gathering
  8. Python - 关于类(self/cls) 以及 多进程通讯的思考
  9. Python - thinking about class (self / CLS) and multi process communication
  10. Python - 关于类(self/cls) 以及 多进程通讯的思考
  11. Python - thinking about class (self / CLS) and multi process communication
  12. Python信用评分卡建模(附代码)
  13. Python credit score card modeling (with code)
  14. 学Python需要学数据库吗?Python学习教程!
  15. Do you need to learn database to learn Python!
  16. Python私有变量如何定义?Python学习教程!
  17. How to define Python private variables? Python tutorial!
  18. Python数据分析入门(六):Pandas的函数应用
  19. Introduction to Python data analysis (6): function application of pandas
  20. 学Python需要学数据库吗?Python学习教程!
  21. Do you need to learn database to learn Python!
  22. Python描述 LeetCode 80. 删除有序数组中的重复项 II
  23. C++/python描述 AcWing 94. 递归实现排列型枚举
  24. C++/python描述 AcWing 92. 递归实现指数型枚举
  25. Python描述 LeetCode 88. 合并两个有序数组
  26. 苏州大学计算机考研 复试机试真题2013-2021真题及Python题解
  27. Python描述 LeetCode 781. 森林中的兔子
  28. 字典和json的区别是什么?Python学习
  29. Python describes leetcode 80. Removing duplicate items from ordered arrays II
  30. C + + / Python description acwing 94. Recursive implementation of permutation enumeration
  31. C + + / Python description acwing 92. Recursive implementation of exponential enumeration
  32. Python describes leetcode 88. Merging two ordered arrays
  33. Real computer test questions 2013-2021 of computer postgraduate entrance examination of Soochow University and python solutions
  34. The rabbit in the forest
  35. Python中的魔法属性
  36. What's the difference between dictionary and JSON? Python learning
  37. Magic properties in Python
  38. 字典和json的区别是什么?Python学习
  39. What's the difference between dictionary and JSON? Python learning
  40. python刷题-字母图形
  41. Python brush questions - letter graphics
  42. Python数据分析入门(七):Pandas层级索引
  43. Introduction to Python data analysis (7): Pandas hierarchical index
  44. Python 操作腾讯云短信(sms)详细教程
  45. Python operation Tencent cloud SMS (SMS) detailed tutorial
  46. Python数据可视化,完整版实操指南 !
  47. Python data visualization, full version of the practical guide!
  48. 上手Pandas,带你玩转数据(2)-- 使用pandas从多种文件中读取数据
  49. 上手Pandas,带你玩转数据(1)-- 实例详解pandas数据结构
  50. Using pandas to read data from various files
  51. Hands on pandas, take you to play with data (1) -- detailed explanation of pandas data structure with examples
  52. Pandas数据结构基础用法
  53. Basic usage of pandas data structure
  54. Python读取ini配置文件,保存到对象属性
  55. Python reads the INI configuration file and saves it to the object properties
  56. Foundation of Python: classes in Python
  57. python刷题-闰年判断
  58. python刷题-01字串
  59. How to judge leap year
  60. Python brush title-01 string