描述符是什么?
A descriptor is what we call any object that defines __get__()
, __set__()
, or __delete__()
.
只要某个对象有如上三个特殊方法之一,就是一个描述符对象。
描述符的目的
The main motivation for descriptors is to provide a hook allowing objects stored in class variables to control what happens during attribute lookup.
只有类属性时才有起作用
Descriptors only work when used as class variables. When put in instances, they have no effect.
__set_name__(self, owner, name)
Optionally, descriptors can have a __set_name__()
method. This is only used in cases where a descriptor needs to know either the class where it was created or the name of class variable it was assigned to.
(This method, if present, is called even if the class is not a descriptor.)
__set_name__()
有两个目的
- 一是告诉 [描述符] 对象它在哪个类里面被实例化了
- 二是 [描述符] 对象被分配到哪个类变量名中
调用
通过成员运算符会触发描述符相关方法的调用
Descriptors get invoked by the dot operator during attribute lookup. If a descriptor is accessed indirectly with varssome_class, the descriptor instance is returned without invoking it.
Descriptors are used throughout the language. It is how functions turn into bound methods. Common tools like classmethod(), staticmethod(), property(), and functools.cached_property() are all implemented as descriptors.
Demos
simple example
1 | class Ten: |
Dynamic lookups
1 | import os |
Managed attributes
A popular use for descriptors is managing access to instance data. The descriptor is assigned to a public attribute in the class dictionary while the actual data is stored as a private attribute in the instance dictionary. The descriptor’s __get__()
and __set__()
methods are triggered when the public attribute is accessed.
1 | import logging |
Customized names
When a class uses descriptors, it can inform each descriptor about which variable name was used. In this example, the Person class has two descriptor instances, name and age. When the Person class is defined, it makes a callback to __set_name__()
in LoggedAccess so that the field names can be recorded, giving each descriptor its own public_name and private_name
1 | import logging |
完整的字段校验的例子
1 | from abc import ABC, abstractmethod |
技术细节 ⭐
类属性中某个属性值定义了 descriptor protocol 中某个方法,这个属性就是个描述符对象。
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary.
For instance, a.x
has a lookup chain starting with a.__dict__['x']
, then type(a).__dict__['x']
, and continuing through the method resolution order of type(a)
. If the looked-up value (应该是类属性) is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead.
Where this occurs in the precedence chain depends on which descriptor methods were defined.
a.x
lookup chain a.__dict__['x']
==> type(a).__dict__['x']
==> mro(a).__dict__['x']
protocol
descr.__get__(self, obj, type=None) -> value
descr.__set__(self, obj, value) -> None
descr.__delete__(self, obj) -> None
data descriptor define __set__
or __delete__
non-data descriptor only define __get__
区别:DIFFS
Data and non-data descriptors differ in how overrides are calculated with respect to entries in an instance’s dictionary.
- If an instance’s dictionary has an entry with the same name as a data descriptor, the data descriptor takes precedence.
- If an instance’s dictionary has an entry with the same name as a non-data descriptor, the dictionary entry takes precedence.
To make a read-only data descriptor, define both __get__()
and __set__()
with the __set__()
raising an AttributeError
when called. Defining the __set__()
method with an exception raising placeholder is enough to make it a data descriptor.
如何初始化值呢?
- 提供一个方法直接操作
__dict__
;
描述符的调用 ⭐
obj.x 关于描述符的调用顺序取决 obj 是什么,object, class, or instance of super?
invocation from an object ⭐
instance.x 表达式会发生如下操作,数据描述符描述符具有最高优先级
- data descriptors
- instance variables
- non-data descriptors
- class variables
__getattr__()
The logic for a dotted lookup is in object.__getattribute__()
注意:__getattribute__
中没有 __getattr__()
hook
That is why calling __getattribute__()
directly or with super().__getattribute__
will bypass __getattr__()
entirely.
dot operator 和 getattr() 函数负责当 __getarribute__()
抛出 AttributeError 时调用 __getattr__
. 这些逻辑封装在一个helper function 中。
__getattribute__
几乎每次都会触发,只有在该函数跑出 AttributeError 时,__getattr__
才会被调用(最后一步尝试)。
描述符的逻辑在 __getattribute__
函数内,最好谨慎覆盖该函数,一般都用 __getattr__
。
⭐
1 | def find_name_in_mro(cls, name, default): |
invocation from a class
desc.__get__(None, A)
invocation from super
B.__dict__['m'].__get__(obj, A)
总结 ⭐
描述符的机制嵌入在 object、type、super()的 __getattribute__
方法中
- Descriptors are invoked by the
__getattribute__()
method. - Classes inherit this machinery from object, type, or super().
- Overriding
__getattribute__()
prevents automatic descriptor calls because all the descriptor logic is in that method. object.__getattribute__()
andtype.__getattribute__()
make different calls to__get__()
. The first includes the
instance and may include the class. The second puts in None for the instance and always includes the class.- Data descriptors always override instance dictionaries.
- Non-data descriptors may be overridden by instance dictionaries.
Automatic name notification
发生在类定义的时候由元类调用 __set_name__()
,如果在类定义后手动添加 descriptor,则需要手动调用 __set_name__
.
Sometimes it is desirable for a descriptor to know what class variable name it was assigned to.
When a new class is created, the type metaclass scans the dictionary of the new class. If any of the entries are descriptors and if they define __set_name__()
, that method is called with two arguments. The owner is the class where the descriptor is used, and the name is the class variable the descriptor was assigned to.
Since the update logic is in type.__new__()
, notifications only take place at the time of class creation.
If descriptors are added to the class afterwards, __set_name__()
will need to be called manually.
python 中使用描述符的例子
The descriptor protocol is simple and offers exciting possibilities. Several use cases are so common that they have been prepackaged into built-in tools. Properties, bound methods, static methods, class methods, and __slots__
are all based on the descriptor protocol.
properties
Calling property() is a succinct(简洁) way of building a data descriptor that triggers a function call upon access to an attribute. Its signature is:
property(fget=None, fset=None, fdel=None, doc=None) -> property
Functions and methods
Python’s object oriented features are built upon a function based environment. Using non-data descriptors, the two are merged seamlessly.
Functions stored in class dictionaries get turned into methods when invoked. Methods only differ from regular functions in that the object instance is prepended to the other arguments. By convention, the instance is called self but could be called this or any other variable name.Static method
不需要访问实例的属性
Class method
仅仅需要访问类的属性
1 | class MethodType: |
__slots__
当一个 class 定义了 __slots__
属性后,它会用一个定长的数组替代对象的字典来保存数据,从用户的角度看有几点影响:
- 快速定位属性名拼写错误, because only attribute names specified in
__slots__
are allowd - Helps create immutable objects where descriptors manage access to private attributes stored in
__slots__
- Saves memory On a 64-bit Linux build, an instance with two attributes takes 48 bytes with
__slots__
and 152 bytes without. - Improves speed
- Block tools like functools.cached_property() which require an instance dictionary to function correctly 拦截某些工具
我们不可能使用 python 来实现 __slots__
的功能,因为需要涉及到 c 的数据结构和内存分配,但是可以使用描述符模拟这种行为。代码略