2. Why this
‣ Python is great for carrying out research experiment.
this should lay the foundation why I discuss Python.
‣ Life is short. You need Python.
this should lay the foundation why people like Python.
2
life is neither a short nor a long, just a (signed) int, 31bits at most, I say.
3. Takeaway
‣ Understand the inter-relation among {.py, .pyc .c} file.
‣ Understand that everything in Python is an object.
‣ Understand how functions on TypeObject affect
InstanceObject.
3
4. CPython Overview
‣ First implemented in Dec.1989 by GvR, the BDFL
‣ Serving as the reference implementation.
‣ IronPython (clr)
‣ Jython (jvm)
‣ Brython (v8, spider) [no kidding]
‣ Written in ANSI C.
‣ flexible language binding
‣ embedding (libpython), e.g., openwrt, etc.
4
5. CPython Overview
‣ Code maintained by Mercurial.
‣ source: https://hg.python.org/cpython/
‣ Build toolchain is autoconf (on *nix)
./configure --with-pydebug && make -j2
5
9. Object
object: memory of C structure with common header
9
PyListObject
PyDictObject
PyTupleObject
PySyntaxErrorObject
PyImportErrorObject
…
takeaway: everything is object
ob_type
ob_refcnt
PyObject
ob_type
ob_size
ob_refcnt
PyVarObject
10. Object Structure
Will PyLongObject overflow?
10
The answer: chunk-chunk
digit[n]
…
digit[3]
digit[2]
ob_type
ob_size
digit ob_digit[1]
ob_refcnt
PyLongObject
typedef PY_UINT32_T digit;
result = PyObject_MALLOC(offsetof(PyLongObject, ob_digit) +
size*sizeof(digit));
n = 2 ** 64 # more bits than a word
assert type(n) is int and n > 0
12. Object Structure
what is the ob_type?
12
The answer: flexible type system
class Machine(type): pass
# Toy = Machine(foo, bar, hoge)
class Toy(metaclass=Machine): pass
toy = Toy()
# Toy, Machine, type, type
print(type(toy), type(Toy), type(Machine), type(type))
ob_type
ob_refcnt
…
toy
ob_type
ob_refcnt
…
ob_type
ob_refcnt
…
Toy
Machine
ob_type
ob_refcnt
…
Type
13. Object Structure
what is the ob_type?
13
# ob_type2
# 10fd69490 - 10fd69490 - 10fd69490
print("%x - %x - %x" % (id(42 .__class__), id(233 .__class__), id(int)))
assert dict().__class__ is dict
# dynamically create a class named "MagicKlass"
klass=“MagicKlass"
klass=type(klass, (object,), {"quack": lambda _: print("quack")});
duck = klass()
# quack
duck.quack()
assert duck.__class__ is klass
14. Object Structure
what is the ob_type?
14
ob_type
…
…
…
ob_refcnt
PyObject
…
*tp_as_mapping
*tp_as_sequence
*tp_as_number
…
ob_type
tp_getattr
…
tp_print
ob_refcnt
PyTypeObject
…
nb_subtract
…
nb_add
17. AOL
‣ Abstract Object Layer
17
…
*tp_as_mapping
*tp_as_sequence
*tp_as_number
…
ob_type
tp_getattr
…
tp_print
ob_refcnt
PyTypeObject
…
nb_subtract
…
nb_add
When I see a bird that walks like a duck and swims like a duck and
quacks like a duck, I call that bird a duck.
Object Protocol
Number Protocol
Sequence Protocol
Iterator Protocol
Buffer Protocol
int PyObject_Print(PyObject *o, FILE *fp, int flags)
int PyObject_HasAttr(PyObject *o, PyObject *attr_name)
int PyObject_DelAttr(PyObject *o, PyObject *attr_name)
…
PyObject* PyNumber_Add(PyObject *o1, PyObject *o2)
PyObject* PyNumber_Multiply(PyObject *o1, PyObject *o2)
PyObject* PyNumber_FloorDivide(PyObject *o1, PyObject *o2)
…
PyObject* PySequence_Concat(PyObject *o1, PyObject *o2)
PyObject* PySequence_Repeat(PyObject *o, Py_ssize_t count)
PyObject* PySequence_GetItem(PyObject *o, Py_ssize_t i)
…
int PyIter_Check(PyObject *o)
PyObject* PyIter_Next(PyObject *o)
int PyObject_GetBuffer(PyObject *exporter, Py_buffer *view, int flags)
void PyBuffer_Release(Py_buffer *view)
int PyBuffer_IsContiguous(Py_buffer *view, char order)
…
Mapping Protocol
int PyMapping_HasKey(PyObject *o, PyObject *key)
PyObject* PyMapping_GetItemString(PyObject *o, const char *key)
int PyMapping_SetItemString(PyObject *o, const char *key, PyObject *v)
…
18. Example
‣ Number Protocol (PyNumber_Add)
18
// v + w?
PyObject *
PyNumber_Add(PyObject *v, PyObject *w)
{
// this just an example!
// try on v
result = v->ob_type->tp_as_number.nb_add(v, w)
// if fail or if w->ob_type is a subclass of v->ob_type
result = w->ob_type->tp_as_number.nb_add(w, v)
// return result
}
…
*tp_as_mapping
*tp_as_sequence
*tp_as_number
…
ob_type
tp_getattr
…
tp_print
ob_refcnt
PyTypeObject
…
nb_subtract
…
nb_add
takeaway: typeobject stores meta information
19. More Example
Why can we multiply a list? Is it slow?
19
arr = [None] * 3
# [None, None, None]
Exercise:
arr = [None] + [None]
# [None, None]
20. Magic Methods
access slots of tp_as_number, and its friends
20
Note tp_as_mapping->mp_length and tp_as_sequence->sq_length map to the
same slot __len__
If your C based MyType implements both, what’s MyType.__len__ and
len(MyType()) ?
# access magic method of dict and list
dict.__getitem__ # tp_as_mapping->mp_subscript
dict.__len__ # tp_as_mapping->mp_length
list.__getitem__ # tp_as_sequence->sq_item
list.__len__ # tp_as_sequence->sq_length
21. Magic Methods
backfill as_number and its friends
21
class A():
def __len__(self):
return 42
class B(): pass
# 42
print(len(A()))
# TypeError: object of type 'B' has no len()
print(len(B()))
Py_ssize_t
PyObject_Size(PyObject *o)
{
PySequenceMethods *m;
if (o == NULL) {
null_error();
return -1;
}
m = o->ob_type->tp_as_sequence;
if (m && m->sq_length)
return m->sq_length(o);
return PyMapping_Size(o);
}Which field does A.__len__ fill?
22. Next: Heterogeneous
Have you ever felt insecure towards negative indexing of
PyListObject?
22
The answer: RTFSC
words = "the quick brown fox jumps over the old lazy dog".split()
assert words[-1] == "dog"
words.insert(-100, "hayabusa")
assert words[-100] == ??