SlideShare ist ein Scribd-Unternehmen logo
1 von 30
Cleanup and new optimizations
             in WPython 1.1

                  Cesare Di Mauro
                     A-Tono s.r.l.
       PyCon Quattro 2010 – Firenze (Florence)
                    May 9, 2010



Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   1 / 30
What’s new in WPython 1.1
    • Removed WPython 1.0 alpha PyTypeObject “hack”

    • Fixed tracing for optimized loops

    • Removed redundant checks and opcodes

    • New opcode_stack_effect()

    • Specialized opcodes for common patterns

    • Constant grouping and marshalling on slices

    • Peepholer moved to compiler.c

    • Experimental “INTEGER” opcodes family
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   2 / 30
The PyTypeObject "hack”
  typedef struct _typeobject {
  [...]
      hashfunc tp_relaxedhash; /* Works for lists and dicts too. */
      cmpfunc tp_equal; /* Checks if two objects are equal (0.0 != -0.0) */
  } PyTypeObject;

  long PyDict_GetHashFromKey(PyObject *key) {
      long hash;
      if (!PyString_CheckExact(key) ||
            (hash = ((PyStringObject *) key)->ob_shash) == -1) {
          hash = key->ob_type->tp_relaxedhash(key);
          if (hash == -1) PyErr_Clear();                                          No conditional
      }                                                                           branch(es):
      return hash;
                                                                                  just get it!
  }
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   3 / 30
Removed it in Wpython 1.0/1.1
  long _Py_object_relaxed_hash(PyObject *o) {
      if (PyTuple_CheckExact(o))
        return _Py_tuple_relaxed_hash((PyTupleObject *) o);                            Conditional
      else if (PyList_CheckExact(o))
        return _Py_list_relaxed_hash((PyListObject *) o);
                                                                                       branches
      else if (PyDict_CheckExact(o))                                                   kill the
        return _Py_dict_relaxed_hash((PyDictObject *) o);
                                                                                       processor
      else if (PySlice_Check(o))
        return _Py_slice_relaxed_hash((PySliceObject *) o);                            pipeline!
      else
        return PyObject_Hash(o);
  }

               _Py_object_strict_equal is even worse!

Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   4 / 30
LOST: performance!
    “Hack” used only on compile.c for consts dict…
                                                               …but very important for speed!
     Tests made on Windows 7 x64, Athlon64 2800+ socket 754, 2GB DDR 400Mhz, 32-bits Python

                                            PyStone (sec.)          PyBench Min (ms.) PyBench Avg (ms.)
              Python 2.6.4                        38901                      10444*             10864*
         WPython 1.0 alpha                        50712                      8727**             9082**
              WPython 1.0                         47914                       9022               9541
              WPython 1.1                         47648                       8785               9033
      WPython 1.1 (INT “opc.”)                    45655                       9125               9486

     * Python 2.6.4 missed UnicodeProperties. Added from 1.0.
     ** WPython 1.0 alpha missed NestedListComprehensions and SimpleListComprehensions. Added from 1.0.


                         Being polite doesn’t pay. “Crime” does!
Cesare Di Mauro – PyCon Quattro 2010             Cleanup and new optimizations in WPython 1.1    May 9, 2010   5 / 30
Tracing on for loops was broken
 Optimized for loops don’t have SETUP_LOOPs & POP_BLOCKs
 Tracing makes crash on jumping in / out of them!

                                        def f():                          Cannot jump in
  2    0 SETUP_LOOP       19 (to 22)       for x in a:
       3 LOAD_GLOBAL      0 (a)               print x                     Cannot jump out
       6 GET_ITER
  >>   7 FOR_ITER         11 (to 21)                            2    0 LOAD_GLOBAL     0 (a)

       10 STORE_FAST      0 (x)                                      1 GET_ITER

  3    13 LOAD_FAST       0 (x)                                   >> 2 FOR_ITER        5 (to 8)

       16 PRINT_ITEM                                                 3 STORE_FAST      0 (x)

       17 PRINT_NEWLINE                                         3    4 LOAD_FAST       0 (x)

       18 JUMP_ABSOLUTE 7                                            5 PRINT_ITEM

  >> 21 POP_BLOCK                                                    6 PRINT_NEWLINE

  >> 22 LOAD_CONST        0 (None)                                   7 JUMP_ABSOLUTE   2

       25 RETURN_VALUE                                            >> 8 RETURN_CONST    0 (None)


Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1    May 9, 2010   6 / 30
Now fixed, but was a nightmare!
  case SETUP_LOOP:                                                            case POP_BLOCK:
  case SETUP_EXCEPT:                                                          case POP_FOR_BLOCK:
  case SETUP_FINALLY:                                                         case END_FINALLY:
                                   1) Save loop target
  case FOR_ITER:                                                              case WITH_CLEANUP:
                                   2) Check loop target
                                   3) Remove loop target

      /* Checks if we have found a "virtual" POP_BLOCK/POP_TOP
           for the current FOR instruction (without SETUP_LOOP). */
      if (addr == temp_last_for_addr) {
          temp_last_for_addr = temp_for_addr_stack[--temp_for_addr_top];
          temp_block_type_top--;
      }
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   7 / 30
CPython has “extra” checks…
  referentsvisit(PyObject *obj, PyObject *list)                                         Must be
  {
                                                                                        a List
             return PyList_Append(list, obj) < 0;
  }

                                                                                         Value is
  int
  PyList_Append(PyObject *op, PyObject *newitem)                                         needed
  {
             if (PyList_Check(op) && (newitem != NULL))
                        return app1((PyListObject *)op, newitem);
             PyErr_BadInternalCall();
             return -1;                                                               It does the
  }
                                                                                      real work!
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1    May 9, 2010   8 / 30
…remove them!
                                                                                 Absolutely sure:
  referentsvisit(PyObject *obj, PyObject *list)
                                                                                 a list and an
  {
             return _Py_list_append(list, obj) < 0;                              object!
  }


  int _Py_list_append(PyObject *self, PyObject *v)
  {
             Py_ssize_t n = PyList_GET_SIZE(self);
                                                                                      Direct call
             assert (v != NULL);                                                      No checks!
  [...]
             PyList_SET_ITEM(self, n, v);
             return 0;
  }                                                app1 is now _Py_list_append
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1      May 9, 2010   9 / 30
But don’t remove too much!
    A too much aggressive unreachable code remover

     def test_constructor_with_iterable_argument(self):
        b = array.array(self.typecode, self.example)
        # pass through errors raised in next()
          def B():
             raise UnicodeError
             yield None
        self.assertRaises(UnicodeError, array.array, self.typecode, B())


    Function B was compiled as if it was:
     def B():
        raise UnicodeError


    Not a generator! Reverted back to old (working) code…
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   10 / 30
The old opcode_stack_effect()…
   static int opcode_stack_effect(int opcode, int oparg)
   {
                                                                                    Several checks
       switch (opcode) {
         case POP_TOP:
           return -1;                  /* POPs one element from stack
         case ROT_THREE:
           return 0;                   /* Stack unchanged */
         case DUP_TOP:
           return 1;                   /* PUSHs one element */
         case LIST_APPEND:
           return -2;                  /* Removes two elements */
         case WITH_CLEANUP:
           return -1;                  /* POPs one element. Sometimes more */
         case BUILD_LIST:
           return 1-oparg;             /* Consumes oparg elements, pushes one */


Cesare Di Mauro – PyCon Quattro 2010      Cleanup and new optimizations in WPython 1.1   May 9, 2010   11 / 30
… and the new one!
static int opcode_stack_effect(struct compiler *c, struct instr*i, int*stack_retire)
{
    static signed char opcode_stack[TOTAL_OPCODES] = {
      /* LOAD_CONST = 1 */
      /* LOAD_FAST = 1 */
      /* STORE_FAST = -1 */
      0,     0,   0,   0,   0,   0,    1,   1, -1,    0, /*      0 ..    9 */
[...]
    int opcode = i->i_opcode;          int oparg = i->i_oparg;           *stack_retire = 0;
    switch (opcode & 0xff) {
      case BUILD_LIST:                                            Only special cases checked
           return 1 - oparg;
[...]
      default:                                                       Regular cases unchecked
           return opcode_stack[opcode & 0xff];
                                                                     Just get the value!
}
Cesare Di Mauro – PyCon Quattro 2010        Cleanup and new optimizations in WPython 1.1   May 9, 2010   12 / 30
Removed unneeded copy on for
                                   def f():
                                       for x in[1, 2, 3]:
                                         pass
                                                                          List of “pure” constants

     Python 2.6.4                      WPython 1.0 alpha                          WPython 1.1
2   0 SETUP_LOOP      23 (to 26)   2   0 LOAD_CONST       1 ([1, 2, 3])    2   0 LOAD_CONST      1 ([1, 2, 3])
    3 LOAD_CONST      1 (1)            1 LIST_DEEP_COPY                        1 GET_ITER
    6 LOAD_CONST      2 (2)            2 GET_ITER                          >> 2 FOR_ITER         2 (to 5)
    9 LOAD_CONST      3 (3)        >> 3 FOR_ITER          2 (to 6)             3 STORE_FAST      0 (x)
    12 BUILD_LIST     3                4 STORE_FAST       0 (x)                4 JUMP_ABSOLUTE 2
    15 GET_ITER                        5 JUMP_ABSOLUTE 3                   >> 5 RETURN_CONST     0 (None)
>> 16 FOR_ITER 6 (to 25)           >> 6 RETURN_CONST      0 (None)
    19 STORE_FAST     0 (x)
    22 JUMP_ABSOLUTE      16
>> 25 POP_BLOCK
>> 26 LOAD_CONST      0 (None)
                                                    Makes a (deep) copy, before use
    29 RETURN_VALUE

Cesare Di Mauro – PyCon Quattro 2010      Cleanup and new optimizations in WPython 1.1      May 9, 2010   13 / 30
Improved class creation
                                       def f():
                                         class c(): pass


 2   0 LOAD_CONST        1 ('c')                              2    0 LOAD_CONST          1 ('c')
     3 LOAD_CONST        3 (())                                    1 BUILD_TUPLE         0
     6 LOAD_CONST        2 (<code>)                                2 LOAD_CONST          2 (<code>)
     9 MAKE_FUNCTION     0                                         3 MAKE_FUNCTION       0
     12 CALL_FUNCTION    0                                         4 BUILD_CLASS
     15 BUILD_CLASS                                                5 STORE_FAST          0 (c)
     16 STORE_FAST       0 (c)                                     6 RETURN_CONST        0 (None)
     19 LOAD_CONST       0 (None)
     22 RETURN_VALUE                                                                         Internal fast
                                                                                             Function call
 2   0 LOAD_NAME         0 (__name__)
     3 STORE_NAME        1 (__module__)                       2   0 LOAD_NAME            0 (__name__)
     6 LOAD_LOCALS                                                1 STORE_NAME           1 (__module__)
     7 RETURN_VALUE                                               2 RETURN_LOCALS

Cesare Di Mauro – PyCon Quattro 2010      Cleanup and new optimizations in WPython 1.1        May 9, 2010   14 / 30
Optimized try on except (last)
2      0 SETUP_EXCEPT    8 (to 11)                                    2     0 SETUP_EXCEPT    4 (to 5)
3      3 LOAD_FAST       0 (x)                                        3     1 LOAD_FAST       0 (x)
       6 POP_TOP                                                            2 POP_TOP
       7 POP_BLOCK                                                          3 POP_BLOCK
       8 JUMP_FORWARD    11 (to 22)       def f(x, y):                      4 JUMP_FORWARD    5 (to 10)
4 >> 11 POP_TOP                                                       4 >> 5 POP_TOP
                                             try:
      12 POP_TOP                                                            6 POP_TOP
                                                x
      13 POP_TOP                                                            7 POP_TOP
                                             except:
5     14 LOAD_FAST       1 (y)                                        5     8 LOAD_FAST       1 (y)
                                                y
      17 POP_TOP                                                            9 POP_TOP
      18 JUMP_FORWARD    1 (to 22)                                     >> 10 RETURN_CONST     0 (None)
      21 END_FINALLY
    >> 22 LOAD_CONST     0 (None)
      25 RETURN_VALUE
                                          Unneeded on generic expect clause
                                          (if it was the last one)
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1     May 9, 2010   15 / 30
Opcodes for multiple comparisons
                                     def f(x, y, z):
                                       return x < y < z


    2    0 LOAD_FAST         0 (x)                   2      0 LOAD_FAST                 0 (x)
         3 LOAD_FAST         1 (y)                          1 LOAD_FAST                 1 (y)
         6 DUP_TOP                                          2 DUP_TOP_ROT_THREE
         7 ROT_THREE                                        3 CMP_LT                    28 (<)
         8 COMPARE_OP        0 (<)                          4 JUMP_IF_FALSE_ELSE_POP    3 (to 8)
        11 JUMP_IF_FALSE     8 (to 22)                      5 FAST_BINOP                2 (z) 28 (<)
        14 POP_TOP                                          7 RETURN_VALUE
        15 LOAD_FAST         2 (z)                     >>   8 ROT_TWO_POP_TOP
        18 COMPARE_OP        0 (<)                          9 RETURN_VALUE
        21 RETURN_VALUE
     >> 22 ROT_TWO
        23 POP_TOP
                                                            Not a simple replacement:
        24 RETURN_VALUE
                                                            Short and Optimized versions!
Cesare Di Mauro – PyCon Quattro 2010     Cleanup and new optimizations in WPython 1.1   May 9, 2010    16 / 30
Specialized generator operator
                                                       def f(*Args):
     2    0 LOAD_CONST       1 (<code> <genexpr>)
                                                          return (int(Arg) for Arg in Args[1 : ])
          3 MAKE_FUNCTION 0
          6 LOAD_FAST        0 (Args)
                                                      2   0 LOAD_CONST 1 (<code> <genexpr>)
          9 LOAD_CONST       2 (1)
                                                          1 MAKE_FUNCTION 0
         12 SLICE+1
                                                          2 FAST_BINOP_CONST 0 (Args) 2 (1) 39 (slice_1)
         13 GET_ITER
                                                          4 GET_GENERATOR
         14 CALL_FUNCTION 1
                                                          5 RETURN_VALUE
                                                                                           Internal fast
         17 RETURN_VALUE

                                                          def f(*Args):
                                                                                           Function call
 2       0 LOAD_GLOBAL     0 (sum)                            return sum(int(Arg) for Arg in Args)
         3 LOAD_CONST      1 (<code> <genexpr>)
         6 MAKE_FUNCTION 0                                2   0 LOAD_GLOBAL 0 (sum)

         9 LOAD_FAST       0 (Args)                           1 LOAD_CONST 1 (<code> <genexpr>)

     12 GET_ITER                                              2 MAKE_FUNCTION 0

     13 CALL_FUNCTION 1                                       3 FAST_BINOP 0 (Args) 42 (get_generator)

     16 CALL_FUNCTION 1                                       5 QUICK_CALL_FUNCTION 1 (1 0)

     19 RETURN_VALUE                                          6 RETURN_VALUE

Cesare Di Mauro – PyCon Quattro 2010        Cleanup and new optimizations in WPython 1.1    May 9, 2010   17 / 30
String joins are… binary operators!
                                                        def f(*Args):
    2    0 LOAD_CONST       1 ('n')
         3 LOAD_ATTR        0 (join)
                                                             return 'n'.join(Args)

         6 LOAD_FAST        0 (Args)                2       0 CONST_BINOP_FAST 1 ('n') 0 (Args) 45 (join)
         9 CALL_FUNCTION 1                                  2 RETURN_VALUE
        12 RETURN_VALUE


                                                 def f(*Args):
                                                   return u'n'.join(str(Arg) for Arg in Args)
2       0 LOAD_CONST      1 (u'n')                     2    0 LOAD_CONST     1 (u'n')
        3 LOAD_ATTR       0 (join)                           1 LOAD_CONST     2 (<code> <genexpr>)
        6 LOAD_CONST      2 (<code> <genexpr>)               2 MAKE_FUNCTION 0
        9 MAKE_FUNCTION 0                                    3 FAST_BINOP     0 (Args) 42 (get_generator)
    12 LOAD_FAST          0 (Args)                           5 UNICODE_JOIN
    15 GET_ITER                                              6 RETURN_VALUE
    16 CALL_FUNCTION 1                                                               Direct call to
    19 CALL_FUNCTION 1
    22 RETURN_VALUE
                                                                                     PyUnicode_Join
Cesare Di Mauro – PyCon Quattro 2010       Cleanup and new optimizations in WPython 1.1    May 9, 2010   18 / 30
Specialized string modulo
                                def f(x, y):
                                  return '%s and %s' % (x, y)

    2   0 LOAD_CONST    1 ('%s and %s')         2   0 LOAD_CONST     1 ('%s and %s')
        3 LOAD_FAST     0 (x)                       1 LOAD_FAST      0 (x)
        6 LOAD_FAST     1 (y)                       2 LOAD_FAST      1 (y)
        9 BUILD_TUPLE 2                             3 BUILD_TUPLE 2
        12 BINARY_MODULO                            4 STRING_MODULO
        13 RETURN_VALUE                             5 RETURN_VALUE
                                                                                  Direct call to
                                                                                  PyString_Format
                                def f(a):
                                  return u'<b>%s</b>' % a
                                                                                  No checks needed

2   0 LOAD_CONST      1 (u'<b>%s</b>')
                                                    2   0 CONST_BINOP_FAST 1 (u'<b>%s</b>') 0 (a) 44 (%)
    3 LOAD_FAST       0 (a)
                                                        2 RETURN_VALUE
    6 BINARY_MODULO
    7 RETURN_VALUE


Cesare Di Mauro – PyCon Quattro 2010        Cleanup and new optimizations in WPython 1.1   May 9, 2010   19 / 30
Improved with cleanup
                                       def f(name):
2   0 LOAD_GLOBAL     0 (open)           with open(name):
    3 LOAD_FAST       0 (name)             pass
    6 CALL_FUNCTION 1                                      Can be done better
    9 DUP_TOP
    10 LOAD_ATTR      1 (__exit__)
                                                           (specialized opcode)
    13 ROT_TWO                            2    0 LOAD_GLOB_FAST_CALL_FUNC 0 (open) 0 (name) 1 (1 0)
    14 LOAD_ATTR      2 (__enter__)            2 DUP_TOP
    17 CALL_FUNCTION 0                         3 LOAD_ATTR                     1 (__exit__)
    20 POP_TOP                                 4 ROT_TWO
    21 SETUP_FINALLY 4 (to 28)                 5 LOAD_ATTR                     2 (__enter__)
3   24 POP_BLOCK                               6 QUICK_CALL_PROCEDURE          0 (0 0)
    25 LOAD_CONST     0 (None)                 7 SETUP_FINALLY                 2 (to 10)
>> 28 WITH_CLEANUP                        3    8 POP_BLOCK
    29 END_FINALLY                             9 LOAD_CONST                    0 (None)
    30 LOAD_CONST     0 (None)            >> 10 WITH_CLEANUP
    33 RETURN_VALUE                           11 RETURN_CONST                  0 (None)


Cesare Di Mauro – PyCon Quattro 2010    Cleanup and new optimizations in WPython 1.1      May 9, 2010   20 / 30
Constants grouping on slice
                                       def f(a):
                                         return a[1 : -1]

  2   0 LOAD_FAST    0 (a)
                                                                      2   0 LOAD_FAST       0 (a)
      3 LOAD_CONST   1 (1)
                                                                          1 LOAD_CONSTS     1 ((1, -1))
      6 LOAD_CONST   2 (-1)
                                                                          2 SLICE_3
      9 SLICE+3
                                                                          3 RETURN_VALUE
      10 RETURN_VALUE


                                   def f(a, n):
                                        return a[n : : 2]

  2   0 LOAD_FAST    0 (a)                                            2   0 LOAD_FAST         0 (a)
      3 LOAD_FAST    1 (n)                                                1 LOAD_FAST         1 (n)
      6 LOAD_CONST   0 (None)                                             2 LOAD_CONSTS       1 ((None, 2))
      9 LOAD_CONST   1 (2)                                                3 BUILD_SLICE_3
      12 BUILD_SLICE 3                                                    4 BINARY_SUBSCR
      15 BINARY_SUBSCR                                                    5 RETURN_VALUE
      16 RETURN_VALUE

Cesare Di Mauro – PyCon Quattro 2010      Cleanup and new optimizations in WPython 1.1      May 9, 2010   21 / 30
The Marshal Matters
  2   0 LOAD_FAST       0 (a)                      def f(a):
      3 LOAD_CONST      1 (1)                        return a[1 : 10 : -1]
      6 LOAD_CONST      2 (10)
                                               2    0 FAST_BINOP_CONST 0 (a) 1 (slice(1, 10, -1)) 7 ([])
      9 LOAD_CONST      3 (-1)
                                                    2 RETURN_VALUE
      12 BUILD_SLICE    3
      15 BINARY_SUBSCR
      16 RETURN_VALUE
                                                                            Constant Slice object
  2   0 LOAD_FAST           0 (a)
      3 LOAD_CONST          1 (1)          def f(a):                        Now marshaled
      6 LOAD_CONST          2 (10)            x = a[1 : 10 : -1]
      9 LOAD_CONST          3 (-1)            return x
      12 BUILD_SLICE        3
      15 BINARY_SUBSCR                 2   0 FAST_SUBSCR_CONST_TO_FAST 0 (a) 1 (slice(1, 10, -1)) 1 (x)

      16 STORE_FAST         1 (x)      3   2 LOAD_FAST                      1 (x)

  3   19 LOAD_FAST          1 (x)          3 RETURN_VALUE

      22 RETURN_VALUE

Cesare Di Mauro – PyCon Quattro 2010       Cleanup and new optimizations in WPython 1.1   May 9, 2010   22 / 30
Peepholer moved in compile.c
   Pros:
   • Simpler opcode handling (using instr structure) on average
   • Very easy jump manipulation
   • No memory allocation (for bytecode and jump buffers)
   • Works only on blocks (no boundary calculations)
   • No 8 & 16 bits values distinct optimizations
   • Always applied (even on 32 bits values code)
   Cons:
   • Works only on blocks (no global code “vision”)
   • Worse unreachable code removing
   • Some jump optimizations missing (will be fixed! ;)
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   23 / 30
Experimental INTEGER opcodes
     • Disabled by default (uncomment wpython.h /
     WPY_SMALLINT_SUPER_INSTRUCTIONS)
     • BINARY operations only
     • One operand must be an integer (0 .. 255)
     • No reference counting
     • Less constants usage (to be done)
     • Optimized integer code
     • Direct integer functions call
     • Fast integer “fallback”
     • Explicit long or float “fallback”
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   24 / 30
2   0 LOAD_FAST          0 (n)
    3 LOAD_CONST         1 (1)
    6 COMPARE_OP
    9 JUMP_IF_FALSE
                         1 (<=)
                         5 (to 17)
                                        def fib(n):                  An example
    12 POP_TOP                             if n <= 1:
3   13 LOAD_CONST        1 (1)                return 1
    16 RETURN_VALUE                        else:
>> 17 POP_TOP                                 return fib(n - 2) + fib(n – 1)
5   18 LOAD_GLOBAL       0 (fib)
    21 LOAD_FAST         0 (n)                       2     0 FAST_BINOP_INT           0 (n) 1 29 (<=)

    24 LOAD_CONST        2 (2)                             2 JUMP_IF_FALSE            2 (to 5)

    27 BINARY_SUBTRACT                               3     3 RETURN_CONST             1 (1)

    28 CALL_FUNCTION     1                                 4 JUMP_FORWARD             10 (to 15)

    31 LOAD_GLOBAL       0 (fib)                     5 >> 5 LOAD_GLOBAL               0 (fib)

    34 LOAD_FAST         0 (n)                             6 FAST_BINOP_INT           0 (n) 2 6 (-)

    37 LOAD_CONST        1 (1)                             8 QUICK_CALL_FUNCTION      1 (1 0)

    40 BINARY_SUBTRACT                                     9 LOAD_GLOBAL              0 (fib)

    41 CALL_FUNCTION     1                                10 FAST_BINOP_INT           0 (n) 1 6 (-)

    44 BINARY_ADD                                         12 QUICK_CALL_FUNCTION      1 (1 0)

    45 RETURN_VALUE                                       13 BINARY_ADD

    46 LOAD_CONST        0 (None)                         14 RETURN_VALUE

    49 RETURN_VALUE                                   >> 15 RETURN_CONST              0 (None)

Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1    May 9, 2010    25 / 30
An INTEGER opcode dissected
    FAST_BINOP_INT                     0 (n) 1 29 (<=)


     156     0       1     29




                                         Binary operator “sub-opcode”. 29 -> CMP_LE -> “<=“


                                         Short integer value. No “index” on constants table


                                         Index on local variables table. 0 -> “n”

                                         FAST_BINOP_INT “main” opcode


Cesare Di Mauro – PyCon Quattro 2010    Cleanup and new optimizations in WPython 1.1   May 9, 2010   26 / 30
A look inside                                             #define _Py_Int_FromByteNoRef(byte) 
                                                                ((PyObject *) _Py_Int_small_ints[ 
case FAST_BINOP_INT:                                            _Py_Int_NSMALLNEGINTS + (byte)])
    x = GETLOCAL(oparg);
    if (x != NULL) {
        NEXTARG16(oparg);
        if (PyInt_CheckExact(x))                                              Extracts operator
               x = INT_BINARY_OPS_Table[EXTRACTARG(oparg)](
                          PyInt_AS_LONG(x), EXTRACTOP(oparg));                 Extracts integer
        else
               x = BINARY_OPS_Table[EXTRACTARG(oparg)](
                                                                              Converts PyInt
                          x, _Py_Int_FromByteNoRef(EXTRACTOP(oparg)));
        if (x != NULL) {                                                      to integer
               PUSH(x);
               continue;
        }                                                                      Converts integer
        break;
    }
                                                                               to PyInt
    PyRaise_UnboundLocalError(co, oparg);
    break;

Cesare Di Mauro – PyCon Quattro 2010       Cleanup and new optimizations in WPython 1.1   May 9, 2010   27 / 30
Optimized integer code
                                                                     #define _Py_Int_FallBackOperation(
  static PyObject *
                                                                         newtype, operation) { 
  Py_INT_BINARY_OR(register long a, register long b)
                                                                          PyObject *v, *w, *u; 
  {
                                                                          v = newtype(a); 
      return PyInt_FromLong(a | b);
                                                                          if (v == NULL) 
  }
                                                                                return NULL; 
                                                                          w = newtype(b); 
  static PyObject *
                                                                          if (w == NULL) { 
  Py_INT_BINARY_ADD2(register long a, register long b)
                                                                                Py_DECREF(v); 
  {
                                                                                return NULL; 
      register long x = a + b;
                                                                          } 
      if ((x^a) >= 0 || (x^b) >= 0)
                                                                          u = operation; 
           return PyInt_FromLong(x);
                                                                          Py_DECREF(v); 
      _Py_Int_FallBackOperation(PyLong_FromLong,
                                                                          Py_DECREF(w); 
           PyLong_Type.tp_as_number->nb_add(v, w));
                                                                          return u; 
  }
                                                                     }




Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1     May 9, 2010   28 / 30
Direct integer functions call
 static PyObject *
 (*INT_BINARY_OPS_Table[])(register long, register
 long) = {                                                              Defined in intobject.h
     Py_INT_BINARY_POWER,     /* BINARY_POWER */
     _Py_int_mul,     /* BINARY_MULTIPLY */
     Py_INT_BINARY_DIVIDE,     /* BINARY_DIVIDE */



 PyObject *                                                    Implemented in intobject.c
 _Py_int_mul(register long a, register long b)
 {
     long longprod;                /* a*b in native long arithmetic */
     double doubled_longprod;      /* (double)longprod */
     double doubleprod;            /* (double)a * (double)b */


     longprod = a * b;
     doubleprod = (double)a * (double)b;
     doubled_longprod = (double)longprod;


Cesare Di Mauro – PyCon Quattro 2010    Cleanup and new optimizations in WPython 1.1   May 9, 2010   29 / 30
WPython future
   • Stopped as wordcodes “proof-of-concept”: no more releases!

   • No Python 2.7 porting (2.x is at a dead end)

   • Python 3.2+ reimplementation, if community asks

   • Reintroducing PyObject “hack”, and extending to other cases

   • CPython “limits” proposal (e.g. max 255 local variables)

   • Define a “protected” interface to identifiers for VM usage

   • Rethinking something (code blocks, stack usage, jumps)

   • Tons optimizations waiting…
Cesare Di Mauro – PyCon Quattro 2010   Cleanup and new optimizations in WPython 1.1   May 9, 2010   30 / 30

Weitere ähnliche Inhalte

Was ist angesagt?

Easy Going Groovy 2nd season on DevLOVE
Easy Going Groovy 2nd season on DevLOVEEasy Going Groovy 2nd season on DevLOVE
Easy Going Groovy 2nd season on DevLOVEUehara Junji
 
Apache Commons - Don\'t re-invent the wheel
Apache Commons - Don\'t re-invent the wheelApache Commons - Don\'t re-invent the wheel
Apache Commons - Don\'t re-invent the wheeltcurdt
 
Commit ускоривший python 2.7.11 на 30% и новое в python 3.5
Commit ускоривший python 2.7.11 на 30% и новое в python 3.5Commit ускоривший python 2.7.11 на 30% и новое в python 3.5
Commit ускоривший python 2.7.11 на 30% и новое в python 3.5PyNSK
 
Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.Platonov Sergey
 
エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理maruyama097
 
The Ring programming language version 1.6 book - Part 15 of 189
The Ring programming language version 1.6 book - Part 15 of 189The Ring programming language version 1.6 book - Part 15 of 189
The Ring programming language version 1.6 book - Part 15 of 189Mahmoud Samir Fayed
 
The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84Mahmoud Samir Fayed
 
The Ring programming language version 1.7 book - Part 16 of 196
The Ring programming language version 1.7 book - Part 16 of 196The Ring programming language version 1.7 book - Part 16 of 196
The Ring programming language version 1.7 book - Part 16 of 196Mahmoud Samir Fayed
 
Groovy 1.8の新機能について
Groovy 1.8の新機能についてGroovy 1.8の新機能について
Groovy 1.8の新機能についてUehara Junji
 
The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196Mahmoud Samir Fayed
 
The Ring programming language version 1.5.2 book - Part 76 of 181
The Ring programming language version 1.5.2 book - Part 76 of 181The Ring programming language version 1.5.2 book - Part 76 of 181
The Ring programming language version 1.5.2 book - Part 76 of 181Mahmoud Samir Fayed
 
NSOperation objective-c
NSOperation objective-cNSOperation objective-c
NSOperation objective-cPavel Albitsky
 
The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)jeffz
 
响应式编程及框架
响应式编程及框架响应式编程及框架
响应式编程及框架jeffz
 
Jscex: Write Sexy JavaScript
Jscex: Write Sexy JavaScriptJscex: Write Sexy JavaScript
Jscex: Write Sexy JavaScriptjeffz
 
The Ring programming language version 1.10 book - Part 94 of 212
The Ring programming language version 1.10 book - Part 94 of 212The Ring programming language version 1.10 book - Part 94 of 212
The Ring programming language version 1.10 book - Part 94 of 212Mahmoud Samir Fayed
 

Was ist angesagt? (20)

Easy Going Groovy 2nd season on DevLOVE
Easy Going Groovy 2nd season on DevLOVEEasy Going Groovy 2nd season on DevLOVE
Easy Going Groovy 2nd season on DevLOVE
 
Apache Commons - Don\'t re-invent the wheel
Apache Commons - Don\'t re-invent the wheelApache Commons - Don\'t re-invent the wheel
Apache Commons - Don\'t re-invent the wheel
 
Python tour
Python tourPython tour
Python tour
 
Commit ускоривший python 2.7.11 на 30% и новое в python 3.5
Commit ускоривший python 2.7.11 на 30% и новое в python 3.5Commit ускоривший python 2.7.11 на 30% и новое в python 3.5
Commit ускоривший python 2.7.11 на 30% и новое в python 3.5
 
Python Async IO Horizon
Python Async IO HorizonPython Async IO Horizon
Python Async IO Horizon
 
Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.Евгений Крутько, Многопоточные вычисления, современный подход.
Евгений Крутько, Многопоточные вычисления, современный подход.
 
asyncio internals
asyncio internalsasyncio internals
asyncio internals
 
エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理エンタープライズ・クラウドと 並列・分散・非同期処理
エンタープライズ・クラウドと 並列・分散・非同期処理
 
The Ring programming language version 1.6 book - Part 15 of 189
The Ring programming language version 1.6 book - Part 15 of 189The Ring programming language version 1.6 book - Part 15 of 189
The Ring programming language version 1.6 book - Part 15 of 189
 
The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84The Ring programming language version 1.2 book - Part 57 of 84
The Ring programming language version 1.2 book - Part 57 of 84
 
The Ring programming language version 1.7 book - Part 16 of 196
The Ring programming language version 1.7 book - Part 16 of 196The Ring programming language version 1.7 book - Part 16 of 196
The Ring programming language version 1.7 book - Part 16 of 196
 
Groovy 1.8の新機能について
Groovy 1.8の新機能についてGroovy 1.8の新機能について
Groovy 1.8の新機能について
 
The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196The Ring programming language version 1.7 book - Part 83 of 196
The Ring programming language version 1.7 book - Part 83 of 196
 
The Ring programming language version 1.5.2 book - Part 76 of 181
The Ring programming language version 1.5.2 book - Part 76 of 181The Ring programming language version 1.5.2 book - Part 76 of 181
The Ring programming language version 1.5.2 book - Part 76 of 181
 
NSOperation objective-c
NSOperation objective-cNSOperation objective-c
NSOperation objective-c
 
The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)The Evolution of Async-Programming on .NET Platform (TUP, Full)
The Evolution of Async-Programming on .NET Platform (TUP, Full)
 
响应式编程及框架
响应式编程及框架响应式编程及框架
响应式编程及框架
 
Jscex: Write Sexy JavaScript
Jscex: Write Sexy JavaScriptJscex: Write Sexy JavaScript
Jscex: Write Sexy JavaScript
 
The Ring programming language version 1.10 book - Part 94 of 212
The Ring programming language version 1.10 book - Part 94 of 212The Ring programming language version 1.10 book - Part 94 of 212
The Ring programming language version 1.10 book - Part 94 of 212
 
Mattbrenner
MattbrennerMattbrenner
Mattbrenner
 

Andere mochten auch

Python: ottimizzazione numerica algoritmi genetici
Python: ottimizzazione numerica algoritmi geneticiPython: ottimizzazione numerica algoritmi genetici
Python: ottimizzazione numerica algoritmi geneticiPyCon Italia
 
Undici anni di lavoro con Python
Undici anni di lavoro con PythonUndici anni di lavoro con Python
Undici anni di lavoro con PythonPyCon Italia
 
socket e SocketServer: il framework per i server Internet in Python
socket e SocketServer: il framework per i server Internet in Pythonsocket e SocketServer: il framework per i server Internet in Python
socket e SocketServer: il framework per i server Internet in PythonPyCon Italia
 
Qt mobile PySide bindings
Qt mobile PySide bindingsQt mobile PySide bindings
Qt mobile PySide bindingsPyCon Italia
 
Feed back report 2010
Feed back report 2010Feed back report 2010
Feed back report 2010PyCon Italia
 
Spyppolare o non spyppolare
Spyppolare o non spyppolareSpyppolare o non spyppolare
Spyppolare o non spyppolarePyCon Italia
 
Why is a[1] fast than a.get(1)
Why is a[1]  fast than a.get(1)Why is a[1]  fast than a.get(1)
Why is a[1] fast than a.get(1)kao kuo-tung
 

Andere mochten auch (8)

Python: ottimizzazione numerica algoritmi genetici
Python: ottimizzazione numerica algoritmi geneticiPython: ottimizzazione numerica algoritmi genetici
Python: ottimizzazione numerica algoritmi genetici
 
Python idiomatico
Python idiomaticoPython idiomatico
Python idiomatico
 
Undici anni di lavoro con Python
Undici anni di lavoro con PythonUndici anni di lavoro con Python
Undici anni di lavoro con Python
 
socket e SocketServer: il framework per i server Internet in Python
socket e SocketServer: il framework per i server Internet in Pythonsocket e SocketServer: il framework per i server Internet in Python
socket e SocketServer: il framework per i server Internet in Python
 
Qt mobile PySide bindings
Qt mobile PySide bindingsQt mobile PySide bindings
Qt mobile PySide bindings
 
Feed back report 2010
Feed back report 2010Feed back report 2010
Feed back report 2010
 
Spyppolare o non spyppolare
Spyppolare o non spyppolareSpyppolare o non spyppolare
Spyppolare o non spyppolare
 
Why is a[1] fast than a.get(1)
Why is a[1]  fast than a.get(1)Why is a[1]  fast than a.get(1)
Why is a[1] fast than a.get(1)
 

Ähnlich wie Cleanup and new optimizations in WPython 1.1

Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CSteffen Wenz
 
Python GTK (Hacking Camp)
Python GTK (Hacking Camp)Python GTK (Hacking Camp)
Python GTK (Hacking Camp)Yuren Ju
 
Python 如何執行
Python 如何執行Python 如何執行
Python 如何執行kao kuo-tung
 
Pysmbc Python C Modules are Easy
Pysmbc Python C Modules are EasyPysmbc Python C Modules are Easy
Pysmbc Python C Modules are EasyRoberto Polli
 
Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Yung-Yu Chen
 
Diving into byte code optimization in python
Diving into byte code optimization in python Diving into byte code optimization in python
Diving into byte code optimization in python Chetan Giridhar
 
Threads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonThreads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonYi-Lung Tsai
 
Java 7 at SoftShake 2011
Java 7 at SoftShake 2011Java 7 at SoftShake 2011
Java 7 at SoftShake 2011julien.ponge
 
Intro python-object-protocol
Intro python-object-protocolIntro python-object-protocol
Intro python-object-protocolShiyao Ma
 
Java 7 JUG Summer Camp
Java 7 JUG Summer CampJava 7 JUG Summer Camp
Java 7 JUG Summer Campjulien.ponge
 
Taipei.py 2018 - Control device via ioctl from Python
Taipei.py 2018 - Control device via ioctl from Python Taipei.py 2018 - Control device via ioctl from Python
Taipei.py 2018 - Control device via ioctl from Python Hua Chu
 
Python 3000
Python 3000Python 3000
Python 3000Bob Chao
 
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coinsoft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coinsoft-shake.ch
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientistsaeberspaecher
 
Python Performance 101
Python Performance 101Python Performance 101
Python Performance 101Ankur Gupta
 
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docxIn Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docxbradburgess22840
 

Ähnlich wie Cleanup and new optimizations in WPython 1.1 (20)

Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
Python GTK (Hacking Camp)
Python GTK (Hacking Camp)Python GTK (Hacking Camp)
Python GTK (Hacking Camp)
 
Python 如何執行
Python 如何執行Python 如何執行
Python 如何執行
 
Pysmbc Python C Modules are Easy
Pysmbc Python C Modules are EasyPysmbc Python C Modules are Easy
Pysmbc Python C Modules are Easy
 
Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020Notes about moving from python to c++ py contw 2020
Notes about moving from python to c++ py contw 2020
 
Diving into byte code optimization in python
Diving into byte code optimization in python Diving into byte code optimization in python
Diving into byte code optimization in python
 
Threads and Callbacks for Embedded Python
Threads and Callbacks for Embedded PythonThreads and Callbacks for Embedded Python
Threads and Callbacks for Embedded Python
 
C pythontalk
C pythontalkC pythontalk
C pythontalk
 
Java 7 at SoftShake 2011
Java 7 at SoftShake 2011Java 7 at SoftShake 2011
Java 7 at SoftShake 2011
 
Intro python-object-protocol
Intro python-object-protocolIntro python-object-protocol
Intro python-object-protocol
 
Java 7 JUG Summer Camp
Java 7 JUG Summer CampJava 7 JUG Summer Camp
Java 7 JUG Summer Camp
 
JAVA SE 7
JAVA SE 7JAVA SE 7
JAVA SE 7
 
Taipei.py 2018 - Control device via ioctl from Python
Taipei.py 2018 - Control device via ioctl from Python Taipei.py 2018 - Control device via ioctl from Python
Taipei.py 2018 - Control device via ioctl from Python
 
Java 7 LavaJUG
Java 7 LavaJUGJava 7 LavaJUG
Java 7 LavaJUG
 
Python 3000
Python 3000Python 3000
Python 3000
 
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coinsoft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
soft-shake.ch - Java SE 7: The Fork/Join Framework and Project Coin
 
Python For Scientists
Python For ScientistsPython For Scientists
Python For Scientists
 
Faster Python, FOSDEM
Faster Python, FOSDEMFaster Python, FOSDEM
Faster Python, FOSDEM
 
Python Performance 101
Python Performance 101Python Performance 101
Python Performance 101
 
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docxIn Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
In Class AssignmetzCST280W13a-1.pdfCST 280 In-Class Pract.docx
 

Mehr von PyCon Italia

zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"
zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"
zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"PyCon Italia
 
Python in the browser
Python in the browserPython in the browser
Python in the browserPyCon Italia
 
PyPy 1.2: snakes never crawled so fast
PyPy 1.2: snakes never crawled so fastPyPy 1.2: snakes never crawled so fast
PyPy 1.2: snakes never crawled so fastPyCon Italia
 
PyCuda: Come sfruttare la potenza delle schede video nelle applicazioni python
PyCuda: Come sfruttare la potenza delle schede video nelle applicazioni pythonPyCuda: Come sfruttare la potenza delle schede video nelle applicazioni python
PyCuda: Come sfruttare la potenza delle schede video nelle applicazioni pythonPyCon Italia
 
OpenERP e l'arte della gestione aziendale con Python
OpenERP e l'arte della gestione aziendale con PythonOpenERP e l'arte della gestione aziendale con Python
OpenERP e l'arte della gestione aziendale con PythonPyCon Italia
 
New and improved: Coming changes to the unittest module
 	 New and improved: Coming changes to the unittest module 	 New and improved: Coming changes to the unittest module
New and improved: Coming changes to the unittest modulePyCon Italia
 
Monitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntopMonitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntopPyCon Italia
 
Jython for embedded software validation
Jython for embedded software validationJython for embedded software validation
Jython for embedded software validationPyCon Italia
 
Foxgame introduzione all'apprendimento automatico
Foxgame introduzione all'apprendimento automaticoFoxgame introduzione all'apprendimento automatico
Foxgame introduzione all'apprendimento automaticoPyCon Italia
 
Django è pronto per l'Enterprise
Django è pronto per l'EnterpriseDjango è pronto per l'Enterprise
Django è pronto per l'EnterprisePyCon Italia
 
Crogioli, alambicchi e beute: dove mettere i vostri dati.
Crogioli, alambicchi e beute: dove mettere i vostri dati.Crogioli, alambicchi e beute: dove mettere i vostri dati.
Crogioli, alambicchi e beute: dove mettere i vostri dati.PyCon Italia
 
Comet web applications with Python, Django & Orbited
Comet web applications with Python, Django & OrbitedComet web applications with Python, Django & Orbited
Comet web applications with Python, Django & OrbitedPyCon Italia
 

Mehr von PyCon Italia (13)

zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"
zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"
zc.buildout: "Un modo estremamente civile per sviluppare un'applicazione"
 
Python in the browser
Python in the browserPython in the browser
Python in the browser
 
PyPy 1.2: snakes never crawled so fast
PyPy 1.2: snakes never crawled so fastPyPy 1.2: snakes never crawled so fast
PyPy 1.2: snakes never crawled so fast
 
PyCuda: Come sfruttare la potenza delle schede video nelle applicazioni python
PyCuda: Come sfruttare la potenza delle schede video nelle applicazioni pythonPyCuda: Come sfruttare la potenza delle schede video nelle applicazioni python
PyCuda: Come sfruttare la potenza delle schede video nelle applicazioni python
 
OpenERP e l'arte della gestione aziendale con Python
OpenERP e l'arte della gestione aziendale con PythonOpenERP e l'arte della gestione aziendale con Python
OpenERP e l'arte della gestione aziendale con Python
 
New and improved: Coming changes to the unittest module
 	 New and improved: Coming changes to the unittest module 	 New and improved: Coming changes to the unittest module
New and improved: Coming changes to the unittest module
 
Monitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntopMonitoraggio del Traffico di Rete Usando Python ed ntop
Monitoraggio del Traffico di Rete Usando Python ed ntop
 
Jython for embedded software validation
Jython for embedded software validationJython for embedded software validation
Jython for embedded software validation
 
Foxgame introduzione all'apprendimento automatico
Foxgame introduzione all'apprendimento automaticoFoxgame introduzione all'apprendimento automatico
Foxgame introduzione all'apprendimento automatico
 
Effective EC2
Effective EC2Effective EC2
Effective EC2
 
Django è pronto per l'Enterprise
Django è pronto per l'EnterpriseDjango è pronto per l'Enterprise
Django è pronto per l'Enterprise
 
Crogioli, alambicchi e beute: dove mettere i vostri dati.
Crogioli, alambicchi e beute: dove mettere i vostri dati.Crogioli, alambicchi e beute: dove mettere i vostri dati.
Crogioli, alambicchi e beute: dove mettere i vostri dati.
 
Comet web applications with Python, Django & Orbited
Comet web applications with Python, Django & OrbitedComet web applications with Python, Django & Orbited
Comet web applications with Python, Django & Orbited
 

Kürzlich hochgeladen

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Kürzlich hochgeladen (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Cleanup and new optimizations in WPython 1.1

  • 1. Cleanup and new optimizations in WPython 1.1 Cesare Di Mauro A-Tono s.r.l. PyCon Quattro 2010 – Firenze (Florence) May 9, 2010 Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 1 / 30
  • 2. What’s new in WPython 1.1 • Removed WPython 1.0 alpha PyTypeObject “hack” • Fixed tracing for optimized loops • Removed redundant checks and opcodes • New opcode_stack_effect() • Specialized opcodes for common patterns • Constant grouping and marshalling on slices • Peepholer moved to compiler.c • Experimental “INTEGER” opcodes family Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 2 / 30
  • 3. The PyTypeObject "hack” typedef struct _typeobject { [...] hashfunc tp_relaxedhash; /* Works for lists and dicts too. */ cmpfunc tp_equal; /* Checks if two objects are equal (0.0 != -0.0) */ } PyTypeObject; long PyDict_GetHashFromKey(PyObject *key) { long hash; if (!PyString_CheckExact(key) || (hash = ((PyStringObject *) key)->ob_shash) == -1) { hash = key->ob_type->tp_relaxedhash(key); if (hash == -1) PyErr_Clear(); No conditional } branch(es): return hash; just get it! } Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 3 / 30
  • 4. Removed it in Wpython 1.0/1.1 long _Py_object_relaxed_hash(PyObject *o) { if (PyTuple_CheckExact(o)) return _Py_tuple_relaxed_hash((PyTupleObject *) o); Conditional else if (PyList_CheckExact(o)) return _Py_list_relaxed_hash((PyListObject *) o); branches else if (PyDict_CheckExact(o)) kill the return _Py_dict_relaxed_hash((PyDictObject *) o); processor else if (PySlice_Check(o)) return _Py_slice_relaxed_hash((PySliceObject *) o); pipeline! else return PyObject_Hash(o); } _Py_object_strict_equal is even worse! Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 4 / 30
  • 5. LOST: performance! “Hack” used only on compile.c for consts dict… …but very important for speed! Tests made on Windows 7 x64, Athlon64 2800+ socket 754, 2GB DDR 400Mhz, 32-bits Python PyStone (sec.) PyBench Min (ms.) PyBench Avg (ms.) Python 2.6.4 38901 10444* 10864* WPython 1.0 alpha 50712 8727** 9082** WPython 1.0 47914 9022 9541 WPython 1.1 47648 8785 9033 WPython 1.1 (INT “opc.”) 45655 9125 9486 * Python 2.6.4 missed UnicodeProperties. Added from 1.0. ** WPython 1.0 alpha missed NestedListComprehensions and SimpleListComprehensions. Added from 1.0. Being polite doesn’t pay. “Crime” does! Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 5 / 30
  • 6. Tracing on for loops was broken Optimized for loops don’t have SETUP_LOOPs & POP_BLOCKs Tracing makes crash on jumping in / out of them! def f(): Cannot jump in 2 0 SETUP_LOOP 19 (to 22) for x in a: 3 LOAD_GLOBAL 0 (a) print x Cannot jump out 6 GET_ITER >> 7 FOR_ITER 11 (to 21) 2 0 LOAD_GLOBAL 0 (a) 10 STORE_FAST 0 (x) 1 GET_ITER 3 13 LOAD_FAST 0 (x) >> 2 FOR_ITER 5 (to 8) 16 PRINT_ITEM 3 STORE_FAST 0 (x) 17 PRINT_NEWLINE 3 4 LOAD_FAST 0 (x) 18 JUMP_ABSOLUTE 7 5 PRINT_ITEM >> 21 POP_BLOCK 6 PRINT_NEWLINE >> 22 LOAD_CONST 0 (None) 7 JUMP_ABSOLUTE 2 25 RETURN_VALUE >> 8 RETURN_CONST 0 (None) Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 6 / 30
  • 7. Now fixed, but was a nightmare! case SETUP_LOOP: case POP_BLOCK: case SETUP_EXCEPT: case POP_FOR_BLOCK: case SETUP_FINALLY: case END_FINALLY: 1) Save loop target case FOR_ITER: case WITH_CLEANUP: 2) Check loop target 3) Remove loop target /* Checks if we have found a "virtual" POP_BLOCK/POP_TOP for the current FOR instruction (without SETUP_LOOP). */ if (addr == temp_last_for_addr) { temp_last_for_addr = temp_for_addr_stack[--temp_for_addr_top]; temp_block_type_top--; } Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 7 / 30
  • 8. CPython has “extra” checks… referentsvisit(PyObject *obj, PyObject *list) Must be { a List return PyList_Append(list, obj) < 0; } Value is int PyList_Append(PyObject *op, PyObject *newitem) needed { if (PyList_Check(op) && (newitem != NULL)) return app1((PyListObject *)op, newitem); PyErr_BadInternalCall(); return -1; It does the } real work! Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 8 / 30
  • 9. …remove them! Absolutely sure: referentsvisit(PyObject *obj, PyObject *list) a list and an { return _Py_list_append(list, obj) < 0; object! } int _Py_list_append(PyObject *self, PyObject *v) { Py_ssize_t n = PyList_GET_SIZE(self); Direct call assert (v != NULL); No checks! [...] PyList_SET_ITEM(self, n, v); return 0; } app1 is now _Py_list_append Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 9 / 30
  • 10. But don’t remove too much! A too much aggressive unreachable code remover def test_constructor_with_iterable_argument(self): b = array.array(self.typecode, self.example) # pass through errors raised in next() def B(): raise UnicodeError yield None self.assertRaises(UnicodeError, array.array, self.typecode, B()) Function B was compiled as if it was: def B(): raise UnicodeError Not a generator! Reverted back to old (working) code… Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 10 / 30
  • 11. The old opcode_stack_effect()… static int opcode_stack_effect(int opcode, int oparg) { Several checks switch (opcode) { case POP_TOP: return -1; /* POPs one element from stack case ROT_THREE: return 0; /* Stack unchanged */ case DUP_TOP: return 1; /* PUSHs one element */ case LIST_APPEND: return -2; /* Removes two elements */ case WITH_CLEANUP: return -1; /* POPs one element. Sometimes more */ case BUILD_LIST: return 1-oparg; /* Consumes oparg elements, pushes one */ Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 11 / 30
  • 12. … and the new one! static int opcode_stack_effect(struct compiler *c, struct instr*i, int*stack_retire) { static signed char opcode_stack[TOTAL_OPCODES] = { /* LOAD_CONST = 1 */ /* LOAD_FAST = 1 */ /* STORE_FAST = -1 */ 0, 0, 0, 0, 0, 0, 1, 1, -1, 0, /* 0 .. 9 */ [...] int opcode = i->i_opcode; int oparg = i->i_oparg; *stack_retire = 0; switch (opcode & 0xff) { case BUILD_LIST: Only special cases checked return 1 - oparg; [...] default: Regular cases unchecked return opcode_stack[opcode & 0xff]; Just get the value! } Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 12 / 30
  • 13. Removed unneeded copy on for def f(): for x in[1, 2, 3]: pass List of “pure” constants Python 2.6.4 WPython 1.0 alpha WPython 1.1 2 0 SETUP_LOOP 23 (to 26) 2 0 LOAD_CONST 1 ([1, 2, 3]) 2 0 LOAD_CONST 1 ([1, 2, 3]) 3 LOAD_CONST 1 (1) 1 LIST_DEEP_COPY 1 GET_ITER 6 LOAD_CONST 2 (2) 2 GET_ITER >> 2 FOR_ITER 2 (to 5) 9 LOAD_CONST 3 (3) >> 3 FOR_ITER 2 (to 6) 3 STORE_FAST 0 (x) 12 BUILD_LIST 3 4 STORE_FAST 0 (x) 4 JUMP_ABSOLUTE 2 15 GET_ITER 5 JUMP_ABSOLUTE 3 >> 5 RETURN_CONST 0 (None) >> 16 FOR_ITER 6 (to 25) >> 6 RETURN_CONST 0 (None) 19 STORE_FAST 0 (x) 22 JUMP_ABSOLUTE 16 >> 25 POP_BLOCK >> 26 LOAD_CONST 0 (None) Makes a (deep) copy, before use 29 RETURN_VALUE Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 13 / 30
  • 14. Improved class creation def f(): class c(): pass 2 0 LOAD_CONST 1 ('c') 2 0 LOAD_CONST 1 ('c') 3 LOAD_CONST 3 (()) 1 BUILD_TUPLE 0 6 LOAD_CONST 2 (<code>) 2 LOAD_CONST 2 (<code>) 9 MAKE_FUNCTION 0 3 MAKE_FUNCTION 0 12 CALL_FUNCTION 0 4 BUILD_CLASS 15 BUILD_CLASS 5 STORE_FAST 0 (c) 16 STORE_FAST 0 (c) 6 RETURN_CONST 0 (None) 19 LOAD_CONST 0 (None) 22 RETURN_VALUE Internal fast Function call 2 0 LOAD_NAME 0 (__name__) 3 STORE_NAME 1 (__module__) 2 0 LOAD_NAME 0 (__name__) 6 LOAD_LOCALS 1 STORE_NAME 1 (__module__) 7 RETURN_VALUE 2 RETURN_LOCALS Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 14 / 30
  • 15. Optimized try on except (last) 2 0 SETUP_EXCEPT 8 (to 11) 2 0 SETUP_EXCEPT 4 (to 5) 3 3 LOAD_FAST 0 (x) 3 1 LOAD_FAST 0 (x) 6 POP_TOP 2 POP_TOP 7 POP_BLOCK 3 POP_BLOCK 8 JUMP_FORWARD 11 (to 22) def f(x, y): 4 JUMP_FORWARD 5 (to 10) 4 >> 11 POP_TOP 4 >> 5 POP_TOP try: 12 POP_TOP 6 POP_TOP x 13 POP_TOP 7 POP_TOP except: 5 14 LOAD_FAST 1 (y) 5 8 LOAD_FAST 1 (y) y 17 POP_TOP 9 POP_TOP 18 JUMP_FORWARD 1 (to 22) >> 10 RETURN_CONST 0 (None) 21 END_FINALLY >> 22 LOAD_CONST 0 (None) 25 RETURN_VALUE Unneeded on generic expect clause (if it was the last one) Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 15 / 30
  • 16. Opcodes for multiple comparisons def f(x, y, z): return x < y < z 2 0 LOAD_FAST 0 (x) 2 0 LOAD_FAST 0 (x) 3 LOAD_FAST 1 (y) 1 LOAD_FAST 1 (y) 6 DUP_TOP 2 DUP_TOP_ROT_THREE 7 ROT_THREE 3 CMP_LT 28 (<) 8 COMPARE_OP 0 (<) 4 JUMP_IF_FALSE_ELSE_POP 3 (to 8) 11 JUMP_IF_FALSE 8 (to 22) 5 FAST_BINOP 2 (z) 28 (<) 14 POP_TOP 7 RETURN_VALUE 15 LOAD_FAST 2 (z) >> 8 ROT_TWO_POP_TOP 18 COMPARE_OP 0 (<) 9 RETURN_VALUE 21 RETURN_VALUE >> 22 ROT_TWO 23 POP_TOP Not a simple replacement: 24 RETURN_VALUE Short and Optimized versions! Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 16 / 30
  • 17. Specialized generator operator def f(*Args): 2 0 LOAD_CONST 1 (<code> <genexpr>) return (int(Arg) for Arg in Args[1 : ]) 3 MAKE_FUNCTION 0 6 LOAD_FAST 0 (Args) 2 0 LOAD_CONST 1 (<code> <genexpr>) 9 LOAD_CONST 2 (1) 1 MAKE_FUNCTION 0 12 SLICE+1 2 FAST_BINOP_CONST 0 (Args) 2 (1) 39 (slice_1) 13 GET_ITER 4 GET_GENERATOR 14 CALL_FUNCTION 1 5 RETURN_VALUE Internal fast 17 RETURN_VALUE def f(*Args): Function call 2 0 LOAD_GLOBAL 0 (sum) return sum(int(Arg) for Arg in Args) 3 LOAD_CONST 1 (<code> <genexpr>) 6 MAKE_FUNCTION 0 2 0 LOAD_GLOBAL 0 (sum) 9 LOAD_FAST 0 (Args) 1 LOAD_CONST 1 (<code> <genexpr>) 12 GET_ITER 2 MAKE_FUNCTION 0 13 CALL_FUNCTION 1 3 FAST_BINOP 0 (Args) 42 (get_generator) 16 CALL_FUNCTION 1 5 QUICK_CALL_FUNCTION 1 (1 0) 19 RETURN_VALUE 6 RETURN_VALUE Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 17 / 30
  • 18. String joins are… binary operators! def f(*Args): 2 0 LOAD_CONST 1 ('n') 3 LOAD_ATTR 0 (join) return 'n'.join(Args) 6 LOAD_FAST 0 (Args) 2 0 CONST_BINOP_FAST 1 ('n') 0 (Args) 45 (join) 9 CALL_FUNCTION 1 2 RETURN_VALUE 12 RETURN_VALUE def f(*Args): return u'n'.join(str(Arg) for Arg in Args) 2 0 LOAD_CONST 1 (u'n') 2 0 LOAD_CONST 1 (u'n') 3 LOAD_ATTR 0 (join) 1 LOAD_CONST 2 (<code> <genexpr>) 6 LOAD_CONST 2 (<code> <genexpr>) 2 MAKE_FUNCTION 0 9 MAKE_FUNCTION 0 3 FAST_BINOP 0 (Args) 42 (get_generator) 12 LOAD_FAST 0 (Args) 5 UNICODE_JOIN 15 GET_ITER 6 RETURN_VALUE 16 CALL_FUNCTION 1 Direct call to 19 CALL_FUNCTION 1 22 RETURN_VALUE PyUnicode_Join Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 18 / 30
  • 19. Specialized string modulo def f(x, y): return '%s and %s' % (x, y) 2 0 LOAD_CONST 1 ('%s and %s') 2 0 LOAD_CONST 1 ('%s and %s') 3 LOAD_FAST 0 (x) 1 LOAD_FAST 0 (x) 6 LOAD_FAST 1 (y) 2 LOAD_FAST 1 (y) 9 BUILD_TUPLE 2 3 BUILD_TUPLE 2 12 BINARY_MODULO 4 STRING_MODULO 13 RETURN_VALUE 5 RETURN_VALUE Direct call to PyString_Format def f(a): return u'<b>%s</b>' % a No checks needed 2 0 LOAD_CONST 1 (u'<b>%s</b>') 2 0 CONST_BINOP_FAST 1 (u'<b>%s</b>') 0 (a) 44 (%) 3 LOAD_FAST 0 (a) 2 RETURN_VALUE 6 BINARY_MODULO 7 RETURN_VALUE Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 19 / 30
  • 20. Improved with cleanup def f(name): 2 0 LOAD_GLOBAL 0 (open) with open(name): 3 LOAD_FAST 0 (name) pass 6 CALL_FUNCTION 1 Can be done better 9 DUP_TOP 10 LOAD_ATTR 1 (__exit__) (specialized opcode) 13 ROT_TWO 2 0 LOAD_GLOB_FAST_CALL_FUNC 0 (open) 0 (name) 1 (1 0) 14 LOAD_ATTR 2 (__enter__) 2 DUP_TOP 17 CALL_FUNCTION 0 3 LOAD_ATTR 1 (__exit__) 20 POP_TOP 4 ROT_TWO 21 SETUP_FINALLY 4 (to 28) 5 LOAD_ATTR 2 (__enter__) 3 24 POP_BLOCK 6 QUICK_CALL_PROCEDURE 0 (0 0) 25 LOAD_CONST 0 (None) 7 SETUP_FINALLY 2 (to 10) >> 28 WITH_CLEANUP 3 8 POP_BLOCK 29 END_FINALLY 9 LOAD_CONST 0 (None) 30 LOAD_CONST 0 (None) >> 10 WITH_CLEANUP 33 RETURN_VALUE 11 RETURN_CONST 0 (None) Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 20 / 30
  • 21. Constants grouping on slice def f(a): return a[1 : -1] 2 0 LOAD_FAST 0 (a) 2 0 LOAD_FAST 0 (a) 3 LOAD_CONST 1 (1) 1 LOAD_CONSTS 1 ((1, -1)) 6 LOAD_CONST 2 (-1) 2 SLICE_3 9 SLICE+3 3 RETURN_VALUE 10 RETURN_VALUE def f(a, n): return a[n : : 2] 2 0 LOAD_FAST 0 (a) 2 0 LOAD_FAST 0 (a) 3 LOAD_FAST 1 (n) 1 LOAD_FAST 1 (n) 6 LOAD_CONST 0 (None) 2 LOAD_CONSTS 1 ((None, 2)) 9 LOAD_CONST 1 (2) 3 BUILD_SLICE_3 12 BUILD_SLICE 3 4 BINARY_SUBSCR 15 BINARY_SUBSCR 5 RETURN_VALUE 16 RETURN_VALUE Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 21 / 30
  • 22. The Marshal Matters 2 0 LOAD_FAST 0 (a) def f(a): 3 LOAD_CONST 1 (1) return a[1 : 10 : -1] 6 LOAD_CONST 2 (10) 2 0 FAST_BINOP_CONST 0 (a) 1 (slice(1, 10, -1)) 7 ([]) 9 LOAD_CONST 3 (-1) 2 RETURN_VALUE 12 BUILD_SLICE 3 15 BINARY_SUBSCR 16 RETURN_VALUE Constant Slice object 2 0 LOAD_FAST 0 (a) 3 LOAD_CONST 1 (1) def f(a): Now marshaled 6 LOAD_CONST 2 (10) x = a[1 : 10 : -1] 9 LOAD_CONST 3 (-1) return x 12 BUILD_SLICE 3 15 BINARY_SUBSCR 2 0 FAST_SUBSCR_CONST_TO_FAST 0 (a) 1 (slice(1, 10, -1)) 1 (x) 16 STORE_FAST 1 (x) 3 2 LOAD_FAST 1 (x) 3 19 LOAD_FAST 1 (x) 3 RETURN_VALUE 22 RETURN_VALUE Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 22 / 30
  • 23. Peepholer moved in compile.c Pros: • Simpler opcode handling (using instr structure) on average • Very easy jump manipulation • No memory allocation (for bytecode and jump buffers) • Works only on blocks (no boundary calculations) • No 8 & 16 bits values distinct optimizations • Always applied (even on 32 bits values code) Cons: • Works only on blocks (no global code “vision”) • Worse unreachable code removing • Some jump optimizations missing (will be fixed! ;) Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 23 / 30
  • 24. Experimental INTEGER opcodes • Disabled by default (uncomment wpython.h / WPY_SMALLINT_SUPER_INSTRUCTIONS) • BINARY operations only • One operand must be an integer (0 .. 255) • No reference counting • Less constants usage (to be done) • Optimized integer code • Direct integer functions call • Fast integer “fallback” • Explicit long or float “fallback” Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 24 / 30
  • 25. 2 0 LOAD_FAST 0 (n) 3 LOAD_CONST 1 (1) 6 COMPARE_OP 9 JUMP_IF_FALSE 1 (<=) 5 (to 17) def fib(n): An example 12 POP_TOP if n <= 1: 3 13 LOAD_CONST 1 (1) return 1 16 RETURN_VALUE else: >> 17 POP_TOP return fib(n - 2) + fib(n – 1) 5 18 LOAD_GLOBAL 0 (fib) 21 LOAD_FAST 0 (n) 2 0 FAST_BINOP_INT 0 (n) 1 29 (<=) 24 LOAD_CONST 2 (2) 2 JUMP_IF_FALSE 2 (to 5) 27 BINARY_SUBTRACT 3 3 RETURN_CONST 1 (1) 28 CALL_FUNCTION 1 4 JUMP_FORWARD 10 (to 15) 31 LOAD_GLOBAL 0 (fib) 5 >> 5 LOAD_GLOBAL 0 (fib) 34 LOAD_FAST 0 (n) 6 FAST_BINOP_INT 0 (n) 2 6 (-) 37 LOAD_CONST 1 (1) 8 QUICK_CALL_FUNCTION 1 (1 0) 40 BINARY_SUBTRACT 9 LOAD_GLOBAL 0 (fib) 41 CALL_FUNCTION 1 10 FAST_BINOP_INT 0 (n) 1 6 (-) 44 BINARY_ADD 12 QUICK_CALL_FUNCTION 1 (1 0) 45 RETURN_VALUE 13 BINARY_ADD 46 LOAD_CONST 0 (None) 14 RETURN_VALUE 49 RETURN_VALUE >> 15 RETURN_CONST 0 (None) Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 25 / 30
  • 26. An INTEGER opcode dissected FAST_BINOP_INT 0 (n) 1 29 (<=) 156 0 1 29 Binary operator “sub-opcode”. 29 -> CMP_LE -> “<=“ Short integer value. No “index” on constants table Index on local variables table. 0 -> “n” FAST_BINOP_INT “main” opcode Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 26 / 30
  • 27. A look inside #define _Py_Int_FromByteNoRef(byte) ((PyObject *) _Py_Int_small_ints[ case FAST_BINOP_INT: _Py_Int_NSMALLNEGINTS + (byte)]) x = GETLOCAL(oparg); if (x != NULL) { NEXTARG16(oparg); if (PyInt_CheckExact(x)) Extracts operator x = INT_BINARY_OPS_Table[EXTRACTARG(oparg)]( PyInt_AS_LONG(x), EXTRACTOP(oparg)); Extracts integer else x = BINARY_OPS_Table[EXTRACTARG(oparg)]( Converts PyInt x, _Py_Int_FromByteNoRef(EXTRACTOP(oparg))); if (x != NULL) { to integer PUSH(x); continue; } Converts integer break; } to PyInt PyRaise_UnboundLocalError(co, oparg); break; Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 27 / 30
  • 28. Optimized integer code #define _Py_Int_FallBackOperation( static PyObject * newtype, operation) { Py_INT_BINARY_OR(register long a, register long b) PyObject *v, *w, *u; { v = newtype(a); return PyInt_FromLong(a | b); if (v == NULL) } return NULL; w = newtype(b); static PyObject * if (w == NULL) { Py_INT_BINARY_ADD2(register long a, register long b) Py_DECREF(v); { return NULL; register long x = a + b; } if ((x^a) >= 0 || (x^b) >= 0) u = operation; return PyInt_FromLong(x); Py_DECREF(v); _Py_Int_FallBackOperation(PyLong_FromLong, Py_DECREF(w); PyLong_Type.tp_as_number->nb_add(v, w)); return u; } } Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 28 / 30
  • 29. Direct integer functions call static PyObject * (*INT_BINARY_OPS_Table[])(register long, register long) = { Defined in intobject.h Py_INT_BINARY_POWER, /* BINARY_POWER */ _Py_int_mul, /* BINARY_MULTIPLY */ Py_INT_BINARY_DIVIDE, /* BINARY_DIVIDE */ PyObject * Implemented in intobject.c _Py_int_mul(register long a, register long b) { long longprod; /* a*b in native long arithmetic */ double doubled_longprod; /* (double)longprod */ double doubleprod; /* (double)a * (double)b */ longprod = a * b; doubleprod = (double)a * (double)b; doubled_longprod = (double)longprod; Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 29 / 30
  • 30. WPython future • Stopped as wordcodes “proof-of-concept”: no more releases! • No Python 2.7 porting (2.x is at a dead end) • Python 3.2+ reimplementation, if community asks • Reintroducing PyObject “hack”, and extending to other cases • CPython “limits” proposal (e.g. max 255 local variables) • Define a “protected” interface to identifiers for VM usage • Rethinking something (code blocks, stack usage, jumps) • Tons optimizations waiting… Cesare Di Mauro – PyCon Quattro 2010 Cleanup and new optimizations in WPython 1.1 May 9, 2010 30 / 30