2. Good morning
Julien PAULI
PHP programmer for many years
PHP internals source hacker
5.5 and 5.6 Release Manager
Writes tech articles and books
http://www.phpinternalsbook.com
http://jpauli.github.io
Working at SensioLabs in Paris
Mainly doing cool C stuff on PHP / Symfony2
@julienpauli - github.com/jpauli - jpauli@php.net
3. The road
Compile PHP and use debug mode
PHP extensions details and lifetime
PHP extensions globals management
Memory management
PHP variables : zvals
PHP INI settings
PHP functions, objects and classes
Overwritting existing behaviors
Changing deep Zend Engine behaviors
4. What you should bring
A laptop under Linux/Unix
Good C knowledge
Linux knowledge (your-OS knowledge)
Any C dev environment
Those slides will assume a Debian based Linux
We won't bother with thread safety
You can just ignore those TSRM* things you'll meet
5. Compiling PHP, using debug
Grab a PHP source code from php.net or git
Install a C code compiling environment
You'll probably need some libs to compile PHP
:$> apt-get install build-essential autoconf
:$> apt-get install libxml2-dev
6. Compiling PHP, using debug
Compile a debug PHP and install it
Do not forget debug flag
Extensions compiled against debug PHP won't load
on "normal" PHP => need recompile
Always develop extension under debug mode
:$> ./configure --enable-debug --prefix=my/install/dir
:$> make && make install
:$> my/install/dir/bin/php -v
PHP 5.5.6-dev (cli) (built: Nov 5 2013 17:33:45) (DEBUG)
Copyright (c) 1997-2013 The PHP Group
Zend Engine v2.5.0, Copyright (c) 1998-2013 Zend Technologies
7. Create your first extension
There exists a skeleton generator, let's use it
:$> cd phpsrc/ext
:$phpsrc/ext> ./ext_skel --extname=extworkshop
Creating directory ext-workshop
Creating basic files: config.m4 config.w32 .svnignore extworkshop.c
php_extworkshop.h CREDITS EXPERIMENTAL
tests/001.phpt extworkshop.php
[done].
:~> cd extworkshop && tree
.
|-- config.m4
|-- config.w32
|-- CREDITS
|-- EXPERIMENTAL
|-- extworkshop.c
|-- extworkshop.php
|-- php_extworkshop.h
`-- tests
`-- 001.phpt
8. Activate your first extension
config.m4 tells the build tools about your ext
Uncomment --enable if your extension is stand
alone
Uncomment --with if your extension has
dependencies against other libraries
:~/extworkshop> vim config.m4
PHP_ARG_ENABLE(ext-workshop, whether to enable extworkshop support,
[ --enable-extworkshop Enable extworkshop support])
PHP_NEW_EXTENSION(extworkshop, extworkshop.c, $ext_shared)
9. Compile and install your ext
phpize tool is under `php-install-dir`/bin
It's a shell script importing PHP sources into your ext dir
for it to get ready to compile
It performs some checks
It imports the configure script
This will make you compile a shared object
For static compilation, rebuild main configure using
buildconf script
Run phpize --clean to clean the env when finished
:~/extworkshop> phpize && ./configure --with-php-config=/path/to/php-config
&& make install
10. API numbers
PHP Api Version is the num of the version of the internal API
ZendModule API is the API of the extension system
ZendExtension API is the API of the zend_extension system
ZEND_DEBUG and ZTS are about debug mode activation and
thread safety layer activation
Those 5 criterias need to match your extension's when you load it
Different PHP versions have different API numbers
Extensions may not work cross-PHP versions
Configuring for:
PHP Api Version: 20121113
Zend Module Api No: 20121212
Zend Extension Api No: 220121212
13. What extensions can do
Extensions can :
Add new functions, classes, interfaces
Add and manage php.ini settings and phpinfo() output
Add new global variables or constants
Add new stream wrappers/filters, new resource types
Overwrite what other extensions defined
Hook by overwriting global function pointers
Extensions cannot :
Modify PHP syntax
Zend extensions :
Are able to hook into OPArrays (very advanced usage)
14. Why create an extension ?
Bundle an external library code into PHP
redis, curl, gd, zip ... so many of them
Optimize performances by adding features
C is way faster than PHP
C is used everywhere in Unix/Linux, including Kernel
Create your own C structures and manage them by
providing PHP functions
Create your own resource intensive algorithms
Exemple : https://github.com/phadej/igbinary
15. C vs PHP
Don't try to turn the world to C
Why you should use PHP over C :
C is way more difficult to develop than PHP
C is less maintainable
C can be really tricky to debug
C is platform dependant. CrossPlatform can turn to PITA
Cross-PHP-Version is a pain
Why you should use C over PHP :
Bundle an external lib into PHP (cant be done in PHP)
Looking for very high speed and fast/efficient algos
Changing PHP behavior deeply, make it do what you want
21. Zend Memory Manager API
Request-lifetime heap memory should be
reclaimed using ZMM API
Infinite lifetime memory can be reclaimed using
ZMM "persist" API, or direct libc calls
#define emalloc(size)
#define safe_emalloc(nmemb, size, offset)
#define efree(ptr)
#define ecalloc(nmemb, size)
#define erealloc(ptr, size)
#define safe_erealloc(ptr, nmemb, size, offset)
#define erealloc_recoverable(ptr, size)
#define estrdup(s)
#define estrndup(s, length)
#define zend_mem_block_size(ptr)
22. ZMM help
ZMM alloc functions track leaks for you
They help finding leaks and overwrites
If PHP is built with --enable-debug
If report_memleaks is On in php.ini (default)
Always use ZMM alloc functions
Don't hesitate to use valgrind to debug memory
USE_ZEND_ALLOC=0 env var disables ZendMM
24. Zval intro
Zval is the basis of every extension
typedef union _zvalue_value {
long lval; /* long value */
double dval; /* double value */
struct { /*
char *val; string value (binary safe)
int len; string length
} str; */
HashTable *ht; /* hash table value */
zend_object_value obj; /* object value */
} zvalue_value;
struct _zval_struct {
zvalue_value value; /* value */
zend_uint refcount__gc; /* refcount */
zend_uchar type; /* active type */
zend_uchar is_ref__gc; /* is_ref flag */
};
typedef struct _zval_struct zval;
25. Zval steps
There exists tons of macros helping you :
Allocate memory for a zval
Change zval type
Change zval real value
Deal with is_ref and refcount
Free zval
Copy zval
Return zval from PHP functions
...
27. Zval and pointers
You'll manipulate, basically :
zval : use MACRO()
zval* : use MACRO_P()
zval ** : use MACRO_PP()
Read macros expansions
Use your IDE
Play with pointers
Z_ADDREF(myzval)
Z_ADDREF_P(myzval *)
Z_ADDREF_PP(myzval **)
29. Zval Types
long = 4 or 8 bytes (depends on OS and arch)
Use SIZEOF_LONG macro to know
double = 8 bytes IEEE754
strings are binary safe
They are NUL terminated, but may encapsulate NULs
They so embed their size as an int
size = number of ASCII chars without ending NUL
Many macros to take care of them
Bools are stored as a long (1 or 0)
Resources are stored as a long (ResourceId)
Arrays = HashTable type (more later)
Objects = lots of things involved (more later)
30. Zval Types macros
When you want to read or write a Zval, you use
once again dedicated macros :
zval *myval; ALLOC_INIT_ZVAL(myval);
ZVAL_DOUBLE(myval, 16.3);
ZVAL_BOOL(myval, 1);
ZVAL_STRINGL(myval, "foo", sizeof("foo")-1, 1);
ZVAL_EMPTY_STRING(myval);
printf("%*s", Z_STRLEN_P(myval), Z_STRVAL_P(myval));
printf("%ld", Z_LVAL_P(myval));
...
31. Zval type switching
You can ask PHP to switch from a type to
another, using its known internal rules
Those functions change the zval*, returning void
Use the _ex() API if separation is needed
convert_to_array()
convert_to_object()
convert_to_string()
convert_to_boolean()
convert_to_null()
convert_to_long()
32. Zval gc info
is_ref
Boolean value 0 or 1
1 means zval has been used as a reference
Engine behavior changed when is_ref = 1
Copy or not copy ? The answer is changed if is_ref=1
Refcount
Represents the number of different places where this
zval is used in PHP (in the entire process)
1 = you are the only one using it (hopefully)
>1 = the zval is used elsewhere
Take care not to modify it if not wanted (separate it before)
Bad refcount management leads to crashes or leaks
33. Creating and destroying zvals
Creation = allocation + gc initialization
Can be done at once (+ sets value to IS_NULL)
Destruction = decrement refcount and if refcount
reaches 0 : free
Tip : Don't try to use malloc() manually for zvals
Tip : Don't try to free your zval manually
zval *myval;
ALLOC_ZVAL(myval);
INIT_PZVAL(myval);
zval *myval;
ALLOC_INIT_ZVAL(myval);
zval_ptr_dtor(&zval_ptr);
38. Function exercise
Declare two new functions
celsius_to_fahrenheit
fahrenheit_to_celsius
They should just be empty for the moment
Confirm all works
39. Functions deeper
Recall the signature of a function :
ht: number of arguments passed to the function
return_value: zval to feed with your data
Allocation is already done by the engine
return_value_ptr: used for return-by-ref functions
this_ptr: pointer to $this for OO context
return_value_used: set to 0 if the return value is
not used ( f.e , a call like foo(); )
tsrm_ls: thread local storage slot
PHP_FE(function_name, function_arginfo)
void zif_my_function(int ht, zval *return_value, zval **return_value_ptr, zval *this_ptr,
int return_value_used , void ***tsrm_ls)
40. Functions: accepting arguments
A very nice API exists
Have a look at
phpsrc/README.PARAMETER_PARSING_API
zend_parse_parameters() converts arguments to
the type you ask
Follows PHP rules
zend_parse_parameters() short : "zpp"
zend_parse_parameters(int num_args_to_parse, char* arg_types, (va_arg args...))
41. Playing with zpp
zpp returns FAILURE or SUCCESS
On failure, you usually return, the engine takes care
of the PHP error message
You always use pointers to data in zpp
PHP_FUNCTION(foo)
{
long mylong;
if (zend_parse_parameters(ZEND_NUM_ARGS(), "l", &mylong) == FAILURE) {
return;
}
RETVAL_LONG(mylong);
}
42. zpp formats
a - array (zval*)
A - array or object (zval *)
b - boolean (zend_bool)
C - class (zend_class_entry*)
d - double (double)
f - function or array containing php method call info (returned as
zend_fcall_info and zend_fcall_info_cache)
h - array (returned as HashTable*)
H - array or HASH_OF(object) (returned as HashTable*)
l - long (long)
L - long, limits out-of-range numbers to LONG_MAX/LONG_MIN (long)
o - object of any type (zval*)
O - object of specific type given by class entry (zval*, zend_class_entry)
p - valid path (string without null bytes in the middle) and its length (char*, int)
r - resource (zval*)
s - string (with possible null bytes) and its length (char*, int)
z - the actual zval (zval*)
Z - the actual zval (zval**)
* - variable arguments list (0 or more)
+ - variable arguments list (1 or more)
43. zpp special formats
| - indicates that the remaining parameters are optional, they
should be initialized to default values by the extension since they
will not be touched by the parsing function if they are not
passed to it.
/ - use SEPARATE_ZVAL_IF_NOT_REF() on the parameter it follows
! - the parameter it follows can be of specified type or NULL. If NULL is
passed and the output for such type is a pointer, then the output
pointer is set to a native NULL pointer.
For 'b', 'l' and 'd', an extra argument of type zend_bool* must be
passed after the corresponding bool*, long* or double* arguments,
respectively. A non-zero value will be written to the zend_bool iif a
PHP NULL is passed.
44. zpp examples
char *name, *value = NULL, *path = NULL, *domain = NULL;
long expires = 0;
zend_bool secure = 0, httponly = 0;
int name_len, value_len = 0, path_len = 0, domain_len = 0;
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s|slssbb", &name,
&name_len, &value, &value_len, &expires, &path,
&path_len, &domain, &domain_len, &secure, &httponly) == FAILURE) {
return;
}
/* Gets an object or null, and an array.
If null is passed for object, obj will be set to NULL. */
zval *obj;
zval *arr;
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "o!a",
&obj, &arr) == FAILURE) {
return;
}
45. Practice zpp
make our temperature functions accept
argument and return a true result
Parse the argument
Check RETVAL_**() macros, they'll help
°C x 9/5 + 32 = °F
(°F - 32) x 5/9 = °C
46. Writing a test
PHP's got a framework for testing itself and its
extensions
Welcome "PHPT"
Learn more about it at
http://qa.php.net/write-test.php
49. Generating errors
Two kinds :
Errors
Exceptions
For errors :
php_error_docref() : Sends an error with a docref
php_error() / zend_error() : Sends an error
For exceptions :
zend_throw_exception() : throws an exception
php_error(E_WARNING, "The number %lu is too big", myulong);
zend_throw_exception_ex(zend_exception_get_default(), 0, "%lu too big", myulong);
50. Practice errors
Create a function
temperature_converter($value, $convert_type)
convert_type can only be 1 or 2
1 = F° to C°
2 = C° to F°
It should output an error if $convert_type is wrong
The function should return a string describing the
scenario run
Have a look at php_printf() function to help
echo temperature_converter(20, 2);
"20 degrees celsius give 68 degrees fahrenheit"
echo temperature_converter(20, 8);
Warning: convert_type not recognized
51. A quick word on string formats
Know your libc's printf() formats
http://www.cplusplus.com/reference/cstdio/printf/
Always use right formats with well sized buffers
Lot's of PHP functions use "extra", internal
implementation of libc's printf() called spprintf()
Error messages for example
Read spprintf.c and snprintf.h to know more
about PHP specific formats, such as "%Z"
Lots of nice comments in those sources
52. Function argument declaration
zpp is clever enough to compute needed args
zpp uses ZEND_NUM_ARGS()
it may return FAILURE if number is incorrect
PHP_FUNCTION(foo)
{
long mylong;
if (zend_parse_parameters(ZEND_NUM_ARGS(), "l", &mylong) == FAILURE) {
return;
}
}
<?php
foo();
Warning: foo() expects exactly 1 parameter, 0 given in /tmp/myext.php on line 3
53. Function args declaration
Try to use Reflection on your temperature
functions
$> php -dextension=extworkshop.so --rf temperature_converter
54. Function args declaration
You may help reflection knowing about
accepted parameters
For this, you need to declare them all to the engine
The engine can't compute them by itself
Welcome "arginfos"
59. HashTable quickly
C noticeable structure
Lots of ways to implement them in C
Mostly lots of operations are O(1) with worst case O(n)
http://lxr.linux.no/linux+v3.12.5/include/linux/list.h#L560
Used everywhere, in every strong program
Implementation of PHP arrays
Keys can be numeric type or string type
Values can be any type
61. Zend HashTables
zend_hash.c / zend_hash.h
HashTable struct
big API
Doubled : weither key is numeric or string
Can store any data, but we'll mainly use them to
store zvals
A specific API exists to ease zval storing
HashTables are widely used into PHP
Not only PHP arrays, they are used internally
everywhere
gdb functions in .gdbinit to help debugging them
63. Zend HashTable API
Hash size is rounded up to the next power of two
If size is exceeded, HashTable will automatically be
resized, but at a (low) CPU cost
pHashFunction is not used anymore, give NULL
pDestructor is the destructor function
Will be called on each element when you remove it
from the Hash
This is used to manage memory : usually free it
If you use zvals in your Hash, use ZVAL_PTR_DTOR as
a destructor
int zend_hash_init(HashTable *ht, uint nSize, hash_func_t pHashFunction,
dtor_func_t pDestructor, zend_bool persistent);
64. Zend HashTable API example
HashTable myht = {0};
zend_hash_init(&myht, 10, NULL, ZVAL_PTR_DTOR, 0);
zval *myval = NULL;
ALLOC_INIT_ZVAL(myval);
ZVAL_STRINGL(myval, "Hello World", sizeof("Hello World") - 1, 1);
if (zend_hash_add(&myht, "myvalue", sizeof("myvalue"), &myval, sizeof(zval *), NULL)
== FAILURE) {
php_error(E_WARNING, "Could not add value to Hash");
} else {
php_printf("The hashTable contains %lu elements", zend_hash_num_elements(&myht));
}
65. Zend HT common mistakes
A HashTable is not a zval
PHP_FUNCTIONs() mainly manipulate zvals (return_value)
Use, f.e. array_init() to create a zval containing a HT
Access the HT into a zval using Z_ARRVAL() macro types
Lots of HashTable functions return SUCCESS or FAILURE macros, they
are not the same as 0 or 1
Very common mistake is writing if (zend_hash_add(...)) { }
HashTables manipulate pointers to your data
If your data is a pointer (zval *), it will then store a zval **, and give you back a
zval **
You usualy add to it a pointer address : &my_zval_pointer
A very common error is to forget one '*' or one '&'
If care is not taken, leaks may appear or worse : invalid memory access
and dangling pointers
String key length includes the terminating NULL, no "-1" needed
66. HashTable retrieve API
PHP_FUNCTION(foo)
{
HashTable *myht;
zval **data = NULL;
if (zend_parse_parameters(ZEND_NUM_ARGS(), "h", &myht) == FAILURE) {
return;
}
if (zend_hash_find(myht, "foo", sizeof("foo"), (void **)&data) == FAILURE) {
php_error(E_NOTICE, "Key 'foo' does not exist");
return;
}
RETVAL_ZVAL(*data, 1, 0);
}
67. HashTable exercise
Create a function that accepts an infinity of
temperature values into an array and converts
them back to C or F
--TEST--
Test temperature converter array
<?php
$temps = array(68, 77, 78.8);
var_dump(multiple_fahrenheit_to_celsius($temps));
?>
--EXPECTF--
array(3) {
[0]=>
float(20)
[1]=>
float(25)
[2]=>
float(26)
}
68. References exercise
Turn multiple_fahrenheit_to_celsius() into an
accept-by-reference function
--TEST--
Test temperature converter array by-ref
<?php
$temps = array(68, 77, 78.8);
multiple_fahrenheit_to_celsius($temps));
var_dump($temps);
?>
--EXPECTF--
array(3) {
[0]=>
float(20)
[1]=>
float(25)
[2]=>
float(26)
}
69. References declaration
Any mismatch in arginfo<->arg passed leads to
separation (which is bad as it often dups memory)
zval passed to function arginfo decl. zval received in function separated by engine?
is_ref=0
refcount = 1
pass_by_ref=0 is_ref=0
refcount = 2
NO
is_ref=1
refcount > 1
pass_by_ref=0 is_ref=1
refcount =1
YES
is_ref=0
refcount > 1
pass_by_ref=0 is_ref=0
refcount > 1 ++
NO
is_ref=0
refcount = 1
pass_by_ref=1 is_ref=1
refcount = 2
YES
is_ref=1
refcount > 1
pass_by_ref=1 is_ref=1
refcount > 1 ++
NO
is_ref=0
refcount > 1
pass_by_ref=1 is_ref=1
refcount = 2
YES
70. Constants
Constants are really easy to use into the engine
You usually register yours in MINIT() phase, use
CONST_PERSISTENT (if not, const will be cleared at RSHUTDOWN)
You can read any constant with an easy API
PHP_MINIT_FUNCTION(extworkshop)
{
REGISTER_STRING_CONSTANT("fooconst", "foovalue",
CONST_CS | CONST_PERSISTENT);
return SUCCESS;
}
zend_module_entry myext_module_entry = {
STANDARD_MODULE_HEADER,
"extworkshop",
extworkshop_functions, /* Function entries */
PHP_MINIT(extworkshop), /* Module init */
NULL, /* Module shutdown */
...
71. Reading constants
zend_get_constant() already duplicates the
value, no need to do it manually
PHP_FUNCTION(foo)
{
zval result_const, *result_const_ptr = &result_const;
if (zend_get_constant("fooconst", sizeof("fooconst") - 1, &result_const)) {
RETURN_ZVAL(result_const_ptr, 0, 0);
}
php_error(E_NOTICE, "Could not find 'fooconst' constant");
}
72. Practice constants
Create two constants for
temperature_converter() $mode argument
TEMP_CONVERTER_TO_CELSIUS
TEMP_CONVERTER_TO_FAHRENHEIT
73. Customizing phpinfo()
Extensions may provide information to the phpinfo
functionnality
PHP_MINFO() used
php_info_*() functions
Beware HTML and non-HTML SAPI
zend_module_entry extworkshop_module_entry = {
/* ... */
PHP_MINFO(extworkshop),
/* ... */
};
PHP_MINFO_FUNCTION(extworkshop)
{
php_info_print_table_start();
php_info_print_table_header(2, "myext support", "enabled");
}
76. INI general concepts
Each extension may register as many INI settings as it wants
Remember INI entries may change during request lifetime
They store both their original value and their modified (if any) value
They store an access level to declare how their value can be altered
(PHP_INI_USER, PHP_INI_SYSTEM, etc...)
PHP's ini_set() modifies the entry value at runtime
PHP's ini_restore() restores the original value as current value
INI entries may be displayed (mainly using phpinfo()), they embed
a "displayer" function pointer
INI entries are attached to an extension
77. An INI entry in PHP
struct _zend_ini_entry {
int module_number;
int modifiable;
char *name;
uint name_length;
ZEND_INI_MH((*on_modify));
void *mh_arg1;
void *mh_arg2;
void *mh_arg3;
char *value;
uint value_length;
char *orig_value;
uint orig_value_length;
int orig_modifiable;
int modified;
void (*displayer)(zend_ini_entry *ini_entry, int type);
};
78. INI entries main
Register at MINIT
Unregister at MSHUTDOWN
Display in phpinfo() (usually)
Many MACROS (once more)
Read the original or the modified value (your
choice) in your extension
Create your own modifier/displayer (if needed)
79. My first INI entry
This declares a zend_ini_entry vector
Register / Unregister it
Display it in phpinfo
PHP_INI_BEGIN()
PHP_INI_ENTRY("logger.default_file", LOGGER_DEFAULT_LOG_FILE, PHP_INI_ALL, NULL)
PHP_INI_END()
#define PHP_INI_ENTRY(name, default_value, modifiable, on_modify)
PHP_MINIT_FUNCTION(myext)
{
REGISTER_INI_ENTRIES();
... ...
}
PHP_MSHUTDOWN_FUNCTION(myext)
{
UNREGISTER_INI_ENTRIES();
... ...
}
PHP_MINFO_FUNCTION(myext)
{
DISPLAY_INI_ENTRIES();
... ...
}
80. Using an INI entry
To read your entry, use one of the MACROs
Same way to read the original value :
INI_STR(entry);
INI_FLT(entry);
INI_INT(entry);
INIT_BOOL(entry);
INI_ORIG_STR(entry);
INI_ORIG_FLT(entry);
INI_ORIG_INT(entry);
INIT_ORIG_BOOL(entry);
81. Modifying an INI entry
INI entries may be attached a "modifier"
A function pointer used to check the new attached
value and to validate it
For example, for bools, users may only provide 1 or 0,
nothing else
Many modifiers/validators already exist :
You may create your own modifier/validator
OnUpdateBool
OnUpdateLong
OnUpdateLongGEZero
OnUpdateReal
OnUpdateString
OnUpdateStringUnempty
82. Using a modifier
The modifier should return FAILURE or SUCCESS
The engine takes care of everything
Access control, error message, writing to the entry...
PHP_INI_BEGIN()
PHP_INI_ENTRY("logger.default_file", LOGGER_DEFAULT_LOG_FILE,
PHP_INI_ALL, OnUpdateStringUnempty)
PHP_INI_END()
ZEND_API ZEND_INI_MH(OnUpdateStringUnempty)
{
char **p;
char *base = (char *) mh_arg2;
if (new_value && !new_value[0]) {
return FAILURE;
}
p = (char **) (base+(size_t) mh_arg1);
*p = new_value;
return SUCCESS;
}
83. Linking INI entry to a global
If you use your entry often by accessing it, you
will trigger a hash lookup everytime
This is not nice for performance
Why not have a global of yours change when
the INI entry is changed (by the PHP user
likely)?
Please, welcome "modifiers linkers"
84. Linking INI entry to a global
Declare a global struct, and tell the engine
which field it must update when your INI entry
gets updated
typedef struct myglobals {
char *my_path;
void *some_foo;
void *some_bar;
} myglobals;
static myglobals my_globals; /* should be thread protected */
PHP_INI_BEGIN()
STD_PHP_INI_ENTRY("logger.default_file", LOGGER_DEFAULT_LOG_FILE, PHP_INI_ALL,
OnUpdateStringUnempty, my_path, myglobals, my_globals)
PHP_INI_END()
86. Classes and objects
More complex than functions as more structures
are involved
zend_class_entry
Represents a class
zend_object
Represents an object
zend_object_handle (int)
Represents an object unique identifier to fetch the zend_object
back from the object store
zend_object_handlers
Function pointers to specific object actions (lots of them)
zend_object_store
Big global single object repository storing every known object
87. All starts with a class
A very big structure : zend_class_entry
Lots of macros to help managing classes
Internal classes need to be registered at MINIT()
Internal classes are not destroyed at the end of the
request (user classes are)
An interface is a (special) class
A trait is a (special) class
Once a class is registered into the engine, you may
create as many objects as you want with low memory
footprint
88. Registering a new class
zend_register_internal_class()
Takes a zend_class_entry* as model
Initialize internal class members
Registers the class into the engine
Returns a new pointer to this freshly added class
zend_class_entry *ce_Logger;
PHP_MINIT_FUNCTION(myext)
{
zend_class_entry ce;
INIT_CLASS_ENTRY(ce, "Logger", NULL);
ce_Logger = zend_register_internal_class(&ce TSRMLS_CC);
return SUCCESS;
}
89. Registering a new class
Usually the class pointer is shared into a global
variable
This one should be exported in a header file
This allows other extensions to use/redefine our class
zend_class_entry *ce_Logger;
PHP_MINIT_FUNCTION(myext)
{
zend_class_entry ce;
INIT_CLASS_ENTRY(ce, "Logger", NULL);
ce_Logger = zend_register_internal_class(&ce TSRMLS_CC);
return SUCCESS;
}
90. Other class noticeable items
zend_class_entry also manages
Static attributes (zvals)
Constants (zvals)
Functions (object methods and class static methods)
Interfaces (zend_class_entry as well)
Inheritence classes tree
Used traits (zend_class_entry again)
Other stuff such as handlers
We'll see how to take care of such item later on
91. An example logger class
We will design something like that :
Now : register the Logger class
<?php
try {
$log = new Logger('/tmp/mylog.log');
} catch (LoggerException $e) {
printf("Woops, could not create object : %s", $e->getMessage());
}
$log->log(Logger::DEBUG, "My debug message");
92. class constants
zend_declare_class_constant_<type>()
Will use ce->constants_table HashTable
#define LOG_INFO 2
PHP_MINIT_FUNCTION(myext)
{
zend_class_entry ce;
INIT_CLASS_ENTRY(ce, "Logger", NULL);
ce_Logger = zend_register_internal_class(&ce);
zend_declare_class_constant_long(ce_Logger, ZEND_STRL("INFO"), LOG_INFO);
}
93. class/object attributes
zend_declare_property_<type>()
Can declare both static and non static attr.
Can declare any visibility, only type matters
PHP_MINIT_FUNCTION(myext)
{
zend_class_entry ce;
INIT_CLASS_ENTRY(ce, "Logger", NULL);
ce_Logger = zend_register_internal_class(&ce);
zend_declare_property_string(ce_Logger, ZEND_STRL("file"), "",
ZEND_ACC_PROTECTED);
}
94. Practice creating a class
Create the logger class
With 3 constants : INFO, DEBUG, ERROR
With 2 properties
handle : private , null
file : protected , string
You may declare a namespaced class
Use INIT_NS_CLASS_ENTRY for this
95. Adding methods
Methods are just functions attached to a class
Very common with PHP_FUNCTION
ZEND_BEGIN_ARG_INFO(arginfo_logger___construct, 0)
ZEND_ARG_INFO(0, value)
ZEND_END_ARG_INFO()
static zend_function_entry logger_class_functions[] = {
PHP_ME( Logger, __construct, arginfo_logger___construct,
ZEND_ACC_PUBLIC|ZEND_ACC_CTOR )
PHP_FE_END
};
PHP_METHOD( Logger, __construct ) { /* some code here */ }
PHP_MINIT_FUNCTION(myext)
{
zend_class_entry ce;
INIT_CLASS_ENTRY(ce, "Logger", logger_class_functions);
/* ... */
}
96. Visibility modifier
One may use
ZEND_ACC_PROTECTED
ZEND_ACC_PUBLIC
ZEND_ACC_PRIVATE
ZEND_ACC_FINAL
ZEND_ACC_ABSTRACT
ZEND_ACC_STATIC
Usually, the other flags (like ZEND_ACC_INTERFACE)
are set by the engine when you call proper functions
ZEND_ACC_CTOR/DTOR/CLONE are used by
reflection only
98. Designing and using interfaces
An interface is a zend_class_entry with special
flags and abstract methods only
zend_register_internal_interface() is used
It simply sets ZEND_ACC_INTERFACE on the
zend_class_entry structure
zend_class_implements() is then used to
implement the interface
99. Practice : add an interface
Detach the log() method into an interface and
implement it
100. Exceptions
Use zend_throw_exception() to throw an
Exception
Passing NULL as class_entry will use default Exception
You may want to register and use your own
Exceptions
Just create your exception class
Make it extend a base Exception class
Use zend_register_class_entry_ex() for that
zend_throw_exception(my_exception_ce, "An error occured", 0);
INIT_CLASS_ENTRY(ce_exception, "LoggerException", NULL);
ce_Logger_ex = zend_register_internal_class_ex(&ce_exception,
zend_exception_get_default(), NULL);
101. Practice
Write real code for our Logger class
You may need to update properties
zend_update_property_<type>()
You may need to access $this in your methods
use getThis() macro or this_ptr from func args
You may need to use php streams
If so, try getting used to their API by yourself
103. Globals ?
They are sometimes needed
Every program needs some kind of global state
Try however to prevent their usage when
possible
Use reentrancy instead
104. PHP globals problem
If the environnement is threaded, many (every?) global
access should be protected
Basicaly using some kinf of locking technology
PHP can be compiled with ZendThreadSafety (ZTS) or not
(NZTS)
Globals access in ZTS mode need to be mutexed
Globals access in NZTS mode don't need such protection
So accessing globals will differ according to ZTS or not
Thus we'll use macros for such tasks
105. Declaring globals
Declare the structure (usually in .h)
Declare a variable holding the structure
ZEND_BEGIN_MODULE_GLOBALS(extworkshop)
char *my_string;
ZEND_END_MODULE_GLOBALS(extworkshop)
ZEND_DECLARE_MODULE_GLOBALS(extworkshop)
106. Initializing globals
Most likely, your globals will need to be
initialized (most likely, to 0).
There exists two hooks for that
Right before MINIT : GINIT
Right after MSHUTDOWN : GSHUTDOWN
Remember that globals are global (...)
Nothing to do with request lifetime
109. Accessing globals
A macro is present in your .h for that
YOUR-EXTENSION-NAME_G(global_value_to_fetch)
#ifdef ZTS
#define EXTWORKSHOP_G(v)
TSRMG(extworkshop_globals_id, zend_extworkshop_globals *, v)
#else
#define EXTWORKSHOP_G(v) (extworkshop_globals.v)
#endif
if (EXTWORKSHOP_G(my_string)) {
...
}
110. Globals : practice
Move our resource_id to a protected global
Our extension is now Thread Safe !
113. Changing the engine behavior
There are many ways to change what exists
As soon as elements are pointers
They are usually fetchable from a global table
Many global tables in PHP
EG() and CG() are the most commonly used
CG() compiler_globals and EG() executor globals :
global function table
global class table
global/local variables table
...
115. Zend Virtual Machine
Zend VM Compiler
Zend VM Executor
Lexer
Parser
PHP Script
bytes
Tokens
Compiler
N
odes
OPCodes
Result
bytes
116. The BIG steps
The engine compiles a PHP file
This gives an OPArray as result
The engine launches the OPArray into the executor
The executor executes each instruction
If a user function is called
The engine executes an OPCode to launch the func
The user func has been compiled as an OPArray
The engine creates a new stack frame with this
function's OPArray
The engine now executes the new frame
When finished, the engine switches back to early frame
117. Zend Virtual Machine
The Zend VM Compiler is not hookable
You cant change PHP syntax from an extension
The Zend VM Executor is hookable
You can alter the instructions beeing run
You can alter the path the executor runs through
Zend ext have more power than PHP ext here
We will only talk about Zend VM executor in the
future
118. OPCodes
Introducing OPCodes :
In computer science, an opcode (operation
code) is the portion of a machine language
instruction that specifies the operation to be
performed. (Wikipedia)
Opcodes can also be found in so called byte
codes and other representations intended for a
software interpreter rather than a hardware
device. (Wikipedia)
122. OPCodes An OPCode is one VM instruction designed to do
just one little thing
It can use up to two operands
It defines an operation to do with them
It produces one result (which can be NULL)
It may make use of an additionnal
"extended_value"
Think about a calculator
Two operands, one operation, one result, one carry
124. OPArray
An OPArray is a structure holding a series of
OPCodes to be run
(and many more things)
Each PHP file is parsed and results into an
OPArray
Each PHP user function is parsed, and results
into an OPArray
When a user function is called, the engine pushes its
OPArray onto the executor for it to be run
125. Zend VM Executor loop
The executor is a giant loop that is given an
OPArray, extracts its OPCodes, and run them
one after the other, infinitely
Fortunately, the compiler always generates a
ZEND_RETURN OPCode at the end, instructing
the executor loop to return
126. Zend VM Executor loop
...
execute_data = i_create_execute_data_from_op_array(EG(active_op_array), 1 TSRMLS_CC);
while (1) {
int ret;
if ((ret = execute_data->opline->handler(execute_data TSRMLS_CC)) > 0) {
switch (ret) {
case 1:
EG(in_execution) = original_in_execution;
return;
case 2:
goto zend_vm_enter;
break;
case 3:
execute_data = EG(current_execute_data);
break;
default:
break;
}
}
}
zend_error_noreturn(E_ERROR, "Arrived at end of main loop which shouldn't happen");
127. Zend VM Executor loop
Every OPCode is responsible of stepping
forward the OPLine
OPLine is a pointer to the current OPCode beeing run
into the current OPArray present into the VM.
static int ZEND_FASTCALL ZEND_ADD_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
zend_op *opline = EX(opline);
fast_add_function(&EX_T(opline->result.var).tmp_var,
opline->op1.zv,
opline->op2.zv TSRMLS_CC);
CHECK_EXCEPTION();
ZEND_VM_NEXT_OPCODE();
}
#define ZEND_VM_NEXT_OPCODE()
CHECK_SYMBOL_TABLES()
ZEND_VM_INC_OPCODE();
ZEND_VM_CONTINUE()
#define ZEND_VM_INC_OPCODE()
execute_data->opline++
#define ZEND_VM_CONTINUE() return 0
128. Zend VM Executor loop
Some OPCodes however can alter the execution
flow, for example ZEND_JMP, generated in case
of conditionnal jumps (if, while, try, switch ,etc.)
static int ZEND_FASTCALL ZEND_JMPZ_HANDLER(ZEND_OPCODE_HANDLER_ARGS)
{
zend_op *opline = EX(opline);
zval *val = opline->op1.zv;
int ret;
ret = i_zend_is_true(val);
if (UNEXPECTED(EG(exception) != NULL)) {
HANDLE_EXCEPTION();
}
if (!ret) {
ZEND_VM_SET_OPCODE(opline->op2.jmp_addr);
ZEND_VM_CONTINUE();
}
ZEND_VM_NEXT_OPCODE();
}
129. Zend VM Executor example
Let's play a demo together, stepping into the
executor
Let's use a very simple code
<?php
const SEPARATOR = "_";
if (isset($argv[1])) {
echo $argv[1] . SEPARATOR . "bar";
} else {
echo "I'm stuck :p";
}
130. Changing the engine behavior
You may overwrite the main executor routines
zend_execute() / zend_execute_internal()
static void (*old_zend_execute)(zend_op_array *op_array TSRMLS_DC);
static void (*old_zend_execute_internal)(zend_execute_data *execute_data_ptr,
int return_value_used TSRMLS_DC);
static void fooext_zend_execute(zend_op_array *op_array TSRMLS_DC)
{
do_some_stuff(op_array);
old_zend_execute(op_array TSRMLS_CC);
do_some_more_stuff(op_array);
}
PHP_RINIT(fooext)
{
old_zend_execute = zend_execute_fn;
old_zend_execute_internal = zend_execute_internal;
zend_execute = fooext_zend_execute;
zend_execute_internal = fooext_zend_execute_internal;
}
131. Changing an opcode handler
Since 5.1, PHP extensions may change opcode
handlers individually
With no/little performance penalty
You'll need to understand the original handler
before overwritting it
Changing opcode handlers changes PHP's
scripting engine behavior (executor)
Warning : many changes in here through PHP
versions (especially 5.3 - 5.4 - 5.5)
132. Logging each object creation
Let's change ZEND_NEW handler
let's log each object creation
Let's use one instance of our "Logger" for that
Strategy :
Declare our handler
Warning : handlers should only be declared BEFORE compilation
pass, aka, in RINIT
execute our handler
execute original zend executor's handler
Restore our handler
Warning: handlers should be restored when the executor is shut
down, this can only be achieved in post_deactivate() handler
133. Need help ?
Read each others source code / extension
http://pecl.php.net
http://github.com
http://lxr.php.net
Find out this workshop source code
https://github.com/jpauli/PHP_Extension_Workshop
Read http://www.phpinternalsbook.com
Ask for help on irc #php.pecl
Open the source and study it by yourself
Attend conferences about internals