HTTP, JSON, JavaScript, Map&Reduce built-in to MySQL
Upcoming SlideShare
Loading in...5
×
 

HTTP, JSON, JavaScript, Map&Reduce built-in to MySQL

on

  • 13,624 Views

HTTP, JSON, JavaScript, Map&Reduce built in to MySQL - make it happen, today. See how a MySQL Server plugin be developed to built all this into MySQL. A new direct wire between MySQL and client-side ...

HTTP, JSON, JavaScript, Map&Reduce built in to MySQL - make it happen, today. See how a MySQL Server plugin be developed to built all this into MySQL. A new direct wire between MySQL and client-side JavaScript is created. MySQL speaks HTTP, replies JSON and offers server-side JavaScript. Server-side JavaScript gets access to MySQL data and does Map&Reduce of JSON documents stored in MySQL. Fast? 2-4x faster than proxing client-side JavaScript request through PHP/Apache. Reasonable results...

Statistiken

Views

Gesamtviews
13,624
Views auf SlideShare
10,015
Views einbetten
3,609

Actions

Gefällt mir
25
Downloads
202
Kommentare
1

17 Einbettungen 3,609

http://blog.ulf-wendel.de 3411
http://www.scoop.it 108
https://twitter.com 16
http://www.cricket.com 12
http://modules.channelfy.com 12
http://us-w1.rockmelt.com 10
http://localhost 10
https://si0.twimg.com 6
http://blog.ulfwendel.de 5
http://a0.twimg.com 4
http://translate.googleusercontent.com 4
http://feedreader.com 4
https://twimg0-a.akamaihd.net 2
http://www.mefeedia.com 2
http://www.blog.ulf-wendel.de 1
http://channelfy.com 1
http://tweetedtimes.com 1
Mehr ...

Zugänglichkeit

Kategorien

Details hochladen

Uploaded via as OpenOffice

Benutzerrechte

© Alle Rechte vorbehalten

Report content

Als unangemessen gemeldet Als unangemessen melden
Als unangemessen melden

Wählen Sie Ihren Grund, warum Sie diese Präsentation als unangemessen melden.

Löschen
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Ihre Nachricht erscheint hier
    Processing...
Kommentar posten
Kommentar bearbeiten

HTTP, JSON, JavaScript, Map&Reduce built-in to MySQL HTTP, JSON, JavaScript, Map&Reduce built-in to MySQL Presentation Transcript

  • Ulf Wendel, OracleHTTP, JSON, JavaScript, Map and Reduce built-in to MySQL Make it happen, today.
  • The speaker says...MySQL is more than SQL!What if...... MySQL would talk HTTP and reply JSON… MySQL had built-in server-side JavaScript for „MyApps“… MySQL had poor mans Map&Reduce for JSON documentsWe – you and I - make it happen. Today.You are watching the proof of concept.
  • Groundbreaking eye-openersNew client protocolsNew access methods, additional data modelsNew output formatsMySQL as a storage framework Mycached (2009, Cybozu) HandlerSocket (2010, Dena) Drizzle HTTP JSON (2011, Steward Smith) InnoDB Memcached (2012, Oracle) NDB/MySQL Cluster Memcached (2012, Oracle) JavaScript/HTTP Interface (today, You)
  • The speaker says...We thought pluggable storage was cool. Different storagebackends for different purposes. We thought the dominantrelational data model is the one and SQL the appropriatequery language. We thought crash-safety, transactions andscale-out through replication count.You wanted maximum performance. You had CPU boundin-memory work loads. You wanted the Key-Value model inaddition to the relational one. Key-Value is fantastic forsharding. You wanted lightweight Memcached protocol andlightweight JSON replies. You did not need a powerful querylanguage such as SQL. Luckily, you saw MySQL as astorage framework!
  • MySQL Server deamon pluginsLike PHP extensions! But not so popular Library, loaded into the MySQL process Expose tables, variables, functions to the user Can start network servers/listeners Can access data – with and without SQL SQL, relational model 3306 MySQL Key/Value 11211 Memcached for InnoDB Key/Value 11211 Memcached for Cluster
  • The speaker says...MySQL daemon plugins can be compared with PHPExtensions or Apache Modules. A daemon plugin can beanything that extends the MySQL Server. The MySQL sourcehas a blueprint, a daemon plugin skeleton: it contains aslittle as 300 lines of code. The code is easy to read.MySQL is written in C and portable C++.The books „MySQL 5.1 Plugin Development“ (SergeiGolubchik, Andrew Hutchings) and „Understanding MySQLInternals“ (Sascha Pachev) gets you started with plugindevelopment. Additional information, including examples, isin the MySQL Reference Manual.
  • PerformanceMemcached Key/Value access to MySQL MySQL benefits: crash safe, replication, ... Some, easy to understand client commands Small footprint network protocol Community reports 10,000+ queries/s single threaded and 300,000+ queries/s with 100 clients
  • The speaker says...I couldnt resist and will continue to show performancefigures to make my point. From now on, the machine usedfor benchmarking is my notebook: Celeron Duo, 1.2 Ghz,32bit virtual machine running SuSE 12.1 with 1.5GB of RAMassigned, Win XP as a host. Small datasets ensure that allbenchmarks run out of main memory.Dont let the benchmarks distract you. Dream of MySQLas a storage – for many data models, many networkprotocols and even more programming languages.For example, think of client-side JavaScript developers.
  • MySQL for client-side JavaScriptThe PoC creates a direct wire Sandbox: HTTP/Websocket network protocols only Needs proxying to access MySQL Extra deployments for proxying: LAMP or node.js Proxying adds latency and increases system load Browser Apache 80 3306 MySQL JavaScript PHP
  • The speaker says...Client-side JavaScript runs in a sandbox. JavaScriptdevelopers do HTTP/Websocket background requests tofetch data from the web server.Because MySQL does not understand HTTP/Websocket,users need to setup and deploy a proxy for accessingMySQL. For example, one can use PHP for proxying. PHPinterprets GET requests from JavaScript, connects toMySQL, executes some SQL, translates the result into JSONand returns it to JavaScript via HTTP.Lets give JavaScript a direct wire to MySQL!
  • HTTP and JSON for MySQL Like with PHP extensions! Copy daemon plugin example, add your magic Glue libraries: libevent (BSD), libevhtp (BSD) Handle GET /?sql=<statement>, reply JSON 2000 1800 1600 1400 1200Requests/s 1000 PHP proxy 800 Server plugin 600 400 200 0 1 4 8 16 32 Concurrency (ab2 -c <n>)
  • The speaker says...First, we add a HTTP server to MySQL. MySQL shalllisten on port 8080, accept GET /?sql=SELECT%1,run the SQL and reply the result as JSON to the user.The HTTP server part is easy: we glue together existing,proven BSD libraries.Benchmark first to motivate you. The chart compares theresulting MySQL Server daemon plugin with a PHP scriptthat accepts a GET parameter with a SQL statement,connects to MySQL, runs the SQL and returns JSON. Systemload reported by top is not shown. At a concurrency of32, the load is 34 for PHP and 2,5 for the MySQLServer daemon plugin...
  • Mission HTTPDont look at extending MySQL network modules! Virtual I/O (vio) and Network (net) are fused Start your own socket server in plugin init()/* Plugin initialization method called by MySQL */static int conn_js_plugin_init(void *p) { ... /* See libevent documentation */ evthread_use_pthreads(); base = event_base_new(); /* Register generic callback to handle events */ evhttp_set_gencb(http, conn_js_send_document_cb, docroot); handle = evhttp_bind_socket_with_handle(http, host, port); event_base_dispatch(base);}
  • The speaker says...Dont bother about using any network or I/O related code ofthe MySQL server. Everything is optimized for MySQLProtocol.The way to go is setting up your own socket serverwhen the plugin is started during MySQL startup.Plugins have init() and deinit() methods, very much like PHPextensions have M|RINIT and M|RSHUTDOWN hooks.You will easily find proper code examples on using libeventand libevhtp. I show pseudo-code derived from my workingproof of concept.
  • Done with HTTP – for nowRequest handling - see libevent examplesstatic voidconn_js_send_document_cb(struct evhttp_request *req, void *arg) { /* ... */ *uri = evhttp_request_get_uri(req); decoded = evhttp_uri_parse(uri); /* parsing is in the libevent examples */ if (sql[0]) { query_in_thd(&json_result, sql); evb = evbuffer_new(); evbuffer_add_printf(evb, "%s", json_result.c_ptr()); evhttp_add_header(evhttp_request_get_output_headers(req), "Content-Type", "application/json"); evhttp_send_reply(req, 200, "OK", evb); }}
  • The speaker says...You are making huge steps forward doing nothingbut copying public libevent documentation examplesand adapting it!The hardest part is to come: learning how to run aSQL statement and how to convert the result intoJSON.query_in_thd() is about SQL execution. For JSON conversionwe will need to create a new Protocol class.
  • Before (left) and after (right) Browser Browser JavaScript JavaScript HTTP, JSON HTTP Apache PHP MySQL Protocol, binary MySQL MySQL
  • The speaker says...All over the presentation I do short breaks to reflect uponthe work. The cliff-hangers take a step back to show theoverall architecture and progress. Dont get lost in thesource code.On the left you see todays proxing architecture at theexample of Apache/PHP as a synonym for LAMP. On theright you see what has been created already.
  • Additional APIs would be coolThe new plugins come unexpected How about a SQL service API for plugin developers? How about a handler service API? developers Plugin development would be even easier!/* NOTE: must have to get access to THD! */#define MYSQL_SERVER 1/* For parsing and executing a statement */#include "sql_class.h" // struct THD#include "sql_parse.h" // mysql_parse()#include "sql_acl.h" // SUPER_ACL#include "transaction.h" // trans_commit
  • The speaker says...The recommended books do a great job introducing you tocore MySQL components. So does the MySQLdocumentation. You will quickly grasp what modules thereare. There is plenty information on writing storage engines,creating INFORMATION_SCHEMA tables, SQL variables,user-defined SQL functions – but executing SQL is a bitmore difficult.The new class of server plugins needs comprehensiveservice APIs for plugin developers for accessing data.Both using SQL and using the low-level handler storageinterface.
  • #define MYSQL_SERVER 1The story about THD (thread descriptor)... Every client request is handled by a thread Our daemon needs THDs and the define...int query_in_thd() { /* … */ my_thread_init(); thd = new THD(false); /* From event_scheduler.cc, pre_init_event_thread(THD* thd) */ thd->client_capabilities = 0; thd->security_ctx->master_access = 0; thd->security_ctx->db_access = 0; thd->security_ctx->host_or_ip = (char*) CONN_JS_HOST; thd->security_ctx->set_user((char*) CONN_JS_USER); my_net_init(&thd->net, NULL); thd->net.read_timeout = slave_net_timeout;
  • The speaker says...MySQL uses one thread for every request/client connection.Additional system threads exist. To run a SQL statementwe must create and setup a THD object. It is THEobject passed all around during request execution.The event scheduler source is a good place to learnabout setting up and tearing down a THD object. Theevent scheduler starts SQL threads for events – just like westart SQL threads to answer HTTP requests.
  • THD setup, setup, setup.../* MySQLs network abstraction- vio, virtual I/O */my_net_init(&thd->net, NULL);thd->net.read_timeout = slave_net_timeout;thd->slave_thread = 0;thd->variables.option_bits |= OPTION_AUTO_IS_NULL;thd->client_capabilities |= CLIENT_MULTI_RESULTS;/* MySQL THD housekeeping */mysql_mutex_lock(&LOCK_thread_count);thd->thread_id = thd->variables.pseudo_thread_id = thread_id++;mysql_mutex_unlock(&LOCK_thread_count);/* Guarantees that we will see the thread in SHOW PROCESSLISTthough its vio is NULL. */thd->proc_info = "Initialized";thd->set_time();DBUG_PRINT("info", ("Thread %ld", thd->thread_id));
  • THD setup, setup, setup.../* From lib_sql.cc */thd->thread_stack = (char*) &thd;thd->store_globals();/* Start lexer and put THD to sleep */lex_start(thd);thd->set_command(COM_SLEEP);thd->init_for_queries();/* FIXME: ACL ignored, super user enforced */sctx = thd->security_ctx;sctx->master_access |= SUPER_ACL;sctx->db_access |= GLOBAL_ACLS;/* Make sure we are in autocommit mode */thd->server_status |= SERVER_STATUS_AUTOCOMMIT;/* Set default database */thd->db = my_strdup(CONN_JS_DB, MYF(0));thd->db_length = strlen(CONN_JS_DB);
  • The speaker says...The setup is done. Following the motto „make it happen,today“, we ignore some nasty details such as access control,authorization or – in the following – bothering aboutcharsets. It can be done, thats for sure. I leave it to theones in the know, the MySQL server developers.Access control? With HTTP? With our client-sideJavaScript code and all its secret passwordsembedded in the clear text HTML documentdownloaded by the browser?Hacking is fun!
  • Executing SQL thd->set_query_id(get_query_id()); inc_thread_running(); /* From sql_parse.cc - do_command() */ thd->clear_error(); thd->get_stmt_da()->reset_diagnostics_area(); /* From sql_parse.cc - dispatch command() */ thd->server_status &= ~SERVER_STATUS_CLEAR_SET; /* Text protocol and plain question, no prepared statement */ thd->set_command(COM_QUERY); /* To avoid confusing VIEW detectors */ thd->lex->sql_command = SQLCOM_END; /* From sql_parse.cc - alloc query() = COM_QUERY package parsing */ query = my_strdup(CONN_JS_QUERY, MYF(0)); thd->set_query(query, strlen(query) + 1); /* Free here lest PS break */ thd->rewritten_query.free(); if (thd->is_error()) { return; }
  • Heck, where is the result? Parser_state parser_state; parser_state.init(thd, thd->query(), thd->query_length()); /* From sql_parse.cc */ mysql_parse(thd, thd->query(), thd->query_length(), &parser_state); /* NOTE: multi query is not handled */ if (parser_state.m_lip.found_semicolon != NULL) { return; } if (thd->is_error()) { return; } thd->update_server_status(); if (thd->killed) { thd->send_kill_message(); return; } /* Flush output buffers, protocol is mostly about output format */ thd->protocol->end_statement(); /* Reset THD and put to sleep */ thd->reset_query(); thd->set_command(COM_SLEEP);
  • The speaker says...Our query has been executed. Unfortunately, the result isgone by the wind.MySQL has „streamed“ the results during the queryexecution into the Protocol object of THD. Protocol in turnhas converted the raw results from MySQL into MySQL(text) protocol binary packages and send them out usingvio/net modules. Net module was set to NULL by us earlier.Results are lost.Lets hack a JSON-Protocol class that returns a string to thecalller. The result is stored in a string buffer.
  • We are here... Browser Browser JavaScript JavaScript GET /?sql=<mysql> Apache PHP MySQL MySQL
  • The speaker says...Quick recap.MySQL now understands the GET /?sql=<mysql> request.<mysql> is used as a statement string. The statement hasbeen executed.Next: return the result as JSON.
  • JSON Protocolclass Protocol_json :public Protocol_text {private: String json_result;public: Protocol_json() {} Protocol_json(THD *thd_arg) :Protocol_text(thd_arg) {} void init(THD* thd_arg); virtual bool store_tiny(longlong from); /* ...*/ virtual bool json_get_result(String * buffer); virtual void json_begin_result_set_row(); virtual bool json_add_result_set_column(uint field_pos, const uchar* s, uint32 s_length); virtual bool json_add_result_set_column(uint field_pos, String str); virtual bool json_add_result_set_column_cs(uint field_pos, const char* s, uint32 s_length, const CHARSET_INFO *fromcs, const CHARSET_INFO *tocs); /* ... */};
  • The speaker says...The proof-of-concept daemon plugin shall besimplistic. Thus, we derive a class from the old MySQL 4.1style text protocol, used for calls like mysql_query(),mysqli_query() and so forth. Prepared statement use adifferent Protocol class.Method implementation is straight forward. We map everystore_<type>() call tojson_add_result_set_column(). Everything becomesa C/C++ string (char*, ...). Returning a numberic columntype as a number of the JSON world is possible.
  • JSON Protocol methodbool Protocol_json::json_add_result_set_column(uint field_pos, constuchar* s, uint32 s_length) DBUG_ENTER("Protcol_json::json_add_result_set_column()"); DBUG_PRINT("info", ("field_pos %u", field_pos)); uint32 i, j; uchar * buffer; if (0 == field_pos) { json_begin_result_set_row();} json_result.append("""); /* TODO CHARSETs, KLUDGE type conversions, JSON escape incomplete! */ buffer = (uchar*)my_malloc(s_length * 2 * sizeof(uchar), MYF(0)); for (i = 0, j = 0; i < s_length; i++, j++) { switch (s[i]) { case ": case : case /: case b: case f: case n: case r: case t: buffer[j] = ; j++; break; } buffer[j] = s[i]; } /*...*/
  • The speaker says...It is plain vanilla C/C++ code one has to write. Pleaseremember, I show proof of concept code. Production codefrom the MySQL Server team is of much higher quality. Forexample, can you explain the reasons for memcpy() in thiscode?func(uchar *pos) { ulong row_num; memcpy(&row_num, pos, sizeof(row_num)); …}Leave the riddle for later. JSON is not complex!
  • Use of JSON Protocolint query_in_thd(String * json_result) { /* … */ thd= new THD(false)); /* JSON, replace protocol object of THD */ protocol_json.init(thd); thd->protocol=&protocol_json; DBUG_PRINT("info", ("JSON protocol object installed")); /*... execute COM_QUERY SQL statement ...*/ /* FIXME, THD will call Protocol::end_statement, the parent implementation. Thus, we cannot hook end_statement() but need and extra call in Protocol_json to fetch the result. */ protocol_json.json_get_result(json_result); /* Calling should not be needed in our case */ thd->protocol->end_statement(); /*...*/
  • The speaker says...Straight forward: we install a different protocol object forTHD and fetch the result after the query execution.
  • Proof: MySQL with HTTP, JSONnixnutz@linux-rn6m:~/> curl -v http://127.0.0.1:8080/?sql=SELECT%201* About to connect() to 127.0.0.1 port 8080 (#0)* Trying 127.0.0.1... connected> GET /?sql=SELECT%201 HTTP/1.1> User-Agent: curl/7.22.0 (i686-pc-linux-gnu) libcurl/7.22.0OpenSSL/1.0.0e zlib/1.2.5 c-ares/1.7.5 libidn/1.22 libssh2/1.2.9> Host: 127.0.0.1:8080> Accept: */*>< HTTP/1.1 200 OK< Content-Type: text/plain< Content-Length: 7<* Connection #0 to host 127.0.0.1 left intact* Closing connection #0[["1"]]
  • The speaker says...Extracting metadata (types, column names) into theProtocol_json method for sending was not called as itproved to big as a task for PoC.We wrap up the HTTP task with a multi-threadedlibevhtp based HTTP interface. Once again, copy andadapt examples from the library documentation...
  • We are here... Browser Browser JavaScript JavaScript GET /?sql=<mysql> JSON reply Apache PHP MySQL MySQL
  • The speaker says...Quick recap.MySQL understands a new GET /?sql=<mysql> command.<mysql> is used as a statement string. The statement hasbeen executed. The result has been formatted as a JSONdocumented. A HTTP reply has been sent.Next: from single-threaded to multi-threaded.
  • Multi-threaded HTTP serverint conn_js_evhtp_init() { evthread_use_pthreads(); base = event_base_new(); htp = evhtp_new(base, NULL); evhtp_set_gencb(htp, conn_js_evhtp_send_document_cb, NULL); evhtp_use_threads(htp, conn_js_evhtp_init_thread, 16, NULL); evhtp_bind_socket(htp, CONN_JS_HTTP_HOST, port, 1024){ event_base_loop(base, 0);}void conn_js_evhtp_init_thread( evhtp_t * htp, evthr_t * thread, void * arg) { struct worker_data * worker_data; /* Thread local storage – TLS */ worker_data = (struct worker_data *) calloc(sizeof(struct worker_data), 1); worker_data->evbase = evthr_get_base(thread); evthr_set_aux(thread, worker_data);}
  • The speaker says...Multi-threading out-of-the box thanks to libevhtp. Libevhtpis a BSD library that aims to replace the HTTP functionalityin libevent.Note the thread-local storage (TLS) of the HTTPworker threads. I have missed the opportunity of cachingTHD in TLS. Doing so may further improve performance.A webserver needs a script language. Lets add server-side JavaScript! TLS will come handy soon.
  • We are here... JavaScript JavaScript JavaScript JavaScript JavaScript GET /?sql=SELECT%1 32 concurrent clients Apache PHP PHP MySQL MySQL400 Req/s, Load 34 1606 Req/s, Load 2,5
  • The speaker says...MySQL understands a new GET /?sql=<mysql> command.<mysql> is used as a statement string. The statement hasbeen executed. The result has been formatted as a JSONdocumented. A HTTP reply has been sent. The MySQL HTTPInterface is multi-threaded. So is MySQL – ever since.The need for proxing (see the left) is gone. No extradeployment of a proxying solution any more. MySQL getsmore system resources resulting in a performance boost.
  • JavaScript for MySQL Like with PHP extensions! Copy daemon plugin example, add your magic Glue libraries: Google V8 JavaScript engine (BSD) Handle GET /?app=<name> 2500 2000 1500Requests/s PHP 1000 Server plugin 500 0 1 4 8 16 32 Concurrency (ab2 -c <n>)
  • The speaker says...The chart shows „Hello world“ with Apache/PHP comparedto MySQL/server-side JavaScript. I am caching theJavaScript source code once its loaded from a databasetable. JavaScript is compiled and run upon each request.System load reported by top during ab2 -n 50000 -c32 is 27during the PHP test and 5 for MySQL/server-sideJavaScript...mysql> select * from js_applications where name=internetsuperheroG*************************** 1. row *************************** name: internetsuperherosource: function main() { return "Hello world"; } main();
  • Embedding Google V8Cache expensive operations Keep v8::Context in thread local storage Cache the script code after fetching from MySQL#include <v8.h>using namespace v8;int main(int argc, char* argv[]) { HandleScope handle_scope; Persistent<Context> context = Context::New(); Context::Scope context_scope(context); Handle<String> source = String::New("Hello + , World!"); Handle<Script> script = Script::Compile(source); Handle<Value> result = script->Run(); context.Dispose(); String::AsciiValue ascii(result); printf("%sn", *ascii); return 0;}
  • The speaker says...Google V8 is the JavaScript engine used in Google Chrome.Googles open source browser. It is written in C++ and saidto be a fast engine. It is used by node.js and some NoSQLdatabases.Armed with the previously developed functionquery_in_thd() to fetch the source of an „MySQLApp“ storedin a table into a string, its easy going. Learn the basicconcepts of V8 and make it happen. Once done with the V8documentation, study http://www.codeproject.com/Articles/29109/Using-V8-Google-s-Chrome-JavaScript-Virtual-Machin
  • Load the sourceint conn_js_v8_run_program(const char * name, ::String * script_result){ /* … */ buffer_len = sizeof(SELECT_JS_APPLICATIONS) + 1 + strlen(name) + 3; buffer = (char*)my_malloc(buffer_len, MYF(0)); /* KLUDGE: escaping */ my_snprintf(buffer, buffer_len, "%s%s", SELECT_JS_APPLICATIONS,name); query_in_thd(&json_result, buffer); my_free(buffer); /* … */ buffer_len = json_result.length() - 6; buffer = (char*)my_malloc(buffer_len, MYF(0)); for (i = 3; i < (buffer_len + 2); i++) { buffer[i - 3] = json_result.c_ptr()[i]; } buffer[buffer_len - 1] = 0; conn_js_v8_run_code(buffer, script_result); my_free(buffer);}
  • The speaker says...Final code would store the source in a system table. Thetable would be accessed through the handler interface.NoSQL would be used, so to say. Many integrity checkswould be done.However, you havent learned yet how to use theHandler interface. Thus, we use what we have:query_in_thd(). It is amazing how far we can get withonly one function.
  • Make it run fastvoid conn_js_evhtp_init_thread(evhtp_t * htp, evthr_t * thr, void *arg) { /* ... */ conn_js_v8_init_thread(&worker_data->v8_context); evthr_set_aux(thread, worker_data);}void conn_js_v8_init_thread(void ** tls) { v8_thread_context * context = (v8_thread_context *)my_malloc(...); /* … */ context->isolate = v8::Isolate::New(); context->have_context = 0; *tls = context;}static void conn_js_v8_run_using_context(::String * res, void * tls) { v8_thread_context * context = (v8_thread_context *)tls; Isolate::Scope iscope(context->isolate); Locker l(context->isolate); HandleScope handle_scope; if (!context->have_context) { context->context = v8::Context::New(); context->have_context = 1; } /*...*/
  • The speaker says...To boost the performance we cache v8::Context inthe thread-local storage of our HTTP worker threads.The v8::Context is needed for compiling and running scripts.A v8::Context contains all built-in utility functions andobjects.For fast multi-threaded V8, each HTTP worker gets its ownv8::Isolate object. We want more than one globalv8::Isolate to boost concurrency. Isolate? Think of it as aMutex. Additionally, we cache the script source code in theTLS.Teach your HTTP server to call the functions. Done.
  • Proof: Server-side JavaScript~/> curl -v http://127.0.0.1:8080/?app=internetsuperhero* About to connect() to 127.0.0.1 port 8080 (#0)* Trying 127.0.0.1... connected> GET /?app=internetsuperhero HTTP/1.1> User-Agent: curl/7.22.0 (i686-pc-linux-gnu) libcurl/7.22.0OpenSSL/1.0.0e zlib/1.2.5 c-ares/1.7.5 libidn/1.22 libssh2/1.2.9> Host: 127.0.0.1:8080> Accept: */*>< HTTP/1.1 200 OK< Content-Type: text/plain< Content-Length: 11<* Connection #0 to host 127.0.0.1 left intact* Closing connection #0Hello world
  • The speaker says...Boring MySQLApp!JavaScript has no access to MySQL tables!
  • We are here... GET hello.php vs. GET /app=hello 32 concurrent clients Apache MySQL PHP PHP JavaScript JavaScript1107 Req/s, Load 27 2360 Req/s, Load 5
  • The speaker says...Quick recap.MySQL has a built-in mult-threaded web server. Users canuse JavaScript for server-side scripting. „Hello world“ runsfaster than Apache/PHP. The CPU load is lower.This is an intermediate step. This is not a new generalpurpose web server.Next: Server-side JavaScript gets SQL access.
  • Server-side JS does SELECT 1 Like with PHP extensions! Copy daemon plugin example, add your magic Glue libraries: Google V8 JavaScript engine (BSD) Handle GET /?app=<name> 1600 1400 1200 1000Requests/s 800 PHP Server plugin 600 400 200 0 1 4 8 16 32 Concurrency (ab2 -c <n>)
  • The speaker says...The charts and the system load is what we all expect. TheMySQL Server deamon plugin proof of concept remains inthe top position. It is faster and uses less CPU.Heres the server-side JavaScript I have benchmarked. ThePHP counterpart is using mysqli() to execute „SELECT 1“and converts the result into JSON.mysql> select * from js_applications where name=selectG*************************** 1. row *************************** name: selectsource: function main() { return ulf("SELECT 1"); } main();1 row in set (0,00 sec)
  • ulf() for server-side JavaScriptstatic void conn_js_v8_run_using_context(::String * script_result, void* tls) { /* ... */ if (!context->have_context) { Handle<ObjectTemplate> global = ObjectTemplate::New(); global->Set(v8::String::New("ulf"), FunctionTemplate::New(UlfCallback)); context->context = v8::Context::New(NULL, global); context->have_context = 1; } /* ... */}static Handle<Value> UlfCallback(const Arguments& args) { if (args.Length() < 1) return v8::Undefined(); ::String json_result; HandleScope scope; Handle<Value> sql = args[0]; v8::String::AsciiValue value(sql); query_in_thd(&json_result, *value); return v8::String::New(json_result.c_ptr(), json_result.length());}
  • The speaker says...Sometimes programming means to glue pieces together.This time, query_in_thd() is connected with V8.Imagine, server-side JavaScript had access to morefunctions to fetch data. That would be fantastic for map &reduce – assuming you want it.
  • Proof: JS runs ulf(SELECT 1)> curl -v http://127.0.0.1:8080/?app=select* About to connect() to 127.0.0.1 port 8080 (#0)* Trying 127.0.0.1... connected> GET /?app=select HTTP/1.1> User-Agent: curl/7.22.0 (i686-pc-linux-gnu) libcurl/7.22.0OpenSSL/1.0.0e zlib/1.2.5 c-ares/1.7.5 libidn/1.22 libssh2/1.2.9> Host: 127.0.0.1:8080> Accept: */*>< HTTP/1.1 200 OK< Content-Type: text/plain< Content-Length: 7<* Connection #0 to host 127.0.0.1 left intact* Closing connection #0[["1"]]
  • The speaker says...SELECT 1 is great to show the base performance of atechnology. SQL runtime is as short as possible. SQLruntime is constant. SQL runtime contributes little to overallruntime. With long running SQL, the majority of the time isspend on SQL. The differences between proxying throughApache/PHP and server-side JavaScript dimish.But SELECT 1 is still boring. What if we put a BLOB intoMySQL, store JSON documents in it and filter them atruntime using server-side JavaScript?
  • We are here... JavaScript JavaScript JavaScript JavaScript JavaScript GET /?app=select1 32 concurrent clients Apache PHP PHP MySQL MySQL JavaScript448 Req/s, Load 34 1312 Req/s, Load 5,5
  • The speaker says...This is a second intermediate step on the way to the mainquestion that has driven the author: how could MySQL bemore than SQL. MySQL is not only SQL.Next: poor-mans document storage.
  • JS to filter JSON documents Like with PHP extensions! Copy daemon plugin example, add magic Glue libraries: Google V8 JavaScript engine (BSD) Handle GET /?app=<name> 800 700 600 500Requests/s 400 PHP Server plugin 300 200 100 0 1 4 8 16 32 Concurrency (ab2 -c <n>)
  • The speaker says...MySQL with server-side JavaScript is still faster than PHP butat a 32 concurrent clients the system load reportedby top is 9. Have we reached a dead end?Or, should I buy myself a new notebook? The subnotebookthat runs all benchmarks in a VM is four years old. Please,take my absolute performance figures with a grain of salt.Benchmarking on modern commodity server hardware wasno goal.
  • We are here... JavaScript JavaScript JavaScript JavaScript JavaScript GET /?map=greetings 32 concurrent clients Apache PHP PHP MySQL JSON documents MySQL stored in BLOB JavaScript358 Req/s, Load 33,5 641 Req/s, Load 9
  • The speaker says...The illustration shows vision together with first benchmarkimpressions.This is how far I got linking three BSD software libraries tothe MySQL Server using a MySQL Server daemon plugin.Allow me a personal note, this was the status a week beforethe presentation.
  • Mission JSON document mappingUse the full potential of MySQL as storage Stream results into the map() function? Handler interface instead of SQL Cache the result – create a view
  • The speaker says...Attention - you are leaving the relational model andentering the NoSQL section of this talk.We are no longer talking about relations. We talk aboutJSON documents. SQL as an access language cant be usedanymore. We must map & reduce documents to filter outinformation, cache results and use triggers to maintainintegrity between derived, cached documents and originals.If you want, you can have access to the API used insideMySQL to execute SQL – the handler interface.
  • Maybe the loops are slow?Too many iterations to filter the doument? First loop inside the plugin to fetch rows Storing all rows in a string eats memory Second loop inside the server-side JavaScriptfunction filter_names() { var s = ulf("SELECT document FROM test_documents"); var docs = JSON.parse(s); var res = []; for (i = 0; i < docs.length; i++) { var doc = JSON.parse(docs[i]); if (doc.firstname !== undefined) { res[i] = "Hi " + doc.firstname; } } return JSON.stringify(res);}
  • The speaker says...The map() function API is not beautiful. First, we iterateover all rows in our plugin and create a string. Then, wepass the string to JavaScript and to the same loop again.
  • Maybe this is faster?Use handler interface Open table For each row: populate C++ object with document For each row: run map() function and access objectfunction map(){ var res; var row = JSON.parse(doc.before); if (row.firstname !== undefined) res = "Hi " + row.firstname; doc.after = JSON.stringify(res);}map();
  • The speaker says...The user API, the JavaScript function is still not nice but astep forward. A good one?
  • Using the handler interfaceint handler_copy_example(const char * db_name, const char *from_table_name, const char * to_table_name, String * result) { /* ... */ TABLE_LIST tables[2]; TABLE * table_from = NULL; TABLE * table_to = NULL; /* create and setup THD as in query_in_thd() */ /* from sql_acl.cc */ tables[0].init_one_table(db_name, strlen(db_name), from_table_name,strlen(from_table_name), from_table_name, TL_READ); tables[1].init_one_table(db_name, strlen(db_name), to_table_name,strlen(to_table_name), to_table_name, TL_WRITE); tables[0].next_local= tables[0].next_global= tables + 1; open_and_lock_tables(thd, tables, FALSE, MYSQL_LOCK_IGNORE_TIMEOUT); table_from = tables[0].table; table_from->use_all_columns(); table_to = tables[1].table; table_from->file->ha_rnd_init(TRUE);
  • The speaker says...For demonstrating the handler interface I show a functionthat copies all rows from one table to another. It is assumedthat the tables have identical structures. The loop hasmost of what is needed to create a „view“ or readfrom a „view“.Before you can use the handler interface you must create aTHD object. Use the setup and tear down code fromquery_in_thd(). Once done, create a table list to be passedto open_and_lock() tabled, tell the handler that we willaccess all rows and announce our plan to start readingcalling ha_rnd_init().
  • Using the handler interface do { if ((err = table_from->file->ha_rnd_next(table_to->record[0]))) { switch (err) { case HA_ERR_RECORD_DELETED: case HA_ERR_END_OF_FILE: goto close; break; default: table_from->file->print_error(err, MYF(0)); goto close; } } else { table_to->file->ha_write_row(table_to->record[0]); } } while (1);close: /* from sql_base.cc - open_and_lock_tables failure */ table_from->file->ha_rnd_end(); if (! thd->in_sub_stmt) { trans_commit_stmt(thd); } close_thread_tables(thd);
  • The speaker says...Read rows from one table into a buffer and write the bufferinto the target table. Stop in case of an error or when allrows have been read. Such loops can be found all over inthe MySQL Server code.When done, close the table handles and tear down THDbefore exiting.
  • Extracting data for map()my_bitmap_map * old_map;my_ptrdiff_t offset;Field * field;::String tmp, *val;/*...*/Do { /* the handler loop */ old_map = dbug_tmp_use_all_columns(table_from, table_from->read_set); offset = (my_ptrdiff_t)0; for (i = 0; i < table_from->s->fields; i++) { field = table_from->field[i]; field->move_field_offset(offset); if (!field->is_null()) { /* document is the C++/JavaScript data exchange object */ document->before = field->val_str(&tmp, &tmp); /* run map() function */ result = v8::script->Run(); /* store modified value*/ field->store(document->after.c_ptr(), document->after.length(),system_charset_info); field->move_field_offset(-offset); } dbug_tmp_restore_column_map(table_from->read_set, old_map);/* ... */ } while (1);
  • The speaker says...This code goes into the handler loop instead of the simplecopy done with table_to->file->ha_write_row(table_to->record[0]);For reference it is shown how to loop over all columns of arow and extract the data. In case of the document mappingone needs to read only the data for the BLOB column andcall the JavaScript map() function.A C++ object is used for data exchange with JavaScript.The object is populated before the map() function is run andinspected afterward.
  • Surprise: no major difference Are C++/ V8-JS context switches expensive? Calling JS for every row is a bad idea? Using C++ object for data exchange does not fly? We should send rows in batches to reduce switches 800 700 600 500Requests/s 400 Server plugin (SQL) Server plugin (Handler inter- 300 face) 200 100 0 1 4 8 16 32 Concurrency (ab2 -c <n>)
  • The speaker says...Calling the map function for every row reduces theperformance a bit. Lets recap how good performance is. Itis twice as fast as the PHP/Apache proxying approach.Detailed bottleneck analysis and further benchmarking isbeyond the scope and interest of this proof of concept. Ithas been proven that mapping is possible – at veryreasonable performance.
  • Single threaded read 8,300 documents mapped per second with V8 8,700 docs/s if map() is an empty function 11,500 docs/s if not calling map() 12,800 docs/s is the base without v8 during read 14000 12000Documents processed per second 10000 8000 No V8 in loop V8 but no script run 6000 V8 with empty map function V8 with filtering map function 4000 2000 0 1 Concurrency (ab2 -c <n>)
  • The speaker says...There is a simple solution how we get to the baseline of 12,800 documents read per second. We cachethe result in a „view“.The view is a SQL table that the plugin creates, if the view isaccessed for the first time. Triggers could be used to updatea view whenever underlying data changes.Please note, the figure of 12,800 is extrapolated from ab2 -n1000 -c 1 127.0.0.1:8080/?map=<name> to repeatedlyscan a small table with 522 rows (documents) using thehandler interface.
  • Map and reduce with MySQL JavaScript JavaScript JavaScript JavaScript GET /map=greeting 32 concurrent clients MySQL JSON documents MySQL SQL Handler JavaScript JavaScript641 Req/s, Load 9 571 Req/s, Load 9
  • The speaker says...As the name says, Map&Reduce is a two stage process.Mapping is optionally followed by reducing. If you are newto map and reduce, think of reduce as the aggregation stepin SELECT <column> FROM <table> GROUP BY <criteria>.It has been shown that map() is possible. Results can bepersisted in a table. Reducing can be understood as secondmapping that works on the results of the map() function.Mapping has been proven to be possible thusreducing is. Implementation was beyond the authorsgoals.
  • Areas for future workImagine someone created a BLOB optimized storage enginefor MySQL. Storage engine development is covered in thebooks...Imagine Websocket would be used instead of HTTP.Websocket is a raw „do as you like“ connection whith muchless overhead. GET /?sql=SELECT%201 return 71 bytes ofwhich 7 are the payload...Imagine Websocket would be used: transactions, events,streaming – all this is within reach...
  • PS: This is a proof of concept. No less, no more. I have created it in my after-work office. For thenext couple of weeks I plan to focus on nothing but my wedding. Otherwise the bride may decidethat an 19 years long evaluation is not enough. She might fear I could be coding during theceremony...Would you create the MySQL HTTP Interface, today? Im busy with the wedding.Happy Hacking!
  • THE ENDContact: ulf.wendel@oracle.com