2. Who am I?
Name: Kazuho Oku (奥一穂)
Original Developer of Palmscape / Xiino
The oldest web browser for Palm OS
Worked Cybozu Labs in 2005-2010
Research subsidiary of Cybozu, Inc. in Japan
Developed Japanize / Mylingual, Q4M, etc.
Now working at DeNA Co., Ltd.
with the developers of the HandlerSocketplugin
Mar 26 2011 Using Q4M 2
3. What is Q4M?
A message queue
runs as a storage plugin of MySQL 5.1
Why is it a MySQLplugin?
accessible by using existing MySQL clients
no need for a new client library
administrable by using SQL
friendly to DB admins
Mar 26 2011 Using Q4M 3
4. Design Goals of Q4M
Robust
Does not lose data on OS crash or power failure
necessary for Tokyo wo. nuclear power plants… orz
Fast
Transfer thousands of messages per second
Easy to Use
Use SQL for access / maintenance
Integration into MySQL
no more separate daemons to take care of
Mar 26 2011 Using Q4M 4
5. Users of Q4M
Many leading web services in Japan
DeNA Co., Ltd.
livedoor Co., Ltd.
mixi, Inc.
Zynga Japan (formerly Unoh, Inc.)
Mar 26 2011 Using Q4M 5
6. Agenda
What MQ is in General
Applications of Q4M
Brief Tutorial
Mar 26 2011 Using Q4M 6
7. What is a Message Queue?
Mar 26 2011 Using Q4M 7
8. What is a Message Queue?
Middleware for persistent asynchronous
communication
communicate between fixed pairs (parties)
a.k.a. Message Oriented Middleware
MQ is intermediate storage
RDBMS is persistent storage
Senders / receivers may go down
Mar 26 2011 Using Q4M 8
9. Minimal Configuration of a MQ
Senders and receivers access a single
queue
Sender Receiver
Queue
Mar 26 2011 Using Q4M 9
10. MQ and Relays
Separate queue for sender and receiver
Messages relayed between queues
Relay
Sender Receiver
Queue Queue
Mar 26 2011 Using Q4M 10
11. Merits of Message Relays
Destination can be changed easily
Relays may transfer messages to different
locations depending on their headers
Robustness against network failure
no loss or duplicates when the relay fails
Logging and Multicasting, etc.
Mar 26 2011 Using Q4M 11
12. Message Brokers
Publish / subscribe model
Separation between components and their
integration
Components read / write to predefined queues
Integration is definition of routing rules between
the message queues
Messages are often transformed (filtered) within
the relay agent
Mar 26 2011 Using Q4M 12
13. What about Q4M?
Q4M itself is a message queue
Can connect Q4M instances to create a
message relay
Provides API for creating message relays
and brokers
Mar 26 2011 Using Q4M 13
14. Performance of Q4M
over 7,000 mess/sec.
message size: avg. 512 bytes
syncing to disk
Outperforming most needs
if you need more, just scale out
Can coexist with other storage engines without
sacrificing their performance
see http://labs.cybozu.co.jp/blog/kazuhoatwork/2008/06/q4m_06_release_and_benchmarks.php
Mar 26 2011 Using Q4M 14
16. Asynchronous Updates
DeNA uses Q4M for sending notifications
to users asynchronously
http://engineer.dena.jp/2010/03/dena-technical-seminar-1-
2.html
Mar 26 2011 Using Q4M 16
17. Delay Peak Demands
Mixi (Japan's one of the largest SNS)
uses Q4M to buffer writes to DB, to
delay peak demands
from http://alpha.mixi.co.jp/blog/?p=272
Mar 26 2011 Using Q4M 17
18. Connecting Distant Servers
Pathtraq uses Q4M to create a relay
between its database and content
analysis processes
→ Contents to be analyzed →
Content
Pathtraq MySQL conn. Analysis
DB over SSL,gzip Processes
← Results of the analysis ←
Mar 26 2011 Using Q4M 18
19. Prefetch Data
livedoor Reader (web-based feed
aggregator) uses Q4M to prefetch data
from database to memcached
uses Q4M for scheduling web crawlers
as well
from http://d.hatena.ne.jp/mala/20081212/1229074359
Mar 26 2011 Using Q4M 19
20. Scheduling Web Crawlers
Web crawlers with retry-on-error
Sample code included in Q4M dist.
If failed to fetch, store URL in retry queue
Store Result
Read URL
Spiders
URL
DB
Request Queue Retry Queue
Re-
scheduler
Mar 26 2011 Using Q4M 20
21. Delayed Content Generation
Hatetter(RSS feed-to-twitter-API
gateway) uses Q4M to delay content
generation
Source code: github.com/yappo/website-hatetter
Mar 26 2011 Using Q4M 21
23. Installing Q4M
Compatible with MySQL 5.1
Download from q4m.github.com
Binary releases available for some platforms
Installing from source:
requires source code of MySQL
./configure && make && make install
run support-files/install.sql
Mar 26 2011 Using Q4M 23
24. Configuration Options of Q4M
--with-sync=no|fsync|fdatasync|fcntl
Controls synchronization to disk
default: fdatasync on linux
--enable-mmap
Mmap’ed reads lead to higher throughput
default: yes
--with-delete=pwrite|msync
msyncrecommended on linux>=2.6.20 if you
need really high performance
Mar 26 2011 Using Q4M 24
26. The Model
Various publishers write to queue
Set of subscribers consume the entries in queue
Publisher
Publisher Q4M table
Subscribers
Publisher
Mar 26 2011 Using Q4M 26
27. Creating a Q4M Table
ENGINE=QUEUE creates
mysql> CREATE TABLE qt (
-> id int(10) unsigned NOT NULL,
a Q4M table
-> message varchar(255) NOT NULL
-> ) ENGINE=QUEUE;
Query OK, 0 rows affected (0.42 sec)
No primary keys or
indexes
Sorted by insertion
order (it’s a queue)
Mar 26 2011 Using Q4M 27
28. Modifying Data on a Q4M Table
No restrictions for
mysql> INSERT INTO qt (id,message)
-> VALUES
INSERT and DELETE
-> (1,'Hello'),
-> (2,'Bonjour'),
-> (3,'Hola');
No support for UPDATE Query OK, 3 rows affected (0.02 sec)
mysql> SELECT * FROM qt;
+----+---------+
| id | message |
+----+---------+
| 1 | Hello |
| 2 | Bonjour |
| 3 | Hola |
+----+---------+
3 rows in set (0.00 sec)
Mar 26 2011 Using Q4M 28
29. SELECT from a Q4M Table
Works the same as
mysql> SELECT * FROM qt;
+----+---------+
| id | message |
other storage +----+---------+
| 1 | Hello |
engines | 2 | Bonjour |
| 3 | Hola |
+----+---------+
SELECT COUNT(*) is 3 rows in set (0.00 sec)
cached mysql> SELECT COUNT(*) FROM qt;
+----------+
| COUNT(*) |
+----------+
| 3|
+----------+
1 row in set (0.00 sec)
How to subscribe to a queue?
Mar 26 2011 Using Q4M 29
30. Calling queue_wait()
After calling, only one
mysql> SELECT * FROM qt;
+----+---------+
| id | message |
row becomes visible +----+---------+
| 1 | Hello |
from the connection | 2 | Bonjour |
| 3 | Hola |
+----+---------+
3 rows in set (0.00 sec)
mysql> SELECT queue_wait('qt');
+------------------+
| queue_wait('qt') |
+------------------+
| 1|
+------------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM qt;
+----+---------+
| id | message |
+----+---------+
| 1 | Hello |
+----+---------+
1 row in set (0.00 sec)
Mar 26 2011 Using Q4M 30
31. OWNER Mode and NON-OWNER Mode
In OWNER mode, only the OWNED row
is visible
OWNED row becomes invisible from other connections
rows of other storage engines are visible
NON-OWNER Mode queue_wait() OWNER Mode
1,'Hello' 1,'Hello'
2,'Bonjour'
queue_end()
3,'Hola'
queue_abort()
Mar 26 2011 Using Q4M 31
32. Returning to NON-OWNER mode
By calling queue_abort,
mysql> SELECT QUEUE_ABORT();
+---------------+
the connection returns
| QUEUE_ABORT() |
+---------------+
to NON-OWNER mode
| 1|
+---------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM qt;
+----+---------+
| id | message |
+----+---------+
| 1 | Hello |
| 2 | Bonjour |
| 3 | Hola |
+----+---------+
3 rows in set (0.01 sec)
Mar 26 2011 Using Q4M 32
33. Consuming a Row
By calling
mysql> SELECT queue_wait('qt');
(snip)
mysql> SELECT * FROM qt;
queue_end, the OWNED +----+---------+
| id | message |
row is deleted, and +----+---------+
| 1 | Hello |
connection returns to
+----+---------+
1 row in set (0.01 sec)
NON-OWNER mode mysql> SELECT queue_end();
+-------------+
| queue_end() |
+-------------+
| 1|
+-------------+
1 row in set (0.01 sec)
mysql> SELECT * FROM qt;
+----+---------+
| id | message |
+----+---------+
| 2 | Bonjour |
| 3 | Hola |
+----+---------+
2 rows in set (0.00 sec)
Mar 26 2011 Using Q4M 33
34. Writing a Subscriber
Call two functions: queue_wait, queue_end
Multiple subscribers can be run concurrently
each row in the queue is consumed only once
while (true) {
SELECT queue_wait('qt'); # switch to owner mode
rows := SELECT * FROM qt; # obtain data
if (count(rows) != 0) # if we have any data, then
handle_row(rows[0]); # consume the row
SELECT queue_end(); # erase the row from queue
}
Mar 26 2011 Using Q4M 34
35. Writing a Subscriber (cont'd)
Or call queue_wait as a condition
Warning: conflicts with trigger-based insertions
while (true) {
rows := SELECT * FROM qt WHERE queue_wait('qt');
if (count(rows) != 0)
handle_row(rows[0]);
SELECT queue_end();
}
Mar 26 2011 Using Q4M 35
36. The Model – with code
INSERT INTO queue ...
Publisher
while (true) {
rows := SELECT * FROM qt
WHERE queue_wait('qt');
if (count(rows) != 0)
INSERT INTO queue ... handle_row(rows[0]);
SELECT queue_end();
}
Publisher Q4M table
Subscribers
INSERT INTO queue ...
Publisher
Mar 26 2011 Using Q4M 36
38. queue_wait(table)
Enters OWNER mode
0〜1 row becomes OWNED
Enters OWNER mode even if no rows were
available
Default timeout: 60 seconds
Returns 1 if a row is OWNED (0 on timeout)
If called within OWNER mode, the
owned row is deleted
Mar 26 2011 Using Q4M 38
39. Revisiting Subscriber Code
Calls to queue_end just before
queue_wait can be omitted
while (true) {
rows := SELECT * FROM qt WHERE queue_wait('qt');
if (count(rows) != 0)
handle_row(rows[0]);
SELECT queue_end();
}
Mar 26 2011 Using Q4M 39
40. Conditional queue_wait()
Consume rows of certain condition
Rows that do not match will be left untouched
Only numeric columns can be checked
Fast - condition tested once per each row
examples:
SELECT queue_wait('table:(col_a*3)+col_b<col_c');
SELECT queue_wait('table:retry_count<5');
Mar 26 2011 Using Q4M 40
41. queue_wait(tbl_cond,[tbl_cond…,timeout])
Accepts multiple tables and timeout
Data searched from leftmost table to
right
Returns table index (the leftmost table is
1) of the newly owned row
Returns zero if no rows are being owned
example:
SELECT queue_wait('table_A','table_B',60);
Mar 26 2011 Using Q4M 41
42. Functions for Exiting OWNER Mode
queue_end
Deletes the owned row and exits OWNER mode
queue_abort
Releases (instead of deleting) the owned row and
exits OWNER mode
Close of a MySQL connection does the same thing
Mar 26 2011 Using Q4M 42
44. The Model
Relay (or router) consists of more than 3
processes, 2 conns
No losses, no duplicates on crash or
disconnection
Q4M Table Q4M Table
(source) Relay Program (dest.)
Mar 26 2011 Using Q4M 44
45. Internal Row ID
Every row have a internal row ID
invisible from Q4M table definition
monotonically increasing 64-bit integer
Used for detecting duplicates
Use two functions to skip duplicates
Data loss prevented by using queue_wait /
queue_end
Mar 26 2011 Using Q4M 45
46. queue_rowid()
Returns row ID of the OWNED row (if
any)
Returns NULL if no row is OWNED
Call when retrieving data from source
Mar 26 2011 Using Q4M 46
47. queue_set_srcid(src_tbl_id, mode, src_row_id)
Call before inserting a row to destination
table
Checks if the row is already inserted into the
table, and ignores next INSERT if true
Parameters:
src_tbl_id - id to determine source table (0〜63)
mode - "a" to drop duplicates, "w" to reset
src_row_id - row ID obtained from source table
Mar 26 2011 Using Q4M 47
48. Pseudo Code
Relays data from src_tbl to dest_tbl
while (true) {
# wait for data
SELECT queue_wait(src_tbl) =>src_db;
# read row and rowid
row := (SELECT * FROM src_tbl =>src_db);
rowid := (SELECT queue_rowid() =>src_db);
# insert the row after setting srcid
SELECT queue_set_srcid(src_tbl_id, 'a', rowid) =>dest_db;
INSERT INTO dest_tbl (row) =>dest_db;
Mar 26 2011 Using Q4M 48
49. q4m-forward
Simple forwarder script
installed into mysql-dir/bin
usage: q4m-forward [options] src_addrdest_addr
example:
% support-files/q4m-forward
"dbi:mysql:database=db1;table=tbl1;user=foo;password=XXX"
"dbi:mysql:database=db2;table=tbl2;host=bar;user=foo"
options:
--reset reset duplicate check info.
--sender=idx slot no. used for checking duplicates (0..63, default: 0)
--help
Mar 26 2011 Using Q4M 49
51. Things that Need to be Fixed
Table compactions is a blocking
operation
runs when live data becomes <25% of log file
very bad, though not as bad as it seems
it's fast since it's a sequential write operation
Relays are slow
since transfer is done row-by-row
Binlog does not work
since MQ replication should be synchronous
Mar 26 2011 Using Q4M 51
52. Future of Q4M (maybe)
Support for MySQL 5.5
not request yet from current users :-p
2-phase commit with other storage
engines
queue consumption and InnoDB updates can
become atomic operation
Mar 26 2011 Using Q4M 52
53. Thank you
http://q4m.github.com/
Mar 26 2011 Using Q4M 53