2. Who are you?
Ritta Narita
(github:@naritta)
The University of Tokyo, Engineering M2
Researched about Physic simulation
Iâve worked in some companies.
2
5. Whatâs Random Forest ?
make many decision trees,
accept a majority decision
decision tree(play golf or not)
to know the result of decision tree,
need calculation for bound features.
humidity
> 30 %?
whether
= sunny
wind speed
> 10 m/s ?
play golf
donât play
donât play
donât play
yes
yes
yes
5
6. generate JS code
âexecute using eval
!
At present, to calculate decision tree
if (x[0]==0){
if (x[1]>30){
return 1;}
ă»ă»ă»
}
else {
return 1;
}
x = [weather, humidity, wind]
0=play golf, 1=donât play
humidity
> 30 %?
whether
= sunny
wind speed
> 10 m/s ?
play golf
donât play
donât play
donât play
yes
yes
yes
6
7. !
!
due to using eval, can execute any code
!
For example
hostile JS code like inïŹnite loop
âburden for TD
!
Itâs difïŹcult to restrict JS code
âneed restricted environment to calculate decision tree
!
Problem for JS
7
8. Then
generate original op code from tree model
âexecute on originalVM
PUT x[1]
PUT 0
IFEQ 10
!
ă»
ă»
ă»
x = [weather, humidity, wind]
0=play golf, 1=donât play
if (x[0]==0){
if (x[1]>30){
return 1;}
ă»ă»ă»
}
else {
return 1;
}
8
9. Whatâs the merit?
ă»can ïŹnd illegal code like inïŹnite loop easily
ă»only for comparator, so very restricted
ă»less op code, very fast
9
10. My work
op code featured for comparator
âonly PUSH, POP, GOTO, IF~
!
can ïŹnd inïŹnite loop
In this code, supposed not to have loop
âdonât execute same code
10
11. hadoop version 2.6, Hive 1.2.0 (Tez 0.6.1)
!
hadoop cluster size: c3.2xlarge 8 nodes
!
!
randomforest
!
 number of test examples in test_rf: 18083
!
 number of trees: 500
!
!
!
compile num: 500
!
eval num: 500 * 18083
!
Javascript : 1062.04 s
(Nashorn)
!
VM: 106.84 s Â
comparison with JS
10 times faster
11
13. Because of the number of class loading
for example, if every clients make 500 modelsâŠ
â
too many class loading
If using one class and 500 method,
It is same.
13
14. summary
ă»very restricted, can ïŹnd illegal code
!
ă»10 times faster
!
ă»future prospects:
can make it even faster by binary code
!
ă»merged in development branch
and will be released in v0.4
14
17. multiprocess at present
use in_multiprocess plugin
have to use multi sockets and assign each ports by user
super
visor
worker
worker
worker
port:
24224
port:
24226
port:
24225
17
21. can use multicore power fully without unconsciousness
setting ïŹle will get very simple 21
with SocketManagerwith in_multiprocess plugin
<source>
type multiprocess
<process>
cmdline -c /etc/td-agent/td-agent-child1.conf
</process>
<process>
cmdline -c /etc/td-agent/td-agent-child2.conf
</process>
</source>
!
#/etc/td-agent/td-agent-child1.conf
<source>
type forward
port 24224
</source>
!
#/etc/td-agent/td-agent-child2.conf
<source>
type forward
port 24225
</source>
<source>
type forward
port 24224
</source>
setting when using 2 core
22. To implement Socket Manager, I used ServerEngine
worker
worker
worker
super
visor
Server
Engine
live restart
Heartbeat via pipe
auto restart
22
ServerEngine is: a framework to implement
robust multiprocess servers like Unicorn.
25. Unix: very simple
Windows: a little complex
main difference
1. canât share socket by FD
ââin Windows, socket descriptor â ïŹle descriptor
ââIt doesnât make sense to share FD
ââ(have to use Winsock2 API to share sockets)
!
2. have to lock accept
in unix, donât need consider thundering herd
âbut do in windows.
â
25
26. Implementation (Windows)
DRb
create socket from port and bind
(WSASocket)
â
duplicate exclusive socket by pid
(WSADuplicateSocket)
â
get socket protocol (WSAProcolInfo)
worker
worker
worker
Socket
Manager
server
Socket
Manager
client
Socket
Manager
client
Socket
Manager
client
from WSAProcolInfo,
make WSASocket
â
handle into FD
â
IO.for_fd(FD)
send this IO to Cool.io
super
visor
Server
Engine
26
27. accept mutex
worker
worker
get
mutex
detach
release
mutex
attach
listening socket
to cool.io loop
accept
mutex
read and send data
to buffer/output
server socket
get
mutex
detach
release
mutex
attach
listening socket
to cool.io loop
accept
read and send data
to buffer/output
deal with post processing
in this process as it is
other process can listen
while this process is dealing with data
27
28. rotation in order
by accept mutex
ââ 2376ââĄ3456ââą2696ââŁ3388
ââ 2376ââĄ3456â 28
29. As a result of test,
Thundering herd doesnât occur in windows.
Tentatively I implemented roughly with mutex,
but I want to use IOCP like livuv in the future.
!
Patches are welcome from Windows specialist!
29
30. benchmark result (unix)
AWS ubuntu 14.04 m4.xlarge
RPS IO
conventional
model
6798.69
/sec
1361.07
kb/s
new model
(4 workers)
13743.02
/sec
2751.29
kb/s
in_httpâââout_forward
30
31. benchmark result (windows)
AWS Microsoft Windows Server 2012 R2 m4.xlarge
RPS IO
conventional
model
1834.01
/sec
385.07
kb/s
new model
(4 workers)
3513.31
/sec
737.66
kb/s
in_httpâââout_forward
31
32. Future work
ă»Buffering in multiprocess
ă»accept mutex based IOCPâŠetc
summary
ă»Implemented ïŹuentd Socket Manager with ServerEngine,
and will be faster without consciousness.
!
ă»There is details in ServerEngine Issue,
âyou can test my forked branch(ïŹuentd and ServerEngine)
âand Iâll send PR after this report.
32
36. Because of memory problem
When Random forests model is big and many customers use it,
It is too much memory consumption
36
37. ServerEngine is:
To implement Socket Manager, I used ServerEngine
a framework to implement
robust multiprocess servers like Unicorn.
37
38. how to use Socket Manager in ïŹuentd side
!
#get socket manager
socket_manager = ServerEngine::SocketManager.new_socket_manager
!
#get FD from socket manager
fd = socket_manager.get_tcp(bind, port)
!
#create listening socket from FD
lsock = TCPServer.for_fd(fd.to_i)
it doesnât need consider about socket sharing in ïŹuentd side,
ServerEngine deal with it inside.
38
39.
40. Benchmark Result
Iâll add multiprocess buffering function,
After that Iâll do benchmark formally.
!
Tentatively Show the rough result
40