2. We run a production service on Heroku
◦ Playframework 1.2.5
◦ Multiple dynos
Sometimes following error is logged, when dyno is
cycling.
◦ This error happens when play receives a request after
shutdown.
◦ But frequency is very rare.
I've discussed about this problem with Heroku
support team.
This document explains its detail.
play.exceptions.UnexpectedException: Application is not started
3. At the shutdown time, Play does nothing to Netty.
After app shutdowned, Netty still alive and can accept
new request.(But always returns 503)
When shutdown process takes a few seconds, there is
a possibility to accept new request and return 503.
Normally this duration is
nearly zero sec.
But rarely takes a few seconds.
5. Playframework
◦ Play should stop Netty before app shutdown.
◦ There is no way to stop Netty by application developer.
Heroku
◦ Heroku shouldn’t send request after SIGTERM.
◦ It is out of control for application developer.
Though it may be possible to fix playframework,
it is better to handle this problem by Heroku side.
(Because other framework may have same problem.)
6. This is a great feature of Heroku labs.
◦ https://devcenter.heroku.com/articles/labs-preboot
Preboot changes Heroku restarting behavior
◦ Before: Kill old dynos, then boot new dynos.
◦ After: Boot new dynos first.
7. Target
◦ Web dynos only
◦ At least two web dynos are required.
When
◦ git push
◦ heroku config:set
◦ heroku restart
It doesn’t apply at cycling…
8. Currently there is no way to avoid this
problem radically…
Our service features.
◦ Mostly used in Japan.
◦ So at the midnight(JST), its traffic is not so high.
If I can control the time of cycling to
midnight(JST), it may reduce error possibility.
◦ Though original frequency is not so high, I want to
reduce error possibility as much as possible.
10. Daily at 18:30 (3:30 JST)
◦ vendor/heroku-toolbelt/bin/heroku restart web.1 -a xxx
Daily at 19:00 (4:00 JST)
◦ vendor/heroku-toolbelt/bin/heroku restart web.2 -a xxx
Delay the timing of restarting multiple dynos.
This settings can reduce daytime cycling.
But it doesn't mean the daytime cycling will never
occur.
e.g. if Heroku detects a fault in underlying
hardware, it will be cycling.
11. To solve this problem radically, I hope Heroku
to either of following.
◦ Don’t send request to dyno after SIGTERM.
◦ Apply preboot at cycling.