2. Introduction to pixiv
● Illust Communication Site
○ http://www.pixiv.net
● Users
○ 5 million
● Monthly page view
○ 3.3 billion
● Network Traffic
○ Over 6Gbps
3. About me
● Tatsuhiko Kubo(H.N:bokko)
● @cubicdaiya
● Infrastructure & Software Engineer@pixiv. Inc
4. My Works@pixiv. Inc
Responsible for
● Middle-Ware Development
● Technical Operation & Administration
● Datastore & Caching Strategy
● Image Upload & Transformation
● Illust & User Recommendation
● Notification( called Popboard at pixiv)
etc...
5. My works@private
■Software
● neoagent
○ A Yet Another Memcached Protocol Proxy Server
● dtl
○ diff template library with C++
● ngx_small_light
○ Dynamic Transformation Module for Nginx
■Writing
● Software Design 2009 Sep
○ どのようにして差分を導き出すのか~diffの動作原理を知る~
○ http://gihyo.jp/dev/column/01/prog/2011/diff_sd200906
And More -> http://cccis.jp
7. Scale of pixiv according to thumbnails
● pixiv has about 30,000,000 illusts and comics
● Each illust has about 12 ~ too many thumbnails
● 20,000 illusts and comics are uploaded every day
● Total volume is about 30TB
9. Image Upload Detail 1
● Generate too many thumbnails
○ 12 ~ too many thumbnails
● Save a original image and thumbnails to storage
○ not NFS
○ with in-house WebDAV Client(ImageClient)
20. semin-asynchronous upload mechanism
User Side Action
File Selection View Input Information View Completed View
create lock file poll until lock file is deleted
21. semin-asynchronous upload mechanism
User Side Action
File Selection View Input Information View Completed View
create lock file poll until lock file is deleted
Server Side Action
22. semin-asynchronous upload mechanism
User Side Action
File Selection View Input Information View Completed View
create lock file poll until lock file is deleted
Server Side Action
prefork server
23. semin-asynchronous upload mechanism
User Side Action
File Selection View Input Information View Completed View
create lock file poll until lock file is deleted
Server Side Action
prefork
prefork server upload worker
upload worker
・
・
・
・
・
upload worker
24. semin-asynchronous upload mechanism
User Side Action
File Selection View Input Information View Completed View
create lock file poll until lock file is deleted
Server Side Action
prefork
prefork server upload worker Genrate Thumbnails
upload worker Genrate Thumbnails
・
・
・
・
・
upload worker Genrate Thumbnails
25. semin-asynchronous upload mechanism
User Side Action
File Selection View Input Information View Completed View
create lock file poll until lock file is deleted
Server Side Action
prefork
prefork server upload worker Genrate Thumbnails delete lock file
upload worker Genrate Thumbnails delete lock file
・
・
・
・
・
upload worker Genrate Thumbnails delete lock file
26. Inside Image Upload
● User Side Action
○ Apache
○ PHP
■ ImageClient
● Server Side Action
○ daemontools
○ Python
■ python-prefork, python-q4m, python-worker
● python-q4m and python-worker are in-house libraries
○ Q4M
■ Used As Upload Job Queue
31. Detail of generating thumbnails
● pixiv uses ImageMagick
○ ImageMagick is not fast
○ But quality of generated image is good
○ Quality is more important than speed for us
○ Of course, optimization is important, too
35. benchmark of libjpeg and libjpeg-turbo
processing JPEG-image with libjpeg and libjpeg-turbo
■libjpeg
■libjpeg-turbo
36. benchmark of libjpeg and libjpeg-turbo
processing JPEG-image with libjpeg and libjpeg-turbo
■libjpeg
■libjpeg-turbo
libjpeg-turbo is faster than libjpeg by 10% on x86_64
38. Disable OpenMP
● Latest ImageMagick is OpenMP enabled at default
○ This is very slow in multi-process environment
● How to disable OpenMP
○ Re-compile with '--disable-openmp'
○ OMP_NUM_THREADS=1
● pixiv takes the latter
○ Re-compiling and Re-packaging are complicated
45. Why is dynamic thumbnail needed?
● Static thumbnail consumes disk space
○ Dynamic thumbnail does not consume disk space
● Preparing new static thumbnails takes a long time
○ Dynamic thumbnail is ready in a second!
53. Resize image with mod_small_light
GET /tank.jpg GET /small_light(dw=100,dh=100)/tank.jpg
54. Many Options
q image quality(0~100)
of output format(jpg,gif,png,tiff)
e processing engine(imlib2,imagemagick)
cc canvas color
p pattern name with SmallLightPatternDefine
etc Too many options
56. Resize image with mod_small_light
GET /small_light(p=small)/tank.jpg
GET /tank.jpg or
GET /small_light(dw=100,dh=100,e=imagemagick,jpeghint=y)/tank.jpg
57. Why not original mod_small_light?
● Some thumbnails are special
○ comic cover
○ various cropping algorithms depending on aspect ratio
● Default output format is JPEG
○ It is good for us that input and output formats are the same
58. Why not original mod_small_light?
● No support for CMYK
○ pixiv must support CMYK
● Needed support for strange aspect ratio
○ da=l -> long-edge
○ da=s -> short-edge
○ da=p -> pixiv-edge (only pixiv Edition)
65. Special Thumbnail 3
● special cropping algorithm
crop top of image if image is portrait
66. Many Extend Options
rmprof remove image profiles
crop_square crop image squarly
cover add manga cover
samec conform dw & dh & cw & ch
extendl extend long-edge
etc twenty new options in all
67. Summary
● Optimization of image processing is very important
○ Generating thumbnail takes a long time
○ Let's tune and desynchronize
● pixiv has two types of thumbnails
○ Static Thumbnail
○ Dynamic Thumbnail
● Dynamic Thumbnail is diskspace-saving and flexible
○ Big image storage is expensive
○ Easy to correspond to shift of application
■ New thumbnail types for new designs