1. Technology Consulting and Engineering
GPGPU Programming
on Android devices
Alten droidconNL 2012
ALTEN | 11/22/12
2. Welcome
• Alten PTS; leading service provider in the field of technical
consultancy and engineering
 Eindhoven, Capelle aan de IJssel and Apeldoorn
• ir. Arjan Somers
Alten-droidconNL 2012 Slide 2
3. Goals
â—ŹWhat is GPGPU?
â—ŹHow is it done on current Android devices?
â—ŹWhen is GPGPU programming useful?
Alten-droidconNL 2012 Slide 3
5. What
â—ŹGPGPU programming is using the GPU to
perform general purpose calculations
â—Ź Data manipulation using the graphic card
Alten-droidconNL 2012 Slide 5
6. What
A little bit of history
â—Ź
â—Ź GPU's and OpenGL
â—Ź Parallel vector based operations
â—Ź Programmable (Shaders)
Battlezone (1980) Crysis 3 (2013)
Alten-droidconNL 2012 Slide 6
7. What
A little bit of history
â—Ź
â—Ź GPU's and OpenGL
â—Ź Parallel vector based operations
â—Ź Programmable (Shaders)
Alten-droidconNL 2012 Slide 7
19. What
A CPU program vs a GPU Program
â—Ź
int[] a; int[] b=F(a); int[] b;
Texture draw() pixel-buffer
int[] a; int[] b;
Alten-droidconNL 2012 Slide 19
20. Why
â—ŹAdditional computational power
â—Ź Can run in parallel with CPU
â—ŹGreater computational power
â—Ź Galaxy S2
â—Ź CPU 4 GFlops
â—Ź GPU 10 GFlops
Alten-droidconNL 2012 Slide 20
21. How
1) Parallelize code
2) Data packing
3) Implement OpenGL ES 2.0 Shaders
4) Drawing and Input/Output
Alten-droidconNL 2012 Slide 21
22. How: Parallelize code
Pixel-shader parallel code:
public void foo(int[] a, int[] b){
int[] c = new int[a.length];
for(int i=0; i<a.length; i++){
c[i] = a[i] + 2*b[i];
}
}
Alten-droidconNL 2012 Slide 22
23. How: Parallelize code
Not Pixel-shader parallel code:
(Mobile GPU have no geometry shaders)
public int[] bar(int[] a, int[] b, int[] c){
int[] d = new int[a.length];
for(int i=0; i<a.length; i++){
d[b[i]] += a[i];
d[c[i]] += a[i];
}
return d;
}
Alten-droidconNL 2012 Slide 23
24. How: Parallelize code
Not Pixel-shader parallel code:
â—Ź Sequence of calculations
Alten-droidconNL 2012 Slide 24
25. How: Data packing
Current mobile GPU's only have
â—Ź 8bpp buffers and textures
â—Ź single render target
Alten-droidconNL 2012 Slide 25
27. How: Implement Shaders
Will be shown later in detail
â—Ź
â—Ź Use OpenGL ES 2.0
â—Ź No CUDA, OpenCL or similar
Alten-droidconNL 2012 Slide 27
28. How: Drawing and Input/Output
Transfer is slow
â—Ź
Texture draw() pixel-buffer
Int[] a; Int[] b;
Alten-droidconNL 2012 Slide 28
29. When: What works, what not
â—ŹParallelism
â—Ź No geometry shaders
â—ŹLimited precision / Single Render Target
â—ŹLimited data transfer
â—ŹNot yet as fast a desktop
Alten-droidconNL 2012 Slide 29
30. Example
AES encryption on the GPU
â—Ź
“Hello droiconNL!”
Encryption
U2FsdGVkX18UAXwN1I7bomP0kuKNXwQ8h
2NHb8lZ5sAG6uaLjZxzkn/ik9QPv8Pq
Decryption
“Hello droiconNL!”
Alten-droidconNL 2012 Slide 30
31. Example
Encoding Decoding
Alten-droidconNL 2012 Slide 31
32. Example
Encoding Decoding
Not GPU-Parallel GPU-Parallel
Alten-droidconNL 2012 Slide 32
34. Example
Are all parts implementable on gpu?
Alten-droidconNL 2012 Slide 34
35. Example
Are all parts implementable on gpu?
Parallelizable?
Packing required?
Alten-droidconNL 2012 Slide 35
36. Example
Implementing shader
Dec Hex
0 00
25 19
255 FF
Alten-droidconNL 2012 Slide 36
37. Example
Implementing shader
Dec Hex
0 00
25 19
255 FF
Alten-droidconNL 2012 Slide 37
38. Example
Implementing shader
â—ŹParallelizable?
â—ŹPacking?
â—ŹSteps:
â—Ź Find row/column using hex-digits
â—Ź Find new value in substitution table
Alten-droidconNL 2012 Slide 38
46. My experiences
â—ŹOpenGL ES is limited vs Desktop
â—Ź Geometry shaders
â—Ź Buffer formats / no MRT's
â—ŹSometimes difficult to debug
â—Ź Dithering
â—Ź NPOT
â—ŹComplex algorithms are possible
â—Ź Computer vision implemented
â—ŹLarge speed gains are possible
Alten-droidconNL 2012 Slide 46
47. Conclusion
â—ŹHow is GPGPU programming performed on
Android devices?
â—Ź Trough the use of shaders and textures
â—ŹWhen is GPGPU a viable option?
â—Ź Calculations are consuming too much time
â—Ź Calculations are parallelizable
â—Ź Can be implemented using 32 bit buffers
â—Ź Limited transfer GPU-CPU memory
required
Alten-droidconNL 2012 Slide 47
48. Conclusion
â—ŹGPGPU programming has high potential
â—ŹMobile GPU are becoming faster
â—ŹGPGPU programming is fun
Alten-droidconNL 2012 Slide 48