CNIC Information System with Pakdata Cf In Pakistan
Â
LCA14: LCA14-301: AArch64: Media, libs and GUI plans & status
1. Wed-5-Mar, 10:05am, Ragesh Radhakrishnan, James Yu
and Tom Gall
LCA14-301: AArch64: Media, libs &
GUI plans & status
2. 1) Pick important libraries that have existing ARMv7
(32bit) NEON optimizations
2) Avoid creating more hand coded NEON assembler,
use NEON intrinsics instead
3) Set expectations
- We have to run in the model
- Model is not cycle accurate
4) Push results upstream to development versions of
library
5) As appropriate create versions against stable library
versions for use in product if requested
Porting Strategy for AARCH64
3. â libpng - James Yu
â libvpx - James Yu (VP8, VP9)
â libjpeg-turbo - Ragesh Radhakrishnan
â pixman - Ragesh Radhakrishnan
â xfce image - Tom Gall
â chromium browser - Tom Gall
Porting Strategy for AARCH64
4. Source Code - git://git.code.sf.net/p/libpng/code
Supported AArch64 from version 1.6.7, Nov. 2013.
Has been tested on iOS 7.
Benchmark Result:
Version 1.6.10beta01 [February 9, 2014]
Toolchain: gcc version 4.8.3 20140106 (prerelease) (crosstool-NG linaro-1.13.1-
4.8-2014.01 - Linaro GCC 2013.11)
CPU: Cortex-A8 800 MHz, single core.
libpng
Test image: goldhill.png, 720x576
Total Time Performance Few
performance
lossNone NEON 50.519 s 100.00%
NEON Assembly 42.899 s 117.76% 100.00%
NEON intrinsics 44.081s 114.61% 97.32%
* Total time = decode 100 times.
5. 1. A part of Google WebM project.
2. Source code - https://chromium.googlesource.
com/webm/libvpx
3. Status -
* Complete rewritten NEON assembly to intrinsics.
* Optimized performance on ARMv7.
* Post total 49 patches of VP8/VP9.
- VP8: in progress review.
- VP9: posted, waiting for review. (27/Feb/2014)
* In progress to run on ARMv8 architecture.
libvpx - VP8/VP9
6. Benchmark result:
Version: 1.3.0 [February 26, 2014]
Toolchain: gcc version 4.8.3 20140106 (prerelease)
(crosstool-NG linaro-1.13.1-4.8-2014.01 - Linaro GCC 2013.11)
CPU: Cortex-A8 800 MHz, single core.
Test Video: Tears of Steel, 1080p. 12:15 mins
have VP8 and VP9 format version.
libvpx - VP8/VP9
FPS Performance 9.5% performance loss
of using intrinsics
instead of assemblyVP8 Decode None NEON 2.82 100.00%
NEON Assembly 13.23 469.14% 100.00%
NEON intrinsics 11.97 424.46% 90.48%
Vp9 Decode None NEON 2.22 100.00%
NEON Assembly 8.37 377.03% 100.00%
NEON intrinsics 7.56 340.54% 90.32%
7. Armv7 Android refresh:
List of features from AOSP integrated to libjpeg turbo ver 1.3.
Tile Decode,Color conversion rgb565 & rgb8888, backing store - Ashmem
Status : Upstreaming to libjpeg turbo in progress.
Source: git://git.linaro.org/people/ragesh.radhakrishnan/libjpeg-turbo.git
jpeglib decompression benchmark on pandaboard using tjbench
libjpeg-turbo
Image resolution Performance
(fps)
Throughput
(MP/Sec)
linaro libjpeg-
turbo
3008*2000 3.8 22.8563
227*149 829.1839 28.04
AOSP
jpeglib
3008*2000 1.454 8.474
227*149 285.302 9.64
8. Armv8 Port:
List of jpeg decoder handcoded armv8 port, This port is tested using ARM RTSM.
Status : Decoder routines upstreamed to libjpeg-turbo
Source: git://git.linaro.org/people/ragesh.radhakrishnan/libjpeg-turbo.git
Branch: libjpeg-turbo-armv8
libjpeg-turbo
# Jpeg funcitons ported Remarks
1 IDCT_Slow IDCT integer version
2 IDCT_Fast IDCT non accurate version
3 IDCT_2x2 IDCT 2x2 size reduction
4 IDCT_4x4 IDCT 2x2 size reduction
5 Color conversion routines yuv to rgb, yuv to bgr, yuv to grayscale etc
9. Pixman armv8 port: Rewriting armv7 functions to armv8.
Approach : Using Intrinsics
List of functionalities and progress
Status : rewriting of Bilinear scanline funciton in progress
Test Environment: Using armv8 xfce stack on ARM RTSM.
Pixman
# Main functions to be ported Remarks
1 Bilinear scanline functions 80% ported
2 Pixman composite function Pixel processing funcitons Not started
11. Status
Chromium-24 src + 32 patches
binary built
tests built (most run without problem)
Model
Networking broken (VFP âupgradeâ)
2 Gig RAM limit
dual core slow
Chromium Porting to AARCH64
12. Plan
libv8 ToT enables ToT Chromium
Forward port
Push upstream to Chromium community
Chromium on AARCH64
13. Any input on next libraries?
Any libraries youâd like to see Linaro optimize?
Discussion
14. More about Linaro Connect: http://connect.linaro.org
More about Linaro: http://www.linaro.org/about/
More about Linaro engineering: http://www.linaro.org/engineering/
Linaro members: www.linaro.org/members