2. Raul Sena Ferreira
IT Coordinator at IPEA (Research Institute for Applied Economics)
Lecturer in Computer Science at UFRRJ
MSc Student in Data Engineering (Major in Machine Learning) at PESC-UFRJ
www.raulferreira.com.br
June 22, 2017
4. Graphics processing unit
GPU-accelerated computing is the use of a graphics processing unit (GPU)
together with a CPU to accelerate deep learning, analytics, and engineering
applications [1]
CUDA® is a parallel computing platform and programming model invented by
NVIDIA. It enables dramatic increases in computing performance by harnessing
the power of the graphics processing unit (GPU) [2]
June 22, 2017
8. RAIC - 2014
Paralelização do algoritmo de Método de Estimação
Não-Paramétrico por Núcleo Estimador Multivariado
(KDE) utilizando GPU/CUDA
9. Research problem
Kernel Density Estimation (KDE): Non-parametric statistical algorithm to perform
probabilities about events within a population
Multidimensional KDE: O(n2
k)
How to parallelize this method?
There are real some advantages?
June 22, 2017
10. Serial vs Parallel
● Processor: Intel I5 2500
● Graphic card: GeForce GTX 650 Ti Boost
● Nº of cores: 4 (Hyperthread)
● Input: 30.000 points (x,y)
3024% faster than serial. 6 times faster than parallel matlab.
KDE (serial) Matlab KDE (parallelized) KDE with CUDA
31.08 6.35 1.03
June 22, 2017
12. Research problem
How to infer statistically some events from geographic coordinates ?
How to make an Information System with this approach ?
How to get the benefits of GPU computing inside this Information System ?
How to “glue” all of them ?
June 22, 2017
14. Integrating tools for GPU programming
PyCUDA: Nvidia‘s CUDA parallel computation API from Python
● Maps all of CUDA into Python
● Enables run-time code generation (RTCG) for flexible, fast, automatically
tuned codes
● Added robustness: automatic management of object lifetimes, automatic error
checking
● Added convenience: comes with ready-made on-GPU linear algebra,
reduction, scan. Add-on packages for FFT and LAPACK available
● Fast. Near-zero wrapping overhead
● Complete, helpful documentation
June 22, 2017
15. Study case
Dataset: 14027 registers collected from UFRRJ undergraduate students between
2000 and 2013
Two dimensional data: Coordinates from students home
Question: How the probability of a person that lives in certain regions to study in
UFRRJ ? How these probabilities are distributed across the country ?
June 22, 2017
17. ERAD - 2017
Análise de desempenho da biblioteca Theano com
GPUs sob a ótica do CUDA Profiler
18. ERAD-RJ 2017
How efficient are the libraries regarding hardware architectures, specially using
GPUs ?
Literature often evaluate performance measuring total processing time
Goal is measure the performance in different GPUs using CUDA Profiler as
collecting metric tool for GPU
Repository: https://github.com/felipe-melo/Erad-Code
June 22, 2017
19. Motivation
Deep Learning are a class of machine learning algorithms that uses many layers
to process something:
● Non-linear transformations
● Feature extraction, image classification, etc
(Returning of) Neural networks:
● CNN, RNN, Autoencoders ...
June 22, 2017
22. CUDA Profile
Performance analysis tool from NVIDIA
Commands:
--print-gpu-trace -u s
--metrics flop_count_sp, flop_sp_efficiency, flop_count_dp, flop_dp_efficiency
June 22, 2017
27. Paraphrase detection + deep learning + cuda
Comparison of Recursive Auto Encoders implementations between Tensorflow
and Theano
Datasets: Microsoft Research Paraphrase Corpus & Webis Crowd Paraphrase
Corpus 2011
Main contributions: Speedup differences using different libraries; Evaluating if use
of hypernyms improves the outcomes
Peripheral contributions: Comparing two widely used frameworks within an
important problem of neural networks
June 22, 2017
29. Bioinformatics: Sequencing protein processing using NVBIO[3]
Computational Finance: Monte Carlo simulation using NVIDIA TESLA GPU [4]
Computational Fluid Dynamics: Navier-Stokes models and Lattice methods
with very large speedups using CUDA-enabled GPUs [5]
June 22, 2017
30. Media and Entertainment: GPUs to deliver high performance graphics and
parallelism for video editing, digital animation, rendering and media creation[7]
Data Science, Analytics and Databases: GPUs for big data analytics to make
better, real –time business decisions [6]
June 22, 2017
31. Electronic Design and Automation: Recent trends in HPC are increasingly
exploiting many core GPUs to achieve speedup of computationally intensive
simulations including verilog simulation, signal integrity & Electromagnetic [8]
Weather and Climate: Weather Research and Forecasting model and tsunami
simulations. [12]
Defense and Intelligence: Deep Learning Tools for Defense, Safety, and Security
Applications. [9]
Imaging and Computer Visions: Applications can achieve interactive video
frame-rate performance. Libraries: GPU4Vision, OpenVIDIA, MinGPU, etc [10]
June 22, 2017
33. Structural mechanics: CUDA-enabled GPUs ANSY, Abaqus, MSC Nastran,
IMPETUS Afea and additional structural mechanics applications [11]
Machine Learning: Many machine learning tools [14]:
● Caffe: Framework for convolutional neural network algorithms
● Theano: Python library to define, optimize, and evaluate mathematical
expressions
● cxxnet: Neural network toolkit
● cuBLAS: GPU-accelerated version of the complete standard BLAS library
June 22, 2017
34. Want to join in ?
Ongoing research or new ideas. Send me an email :)
Requirements:
● Python programming
● English (basic reading and writing)
● Basic understanding of neural networks
● Eager to learning and resilience
June 22, 2017
35. Good places to start
NVIDIA Website
Intro to Parallel Programming Using CUDA to Harness the Power of GPUs
CUDA University Courses
Multicore and GPU Programming for Video Games
CUDA Tutorials | The Supercomputing Blog
June 22, 2017
36. "In the U.S. there are two types of hipsters: those
who know how to program and those who serve
coffee."
César Hidalgo
June 22, 2017