SlideShare ist ein Scribd-Unternehmen logo
1 von 39
2019 HPCC
Systems®
Community Day
Challenge Yourself –
Challenge the Status Quo
Robert Kennedy, PhD Candidate at Florida Atlantic University
Taghi M. Khoshgoftaar, PhD | Advisor
Timothy Humphrey | LexisNexis Mentor
Expanding HPCC Systems Deep Neural Network
Capabilities
Overview
• Both topics covered here are a result from my Summer Internship
• Work is available on GitHub
• Tool for creating “Standard” HPCC Systems Platform Virtual Machines
• Hyper-V, AWS, Azure, VirtualBox, etc…
• https://github.com/xwang2713/cloud-image-build
• In addition, used for creating NVIDIA GPU Enabled VMs (AWS AMI)
• Started a GPU Enabled Deep Learning Bundle
• Demonstrating GPU accelerated Deep Learning on HPCC Systems
• https://github.com/hpcc-systems/GPU-Deep-Learning
GPU Accelerated HPCC Systems | Robert Kennedy 2
HPCC Systems on Hyper-V
• Used Packer.io to generate machine images
• To create a Hyper-V Image:
• https://github.com/xwang2713/cloud-image-build/tree/master/packer/hyper-v
• Hyper-V VMs can be used similarly to the VirtualBox VMs you might already be
using
• Hyper-V Images build locally, on a Hyper-V enabled machine
• Installed programs list can be easily modified in a .JSON format
• HPCC Systems Platform running on Hyper-V allows for Docker Desktop
(windows) use
• Docker Desktop uses Hyper-V and Hyper-V and VirtualBox can’t run
concurrently
GPU Accelerated HPCC Systems | Robert Kennedy 3
Config File
• Packer.io uses .json file as config
• Defines network (ex. for VirtualBox)
• Defines size of machine (for cloud
providers)
• Config defines which software to be
installed via standard Linux
commands
GPU Accelerated HPCC Systems | Robert Kennedy 4
GPU Enabled Virtual Machines
• Using the same tool, GPU enabled VMs can be created
• Cloud images build in cloud, local images build locally
• This work supports the use of Python 3.6, CUDA 10.0, TensorFlow 1.14, and
PyTorch 1.1
• AWS GPU Instances:
• K80s, V100s
• Azure GPU Instances:
• K80s [12 gigs vram]
• V100s [16 gigs vram] (with and without NVLink)
• P100s [16 gigs vram]
GPU Accelerated HPCC Systems | Robert Kennedy 5
Bundle
Implementation
HPCC Systems and GPU Accelerated Deep Learning
• Current HPCC Systems are CPU only, and so is its DL runtimes
• My previous work was with Distributed DL on HPCC Systems using only
CPUs
• Traditional HPCC Systems use commodity computers connected via standard
network protocols
• With respect to Deep Learning, this presents a large communication bottle
neck, partly due to its iterative nature
• Graphics Processing Units (GPU) are used to decrease the computation time for
Neural Networks
• Single or Multiple GPUs are connected to the CPU (central node) via much
faster hardware connections
• A new bundle was started to enable GPU accelerated Deep Learning on HPCC
Systems Platform
GPU Accelerated HPCC Systems | Robert Kennedy 7
GPU Accelerated Deep Learning
• With this bundle, you can train NN models on the GPU
• Sprayed data is used as training data
• Bundle is in its infancy, but you can build, train, and use neural networks
• Using only ECL
• Using ECL and Python, allows for more customized NN architectures and
training routines
• A trained model (either in ECL or ECL+Python) can be used to predict on sprayed
data
• It returns its predictions via records in a one-hot-encoded format
GPU Accelerated HPCC Systems | Robert Kennedy 8
Bundle Implementation Overview
• Current work uses only one Thor node
• Single Thor node still can use multiple GPUs
• ECL/HPCC Systems handles the data storage and execution of the NN runtimes
• The implementation is uses data parallelism across one ore more GPUs
• Currently limited to only a single physical computer
• The pyembed plugin allows for Python to run on HPCC Systems Platform
• We use Python 3, as Python 2 is nearing EOL
• Python code handles the NN training and interfaces with GPUs directly using
NVIDIA’s CUDA language
GPU Accelerated HPCC Systems | Robert Kennedy 9
TensorFlow | Keras
• The Python code is in the form of
TensorFlow
• TensorFlow
• Google’s Popular Deep Learning
Library
• Keras
• Deep Learning Library API – uses
TensorFlow or other ‘backend’
• Much less code to produce same
model
10
Artificial Neural
Networks
Biological Neuron
• Basis for artificial neural networks
• Such as the ones in deep learning
• Dendrites
• Input vector, from previous
neurons
• Weights
• Soma
• Summation Function
• Axon
• Activation Function
• A neuron 'fires” when there is enough
of an input stimulus
GPU Accelerated HPCC Systems | Robert Kennedy 12
Dendrite
Axon
Soma
Artificial Neuron
• First concept in 1943
• Inputs of the neuron are the outputs
of the previous layer’s neurons
• The input weights are summed with a
bias
• Then passed into an activation
function
• Activation Functions are like the
biological neurons ‘deciding’ to fire
• ReLu activation – gives output x if
x>0, and outputs 0, if x<0, where x is
the input
GPU Accelerated HPCC Systems | Robert Kennedy 13
A Fully Connected Network
• Fully Connected Network
• Each neuron is connected to
every neuron in the subsequent
layer
• Neural Network Visualization
• 2 hidden layers, fully connected, 3
class classification output
• Multi-Layer Perceptron is an example
GPU Accelerated HPCC Systems | Robert Kennedy 14
Neural Network Training
• Forwardpropagation
• Backpropagation
• Optimize Model with respect to Loss
Function
• Quantification of how “right or wrong” the
model for any given datum
• Gradient Descent
• Stochastic Gradient Descent (SGD)
• Mini-batch SGD
• Right: visualization of gradient
descent over an example loss
function
GPU Accelerated HPCC Systems | Robert Kennedy 15
Gradient Descent In Action
Where Exactly Do the GPUs Come Into Play?
• Training a NN Model is the most
time-consuming part, this is where
the GPU is used to dramatically
reduce computation time
• Two main training steps
• Forward pass – weights and
errors
• Backward pass – gradients and
weight updates
• Computationally expensive
convolutions are offloaded onto
GPUs
• These steps are done for each data
point, multiple times GPU Accelerated HPCC Systems | Robert Kennedy 16
Parallel Paradigms
• Data Parallelism
• Model Parallelism
• Synchronous and
Asynchronous
• Parallel SGD
GPU Accelerated HPCC Systems | Robert Kennedy 17
Data Parallelism Model Parallelism
Model Parallelism
• Neural Network Model is split across
nodes
• For models larger than a GPU’s
memory
• Requires significantly higher
communication bandwidths between
nodes
• Not well suited for a cluster system
• However, this paradigm is feasible for a
multi-GPU system due to faster hardware
speeds
GPU Accelerated HPCC Systems | Robert Kennedy 18
Data Parallelism
• Data is partitioned and distributed to
nodes
• A singe NN model is replicated onto
each node
• Only weight updates are communicated
and aggregated
• As defined by the specific parallel
training method
• Suitable for parallelizing across multiple
nodes in HPCC Systems cluster or
across GPUs in a single system
• This is the paradigm that is used
GPU Accelerated HPCC Systems | Robert Kennedy 19
Not Your Average HPCC Systems
• Slightly different than traditional HPCC
Systems topologies
• Whole figure represents a single physical
computer and Thor Node
• Parameter Server
• This is the CPU on the system
• Nodes (blue)
• Each node represents a single
physical GPU
• Connections are high speed
hardware
• PCI Express is up to 985 MB/s
per each 16 lanes
• NVLINK is roughly 10x faster
than PCIe Gen 3
GPU Accelerated HPCC Systems | Robert Kennedy 20
Workflow
Example
• We will create a Convolutional Neural Network (CNN) and train on the MNIST
Dataset
• MNIST is a 10-class image classification dataset, handwritten digits 0-9
• The CNN takes 784 pixels as an input (each with range 0-255)
• Two Convolutional Layers
• One fully connected layer with 128 neurons
• 10 Output neurons (one for each class)
• Total of 1,199,882 trainable parameters
• Processing through 720,000 MNIST images
Bundle Usage Example Architecture
GPU Accelerated HPCC Systems | Robert Kennedy 22
Spray MNIST Dataset
• MNIST included in bundle
• Test and Train, 785 fixed length
• 60,000 28x28 grayscale images
• 10,000 28x28 grayscale images
• Both are labeled as one of 10
classes, 0-9
GPU Accelerated HPCC Systems | Robert Kennedy 23
Image Visualization
• Imported RAW MNIST
Data
• Visualization of a single
MNIST image in the
“data” format
• Each pixel has value
between 0-255,
represented as 2-digit
hex numbers
• Each pixel is a feature
GPU Accelerated HPCC Systems | Robert Kennedy 24
Preparing the Data
• Currently, the bundle demonstrates how to train on image data
• Includes Example NN and the example dataset (MNSIT and Fashion
MNIST)
• Training data and labels is molded into a NumPy array with specified shape
before training
• Here, shape is the dimensions of the image
• i.e. the dimensions of the input features
• These get flattened to an array of 784 inputs for 784 input neurons
GPU Accelerated HPCC Systems | Robert Kennedy 25
Creating a CNN – model.add() method
• First, we define the optimizer and its
parameters
• Next, we define the training scheme
• Batch size = 128
• We’ll train for 20 epochs
GPU Accelerated HPCC Systems | Robert Kennedy 26
Creating a CNN – model.add() method
• Next, we define the NN architecture
• Input shape, 28x28x1 grayscale
images
• Initialize the model
• The “nnOutputLayer” is the final layer
and is, at this point, the entire NN
model thus far
GPU Accelerated HPCC Systems | Robert Kennedy 27
• “nnOutputLayer” is passed into model.train() along with hyperparameters and
training data
Train the CNN – model.add() method
GPU Accelerated HPCC Systems | Robert Kennedy 28
GPU:
CPU:
Create CNN – ECL and Python
GPU Accelerated HPCC Systems | Robert Kennedy 29
Example Input and Output
GPU Accelerated HPCC Systems | Robert Kennedy 30
Image Input
One-Hot-Encoded Output
Performance
Performance Evaluation
• A case study was performed to measure the performance improvements
• 5 identical Convolutional Neural Networks are trained on the MNIST dataset
• 10 times each to provide statistical significance
• Measuring the required training time for the same model on same data using fixed
training parameters
• Faster training time is desired
• CPU Alone, 1, 2, 3, and 4 GPUs
• Older K80’s are used
• Newer GPUs will only increase performance and efficiency
• Compared against each other and compared against the “optimal” speed up
• i.e. linear speedup
GPU Accelerated HPCC Systems | Robert Kennedy 32
Performance Boost: GPU vs. CPU
• Time, in seconds, to train a CNN on
MNIST dataset
• Training time speedup is 5.4x
between a Xeon CPU vs a K80 GPU
• Speedup is large, even for a
simple model on small and simple
data
• The training time is measuring NN
training time, not necessarily any
HPCC-specific computations that
would be the same during CPU or
GPU
GPU Accelerated HPCC Systems | Robert Kennedy 33
Performance Boost: CPU vs. GPU vs Optimal Speedup
• Optimal Speed up is linear
• i.e. twice the nodes is twice as fast
• Speedup is not expected to be linear
due to communication overheads
• Results show that additional GPUs
have minimal cost
GPU Accelerated HPCC Systems | Robert Kennedy 34
Conclusion
• Tool used to create HPCC Systems Virtual Images on various new platforms
• Good use case is to create GPU enabled images
• Brief overview of Neural Networks and their optimization
• Demonstrated that GPU accelerated deep learning is possible on HPCC Systems
Platform
• Demonstrated that GPU provides significant performance increase, even on non-
traditional cluster
GPU Accelerated HPCC Systems | Robert Kennedy 35
• Implementing generalizable data loaders
• To allow for a training on data with less knowledge of NumPy (Python)
• Continue adding to the supported methods and ECL modeling functions
• Research and Development on integrating model parallelism
• Research on NN training on multi-node clusters where each node can have one
or more GPUs
Future Work
GPU Accelerated HPCC Systems | Robert Kennedy 36
Links
• GitHub
• https://github.com/hpcc-systems/GPU-Deep-Learning
• https://github.com/xwang2713/cloud-image-build
• NVIDIA CUDA
• https://developer.nvidia.com/cuda-toolkit
• TensorFlow
• https://www.tensorflow.org/
• Keras
• https://keras.io/
• NumPy
• https://numpy.org/
GPU Accelerated HPCC Systems | Robert Kennedy 37
GPU Accelerated HPCC Systems | Robert Kennedy 38
Robert Kennedy
PhD Candidate, Florida Atlantic
University
rkennedy@fau.edu
Questions?
GPU Accelerated HPCC Systems | Robert Kennedy 39
View this presentation on YouTube:
https://www.youtube.com/watch?v=GMt-_Io4Jys&list=PL-
8MJMUpp8IKH5-d56az56t52YccleX5h&index=8&t=0s (4:02)

Weitere ähnliche Inhalte

Was ist angesagt?

CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016Belmiro Moreira
 
Nova net-or-neutron-atlanta2014.pptx
Nova net-or-neutron-atlanta2014.pptxNova net-or-neutron-atlanta2014.pptx
Nova net-or-neutron-atlanta2014.pptxSomik Behera
 
Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Joe Brockmeier
 
유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리NAVER D2
 
BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform The Linux Foundation
 
Adventures in Research
Adventures in ResearchAdventures in Research
Adventures in ResearchNETWAYS
 
Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Intel® Software
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopDataWorks Summit
 
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...OpenNebula Project
 
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014Belmiro Moreira
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarKamesh Pemmaraju
 
産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組みRyousei Takano
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Cloud Native Day Tel Aviv
 
Distributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learnedDistributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learnedWee Hyong Tok
 
Cloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: OpenstackCloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: OpenstackMicrosoft
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...In-Memory Computing Summit
 
Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Belmiro Moreira
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech dayArthur Berezin
 

Was ist angesagt? (20)

CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016
 
Nova net-or-neutron-atlanta2014.pptx
Nova net-or-neutron-atlanta2014.pptxNova net-or-neutron-atlanta2014.pptx
Nova net-or-neutron-atlanta2014.pptx
 
Apache Spark
Apache SparkApache Spark
Apache Spark
 
Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)Apache CloudStack: API to UI (STLLUG)
Apache CloudStack: API to UI (STLLUG)
 
유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리유연하고 확장성 있는 빅데이터 처리
유연하고 확장성 있는 빅데이터 처리
 
BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform BACD July 2012 : The Xen Cloud Platform
BACD July 2012 : The Xen Cloud Platform
 
Adventures in Research
Adventures in ResearchAdventures in Research
Adventures in Research
 
Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*
 
Hello OpenStack, Meet Hadoop
Hello OpenStack, Meet HadoopHello OpenStack, Meet Hadoop
Hello OpenStack, Meet Hadoop
 
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...
 
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 
産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み
 
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
Networking, QoS, Liberty, Mitaka and Newton - Livnat Peer - OpenStack Day Isr...
 
Distributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learnedDistributed DNN training: Infrastructure, challenges, and lessons learned
Distributed DNN training: Infrastructure, challenges, and lessons learned
 
Thread
ThreadThread
Thread
 
Cloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: OpenstackCloud Architect Alliance #15: Openstack
Cloud Architect Alliance #15: Openstack
 
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
IMCSummit 2015 - Day 2 IT Business Track - 4 Myths about In-Memory Databases ...
 
Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016Cern Cloud Architecture - February, 2016
Cern Cloud Architecture - February, 2016
 
OpenStack Best Practices and Considerations - terasky tech day
OpenStack Best Practices and Considerations  - terasky tech dayOpenStack Best Practices and Considerations  - terasky tech day
OpenStack Best Practices and Considerations - terasky tech day
 

Ähnlich wie Expanding HPCC Systems Deep Neural Network Capabilities

Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusJakob Karalus
 
Parallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC SystemsParallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC SystemsHPCC Systems
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetupGanesan Narayanasamy
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
Deep_Learning_Frameworks_CNTK_PyTorch
Deep_Learning_Frameworks_CNTK_PyTorchDeep_Learning_Frameworks_CNTK_PyTorch
Deep_Learning_Frameworks_CNTK_PyTorchSubhashis Hazarika
 
Using Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clustersUsing Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clustersJoy Qiao
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for researchEsteban Hernandez
 
Democratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDemocratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDocker, Inc.
 
Harnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsHarnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsUnai Lopez-Novoa
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsHPCC Systems
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitMilind Bhandarkar
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHungWei Chiu
 
Assisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated ArchitectureAssisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated Architectureinside-BigData.com
 

Ähnlich wie Expanding HPCC Systems Deep Neural Network Capabilities (20)

Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob KaralusDistributed Tensorflow with Kubernetes - data2day - Jakob Karalus
Distributed Tensorflow with Kubernetes - data2day - Jakob Karalus
 
Parallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC SystemsParallel Distributed Deep Learning on HPCC Systems
Parallel Distributed Deep Learning on HPCC Systems
 
2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup2018 03 25 system ml ai and openpower meetup
2018 03 25 system ml ai and openpower meetup
 
Current Trends in HPC
Current Trends in HPCCurrent Trends in HPC
Current Trends in HPC
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Deep_Learning_Frameworks_CNTK_PyTorch
Deep_Learning_Frameworks_CNTK_PyTorchDeep_Learning_Frameworks_CNTK_PyTorch
Deep_Learning_Frameworks_CNTK_PyTorch
 
Using Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clustersUsing Deep Learning Toolkits with Kubernetes clusters
Using Deep Learning Toolkits with Kubernetes clusters
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
High performance computing
High performance computingHigh performance computing
High performance computing
 
High performance computing for research
High performance computing for researchHigh performance computing for research
High performance computing for research
 
Democratizing machine learning on kubernetes
Democratizing machine learning on kubernetesDemocratizing machine learning on kubernetes
Democratizing machine learning on kubernetes
 
Harnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern CoprocessorsHarnessing OpenCL in Modern Coprocessors
Harnessing OpenCL in Modern Coprocessors
 
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharingVinetalk: The missing piece for cluster managers to enable accelerator sharing
Vinetalk: The missing piece for cluster managers to enable accelerator sharing
 
Introduction to OpenCL
Introduction to OpenCLIntroduction to OpenCL
Introduction to OpenCL
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
Extending Hadoop for Fun & Profit
Extending Hadoop for Fun & ProfitExtending Hadoop for Fun & Profit
Extending Hadoop for Fun & Profit
 
High performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User GroupHigh performace network of Cloud Native Taiwan User Group
High performace network of Cloud Native Taiwan User Group
 
NSCC Training Introductory Class
NSCC Training Introductory Class NSCC Training Introductory Class
NSCC Training Introductory Class
 
Assisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated ArchitectureAssisting User’s Transition to Titan’s Accelerated Architecture
Assisting User’s Transition to Titan’s Accelerated Architecture
 
GPU Algorithms and trends 2018
GPU Algorithms and trends 2018GPU Algorithms and trends 2018
GPU Algorithms and trends 2018
 

Mehr von HPCC Systems

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...HPCC Systems
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsHPCC Systems
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn HPCC Systems
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingHPCC Systems
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle ChangesHPCC Systems
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index HPCC Systems
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningHPCC Systems
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsHPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch HPCC Systems
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem HPCC Systems
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis ToolHPCC Systems
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony HPCC Systems
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterHPCC Systems
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...HPCC Systems
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...HPCC Systems
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...HPCC Systems
 
Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...
Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...
Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...HPCC Systems
 

Mehr von HPCC Systems (20)

Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...Natural Language to SQL Query conversion using Machine Learning Techniques on...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
 
Towards Trustable AI for Complex Systems
Towards Trustable AI for Complex SystemsTowards Trustable AI for Complex Systems
Towards Trustable AI for Complex Systems
 
Welcome
WelcomeWelcome
Welcome
 
Closing / Adjourn
Closing / Adjourn Closing / Adjourn
Closing / Adjourn
 
Community Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon CuttingCommunity Website: Virtual Ribbon Cutting
Community Website: Virtual Ribbon Cutting
 
Path to 8.0
Path to 8.0 Path to 8.0
Path to 8.0
 
Release Cycle Changes
Release Cycle ChangesRelease Cycle Changes
Release Cycle Changes
 
Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index Geohashing with Uber’s H3 Geospatial Index
Geohashing with Uber’s H3 Geospatial Index
 
Advancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine LearningAdvancements in HPCC Systems Machine Learning
Advancements in HPCC Systems Machine Learning
 
Docker Support
Docker Support Docker Support
Docker Support
 
Leveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC SystemsLeveraging Intra-Node Parallelization in HPCC Systems
Leveraging Intra-Node Parallelization in HPCC Systems
 
DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch DataPatterns - Profiling in ECL Watch
DataPatterns - Profiling in ECL Watch
 
Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem Leveraging the Spark-HPCC Ecosystem
Leveraging the Spark-HPCC Ecosystem
 
Work Unit Analysis Tool
Work Unit Analysis ToolWork Unit Analysis Tool
Work Unit Analysis Tool
 
Community Award Ceremony
Community Award Ceremony Community Award Ceremony
Community Award Ceremony
 
Dapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL NeaterDapper Tool - A Bundle to Make your ECL Neater
Dapper Tool - A Bundle to Make your ECL Neater
 
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
 
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
 
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
Using High Dimensional Representation of Words (CBOW) to Find Domain Based Co...
 
Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...
Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...
Leveraging HPCC Systems as Part of an Information Security, Privacy, and Comp...
 

Kürzlich hochgeladen

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Thomas Poetter
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max PrincetonTimothy Spann
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Boston Institute of Analytics
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...ssuserf63bd7
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanMYRABACSAFRA2
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGIThomas Poetter
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 

Kürzlich hochgeladen (20)

Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
Minimizing AI Hallucinations/Confabulations and the Path towards AGI with Exa...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Real-Time AI Streaming - AI Max Princeton
Real-Time AI  Streaming - AI Max PrincetonReal-Time AI  Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
Data Analysis Project : Targeting the Right Customers, Presentation on Bank M...
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
Statistics, Data Analysis, and Decision Modeling, 5th edition by James R. Eva...
 
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population MeanIdentifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGILLMs, LMMs, their Improvement Suggestions and the Path towards AGI
LLMs, LMMs, their Improvement Suggestions and the Path towards AGI
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 

Expanding HPCC Systems Deep Neural Network Capabilities

  • 1. 2019 HPCC Systems® Community Day Challenge Yourself – Challenge the Status Quo Robert Kennedy, PhD Candidate at Florida Atlantic University Taghi M. Khoshgoftaar, PhD | Advisor Timothy Humphrey | LexisNexis Mentor Expanding HPCC Systems Deep Neural Network Capabilities
  • 2. Overview • Both topics covered here are a result from my Summer Internship • Work is available on GitHub • Tool for creating “Standard” HPCC Systems Platform Virtual Machines • Hyper-V, AWS, Azure, VirtualBox, etc… • https://github.com/xwang2713/cloud-image-build • In addition, used for creating NVIDIA GPU Enabled VMs (AWS AMI) • Started a GPU Enabled Deep Learning Bundle • Demonstrating GPU accelerated Deep Learning on HPCC Systems • https://github.com/hpcc-systems/GPU-Deep-Learning GPU Accelerated HPCC Systems | Robert Kennedy 2
  • 3. HPCC Systems on Hyper-V • Used Packer.io to generate machine images • To create a Hyper-V Image: • https://github.com/xwang2713/cloud-image-build/tree/master/packer/hyper-v • Hyper-V VMs can be used similarly to the VirtualBox VMs you might already be using • Hyper-V Images build locally, on a Hyper-V enabled machine • Installed programs list can be easily modified in a .JSON format • HPCC Systems Platform running on Hyper-V allows for Docker Desktop (windows) use • Docker Desktop uses Hyper-V and Hyper-V and VirtualBox can’t run concurrently GPU Accelerated HPCC Systems | Robert Kennedy 3
  • 4. Config File • Packer.io uses .json file as config • Defines network (ex. for VirtualBox) • Defines size of machine (for cloud providers) • Config defines which software to be installed via standard Linux commands GPU Accelerated HPCC Systems | Robert Kennedy 4
  • 5. GPU Enabled Virtual Machines • Using the same tool, GPU enabled VMs can be created • Cloud images build in cloud, local images build locally • This work supports the use of Python 3.6, CUDA 10.0, TensorFlow 1.14, and PyTorch 1.1 • AWS GPU Instances: • K80s, V100s • Azure GPU Instances: • K80s [12 gigs vram] • V100s [16 gigs vram] (with and without NVLink) • P100s [16 gigs vram] GPU Accelerated HPCC Systems | Robert Kennedy 5
  • 7. HPCC Systems and GPU Accelerated Deep Learning • Current HPCC Systems are CPU only, and so is its DL runtimes • My previous work was with Distributed DL on HPCC Systems using only CPUs • Traditional HPCC Systems use commodity computers connected via standard network protocols • With respect to Deep Learning, this presents a large communication bottle neck, partly due to its iterative nature • Graphics Processing Units (GPU) are used to decrease the computation time for Neural Networks • Single or Multiple GPUs are connected to the CPU (central node) via much faster hardware connections • A new bundle was started to enable GPU accelerated Deep Learning on HPCC Systems Platform GPU Accelerated HPCC Systems | Robert Kennedy 7
  • 8. GPU Accelerated Deep Learning • With this bundle, you can train NN models on the GPU • Sprayed data is used as training data • Bundle is in its infancy, but you can build, train, and use neural networks • Using only ECL • Using ECL and Python, allows for more customized NN architectures and training routines • A trained model (either in ECL or ECL+Python) can be used to predict on sprayed data • It returns its predictions via records in a one-hot-encoded format GPU Accelerated HPCC Systems | Robert Kennedy 8
  • 9. Bundle Implementation Overview • Current work uses only one Thor node • Single Thor node still can use multiple GPUs • ECL/HPCC Systems handles the data storage and execution of the NN runtimes • The implementation is uses data parallelism across one ore more GPUs • Currently limited to only a single physical computer • The pyembed plugin allows for Python to run on HPCC Systems Platform • We use Python 3, as Python 2 is nearing EOL • Python code handles the NN training and interfaces with GPUs directly using NVIDIA’s CUDA language GPU Accelerated HPCC Systems | Robert Kennedy 9
  • 10. TensorFlow | Keras • The Python code is in the form of TensorFlow • TensorFlow • Google’s Popular Deep Learning Library • Keras • Deep Learning Library API – uses TensorFlow or other ‘backend’ • Much less code to produce same model 10
  • 12. Biological Neuron • Basis for artificial neural networks • Such as the ones in deep learning • Dendrites • Input vector, from previous neurons • Weights • Soma • Summation Function • Axon • Activation Function • A neuron 'fires” when there is enough of an input stimulus GPU Accelerated HPCC Systems | Robert Kennedy 12 Dendrite Axon Soma
  • 13. Artificial Neuron • First concept in 1943 • Inputs of the neuron are the outputs of the previous layer’s neurons • The input weights are summed with a bias • Then passed into an activation function • Activation Functions are like the biological neurons ‘deciding’ to fire • ReLu activation – gives output x if x>0, and outputs 0, if x<0, where x is the input GPU Accelerated HPCC Systems | Robert Kennedy 13
  • 14. A Fully Connected Network • Fully Connected Network • Each neuron is connected to every neuron in the subsequent layer • Neural Network Visualization • 2 hidden layers, fully connected, 3 class classification output • Multi-Layer Perceptron is an example GPU Accelerated HPCC Systems | Robert Kennedy 14
  • 15. Neural Network Training • Forwardpropagation • Backpropagation • Optimize Model with respect to Loss Function • Quantification of how “right or wrong” the model for any given datum • Gradient Descent • Stochastic Gradient Descent (SGD) • Mini-batch SGD • Right: visualization of gradient descent over an example loss function GPU Accelerated HPCC Systems | Robert Kennedy 15 Gradient Descent In Action
  • 16. Where Exactly Do the GPUs Come Into Play? • Training a NN Model is the most time-consuming part, this is where the GPU is used to dramatically reduce computation time • Two main training steps • Forward pass – weights and errors • Backward pass – gradients and weight updates • Computationally expensive convolutions are offloaded onto GPUs • These steps are done for each data point, multiple times GPU Accelerated HPCC Systems | Robert Kennedy 16
  • 17. Parallel Paradigms • Data Parallelism • Model Parallelism • Synchronous and Asynchronous • Parallel SGD GPU Accelerated HPCC Systems | Robert Kennedy 17 Data Parallelism Model Parallelism
  • 18. Model Parallelism • Neural Network Model is split across nodes • For models larger than a GPU’s memory • Requires significantly higher communication bandwidths between nodes • Not well suited for a cluster system • However, this paradigm is feasible for a multi-GPU system due to faster hardware speeds GPU Accelerated HPCC Systems | Robert Kennedy 18
  • 19. Data Parallelism • Data is partitioned and distributed to nodes • A singe NN model is replicated onto each node • Only weight updates are communicated and aggregated • As defined by the specific parallel training method • Suitable for parallelizing across multiple nodes in HPCC Systems cluster or across GPUs in a single system • This is the paradigm that is used GPU Accelerated HPCC Systems | Robert Kennedy 19
  • 20. Not Your Average HPCC Systems • Slightly different than traditional HPCC Systems topologies • Whole figure represents a single physical computer and Thor Node • Parameter Server • This is the CPU on the system • Nodes (blue) • Each node represents a single physical GPU • Connections are high speed hardware • PCI Express is up to 985 MB/s per each 16 lanes • NVLINK is roughly 10x faster than PCIe Gen 3 GPU Accelerated HPCC Systems | Robert Kennedy 20
  • 22. • We will create a Convolutional Neural Network (CNN) and train on the MNIST Dataset • MNIST is a 10-class image classification dataset, handwritten digits 0-9 • The CNN takes 784 pixels as an input (each with range 0-255) • Two Convolutional Layers • One fully connected layer with 128 neurons • 10 Output neurons (one for each class) • Total of 1,199,882 trainable parameters • Processing through 720,000 MNIST images Bundle Usage Example Architecture GPU Accelerated HPCC Systems | Robert Kennedy 22
  • 23. Spray MNIST Dataset • MNIST included in bundle • Test and Train, 785 fixed length • 60,000 28x28 grayscale images • 10,000 28x28 grayscale images • Both are labeled as one of 10 classes, 0-9 GPU Accelerated HPCC Systems | Robert Kennedy 23
  • 24. Image Visualization • Imported RAW MNIST Data • Visualization of a single MNIST image in the “data” format • Each pixel has value between 0-255, represented as 2-digit hex numbers • Each pixel is a feature GPU Accelerated HPCC Systems | Robert Kennedy 24
  • 25. Preparing the Data • Currently, the bundle demonstrates how to train on image data • Includes Example NN and the example dataset (MNSIT and Fashion MNIST) • Training data and labels is molded into a NumPy array with specified shape before training • Here, shape is the dimensions of the image • i.e. the dimensions of the input features • These get flattened to an array of 784 inputs for 784 input neurons GPU Accelerated HPCC Systems | Robert Kennedy 25
  • 26. Creating a CNN – model.add() method • First, we define the optimizer and its parameters • Next, we define the training scheme • Batch size = 128 • We’ll train for 20 epochs GPU Accelerated HPCC Systems | Robert Kennedy 26
  • 27. Creating a CNN – model.add() method • Next, we define the NN architecture • Input shape, 28x28x1 grayscale images • Initialize the model • The “nnOutputLayer” is the final layer and is, at this point, the entire NN model thus far GPU Accelerated HPCC Systems | Robert Kennedy 27
  • 28. • “nnOutputLayer” is passed into model.train() along with hyperparameters and training data Train the CNN – model.add() method GPU Accelerated HPCC Systems | Robert Kennedy 28 GPU: CPU:
  • 29. Create CNN – ECL and Python GPU Accelerated HPCC Systems | Robert Kennedy 29
  • 30. Example Input and Output GPU Accelerated HPCC Systems | Robert Kennedy 30 Image Input One-Hot-Encoded Output
  • 32. Performance Evaluation • A case study was performed to measure the performance improvements • 5 identical Convolutional Neural Networks are trained on the MNIST dataset • 10 times each to provide statistical significance • Measuring the required training time for the same model on same data using fixed training parameters • Faster training time is desired • CPU Alone, 1, 2, 3, and 4 GPUs • Older K80’s are used • Newer GPUs will only increase performance and efficiency • Compared against each other and compared against the “optimal” speed up • i.e. linear speedup GPU Accelerated HPCC Systems | Robert Kennedy 32
  • 33. Performance Boost: GPU vs. CPU • Time, in seconds, to train a CNN on MNIST dataset • Training time speedup is 5.4x between a Xeon CPU vs a K80 GPU • Speedup is large, even for a simple model on small and simple data • The training time is measuring NN training time, not necessarily any HPCC-specific computations that would be the same during CPU or GPU GPU Accelerated HPCC Systems | Robert Kennedy 33
  • 34. Performance Boost: CPU vs. GPU vs Optimal Speedup • Optimal Speed up is linear • i.e. twice the nodes is twice as fast • Speedup is not expected to be linear due to communication overheads • Results show that additional GPUs have minimal cost GPU Accelerated HPCC Systems | Robert Kennedy 34
  • 35. Conclusion • Tool used to create HPCC Systems Virtual Images on various new platforms • Good use case is to create GPU enabled images • Brief overview of Neural Networks and their optimization • Demonstrated that GPU accelerated deep learning is possible on HPCC Systems Platform • Demonstrated that GPU provides significant performance increase, even on non- traditional cluster GPU Accelerated HPCC Systems | Robert Kennedy 35
  • 36. • Implementing generalizable data loaders • To allow for a training on data with less knowledge of NumPy (Python) • Continue adding to the supported methods and ECL modeling functions • Research and Development on integrating model parallelism • Research on NN training on multi-node clusters where each node can have one or more GPUs Future Work GPU Accelerated HPCC Systems | Robert Kennedy 36
  • 37. Links • GitHub • https://github.com/hpcc-systems/GPU-Deep-Learning • https://github.com/xwang2713/cloud-image-build • NVIDIA CUDA • https://developer.nvidia.com/cuda-toolkit • TensorFlow • https://www.tensorflow.org/ • Keras • https://keras.io/ • NumPy • https://numpy.org/ GPU Accelerated HPCC Systems | Robert Kennedy 37
  • 38. GPU Accelerated HPCC Systems | Robert Kennedy 38 Robert Kennedy PhD Candidate, Florida Atlantic University rkennedy@fau.edu Questions?
  • 39. GPU Accelerated HPCC Systems | Robert Kennedy 39 View this presentation on YouTube: https://www.youtube.com/watch?v=GMt-_Io4Jys&list=PL- 8MJMUpp8IKH5-d56az56t52YccleX5h&index=8&t=0s (4:02)