SlideShare ist ein Scribd-Unternehmen logo
1 von 57
Downloaden Sie, um offline zu lesen
Training Neural Networks
Accelerate innovation by unifying data science, engineering
and business
• Founded by the original creators of Apache Spark
• Contributes 75% of the open source code, 10x more than
any other company
• Trained 100k+ Spark users on the Databricks platform
VISION
WHO WE ARE
Unified Analytics Platform powered by Apache Spark™PRODUCT
Aboutourspeaker
Denny Lee
Technical Product Marketing Manager
Former:
• Senior Director of Data Sciences Engineering at SAP Concur
• Principal Program Manager at Microsoft
• Azure Cosmos DB Engineering Spark and Graph Initiatives
• Isotope Incubation Team (currently known as HDInsight)
• Bing’s Audience Insights Team
• Yahoo!’s 24TB Analysis Services cube
DeepLearningFundamentalsSeries
This is a three-part series:
• Introduction to Neural Networks
• Training Neural Networks
• Applying your Convolutional Neural Network
This series will be make use of Keras (TensorFlow backend) but as it is a
fundamentals series, we are focusing primarily on the concepts.
PreviousSession:IntroductiontoNeuralNetworks
• What is Deep Learning?
• What can Deep Learning do for you?
• What are artificial neural networks?
• Let’s start with a perceptron…
• Understanding the effect of activation functions
CurrentSession:TrainingNeuralNetworks
• Tuning training
• Training Algorithms
• Optimization (including Adam)
• Convolutional Neural Networks
UpcomingSession:ApplyingNeuralNetworks
• Diving further into CNNs
• CNN Architectures
• Convolutions at Work!
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
Dropout
Tuning Training
Hyperparameters
• Network
• How many layers?
• How many neurons in each layer?
• What activation functions to use?
• Learning algorithm
• What’s the best value of the learning rate?
• How quickly decay the learning rate? Momentum?
• What type of loss function should I use?
• What batch size?
• How many iterations is enough?
Overfittingandunderfitting
Overfittingandunderfitting
Overfittingandunderfitting
Hyperparameters:Network
Generally, the more layers and the number of units in each layer:
• The greater the capacity of the artificial neural network
• The risk is overfitting when your goal is to build a generalized model.
From a practical perspective, a good starting point is:
• The number of input units equals the dimension of features
• The number of output units equals the number of classes (e.g. in the MNIST dataset, there
are 10 possible values represents digits (0…9) hence there are 10 output units
• Start with one hidden layer that is 2x the number of input units
• A good reference is Andrew Ng’s Coursera Machine Learning course.
Hyperparameters:ActivationFunctions?
• Good starting point: ReLU
• Note many neural networks samples: Keras MNIST,  TensorFlow CIFAR10 Pruning, etc.
• Note that each activation function has its own strengths and weaknesses.  A good quote
on activation functions from CS231N summarizes the choice well:
“What neuron type should I use?” Use the ReLU non-linearity, be careful with your learning rates
and possibly monitor the fraction of “dead” units in a network. If this concerns you, give Leaky
ReLU or Maxout a try. Never use sigmoid. Try tanh, but expect it to work worse than ReLU/
Maxout.
Neurons … Activate!
DEMO
Hyperparameters
Learning algorithm
• What’s the best value of the learning rate?
• How quickly decay the learning rate? Momentum?
• What type of loss function should I use?
• What batch size?
• How many iterations is enough?
Training Algorithms
Cost function
Source: https://bit.ly/2IoAGzL
For this linear regression example, to
determine the best (slope of the line) for
we can calculate the cost function, such as
Mean Square Error, Mean absolute error,
Mean bias error, SVM Loss, etc.
For this example, we’ll use sum of squared
absolute differences
𝑦 = 𝑥 ⋅ 𝑝
𝑝
𝑐𝑜𝑠𝑡 =
∑
| 𝑡 − 𝑦|2
Gradient Descent Optimization
Source: https://bit.ly/2IoAGzL
Small Learning Rate
Source: https://bit.ly/2IoAGzL
Small Learning Rate
Source: https://bit.ly/2IoAGzL
Small Learning Rate
Source: https://bit.ly/2IoAGzL
Small Learning Rate
Source: https://bit.ly/2IoAGzL
SimplifiedTwo-LayerANN
1
1
0.8
0.2
0.7
0.6
0.9
0.1
0.8
0.75
0.69
h1 = 𝜎(1𝑥0.8 + 1𝑥0.6) = 0.80
h2 = 𝜎(1𝑥0.2 + 1𝑥0.9) = 0.75
h3 = 𝜎(1𝑥0.7 + 1𝑥0.1) = 0.69
SimplifiedTwo-LayerANN
1
1
0.8
0.2
0.7
0.6
0.9
0.1
0.8
0.75
0.69
0.2
0.8
0.5
0.75
𝑜𝑢𝑡 = 𝜎(0.2𝑥0.8 + 0.8𝑥0.75 + 0.5𝑥0.69)
= 𝜎(1.105)
= 0.75
Backpropagation
Input Hidden Output
0.8
0.2
0.75
Backpropagation
Input Hidden Output
0.85
• Backpropagation: calculate the
gradient of the cost function in a
neural network
• Used by gradient descent optimization
algorithm to adjust weight of neurons
• Also known as backward propagation
of errors as the error is calculated and
distributed back through the network
of layers
0.10
Sigmoidfunction(continued)
Output is not zero-centered: During gradient
descent, if all values are positive then during
backpropagation the weights will become all
positive or all negative creating zig zagging
dynamics.
Source: https://bit.ly/2IoAGzL
LearningRateCallouts
• Too small, it may take too long to get minima
• Too large, it may skip the minima altogether
Optimization
OptimizationOverview
• After backpropagation, the parameters are updated based on the gradients
calculated
• There are several approaches in this area of active research; we will focus on:
• Stochastic Gradient Descent
• Momentum, NAG
• Per-parameter adaptive learning rate methods
StochasticGradientDescent
• (Batch) Gradient Descent is computed on the full dataset (not efficient for large
scale models and datasets).
• Often converges faster because it performs updates more frequently
• But due to frequent updates, this may complicate convergence to the exact
minima
• For more information, refer to:
• Andrew Ng’s 2. Stochastic Gradient (https://goo.gl/bNrJbx)
• Types of Optimization Algorithms used in Neural Networks and Ways to
Optimize Gradient Descent (https://goo.gl/tB2e7S)
GradientDescent Source: https://goo.gl/VUX2ZS
MomentumandNAG
• Obtain faster convergence by helping parameter vector build up velocity
• i.e. use the momentum of the gradient to converge faster
• Nesterov Accelerated Gradient (NAG): optimized version of Momentum
• Typically works better in practice than Momentum
Source: http://cs231n.github.io/neural-networks-3
Annealingthelearningrate
• i.e. slow down the learning rate to prevent it from bouncing around too much
• Referred as the decay parameter (i.e., the learning rate decay over each
update) to reduce kinetic energy
• Note, this is different from rho (i.e. exponentially weighted average or
exponentially weighted decay of past gradients) to smooth the descent path
trajectory
Per-parameteradaptivelearningratemethods
• Adaptively tune learning rates at the parameter level
• Popular methods include:
• Adaptive Gradient Algorithm (AdaGrad) improves performance on problems with sparse
gradients (e.g. natural language and computer vision problems).
• Root Mean Square Propagation (RMSProp) maintains per-parameter learning rates based
on the average of recent magnitudes of the gradients for the weight (e.g. how quickly it is
changing). This means the algorithm does well on online and non-stationary problems
(e.g. noisy).
• AdaDelta: Per dimension learning rate method for gradient descent with minimal
computational overhead, requires no manual tuning, and quite robust
WhichOptimizer?
“In practice Adam is currently recommended as
the default algorithm to use, and often works
slightly better than RMSProp. However, it is often
also worth trying SGD+Nesterov Momentum as
an alternative..”
 Andrej Karpathy, et al, CS231n
Source: https://goo.gl/2da4WY
Comparison of Adam to Other Optimization Algorithms Training a Multilayer Perceptron

Taken from Adam: A Method for Stochastic Optimization, 2015.
Optimizationonlosssurfacecontours
Adaptive algorithms converge
quickly and find the right direction
for the parameters.
In comparison, SGD is slow
Momentum-based methods
overshoot
Source: http://cs231n.github.io/neural-networks-3/#hyper
Image credit: Alec Radford
Optimizationonsaddlepoint
Notice how SGD gets stuck near the
top
Meanwhile adaptive techniques
optimize the fastest
Source: http://cs231n.github.io/neural-networks-3/#hyper
Image credit: Alec Radford
GoodReferences
• Suki Lau's Learning Rate Schedules and Adaptive Learning Rate Methods for
Deep Learning
• CS23n Convolutional Neural Networks for Visual Recognition
• Fundamentals of Deep Learning
• ADADELTA: An Adaptive Learning Rate Method
• Gentle Introduction to the Adam Optimization Algorithm for Deep Learning
Convolutional Networks
ConvolutionalNeuralNetworks
• Similar to Artificial Neural Networks but CNNs (or ConvNets) make explicit
assumptions that the input are images
• Regular neural networks do not scale well against images
• E.g. CIFAR-10 images are 32x32x3 (32 width, 32 height, 3 color channels) =
3072 weights – somewhat manageable
• A larger image of 200x200x3 = 120,000 weights
• CNNs have neurons arranged in 3D: width, height, depth.
• Neurons in a layer will only be connected to a small region of the layer before
it, i.e. NOT all of the neurons in a fully-connected manner.
• Final output layer for CIFAR-10 is 1x1x10 as we will reduce the full image into
a single vector of class scores, arranged along the depth dimension
CNNs/ConvNets
Regular 3-layer neural
network
• ConvNet arranges neurons in 3 dimensions
• 3D input results in 3D output
Source: https://cs231n.github.io/convolutional-networks/
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
Dropout
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
Dropout
Input
Pixel value of 32x32x3: 32 width,
32 height, 3 color channels (RGB)
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
Dropout
Convolutions
Compute output of neurons (dot
product between their weights)
connected to a small local region. If
we use 32 filters, then the output is
28x28x32 (using 5x5 filter)
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
Dropout
Pooling
Perform down sampling operation
along spatial dimensions (w, h)
resulting in reduced volume,
e.g. 14x14x2.
Convolutional Neural Networks
28 x 28 28 x 28 14 x 14
Convolution
32 filters
Convolution
64 filters
Subsampling
Stride (2,2)
Feature Extraction Classification
0
1
8
9
FullyConnected
Dropout
Dropout
Fully Connected
Neurons in a fully connected layer
have full connections to all
activations in the previous layer, as
seen in regular Neural Networks.
https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html
ConvNetJS MNIST Demo
Neurons … Activate!
DEMO
I’d like to thank…
GreatReferences
• Andrej Karparthy’s ConvNetJS MNIST Demo
• What is back propagation in neural networks?
• CS231n: Convolutional Neural Networks for Visual Recognition
• Syllabus and Slides | Course Notes | YouTube
• With particular focus on CS231n: Lecture 7: Convolution Neural Networks 
• Neural Networks and Deep Learning
• TensorFlow
GreatReferences
• Deep Visualization Toolbox
• Back Propagation with TensorFlow
• TensorFrames: Google TensorFlow with Apache Spark
• Integrating deep learning libraries with Apache Spark
• Build, Scale, and Deploy Deep Learning Pipelines with Ease
Attribution
Tomek Drabas
Brooke Wenig
Timothee Hunter
Cyrielle Simeone
Q&A
What’s next?
Applying your Convolutional Neural Network
October 25, 2018 | 10:00 PDT
https://dbricks.co/2O2c4BZ
State Of The Art Deep Learning On Apache Spark™
October 31, 2018 | 09:00 PDT
https://dbricks.co/2NqogIp

Weitere ähnliche Inhalte

Was ist angesagt?

Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
 
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...Simplilearn
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnBenjamin Bengfort
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learningamalalhait
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep LearningMyungjin Lee
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Simplilearn
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDevashish Shanker
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsKush Kulshrestha
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learningleopauly
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesMohammed Bennamoun
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNNAshray Bhandare
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learningJörgen Sandig
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Simplilearn
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning TutorialAmr Rashed
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)EdutechLearners
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector MachinesEdgar Marca
 

Was ist angesagt? (20)

Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...
What Is A Neural Network? | How Deep Neural Networks Work | Neural Network Tu...
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 
Introduction of Deep Learning
Introduction of Deep LearningIntroduction of Deep Learning
Introduction of Deep Learning
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
Deep Neural Networks (DNN)
Deep Neural Networks (DNN)Deep Neural Networks (DNN)
Deep Neural Networks (DNN)
 
Deep Learning for Natural Language Processing
Deep Learning for Natural Language ProcessingDeep Learning for Natural Language Processing
Deep Learning for Natural Language Processing
 
Performance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning AlgorithmsPerformance Metrics for Machine Learning Algorithms
Performance Metrics for Machine Learning Algorithms
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Artificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rulesArtificial Neural Networks Lect3: Neural Network Learning rules
Artificial Neural Networks Lect3: Neural Network Learning rules
 
Deep Learning - CNN and RNN
Deep Learning - CNN and RNNDeep Learning - CNN and RNN
Deep Learning - CNN and RNN
 
Neural networks and deep learning
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Deep Learning Tutorial
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
 
Perceptron (neural network)
Perceptron (neural network)Perceptron (neural network)
Perceptron (neural network)
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 

Ähnlich wie Training Neural Networks

Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionTe-Yen Liu
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer FarooquiDatabricks
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Scaling Analytics with Apache Spark
Scaling Analytics with Apache SparkScaling Analytics with Apache Spark
Scaling Analytics with Apache SparkQuantUniversity
 
Deep Learning for Developers (Advanced Workshop)
Deep Learning for Developers (Advanced Workshop)Deep Learning for Developers (Advanced Workshop)
Deep Learning for Developers (Advanced Workshop)Amazon Web Services
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Databricks
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch Eran Shlomo
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Julien SIMON
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”Dr.(Mrs).Gethsiyal Augasta
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for BeginnersSanghamitra Deb
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)DonghyunKang12
 
DigitRecognition.pptx
DigitRecognition.pptxDigitRecognition.pptx
DigitRecognition.pptxruvex
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slidesMLconf
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016MLconf
 
House price prediction
House price predictionHouse price prediction
House price predictionSabahBegum
 

Ähnlich wie Training Neural Networks (20)

Machine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis IntroductionMachine Learning, Deep Learning and Data Analysis Introduction
Machine Learning, Deep Learning and Data Analysis Introduction
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Scaling Analytics with Apache Spark
Scaling Analytics with Apache SparkScaling Analytics with Apache Spark
Scaling Analytics with Apache Spark
 
Deep Learning for Developers (Advanced Workshop)
Deep Learning for Developers (Advanced Workshop)Deep Learning for Developers (Advanced Workshop)
Deep Learning for Developers (Advanced Workshop)
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
Using Deep Learning on Apache Spark to Diagnose Thoracic Pathology from Chest...
 
Deep learning from scratch
Deep learning from scratch Deep learning from scratch
Deep learning from scratch
 
Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)Deep Learning with Apache MXNet (September 2017)
Deep Learning with Apache MXNet (September 2017)
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
 
Computer Vision for Beginners
Computer Vision for BeginnersComputer Vision for Beginners
Computer Vision for Beginners
 
Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)Cvpr 2018 papers review (efficient computing)
Cvpr 2018 papers review (efficient computing)
 
DigitRecognition.pptx
DigitRecognition.pptxDigitRecognition.pptx
DigitRecognition.pptx
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
 
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
 
TensorFlow.pptx
TensorFlow.pptxTensorFlow.pptx
TensorFlow.pptx
 
House price prediction
House price predictionHouse price prediction
House price prediction
 

Mehr von Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

Mehr von Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Kürzlich hochgeladen

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...amitlee9823
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 

Kürzlich hochgeladen (20)

Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 

Training Neural Networks

  • 2. Accelerate innovation by unifying data science, engineering and business • Founded by the original creators of Apache Spark • Contributes 75% of the open source code, 10x more than any other company • Trained 100k+ Spark users on the Databricks platform VISION WHO WE ARE Unified Analytics Platform powered by Apache Spark™PRODUCT
  • 3. Aboutourspeaker Denny Lee Technical Product Marketing Manager Former: • Senior Director of Data Sciences Engineering at SAP Concur • Principal Program Manager at Microsoft • Azure Cosmos DB Engineering Spark and Graph Initiatives • Isotope Incubation Team (currently known as HDInsight) • Bing’s Audience Insights Team • Yahoo!’s 24TB Analysis Services cube
  • 4. DeepLearningFundamentalsSeries This is a three-part series: • Introduction to Neural Networks • Training Neural Networks • Applying your Convolutional Neural Network This series will be make use of Keras (TensorFlow backend) but as it is a fundamentals series, we are focusing primarily on the concepts.
  • 5. PreviousSession:IntroductiontoNeuralNetworks • What is Deep Learning? • What can Deep Learning do for you? • What are artificial neural networks? • Let’s start with a perceptron… • Understanding the effect of activation functions
  • 6. CurrentSession:TrainingNeuralNetworks • Tuning training • Training Algorithms • Optimization (including Adam) • Convolutional Neural Networks
  • 7. UpcomingSession:ApplyingNeuralNetworks • Diving further into CNNs • CNN Architectures • Convolutions at Work!
  • 8. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout Dropout
  • 10. Hyperparameters • Network • How many layers? • How many neurons in each layer? • What activation functions to use? • Learning algorithm • What’s the best value of the learning rate? • How quickly decay the learning rate? Momentum? • What type of loss function should I use? • What batch size? • How many iterations is enough?
  • 14. Hyperparameters:Network Generally, the more layers and the number of units in each layer: • The greater the capacity of the artificial neural network • The risk is overfitting when your goal is to build a generalized model. From a practical perspective, a good starting point is: • The number of input units equals the dimension of features • The number of output units equals the number of classes (e.g. in the MNIST dataset, there are 10 possible values represents digits (0…9) hence there are 10 output units • Start with one hidden layer that is 2x the number of input units • A good reference is Andrew Ng’s Coursera Machine Learning course.
  • 15. Hyperparameters:ActivationFunctions? • Good starting point: ReLU • Note many neural networks samples: Keras MNIST,  TensorFlow CIFAR10 Pruning, etc. • Note that each activation function has its own strengths and weaknesses.  A good quote on activation functions from CS231N summarizes the choice well: “What neuron type should I use?” Use the ReLU non-linearity, be careful with your learning rates and possibly monitor the fraction of “dead” units in a network. If this concerns you, give Leaky ReLU or Maxout a try. Never use sigmoid. Try tanh, but expect it to work worse than ReLU/ Maxout.
  • 17. Hyperparameters Learning algorithm • What’s the best value of the learning rate? • How quickly decay the learning rate? Momentum? • What type of loss function should I use? • What batch size? • How many iterations is enough?
  • 19. Cost function Source: https://bit.ly/2IoAGzL For this linear regression example, to determine the best (slope of the line) for we can calculate the cost function, such as Mean Square Error, Mean absolute error, Mean bias error, SVM Loss, etc. For this example, we’ll use sum of squared absolute differences 𝑦 = 𝑥 ⋅ 𝑝 𝑝 𝑐𝑜𝑠𝑡 = ∑ | 𝑡 − 𝑦|2
  • 20. Gradient Descent Optimization Source: https://bit.ly/2IoAGzL
  • 21. Small Learning Rate Source: https://bit.ly/2IoAGzL
  • 22. Small Learning Rate Source: https://bit.ly/2IoAGzL
  • 23. Small Learning Rate Source: https://bit.ly/2IoAGzL
  • 24. Small Learning Rate Source: https://bit.ly/2IoAGzL
  • 25. SimplifiedTwo-LayerANN 1 1 0.8 0.2 0.7 0.6 0.9 0.1 0.8 0.75 0.69 h1 = 𝜎(1𝑥0.8 + 1𝑥0.6) = 0.80 h2 = 𝜎(1𝑥0.2 + 1𝑥0.9) = 0.75 h3 = 𝜎(1𝑥0.7 + 1𝑥0.1) = 0.69
  • 28. Backpropagation Input Hidden Output 0.85 • Backpropagation: calculate the gradient of the cost function in a neural network • Used by gradient descent optimization algorithm to adjust weight of neurons • Also known as backward propagation of errors as the error is calculated and distributed back through the network of layers 0.10
  • 29. Sigmoidfunction(continued) Output is not zero-centered: During gradient descent, if all values are positive then during backpropagation the weights will become all positive or all negative creating zig zagging dynamics. Source: https://bit.ly/2IoAGzL
  • 30. LearningRateCallouts • Too small, it may take too long to get minima • Too large, it may skip the minima altogether
  • 32. OptimizationOverview • After backpropagation, the parameters are updated based on the gradients calculated • There are several approaches in this area of active research; we will focus on: • Stochastic Gradient Descent • Momentum, NAG • Per-parameter adaptive learning rate methods
  • 33. StochasticGradientDescent • (Batch) Gradient Descent is computed on the full dataset (not efficient for large scale models and datasets). • Often converges faster because it performs updates more frequently • But due to frequent updates, this may complicate convergence to the exact minima • For more information, refer to: • Andrew Ng’s 2. Stochastic Gradient (https://goo.gl/bNrJbx) • Types of Optimization Algorithms used in Neural Networks and Ways to Optimize Gradient Descent (https://goo.gl/tB2e7S)
  • 35. MomentumandNAG • Obtain faster convergence by helping parameter vector build up velocity • i.e. use the momentum of the gradient to converge faster • Nesterov Accelerated Gradient (NAG): optimized version of Momentum • Typically works better in practice than Momentum Source: http://cs231n.github.io/neural-networks-3
  • 36. Annealingthelearningrate • i.e. slow down the learning rate to prevent it from bouncing around too much • Referred as the decay parameter (i.e., the learning rate decay over each update) to reduce kinetic energy • Note, this is different from rho (i.e. exponentially weighted average or exponentially weighted decay of past gradients) to smooth the descent path trajectory
  • 37. Per-parameteradaptivelearningratemethods • Adaptively tune learning rates at the parameter level • Popular methods include: • Adaptive Gradient Algorithm (AdaGrad) improves performance on problems with sparse gradients (e.g. natural language and computer vision problems). • Root Mean Square Propagation (RMSProp) maintains per-parameter learning rates based on the average of recent magnitudes of the gradients for the weight (e.g. how quickly it is changing). This means the algorithm does well on online and non-stationary problems (e.g. noisy). • AdaDelta: Per dimension learning rate method for gradient descent with minimal computational overhead, requires no manual tuning, and quite robust
  • 38. WhichOptimizer? “In practice Adam is currently recommended as the default algorithm to use, and often works slightly better than RMSProp. However, it is often also worth trying SGD+Nesterov Momentum as an alternative..”  Andrej Karpathy, et al, CS231n Source: https://goo.gl/2da4WY Comparison of Adam to Other Optimization Algorithms Training a Multilayer Perceptron
 Taken from Adam: A Method for Stochastic Optimization, 2015.
  • 39. Optimizationonlosssurfacecontours Adaptive algorithms converge quickly and find the right direction for the parameters. In comparison, SGD is slow Momentum-based methods overshoot Source: http://cs231n.github.io/neural-networks-3/#hyper Image credit: Alec Radford
  • 40. Optimizationonsaddlepoint Notice how SGD gets stuck near the top Meanwhile adaptive techniques optimize the fastest Source: http://cs231n.github.io/neural-networks-3/#hyper Image credit: Alec Radford
  • 41. GoodReferences • Suki Lau's Learning Rate Schedules and Adaptive Learning Rate Methods for Deep Learning • CS23n Convolutional Neural Networks for Visual Recognition • Fundamentals of Deep Learning • ADADELTA: An Adaptive Learning Rate Method • Gentle Introduction to the Adam Optimization Algorithm for Deep Learning
  • 43. ConvolutionalNeuralNetworks • Similar to Artificial Neural Networks but CNNs (or ConvNets) make explicit assumptions that the input are images • Regular neural networks do not scale well against images • E.g. CIFAR-10 images are 32x32x3 (32 width, 32 height, 3 color channels) = 3072 weights – somewhat manageable • A larger image of 200x200x3 = 120,000 weights • CNNs have neurons arranged in 3D: width, height, depth. • Neurons in a layer will only be connected to a small region of the layer before it, i.e. NOT all of the neurons in a fully-connected manner. • Final output layer for CIFAR-10 is 1x1x10 as we will reduce the full image into a single vector of class scores, arranged along the depth dimension
  • 44. CNNs/ConvNets Regular 3-layer neural network • ConvNet arranges neurons in 3 dimensions • 3D input results in 3D output Source: https://cs231n.github.io/convolutional-networks/
  • 45. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout Dropout
  • 46. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout Dropout Input Pixel value of 32x32x3: 32 width, 32 height, 3 color channels (RGB)
  • 47. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout Dropout Convolutions Compute output of neurons (dot product between their weights) connected to a small local region. If we use 32 filters, then the output is 28x28x32 (using 5x5 filter)
  • 48. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout Dropout Pooling Perform down sampling operation along spatial dimensions (w, h) resulting in reduced volume, e.g. 14x14x2.
  • 49. Convolutional Neural Networks 28 x 28 28 x 28 14 x 14 Convolution 32 filters Convolution 64 filters Subsampling Stride (2,2) Feature Extraction Classification 0 1 8 9 FullyConnected Dropout Dropout Fully Connected Neurons in a fully connected layer have full connections to all activations in the previous layer, as seen in regular Neural Networks.
  • 52. I’d like to thank…
  • 53. GreatReferences • Andrej Karparthy’s ConvNetJS MNIST Demo • What is back propagation in neural networks? • CS231n: Convolutional Neural Networks for Visual Recognition • Syllabus and Slides | Course Notes | YouTube • With particular focus on CS231n: Lecture 7: Convolution Neural Networks  • Neural Networks and Deep Learning • TensorFlow
  • 54. GreatReferences • Deep Visualization Toolbox • Back Propagation with TensorFlow • TensorFrames: Google TensorFlow with Apache Spark • Integrating deep learning libraries with Apache Spark • Build, Scale, and Deploy Deep Learning Pipelines with Ease
  • 56. Q&A
  • 57. What’s next? Applying your Convolutional Neural Network October 25, 2018 | 10:00 PDT https://dbricks.co/2O2c4BZ State Of The Art Deep Learning On Apache Spark™ October 31, 2018 | 09:00 PDT https://dbricks.co/2NqogIp