SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Pop-up Loft
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Deep Learning with MXNet workshop
Sunil Mallya
Solutions Architect, Deep Learning
smallya@amazon.com
@sunilmallya
Agenda
•  AWS AI Stack
•  Deep Learning motivation and basics
•  Apache MXNet overview
•  MXNet programing model
•  Fine tuning pre-trained models
•  MXNet serverless deployment
Amazon AI
Intelligent Services Powered By Deep Learning
BigDL on AWS
Github: github.com/intel-analytics/BigDL
http://software.intel.com/bigdl
§  BigDL, A Distributed Deep learning framework for
Apache Spark*
§  Deploying BigDL on AWS is super easy!
§  Option 1: Install BigDL on Amazon EMR with Bootstrap action
s3://aws-bigdata-blog/artifacts/aws-blog-emr-jupyter/install-jupyter-emr5-
latest.sh –bigdl
§  Option 2: Launch Public AMI on EC2 w/Xeon E5 v3 or v4
https://github.com/intel-analytics/BigDL/wiki/Running-on-EC2
https://aws.amazon.com/blogs/ai/
running-bigdl-deep-learning-for-apache-
spark-on-aws/
Pop-up Loft
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Deep Learning basics
Machine Learning 101
Shallow Learning
•  Extract clever features
(preprocessing)
•  Map into feature space
(kernel methods)
•  Set of rules
(decision tree)
•  Combine multiple estimates
(boosting)
Deep Learning
•  Many simple neurons
•  Specialized layers
(images, text, audio, …)
•  Stack layers
(hence deep learning)
•  Optimization is difficult
•  Backpropagation
•  Stochastic gradient descent
usually simple to learn better accuracy
0.2
-0.1
...
0.7
Input Output
1 1 1
1 0 1
0 0 0
3
mx.sym.Pooling(data, pool_type="max", kernel=(2,2), stride=(2,2)
lstm.lstm_unroll(num_lstm_layer, seq_len, len, num_hidden, num_embed)
4 2
2 0
4=Max
1
3
...
4
0.2
-0.1
...
0.7
mx.sym.FullyConnected(data, num_hidden=128)
2
mx.symbol.Embedding(data, input_dim, output_dim = k)
Queen
4 2
2 0
2=Avg
Input Weights
cos(w, queen) = cos(w, king) - cos(w, man) + cos(w, woman)
mx.sym.Activation(data, act_type="xxxx")
"relu"
"tanh"
"sigmoid"
"softrelu"
Neural Art
Face Search
Image Segmentation
Image Caption
“People Riding Bikes”
Bicycle, People,
Road, Sport
Image Labels
Image
Video
Speech
Text
“People Riding Bikes”
Machine Translation
“Οι άνθρωποι
ιππασίας ποδήλατα”
Events
mx.model.FeedForward model.fit
mx.sym.SoftmaxOutput
Anatomy of a Deep Learning Model
mx.sym.Convolution(data, kernel=(5,5), num_filter=20)
Deep Learning Models
Biological Neuron
slide from http://cs231n.stanford.edu/
Neural Network basics: http://cs231n.github.io/neural-networks-1/
Artificial Neuron
output
synaptic
weights
•  Input
Vector of training data x
•  Output
Linear function of inputs
•  Nonlinearity
Transform output into desired range
of values, e.g. for classification we
need probabilities [0, 1]
•  Training
Learn the weights w and bias b
Deep Neural Network
hidden layers
The optimal size of the hidden
layer (number of neurons) is
usually between the size of the
input and size of the output
layers
Input layer
output
The “Learning” in Deep Learning
0.4 0.3
0.2 0.9
...
back propogation (gradient descent)
X1 != X
0.4 ± 𝛿 0.3 ± 𝛿
new
weights
new
weights
0
1
0
1
1
.
.
--
X
input
label
...
X1
Gradient Descent
Hierarchical Feature Representation
Neural Net Simulation
http://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html
Pop-up Loft
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Apache MXNet
Apache MXNet
Programmable Portable High Performance
Near linear scaling
across hundreds of GPUs
Highly efficient
models for mobile
and IoT
Simple syntax,
multiple languages
88% efficiency
on 256 GPUs
Resnet 1024 layer network
is ~4GB
Ideal
Inception v3
Resnet
Alexnet
88%
Efficiency
1! 2! 4! 8! 16! 32! 64! 128! 256!
No. of GPUs
•  Cloud formation with Deep Learning AMI
•  16x P2.16xlarge. Mounted on EFS
•  Inception and Resnet: batch size 32, Alex net: batch
size 512
•  ImageNet, 1.2M images,1K classes
•  152-layer ResNet, 5.4d on 4x K80s (1.2h per epoch),
0.22 top-1 error
Scaling with MXNet
http://bit.ly/deepami
Deep Learning any way you want on AWS
Tool for data scientists and developers
Setting up a DL system takes (install) time & skill
Keep packages up to date and compiled (MXNet, TensorFlow, Caffe, Torch,
Theano, Keras)
Anaconda, Jupyter, Python 2 and 3
NVIDIA Drivers for G2 and P2 instances
Intel MKL Drivers for all other instances (C4, M4, …)
Deep Learning AMIs
Pop-up Loft
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
MXNet Programing model
import numpy as np
a = np.ones(10)
b = np.ones(10) * 2
c = b * a
•  Straightforward and flexible.
•  Take advantage of language
native features (loop,
condition, debugger)
•  E.g. Numpy, Matlab, Torch, …
•  Hard to optimize
PROS
CONS
d = c + 1c
Easy to tweak
with python codes
Imperative Programing
•  More chances for optimization
•  Cross different languages
•  E.g. TensorFlow, Theano,
Caffe
•  Less flexible
PROS
CONS
C can share memory with D
because C is deleted later
A = Variable('A')
B = Variable('B')
C = B * A
D = C + 1
f = compile(D)
d = f(A=np.ones(10),
B=np.ones(10)*2)
A B
1
+
X
Declarative Programing
IMPERATIVE
NDARRAY API
DECLARATIVE
SYMBOLIC
EXECUTOR
>>> import mxnet as mx
>>> a = mx.nd.zeros((100, 50))
>>> b = mx.nd.ones((100, 50))
>>> c = a + b
>>> c += 1
>>> print(c)
>>> import mxnet as mx
>>> net = mx.symbol.Variable('data')
>>> net = mx.symbol.FullyConnected(data=net, num_hidden=12
>>> net = mx.symbol.SoftmaxOutput(data=net)
>>> texec = mx.module.Module(net)
>>> texec.forward(data=c)
>>> texec.backward()
NDArray can be set
as input to the graph
MXNet: Mixed programming paradigm
Embed symbolic expressions into imperative programming
texec = mx.module.Module(net)
for batch in train_data:
texec.forward(batch)
texec.backward()
for param, grad in zip(texec.get_params(), texec.get_grads()):
param -= 0.2 * grad
MXNet: Mixed programming paradigm
Pop-up Loft
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Neural Nets and Deep
Learning Glossary
Training, Validation Set and Overfitting
Best model
Batch, Epoch
Batch:
•  Number of samples propagated through the network at every iteration
•  Helps utilize the GPU compute power
Epoch:
An Epoch is a complete pass through all the training data. A neural network
is trained until the error rate is acceptable, and this will often take multiple
passes through the complete data se
Loss Function
•  Objective function defines what success looks like when
an algorithm learns.
•  It is a measure of the difference between a neural net’s
guess and the ground truth; that is, the error.
•  Eror resulting from the loss function is fed into
backpropagation in order to update the weights & biases
•  Common loss functions
•  Cross entropy
•  L1 (linear), L2 (quadratic)
•  Mean square error (MSE)
Activation Functions
Adds non linearity
ReLU is most
commonly
used today
Fully Connected Layer
Fully connected layer of a neural
network
If any activation isnt’ applied, you can
image this to be just a linear
regression on the input attributes.
Multilayer Perceptron (MLP)
Y = WX +b
Dropout
Srivastava, Nitish, et al. ”Dropout: a simple way to prevent neural networks from
overfitting”, JMLR 2014
Learning Rates and SGD Visualization
Source: https://twitter.com/alecrad
Convolution Neural Network (CNN)
CNN Layers
Convolutional Layer
Pooling Layer
Activation
Fully-Connected Layer
Recurrent Neural Network (RNN) Examples
Image Caption Sentiment Analysis Machine Translation Video LabelingImage Labeling
MXNet Model Zoo
http://mxnet.io/model_zoo/
MXNet Lambda Deployment
import boto3
mport mxnet as mx
import numpy as np
….
bucket = 'smallya-test'
s3 = boto3.resource('s3')
s3_client = boto3.client('s3')
mod = None
with tempfile.NamedTemporaryFile(delete=True) as f_params_file, tempfile.NamedTemporaryFile(delete=True) as f_symbol_file:
s3_client.download_file(bucket, f_params, f_params_file.name) ; f_params_file.flush()
s3_client.download_file(bucket, f_symbol, f_symbol_file.name) ; f_symbol_file.flush()
sym, arg_params, aux_params = load_model(f_symbol_file.name, f_params_file.name)
mod = mx.mod.Module(symbol=sym)
mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))])
mod.set_params(arg_params, aux_params)
def lambda_handler(event, context):
….
labels = predict(url, data_url)
…
Outside context handler
Distributed Deep Learning with MXNet
Pop-up Loft
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved
Thank You
smallya@amazon.com
sunilmallya

Weitere ähnliche Inhalte

Was ist angesagt?

Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
MLconf
 
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
Sri Ambati
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
gothicane
 

Was ist angesagt? (20)

Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
Andrew Musselman, Committer and PMC Member, Apache Mahout, at MLconf Seattle ...
 
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
 
Tensorflow windows installation
Tensorflow windows installationTensorflow windows installation
Tensorflow windows installation
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2
 
Introduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlowIntroduction to Deep Learning, Keras, and TensorFlow
Introduction to Deep Learning, Keras, and TensorFlow
 
TensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache SparkTensorFrames: Google Tensorflow on Apache Spark
TensorFrames: Google Tensorflow on Apache Spark
 
Introduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlowIntroduction to Machine Learning with TensorFlow
Introduction to Machine Learning with TensorFlow
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
Introduction to TensorFlow
Introduction to TensorFlowIntroduction to TensorFlow
Introduction to TensorFlow
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
From Python to PySpark and Back Again – Unifying Single-host and Distributed ...
 
Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018Apache MXNet ODSC West 2018
Apache MXNet ODSC West 2018
 
Behm Shah Pagerank
Behm Shah PagerankBehm Shah Pagerank
Behm Shah Pagerank
 
Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015Strata London - Deep Learning 05-2015
Strata London - Deep Learning 05-2015
 
AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...AI powered emotion recognition: From Inception to Production - Global AI Conf...
AI powered emotion recognition: From Inception to Production - Global AI Conf...
 
TensorFlow example for AI Ukraine2016
TensorFlow example  for AI Ukraine2016TensorFlow example  for AI Ukraine2016
TensorFlow example for AI Ukraine2016
 
Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16Intel Nervana Artificial Intelligence Meetup 11/30/16
Intel Nervana Artificial Intelligence Meetup 11/30/16
 

Ähnlich wie MXNet Workshop

Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
Databricks
 

Ähnlich wie MXNet Workshop (20)

Scalable Deep Learning on AWS with Apache MXNet
Scalable Deep Learning on AWS with Apache MXNetScalable Deep Learning on AWS with Apache MXNet
Scalable Deep Learning on AWS with Apache MXNet
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech TalksA Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
A Deeper Dive into Apache MXNet - March 2017 AWS Online Tech Talks
 
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
Scalable Deep Learning on AWS Using Apache MXNet - AWS Summit Tel Aviv 2017
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Google Cloud Platform Empowers TensorFlow and Machine Learning
Google Cloud Platform Empowers TensorFlow and Machine LearningGoogle Cloud Platform Empowers TensorFlow and Machine Learning
Google Cloud Platform Empowers TensorFlow and Machine Learning
 
Machine Learning in Action
Machine Learning in ActionMachine Learning in Action
Machine Learning in Action
 
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
Software Frameworks for Deep Learning (D1L7 2017 UPC Deep Learning for Comput...
 
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali ZaidiNatural Language Processing with CNTK and Apache Spark with Ali Zaidi
Natural Language Processing with CNTK and Apache Spark with Ali Zaidi
 
Introduction To Tensorflow
Introduction To TensorflowIntroduction To Tensorflow
Introduction To Tensorflow
 
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNetAWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
AWS 機器學習 II ─ 深度學習 Deep Learning & MXNet
 
Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications Startup.Ml: Using neon for NLP and Localization Applications
Startup.Ml: Using neon for NLP and Localization Applications
 
Deep Dive into Apache MXNet on AWS
Deep Dive into Apache MXNet on AWSDeep Dive into Apache MXNet on AWS
Deep Dive into Apache MXNet on AWS
 
Deep Learning with Apache MXNet
Deep Learning with Apache MXNetDeep Learning with Apache MXNet
Deep Learning with Apache MXNet
 
The deep learning tour - Q1 2017
The deep learning tour - Q1 2017 The deep learning tour - Q1 2017
The deep learning tour - Q1 2017
 
IN4308 1
IN4308 1IN4308 1
IN4308 1
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117Power ai tensorflowworkloadtutorial-20171117
Power ai tensorflowworkloadtutorial-20171117
 
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 Separating Hype from Reality in Deep Learning with Sameer Farooqui Separating Hype from Reality in Deep Learning with Sameer Farooqui
Separating Hype from Reality in Deep Learning with Sameer Farooqui
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 

Mehr von Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mehr von Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Kürzlich hochgeladen

Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 

Kürzlich hochgeladen (20)

Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 

MXNet Workshop

  • 1. Pop-up Loft © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Deep Learning with MXNet workshop Sunil Mallya Solutions Architect, Deep Learning smallya@amazon.com @sunilmallya
  • 2. Agenda •  AWS AI Stack •  Deep Learning motivation and basics •  Apache MXNet overview •  MXNet programing model •  Fine tuning pre-trained models •  MXNet serverless deployment
  • 3. Amazon AI Intelligent Services Powered By Deep Learning
  • 4. BigDL on AWS Github: github.com/intel-analytics/BigDL http://software.intel.com/bigdl §  BigDL, A Distributed Deep learning framework for Apache Spark* §  Deploying BigDL on AWS is super easy! §  Option 1: Install BigDL on Amazon EMR with Bootstrap action s3://aws-bigdata-blog/artifacts/aws-blog-emr-jupyter/install-jupyter-emr5- latest.sh –bigdl §  Option 2: Launch Public AMI on EC2 w/Xeon E5 v3 or v4 https://github.com/intel-analytics/BigDL/wiki/Running-on-EC2 https://aws.amazon.com/blogs/ai/ running-bigdl-deep-learning-for-apache- spark-on-aws/
  • 5. Pop-up Loft © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Deep Learning basics
  • 6. Machine Learning 101 Shallow Learning •  Extract clever features (preprocessing) •  Map into feature space (kernel methods) •  Set of rules (decision tree) •  Combine multiple estimates (boosting) Deep Learning •  Many simple neurons •  Specialized layers (images, text, audio, …) •  Stack layers (hence deep learning) •  Optimization is difficult •  Backpropagation •  Stochastic gradient descent usually simple to learn better accuracy
  • 7. 0.2 -0.1 ... 0.7 Input Output 1 1 1 1 0 1 0 0 0 3 mx.sym.Pooling(data, pool_type="max", kernel=(2,2), stride=(2,2) lstm.lstm_unroll(num_lstm_layer, seq_len, len, num_hidden, num_embed) 4 2 2 0 4=Max 1 3 ... 4 0.2 -0.1 ... 0.7 mx.sym.FullyConnected(data, num_hidden=128) 2 mx.symbol.Embedding(data, input_dim, output_dim = k) Queen 4 2 2 0 2=Avg Input Weights cos(w, queen) = cos(w, king) - cos(w, man) + cos(w, woman) mx.sym.Activation(data, act_type="xxxx") "relu" "tanh" "sigmoid" "softrelu" Neural Art Face Search Image Segmentation Image Caption “People Riding Bikes” Bicycle, People, Road, Sport Image Labels Image Video Speech Text “People Riding Bikes” Machine Translation “Οι άνθρωποι ιππασίας ποδήλατα” Events mx.model.FeedForward model.fit mx.sym.SoftmaxOutput Anatomy of a Deep Learning Model mx.sym.Convolution(data, kernel=(5,5), num_filter=20) Deep Learning Models
  • 8. Biological Neuron slide from http://cs231n.stanford.edu/ Neural Network basics: http://cs231n.github.io/neural-networks-1/
  • 9. Artificial Neuron output synaptic weights •  Input Vector of training data x •  Output Linear function of inputs •  Nonlinearity Transform output into desired range of values, e.g. for classification we need probabilities [0, 1] •  Training Learn the weights w and bias b
  • 10. Deep Neural Network hidden layers The optimal size of the hidden layer (number of neurons) is usually between the size of the input and size of the output layers Input layer output
  • 11. The “Learning” in Deep Learning 0.4 0.3 0.2 0.9 ... back propogation (gradient descent) X1 != X 0.4 ± 𝛿 0.3 ± 𝛿 new weights new weights 0 1 0 1 1 . . -- X input label ... X1
  • 15. Pop-up Loft © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Apache MXNet
  • 16. Apache MXNet Programmable Portable High Performance Near linear scaling across hundreds of GPUs Highly efficient models for mobile and IoT Simple syntax, multiple languages 88% efficiency on 256 GPUs Resnet 1024 layer network is ~4GB
  • 17. Ideal Inception v3 Resnet Alexnet 88% Efficiency 1! 2! 4! 8! 16! 32! 64! 128! 256! No. of GPUs •  Cloud formation with Deep Learning AMI •  16x P2.16xlarge. Mounted on EFS •  Inception and Resnet: batch size 32, Alex net: batch size 512 •  ImageNet, 1.2M images,1K classes •  152-layer ResNet, 5.4d on 4x K80s (1.2h per epoch), 0.22 top-1 error Scaling with MXNet
  • 18. http://bit.ly/deepami Deep Learning any way you want on AWS Tool for data scientists and developers Setting up a DL system takes (install) time & skill Keep packages up to date and compiled (MXNet, TensorFlow, Caffe, Torch, Theano, Keras) Anaconda, Jupyter, Python 2 and 3 NVIDIA Drivers for G2 and P2 instances Intel MKL Drivers for all other instances (C4, M4, …) Deep Learning AMIs
  • 19. Pop-up Loft © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved MXNet Programing model
  • 20. import numpy as np a = np.ones(10) b = np.ones(10) * 2 c = b * a •  Straightforward and flexible. •  Take advantage of language native features (loop, condition, debugger) •  E.g. Numpy, Matlab, Torch, … •  Hard to optimize PROS CONS d = c + 1c Easy to tweak with python codes Imperative Programing
  • 21. •  More chances for optimization •  Cross different languages •  E.g. TensorFlow, Theano, Caffe •  Less flexible PROS CONS C can share memory with D because C is deleted later A = Variable('A') B = Variable('B') C = B * A D = C + 1 f = compile(D) d = f(A=np.ones(10), B=np.ones(10)*2) A B 1 + X Declarative Programing
  • 22. IMPERATIVE NDARRAY API DECLARATIVE SYMBOLIC EXECUTOR >>> import mxnet as mx >>> a = mx.nd.zeros((100, 50)) >>> b = mx.nd.ones((100, 50)) >>> c = a + b >>> c += 1 >>> print(c) >>> import mxnet as mx >>> net = mx.symbol.Variable('data') >>> net = mx.symbol.FullyConnected(data=net, num_hidden=12 >>> net = mx.symbol.SoftmaxOutput(data=net) >>> texec = mx.module.Module(net) >>> texec.forward(data=c) >>> texec.backward() NDArray can be set as input to the graph MXNet: Mixed programming paradigm
  • 23. Embed symbolic expressions into imperative programming texec = mx.module.Module(net) for batch in train_data: texec.forward(batch) texec.backward() for param, grad in zip(texec.get_params(), texec.get_grads()): param -= 0.2 * grad MXNet: Mixed programming paradigm
  • 24. Pop-up Loft © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Neural Nets and Deep Learning Glossary
  • 25. Training, Validation Set and Overfitting Best model
  • 26. Batch, Epoch Batch: •  Number of samples propagated through the network at every iteration •  Helps utilize the GPU compute power Epoch: An Epoch is a complete pass through all the training data. A neural network is trained until the error rate is acceptable, and this will often take multiple passes through the complete data se
  • 27. Loss Function •  Objective function defines what success looks like when an algorithm learns. •  It is a measure of the difference between a neural net’s guess and the ground truth; that is, the error. •  Eror resulting from the loss function is fed into backpropagation in order to update the weights & biases •  Common loss functions •  Cross entropy •  L1 (linear), L2 (quadratic) •  Mean square error (MSE)
  • 28. Activation Functions Adds non linearity ReLU is most commonly used today
  • 29. Fully Connected Layer Fully connected layer of a neural network If any activation isnt’ applied, you can image this to be just a linear regression on the input attributes.
  • 31. Dropout Srivastava, Nitish, et al. ”Dropout: a simple way to prevent neural networks from overfitting”, JMLR 2014
  • 32. Learning Rates and SGD Visualization Source: https://twitter.com/alecrad
  • 33. Convolution Neural Network (CNN) CNN Layers Convolutional Layer Pooling Layer Activation Fully-Connected Layer
  • 34. Recurrent Neural Network (RNN) Examples Image Caption Sentiment Analysis Machine Translation Video LabelingImage Labeling
  • 36. MXNet Lambda Deployment import boto3 mport mxnet as mx import numpy as np …. bucket = 'smallya-test' s3 = boto3.resource('s3') s3_client = boto3.client('s3') mod = None with tempfile.NamedTemporaryFile(delete=True) as f_params_file, tempfile.NamedTemporaryFile(delete=True) as f_symbol_file: s3_client.download_file(bucket, f_params, f_params_file.name) ; f_params_file.flush() s3_client.download_file(bucket, f_symbol, f_symbol_file.name) ; f_symbol_file.flush() sym, arg_params, aux_params = load_model(f_symbol_file.name, f_params_file.name) mod = mx.mod.Module(symbol=sym) mod.bind(for_training=False, data_shapes=[('data', (1,3,224,224))]) mod.set_params(arg_params, aux_params) def lambda_handler(event, context): …. labels = predict(url, data_url) … Outside context handler
  • 38. Pop-up Loft © 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved Thank You smallya@amazon.com sunilmallya