SlideShare ist ein Scribd-Unternehmen logo
1 von 62
DIY DEEP LEARNING
WITH CAFFE
Kate Saenko
UMASS Lowell
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_
BOSTON 2015
@opendatasci
Caffe: Open Source Deep Learning Library
Tutorials by Evan Shelhamer, Jon Long,
Sergio Guadarrama, Jeff Donahue, and Ross
Girshick and Yangqing Jia
caffe.berkeleyvision.org
github.com/BVLC/caffe
Based on tutorials by the Caffe creators at UC Berkeley
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev
Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell
Caffe crew
...plus the
cold-brew
and over 100 open source contributors!
See also: upcoming Caffe Tutorial
at CVPR 2015 in Boston, June 7
http://tutorial.caffe.berkeleyvision.org/
Caffe Tutorial
Deep learning framework tutorial by
Evan Shelhamer, Jeff Donahue, Jon Long, Yangqing Jia and Ross Girshick
WHY DEEP LEARNING?
WHAT IS CAFFE?
APPLICATIONS OF CAFFE
NEURAL NETWORKS IN A SIP
CAFFE INTRO
STEP-BY-STEP EXAMPLES
Why Deep Learning?
End-to-End Learning for Many Tasks
Pinterest used deep learning-based visual search to find
pinned images of bags similar to the one on the left.
Image Credit: Pinterest
Why Deep Learning?
Compositional Models
Learned End-to-End
Hierarchy of Representations
- vision: pixel, motif, part, object
- text: character, word, clause, sentence
- speech: audio, band, phone, word
concrete abstract
learning
figure credit Yann LeCun, ICML ‘13 tutorial
Why Deep Learning?
Compositional Models
Learned End-to-End
figure credit Yann LeCun, ICML ‘13 tutorial
Back-propagation jointly learns
all of the model parameters to
optimize the output for the task.
http://code.flickr.net/2014/10/20/introducing-flickr-park-or-bird/
http://code.flickr.net/2014/10/20/introducing-flickr-park-or-bird/
All in a day’s work with Caffe
Why Deep Learning?
WHY DEEP LEARNING?
WHAT IS CAFFE?
APPLICATIONS OF CAFFE
NEURAL NETWORKS IN A SIP
CAFFE INTRO
STEP-BY-STEP EXAMPLES
What is Caffe?
Prototype Train Deploy
Open framework, models, and worked examples
for deep learning
- 1.5 years
- 300+ citations, 100+ contributors
- 2,000+ forks, >1 pull request / day average
- focus has been vision, but branching out:
sequences, reinforcement learning, speech + text
What is Caffe?
Prototype Train Deploy
Open framework, models, and worked examples
for deep learning
- Pure C++ / CUDA architecture for deep learning
- Command line, Python, MATLAB interfaces
- Fast, well-tested code
- Tools, reference models, demos, and recipes
- Seamless switch between CPU and GPU
Caffe is a Community project pulse
Caffe offers the
● model definitions
● optimization settings
● pre-trained weights
so you can start right away.
The BVLC models are
licensed for unrestricted use.
The community shares
models in our Model Zoo.
Reference Models
GoogLeNet: ILSVRC14 winner
Brewing by the Numbers...
● Speed with Krizhevsky's 2012 model:
o 2 ms / image on K40 GPU
o <1 ms inference with Caffe + cuDNN v2 on Titan X
o 72 million images / day with batched IO
o 8-core CPU: ~20 ms/image
● 9k lines of C++ code (20k with tests)
WHY DEEP LEARNING?
WHAT IS CAFFE?
APPLICATIONS OF CAFFE
NEURAL NETWORKS IN A SIP
CAFFE INTRO
STEP-BY-STEP EXAMPLES
Image Tagging:
Share a Sip of Brewed Models
demo.caffe.berkeleyvision.org
demo code open-source and bundled
Scene Recognition http://places.csail.mit.edu/
B. Zhou et al. NIPS 14
Visual Style Recognition
Other Styles:
Vintage
Long Exposure
Noir
Pastel
Macro
… and so on.
Karayev et al. Recognizing Image Style. BMVC14. Caffe fine-tuning example.
Demo online at http://demo.vislab.berkeleyvision.org/ (see Results Explorer).
[ Image-Style]
Object Detection
R-CNN: Regions with Convolutional Neural Networks
http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/detection.ipynb
Full R-CNN scripts available at
https://github.com/rbgirshick/rcnn
Ross Girshick et al.
Rich feature hierarchies for accurate
object detection and semantic
segmentation. CVPR14.
Recurrent Net RNN and Long Short Term Memory LSTM
are sequential models
- video
- language
- dynamics
learned by back-propagation through time.
Sequences
Jeff Donahue et al.
LRCN: Long-term Recurrent Convolutional Network
- activity recognition
- image captioning
- video captioning
arXiv and web page & PR
Youtube Captioning
Venugopalan, Xu, Donahue,
Rohrbach, Mooney, Saenko;
Translating Videos to Natural
Language Using Deep Recurrent
Neural Networks,
NAACL, 2015
Fully convolutional networks for pixel prediction
applied to semantic segmentation
- end-to-end learning
- efficiency in inference and learning
175 ms per-image prediction
- multi-modal, multi-task
Image Segmentation
Further applications
- depth estimation
- denoising
Jon Long* & Evan Shelhamer*,
Trevor Darrell
arXiv and pre-release
Deep Visuomotor Control
Sergey Levine* & Chelsea Finn*,
Trevor Darrell, and Pieter Abbeeltech report http://tinyurl.com/visuomotor
example experiments
Robotic manipulation
Embedded Caffe
Caffe on the NVIDIA Jetson TK1 mobile board
- 10 watts of power
- inference at 35 ms per image
- no need to change the code
WHY DEEP LEARNING?
WHAT IS CAFFE?
APPLICATIONS OF CAFFE
NEURAL NETWORKS IN A SIP
CAFFE INTRO
STEP-BY-STEP EXAMPLES
Neurons in the Brain
0
-2
+4
0
+2
Input
Multiply by
weights
Sum Threshold
Output
Artificial Neuron
+4
+2
0
+3
-2
0
-2
+4
0
+2
-8 0
Artificial Neuron: Activation
Input
Multiply by
weights
Sum Threshold
+4
+2
0
+3
-2
0
-2
+4
0
+2
-8 0
0
-2
0
+2
+2
0
-2
+4
0
+2
+8 +8
Artificial Neuron: Activation
Input
Multiply by
weights
Sum Threshold
Neurons learn
patterns!
Input
Output
Weights
Artificial Neural Network
Input
Output
Weights
Artificial Neural Network
Hidden LayerInput Layer Output Layer
Deep Network:
many layers!
10-1
10-1
10-1
Convolve with Threshold
Convolutional Neural Network
Input Layer
Output Layer
slide by Abi-Roozgard
w13w12w11
w23w22w21
w33w32w31
10-1
10-1
10-1
Convolve with Threshold
w13w12w11
w23w22w21
w33w32w31
Convolutional Neural Network
slide by Abi-Roozgard
Input Layer
Output Layer
Convolutional Network
Convolutional Networks: 1989
LeNet: a layered model composed of convolution and subsampling
operations followed by a holistic representation and ultimately a
classifier for handwritten digits. [ Yann LeCun; LeNet ]
Convolutional Nets: 2012
AlexNet: a layered model composed of convolution,
subsampling, and further operations followed by a holistic
representation and all-in-all a landmark classifier on
ILSVRC12. [ AlexNet ]
+ data
+ gpu
+ non-saturating nonlinearity
+ regularization
Convolutional Nets: 2014
ILSVRC14 Winners: ~6.6% Top-5 error
- GoogLeNet: composition of multi-scale dimension-
reduced modules (pictured)
- VGG: 16 layers of 3x3 convolution interleaved with
max pooling + 3 fully-connected layers
+ depth
+ data
+ dimensionality reduction
[Zeiler-Fergus]
1st layer filters
image patches that strongly activate 1st layer filters
Why Deep Learning?
The Unreasonable Effectiveness of Learning Deep Features
[Zeiler-Fergus]
Note the increase in
visual complexity of
feature activations as
we go from “shallow”
to “deep” layers.
[Zeiler-Fergus]
WHY DEEP LEARNING?
WHAT IS CAFFE?
APPLICATIONS OF CAFFE
NEURAL NETWORKS IN A SIP
CAFFE INTRO
STEP-BY-STEP EXAMPLES
Net
name: "dummy-net"
layers { name: "data" …}
layers { name: "conv" …}
layers { name: "pool" …}
… more layers …
layers { name: "loss" …}
● A network is a set of layers
connected as a DAG:
LogReg ↑
LeNet →
ImageNet, Krizhevsky 2012 →
● Caffe creates and checks the net from
the definition.
● Data and derivatives flow through the
net as blobs – a an array interface
Forward / Backward the essential Net computations
Caffe models are complete machine learning systems for inference and learning.
The computation follows from the model definition. Define the model and run.
Layer
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
}
name, type, and the
connection structure
(input blobs and
output blobs)
layer-specific
parameters
* Nets + Layers are defined by protobuf schema
● Every layer type defines
- Setup
- Forward
- Backward
Data
Number x K Channel x Height x Width
256 x 3 x 227 x 227 for ImageNet train input
Blobs are 4-D arrays for storing and
communicating information.
● hold data, derivatives, and parameters
● lazily allocate memory
● shuttle between CPU and GPU
Blob
name: "conv1"
type: CONVOLUTION
bottom: "data"
top: "conv1"
… definition …
top
blob
bottom
blob
Parameter: Convolution Weight
N Output x K Input x Height x Width
96 x 3 x 11 x 11 for CaffeNet conv1
Parameter: Convolution Bias
96 x 1 x 1 x 1 for CaffeNet conv1
Blobs provide a unified memory interface.
Reshape(num, channel, height, width)
- declare dimensions
- make SyncedMem -- but only lazily
allocate
Blob
cpu_data(), mutable_cpu_data()
- host memory for CPU mode
gpu_data(), mutable_gpu_data()
- device memory for GPU mode
{cpu,gpu}_diff(), mutable_{cpu,gpu}_diff()
- derivative counterparts to data methods
- easy access to data + diff in forward / backward
SyncedMem
allocation + communication
Solving: Training a Net
Optimization like model definition is configuration.
train_net: "lenet_train.prototxt"
base_lr: 0.01
momentum: 0.9
weight_decay: 0.0005
max_iter: 10000
snapshot_prefix: "lenet_snapshot"
All you need to run
things on the GPU.
> caffe train -solver lenet_solver.prototxt -gpu 0
Stochastic Gradient Descent (SGD) + momentum ·
Adaptive Gradient (ADAGRAD) · Nesterov’s Accelerated Gradient (NAG)
Setup: run once for initialization.
Forward: make output given input.
Backward: make gradient of output
- w.r.t. bottom
- w.r.t. parameters (if needed)
Reshape: set dimensions.
Layer Protocol
Layer Development Checklist
Compositional Modeling
The Net’s forward and backward passes
are composed of the layers’ steps.
Layer Protocol
== Class Interface
Define a class in C++ or Python to
extend Layer.
Include your new layer type in a
network and keep brewing.
layer {
type: ‘Python’
python_param {
module: ‘layers’
layer: ‘EuclideanLoss’
} }
Classification
SoftmaxWithLoss
HingeLoss
Linear Regression
EuclideanLoss
Attributes / Multiclassification
SigmoidCrossEntropyLoss
Others…
New Task
NewLoss
Loss
What kind of model is this?
Define the task by the loss.
loss (LOSS_TYPE)
Recipe for Brewing
● Convert the data to Caffe-format
o lmdb, leveldb, hdf5 / .mat, list of images, etc.
● Define the Net
● Configure the Solver
● caffe train -solver solver.prototxt -gpu 0
● Examples are your friends
o caffe/examples/mnist,cifar10,imagenet
o caffe/examples/*.ipynb
o caffe/models/*
WHY DEEP LEARNING?
WHAT IS CAFFE?
APPLICATIONS OF CAFFE
NEURAL NETWORKS IN A SIP
CAFFE INTRO
STEP-BY-STEP EXAMPLES
Classification
instant recognition the Caffe way
see notebook
Brewing Models
from logistic regression to non-linearity
see notebook
Latest Roast
Parallelism: open pull requests
- synchronous SGD #2114
- data parallelism
- ~linear speedup on multiple GPUs
for computation >> communication
- Flickr + NVIDIA collaboration on
distributed opt + asynchronous SGD
Python Net Specification #2086
New MATLAB Interface #2505
These will be bundled in the next release shortly.
Help Brewing
Documentation
- tutorial documentation
- worked examples
Modeling, Usage, and Install
- caffe-users group
- gitter.im chat
Caffe @ CVPR15
Convolutional Nets
- CS231n online convnet class
by Andrej Karpathy and Fei-
Fei Li.
- Deep Learning for Vision
tutorial from CVPR ‘14.
Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev
Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell
Thanks to the Caffe crew
...plus the
cold-brew
and open source contributors!
Acknowledgements
Thank you to the Berkeley Vision and Learning Center Sponsors.
Thank you to NVIDIA
for GPU donation and
collaboration on cuDNN.
Thank you to our 75+
open source contributors
and vibrant community.
Thank you to A9 and AWS
for a research grant for Caffe dev
and reproducible research.
References
[ DeCAF ] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep
convolutional activation feature for generic visual recognition. ICML, 2014.
[ R-CNN ] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection
and semantic segmentation. CVPR, 2014.
[ Zeiler-Fergus ] M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. ECCV, 2014.
[ LeNet ] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document
recognition. IEEE, 1998.
[ AlexNet ] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural
networks. NIPS, 2012.
[ OverFeat ] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated
recognition, localization and detection using convolutional networks. ICLR, 2014.
[ Image-Style ] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, H. Winnemoeller.
Recognizing Image Style. BMVC, 2014.
[ Karpathy14 ] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video
classification with convolutional neural networks. CVPR, 2014.
[ Sutskever13 ] I. Sutskever. Training Recurrent Neural Networks.
PhD thesis, University of Toronto, 2013.
[ Chopra05 ] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to
face verification. CVPR, 2005.

Weitere ähnliche Inhalte

Was ist angesagt?

Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingCloudxLab
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityThomas Graf
 
Natural language processing
Natural language processingNatural language processing
Natural language processingKarenVacca
 
Generalized Pipeline Parallelism for DNN Training
Generalized Pipeline Parallelism for DNN TrainingGeneralized Pipeline Parallelism for DNN Training
Generalized Pipeline Parallelism for DNN TrainingDatabricks
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
 
05_Reliable UDP 구현
05_Reliable UDP 구현05_Reliable UDP 구현
05_Reliable UDP 구현noerror
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object DetectionTaegyun Jeon
 
Deep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniquesDeep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniquesKang Pilsung
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...Edureka!
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing Adarsh Saxena
 
시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"
시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"
시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"InfraEngineer
 
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Odinot Stanislas
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionFrancesco Corti
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - IntroductionChristian Perone
 
LockFree Algorithm
LockFree AlgorithmLockFree Algorithm
LockFree AlgorithmMerry Merry
 
Trends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_businessTrends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_businessSANG WON PARK
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingScyllaDB
 

Was ist angesagt? (20)

Crunchy containers
Crunchy containersCrunchy containers
Crunchy containers
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and SecurityCilium - Bringing the BPF Revolution to Kubernetes Networking and Security
Cilium - Bringing the BPF Revolution to Kubernetes Networking and Security
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Generalized Pipeline Parallelism for DNN Training
Generalized Pipeline Parallelism for DNN TrainingGeneralized Pipeline Parallelism for DNN Training
Generalized Pipeline Parallelism for DNN Training
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
05_Reliable UDP 구현
05_Reliable UDP 구현05_Reliable UDP 구현
05_Reliable UDP 구현
 
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
[PR12] You Only Look Once (YOLO): Unified Real-Time Object Detection
 
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
StarlingX - A Platform for the Distributed Edge | Ildiko VancsaStarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
 
Deep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniquesDeep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniques
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
 
Natural Language Processing
Natural Language Processing Natural Language Processing
Natural Language Processing
 
시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"
시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"
시니어가 들려주는 "내가 알고 있는 걸 당신도 알게 된다면"
 
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
Virtualizing the Network to enable a Software Defined Infrastructure (SDI)
 
Alfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in ActionAlfresco DevCon 2019 - Alfresco Identity Services in Action
Alfresco DevCon 2019 - Alfresco Identity Services in Action
 
Word Embeddings - Introduction
Word Embeddings - IntroductionWord Embeddings - Introduction
Word Embeddings - Introduction
 
LockFree Algorithm
LockFree AlgorithmLockFree Algorithm
LockFree Algorithm
 
Trends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_businessTrends_of_MLOps_tech_in_business
Trends_of_MLOps_tech_in_business
 
UM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
 
New Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using TracingNew Ways to Find Latency in Linux Using Tracing
New Ways to Find Latency in Linux Using Tracing
 

Andere mochten auch

Using Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and LearningUsing Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and LearningDr. Volkan OBAN
 
Semi fragile watermarking
Semi fragile watermarkingSemi fragile watermarking
Semi fragile watermarkingYash Diwakar
 
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECFace recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECBAINIDA
 
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkitde:code 2017
 
Optimization in deep learning
Optimization in deep learningOptimization in deep learning
Optimization in deep learningJeremy Nixon
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Joe Suzuki
 
Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...
Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...
Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...Petroleum Training Institute
 
Caffe framework tutorial
Caffe framework tutorialCaffe framework tutorial
Caffe framework tutorialPark Chunduck
 
Pattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical ModelsPattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical Modelsbutest
 
Processor, Compiler and Python Programming Language
Processor, Compiler and Python Programming LanguageProcessor, Compiler and Python Programming Language
Processor, Compiler and Python Programming Languagearumdapta98
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2Park Chunduck
 
Center loss for Face Recognition
Center loss for Face RecognitionCenter loss for Face Recognition
Center loss for Face RecognitionJisung Kim
 
Computer vision, machine, and deep learning
Computer vision, machine, and deep learningComputer vision, machine, and deep learning
Computer vision, machine, and deep learningIgi Ardiyanto
 
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) IT Arena
 
Rattani - Ph.D. Defense Slides
Rattani - Ph.D. Defense SlidesRattani - Ph.D. Defense Slides
Rattani - Ph.D. Defense SlidesPluribus One
 
Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3Yusuke Oda
 
怖くない誤差逆伝播法 Chainerを添えて
怖くない誤差逆伝播法 Chainerを添えて怖くない誤差逆伝播法 Chainerを添えて
怖くない誤差逆伝播法 Chainerを添えてmarujirou
 

Andere mochten auch (20)

Using Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and LearningUsing Gradient Descent for Optimization and Learning
Using Gradient Descent for Optimization and Learning
 
Semi fragile watermarking
Semi fragile watermarkingSemi fragile watermarking
Semi fragile watermarking
 
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTECFace recognition and deep learning  โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
Face recognition and deep learning โดย ดร. สรรพฤทธิ์ มฤคทัต NECTEC
 
портфоліо Бабич О.А.
портфоліо Бабич О.А.портфоліо Бабич О.А.
портфоліо Бабич О.А.
 
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit[AI07] Revolutionizing Image Processing with Cognitive Toolkit
[AI07] Revolutionizing Image Processing with Cognitive Toolkit
 
Optimization in deep learning
Optimization in deep learningOptimization in deep learning
Optimization in deep learning
 
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
Structure Learning of Bayesian Networks with p Nodes from n Samples when n&lt...
 
Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...
Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...
Muzammil Abdulrahman PPT On Gabor Wavelet Transform (GWT) Based Facial Expres...
 
Caffe framework tutorial
Caffe framework tutorialCaffe framework tutorial
Caffe framework tutorial
 
Pattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical ModelsPattern Recognition and Machine Learning : Graphical Models
Pattern Recognition and Machine Learning : Graphical Models
 
Processor, Compiler and Python Programming Language
Processor, Compiler and Python Programming LanguageProcessor, Compiler and Python Programming Language
Processor, Compiler and Python Programming Language
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2
 
Center loss for Face Recognition
Center loss for Face RecognitionCenter loss for Face Recognition
Center loss for Face Recognition
 
Facebook Deep face
Facebook Deep faceFacebook Deep face
Facebook Deep face
 
Computer vision, machine, and deep learning
Computer vision, machine, and deep learningComputer vision, machine, and deep learning
Computer vision, machine, and deep learning
 
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream) Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
Face Recognition Based on Deep Learning (Yurii Pashchenko Technology Stream)
 
Rattani - Ph.D. Defense Slides
Rattani - Ph.D. Defense SlidesRattani - Ph.D. Defense Slides
Rattani - Ph.D. Defense Slides
 
Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3Pattern Recognition and Machine Learning: Section 3.3
Pattern Recognition and Machine Learning: Section 3.3
 
怖くない誤差逆伝播法 Chainerを添えて
怖くない誤差逆伝播法 Chainerを添えて怖くない誤差逆伝播法 Chainerを添えて
怖くない誤差逆伝播法 Chainerを添えて
 
Deep Learning for Computer Vision: Face Recognition (UPC 2016)
Deep Learning for Computer Vision: Face Recognition (UPC 2016)Deep Learning for Computer Vision: Face Recognition (UPC 2016)
Deep Learning for Computer Vision: Face Recognition (UPC 2016)
 

Ähnlich wie DIY Deep Learning with Caffe Workshop

Urs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural NetworksUrs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural NetworksIntel Nervana
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesAnirudh Koul
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...Edge AI and Vision Alliance
 
Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discoverydiannepatricia
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...MLconf
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...Chris Fregly
 
HPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTHPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTRenee Yao
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practicesLior Sidi
 
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...Chris Fregly
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSitakanta Mishra
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Chris Fregly
 
Ceph Day Berlin: Community Update
Ceph Day Berlin:  Community UpdateCeph Day Berlin:  Community Update
Ceph Day Berlin: Community UpdateCeph Community
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETMarco Parenzan
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenPoo Kuan Hoong
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetEric Haibin Lin
 
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...Willy Marroquin (WillyDevNET)
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RPoo Kuan Hoong
 

Ähnlich wie DIY Deep Learning with Caffe Workshop (20)

Urs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural NetworksUrs Köster - Convolutional and Recurrent Neural Networks
Urs Köster - Convolutional and Recurrent Neural Networks
 
Squeezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile PhonesSqueezing Deep Learning Into Mobile Phones
Squeezing Deep Learning Into Mobile Phones
 
R.E.M.O.T.E. SACNAS Poster
R.E.M.O.T.E. SACNAS PosterR.E.M.O.T.E. SACNAS Poster
R.E.M.O.T.E. SACNAS Poster
 
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation..."Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
"Semantic Segmentation for Scene Understanding: Algorithms and Implementation...
 
Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discovery
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 
HPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoTHPE and NVIDIA empowering AI and IoT
HPE and NVIDIA empowering AI and IoT
 
GPU and Deep learning best practices
GPU and Deep learning best practicesGPU and Deep learning best practices
GPU and Deep learning best practices
 
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
 
Open power ddl and lms
Open power ddl and lmsOpen power ddl and lms
Open power ddl and lms
 
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_ReportSaptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
Saptashwa_Mitra_Sitakanta_Mishra_Final_Project_Report
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
 
Ceph Day Berlin: Community Update
Ceph Day Berlin:  Community UpdateCeph Day Berlin:  Community Update
Ceph Day Berlin: Community Update
 
Anomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NETAnomaly Detection with Azure and .NET
Anomaly Detection with Azure and .NET
 
Deep Learning with Microsoft R Open
Deep Learning with Microsoft R OpenDeep Learning with Microsoft R Open
Deep Learning with Microsoft R Open
 
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNetFrom Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
From Hours to Minutes: The Journey of Optimizing Mask-RCNN and BERT Using MXNet
 
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
Handwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with RHandwritten Recognition using Deep Learning with R
Handwritten Recognition using Deep Learning with R
 

Mehr von odsc

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer odsc
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discoveryodsc
 
API Driven Development
API Driven Development API Driven Development
API Driven Development odsc
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysisodsc
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Upodsc
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hiveodsc
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depthodsc
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Informationodsc
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet odsc
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLodsc
 
Beyond Names
Beyond NamesBeyond Names
Beyond Namesodsc
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500odsc
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Dataodsc
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Scienceodsc
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions odsc
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learnodsc
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Toolsodsc
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypseodsc
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science odsc
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Researchodsc
 

Mehr von odsc (20)

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discovery
 
API Driven Development
API Driven Development API Driven Development
API Driven Development
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Up
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depth
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Information
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure ML
 
Beyond Names
Beyond NamesBeyond Names
Beyond Names
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Data
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Science
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Tools
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypse
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
 

Kürzlich hochgeladen

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Kürzlich hochgeladen (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

DIY Deep Learning with Caffe Workshop

  • 1. DIY DEEP LEARNING WITH CAFFE Kate Saenko UMASS Lowell O P E N D A T A S C I E N C E C O N F E R E N C E_ BOSTON 2015 @opendatasci
  • 2. Caffe: Open Source Deep Learning Library Tutorials by Evan Shelhamer, Jon Long, Sergio Guadarrama, Jeff Donahue, and Ross Girshick and Yangqing Jia caffe.berkeleyvision.org github.com/BVLC/caffe Based on tutorials by the Caffe creators at UC Berkeley
  • 3. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell Caffe crew ...plus the cold-brew and over 100 open source contributors!
  • 4. See also: upcoming Caffe Tutorial at CVPR 2015 in Boston, June 7 http://tutorial.caffe.berkeleyvision.org/ Caffe Tutorial Deep learning framework tutorial by Evan Shelhamer, Jeff Donahue, Jon Long, Yangqing Jia and Ross Girshick
  • 5. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  • 6. Why Deep Learning? End-to-End Learning for Many Tasks Pinterest used deep learning-based visual search to find pinned images of bags similar to the one on the left. Image Credit: Pinterest
  • 7. Why Deep Learning? Compositional Models Learned End-to-End Hierarchy of Representations - vision: pixel, motif, part, object - text: character, word, clause, sentence - speech: audio, band, phone, word concrete abstract learning figure credit Yann LeCun, ICML ‘13 tutorial
  • 8. Why Deep Learning? Compositional Models Learned End-to-End figure credit Yann LeCun, ICML ‘13 tutorial Back-propagation jointly learns all of the model parameters to optimize the output for the task.
  • 9.
  • 12. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  • 13. What is Caffe? Prototype Train Deploy Open framework, models, and worked examples for deep learning - 1.5 years - 300+ citations, 100+ contributors - 2,000+ forks, >1 pull request / day average - focus has been vision, but branching out: sequences, reinforcement learning, speech + text
  • 14. What is Caffe? Prototype Train Deploy Open framework, models, and worked examples for deep learning - Pure C++ / CUDA architecture for deep learning - Command line, Python, MATLAB interfaces - Fast, well-tested code - Tools, reference models, demos, and recipes - Seamless switch between CPU and GPU
  • 15. Caffe is a Community project pulse
  • 16. Caffe offers the ● model definitions ● optimization settings ● pre-trained weights so you can start right away. The BVLC models are licensed for unrestricted use. The community shares models in our Model Zoo. Reference Models GoogLeNet: ILSVRC14 winner
  • 17. Brewing by the Numbers... ● Speed with Krizhevsky's 2012 model: o 2 ms / image on K40 GPU o <1 ms inference with Caffe + cuDNN v2 on Titan X o 72 million images / day with batched IO o 8-core CPU: ~20 ms/image ● 9k lines of C++ code (20k with tests)
  • 18. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  • 19. Image Tagging: Share a Sip of Brewed Models demo.caffe.berkeleyvision.org demo code open-source and bundled
  • 21. Visual Style Recognition Other Styles: Vintage Long Exposure Noir Pastel Macro … and so on. Karayev et al. Recognizing Image Style. BMVC14. Caffe fine-tuning example. Demo online at http://demo.vislab.berkeleyvision.org/ (see Results Explorer). [ Image-Style]
  • 22. Object Detection R-CNN: Regions with Convolutional Neural Networks http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/detection.ipynb Full R-CNN scripts available at https://github.com/rbgirshick/rcnn Ross Girshick et al. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR14.
  • 23. Recurrent Net RNN and Long Short Term Memory LSTM are sequential models - video - language - dynamics learned by back-propagation through time. Sequences Jeff Donahue et al. LRCN: Long-term Recurrent Convolutional Network - activity recognition - image captioning - video captioning arXiv and web page & PR
  • 24. Youtube Captioning Venugopalan, Xu, Donahue, Rohrbach, Mooney, Saenko; Translating Videos to Natural Language Using Deep Recurrent Neural Networks, NAACL, 2015
  • 25. Fully convolutional networks for pixel prediction applied to semantic segmentation - end-to-end learning - efficiency in inference and learning 175 ms per-image prediction - multi-modal, multi-task Image Segmentation Further applications - depth estimation - denoising Jon Long* & Evan Shelhamer*, Trevor Darrell arXiv and pre-release
  • 26. Deep Visuomotor Control Sergey Levine* & Chelsea Finn*, Trevor Darrell, and Pieter Abbeeltech report http://tinyurl.com/visuomotor example experiments Robotic manipulation
  • 27. Embedded Caffe Caffe on the NVIDIA Jetson TK1 mobile board - 10 watts of power - inference at 35 ms per image - no need to change the code
  • 28. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  • 29. Neurons in the Brain
  • 31. +4 +2 0 +3 -2 0 -2 +4 0 +2 -8 0 Artificial Neuron: Activation Input Multiply by weights Sum Threshold
  • 32. +4 +2 0 +3 -2 0 -2 +4 0 +2 -8 0 0 -2 0 +2 +2 0 -2 +4 0 +2 +8 +8 Artificial Neuron: Activation Input Multiply by weights Sum Threshold Neurons learn patterns!
  • 34. Input Output Weights Artificial Neural Network Hidden LayerInput Layer Output Layer Deep Network: many layers!
  • 35. 10-1 10-1 10-1 Convolve with Threshold Convolutional Neural Network Input Layer Output Layer slide by Abi-Roozgard
  • 38. Convolutional Networks: 1989 LeNet: a layered model composed of convolution and subsampling operations followed by a holistic representation and ultimately a classifier for handwritten digits. [ Yann LeCun; LeNet ]
  • 39. Convolutional Nets: 2012 AlexNet: a layered model composed of convolution, subsampling, and further operations followed by a holistic representation and all-in-all a landmark classifier on ILSVRC12. [ AlexNet ] + data + gpu + non-saturating nonlinearity + regularization
  • 40. Convolutional Nets: 2014 ILSVRC14 Winners: ~6.6% Top-5 error - GoogLeNet: composition of multi-scale dimension- reduced modules (pictured) - VGG: 16 layers of 3x3 convolution interleaved with max pooling + 3 fully-connected layers + depth + data + dimensionality reduction
  • 41. [Zeiler-Fergus] 1st layer filters image patches that strongly activate 1st layer filters Why Deep Learning? The Unreasonable Effectiveness of Learning Deep Features
  • 42. [Zeiler-Fergus] Note the increase in visual complexity of feature activations as we go from “shallow” to “deep” layers.
  • 44. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  • 45. Net name: "dummy-net" layers { name: "data" …} layers { name: "conv" …} layers { name: "pool" …} … more layers … layers { name: "loss" …} ● A network is a set of layers connected as a DAG: LogReg ↑ LeNet → ImageNet, Krizhevsky 2012 → ● Caffe creates and checks the net from the definition. ● Data and derivatives flow through the net as blobs – a an array interface
  • 46. Forward / Backward the essential Net computations Caffe models are complete machine learning systems for inference and learning. The computation follows from the model definition. Define the model and run.
  • 47. Layer name: "conv1" type: CONVOLUTION bottom: "data" top: "conv1" convolution_param { num_output: 20 kernel_size: 5 stride: 1 weight_filler { type: "xavier" } } name, type, and the connection structure (input blobs and output blobs) layer-specific parameters * Nets + Layers are defined by protobuf schema ● Every layer type defines - Setup - Forward - Backward
  • 48. Data Number x K Channel x Height x Width 256 x 3 x 227 x 227 for ImageNet train input Blobs are 4-D arrays for storing and communicating information. ● hold data, derivatives, and parameters ● lazily allocate memory ● shuttle between CPU and GPU Blob name: "conv1" type: CONVOLUTION bottom: "data" top: "conv1" … definition … top blob bottom blob Parameter: Convolution Weight N Output x K Input x Height x Width 96 x 3 x 11 x 11 for CaffeNet conv1 Parameter: Convolution Bias 96 x 1 x 1 x 1 for CaffeNet conv1
  • 49. Blobs provide a unified memory interface. Reshape(num, channel, height, width) - declare dimensions - make SyncedMem -- but only lazily allocate Blob cpu_data(), mutable_cpu_data() - host memory for CPU mode gpu_data(), mutable_gpu_data() - device memory for GPU mode {cpu,gpu}_diff(), mutable_{cpu,gpu}_diff() - derivative counterparts to data methods - easy access to data + diff in forward / backward SyncedMem allocation + communication
  • 50. Solving: Training a Net Optimization like model definition is configuration. train_net: "lenet_train.prototxt" base_lr: 0.01 momentum: 0.9 weight_decay: 0.0005 max_iter: 10000 snapshot_prefix: "lenet_snapshot" All you need to run things on the GPU. > caffe train -solver lenet_solver.prototxt -gpu 0 Stochastic Gradient Descent (SGD) + momentum · Adaptive Gradient (ADAGRAD) · Nesterov’s Accelerated Gradient (NAG)
  • 51. Setup: run once for initialization. Forward: make output given input. Backward: make gradient of output - w.r.t. bottom - w.r.t. parameters (if needed) Reshape: set dimensions. Layer Protocol Layer Development Checklist Compositional Modeling The Net’s forward and backward passes are composed of the layers’ steps.
  • 52. Layer Protocol == Class Interface Define a class in C++ or Python to extend Layer. Include your new layer type in a network and keep brewing. layer { type: ‘Python’ python_param { module: ‘layers’ layer: ‘EuclideanLoss’ } }
  • 53. Classification SoftmaxWithLoss HingeLoss Linear Regression EuclideanLoss Attributes / Multiclassification SigmoidCrossEntropyLoss Others… New Task NewLoss Loss What kind of model is this? Define the task by the loss. loss (LOSS_TYPE)
  • 54. Recipe for Brewing ● Convert the data to Caffe-format o lmdb, leveldb, hdf5 / .mat, list of images, etc. ● Define the Net ● Configure the Solver ● caffe train -solver solver.prototxt -gpu 0 ● Examples are your friends o caffe/examples/mnist,cifar10,imagenet o caffe/examples/*.ipynb o caffe/models/*
  • 55. WHY DEEP LEARNING? WHAT IS CAFFE? APPLICATIONS OF CAFFE NEURAL NETWORKS IN A SIP CAFFE INTRO STEP-BY-STEP EXAMPLES
  • 56. Classification instant recognition the Caffe way see notebook
  • 57. Brewing Models from logistic regression to non-linearity see notebook
  • 58. Latest Roast Parallelism: open pull requests - synchronous SGD #2114 - data parallelism - ~linear speedup on multiple GPUs for computation >> communication - Flickr + NVIDIA collaboration on distributed opt + asynchronous SGD Python Net Specification #2086 New MATLAB Interface #2505 These will be bundled in the next release shortly.
  • 59. Help Brewing Documentation - tutorial documentation - worked examples Modeling, Usage, and Install - caffe-users group - gitter.im chat Caffe @ CVPR15 Convolutional Nets - CS231n online convnet class by Andrej Karpathy and Fei- Fei Li. - Deep Learning for Vision tutorial from CVPR ‘14.
  • 60. Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell Thanks to the Caffe crew ...plus the cold-brew and open source contributors!
  • 61. Acknowledgements Thank you to the Berkeley Vision and Learning Center Sponsors. Thank you to NVIDIA for GPU donation and collaboration on cuDNN. Thank you to our 75+ open source contributors and vibrant community. Thank you to A9 and AWS for a research grant for Caffe dev and reproducible research.
  • 62. References [ DeCAF ] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. ICML, 2014. [ R-CNN ] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, 2014. [ Zeiler-Fergus ] M. Zeiler and R. Fergus. Visualizing and understanding convolutional networks. ECCV, 2014. [ LeNet ] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. IEEE, 1998. [ AlexNet ] A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS, 2012. [ OverFeat ] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. ICLR, 2014. [ Image-Style ] S. Karayev, M. Trentacoste, H. Han, A. Agarwala, T. Darrell, A. Hertzmann, H. Winnemoeller. Recognizing Image Style. BMVC, 2014. [ Karpathy14 ] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. CVPR, 2014. [ Sutskever13 ] I. Sutskever. Training Recurrent Neural Networks. PhD thesis, University of Toronto, 2013. [ Chopra05 ] S. Chopra, R. Hadsell, and Y. LeCun. Learning a similarity metric discriminatively, with application to face verification. CVPR, 2005.

Hinweis der Redaktion

  1. Evan
  2. over 100 open source contributors
  3. Yangqing
  4. Rich set of tasks that can be done through learning Can do them better than before without individual solutions
  5. Two essential parts: compositional, learn hierarchical representations (pixels->textures->parts->objects) etc. through learning learn layers of abstractions, e.g. recognizing your friend compositionality helps to break down the complex task
  6. Need to learn layers of representations End-to-end learning jointly learns all parameters for the task Given current model’s predictions, can measure error and adjust the model to improve it Over the course of many iterations the models improves At the output we know if it’s right or wrong, at the lower layer we propagate the error from the top
  7. this comics looks at why AI problems are hard, what is the issue? If we want to know where the photo is taken, it is a simple problem, look up But if we want to know what is in it, it si hard Perception is complicated, the answer is not just a look up of individual pixel values
  8. Yet Flickr did it in a weekend! Tells you where photo is taken and if it is a bird A fwe years ago it would be a whole research project, now it’s a side project we can do for fun (point out basic steps, each step leads to next steps, finally get answer)
  9. Yangqing
  10. Caffe is not only a library for deep learning but also test, recipes, and models Can get started from the work of others Run it wherever you want, start on laptop, send to high end GPU, cluster or cloud Model is independent of the processor
  11. Changes are contributed more than once a day on average used in academic research labs, also in industry and start ups, enthusiasts Get to take advantage of community expertise
  12. The key part of Caffe framework is the reference models We offer model definitions, setting for optimize th models, and pretrained weights so you can start right away In addition to official BVLC models the community shares in model zoo Good way to see latest research results coming out
  13. The time from being given an image to producing a classification output is less than ms This translates into 72 images /day On CPU, still fast
  14. Yangqing
  15. We have an online demo that you could easily deploy
  16. Researchrs at MIT made a scene recognizer
  17. Can even recognize how the image is portrayed All these applications are using the same model, just trained on different data We will only look at applications in vision but networks can be used for many other tasks (speech, text, robotics)
  18. Learning models end-to-end not only for vision but also for control Robot learns to do tasks in real time by going directly from camera inputs and joint angles
  19. Instead of being tied to a workstation GPU there are also embedded solutions Caffe runs out of the box
  20. Yangqing
  21. The human brain is composed of billions of cells called neurons Neurons are connected in a complex network, each receives electrical signals on ‘input wires’ and sends out signal on ‘output wires’
  22. Simplify representation (omit weights, combine sum+threshold)
  23. Convolutional networks work by applying filters For each patch we can look at strongest response and summarize it The whole model is trained by measuring errors on suprevised data
  24. In 2012 we have the same fundamental steps but more data, deeper, more computation and some technical improvements
  25. structure is more complicated but have same fundamental pieces we can define experiment and deploy all these models in caffe because theyr’e made of the same pieces
  26. Yangqing
  27. Evan
  28. What does a layer do? Foreard used to make output Backward used during learning Reshape determines the layer dimensions given input dimensions, filter size, stride etc., checks for mistakes you can make your own layers! include a checklist need to define
  29. example code for layer defined in python
  30. what defines the models you’re learning is the loss if we take a simple linear model and add a softmax we get a classifier if we add euclidean loss we get a regression same architecvture can learn different tasks can extend to new tasks by defining new losses
  31. General outline for brewing a model in Caffe We include many examples for defining models and optimization include pretrained weights/optimizations settings
  32. Yangqing
  33. Take the model powering the online demo, look at an example of running it for inference on an example image This demo is more than 1ms per image but can get better efficiency by batching up images Take a tour of features, show existing structure of the model
  34. define network in python but look at generated prototext caffe models are defined by text which has pros and cons instead of writing the verbose model format in the schema we have a succinct way to define it in python layer by layer look at prototext to point out layers the model is just a set of layers and their interconections also need settings for optimization, models are sensitive to all these settings
  35. Start with classic logistic regression and extend it to a nonlinear model Classic logistic regression predicts a binary output We will learn a classifier for vectors generate synthetic data with 2 usefl features and 2 noise use scikit to learn what is the point? we can extend the model and make it more advanced by adding a layer parameters that learns a nonlinear model even logistic regression can be seen as a network but we can extend it to be a layered network we add a hidden layer of dimension 40 if you had done good old logistic regression there is not next step with caffe you have a way to improve on your results
  36. Finetuning rules since most people don’t have an imagnet worth of data this is what you will most likely do a strength of deep models is they can learn incrementally and adapt to new tasks we can take lots of data, learn a model on it, and use it to initialize for a new task Example; a student from S Korea took 10 minutes to place in top 10 in a cats vs dogs using a pretrained caffe model for object recognition
  37. we need a new output layer, imagenet had a 1000 layer of outputs in this task we have 20 labels so we will make a new layer give it a new name to signal to caffe that this is a new layer to be learned
  38. start with a 1000 object classifier network and learn to recognize image style e.g. romantic the learning plos show green (from scratch) and blue (fine-tuned) the blue does remarkably better underscore how successful finetuning has been even across modalities important in practice because we don’t have infinite data
  39. important changes on the way for merging into official Caffe these will all come out shortly in the next release
  40. Join the community for help! The Caffe crew are attending CVPR ‘15 and giving a tutorial.
  41. over 100 open source contributors