Deep learning

Introduction to Deep Learning
RATNAKAR PANDEY
Is Artificial
Intelligence, Machine
Learning and Deep Learning
the same thing? What about
Data Science?
Source: https://www.linkedin.com/pulse/artificial-intelligence-machine-learning-deep-same-thing-pandey/
Artificial Intelligence
• AI is any technique, code or algorithm that enables machines to develop,
demonstrate and mimic human cognitive behavior or intelligence and hence the
name “Artificial Intelligence”
• AI doesn’t mean that everything machines will be doing, rather AI can be better
represented as “Augmented Intelligence”, i.e. Man+Machine to solve business
problems better and faster
• AI won’t replace managers, but managers who use AI will replace those who
don’t.
• Some of the most successful applications of AI around us can be seen in
Robotics, Computer Vision, Virtual Reality, Speech Recognition, Automation,
Gaming and so on…
Machine Learning
• Machine learning is the sub field of AI,
which gives machines the ability to
improve its performance over time
without explicit intervention or help
from the human being
• In this approach machines are shown
thousands or millions of examples and
trained how to correctly solve a
problem.
• Most of the current applications of
the machine learning leverage
supervised learning
• Other usage of ML can be broadly
classified between unsupervised
learning and reinforced learning.
Source: https://hbr.org/cover-story/2017/07/the-business-of-artificial-intelligence
Data Science
• Data Science is a field which intersects AI, Machine
Learning and Deep Learning and enables statistically
driven decision making.
• Data science is the Art and Science of drawing
actionable insights from the data.
• Data Science + Business Knowledge = Impact/Value
Creation for the Business.
• Generally speaking, Data Scientists and Analytics
Professionals try to answer following questions via
their analysis-
• Descriptive Analytics ( What has happened?)
• Diagnostic Analytics ( Why it has happened?)
• Predictive Analytics ( What may happen in future?)
• Prescriptive Analytics ( What plan of action we should
follow?)
Deep Learning
• Deep learning is a sub field of
Machine Learning that very closely
tries to mimic human brain's
working using neurons.
• These techniques focus on building
Artificial Neural Networks (ANN)
using several hidden layers.
• There are variety of deep learning
networks such as Multilayer
Perceptron ( MLP), Autoencoders
(AE), Convolution Neural Network
(CNN), Recurrent Neural Network
(RNN)
Source: https://www.quora.com/What-are-the-types-of-deep-neural-networks-and-how-can-one-categorize-them-and-their-related-algorithms-as-
either-shallow-or-deep/answer/Ratnakar-Pandey-RP
Why Deep Learning is Growing
• Processing power needed for Deep
learning is readily becoming
available using GPUs, Distributed
Computing and powerful CPUs
• Moreover, as the data amount
grows, Deep Learning models seem
to outperform Machine Learning
models
• Explosion of features and datasets
• Focus on customization and real
time decisioning
Why Deep Learning is Growing
• Uncover hard to detect patterns
(using traditional techniques) when
the incidence rate is low
• Find latent features (super variables)
without significant manual feature
engineering
• Real time fraud detection and self
learning models using streaming data
(KAFKA, MapR)
• Ensure consistent customer
experience and regulatory compliance
• Higher operational efficiency
10,000 +
Features
Unstructured
Transactional
Social
Device
&
IP
Third Parties
Bureau
Challenges with Deep Learning
• Works better with large amount of
data
• Some models are very hard to train,
may take weeks or months
• Overfitting
• Black box and hence may have
regulatory challenges, particularly
for BFSI
Source : http://www.npr.org/sections/thesalt/2016/03/11/470084215/canine-or-cuisine-this-photo-meme-is-fetching
Deep Learning Building Blocks
Multilayer Perceptron (MLP)
• These are the most basic networks
and feed forward the inputs to
create output. They consist of an
input layer and an output layer and
many interconnected hidden layers
and neurons between the input and
the output layers.
• They generally use some non linear
activation function such as Relu or
Tanh and compute the losses ( the
difference between the true output
and computed output) such as
Mean Square Error ( MSE), Logloss.
• This loss is backward propagated to
adjust the weights and training to
minimize the losses or make the
models more accurate.
w1
w2
wn
A
c
t
i
v
a
t
i
o
n
Activation Function
Inputs Weights Bias
Key Components and Hyperparameters
• Number of layers- Input layer, output layer and hidden layers. More the number of
layers, deeper the network.
• Number of Neurons- how many neurons in each layer. Input layer neurons depend of
the number of features, output layer neurons on number of outputs and hidden layer
neurons need to be optimized
• Weights- importance given to each factor in computing the output. Typically chosen
randomly in the first run and optimized using backward propagation.
• Activation Function- Function used to generate outputs by matrix multiplication of
inputs and weights along with bias
• Forward Propagation- Weights for each input are initialized to make predictions and
compute error. Output from each layer is fed forward to the next layer.
• Loss Function- To compute error between actual and prediction values and measure
models performance. Hyperparameters are fine tuned to minimize the loss function.
Some common loss functions are- Mean Square Error, Log loss, Cross entropy,
Popular Activation Functions
Most of the activation functions are non-linear as most of the real world problems are non linear
Source: https://en.wikipedia.org/wiki/Activation_function
Key Components and Hyperparameters
• Backpropagation- Back propagate the error (starting from the output layer) to the
previous layer and update weights
• Gradient Descent and Optimization Algorithms- Used for optimize weights based on
the error signal backward propagated and chain rules
• Epochs- One complete set of feedforward and back propagation to train the entire
network.
• Batch Size- No of input observation which are processed in one epoch.
• Dropout- x% of nodes are dropped out to ensure weight regularization and
overfitting and leverage community effects of neuron, rather than dependence on
few players
• Optimizer and Learning Rate- Optimizer are used to optimize learning rates by
Stochastic Gradient Descent (SGD) and find the best solution. If network learns very
fast, it may find suboptimal solutions If it learns very slow, it will take very long to
train a network. Common optimizers are Adam, SGD, RMSprop etc.
Autoencoders
• Autoencoders follow “Representation
Learning”
• The concept of the AE is quite simple-
here input vectors are used to compute
the output vectors, but output vectors
are same as the input vectors.
• The reconstruction error is computed
and data points with the higher
reconstruction error are supposed to be
outliers
• AE are used for unsupervised learning,
feature reduction, speech and image
recognition.
w1
w2
wn
Convolution Neural Network (CNN)
• Convolution Neural Networks (CNN) significantly
enhances the capabilities of the feed forward
network such as MLP by inserting convolution
layers.
• They are particularly suitable for spatial data, object
recognition and image analysis using
multidimensional neurons structures.
• CNNs use convolutions ( a linear operation) rather
than matrix multiplication as in MLP
• Typically a CNN will have three stages- convolution
stage, detector layer ( non linear activator) and
pooling layer
w1
w2
wn
Convolution Neural Network (CNN)
• Convolution Layer- The most important component
in the CNN. The layer has Kernels ( learnable filters)
and the input x and y dimensions are convoluted (
dot product) to generate feature map
• Detector Layer- The feature maps are passed to this
stage using a not linear activation function such as
ReLU activation function to accentuate the non
linear components of the feature maps
• Pooling Layer- A pooling layer such as “max
pooling” summarizes (sub-sampling) the responses
from several inputs from the previous layer and
serves to reduce the size of the spatial
representation. Allowing the next layer to look at
bigger region
w1
w2
wn
Source : MIT Deeplearningbook
Recurrent Neural Network(RNN)
• RNNs are also a feed forward network, however
with recurrent memory loops which take the input
from the previous and/or same layers or states.
• This gives them a unique capability to model along
the time dimension and arbitrary sequence of
events and inputs.
• RNNs are used for sequenced data analysis such as
time-series, sentiment analysis, NLP, language
translation, speech recognition, image captioning,
and script recognition among other things.
• These are also called networks with the memory, as
the previous inputs or states may persist (stored) in
the model to do a sequential analysis. These
memories become an input as well
w1
w2
wn
Recurrent Neural Network(RNN)
• Long Short Term Memory is one of the most
frequently ( LSTM) used RNN model
• These sort of models help us overcome the NLP
challenges which can’t be solved by “Bag of
Words” analysis -
“ The flight was good, not bad at all”
vs
“ The flight was bad, not good at all”
w1
w2
wn
1 von 21

Recomendados

What is Deep Learning?What is Deep Learning?
What is Deep Learning?NVIDIA
26.6K views12 Folien
neural networkneural network
neural networkSTUDENT
115.6K views19 Folien
Deep learning Deep learning
Deep learning Rajgupta258
1.6K views20 Folien

Más contenido relacionado

Was ist angesagt?(20)

Machine LearningMachine Learning
Machine Learning
Darshan Ambhaikar58.7K views
Machine learning pptMachine learning ppt
Machine learning ppt
Rajat Sharma123.1K views
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
leopauly1.7K views
Support Vector Machines ( SVM ) Support Vector Machines ( SVM )
Support Vector Machines ( SVM )
Mohammad Junaid Khan35.3K views
Deep Learning TutorialDeep Learning Tutorial
Deep Learning Tutorial
Amr Rashed5.4K views
An introduction to Deep LearningAn introduction to Deep Learning
An introduction to Deep Learning
Julien SIMON4.3K views
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
SwatiTripathi441.9K views
Machine LearningMachine Learning
Machine Learning
Rabab Munawar7.4K views
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Lior Rokach283.1K views
Machine learningMachine learning
Machine learning
Rajesh Chittampally9.3K views
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
Melanie Swan41.4K views
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural Networks
Aniket Maurya3K views
Intro to deep learning Intro to deep learning
Intro to deep learning
David Voyles1.6K views
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed587 views
Neural networks and deep learningNeural networks and deep learning
Neural networks and deep learning
Jörgen Sandig7.9K views
Deep learning pptDeep learning ppt
Deep learning ppt
BalneSridevi1.2K views
1.Introduction to deep learning1.Introduction to deep learning
1.Introduction to deep learning
KONGU ENGINEERING COLLEGE1.9K views
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
Christian Perone71.4K views

Similar a Deep learning(20)

Development of Deep Learning ArchitectureDevelopment of Deep Learning Architecture
Development of Deep Learning Architecture
Pantech ProLabs India Pvt Ltd3.9K views
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Shirin Elsinghorst4.7K views
Deep learning - a primerDeep learning - a primer
Deep learning - a primer
Uwe Friedrichsen2.8K views
33.-Multi-Layer-Perceptron.pdf33.-Multi-Layer-Perceptron.pdf
33.-Multi-Layer-Perceptron.pdf
gnans Kgnanshek38 views
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Abhishek Bhandwaldar422 views
Neural networkNeural network
Neural network
Saddam Hussain771 views
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep Learning
YaminiAlapati1154 views
Deep learning from a novice perspectiveDeep learning from a novice perspective
Deep learning from a novice perspective
Anirban Santara1.2K views
ML Module 3 Non Linear Learning.pptxML Module 3 Non Linear Learning.pptx
ML Module 3 Non Linear Learning.pptx
DebabrataPain12 views
Nural network ER. Abhishek  k. upadhyayNural network ER. Abhishek  k. upadhyay
Nural network ER. Abhishek k. upadhyay
abhishek upadhyay868 views
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
Subrat Panda, PhD1.8K views
Neural Networks-1Neural Networks-1
Neural Networks-1
Sai Kumar Dwivedi1.1K views
Computer Design Concepts for Machine LearningComputer Design Concepts for Machine Learning
Computer Design Concepts for Machine Learning
Facultad de Informática UCM82 views

Último(20)

2022-Scripting_Hacks.pdf2022-Scripting_Hacks.pdf
2022-Scripting_Hacks.pdf
Roland Schock9 views
Microsoft Fabric.pptxMicrosoft Fabric.pptx
Microsoft Fabric.pptx
Shruti Chaurasia12 views
King_Ishmael_DMBS_PB1King_Ishmael_DMBS_PB1
King_Ishmael_DMBS_PB1
imking1115 views
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
thomasjvarghese4915 views
Introduction to Microsoft Fabric.pdfIntroduction to Microsoft Fabric.pdf
Introduction to Microsoft Fabric.pdf
ishaniuudeshika15 views
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docxRIO GRANDE SUPPLY COMPANY INC, JAYSON.docx
RIO GRANDE SUPPLY COMPANY INC, JAYSON.docx
JaysonGarabilesEspej5 views
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
Abdul salam 11 views
Personal brand explorationPersonal brand exploration
Personal brand exploration
KyleeGarciaDean19 views
PROGRAMME.pdfPROGRAMME.pdf
PROGRAMME.pdf
HiNedHaJar6 views
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann40 views
ColonyOSColonyOS
ColonyOS
JohanKristiansson69 views
RuleBookForTheFairDataEconomy.pptxRuleBookForTheFairDataEconomy.pptx
RuleBookForTheFairDataEconomy.pptx
noraelstela162 views
How Leaders See Data? (Level 1)How Leaders See Data? (Level 1)
How Leaders See Data? (Level 1)
Narendra Narendra9 views
PTicketInput.pdfPTicketInput.pdf
PTicketInput.pdf
stuartmcphersonflipm284 views

Deep learning

  • 1. Introduction to Deep Learning RATNAKAR PANDEY
  • 2. Is Artificial Intelligence, Machine Learning and Deep Learning the same thing? What about Data Science?
  • 4. Artificial Intelligence • AI is any technique, code or algorithm that enables machines to develop, demonstrate and mimic human cognitive behavior or intelligence and hence the name “Artificial Intelligence” • AI doesn’t mean that everything machines will be doing, rather AI can be better represented as “Augmented Intelligence”, i.e. Man+Machine to solve business problems better and faster • AI won’t replace managers, but managers who use AI will replace those who don’t. • Some of the most successful applications of AI around us can be seen in Robotics, Computer Vision, Virtual Reality, Speech Recognition, Automation, Gaming and so on…
  • 5. Machine Learning • Machine learning is the sub field of AI, which gives machines the ability to improve its performance over time without explicit intervention or help from the human being • In this approach machines are shown thousands or millions of examples and trained how to correctly solve a problem. • Most of the current applications of the machine learning leverage supervised learning • Other usage of ML can be broadly classified between unsupervised learning and reinforced learning. Source: https://hbr.org/cover-story/2017/07/the-business-of-artificial-intelligence
  • 6. Data Science • Data Science is a field which intersects AI, Machine Learning and Deep Learning and enables statistically driven decision making. • Data science is the Art and Science of drawing actionable insights from the data. • Data Science + Business Knowledge = Impact/Value Creation for the Business. • Generally speaking, Data Scientists and Analytics Professionals try to answer following questions via their analysis- • Descriptive Analytics ( What has happened?) • Diagnostic Analytics ( Why it has happened?) • Predictive Analytics ( What may happen in future?) • Prescriptive Analytics ( What plan of action we should follow?)
  • 7. Deep Learning • Deep learning is a sub field of Machine Learning that very closely tries to mimic human brain's working using neurons. • These techniques focus on building Artificial Neural Networks (ANN) using several hidden layers. • There are variety of deep learning networks such as Multilayer Perceptron ( MLP), Autoencoders (AE), Convolution Neural Network (CNN), Recurrent Neural Network (RNN) Source: https://www.quora.com/What-are-the-types-of-deep-neural-networks-and-how-can-one-categorize-them-and-their-related-algorithms-as- either-shallow-or-deep/answer/Ratnakar-Pandey-RP
  • 8. Why Deep Learning is Growing • Processing power needed for Deep learning is readily becoming available using GPUs, Distributed Computing and powerful CPUs • Moreover, as the data amount grows, Deep Learning models seem to outperform Machine Learning models • Explosion of features and datasets • Focus on customization and real time decisioning
  • 9. Why Deep Learning is Growing • Uncover hard to detect patterns (using traditional techniques) when the incidence rate is low • Find latent features (super variables) without significant manual feature engineering • Real time fraud detection and self learning models using streaming data (KAFKA, MapR) • Ensure consistent customer experience and regulatory compliance • Higher operational efficiency 10,000 + Features Unstructured Transactional Social Device & IP Third Parties Bureau
  • 10. Challenges with Deep Learning • Works better with large amount of data • Some models are very hard to train, may take weeks or months • Overfitting • Black box and hence may have regulatory challenges, particularly for BFSI
  • 13. Multilayer Perceptron (MLP) • These are the most basic networks and feed forward the inputs to create output. They consist of an input layer and an output layer and many interconnected hidden layers and neurons between the input and the output layers. • They generally use some non linear activation function such as Relu or Tanh and compute the losses ( the difference between the true output and computed output) such as Mean Square Error ( MSE), Logloss. • This loss is backward propagated to adjust the weights and training to minimize the losses or make the models more accurate. w1 w2 wn A c t i v a t i o n Activation Function Inputs Weights Bias
  • 14. Key Components and Hyperparameters • Number of layers- Input layer, output layer and hidden layers. More the number of layers, deeper the network. • Number of Neurons- how many neurons in each layer. Input layer neurons depend of the number of features, output layer neurons on number of outputs and hidden layer neurons need to be optimized • Weights- importance given to each factor in computing the output. Typically chosen randomly in the first run and optimized using backward propagation. • Activation Function- Function used to generate outputs by matrix multiplication of inputs and weights along with bias • Forward Propagation- Weights for each input are initialized to make predictions and compute error. Output from each layer is fed forward to the next layer. • Loss Function- To compute error between actual and prediction values and measure models performance. Hyperparameters are fine tuned to minimize the loss function. Some common loss functions are- Mean Square Error, Log loss, Cross entropy,
  • 15. Popular Activation Functions Most of the activation functions are non-linear as most of the real world problems are non linear Source: https://en.wikipedia.org/wiki/Activation_function
  • 16. Key Components and Hyperparameters • Backpropagation- Back propagate the error (starting from the output layer) to the previous layer and update weights • Gradient Descent and Optimization Algorithms- Used for optimize weights based on the error signal backward propagated and chain rules • Epochs- One complete set of feedforward and back propagation to train the entire network. • Batch Size- No of input observation which are processed in one epoch. • Dropout- x% of nodes are dropped out to ensure weight regularization and overfitting and leverage community effects of neuron, rather than dependence on few players • Optimizer and Learning Rate- Optimizer are used to optimize learning rates by Stochastic Gradient Descent (SGD) and find the best solution. If network learns very fast, it may find suboptimal solutions If it learns very slow, it will take very long to train a network. Common optimizers are Adam, SGD, RMSprop etc.
  • 17. Autoencoders • Autoencoders follow “Representation Learning” • The concept of the AE is quite simple- here input vectors are used to compute the output vectors, but output vectors are same as the input vectors. • The reconstruction error is computed and data points with the higher reconstruction error are supposed to be outliers • AE are used for unsupervised learning, feature reduction, speech and image recognition. w1 w2 wn
  • 18. Convolution Neural Network (CNN) • Convolution Neural Networks (CNN) significantly enhances the capabilities of the feed forward network such as MLP by inserting convolution layers. • They are particularly suitable for spatial data, object recognition and image analysis using multidimensional neurons structures. • CNNs use convolutions ( a linear operation) rather than matrix multiplication as in MLP • Typically a CNN will have three stages- convolution stage, detector layer ( non linear activator) and pooling layer w1 w2 wn
  • 19. Convolution Neural Network (CNN) • Convolution Layer- The most important component in the CNN. The layer has Kernels ( learnable filters) and the input x and y dimensions are convoluted ( dot product) to generate feature map • Detector Layer- The feature maps are passed to this stage using a not linear activation function such as ReLU activation function to accentuate the non linear components of the feature maps • Pooling Layer- A pooling layer such as “max pooling” summarizes (sub-sampling) the responses from several inputs from the previous layer and serves to reduce the size of the spatial representation. Allowing the next layer to look at bigger region w1 w2 wn Source : MIT Deeplearningbook
  • 20. Recurrent Neural Network(RNN) • RNNs are also a feed forward network, however with recurrent memory loops which take the input from the previous and/or same layers or states. • This gives them a unique capability to model along the time dimension and arbitrary sequence of events and inputs. • RNNs are used for sequenced data analysis such as time-series, sentiment analysis, NLP, language translation, speech recognition, image captioning, and script recognition among other things. • These are also called networks with the memory, as the previous inputs or states may persist (stored) in the model to do a sequential analysis. These memories become an input as well w1 w2 wn
  • 21. Recurrent Neural Network(RNN) • Long Short Term Memory is one of the most frequently ( LSTM) used RNN model • These sort of models help us overcome the NLP challenges which can’t be solved by “Bag of Words” analysis - “ The flight was good, not bad at all” vs “ The flight was bad, not good at all” w1 w2 wn