SlideShare a Scribd company logo
1 of 47
Download to read offline
Deep learning basics for NLP
Arvind Devaraj (Grouply.org)
Lead Data Scientist, Embibe
Agenda
Classical ML vs Deep Learning
What is Deep learning for NLP
Learning Resources
How Machine translation (NMT) works
ML Tools
Scikit-learn
Jupyter
Pandas
TensorFlow
Google colab
Nltk/spacy
ML Courses
MLcourse.Ai
Cs231n (stanford)
Fast.Ai
Deeplearning.Ai
Coursera(nlp)
Analyticsvidya
Edureka
Simplilearn
Siraj Raval
Youtube channels and Books
KrisNaik(deployment)
StatQuest
DeepLearning.TV
CodeEmporium
3blue1brown
Bob Trenwith
Data School
Cognitive Class
The Hundred-page Machine Learning Book
Hands on ML with scikit learn, keras and
tensorflow
Deep learning (Ian Goodfellow)
Refer Grouply.org
ML Types
ML Algorithms
1. Linear Regression
2. Logistic Regression
3. Decision Trees
4. Random Forest
5. Clustering (K means, Hierarchical)
Machine learning skills
1. Math skills
Probability and statistics
Linear Algebra
Calculus
2. Programming in Python/R
3. Data Engineer skills
1. Ability to work with large amounts of data
2. Data preprocessing
3. Knowledge of SQL and NoSQL
4. ETL (Extract Transform and Load) operations
5. Data Analysis and Visualization skills
Project ideas
NLP
Sentiment analysis
Spam classifier
Text classifier (Tagging documents)
Smartbook
Kaggle
Classical machine learning
More info
Deep learning for NLP
Introduction to Deep learning
Difference between classical ML and DL
Architectures - MLP, CNN, RNN
Frameworks - Tensorflow, Keras, Pytorch
Model 1
Neural Networks
Neural Network
Neural Networks act as a ‘black box’ that takes inputs and predicts an output.
Neural Network
Neural Network Training
Basic Neural Network
Neural Network
Neural Network
‘Black box’ that takes inputs and predicts an output.
Trained using known (input,output) , approximates the function and maps new inputs
Learns the function mapping inputs to output by adjusting the internal parameters (weights)
Word2vec using neural networks
Word2vec using neural networks
Word2vec using neural networks
Types of Neural Nets
Model 2
RNN
RNN
RNNs are used when the inputs have some state information
Examples include time series and word sequences
Can captures the essence of the sequence in its hidden state (context)
https://towardsdatascience.com/illustrated-guide-to-recurrent-neural-networks-79e5eb8049c9
RNN - unrolled in time
RNN
RNN can learn sentence representations
RNNs can predict
RNN can learn a probability function p( word | previous-words )
RNN - Summary
RNN can encode sentence meaning
Can predict next word given a sequence of words.
Can be trained to learn a language model
Translation
Two RNNs jointly trained (one using English, another with German)
Training using Parallel corpora
Models
1. Neural Networks
2. Recurrent Neural Networks
3. Encoder-Decoder
4. Attention Models
5. Transformer
Model 3
Encoder Decoder
Encoder Decoder
http://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/
Encoder Decoder
Encoder-Decoder in Translation
Encoder - Decoder summary
2014 -Google replaced their Statistical model with NMT
Due to its flexibility it is the goto framework for NLG with different models taking
roles of encoder and decoder
The decoder can not only be conditioned on a sequence but on arbitrary
representation enabling lot of use cases (like generating caption from image)
Translation using Encoder-Decoder
Issues
● Meaning is crammed
● Long term dependencies
● Word alignment
● Word interdependencies
Translation issues (long term dependencies)
The model cannot remember enough
difficult to retain >30 words.
"Born in France..went to EPFL Switzerland..I speak fluent ..."
Pay selective attention
Pay selective attention
First introduced in Image Recognition
Translation with Attention
Translation with transformer
Machine translation
NN can encode words
RNN can encode sentences
Long sentences need changes to RNN architecture
Two RNNs can act as encoder and decoder (of any representation)
Encoding everything into single context loses information
Selectively pay attention to the inputs we need.
Get rid of RNN and use only attention mechanism. Make it parallelizable
Richer representation of inputs using Self-attention.
Use Encoder-Decoder attention as usual for translation
Word2vec to BERT

More Related Content

What's hot

What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
Simplilearn
 

What's hot (20)

Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Attention Is All You Need
Attention Is All You NeedAttention Is All You Need
Attention Is All You Need
 
Transformers in 2021
Transformers in 2021Transformers in 2021
Transformers in 2021
 
Attention is All You Need (Transformer)
Attention is All You Need (Transformer)Attention is All You Need (Transformer)
Attention is All You Need (Transformer)
 
BERT
BERTBERT
BERT
 
Transformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptxTransformers In Vision From Zero to Hero (DLI).pptx
Transformers In Vision From Zero to Hero (DLI).pptx
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
Introduction to Named Entity Recognition
Introduction to Named Entity RecognitionIntroduction to Named Entity Recognition
Introduction to Named Entity Recognition
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language UnderstandingBERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
 
rnn BASICS
rnn BASICSrnn BASICS
rnn BASICS
 
Diffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesisDiffusion models beat gans on image synthesis
Diffusion models beat gans on image synthesis
 
Neural Language Generation Head to Toe
Neural Language Generation Head to Toe Neural Language Generation Head to Toe
Neural Language Generation Head to Toe
 
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
 
Bert
BertBert
Bert
 
BERT Finetuning Webinar Presentation
BERT Finetuning Webinar PresentationBERT Finetuning Webinar Presentation
BERT Finetuning Webinar Presentation
 
Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Deep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word EmbeddingsDeep Learning for Natural Language Processing: Word Embeddings
Deep Learning for Natural Language Processing: Word Embeddings
 

Similar to Deep learning for NLP and Transformer

Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
BigDataCloud
 

Similar to Deep learning for NLP and Transformer (20)

Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8Distributed deep learning_over_spark_20_nov_2014_ver_2.8
Distributed deep learning_over_spark_20_nov_2014_ver_2.8
 
Recurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text AnalysisRecurrent Neural Networks for Text Analysis
Recurrent Neural Networks for Text Analysis
 
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23Sjug #26   ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
Sjug #26 ml is in java but is dl too - ver1.04 - tomasz sikora 2018-03-23
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
Understanding deep learning
Understanding deep learningUnderstanding deep learning
Understanding deep learning
 
Deep Learning and Watson Studio
Deep Learning and Watson StudioDeep Learning and Watson Studio
Deep Learning and Watson Studio
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016Kaz Sato, Evangelist, Google at MLconf ATL 2016
Kaz Sato, Evangelist, Google at MLconf ATL 2016
 
IRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a SentenceIRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
IRJET- Survey on Generating Suggestions for Erroneous Part in a Sentence
 
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
 
BigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for SparkBigDL webinar - Deep Learning Library for Spark
BigDL webinar - Deep Learning Library for Spark
 
Image Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDLImage Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDL
 
The State of ML for iOS: On the Advent of WWDC 2018 🕯
The State of ML for iOS: On the Advent of WWDC 2018 🕯The State of ML for iOS: On the Advent of WWDC 2018 🕯
The State of ML for iOS: On the Advent of WWDC 2018 🕯
 
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
Suggestion Generation for Specific Erroneous Part in a Sentence using Deep Le...
 
The State of #NLProc
The State of #NLProcThe State of #NLProc
The State of #NLProc
 
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning Big Data Intelligence: from Correlation Discovery to Causal Reasoning
Big Data Intelligence: from Correlation Discovery to Causal Reasoning
 
Thinking in parallel ab tuladev
Thinking in parallel ab tuladevThinking in parallel ab tuladev
Thinking in parallel ab tuladev
 
Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014Deeplearning on Hadoop @OSCON 2014
Deeplearning on Hadoop @OSCON 2014
 

More from Arvind Devaraj

Yourstory Android Workshop
Yourstory Android WorkshopYourstory Android Workshop
Yourstory Android Workshop
Arvind Devaraj
 
Android High performance in GPU using opengles and renderscript
Android High performance in GPU using opengles and renderscriptAndroid High performance in GPU using opengles and renderscript
Android High performance in GPU using opengles and renderscript
Arvind Devaraj
 
Sorting (introduction)
 Sorting (introduction) Sorting (introduction)
Sorting (introduction)
Arvind Devaraj
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)
Arvind Devaraj
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier Transforms
Arvind Devaraj
 

More from Arvind Devaraj (20)

Nodejs presentation
Nodejs presentationNodejs presentation
Nodejs presentation
 
Career hunt pitch
Career hunt pitchCareer hunt pitch
Career hunt pitch
 
Career options for CS and IT students
Career options for CS and IT studentsCareer options for CS and IT students
Career options for CS and IT students
 
Careerhunt ebook
Careerhunt ebookCareerhunt ebook
Careerhunt ebook
 
Static Analysis of Computer programs
Static Analysis of Computer programs Static Analysis of Computer programs
Static Analysis of Computer programs
 
Hyperbook
HyperbookHyperbook
Hyperbook
 
Yourstory Android Workshop
Yourstory Android WorkshopYourstory Android Workshop
Yourstory Android Workshop
 
Android High performance in GPU using opengles and renderscript
Android High performance in GPU using opengles and renderscriptAndroid High performance in GPU using opengles and renderscript
Android High performance in GPU using opengles and renderscript
 
OpenGLES Android Graphics
OpenGLES Android GraphicsOpenGLES Android Graphics
OpenGLES Android Graphics
 
Broadcast Receiver
Broadcast ReceiverBroadcast Receiver
Broadcast Receiver
 
AIDL - Android Interface Definition Language
AIDL  - Android Interface Definition LanguageAIDL  - Android Interface Definition Language
AIDL - Android Interface Definition Language
 
NDK Programming in Android
NDK Programming in AndroidNDK Programming in Android
NDK Programming in Android
 
Google Cloud Messaging
Google Cloud MessagingGoogle Cloud Messaging
Google Cloud Messaging
 
OpenGLES - Graphics Programming in Android
OpenGLES - Graphics Programming in Android OpenGLES - Graphics Programming in Android
OpenGLES - Graphics Programming in Android
 
Operating system
Operating systemOperating system
Operating system
 
Sorting (introduction)
 Sorting (introduction) Sorting (introduction)
Sorting (introduction)
 
Data structures (introduction)
 Data structures (introduction) Data structures (introduction)
Data structures (introduction)
 
Graphics programming in open gl
Graphics programming in open glGraphics programming in open gl
Graphics programming in open gl
 
Signal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier TransformsSignal Processing Introduction using Fourier Transforms
Signal Processing Introduction using Fourier Transforms
 
Thesis: Slicing of Java Programs using the Soot Framework (2006)
Thesis:  Slicing of Java Programs using the Soot Framework (2006) Thesis:  Slicing of Java Programs using the Soot Framework (2006)
Thesis: Slicing of Java Programs using the Soot Framework (2006)
 

Recently uploaded

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
SanaAli374401
 

Recently uploaded (20)

Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 

Deep learning for NLP and Transformer