SlideShare ist ein Scribd-Unternehmen logo
1 von 3
History of neural networks
In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper
on how neurons might work. In order to describe how neurons in the brain might work, they
modelled a simple neural network using electrical circuits.
In 1949, Donald Hebb wrote The Organization of Behaviour, a work which pointed out the
fact that neural pathways are strengthened each time they are used, a concept fundamentally
essential to the ways in which humans learn. If two nerves fire at the same time, he argued,
the connection between them is enhanced.
As computers became more advanced in the 1950's, it was finally possible to simulate a
hypothetical neural network. The first step towards this was made by Nathanial Rochester
from the IBM research laboratories. Unfortunately for him, the first attempt to do so failed.
In 1959, Bernard Widrow and Marcian Hoff of Stanford developed models called
"ADALINE" and "MADALINE." In a typical display of Stanford's love for acronyms, the
names come from their use of Multiple Adaptive Linear Elements. ADALINE was developed
to recognize binary patterns so that if it was reading streaming bits from a phone line, it could
predict the next bit. MADALINE was the first neural network applied to a real-world
problem, using an adaptive filter that eliminates echoes on phone lines. While the system is
as ancient as air traffic control systems, like air traffic control systems, it is still in
commercial use.
In 1962, Widrow & Hoff developed a learning procedure that examines the value before the
weight adjusts it (i.e. 0 or 1) according to the rule: Weight Change = (Pre-Weight line value)
* (Error / (Number of Inputs)). It is based on the idea that while one active perceptron may
have a big error, one can adjust the weight values to distribute it across the network, or at
least to adjacent perceptron. Applying this rule still results in an error if the line before the
weight is 0, although this will eventually correct itself. If the error is conserved so that all of
it is distributed to all of the weights than the error is eliminated.
Despite the later success of the neural network, traditional von Neumann architecture took
over the computing scene, and neural research was left behind. Ironically, John von Neumann
himself suggested the imitation of neural functions by using telegraph relays or vacuum
tubes.
In the same time period, a paper was written that suggested there could not be an extension
from the single layered neural network to a multiple layered neural network. In addition,
many people in the field were using a learning function that was fundamentally flawed
because it was not differentiable across the entire line. As a result, research and funding went
drastically down.
This was coupled with the fact that the early successes of some neural networks led to an
exaggeration of the potential of neural networks, especially considering the practical
technology at the time. Promises went unfulfilled, and at times greater philosophical
questions led to fear. Writers pondered the effect that the so-called "thinking machines"
would have on humans, ideas which are still around today.
The idea of a computer which programs itself is very appealing. If Microsoft's Windows 2000
could reprogram itself, it might be able to repair the thousands of bugs that the programming
staff made. Such ideas were appealing but very difficult to implement. In addition, von
Neumann architecture was gaining in popularity. There were a few advances in the field, but
for the most part research was few and far between.
In 1972, Kohonen and Anderson developed a similar network independently of one another,
which we will discuss more about later. They both used matrix mathematics to describe their
ideas but did not realize that what they were doing was creating an array of analog
ADALINE circuits. The neurons are supposed to activate a set of outputs instead of just one.
The first multilayered network was developed in 1975, an unsupervised network
In 1982, interest in the field was renewed. John Hopfield of Caltech presented a paper to the
National Academy of Sciences. His approach was to create more useful machines by using
bidirectional lines. Previously, the connections between neurons was only one way.
That same year, Reilly and Cooper used a "Hybrid network" with multiple layers, each layer
using a different problem-solving strategy.
Also in 1982, there was a joint US-Japan conference on Cooperative/Competitive Neural
Networks. Japan announced a new Fifth Generation effort on neural networks, and US papers
generated worry that the US could be left behind in the field. (Fifth generation computing
involves artificial intelligence. First generation used switches and wires, second generation
used the transister, third state used solid-state technology like integrated circuits and higher
level programming languages, and the fourth generation is code generators.) As a result, there
was more funding and thus more research in the field.
In 1986, with multiple layered neural networks in the news, the problem was how to extend
the Widrow-Hoff rule to multiple layers. Three independent groups of researchers, one of
which included David Rumelhart, a former member of Stanford's psychology department,
came up with similar ideas which are now called back propagation networks because it
distributes pattern recognition errors throughout the network. Hybrid networks used just two
layers, these back-propagation networks use many. The result is that back-propagation
networks are "slow learners," needing possibly thousands of iterations to learn.
In 1989 Yann LeCun uses backpropagation to train convolutional neural network to
recognize handwritten digits. This is a breakthrough moment as it lays the foundation of
modern computer vision using deep learning. And also George Cybenko publishes earliest
version of the Universal Approximation Theorem in his paper “Approximation by
superpositions of a sigmoidal function“. He proves that feed forward neural network with
single hidden layer containing finite number of neurons can approximate any continuous
function. It further adds credibility to Deep Learning.
In 1997 Sepp Hochreiter and Jürgen Schmidhuber publishes a milestone paper on “Long
Short-Term Memory” (LSTM). It is a type of recurrent neural network architecture which
will go on to revolutionize deep learning in decades to come
There is another milestone in 2006 Geoffrey Hinton, Ruslan Salakhutdinov, Osindero and
Teh publishes the paper “A fast learning algorithm for deep belief nets” in which they
stacked multiple RBMs together in layers and called them Deep Belief Networks. The
training process is much more efficient for large amount of data.
In 2008 Andrew NG’s group in Stanford starts advocating for the use of GPUs for training
Deep Neural Networks to speed up the training time by many folds. This could bring
practicality in the field of Deep Learning for training on huge volume of data efficiently.
Finding enough labeled data has always been a challenge for Deep Learning community. In
2009 Fei-Fei Li, a professor at Stanford, launches ImageNet which is a database of 14 million
labeled images. It would serve as a benchmark for the deep learning researchers who would
participate in ImageNet competitions (ILSVRC) every year.
In 2011 yoshua Bengio, Antoine Bordes, Xavier Glorot in their paper “Deep Sparse Rectifier
Neural Networks” shows that ReLU activation function can avoid vanishing gradient
problem. This means that now, apart from GPU, deep learning community has another tool to
avoid issues of longer and impractical training times of deep neural network.
In 2012 AlexNet, a GPU implemented CNN model designed by Alex Krizhevsky, wins
Imagenet’s image classification contest with accuracy of 84%. It is a huge jump over 75%
accuracy that earlier models had achieved. This win triggers a new deep learning boom
globally.
In 2014 Generative Adversarial Neural Network also known as GAN is created by Ian
Goodfellow. GANs open a whole new door of application of deep learning in fashion, art,
science due it’s ability to synthesize real like data.
In 2016 Deepmind’s deep reinforcement learning model beats human champion in the
complex game of Go. The game is much more complex than chess, so this feat captures the
imagination of everyone and takes the promise of deep learning to whole new level.
In 2019 Yoshua Bengio, Geoffrey Hinton, and Yann LeCun wins Turing Award 2018 for
their immense contribution in advancements in area of deep learning and artificial
intelligence. This is a defining moment for those who had worked relentlessly on neural
networks when entire machine learning community had moved away from it in 1970s.
Now, neural networks are used in several applications, some of which we will describe later
in our presentation. The fundamental idea behind the nature of neural networks is that if it
works in nature, it must be able to work in computers. The future of neural networks, though,
lies in the development of hardware. Much like the advanced chess-playing machines like
Deep Blue, fast, efficient neural networks depend on hardware being specified for its
eventual use.

Weitere ähnliche Inhalte

Was ist angesagt?

Soft Computing-173101
Soft Computing-173101Soft Computing-173101
Soft Computing-173101
AMIT KUMAR
 

Was ist angesagt? (20)

Quantum computing in machine learning
Quantum computing in machine learningQuantum computing in machine learning
Quantum computing in machine learning
 
Artificial Intelligence Notes Unit 1
Artificial Intelligence Notes Unit 1 Artificial Intelligence Notes Unit 1
Artificial Intelligence Notes Unit 1
 
Quantum Information Technology
Quantum Information TechnologyQuantum Information Technology
Quantum Information Technology
 
Neural Networks
Neural NetworksNeural Networks
Neural Networks
 
Quantum Computers
Quantum ComputersQuantum Computers
Quantum Computers
 
Artificial nueral network slideshare
Artificial nueral network slideshareArtificial nueral network slideshare
Artificial nueral network slideshare
 
Soft Computing-173101
Soft Computing-173101Soft Computing-173101
Soft Computing-173101
 
Deep learning
Deep learningDeep learning
Deep learning
 
Knowledge representation In Artificial Intelligence
Knowledge representation In Artificial IntelligenceKnowledge representation In Artificial Intelligence
Knowledge representation In Artificial Intelligence
 
Introduction Artificial Intelligence a modern approach by Russel and Norvig 1
Introduction Artificial Intelligence a modern approach by Russel and Norvig 1Introduction Artificial Intelligence a modern approach by Russel and Norvig 1
Introduction Artificial Intelligence a modern approach by Russel and Norvig 1
 
Instance based learning
Instance based learningInstance based learning
Instance based learning
 
Report (1)
Report (1)Report (1)
Report (1)
 
Quantum Computers
Quantum ComputersQuantum Computers
Quantum Computers
 
Deep Learning With Neural Networks
Deep Learning With Neural NetworksDeep Learning With Neural Networks
Deep Learning With Neural Networks
 
Quantum Computing and AI
Quantum Computing and AIQuantum Computing and AI
Quantum Computing and AI
 
Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing Unit I & II in Principles of Soft computing
Unit I & II in Principles of Soft computing
 
Perceptron & Neural Networks
Perceptron & Neural NetworksPerceptron & Neural Networks
Perceptron & Neural Networks
 
Quantum computer
Quantum computerQuantum computer
Quantum computer
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
 
Quantum Computing ppt
Quantum Computing  pptQuantum Computing  ppt
Quantum Computing ppt
 

Ähnlich wie History of neural networks

Case study on deep learning
Case study on deep learningCase study on deep learning
Case study on deep learning
HarshitBarde
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKS
ESCOM
 
Elective Neural Networks. I. The boolean brain. On a Heuristic Point of V...
Elective Neural Networks.   I. The boolean brain.   On a Heuristic Point of V...Elective Neural Networks.   I. The boolean brain.   On a Heuristic Point of V...
Elective Neural Networks. I. The boolean brain. On a Heuristic Point of V...
ABINClaude
 
Artificial Intelligence for Biology
Artificial Intelligence for BiologyArtificial Intelligence for Biology
Artificial Intelligence for Biology
arannadelwar361
 

Ähnlich wie History of neural networks (20)

Case study on deep learning
Case study on deep learningCase study on deep learning
Case study on deep learning
 
Growing-up With AI
Growing-up With AIGrowing-up With AI
Growing-up With AI
 
Artificial Neural Network An Important Asset For Future Computing
Artificial Neural Network   An Important Asset For Future ComputingArtificial Neural Network   An Important Asset For Future Computing
Artificial Neural Network An Important Asset For Future Computing
 
From web 2 to web 3
From web 2 to web 3From web 2 to web 3
From web 2 to web 3
 
Artificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and FutureArtificial Intelligence - Past, Present and Future
Artificial Intelligence - Past, Present and Future
 
NEURAL NETWORKS
NEURAL NETWORKSNEURAL NETWORKS
NEURAL NETWORKS
 
History of Artificial Intelligence.pptx
History of Artificial Intelligence.pptxHistory of Artificial Intelligence.pptx
History of Artificial Intelligence.pptx
 
An Overview On Neural Network And Its Application
An Overview On Neural Network And Its ApplicationAn Overview On Neural Network And Its Application
An Overview On Neural Network And Its Application
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
 
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored Using Transformer Architecture
 
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer ArchitectureFoundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
Foundations of ANNs: Tolstoy’s Genius Explored using Transformer Architecture
 
Dli milano rl_parton_sep
Dli milano rl_parton_sepDli milano rl_parton_sep
Dli milano rl_parton_sep
 
3234150
32341503234150
3234150
 
Artificial intellegence by Bhanuprakash
Artificial  intellegence by BhanuprakashArtificial  intellegence by Bhanuprakash
Artificial intellegence by Bhanuprakash
 
Elective Neural Networks. I. The boolean brain. On a Heuristic Point of V...
Elective Neural Networks.   I. The boolean brain.   On a Heuristic Point of V...Elective Neural Networks.   I. The boolean brain.   On a Heuristic Point of V...
Elective Neural Networks. I. The boolean brain. On a Heuristic Point of V...
 
Artificial Intelligence for Biology
Artificial Intelligence for BiologyArtificial Intelligence for Biology
Artificial Intelligence for Biology
 
Rise of AI through DL
Rise of AI through DLRise of AI through DL
Rise of AI through DL
 
AI.pdf
AI.pdfAI.pdf
AI.pdf
 
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
[IJET V2I2P20] Authors: Dr. Sanjeev S Sannakki, Ms.Anjanabhargavi A Kulkarni
 
Beep...Destroy All Humans!
Beep...Destroy All Humans!Beep...Destroy All Humans!
Beep...Destroy All Humans!
 

Kürzlich hochgeladen

The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 

Kürzlich hochgeladen (20)

WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
WSO2CON 2024 - Building the API First Enterprise – Running an API Program, fr...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
%+27788225528 love spells in Colorado Springs Psychic Readings, Attraction sp...
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Artyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptxArtyushina_Guest lecture_YorkU CS May 2024.pptx
Artyushina_Guest lecture_YorkU CS May 2024.pptx
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 

History of neural networks

  • 1. History of neural networks In 1943, neurophysiologist Warren McCulloch and mathematician Walter Pitts wrote a paper on how neurons might work. In order to describe how neurons in the brain might work, they modelled a simple neural network using electrical circuits. In 1949, Donald Hebb wrote The Organization of Behaviour, a work which pointed out the fact that neural pathways are strengthened each time they are used, a concept fundamentally essential to the ways in which humans learn. If two nerves fire at the same time, he argued, the connection between them is enhanced. As computers became more advanced in the 1950's, it was finally possible to simulate a hypothetical neural network. The first step towards this was made by Nathanial Rochester from the IBM research laboratories. Unfortunately for him, the first attempt to do so failed. In 1959, Bernard Widrow and Marcian Hoff of Stanford developed models called "ADALINE" and "MADALINE." In a typical display of Stanford's love for acronyms, the names come from their use of Multiple Adaptive Linear Elements. ADALINE was developed to recognize binary patterns so that if it was reading streaming bits from a phone line, it could predict the next bit. MADALINE was the first neural network applied to a real-world problem, using an adaptive filter that eliminates echoes on phone lines. While the system is as ancient as air traffic control systems, like air traffic control systems, it is still in commercial use. In 1962, Widrow & Hoff developed a learning procedure that examines the value before the weight adjusts it (i.e. 0 or 1) according to the rule: Weight Change = (Pre-Weight line value) * (Error / (Number of Inputs)). It is based on the idea that while one active perceptron may have a big error, one can adjust the weight values to distribute it across the network, or at least to adjacent perceptron. Applying this rule still results in an error if the line before the weight is 0, although this will eventually correct itself. If the error is conserved so that all of it is distributed to all of the weights than the error is eliminated. Despite the later success of the neural network, traditional von Neumann architecture took over the computing scene, and neural research was left behind. Ironically, John von Neumann himself suggested the imitation of neural functions by using telegraph relays or vacuum tubes. In the same time period, a paper was written that suggested there could not be an extension from the single layered neural network to a multiple layered neural network. In addition, many people in the field were using a learning function that was fundamentally flawed because it was not differentiable across the entire line. As a result, research and funding went drastically down. This was coupled with the fact that the early successes of some neural networks led to an exaggeration of the potential of neural networks, especially considering the practical technology at the time. Promises went unfulfilled, and at times greater philosophical questions led to fear. Writers pondered the effect that the so-called "thinking machines" would have on humans, ideas which are still around today. The idea of a computer which programs itself is very appealing. If Microsoft's Windows 2000
  • 2. could reprogram itself, it might be able to repair the thousands of bugs that the programming staff made. Such ideas were appealing but very difficult to implement. In addition, von Neumann architecture was gaining in popularity. There were a few advances in the field, but for the most part research was few and far between. In 1972, Kohonen and Anderson developed a similar network independently of one another, which we will discuss more about later. They both used matrix mathematics to describe their ideas but did not realize that what they were doing was creating an array of analog ADALINE circuits. The neurons are supposed to activate a set of outputs instead of just one. The first multilayered network was developed in 1975, an unsupervised network In 1982, interest in the field was renewed. John Hopfield of Caltech presented a paper to the National Academy of Sciences. His approach was to create more useful machines by using bidirectional lines. Previously, the connections between neurons was only one way. That same year, Reilly and Cooper used a "Hybrid network" with multiple layers, each layer using a different problem-solving strategy. Also in 1982, there was a joint US-Japan conference on Cooperative/Competitive Neural Networks. Japan announced a new Fifth Generation effort on neural networks, and US papers generated worry that the US could be left behind in the field. (Fifth generation computing involves artificial intelligence. First generation used switches and wires, second generation used the transister, third state used solid-state technology like integrated circuits and higher level programming languages, and the fourth generation is code generators.) As a result, there was more funding and thus more research in the field. In 1986, with multiple layered neural networks in the news, the problem was how to extend the Widrow-Hoff rule to multiple layers. Three independent groups of researchers, one of which included David Rumelhart, a former member of Stanford's psychology department, came up with similar ideas which are now called back propagation networks because it distributes pattern recognition errors throughout the network. Hybrid networks used just two layers, these back-propagation networks use many. The result is that back-propagation networks are "slow learners," needing possibly thousands of iterations to learn. In 1989 Yann LeCun uses backpropagation to train convolutional neural network to recognize handwritten digits. This is a breakthrough moment as it lays the foundation of modern computer vision using deep learning. And also George Cybenko publishes earliest version of the Universal Approximation Theorem in his paper “Approximation by superpositions of a sigmoidal function“. He proves that feed forward neural network with single hidden layer containing finite number of neurons can approximate any continuous function. It further adds credibility to Deep Learning. In 1997 Sepp Hochreiter and Jürgen Schmidhuber publishes a milestone paper on “Long Short-Term Memory” (LSTM). It is a type of recurrent neural network architecture which will go on to revolutionize deep learning in decades to come There is another milestone in 2006 Geoffrey Hinton, Ruslan Salakhutdinov, Osindero and Teh publishes the paper “A fast learning algorithm for deep belief nets” in which they
  • 3. stacked multiple RBMs together in layers and called them Deep Belief Networks. The training process is much more efficient for large amount of data. In 2008 Andrew NG’s group in Stanford starts advocating for the use of GPUs for training Deep Neural Networks to speed up the training time by many folds. This could bring practicality in the field of Deep Learning for training on huge volume of data efficiently. Finding enough labeled data has always been a challenge for Deep Learning community. In 2009 Fei-Fei Li, a professor at Stanford, launches ImageNet which is a database of 14 million labeled images. It would serve as a benchmark for the deep learning researchers who would participate in ImageNet competitions (ILSVRC) every year. In 2011 yoshua Bengio, Antoine Bordes, Xavier Glorot in their paper “Deep Sparse Rectifier Neural Networks” shows that ReLU activation function can avoid vanishing gradient problem. This means that now, apart from GPU, deep learning community has another tool to avoid issues of longer and impractical training times of deep neural network. In 2012 AlexNet, a GPU implemented CNN model designed by Alex Krizhevsky, wins Imagenet’s image classification contest with accuracy of 84%. It is a huge jump over 75% accuracy that earlier models had achieved. This win triggers a new deep learning boom globally. In 2014 Generative Adversarial Neural Network also known as GAN is created by Ian Goodfellow. GANs open a whole new door of application of deep learning in fashion, art, science due it’s ability to synthesize real like data. In 2016 Deepmind’s deep reinforcement learning model beats human champion in the complex game of Go. The game is much more complex than chess, so this feat captures the imagination of everyone and takes the promise of deep learning to whole new level. In 2019 Yoshua Bengio, Geoffrey Hinton, and Yann LeCun wins Turing Award 2018 for their immense contribution in advancements in area of deep learning and artificial intelligence. This is a defining moment for those who had worked relentlessly on neural networks when entire machine learning community had moved away from it in 1970s. Now, neural networks are used in several applications, some of which we will describe later in our presentation. The fundamental idea behind the nature of neural networks is that if it works in nature, it must be able to work in computers. The future of neural networks, though, lies in the development of hardware. Much like the advanced chess-playing machines like Deep Blue, fast, efficient neural networks depend on hardware being specified for its eventual use.