Diese Präsentation wurde erfolgreich gemeldet.
Wir verwenden Ihre LinkedIn Profilangaben und Informationen zu Ihren Aktivitäten, um Anzeigen zu personalisieren und Ihnen relevantere Inhalte anzuzeigen. Sie können Ihre Anzeigeneinstellungen jederzeit ändern.
Jürgen Schmidhuber
The Swiss AI Lab IDSIA
Univ. Lugano & SUPSI
http://www.idsia.ch/~juergen
True Artificial
Intelligence W...
Jürgen Schmidhuber
You_again Shmidhoobuh
According to Nature’s
1999 Millennium Issue
Fritz Haber
Nobel 1918
Carl Bosch
½ Nobel 1931
Haber-Bosch Process::Extracts
n...
The AI explosion of
the 21st century will
be even much bigger
Konrad Zuse 1941
First working
general computer
Every 5 years
10 times cheaper
75 years ≅1015
R-learn & improve learning
algorithm itself, and also the
meta-learning algorithm, etc…
My diploma thesis (1987):
first co...
SINCE 1991
SCHMIDHUBER
THE SWISS AI LAB
JÜRGEN
IDSIA - USI & SUPSI
NNAISENSE
Father of Deep Learning
A. G. Ivakhnenko, since 1965
Deep multilayer perceptrons with
polynomial activation functions
Incr...
Continuous BP in Euler-LaGrange Calculus + Dynamic
Programming: Bryson 1961, Kelley 1960. BP through
chain rule only: Drey...
RNNaissance of the deepest NNs:
RNNs are general computers
Learn program = weight matrix
http://www.idsia.ch/~juergen/rnn....
http://www.idsia.ch/~juergen/firstdeeplearner.html
Schmidhuber 1991: first very deep
learner. Unsupervised pretraining
for...
With Hochreiter (1997), Gers (2000), Graves, Fernandez, Gomez, Bayer…
1997-2009. Since 2015 on your phone! Google, Microso...
Examples of recent benchmark records with LSTM
RNNs / CTC, often at major IT companies:
1. Large vocabulary speech recogni...
World’s most valuable
public companies are
massively using LSTM
… for best speech recognition,
machine translation, image
...
Our Deep GPU-Based Max-Pooling CNN (IJCAI 2011)
e.g., http://www.idsia.ch/~juergen/deeplearning.html
Alternating convoluti...
Traffic Sign Contest, Silicon Valley, 2011:
Our GPU-MPCNN was twice better than humans
3 times better than closest artific...
Ernst
Dickmanns,
the robot
car pioneer,
Munich, 80s
1995: Munich to
Denmark and
back on public
Autobahns, up to
180 km/h, ...
Our Deep Learner
Won ISBI 2012 Brain
Image Segmentation
Contest:
First feedforward
Deep Learner to win
an image
segmentati...
Our Deep Learner Won ICPR
2012 Contest on Mitosis
Detection: First pure Deep
Learner to win a contest on
object detection ...
Thanks to Dan Ciresan & Alessandro Giusti
http://www.idsia.ch/~juergen/deeplearningwinsMICCAIgrandchallenge.html
Image caption generation with LSTM RNNs translating internal representations of CNNs
(Vinyals, Toshev, Bengio, Erhan, Goog...
SCHMIDHUBER
THE SWISS AI LAB
JÜRGEN
IDSIA - USI & SUPSI
Highway
Networks
Very similar later ResNet = feedforward LSTM
(in ...
http://people.idsia.ch/~juergen/microsoft-wins-imagenet-through-feedforward-LSTM-without-gates.html
LSTM concepts keep inv...
Best Segmentation with PyramMiD-LSTM (NIPS 2015)
Stollenga, Byeon,
Liwicki, Schmidhuber
Open Source Neural Networks Library by my PhD students K Greff and R Srivastava
http://people.idsia.ch/~juergen/brainstorm...
LSTM learns knot-tying tasklets:
Mayr Gomez Wierstra Nagy Knoll
Schmidhuber, IROS’06
• First very deep learner (1991-1993) – tasks with >1000 computational stages
• First neural learner of sequential attenti...
1997: Chess
Deep Blue (does not learn), IBM
1994: Backgammon
TD-Gammon (learns), IBM
2016: Go
AlphaGo (learns)
Google Deep...
Google bought DeepMind for
600 M to do Machine Learning
(ML) & AI. First DeepMinders
with PhDs in ML & AI: my lab’s
ex-PhD...
Finds Complex Neural Controllers with a Million Weights – RAW VIDEO INPUT!
Faustino Gomez, Jan Koutnik, Giuseppe Cuccu, J....
J.S.: IJCNN 1990, NIPS 1991: Reinforcement Learning
with Recurrent Controller & Recurrent World Model
Learning
and
plannin...
IJNS 1991: R-Learning of Visual Attention
on 100,000 times slower computers
http://people.idsia.ch/~juergen/attentive.html
1991: current goal=extra fixed input
2015: all of this is coming back!
RoboCup World Champion 2004, Fastest League, 5m/s
Alex @ IDSIA, led
FU Berlin’s RoboCup
World Champion
Team 2004
Lookahead...
RNNAIssance
2014-2015
On Learning to
Think: Algorithmic
Information
Theory for Novel
Combinations of
Reinforcement
Learnin...
Formal theory of fun & novelty &
surprise & attention & creativity &
curiosity & art & science & humor
Maximize Future Fun...
https://www.youtube.com/watch?v=OTqdXbTEZpE
Continual curiosity-driven skill
acquisition from high-dimensional
video input...
PowerPlay not only solves but also continually
invents problems at the borderline between what's
known and unknown - train...
Jürgen Schmidhuber
The Swiss AI Lab IDSIA
Univ. Lugano & SUPSI
http://www.idsia.ch/~juergen
True Artificial
Intelligence W...
Next: build small
animal-like AI that
learns to think and
plan hierarchically
like a crow or a
capuchin monkey
Evolution
n...
now talking to investors
neural networks-based
artificial intelligence
Finally: my beautiful simple
pattern (discovered in 2014)
of exponential acceleration of
the most important events in
the ...
Ω = 2050 or so - 13.8 B years: Big Bang
Ω - 3.5 B years: first life on Earth
Ω - 0.9 B years: first animal-like mobile lif...
MANY OF MY WEB PAGES AND TALK SLIDES
ARE GRAPHICALLY STRUCTURED THROUGH
FIBONACCI NUMBER RATIOS RELATED TO
THE“GOLDEN” OR ...
True Artificial Intelligence Will Change Everything
True Artificial Intelligence Will Change Everything
Nächste SlideShare
Wird geladen in …5
×

True Artificial Intelligence Will Change Everything

1.086 Aufrufe

Veröffentlicht am

We are delighted to republish slides presented by professor Jürgen Schmidhuber at GTC Conference in Amsterdam.

The presentation covers evolution of deep learning since efforts of  A. G. Ivakhnenko till the most recent achievement in curiosity-driven skills acquisition.
Slides contains a wealth of useful information including references and links. 

Veröffentlicht in: Technologie
  • Sharpen your mind with brain pill. learn more info.. ▲▲▲ https://tinyurl.com/brainpill101
       Antworten 
    Sind Sie sicher, dass Sie …  Ja  Nein
    Ihre Nachricht erscheint hier

True Artificial Intelligence Will Change Everything

  1. 1. Jürgen Schmidhuber The Swiss AI Lab IDSIA Univ. Lugano & SUPSI http://www.idsia.ch/~juergen True Artificial Intelligence Will Change Everything NNAISENSE
  2. 2. Jürgen Schmidhuber You_again Shmidhoobuh
  3. 3. According to Nature’s 1999 Millennium Issue Fritz Haber Nobel 1918 Carl Bosch ½ Nobel 1931 Haber-Bosch Process::Extracts nitrogen from thin air to make fertilizer – else 1 in 2 people (soon 2 in 3) wouldn’t exist. Nature: Detonated population explosion Most Influential Invention of the 20th Century?
  4. 4. The AI explosion of the 21st century will be even much bigger
  5. 5. Konrad Zuse 1941 First working general computer Every 5 years 10 times cheaper 75 years ≅1015
  6. 6. R-learn & improve learning algorithm itself, and also the meta-learning algorithm, etc… My diploma thesis (1987): first concrete design of recursively self-improving AI http://people.idsia.ch/~juergen/metalearner.html
  7. 7. SINCE 1991 SCHMIDHUBER THE SWISS AI LAB JÜRGEN IDSIA - USI & SUPSI NNAISENSE
  8. 8. Father of Deep Learning A. G. Ivakhnenko, since 1965 Deep multilayer perceptrons with polynomial activation functions Incremental layer-wise training by regression analysis - learn numbers of layers and units per layer - prune superfluous units 8 layers already back in 1971 still used in the 2000s
  9. 9. Continuous BP in Euler-LaGrange Calculus + Dynamic Programming: Bryson 1961, Kelley 1960. BP through chain rule only: Dreyfus 1962. `Modern BP’ in sparse, discrete, NN-like nets: Linnainmaa 1970. Weight changes: Dreyfus 1973. Automatic differentiation: Speelpenning 1980. BP applied to NNs: Werbos 1982. Experiments & internal representations: Rumelhart et al 86. RNNs: e.g., Williams, Werbos, Robinson, 1980s... Supervised Backpropagation (BP) …..
  10. 10. RNNaissance of the deepest NNs: RNNs are general computers Learn program = weight matrix http://www.idsia.ch/~juergen/rnn.html
  11. 11. http://www.idsia.ch/~juergen/firstdeeplearner.html Schmidhuber 1991: first very deep learner. Unsupervised pretraining for Hierarchical Temporal Memory: stack of RNN è history compression èspeed up supervised learning. Compare feedforward NN case: AutoEncoder stacks (Ballard 1987) and Deep Belief NNs (Hinton et al 2006)
  12. 12. With Hochreiter (1997), Gers (2000), Graves, Fernandez, Gomez, Bayer… 1997-2009. Since 2015 on your phone! Google, Microsoft, IBM, Apple, all use LSTM now http://www.idsia.ch/~juergen/rnn.html
  13. 13. Examples of recent benchmark records with LSTM RNNs / CTC, often at major IT companies: 1. Large vocabulary speech recognition (Sak et al., Google, Interspeech 2014) 2. English to French translation (Sutskever et al., Google, NIPS 2014) 3. CTC RNNs break Switchboard record (Hannun et al., Baidu, 2014) 4. Text-to-speech synthesis (Fan et al., Microsoft, 2014, Zen et al., Google, 2015) 5. Prosody contour prediction (Fernandez et al., IBM, Interspeech 2014) 6. Google Voice improved by 49% (Sak et al, 2015, now for billions of users) 7. Syntactic parsing for NLP (Vinyals et al., Google, 2014-15) 8. Photo-real talking heads (Soong and Wang, Microsoft, ICASSP 2015) 9. Quicktype for iOS (Apple, 2016) 10. Image caption generation (Vinyals et al., Google, 2014) 11. Keyword spotting (Chen et al., Google, ICASSP 2015) 12. Video to textual description (Donahue et al., 2014; Li Yao et al., 2015) http://www.idsia.ch/~juergen/rnn.html
  14. 14. World’s most valuable public companies are massively using LSTM … for best speech recognition, machine translation, image captions, chat bots, etc, etc. 2015-2016: our LSTM / CTC is now available to billions of users on smartphones, PCs,…
  15. 15. Our Deep GPU-Based Max-Pooling CNN (IJCAI 2011) e.g., http://www.idsia.ch/~juergen/deeplearning.html Alternating convolutional and subsampling layers (CNN): Fukushima 1979. Backprop for shift-invariant 1D CNN or TDNN: Waibel et al, 1987, for 2D CNN: LeCun et al 1989. Max-pooling (MP): Weng 1993. Backprop for MPCNNs: Ranzato et al 2007, Scherer et al 2010, GPU-MPCNN - Ciresan et al, IDSIA, 2011
  16. 16. Traffic Sign Contest, Silicon Valley, 2011: Our GPU-MPCNN was twice better than humans 3 times better than closest artificial competitor 6 times better than best non-neural thing: FIRST
  17. 17. Ernst Dickmanns, the robot car pioneer, Munich, 80s 1995: Munich to Denmark and back on public Autobahns, up to 180 km/h, no GPS, passing other cars Robot Cars 2014: 20 year anniversary of self-driving cars in highway traffic
  18. 18. Our Deep Learner Won ISBI 2012 Brain Image Segmentation Contest: First feedforward Deep Learner to win an image segmentation competition (but compare deep recurrent LSTM 2009: segmentation & classification) Our Deep Learner Won ISBI 2012 Brain Image Segmentation Contest: First feedforward Deep Learner to win an image segmentation competition (but compare deep recurrent LSTM 2009: segmentation & classification)
  19. 19. Our Deep Learner Won ICPR 2012 Contest on Mitosis Detection: First pure Deep Learner to win a contest on object detection (in large images). Very fast MPCNN scans: Masci, Giusti, Ciresan, Gambardella, Schmidhuber, ICIP 2013
  20. 20. Thanks to Dan Ciresan & Alessandro Giusti http://www.idsia.ch/~juergen/deeplearningwinsMICCAIgrandchallenge.html
  21. 21. Image caption generation with LSTM RNNs translating internal representations of CNNs (Vinyals, Toshev, Bengio, Erhan, Google, 2014)
  22. 22. SCHMIDHUBER THE SWISS AI LAB JÜRGEN IDSIA - USI & SUPSI Highway Networks Very similar later ResNet = feedforward LSTM (in each layer a function of x plus x), no gates, won ImageNet (Microsoft, 150 layers) “Feedforward LSTM” (in each layer a function of x plus x) with forget gates; to train NNs with hundreds of layers: Srivastava, Greff, Schmidhuber, May 2015, also NIPS 2015
  23. 23. http://people.idsia.ch/~juergen/microsoft-wins-imagenet-through-feedforward-LSTM-without-gates.html LSTM concepts keep invading CNN territory
  24. 24. Best Segmentation with PyramMiD-LSTM (NIPS 2015) Stollenga, Byeon, Liwicki, Schmidhuber
  25. 25. Open Source Neural Networks Library by my PhD students K Greff and R Srivastava http://people.idsia.ch/~juergen/brainstorm.html
  26. 26. LSTM learns knot-tying tasklets: Mayr Gomez Wierstra Nagy Knoll Schmidhuber, IROS’06
  27. 27. • First very deep learner (1991-1993) – tasks with >1000 computational stages • First neural learner of sequential attention (hard: 1991, soft: 1993) • First self-referential RNNs that run their own learning algorithm (1993) • First very deep supervised learner (LSTM, 1990s-2006 and beyond) • First recurrent NN to win international contests (2009) • First NN to win connected handwriting contests (2009) • First outperformance of humans in a computer vision contest (2011) • First deep NN to win Chinese handwriting contest (2011) • European handwriting (MNIST): old error record almost halved (2011) • First deep NN to win image segmentation contest (May 2012) • First deep NN to win object detection contest (2012) • First deep NN to win medical imaging contest (2012) • First feedforward NNs with >100 layers (Highway NNs, May 2015) • First RNN controller that reinforcement learns from raw video (2013) Some of Our Deep Learning “Firsts”
  28. 28. 1997: Chess Deep Blue (does not learn), IBM 1994: Backgammon TD-Gammon (learns), IBM 2016: Go AlphaGo (learns) Google DeepMind Board Games: Non-Human World Champions
  29. 29. Google bought DeepMind for 600 M to do Machine Learning (ML) & AI. First DeepMinders with PhDs in ML & AI: my lab’s ex-PhD students Legg (co- founder) & Wierstra (#4). Background of the other co- founders: neurobiology & video games (Hassabis) & business (Suleyman) DeepMind hired 2 more PhD students of mine: Graves (e.g., CTC & NTM) & Schaul (RL, on our 2010 Atari-Go paper)
  30. 30. Finds Complex Neural Controllers with a Million Weights – RAW VIDEO INPUT! Faustino Gomez, Jan Koutnik, Giuseppe Cuccu, J. Schmidhuber, GECCO 2013 Reinforcement Learning in Partially Observable Worlds
  31. 31. J.S.: IJCNN 1990, NIPS 1991: Reinforcement Learning with Recurrent Controller & Recurrent World Model Learning and planning with recurrent networks
  32. 32. IJNS 1991: R-Learning of Visual Attention on 100,000 times slower computers http://people.idsia.ch/~juergen/attentive.html
  33. 33. 1991: current goal=extra fixed input 2015: all of this is coming back!
  34. 34. RoboCup World Champion 2004, Fastest League, 5m/s Alex @ IDSIA, led FU Berlin’s RoboCup World Champion Team 2004 Lookahead expectation & planning with neural networks (Schmidhuber, IEEE INNS 1990): successfully used for RoboCup by Alexander Gloye-Förster (went to IDSIA) http://www.idsia.ch/~juergen/learningrobots.html
  35. 35. RNNAIssance 2014-2015 On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning RNN- based Controllers (RNNAIs) and Recurrent Neural World Models http://arxiv.org/abs/1511.09249
  36. 36. Formal theory of fun & novelty & surprise & attention & creativity & curiosity & art & science & humor Maximize Future Fun(Data X,O(t))~ ∂CompResources(X,O(t))/∂t E.g., Connection Science 18(2):173-187, 2006 IEEE Transactions AMD 2(3):230-247, 2010 http://www.idsia.ch/~juergen/creativity.html
  37. 37. https://www.youtube.com/watch?v=OTqdXbTEZpE Continual curiosity-driven skill acquisition from high-dimensional video inputs for humanoid robots. Kompella, Stollenga, Luciw, Schmidhuber. Artificial Intelligence, 2015
  38. 38. PowerPlay not only solves but also continually invents problems at the borderline between what's known and unknown - training an increasingly general problem solver by continually searching for the simplest still unsolvable problem
  39. 39. Jürgen Schmidhuber The Swiss AI Lab IDSIA Univ. Lugano & SUPSI http://www.idsia.ch/~juergen True Artificial Intelligence Will Change Everything NNAISENSE
  40. 40. Next: build small animal-like AI that learns to think and plan hierarchically like a crow or a capuchin monkey Evolution needed billions of years for this, then only a few more millions for humans
  41. 41. now talking to investors neural networks-based artificial intelligence
  42. 42. Finally: my beautiful simple pattern (discovered in 2014) of exponential acceleration of the most important events in the history of the universe from a human perspective http://www.reddit.com/r/MachineLearning/comments/2xcyrl/i_am_j%C3%BCrgen_schmidhuber_ama/cozd9ju
  43. 43. Ω = 2050 or so - 13.8 B years: Big Bang Ω - 3.5 B years: first life on Earth Ω - 0.9 B years: first animal-like mobile life Ω - 220 M years: first mammals (your ancestors) Ω - 55 M years: first primates (your ancestors) Ω - 13 M years: first hominids (your ancestors) Ω - 3.5 M years: first stone tools (dawn of technology) Ω – 850 K years: controlled fire (next big tech breakthrough) Ω - 210 K years: first anatomically modern man (your ancestors) Ω - 50 K years: behaviorally modern man colonizing earth Ω - 13 K years: neolithic revolution, agriculture, domestic animals Ω - 3.3 K years: iron age, 1st population explosion starts Ω - 800 years: first guns & rockets (in China) Ω - 200 years: industrial revolution, 2nd population explosion starts Ω - 50 years: digital nervous system, WWW, cell phones for all Ω - 12 years: cheap small computers with 1 brain power? Ω - 3 years: ?? Ω - 9 months: ???? Ω - 2 months: ???????? Ω - 2 weeks: ???????????????? …. Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time: Ω - 1/4 of this time Ω - 1/4 of this time: Ω - 1/4 of this time:
  44. 44. MANY OF MY WEB PAGES AND TALK SLIDES ARE GRAPHICALLY STRUCTURED THROUGH FIBONACCI NUMBER RATIOS RELATED TO THE“GOLDEN” OR “DIVINE” PROPORTION. MANY ARTISTS CLAIM THE HUMAN EYE PREFERS THIS RATIO OVER OTHERS THE RATIOS OF SUBSEQUENT FIBONACCI NUMBERS: 1/1, 1/2, 2/3, 3/5, 5/8, 8/13, 13/21, 21/34 ... CONVERGE TO THE HARMONIC PROPORTION 1/2 (SQUARE ROOT OF 5 - 1) = 0.618034..., DIVIDING THE UNIT INTERVAL INTO SEGMENTS OF LENGTHS A AND B SUCH THAT A/B=B BTW, MOST OF MY SLIDES ARE DESIGNED THROUGH RECURSIVE “GOLDEN” HARMONIC PROPORTIONS

×