Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Philosophy of Deep Learning

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 99 Anzeige

Philosophy of Deep Learning

Herunterladen, um offline zu lesen

Deep learning is not merely an AI technique or a software program, but a new class of smart network information technology that is changing the concept of the modern technology project by offering real-time engagement with reality
Deep learning is a data automation method that replaces hard-coded software with a capacity, in the form of a learning network that is trained to perform a task

Deep learning is not merely an AI technique or a software program, but a new class of smart network information technology that is changing the concept of the modern technology project by offering real-time engagement with reality
Deep learning is a data automation method that replaces hard-coded software with a capacity, in the form of a learning network that is trained to perform a task

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Philosophy of Deep Learning (20)

Anzeige

Weitere von Melanie Swan (20)

Aktuellste (20)

Anzeige

Philosophy of Deep Learning

  1. 1. Melanie Swan Philosophy Department, Purdue University melanie@BlockchainStudies.org Deep Learning Explained The future of Smart Networks Waterfront Conference Center Indianapolis IN, January 26, 2019 Slides: http://slideshare.net/LaBlogga Image credit: NVIDIA
  2. 2. 26 Jan 2019 Deep Learning 1 Melanie Swan, Technology Theorist  Philosophy Department, Purdue University, Indiana, USA  Founder, Institute for Blockchain Studies  Singularity University Instructor; Institute for Ethics and Emerging Technology Affiliate Scholar; EDGE Essayist; FQXi Advisor Traditional Markets Background Economics and Financial Theory Leadership New Economies research group Source: http://www.melanieswan.com, http://blockchainstudies.org/NSNE.pdf, http://blockchainstudies.org/Metaphilosophy_CFP.pdf https://www.facebook.com/groups/NewEconomies
  3. 3. 26 Jan 2019 Deep Learning Technophysics Research Program: Application of physics principles to technology 2 Econophysics Biophysics • Disease causality: role of cellular dysfunction and environmental degradation • Concentration limits in short and long range inter-cellular signaling • Boltzmann distribution and diffusion limits in RNAi and SiRNA delivery • Path integrals extend point calculations in dynamical systems • General (not only specialized) Schrödinger for Black Scholes option pricing • Quantum game theory (greater than fixed sum options), Quantum finance Smart Networks (intelligent self-operating networks) Technologies Tools • Smart network field theory • Optimal control theory • Blockchain • Deep Learning • UAV, HFT, RTB, IoT • Satellite, nanorobot Steam Light and ElectromagneticsMechanics Information 21c20c18-19c16-17c Scientific Paradigms Computational Complexity, Black Holes, and Quantum Gravity (Aaronson, Susskind, Zenil) General Topics Quantum Computation • Apply renormalization group to system criticality and phase transition detection (Aygun, Goldenfeld) and extend tensor network renormalization (Evenbly, Vidal) • Unifying principles: same probability functions used for spin glasses (statistical physics), error-correcting (LDPC) codes (information theory), and randomized algorithms (computer science) (Mézard) • Define relationships between statistical physics and information theory: generalized temperature and Fisher information, partition functions and free energy, and Gibbs’ inequality and entropy (Merhav) • Apply complexity theory to blockchain and deep learning (dos Santos) • Apply spin glass models to blockchain and deep learning (LeCun, Auffinger, Stein) • Apply deep learning to particle physics (Radovic) Research Topics Data Science Method: Science Modules Technophysics The application of physics principles to the study of technology (particularly statistical physics and information theory for the control of complex networks)
  4. 4. 26 Jan 2019 Deep Learning Deep Learning Smart Network Thesis 3 Deep learning is a smart network: global computational infrastructure that operates autonomously Source: Swan, M., and dos Santos, R.P. In prep. Smart Network Field Theory: The Technophysics of Blockchain and Deep Learning. https://www.researchgate.net/publication/328051668_Smart_Network_Field_Theory_The_Technophysics_of_Blockchain_and_Deep_Learning Other smart networks: UAVs, blockchain economic networks, satellites, smart city IoT landscapes, real-time bidding markets for advertising, and high-frequency trading platforms
  5. 5. 26 Jan 2019 Deep Learning Identity crisis? 4 Source: http://www.robotandhwang.com/attorneys  Redefining human identity in the context of the machine age  Human and machines in partnership  Computers excel at?  Humans excel at?
  6. 6. 26 Jan 2019 Deep Learning Agenda  Deep Learning  Definition  Technical details  Applications  Deep Qualia: Deep Learning and the Brain  Smart Network Convergence Theory  Conclusion 5 Image Source: http://www.opennn.net
  7. 7. 26 Jan 2019 Deep Learning Why is Deep Learning important?  IDC estimates that worldwide spending on cognitive and artificial intelligence systems will reach $77.6 billion in 2022  Gartner projects that the global business value derived from artificial intelligence will be $1.2 trillion in 2018 and $3.9 trillion in 2022  Data science and machine learning are among LinkenIn’s fastest- growing jobs 6 Sources: Columbus L. LinkedIn's Fastest-Growing Jobs Today are in Data Science and Machine Learning. Forbes. 2017. IDC. Worldwide Spending on Cognitive and Artificial Intelligence Systems Forecast to Reach $77.6 Billion in 2022. 2018; Gartner. Gartner Says Global Artificial Intelligence Business Value to Reach $1.2 Trillion in 2018.
  8. 8. 26 Jan 2019 Deep Learning What is Artificial Intelligence?  Artificial intelligence (AI) is using computers to do cognitive work (physical or mental) that usually requires a human 7 Source: Swan, M. (Submitted). Philosophy of Deep Learning Networks: Reality Automation Modules. Ke Jie vs. AlphaGo AI Go player, Future of Go Summit, Wuzhen China, May 2017
  9. 9. 26 Jan 2019 Deep Learning How are AI and Deep Learning related? 8 Source: Machine Learning Guide, 9. Deep Learning  Broader context of Computer Science  Within the Computer Science discipline, in the field of Artificial Intelligence, Deep Learning is a class of Machine Learning algorithms, that are in the form of a Neural Network Deep Learning Neural Nets Machine Learning Artificial Intelligence Computer Science
  10. 10. 26 Jan 2019 Deep Learning Deep Learning vocabulary What do these terms mean?  Deep Learning, Machine Learning, Artificial Intelligence  Perceptron, Artificial Neuron, Logit  Deep Belief Net, Artificial Neural Net, Boltzmann Machine  Google DeepDream, Google Brain, Google DeepMind  Supervised and Unsupervised Learning  Convolutional Neural Nets  Recurrent NN & LSTM (Long Short Term Memory)  Activation Function ReLU (Rectified Linear Unit)  Deep Learning libraries and frameworks  TensorFlow, Caffe, Theano, Torch, DL4J  Backpropagation, gradient descent, loss function 9
  11. 11. 26 Jan 2019 Deep Learning 10 Conceptual Definition: Deep learning is a computer program that can identify what something is Technical Definition: Deep learning is a class of machine learning algorithms in the form of a neural network that uses a cascade of layers of processing units to model high-level abstractions in data and extract features from data sets in order to make predictive guesses about new data Source: Extending Jann LeCun, http://spectrum.ieee.org/automaton/robotics/artificial-intelligence/facebook-ai-director-yann-lecun- on-deep-learning
  12. 12. 26 Jan 2019 Deep Learning Deep Learning Theory  System is “dumb” (i.e. mechanistic)  “Learns” by having big data (lots of input examples), and making trial-and-error guesses to adjust weights to find key features  Creates a predictive system to identity new examples  Usual AI argument: big enough data is what makes a difference (“simple” algorithms run over large data sets) 11 Input: Big Data (e.g.; many examples) Method: Trial-and-error guesses to adjust node weights Output: system identifies new examples
  13. 13. 26 Jan 2019 Deep Learning Sample task: is that a Car?  Create an image recognition system that determines which features are relevant (at increasingly higher levels of abstraction) and correctly identifies new examples 12 Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf
  14. 14. 26 Jan 2019 Deep Learning Statistical Mechanics Deep Learning is inspired by Physics 13  Sigmoid function suggested as a model for neurons, per statistical mechanical behavior (Cowan, 1972)  Stationary solutions for dynamic models (asymmetric weights create an oscillator to model neuron signaling)  Hopfield Neural Network: content-addressable memory system with binary threshold nodes, converges to a local minimum (Hopfield, 1982)  Can use an Ising model (of ferromagnetism) for neurons  Restricted Boltzmann Machine (Hinton, 1983)  Studied in theoretical physics, condensed matter field theory; Statistical Mechanics concepts: Renormalization, Boltzmann Distribution, Free Energies, Gibbs Sampling; stochastic processing units with binary output Source: https://www.quora.com/Is-deep-learning-related-to-statistical-physics-particularly-network-science
  15. 15. 26 Jan 2019 Deep Learning What is a Neural Net? 14  Motivation: create an Artificial Neural Network to solve problems in the same way as the human brain
  16. 16. 26 Jan 2019 Deep Learning What is a Neural Net? 15  Structure: input-processing-output  Mimic neuronal signal firing structure of brain with computational processing units Source: https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning, http://cs231n.github.io/convolutional-networks/
  17. 17. 26 Jan 2019 Deep Learning Why is it called Deep Learning?  Deep: Hidden layers (cascading tiers) of processing  “Deep” networks (3+ layers) versus “shallow” (1-2 layers)  Learning: Algorithms “learn” from data by modeling features and updating probability weights assigned to feature nodes in testing how relevant specific features are in determining the general type of item 16 Deep: Hidden processing layers Learning: Updating probability weights re: feature importance
  18. 18. 26 Jan 2019 Deep Learning Supervised and Unsupervised Learning  Supervised (classify labeled data)  Unsupervised (find patterns in unlabeled data) 17 Source: https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning
  19. 19. 26 Jan 2019 Deep Learning Early success in Supervised Learning (2011)  YouTube: user-classified data perfect for Supervised Learning 18 Source: Google Brain: Le, QV, Dean, Jeff, Ng, Andrew, et al. 2012. Building high-level features using large scale unsupervised learning. https://arxiv.org/abs/1112.6209
  20. 20. 26 Jan 2019 Deep Learning 2 main kinds of Deep Learning neural nets 19 Source: Yann LeCun, CVPR 2015 keynote (Computer Vision ), "What's wrong with Deep Learning" http://t.co/nPFlPZzMEJ  Convolutional Neural Nets  Image recognition  Convolve: roll up to higher levels of abstraction in feature sets  Recurrent Neural Nets  Speech, text, audio recognition  Recur: iterate over sequential inputs with a memory function  LSTM (Long Short-Term Memory) remembers sequences and avoids gradient vanishing
  21. 21. 26 Jan 2019 Deep Learning Image Recognition and Computer Vision 20 Source: Quoc Le, https://arxiv.org/abs/1112.6209; Yann LeCun, NIPS 2016, https://drive.google.com/file/d/0BxKBnD5y2M8NREZod0tVdW5FLTQ/view Marv Minsky, 1966 “summer project” Jeff Hawkins, 2004, Hierarchical Temporal Memory (HTM) Quoc Le, 2011, Google Brain cat recognition Convolutional net for autonomous driving, http://cs231n.github.io/convolutional-networks History Current state of the art - 2017
  22. 22. 26 Jan 2019 Deep Learning Progression in AI Deep Learning machines 21 Single-purpose AI: Hard-coded rules Multi-purpose AI: Algorithm detects rules, reusable template Question-answering AI: Natural-language processing Deep Learning prototypeHard-coded AI machine Deep Learning machine Deep Blue, 1997 Watson, 2011 AlphaGo, 2016
  23. 23. 26 Jan 2019 Deep Learning Why do we need Deep Learning? 22  Big data is not smart data or thick data (e.g. usable)  A data science method to keep up with the growth in data, older learning algorithms no longer performing Source: http://blog.algorithmia.com/introduction-to-deep-learning-2016
  24. 24. 26 Jan 2019 Deep Learning Agenda  Deep Learning  Definition  Technical details  Applications  Deep Qualia: Deep Learning and the Brain  Smart Network Convergence Theory  Conclusion 23 Image Source: http://www.opennn.net
  25. 25. 26 Jan 2019 Deep Learning 3 Key Technical Principles of Deep Learning 24 Reduce combinatoric dimensionality Core processing unit (input-processing-output) Levers: weights and bias Squash values into Sigmoidal S-curve -Binary values (Y/N, 0/1) -Probability values (0 to 1) -Tanh values 9(-1) to 1) Loss FunctionPerceptron StructureSigmoid Function “Dumb” system learns by adjusting parameters and checking against outcome Loss function optimizes efficiency of solution Non-linear formulation as a logistic regression problem means greater mathematical manipulation What Why
  26. 26. 26 Jan 2019 Deep Learning Linear Regression 25 House price vs. Size (square feet) y=mx+b House price Size (square feet) Source: https://www.statcrunch.com/5.0/viewreport.php?reportid=5647
  27. 27. 26 Jan 2019 Deep Learning Logistic Regression 26 Source: http://www.simafore.com/blog/bid/99443/Understand-3-critical-steps-in-developing-logistic-regression-models
  28. 28. 26 Jan 2019 Deep Learning Logistic Regression 27  Higher-order mathematical formulation  Sigmoid function  S-shaped and bounded  Maps the whole real axis into a finite interval (0-1)  Non-linear  Can fit probability  Can apply optimization techniques  Deep Learning classification predictions are in the form of a probability value Source: https://www.quora.com/Logistic-Regression-Why-sigmoid-function Sigmoid Function Unit Step Function
  29. 29. 26 Jan 2019 Deep Learning Sigmoid function: Taleb 28 Source: Swan, M. (2019). Blockchain Theory of Programmable Risk: Black Swan Smart Contracts. In Blockchain Economics: Implications of Distributed Ledgers - Markets, communications networks, and algorithmic reality. London: World Scientific.  Thesis: mapping a phenomenon to an s-curve curve (“convexify” it), means its risk may be controlled  Antifragility = convexity = risk-manageable  Fragility = concavity  Non-linear dose response in medicine suggests treatment optimality  U-shaped, j-shaped curves implicated in hormesis (biphasic response); Bell’s theorem
  30. 30. 26 Jan 2019 Deep Learning Regression  Logistic regression  Predict binary outcomes:  Perceptron (0 or 1)  Predict probabilities:  Sigmoid Neuron (values 0-1)  Tanh Hyperbolic Tangent Neuron (values (-1)-1) 29 Logistic Regression (Sigmoid function) (0-1) or Tanh ((-1)-1) Linear Regression  Linear regression  Predict continuous set of values (house prices)
  31. 31. 26 Jan 2019 Deep Learning Deep Learning Architecture 30 Source: Michael A. Nielsen, Neural Networks and Deep Learning Modular Processing Units
  32. 32. 26 Jan 2019 Deep Learning Modular Processing Units 31 Source: http://deeplearning.stanford.edu/tutorial 1. Input 2. Hidden layers 3. Output X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X  Unit: processing unit, logit (logistic regression unit), perceptron, artificial neuron
  33. 33. 26 Jan 2019 Deep Learning Example: Image recognition 1. Obtain training data set MNIST (60,000-item database) 2. Digitize pixels (convert images to numbers)  Divide image into 28x28 grid, assign a value (0-255) to each square based on brightness 3. Read into vector (array; list of numbers)  28x28 = 784 elements per image) 32 Source: Quoc V. Le, A Tutorial on Deep Learning, Part 1: Nonlinear Classifiers and The Backpropagation Algorithm, 2015, Google Brain, https://cs.stanford.edu/~quocle/tutorial1.pdf
  34. 34. 26 Jan 2019 Deep Learning Deep Learning Architecture 4. Load spreadsheet of vectors into deep learning system  Each row of spreadsheet (784-element array) is an input 33 Source: http://deeplearning.stanford.edu/tutorial; MNIST dataset: http://yann.lecun.com/exdb/mnist 1. Input 2. Hidden layers 3. Output X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X Vector data 784-element array Image #1 Image #2 Image #3
  35. 35. 26 Jan 2019 Deep Learning What happens in the Hidden Layers? 34 Source: Michael A. Nielsen, Neural Networks and Deep Learning  First layer learns primitive features (line, edge, tiniest unit of sound) by finding combinations of the input vector data that occur more frequently than by chance  A logistic regression is performed at each processing node (Y/N (0-1)), does this example have this feature?  System feeds basic features to next layer, which identifies slightly more complicated features (jaw line, corner, combination of speech sounds)  Features pushed to subsequent layers at higher levels of abstraction until full objects can be recognized
  36. 36. 26 Jan 2019 Deep Learning Image Recognition Higher Abstractions of Feature Recognition 35 Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf Edges Object Parts (combinations of edges) Object Models
  37. 37. 26 Jan 2019 Deep Learning Image Recognition Higher Abstractions of Feature Recognition 36 Source: https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html
  38. 38. 26 Jan 2019 Deep Learning Speech, Text, Audio Recognition Sequence-to-sequence Recognition + LSTM 37 Source: Andrew Ng  LSTM: Long Short Term Memory  Technophysics technique: each subsequent layer remembers data for twice as long (fractal-type model)  The “grocery store” not the “grocery church”
  39. 39. 26 Jan 2019 Deep Learning Example: NVIDIA Facial Recognition 38 Source: NVIDIA  First hidden layer extracts all possible low-level features from data (lines, edges, contours); next layers abstract into more complex features of possible relevance
  40. 40. 26 Jan 2019 Deep Learning Deep Learning 39 Source: Quoc V. Le et al, Building high-level features using large scale unsupervised learning, 2011, https://arxiv.org/abs/1112.6209
  41. 41. 26 Jan 2019 Deep Learning Deep Learning Architecture 40 Source: Michael A. Nielsen, Neural Networks and Deep Learning 1. Input 2. Hidden layers 3. Output (0,1)
  42. 42. 26 Jan 2019 Deep Learning Mathematical methods update weights 41 1. Input 2. Hidden layers 3. Output X X X X X X X X X X X X X X X Source: http://deeplearning.stanford.edu/tutorial; MNIST dataset: http://yann.lecun.com/exdb/mnist  Linear algebra: matrix multiplications of input vectors  Statistics: logistic regression units (Y/N (0,1)), probability weighting and updating, inference for outcome prediction  Calculus: optimization (minimization), gradient descent in back-propagation to avoid local minima with saddle points Feed-forward pass (0,1) 1.5 Backward pass to update probabilities per correct guess .5.5 .5.5.5 1 10 .75 .25 Inference Guess Actual
  43. 43. 26 Jan 2019 Deep Learning More complicated in actual use  Convolutional neural net scale-up for number recognition  Example data: MNIST dataset  http://yann.lecun.com/exdb/mnist 42 Source: http://www.kdnuggets.com/2016/04/deep-learning-vs-svm-random-forest.html
  44. 44. 26 Jan 2019 Deep Learning Node Structure: Computation Graph 43 Edge (input value) Architecture Node (operation) Edge (input value) Edge (output value) Example 1 3 4 Add ?? Example 2 3 4 Multiply ??
  45. 45. 26 Jan 2019 Deep Learning Basic node with Weights and Bias 44 Edge Input value = 4 Edge Input value = 16 Edge Output value = 20 Node Operation = Add Input Values have Weights w Nodes have a Bias bw1* x1 w2*x2 N+b .25*4=1 .75*16=12 13+2 15 Input Processing Output Variable Weights and Biases  Basic node structure is fixed: input-processing-output  Weight and bias are variable parameters that are adjusted as the system iterates and “learns” Source: http://neuralnetworksanddeeplearning.com/chap1.html Mimics NAND gate Basic Node Structure (fixed) Basic Node with Weights and Bias (variable)
  46. 46. 26 Jan 2019 Deep Learning Actual: same structure, more complicated 45
  47. 47. 26 Jan 2019 Deep Learning 46 Source: https://medium.com/@karpathy/software-2-0-a64152b37c35 Same structure, more complicated values
  48. 48. 26 Jan 2019 Deep Learning Neural net: massive scale-up of nodes 47 Source: http://neuralnetworksanddeeplearning.com/chap1.html
  49. 49. 26 Jan 2019 Deep Learning Same Structure 48
  50. 50. 26 Jan 2019 Deep Learning How does the neural net actually “learn”?  Vary the weights and biases to see if a better outcome is obtained  Repeat until the net correctly classifies the data 49 Source: http://neuralnetworksanddeeplearning.com/chap2.html  Structural system based on cascading layers of neurons with variable parameters: weight and bias
  51. 51. 26 Jan 2019 Deep Learning Backpropagation  Problem: Combinatorial complexity  Inefficient to test all possible parameter variations  Solution: Backpropagation (1986 Nature paper)  Optimization method used to calculate the error contribution of each neuron after a batch of data is processed 50 Source: http://neuralnetworksanddeeplearning.com/chap2.html
  52. 52. 26 Jan 2019 Deep Learning Backpropagation of errors 1. Calculate the total error 2. Calculate the contribution to the error at each step going backwards  Variety of Error Calculation methods: Mean Square Error (MSE), sum of squared errors of prediction (SSE), Cross- Entropy (Softmax), Softplus  Goal: identify which feature solutions have a higher power of potential accuracy 51
  53. 53. 26 Jan 2019 Deep Learning Backpropagation  Heart of Deep Learning  Backpropagation: algorithm dynamically calculates the gradient (derivative) of the loss function with respect to the weights in a network to find the minimum and optimize the function from there  Algorithms optimize the performance of the network by adjusting the weights, e.g.; in the gradient descent algorithm  Error and gradient are computed for each node  Intermediate errors transmitted backwards through the network (backpropagation)  Objective: optimize the weights so the network can learn how to correctly map arbitrary inputs to outputs 52 Source: http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4, https://mattmazur.com/2015/03/17/a-step-by-step-backpropagation-example/
  54. 54. 26 Jan 2019 Deep Learning Gradient Descent  Gradient: derivative to find the minimum of a function  Gradient descent: optimization algorithm to find the biggest errors (minima) most quickly  Error = MSE, log loss, cross-entropy; e.g.; least correct predictions to correctly identify data  Technophysics methods: spin glass, simulated annealing 53 Source: http://briandolhansky.com/blog/2013/9/27/artificial-neural-networks-backpropagation-part-4
  55. 55. 26 Jan 2019 Deep Learning  Optimization Technique  Mathematical tool used in statistics, finance, decision theory, biological modeling, computational neuroscience  State as non-linear equation to optimize  Minimize loss or cost  Maximize reward, utility, profit, or fitness  Loss function links instance of an event to its cost  Accident (event) means $1,000 damage on average (cost)  5 cm height (event) confers 5% fitness advantage (reward)  Deep learning: system feedback loop  Apply cost penalty for incorrect classifications in training  Methods: CNN (classification): cross-entropy; RNN (regression): MSE Loss Function 54 Laplace
  56. 56. 26 Jan 2019 Deep Learning Known problems: Overfitting  Regularization  Introduce additional information such as a lambda parameter in the cost function (to update the theta parameters in the gradient descent algorithm)  Dropout: prevent complex adaptations on training data by dropping out units (both hidden and visible)  Test new datasets 55
  57. 57. 26 Jan 2019 Deep Learning Research Topics  Layer depth vs. height (1x9, 3x3, etc.); L1/2 slow-downs  Backpropagation, gradient descent, loss function  Saddle-free optimization, vanishing gradients  Composition of non-linearities  Non-parametric manifold learning, auto-encoders  Activation maximization (ReLU)  Synthesizing preferred inputs for neurons 56 Source: http://cs231n.github.io/convolutional-networks, https://arxiv.org/abs/1605.09304, https://www.iro.umontreal.ca/~bengioy/talks/LondonParisMeetup_15April2015.pdf
  58. 58. 26 Jan 2019 Deep Learning Advanced Deep Learning Architectures 57 Source: http://prog3.com/sbdm/blog/zouxy09/article/details/8781396  Deep Belief Network  Connections between layers not units  Establish weighting guesses for processing units before run deep learning system  Used to pre-train systems to assign initial probability weights (more efficient)  Deep Boltzmann Machine  Stochastic recurrent neural network  Runs learning on internal representations  Represent and solve combinatoric problems Deep Boltzmann Machine Deep Belief Network
  59. 59. 26 Jan 2019 Deep Learning Research Topics  Layer depth vs. height: (1x9, 3x3, etc.); L1/2 slow-downs  Dark knowledge: data compression, compress dark (unseen) knowledge into a single summary model  Adversarial networks: two networks, adversary network generates false data and discriminator network identifies  Reinforcement networks: goal-oriented algorithm for system to attain a complex objective over many steps 58 Source: http://cs231n.github.io/convolutional-networks, https://arxiv.org/abs/1605.09304, https://www.iro.umontreal.ca/~bengioy/talks/LondonParisMeetup_15April2015.pdf
  60. 60. 26 Jan 2019 Deep Learning Convolutional net: Image Enhancement  Google DeepDream: Convolutional neural network enhances (potential) patterns in images; deliberately over-processing images 59 Source: Georges Seurat, Un dimanche après-midi à l'Île de la Grande Jatte, 1884-1886; http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722; Google DeepDream uses algorithmic pareidolia (seeing an image when none is present) to create a dream-like hallucinogenic appearance
  61. 61. 26 Jan 2019 Deep Learning Hardware and Software Tools 60
  62. 62. 26 Jan 2019 Deep Learning Deep Learning Hardware  Advance in chip design  GPU chips (graphics processing unit): 3D graphics cards for fast matrix multiplication  Google TPU chip (tensor processing unit): flow through matrix multiplications without storing interim values in memory (AlphaGo)  Google Cloud TPUs: ML accelerators for TensorFlow; TPU 3.0 pod (8x more powerful, up to 100 petaflops (2018))  NVIDIA DGX-1 integrated deep learning system (Eight Tesla P100 GPU accelerators) 61 Google TPU chip (Tensor Processing Unit), 2016 Source: http://www.techradar.com/news/computing-components/processors/google-s-tensor-processing-unit-explained-this-is-what- the-future-of-computing-looks-like-1326915 NVIDIA DGX-1 Deep Learning System
  63. 63. 26 Jan 2019 Deep Learning USB and Browser-based Machine Learning  Intel: Movidius Visual Processing Unit (VPU): USB ML for IOT  Security cameras, industrial equipment, robots, drones  Apple: ML acquisition Turi (Dato)  Browser-based Deep Learning  ConvNetJS; TensorFire  Javascript library to run Deep Learning (Neural Networks) in a browser  Smart Network in a browser  JavaScript Deep Learning  Blockchain EtherWallets 62 Source: http://cs.stanford.edu/people/karpathy/convnetjs/, http://www.infoworld.com/article/3212884/machine-learning/machine-learning- comes-to-your-browser-via-javascript.html
  64. 64. 26 Jan 2019 Deep Learning Deep Learning frameworks and libraries 63 Source: http://www.infoworld.com/article/3163525/analytics/review-the-best-frameworks-for-machine-learning-and-deep- learning.html#tk.ifw-ifwsb
  65. 65. 26 Jan 2019 Deep Learning 64 Key software tool: TensorFlow
  66. 66. 26 Jan 2019 Deep Learning What is TensorFlow? 65 Source: https://www.youtube.com/watch?v=uHaKOFPpphU Python code invoking TensorFlowTensorBoard (TensorFlow) visualization Computation graph Design in TensorFlow  “Tensor” = multidimensional arrays used in NN operations  “Flow” directly through tensor operations (matrix multiplications) without needing to store intermediate values in memory Google’s open-source machine learning library
  67. 67. 26 Jan 2019 Deep Learning How big are Deep Learning neural nets?  Google Deep Brain cat recognition, 2011  1 billion connections, 10 million images (200x200 pixel), 1,000 machines (16,000 cores), 3 days, each instantiation of the network spanned 170 servers, and 20,000 object categories  State of the art, 2016-2019  NVIDIA facial recognition, 100 million images, 10 layers, 1 bn parameters, 30 exaflops, 30 GPU days  Google, 11.2-billion parameter system  Lawrence Livermore Lab, 15-billion parameter system  Digital Reasoning, cognitive computing (Nashville TN), 160 billion parameters, trained on three multi-core computers overnight 66 Parameters: variables that determine the network structure Source: https://futurism.com/biggest-neural-network-ever-pushes-ai-deep-learning, Digital Reasoning paper: https://arxiv.org/pdf/1506.02338v3.pdf
  68. 68. 26 Jan 2019 Deep Learning Agenda  Deep Learning  Definition  Technical details  Applications  Deep Qualia: Deep Learning and the Brain  Smart Network Convergence Theory  Conclusion 67 Image Source: http://www.opennn.net
  69. 69. 26 Jan 2019 Deep Learning Applications: Cats to Cancer to Cognition 68 Source: Yann LeCun, CVPR 2015 keynote (Computer Vision ), "What's wrong with Deep Learning" http://t.co/nPFlPZzMEJ Computational imaging: Machine learning for 3D microscopy https://www.nature.com/nature/journal/v523/n7561/full/523416a.html
  70. 70. 26 Jan 2019 Deep Learning Tumor Image Recognition 69 Source: https://www.nature.com/articles/srep24454  Computer-Aided Diagnosis with Deep Learning Architecture  Breast tissue lesions in images and pulmonary nodules in CT Scans
  71. 71. 26 Jan 2019 Deep Learning Melanoma Image Recognition 70 Source: Nature volume542, pages115–118 (02 February 2017 http://www.nature.com/nature/journal/v542/n7639/full/nature21056.html 2017
  72. 72. 26 Jan 2019 Deep Learning Melanoma Image Recognition 71 Source: https://www.techemergence.com/machine-learning-medical-diagnostics-4-current-applications/  Diagnose skin cancer using deep learning CNNs  Algorithm trained to detect skin cancer (melanoma) using 130,000 images of skin lesions representing over 2,000 different diseases
  73. 73. 26 Jan 2019 Deep Learning DIY Image Recognition: use Contrast 72 Source: https://developer.clarifai.com/modelshttps://developer.clarifai.com/models How many orange pixels? Apple or Orange? Melanoma risk or healthy skin? Degree of contrast in photo colors?
  74. 74. 26 Jan 2019 Deep Learning Deep Learning and Genomics  Large classes of hypothesized but unknown correlations  Genotype-phenotype disease linkage unknown  Computer-identifiable patterns in genomic data  RNN: textual analysis; CNN: genome symmetry 73 Source: http://ieeexplore.ieee.org/document/7347331
  75. 75. 26 Jan 2019 Deep Learning Deep Learning and the Brain 74
  76. 76. 26 Jan 2019 Deep Learning  Deep learning neural networks are inspired by the structure of the cerebral cortex  The processing unit, perceptron, artificial neuron is the mathematical representation of a biological neuron  In the cerebral cortex, there can be several layers of interconnected perceptrons 75 Deep Qualia machine? General purpose AI Mutual inspiration of neurological and computing research
  77. 77. 26 Jan 2019 Deep Learning Deep Qualia machine?  Visual cortex is hierarchical with intermediate layers  The ventral (recognition) pathway in the visual cortex has multiple stages: Retina - LGN - V1 - V2 - V4 - PIT – AIT  Human brain simulation projects  Swiss Blue Brain project, European Human Brain Project 76 Source: Jann LeCun, http://www.pamitc.org/cvpr15/files/lecun-20150610-cvpr-keynote.pdf
  78. 78. 26 Jan 2019 Deep Learning Social Impact of Deep Learning  WHO estimates 400 million people without access to essential health services  6% in extreme poverty due to healthcare costs  Next leapfrog technology: Deep Learning  Last-mile build out of brick-and-mortar clinics does not make sense in era of digital medicine  Medical diagnosis via image recognition, natural language processing symptoms description  Convergence Solution: Digital Health Wallet  Deep Learning medical diagnosis + Blockchain- based EMRs (electronic medical records)  Empowerment Effect: Deep learning = “tool I use,” not hierarchically “doctor-administered” 77 Source: http://www.who.int/mediacentre/news/releases/2015/uhc-report/en/ Digital Health Wallet: Deep Learning diagnosis Blockchain-based EMRs
  79. 79. 26 Jan 2019 Deep Learning Agenda  Deep Learning  Definition  Technical details  Applications  Deep Qualia: Deep Learning and the Brain  Smart Network Convergence Theory  Conclusion 78 Image Source: http://www.opennn.net
  80. 80. 26 Jan 2019 Deep Learning 79 “Better horse” Progression of a New Technology “Horseless carriage” “Car” 3.0 2.0 1.0 Predictive Simulation, Data Automation: Supply Chain Matching: Buyer-Seller, Invoice-PO Optimization, Pattern Recognition: Autonomous Transportation, Medical Diagnostics, Time Series Forecasting Object Identification (IDtech), Facial Recognition, Language Translation Source: Swan, M. (Submitted). Philosophy of Deep Learning Networks: Reality Automation Modules Deep Learning
  81. 81. 26 Jan 2019 Deep Learning Deep Learning Smart Network Thesis 80 Deep learning is a smart network: global computational infrastructure that operates autonomously Source: Swan, M., and dos Santos, R.P. In prep. Smart Network Field Theory: The Technophysics of Blockchain and Deep Learning. https://www.researchgate.net/publication/328051668_Smart_Network_Field_Theory_The_Technophysics_of_Blockchain_and_Deep_Learning Other smart networks: UAVs, blockchain economic networks, satellites, smart city IoT landscapes, real-time bidding markets for advertising, and high-frequency trading platforms
  82. 82. 26 Jan 2019 Deep Learning 81 Smart networks are computing networks with intelligence built in such that identification and transfer is performed by the network itself through protocols that automatically identify (deep learning), and validate, confirm, and route transactions (blockchain) within the network Smart Network Convergence Theory
  83. 83. 26 Jan 2019 Deep Learning Smart Network Convergence Theory  Network intelligence “baked in” to smart networks  Deep Learning algorithms for predictive identification  Blockchains to transfer value, confirm authenticity 82 Source: Expanded from Mark Sigal, http://radar.oreilly.com/2011/10/post-pc-revolution.html Two Fundamental Eras of Network Computing
  84. 84. 26 Jan 2019 Deep Learning Next Phase  Put Deep Learning systems on the Internet  Deep Learning Blockchain Networks  Combine Deep Learning and Blockchain Technology  Blockchain offers secure audit ledger of activity  Advanced computational infrastructure to tackle larger-scale problems  Genomic disease, protein modeling, energy storage, global financial risk assessment, voting, astronomical data 83
  85. 85. 26 Jan 2019 Deep Learning Example: Autonomous Driving  Requires the smart network functionality of deep learning and blockchain  Deep Learning: identify what things are  Convolutional neural nets core element of machine vision system  Blockchain: secure automation technology  Track arbitrarily-many fleet units  Legal accountability  Software upgrades  Remuneration 84
  86. 86. 26 Jan 2019 Deep Learning The Future Learning optimizes Quantum Computing (QC) 85  QC: assign an amplitude (not a probability) for possible states of the world  Amplitudes can interfere destructively and cancel out, be complex numbers, not sum to 1  Feynman: “QM boils down to the minus signs”  QC: a device that maintains a state that is a superposition for every configuration of bits  Turn amplitude into probabilities (event probability is the squared absolute value of its amplitude)  Challenge: obtain speed advantage by exploiting amplitudes, need to choreograph a pattern of interference (not measure random configurations)  New field: Quantum Machine Learning Sources: Scott Aaronson; and Biamonte, Lloyd, et al. (2017). Quantum machine learning. Nature. 549:195–202.
  87. 87. 26 Jan 2019 Deep Learning The Very Small Blockchain Deep Learning nets in Cells  On-board pacemaker data security, software updates, patient monitoring  Medical nanorobotics for cell repair  Deep Learning: identify what things are (diagnosis)  Blockchain: secure automation technology  Bio-cryptoeconomics: secure automation of medical nanorobotics for cell repair  Medical nanorobotics as coming-onboard repair platform for the human body  High number of agents and “transactions”  Identification and automation is obvious 86 Sources: Swan, M. Blockchain Thinking: The Brain as a DAC (Decentralized Autonomous Corporation)., IEEE 2015; 34(4): 41-52 , Swan, M. Forthcoming. Technophysics, Smart Health Networks, and the Bio-cryptoeconomy: Quantized Fungible Global Health Care Equivalency Units for Health and Well-being. In Boehm, F. Ed., Nanotechnology, Nanomedicine, and AI. Boca Raton FL: CRC Press
  88. 88. 26 Jan 2019 Deep Learning The Very Large Blockchain Deep Learning nets in Space  Satellite networks  Automated space construction bots/agents  Deep Learning: identify what things are (classification)  Blockchain: secure automation technology  Applications: asteroid mining, terraforming, radiation-monitoring, space-based solar power, debris tracking net 87
  89. 89. 26 Jan 2019 Deep Learning Agenda  Deep Learning  Definition  Technical details  Applications  Deep Qualia: Deep Learning and the Brain  Smart Network Convergence Theory  Conclusion 88 Image Source: http://www.opennn.net
  90. 90. 26 Jan 2019 Deep Learning Risks and Limitations of Deep Learning 89  Complicated solution  Conceptually and technically; requires skilled workforce  Limited solution  So far, restricted to a specific range of applications (supervised learning for image and text recognition)  Plateau: cheap hardware and already-labeled data sets; need to model complex network science relationships between data  Non-generalizable intelligence  AlphaGo learns each arcade game from scratch  How does the “black box” system work?  Claim: no “learning,” just a clever mapping of the input data vector space to output solution vector space Source: Battaglia et al. 2018. Relational inductive biases, deep learning, and graph networks. arXiv:1806.01261.
  91. 91. 26 Jan 2019 Deep Learning 90 Conceptual Definition: Deep learning is a computer program that can identify what something is Technical Definition: Deep learning is a class of machine learning algorithms in the form of a neural network that uses a cascade of layers of processing units to model high-level abstractions in data and extract features from data in order to make predictive guesses about new data Source: Extending Jann LeCun, http://spectrum.ieee.org/automaton/robotics/artificial-intelligence/facebook-ai-director-yann-lecun- on-deep-learning
  92. 92. 26 Jan 2019 Deep Learning Deep Learning Theory  System is “dumb” (i.e. mechanistic)  “Learns” by having big data (lots of input examples), and making trial-and-error guesses to adjust weights to find key features  Creates a predictive system to identity new examples  Same AI argument: big enough data is what makes a difference (“simple” algorithms run over large data sets) 91 Input: Big Data (e.g.; many examples) Method: Trial-and-error guesses to adjust node weights Output: system identifies new examples
  93. 93. 26 Jan 2019 Deep Learning 3 Key Technical Principles of Deep Learning 92 Reduce combinatoric dimensionality Core processing unit (input-processing-output) Levers: weights and bias Squash values into probability function (Sigmoid (0-1); Tanh ((-1)-1)) Loss FunctionPerceptron StructureSigmoid Function “Dumb” system learns by adjusting parameters and checking against outcome Loss function optimizes efficiency of solution Formulate as a logistic regression problem for greater mathematical manipulation What Why
  94. 94. 26 Jan 2019 Deep Learning Our human future 93  Are we doomed?  Redefining human identity  What do computers excel at?  What do humans excel at?
  95. 95. 26 Jan 2019 Deep Learning Human-machine collaboration 94  Team-members excel at different tasks  Differently-abled agents in society Source: Swan, M. (2017). Is Technological Unemployment Real? In: Surviving the Machine Age. http://www.springer.com/us/book/9783319511641
  96. 96. 26 Jan 2019 Deep Learning Conclusion  Deep learning is not merely an AI technique or a software program, but a new class of smart network information technology that is changing the concept of the modern technology project by offering real-time engagement with reality  Deep learning is a data automation method that replaces hard-coded software with a capacity, in the form of a learning network that is trained to perform a task 95
  97. 97. 26 Jan 2019 Deep Learning  Neural Networks and Deep Learning, Michael Nielsen, http://neuralnetworksanddeeplearning.com/  Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville, http://www.deeplearningbook.org/Machine learning and deep neural nets  Machine Learning Guide podcast, Tyler Renelle, http://ocdevel.com/podcasts/machine-learning  notMNIST dataset http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html  Metacademy; Fast.ai; Keras.io Resources 96 Distill (visual ML journal) http://distill.pubSource: http://cs231n.stanford.edu https://www.deeplearning.ai/
  98. 98. Source: https://www.nvidia.com/en-us/deep-learning-ai/industries Thank You! Questions?
  99. 99. Melanie Swan Philosophy Department, Purdue University melanie@BlockchainStudies.org Deep Learning Explained The future of Smart Networks Waterfront Conference Center Indianapolis IN, January 26, 2019 Slides: http://slideshare.net/LaBlogga Image credit: NVIDIA Thank You! Questions?

×