Leading edge AI applications have always been resource-intensive and known for stretching the limits of conventional (von Neumann architecture) computer performance. Specialized hardware, purpose built to optimize AI applications, is not new. In fact, it should be no surprise that the very first .com internet domain was registered to Symbolics - a company that built the Lisp Machine, a dedicated AI workstation - in 1985. In the last three decades, of course, the performance of conventional computers has improved dramatically with advances in chip density (Moore’s Law) leading to faster processor speeds, memory speeds, and massively parallel architectures. And yet, some applications - like machine vision for real time video analysis and deep machine learning - always need more power.
Participants in this webinar will learn the fundamentals of the three hardware approaches that are receiving significant investments and demonstrating significant promise for AI applications.
- neuromorphic/neurosynaptic architectures (brain-inspired hardware)
- GPUs (graphics processing units, optimized for AI algorithms), and
- quantum computers (based on principles and properties of quantum-mechanics rather than binary logic).
Note - This webinar requires no previous knowledge of hardware or computer architectures.
Smart Data Slides: Emerging Hardware Choices for Modern AI Data Management
1. November 10, 2016
Adrian Bowles, PhD
Founder, STORM Insights, Inc.
info@storminsights.com
Emerging Hardware Choices for #ModernAI
2. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Hardware - The Final Frontier for Workload Optimization
Performance Challenges for #ModernAI
Optimizing Workloads Through Parallel Execution
Three Architectural Paths
Neuromorphic
GPU/Advanced Memory
Quantum
Market Overview & Recommendations
Agenda
3. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Value Migrates to Hardware
Optimize
Commoditize
Standardize
5. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Emerging AI Hardware Trends and Options
A Role for Hardware Optimization
Cognitive
Machine Learning
Reasoning
Understanding
Planning
Human Input
Language
Vision
Aural
Human-Oriented Output
Machine Input
IOT
Machine-Oriented Output
Emerging AI Hardware Trends and Options
6. Human
Machine
Input Output
Narrative Generation
Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Data Mgmt
Learn Model
Reason
Understand
Plan
Taste
Smell
Touch
Hear
See
Gestures
Emotions
Language
Visualization
Reports
Haptics
IoT IoT
Cognitive Systems: Communication & Control
Sensors
Systems
Controls
7. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Hearing (audioception)
~12,000 outer hair cells/ear
~3,500 inner hair cells Vision (ophthalmoception)
Photoreceptors - Per Eye
~120,000,000 rod cells
(triggered by single photon)
~6,000,000 cone cells
(require more photons to trigger)
~ 60,000 photosensitive ganglion cells
Touch (tactioception)
Thermoreceptors, mechanoreceptors,
chemoreceptors and nociceptors for touch, pressure, pain,
temperature, vibration
Smell (olfacoception)
Chemoreception
Taste (gustaoception)
Chemoreception
Neurosynaptic Problem Solving Scope
8. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Hearing (audioception)
~12,000 outer hair cells/ear
~3,500 inner hair cells Vision (ophthalmoception)
Photoreceptors - Per Eye
~120,000,000 rod cells
(triggered by single photon)
~6,000,000 cone cells
(require more photons to trigger)
~ 60,000 photosensitive ganglion cells
Touch (tactioception)
Thermoreceptors, mechanoreceptors,
chemoreceptors and nociceptors for touch, pressure, pain,
temperature, vibration
Smell (olfacoception)
Chemoreception
Taste (gustaoception)
Chemoreception
Human Cognition
~100,000,000,000 (100B) Neurons
~100-500,000,000,000,000 (100-500T) Synapses
Neurosynaptic Problem Solving Scope
Learn
ModelReason
Understand
Plan
9. Copyright (c) 2015 by STORM Insights Inc. All Rights reserved.
deep
learning
Deep learning refers to a biologically-inspired approach to machine
learning that leverages a collection of simple processing units - analogous
to neurosynaptic elements - that collaborate to solve complex problems at
multiple levels of abstraction.
These modern neural networks can support supervised, reinforcement, or
unsupervised learning systems.
In general, deep learning solutions require a high degree of parallelism,
which may be implemented in hardware and/or software.
Deep Learning is Inherently Parallel
10. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Memory
(Instructions & Data)
Central Processing Unit
(CPU)
Control Unit
Arithmetic/Logic Unit
(ALU)
Input
Device(s)
Output
Device(s)
Operating System
The von Neumann Architecture
11. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Memory
(Instructions & Data)
Central Processing Unit
(CPU)
Control Unit
Arithmetic/Logic Unit
(ALU)
Input
Device(s)
Output
Device(s)
Operating System
“Speed”/Throughput Constraints
12. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Memory
(Instructions & Data)
Central Processing Unit
(CPU)
Control Unit
Arithmetic/Logic Unit
(ALU)
Input
Device(s)
Output
Device(s)
Operating System
Control Unit
Arithmetic/Logic Unit
(ALU)
Parallelism With Multi-Cores
13. Copyright (c) 2016 by STORM Insights Inc. All Rights Reserved. 9/28/2011
IBM Power 750
90 servers, 32 cores/server,
2880 Cores in 10 racks
16Tb RAM
~80TeraFLOPS
80,000,000,000,000FLOPS
IBM Watson - Parallelism for Deep QA
14. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.Source: https://www.top500.org/system/177999
Amdahl’s Law: The theoretical performance improvement resulting from
a resource improvement for a fixed workload is limited by that part of the
workload that cannot benefit from the resource improvement.
Limits to Parallelism
15. Copyright (c) 2015 by STORM Insights Inc. All Rights reserved.
Research Examples:
The European Commission FACETS (Fast Analog Computing with Emergent Transient States)
and BrainScaleS (Brain-inspired multi scale computation in neuromorphic hybrid systems)
UK SpiNNaker (Spiking Neural Network Architecture)
DARPA - SyNAPSE (Systems of Neuromorphic Adaptive Plastic Scalable Electronics)
Computer, device/component -level systems modeled after biological
systems or components, such as neurons and synapses. These may be
implemented in analog, digital or hybrid hardware. Typically designed to learn
by experience over time, rather than by programming.
Neuromorphic Architectures (“Brain-Inspired”)
Massively interconnected networks of very simple processors.
16. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Synapse 16 chip board
Neuromorphic Architectures
IBM - SyNAPSE board
“TrueNorth chips can be seamlessly tiled to create vast, scalable neuromorphic systems.”
Already demonstrated 16 million neurons and 4 billion synapses.
Goal is to integrate 4,096 chips in a single rack with 4 billion neurons
and 1 trillion synapses while consuming ~4kW of power.
17. Source: Qualcomm
Copyright (c) 2015 by STORM Insights Inc. All Rights reserved.
Neuromorphic Architectures
MAY 2, 2016: Qualcomm Incorporated (NASDAQ: QCOM) today announced at the Embedded Vision Summit in Santa Clara, Calif., that its subsidiary,
Qualcomm Technologies, Inc., is offering the first deep learning software development kit (SDK) for devices powered by Qualcomm® Snapdragon™ 820
processors. The SDK, called the Qualcomm Snapdragon Neural Processing Engine, is powered by the Qualcomm® Zeroth™ Machine Intelligence
Platform
18. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
The Nvidia M40 processor for training neural networks.
Nvidia
NVIDIA Maxwell™ architecture
Up to 7 Teraflops of single-precision performance with NVIDIA GPU Boost™
3072 NVIDIA CUDA® cores
24 GB of GDDR5 memory
288 GB/sec memory bandwidth
Qualified to deliver maximum uptime in the datacenter
GPU/Advanced Memory Architectures
19. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
GPU/Advanced Memory Architectures
20. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Server racks with TPUs used in the
AlphaGo matches with Lee Sedol
GPU/Advanced Memory Architectures
21. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
At Facebook, we've made great progress thus far with off-the-shelf infrastructure components
and design. We've developed software that can read stories, answer questions about
scenes, play games and even learn unspecified tasks through observing some examples.
But we realized that truly tackling these problems at scale would require us to design our own
systems. Today, we're unveiling our next-generation GPU-based systems for training neural
networks, which we've code-named “Big Sur.”
• FAIR is more than tripling its investment in GPU hardware as we focus even more on
research and enable other teams across the company to use neural networks in our
products and services.
• As part of our ongoing commitment to open source and open standards, we plan to
contribute our innovations in GPU hardware to the Open Compute Project so others
can benefit from them.
Facebook Open-source AI hardware design
https://code.facebook.com/posts/1687861518126048/facebook-to-open-source-ai-hardware-design/
GPU/Advanced Memory Architectures
22. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Source: https://www.micron.com/about/emerging-technologies/automata-processing
GPU/Advanced Memory Architectures
23. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
GPU/Advanced Memory Architectures
24. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
http://www.research.ibm.com/quantum/
Quantum Architectures
25. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Source: https://arxiv.org/abs/1608.00263
Quantum Architectures
26. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Probabalistic Architecture?
27. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Neuromorphic
GPU/
Memory Acceleration
Quantum
Market/Technology Positions & Maturity
Ready Now
Much More in the Pipeline
Promising -
Ready Now At Handset Level
Promising -
Watch But Don’t Wait
Proven approach for ||ism
Easy interoperability
with conventional systems
+Natural behavioral process model
+Lower power requirements
- Requires new software model
& skills
+Incredible compute power potential
- Requires new software model
& skills
- Requires interface to
conventional system for
pre-processing
- Requires extremely cold
(big, expensive) environment
28. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
IBM
Qualcomm
Brain Corporation
(hosted by Qualcomm)
Knupath
Tenstorrent
Cirrascale
Neurogrid (Stanford)
Tensilica - Cadence
1026 Labs
Cerebras
Artificial Learning
HRL Laboratories
Isocline
Nvidia
Intel
AMD
Facebook (FAIR)
Nervana Systems/Intel
Movidius - Intel (Vision processing)
Google TPU
IBM
D-Wave
Google
Neuromorphic
GPU/
Memory Acceleration
Quantum
Ones to Watch
On the Horizon
Ready Now
Much More in the Pipeline
Promising -
Ready Now At Handset Level
Promising -
Watch But Don’t Wait
29. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
adrian@storminsights.com
Twitter @ajbowles
Skype ajbowles
Upcoming Webinar Dates & Topics
December 8 Leverage the IOT to Build a Smart Data Ecosystem
January #Modern AI and Cognitive Computing: Boundaries and Opportunities
February Artificial General Intelligence: When I Can I Get It?
March Data Science and Business Analysis: A Look at Best Practices for Roles, Skills, and Processes
April Machine Learning: Moving Beyond Discovery to Understanding
May Streaming Analytics for Agile IoT-Oriented Applications
June Machine Learning Case Studies
July Advances in Natural Language Processing I: Understanding
August Organizing Data and Knowledge: The Role of Taxonomies and Ontologies
September Advances in Natural Language Processing II: NL Generation
October Choosing the Right Data Management Architecture for Cognitive Computing
November See Me, Feel Me, Touch Me, Heal Me: The Rise of the Cognitive Interface
December The Road to Autonomous Applications
For More Information…
30. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Basilar membrane. (2016, October 28). In Wikipedia, The Free Encyclopedia. Retrieved 01:58, October 28, 2016, from https://en.wikipedia.org/w/index.php?title=Basilar_membrane&oldid=746543229
Somatosensory system. (2016, October 9). In Wikipedia, The Free Encyclopedia. Retrieved 04:59, October 9, 2016, from https://en.wikipedia.org/w/index.php?title=Somatosensory_system&oldid=743336883
Photoreceptor cell. (2016, September 19). In Wikipedia, The Free Encyclopedia. Retrieved 03:07, September 19, 2016, from https://en.wikipedia.org/w/index.php?title=Photoreceptor_cell&oldid=740108113
31. Copyright (c) 2016 by STORM Insights Inc. All Rights reserved.
Hardware - The Final Frontier for Workload Optimization
#ModernAI Defined
Performance Challenges
Optimizing Workloads Through Parallel Execution
Three Architecture Paths
Neuromorphic
GPU/Advanced Memory
Quantum
Agenda
A Role for Hardware
Cognitive
Machine Learning
Reasoning
Understanding
Planning
Human Input
Language
Vision
Aural
Human-Oriented Output
Machine Input
IOT
Machine-Oriented Output