SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Downloaden Sie, um offline zu lesen
NovuTensor: Hardware Acceleration of Deep Convolutional
Neural Networks for AI
User Friendly
NovuTensor
AI Chip
NovuStar
Supercomputer
Neural
Network
Models
Training NovuBrain
AI Module
Get Data
Training ability
Model accuracy
Full-StackAI
Solutions
NovuMind
Industry Corporation
Customer
&Alliance
• The core of CNN is 3-D Convolution Computation
• Deeper CNN needs more 3-D Convolution Computation
• AlexNet -> VGG -> ResNet -> …
• DCNN is a network topology composed of basic computation elements
(e.g., 3x3 3-D Convolutions )
•
•
•
•
•
TPU with 2D Systolic Array
1080Ti Throughput:
Peak: 11TOPS/s
Effective: 2.4TOPS/s
(ResNet18)
Efficiency: 22%
TPU Throughput:
Peak: 96TOPS/s
Average: 23TOPS/s
Efficiency: 24%
2D Accelerators for Deep Learning
Mesh Structure and Systolic Array
• Overhead to prepare the 3D data into 2D data
• Overhead to feed data
• TPU uses 256x256 systolic array->needs 65536 data to fill
• Systolic Array got 2D parallel acceleration – only when data is loaded
1D, 2D, 3D – An Ice Cream Example
How Do You Eat Ice Cream?
• Data Movement:
• Need a scoop,then the art of scooping
• Need a spoon or cone that is mouth size
• Arithmetic:
• Need a mouth to consume the ice cream
• Unnecessary Assumptions with Mesh:
• dqual distance; deighboring data dependency
• Collective Streaming, and CNN 3D Data Processing: Data independency b/w PEs:
• no data flow
• no need to be rectangular, PEs can be flexible to have dozens to thousands
of MACs
•
•
•
The Scooping: Tile Based Operation (patented)
• Efficient Data Movement
• Reduce unnecessary data movement off chip and on chip
• Fully exploit data sharing, re-using:
• IFM sharing between OFMs
• Weight sharing within IFM (X-Y)
• Weight sharing within batch
• Efficient Arithmetic Logic
• Reduce unnecessary computation
PEP0
PEP1
PEP2
PEP3
IFM
broadcast
OFM[0]
OFM[1]
OFM[2]
OFM[3]
concatenate
OFM[3:0]
•
•
•
NT-BB
FPGA
Server
GPU
Perform
ance
(FPS)
301 666
Power
(W)
12 250
1/20 Power Consumption
1/2 Performance
ResNet18
ImageNet Classification
20
Name
Power
Consumption (w)
Performance
(Effective TOPS,
VGG16)
Performance Power
Ratio (TOPS/W)
#1 Embedded 1.5 (500MHz) 0.2 0.13
#2 Smartphone NA (use as IP) 0.19 NA
#3 Smartphone NA (use as IP) 0.3 NA
NovuMind NT-BB
(“BlackBear”)
12~18 2.5 0.21 (FPGA)
NovuMind NT-GZ
(“Grizzly”)
5~10 11 (peak 14) 2.25 (ASIC)
Embedded GPU 10 0.4 0.04
Data Center AI ASIC 40 23 0.58
Server GPU 250 6.6 0.026
Low Power
Low Performance
Cannot meet the
requirements of most
AI computing needs.
High Power
High Performance
Cannot meet the
requirements of most
edge application
scenarios.
NovuTensor: Server performance within embedded power
The most power-efficient chip on the market!
Major AI Chips in Market
using AI super resolution technology to upscale low-resolution
video into 4k or 8k UHD video in real time
The world's 1st in
➢ Intelligently fill the
details to ensure ultra-
high quality
➢ Solve the problem of
media content lacking
4K / 8K resolutions for
online video and air
broadcasting
Super Resolution at CES2018: Beauty Made by
Muscle
( ) ( )
NovuTensor Kestrel: When Grizzly Flies
Call to Action
➢ CES January 2018:
http://share.gmw.cn/tech/2018-
01/11/content_27322512.htm?from=message&isappinstalled=0
➢ HP Enterprise + NovuMind at GTC October 2017
“Improving Deep Learning Scalability on HPE Servers With NovuMind: GPU RDMA Made
Easy”
https://www.leiphone.com/news/201710/GG9umC93Gtav2Eac.html
➢ Real Time Endoscopic Application at West China Hospital, July~August 2017
SCTV Video: http://www.iqiyi.com/v_19rrefrw1g.html
People’s Daily News: http://health.people.com.cn/n1/2017/0801/c14739-29440444.html
CCTV: http://jiankang.cctv.com/2017/07/28/ARTIE26PkbwL9t34pFS8jbIW170728.shtml
➢ AI Start-up NovuMind Helps European City Improve Their Green Life
Resources

Weitere ähnliche Inhalte

Was ist angesagt?

Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchRyousei Takano
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraRyousei Takano
 
Boyang gao gpu k-means_gmm_final_v1
Boyang gao gpu k-means_gmm_final_v1Boyang gao gpu k-means_gmm_final_v1
Boyang gao gpu k-means_gmm_final_v1Gao Boyang
 
Managing Large Datasets in LabVIEW
Managing Large Datasets in LabVIEWManaging Large Datasets in LabVIEW
Managing Large Datasets in LabVIEWJames McNally
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)Fatima Qayyum
 
Memory Requirements for Convolutional Neural Network Hardware Accelerators
Memory Requirements for Convolutional Neural Network Hardware AcceleratorsMemory Requirements for Convolutional Neural Network Hardware Accelerators
Memory Requirements for Convolutional Neural Network Hardware AcceleratorsSepidehShirkhanzadeh
 
Effective machine learning_with_tpu
Effective machine learning_with_tpuEffective machine learning_with_tpu
Effective machine learning_with_tpuAthul Suresh
 
Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898Martin Hassner
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Tokyo Institute of Technology
 
Gossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large cloudsGossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large cloudsRerngvit Yanggratoke
 
Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...
Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...
Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...Rahul Jain
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionNVIDIA Taiwan
 

Was ist angesagt? (17)

Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
 
cnsm2011_slide
cnsm2011_slidecnsm2011_slide
cnsm2011_slide
 
Boyang gao gpu k-means_gmm_final_v1
Boyang gao gpu k-means_gmm_final_v1Boyang gao gpu k-means_gmm_final_v1
Boyang gao gpu k-means_gmm_final_v1
 
Managing Large Datasets in LabVIEW
Managing Large Datasets in LabVIEWManaging Large Datasets in LabVIEW
Managing Large Datasets in LabVIEW
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 
Memory Requirements for Convolutional Neural Network Hardware Accelerators
Memory Requirements for Convolutional Neural Network Hardware AcceleratorsMemory Requirements for Convolutional Neural Network Hardware Accelerators
Memory Requirements for Convolutional Neural Network Hardware Accelerators
 
Effective machine learning_with_tpu
Effective machine learning_with_tpuEffective machine learning_with_tpu
Effective machine learning_with_tpu
 
Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898Large_Data_Block_Size_NSIC898
Large_Data_Block_Size_NSIC898
 
Google warehouse scale computer
Google warehouse scale computerGoogle warehouse scale computer
Google warehouse scale computer
 
GPU Programming
GPU ProgrammingGPU Programming
GPU Programming
 
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memor...
 
Gossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large cloudsGossip-based resource allocation for green computing in large clouds
Gossip-based resource allocation for green computing in large clouds
 
Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...
Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...
Memory Architecture Exploration for Power-Efficient 2D-Discrete Wavelet Trans...
 
GPU - Basic Working
GPU - Basic WorkingGPU - Basic Working
GPU - Basic Working
 
Ultra Fast SOM using CUDA
Ultra Fast SOM using CUDAUltra Fast SOM using CUDA
Ultra Fast SOM using CUDA
 
Evolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server SolutionEvolution of Supermicro GPU Server Solution
Evolution of Supermicro GPU Server Solution
 

Ähnlich wie "NovuTensor: Hardware Acceleration of Deep Convolutional Neural Networks for AI," a Presentation from NovuMind

In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitJinwon Lee
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxMemory Fabric Forum
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersCastLabKAIST
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Shien-Chun Luo
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloadsinside-BigData.com
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2Junli Gu
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLinside-BigData.com
 
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioFast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioOPNFV
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerFörderverein Technische Fakultät
 
Infrastructure optimization for seismic processing (eng)
Infrastructure optimization for seismic processing (eng)Infrastructure optimization for seismic processing (eng)
Infrastructure optimization for seismic processing (eng)Vsevolod Shabad
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit SupercomputerVigneshwarRamaswamy
 
Exploring Parallel Merging In GPU Based Systems Using CUDA C.
Exploring Parallel Merging In GPU Based Systems Using CUDA C.Exploring Parallel Merging In GPU Based Systems Using CUDA C.
Exploring Parallel Merging In GPU Based Systems Using CUDA C.Rakib Hossain
 
Oow 2008 yahoo_pie-db
Oow 2008 yahoo_pie-dbOow 2008 yahoo_pie-db
Oow 2008 yahoo_pie-dbbohanchen
 
ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...
ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...
ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...Hideyuki Tanaka
 
Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Joshua Mora
 

Ähnlich wie "NovuTensor: Hardware Acceleration of Deep Convolutional Neural Networks for AI," a Presentation from NovuMind (20)

In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
 
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptxQ1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
Q1 Memory Fabric Forum: Using CXL with AI Applications - Steve Scargall.pptx
 
AI Accelerators for Cloud Datacenters
AI Accelerators for Cloud DatacentersAI Accelerators for Cloud Datacenters
AI Accelerators for Cloud Datacenters
 
Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)Lightweight DNN Processor Design (based on NVDLA)
Lightweight DNN Processor Design (based on NVDLA)
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)Tensor Processing Unit (TPU)
Tensor Processing Unit (TPU)
 
26_Fan.pdf
26_Fan.pdf26_Fan.pdf
26_Fan.pdf
 
APSys Presentation Final copy2
APSys Presentation Final copy2APSys Presentation Final copy2
APSys Presentation Final copy2
 
Hardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and MLHardware & Software Platforms for HPC, AI and ML
Hardware & Software Platforms for HPC, AI and ML
 
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.ioFast datastacks - fast and flexible nfv solution stacks leveraging fd.io
Fast datastacks - fast and flexible nfv solution stacks leveraging fd.io
 
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A SupercomputerIntroduction to National Supercomputer center in Tianjin TH-1A Supercomputer
Introduction to National Supercomputer center in Tianjin TH-1A Supercomputer
 
Infrastructure optimization for seismic processing (eng)
Infrastructure optimization for seismic processing (eng)Infrastructure optimization for seismic processing (eng)
Infrastructure optimization for seismic processing (eng)
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
Brkdct 3101
Brkdct 3101Brkdct 3101
Brkdct 3101
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit Supercomputer
 
Exploring Parallel Merging In GPU Based Systems Using CUDA C.
Exploring Parallel Merging In GPU Based Systems Using CUDA C.Exploring Parallel Merging In GPU Based Systems Using CUDA C.
Exploring Parallel Merging In GPU Based Systems Using CUDA C.
 
Oow 2008 yahoo_pie-db
Oow 2008 yahoo_pie-dbOow 2008 yahoo_pie-db
Oow 2008 yahoo_pie-db
 
ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...
ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...
ESPM2 2018 - Automatic Generation of High-Order Finite-Difference Code with T...
 
Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...Performance analysis of 3D Finite Difference computational stencils on Seamic...
Performance analysis of 3D Finite Difference computational stencils on Seamic...
 

Mehr von Edge AI and Vision Alliance

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...Edge AI and Vision Alliance
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...Edge AI and Vision Alliance
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...Edge AI and Vision Alliance
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...Edge AI and Vision Alliance
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...Edge AI and Vision Alliance
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...Edge AI and Vision Alliance
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...Edge AI and Vision Alliance
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsightsEdge AI and Vision Alliance
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...Edge AI and Vision Alliance
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...Edge AI and Vision Alliance
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...Edge AI and Vision Alliance
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...Edge AI and Vision Alliance
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...Edge AI and Vision Alliance
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...Edge AI and Vision Alliance
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...Edge AI and Vision Alliance
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from SamsaraEdge AI and Vision Alliance
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...Edge AI and Vision Alliance
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...Edge AI and Vision Alliance
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...Edge AI and Vision Alliance
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...Edge AI and Vision Alliance
 

Mehr von Edge AI and Vision Alliance (20)

“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
“Learning Compact DNN Models for Embedded Vision,” a Presentation from the Un...
 
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
“Introduction to Computer Vision with CNNs,” a Presentation from Mohammad Hag...
 
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
“Selecting Tools for Developing, Monitoring and Maintaining ML Models,” a Pre...
 
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
 
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
“Understanding, Selecting and Optimizing Object Detectors for Edge Applicatio...
 
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
“Introduction to Modern LiDAR for Machine Perception,” a Presentation from th...
 
“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...“Vision-language Representations for Robotics,” a Presentation from the Unive...
“Vision-language Representations for Robotics,” a Presentation from the Unive...
 
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
“ADAS and AV Sensors: What’s Winning and Why?,” a Presentation from TechInsights
 
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
“Computer Vision in Sports: Scalable Solutions for Downmarkets,” a Presentati...
 
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
“Detecting Data Drift in Image Classification Neural Networks,” a Presentatio...
 
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
“Deep Neural Network Training: Diagnosing Problems and Implementing Solutions...
 
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
“AI Start-ups: The Perils of Fishing for Whales (War Stories from the Entrepr...
 
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
“A Computer Vision System for Autonomous Satellite Maneuvering,” a Presentati...
 
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
“Bias in Computer Vision—It’s Bigger Than Facial Recognition!,” a Presentatio...
 
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
“Sensor Fusion Techniques for Accurate Perception of Objects in the Environme...
 
“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara“Updating the Edge ML Development Process,” a Presentation from Samsara
“Updating the Edge ML Development Process,” a Presentation from Samsara
 
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
“Combating Bias in Production Computer Vision Systems,” a Presentation from R...
 
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
“Developing an Embedded Vision AI-powered Fitness System,” a Presentation fro...
 
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
“Navigating the Evolving Venture Capital Landscape for Edge AI Start-ups,” a ...
 
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
“Advanced Presence Sensing: What It Means for the Smart Home,” a Presentation...
 

Kürzlich hochgeladen

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 

Kürzlich hochgeladen (20)

CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 

"NovuTensor: Hardware Acceleration of Deep Convolutional Neural Networks for AI," a Presentation from NovuMind

  • 1. NovuTensor: Hardware Acceleration of Deep Convolutional Neural Networks for AI
  • 2. User Friendly NovuTensor AI Chip NovuStar Supercomputer Neural Network Models Training NovuBrain AI Module Get Data Training ability Model accuracy Full-StackAI Solutions NovuMind Industry Corporation Customer &Alliance
  • 3. • The core of CNN is 3-D Convolution Computation • Deeper CNN needs more 3-D Convolution Computation • AlexNet -> VGG -> ResNet -> … • DCNN is a network topology composed of basic computation elements (e.g., 3x3 3-D Convolutions )
  • 5. TPU with 2D Systolic Array
  • 6.
  • 7. 1080Ti Throughput: Peak: 11TOPS/s Effective: 2.4TOPS/s (ResNet18) Efficiency: 22% TPU Throughput: Peak: 96TOPS/s Average: 23TOPS/s Efficiency: 24%
  • 8. 2D Accelerators for Deep Learning Mesh Structure and Systolic Array • Overhead to prepare the 3D data into 2D data • Overhead to feed data • TPU uses 256x256 systolic array->needs 65536 data to fill • Systolic Array got 2D parallel acceleration – only when data is loaded
  • 9. 1D, 2D, 3D – An Ice Cream Example
  • 10. How Do You Eat Ice Cream? • Data Movement: • Need a scoop,then the art of scooping • Need a spoon or cone that is mouth size • Arithmetic: • Need a mouth to consume the ice cream
  • 11. • Unnecessary Assumptions with Mesh: • dqual distance; deighboring data dependency • Collective Streaming, and CNN 3D Data Processing: Data independency b/w PEs: • no data flow • no need to be rectangular, PEs can be flexible to have dozens to thousands of MACs
  • 12. • • • The Scooping: Tile Based Operation (patented)
  • 13. • Efficient Data Movement • Reduce unnecessary data movement off chip and on chip • Fully exploit data sharing, re-using: • IFM sharing between OFMs • Weight sharing within IFM (X-Y) • Weight sharing within batch • Efficient Arithmetic Logic • Reduce unnecessary computation
  • 16.
  • 17.
  • 18.
  • 19. NT-BB FPGA Server GPU Perform ance (FPS) 301 666 Power (W) 12 250 1/20 Power Consumption 1/2 Performance ResNet18 ImageNet Classification
  • 20. 20 Name Power Consumption (w) Performance (Effective TOPS, VGG16) Performance Power Ratio (TOPS/W) #1 Embedded 1.5 (500MHz) 0.2 0.13 #2 Smartphone NA (use as IP) 0.19 NA #3 Smartphone NA (use as IP) 0.3 NA NovuMind NT-BB (“BlackBear”) 12~18 2.5 0.21 (FPGA) NovuMind NT-GZ (“Grizzly”) 5~10 11 (peak 14) 2.25 (ASIC) Embedded GPU 10 0.4 0.04 Data Center AI ASIC 40 23 0.58 Server GPU 250 6.6 0.026 Low Power Low Performance Cannot meet the requirements of most AI computing needs. High Power High Performance Cannot meet the requirements of most edge application scenarios. NovuTensor: Server performance within embedded power The most power-efficient chip on the market! Major AI Chips in Market
  • 21. using AI super resolution technology to upscale low-resolution video into 4k or 8k UHD video in real time The world's 1st in ➢ Intelligently fill the details to ensure ultra- high quality ➢ Solve the problem of media content lacking 4K / 8K resolutions for online video and air broadcasting Super Resolution at CES2018: Beauty Made by Muscle
  • 22. ( ) ( ) NovuTensor Kestrel: When Grizzly Flies
  • 24. ➢ CES January 2018: http://share.gmw.cn/tech/2018- 01/11/content_27322512.htm?from=message&isappinstalled=0 ➢ HP Enterprise + NovuMind at GTC October 2017 “Improving Deep Learning Scalability on HPE Servers With NovuMind: GPU RDMA Made Easy” https://www.leiphone.com/news/201710/GG9umC93Gtav2Eac.html ➢ Real Time Endoscopic Application at West China Hospital, July~August 2017 SCTV Video: http://www.iqiyi.com/v_19rrefrw1g.html People’s Daily News: http://health.people.com.cn/n1/2017/0801/c14739-29440444.html CCTV: http://jiankang.cctv.com/2017/07/28/ARTIE26PkbwL9t34pFS8jbIW170728.shtml ➢ AI Start-up NovuMind Helps European City Improve Their Green Life Resources