2. Introduction to AI System
• AI ~ Deep Neural Network (DNN)-based Machine Learning
• AI system: a system with analysis and synthesis capabilities powered by DNN-
based machine learning
– Autonomous driving vehicle, drone, robot, personal virtual assistant, etc.
• Machine learning: a universal algorithm for building a functional mapping
between sample inputs and associated outputs
– A new paradigm of software development
– Learn from many normal people
vs. Design with few gifted experts
• From AI Winter to AI Everywhere
– Algorithmic breakthrough that enables
training of deep neural network
– Large high-quality training data set, e.g., ImageNet
– Availability of high-performance GPU
3. Overarching Strategies
• 產業AI化: Apply modern AI techniques to improving the value and
efficiency of existing industry segments,
– As common a tool as MatLab
– Medicine: diagnostics, nursing care
– Manufacturing: defect detection, equipment maintenance, robotic manipulation
– Finance: credit assessment, personal investment, trading algorithm
– Commerce: advertisement, retail analytics, logistics planning
• AI產業化: Convert modern AI techniques into new systems and
products that enable applications of AI
– High-performance DNN training
– Real-time low-power DNN inferencing
– DNN-based systems: autonomous driving vehicle, autonomous drone, robot,
personal virtual assistant
4. Machine Learning Basics
• Supervised Learning: from sample input-output pairs
– Labeling a training data set knowledge acquisition
– Training to get a functional model knowledge transfer
& abstraction
– Applying a learned model knowledge application
– Ask the right question: Set up a proper optimization
objective function: “Like this?”
– Training corresponds to multi-variable non-linear optimization
• Universal Approximation Theorem
• Gradient descent-based search
• Unsupervised Learning
– Clustering
– Factor analysis
– Auto encoding
5. Training and Inference of Neural Network
Training: Forward/Backward Propagation
Inference: Forward Propagation
6. Key Technical Challenges in DNN
• Training of DNN model
– Quality: how to acquire high-quality training data set
• Label correctness and diversity
• Semi-automatic training data collection and labeling
– Speed:
• Reduce the number of rounds required in the training process
– Round Epoch Batch
• Reduce the computation overhead associated with each training round
• Speed and power consumption of applying DNN model (inference)
– Real-time: autonomous driving
– Embedded system: low power and low cost
• Explainability of learned DNN models
• Broadening the scope of DNN applications: from analysis to synthesis
7. Overview of DNN Systems Research
• Computational challenges brought by DNN
– Training (off-line)
• DNN Integrated Development Environment (IDE): reduce the number
of iterations required in a training process
• DNN Training Appliance: reduce the amount of time taken by each
training iteration, which involves multiple passes through the training
data set, e.g., Nvidia’s DGX-1 and Intel/Nervana’s Lake Crest
– Inference (on-line)
• Cloud-based DNN inference engine (high performance), e.g., Google’s
TensorFlow processing unit (TPU)
• Embedded DNN inference engine (low power and real-time) for
smartphone, and automatic driving vehicle, e.g., Nvidia’s TX2, Intel’s
Movidius (Neural Computing Stick), and Mobileye’s EyeQ
8. DNN Training Appliance
• DNN training has emerged as a crucial class of workloads in future data centers
• DNN appliance: a system that integrates (I) DNN IDE, (II) DNN model
optimization, (III) DNN training computation mapping and scheduling, and (IV)
GPU-based compute cluster to minimize the end-to-end training time
• Nvidia’s DGX-1
– Deep learning supercomputer
– NTD $4M + $1M
– P100 GPU + NVlink
– 170 TeraFLOPs of FP16
– Effective performance
equal to 250 Intel x86
CPU-based servers
– HGX-1: Hyperscale
GPU Accelerator
9. ITRI DNN Training Appliance
• Objective: Enable Taiwan to become a major power of DNN training appliances
• Hardware enhancements
– Processor
• Nvidia’s Tesla P100 and V100 (12GB, 4.7TFLOPs, $5899)
• Nvidia’s GeForce GTX 1080Ti (11GB, 11.3TFLOPs of FP32, $699)
• AMD’s Radeon RX-500 and RX Vega
• Intel’s Knights Mill (KNM)
– System Interconnect
• NVlink (Gen-Z)
• Meshed PCIe network
– Cooling
• Software Optimizations
– Minimize the performance impacts of lack of NVlink
– DNN training integrated development environment
• Leverage MOST’s AI computation platform as a reference case
Graphics driver API:
• CUDA
• OpenCL
10. DNN Inference Engine
• Track 1: Sensor data processing platform for autonomous driving
– Distributed architecture: edge processing and leaner network
– Centralized architecture: fatter network and more efficient processing
resource utilization
• Track 2: Customized DNN inference processor design
– Digital hardware approach
• Make data access as efficient as possible
• Decrease the total amount of computation for every possible input
• Reduce the amount of computation for easy or already seen inputs
– Analog hardware approach
• Programmable and persistent resistors
• I (output) = V (input) * 1/R could be used to
implement multiplication
• Wired-OR implements (current) addition
25. 25Copyright 2017 限閱資料、禁止複製、轉載及外流
高壓電塔 礙子清洗
• 台電每年在高壓電塔維護作業花費 > NT$80億。
• 需求如:預防電纜線連接頭因高溫斷裂毀損而斷電、礙子表面汙損時洩漏電流及
清掃、高壓電塔與附近的異狀物的距離觀測、高壓電塔表面銹蝕檢測與補漆。
Manpower Helicopters UAV
Man Hour Costs
12.5 thousand/Hour
(Annual cost about 7.5 billion a
year)
100 thousand / Hour
(Annual cost about 0.5 billion)
10 thousand / Hour
Working Area City / Reachable place
Remote /Rural Areas
(Height limit in recent years)
No Limit
Safety
Electric shock / Person fall
down
Mechanical failure / Turbulence
(There are accidents these
three years)
No Casualty
Cleaning Method
Short Distance
Cloth wiping / Water Column
Long Distance
High pressure water column
cleaning
Short Distance
Dry Ice Cleaning
Real-Time Surveillance System
Operation Power Outage / Live Line Live Line Live Line
Manpower : high pressure
water column cleaningManpower : Cloth wiping UAV dry ice cleaning
Helicopters: high pressure water
column cleaning
27. Common Cyber Attack Scenarios
• From the Internet
– Scanning public IP addresses
– Fingerprinting the OS and applications
– Applying attacks to exploit known vulnerabilities
– Increasingly difficult with multi-level defense around servers
• From the intranet
– Social engineering via email or social network sites
– Drive-by download
– Stepping stone to attack enterprises
– Increasingly common with dirtier endpoints
27
29. Programmatic Defense Mechanisms
• Stop injection of shell code (Step 1)
– Eliminate security loopholes in applications
– Buffer overflow prevention: CASH: a fast bounds checking compiler
• Stop injected shell code from performing unauthorized sensitive
operations (Step 2)
– Address space or library randomization
– System call monitoring: PAID: an accurate system call monitoring tool
• Stop when malicious binaries are downloaded (Step 3)
– Scanning of malware during transit in firewalls
• Stop an injected malware at invocation time (Step 4)
– Black-listing: traditional anti-virus SW
– White-listing
– Run-time behavior monitoring29
30. DARPA’s Cyber Grand Challenge
• Problem: Once a bug is announced, a
race is on between bad guys that aim
to exploit the bug, and good guys that
aim to patch the bug, and bad guys
typically win.
• Goal: DARPA seeks to create
automatic defensive systems that are
capable of reasoning about flaws,
formulating patches and deploying
them on a network in real time.
• Result: Seven teams competed in
August 2016, and the champion
team, ForAllSecure, led by David
Brumley from CMU, took home a $2
million award.30
31. Automated Cyber Defense/Offense
Program analysis-based software security
– Input: Adobe flash player has a bug
– Outputs:
• Where is the bug? Is it a vulnerability?
• How to exploit it?
• How to patch it?
• How to develop an intrusion detection signature for it?
– Exploration approaches
• Fuzzing: random mutation, and anomaly- or corner case-driven stressing
• Intelligent fuzzing: exploiting knowledge on inputs
• Symbolic execution : source code/binary code
– Full automation: Bug Vulnerability Patch Signature Attack
– Targets: DARPA CGC and NSA/CIA’s leaked attack toolkit31
32. Summary
• DNN is expected to become the focal point of ICT research in the
next couple of years
– Myriad applications are possible and promising
• Analysis and Synthesis
• More promising bet is on business/enterprise applications using natural language
understanding, e.g., legal technology, regulatory technology, patent analysis, etc.
– Systems support for DNN training and inference is a promising area
• DNN training appliance and DNN inference processor
• Competition in DNN inference processor is like that in GPU design 15 years ago.
• DNN-based systems
– Perception subsystem for autonomous driving
– Tele-operated drone-based application services
– Automated cyber defense and offense
34. Training Data Collection and Labeling
• ImageNet is the most successful example so far
• How to reduce the effort and improve the quality of training data collection
and labeling?
– Crowd sourcing for raw data collection
– Human-based computation: image
labeling as a multi-user game
– Computer-aided labeling: object/label tracking across video frames
– Generation of meaningful new training data from example data set
– Unsupervised learning
• Corner-case training data collection and labeling
– Crowd sourcing-based collection
– Model-driven synthesis (for ADV)
• Augmented reality: videos augmented with graphics objects
• Photo-realistic graphics rendering
35. Analysis Applications of DNN
• Perception subsystem for Autonomous Driving
– The relative coordinate, relative speed and future
trajectory of every driving-related object
within a certain distance
– Real-time video object locationing,
recognition and tracking: YOLO object detection
and classification in one shot
– Driving event prediction
– Multi-sensor data fusion and analysis: RGBD
• Analytics for New Retail
– Trajectory of every shopper, the set of
merchandises he touches, and his likings
36. Synthesis Applications of DNN
• Movie critic Movie director
• Generative Adversarial Network
(GAN) : a game theoretic
approach to converting an
analysis model into a synthesis
model