SlideShare ist ein Scribd-Unternehmen logo
1 von 70
Downloaden Sie, um offline zu lesen
Deep Learning Techniques for
Object Detection and Recognition
Chu-Song Chen
Outline
● Computer Vision
● Image Classification and Object Detection
● Crowdsourcing + Machine Learning
o Image Net + ILSVRC Challenge
o Deep Convolution Nets
● Recent Advances and Results
Computer Vision
● Research on the methods for acquiring, processing,
analyzing, and understanding images and, in general, high-
dimensional data from the real world in order to produce
numerical or symbolic information, e.g., in the forms of
decisions.
Object Detection & Recognition
● Object recognition is one of the main tasks in
computer vision.
Semantic segmentation Object detection
What is object detection?
● Image classification
● object localization
● object detection
● segmentation
difficulty
Why is object detection important?
● Perception is one of the biggest bottlenecks of
○ Robotics
○ Self-driving cars
○ Surveillance
Applications
● Image classification
○ image search (Google, Baidu, Bing)
● Object detection
○ face
■ smart phone/cameras
■ election duplicate votes
■ CCTV
■ border control
■ casinos
■ visa processing
■ crime solving
■ prosopagnosia (face blindness)
○ objects
■ license plates
■ pedestrian detection (Daimler, MobileEye):
● warning and automatic braking reducing accidents
and severity
■ vehicle detection for forward collision
warning (MobileEye)
■ traffic sign detection (MobileEye)
 E-commerce
 machine inspection
Machine Learning & Computer
Vision
● How to achieve object recognition?
o Typically through machine learning in computer vision.
● Training stage:
o Collect training sample images.
o Learn an object detector.
● Inference stage: Employ the learned detector for detection.
o Take pedestrian detection as an example:
Pedestrian detection: training phase
(traditional approach)
● Collecting training data
o Extracting features (or casting data into feature space).
 color, edge, gradient, silhouette, dimension reduction, etc.
o Learning an object detector classifier
 Many learning methods: eg., Neural Networks, SVM,
Boosting, Cascaded AdaBoost, random forest.
Positive training data Negative training data
10
Pedestrian detection: testing phase
(traditional approach)
● After learning a human detector
o A detection window can be used to scan the testing image
along x and y directions for human detection.
11
Pedestrian detection: inference phase
● Human detection
o Detection windows with different sizes are used to detect
humans with different scales.
…
…
…
Difficulties for object recognition
● Object recognition
To human (an image and an image block)
To machine (a data ary of real numbers)
Past breakthroughs in object
detection researches
o Face detection: Haar
feature + AdaBoost
learning. (2000)
● Every mobile phone is
equipped with this function now.
o SIFT and HOG: local
discriminating features.
(2004) + SVM for object
detection.
● A key component to RGB vision-
based positioning and localization.
Examples of several breakthroughs
in object detection researches
● Deformable part models (2008):
o HOG feature
o Latent SVM + stochastic gradient descent (SGD) training
o Training scale of the above: 5K ~ 20K training images.
General object recognition
o The above methods bring many ingredient in application.
o However, they are still difficult to achieve general object
detection/recognition.
● Recent big breakthroughs of object detection
comes from crowdsourcing + machine learning:
o More labeled training data are gathered from mechanical
turk.
o More suitable machine learning techniques: deep
convolution neural networks (CNNs).
Artificial neural networks and deep
learning
● Why deep learning?
o A limitation of tradition methods: separate feature
extraction and classifier training as two independent
processes.
o One motivation in deep learning is to joining feature
extraction and classification into a single framework.
o This causes a large number of parameters. However,
when the number of training images is huge, the issue of
over-fitting is lessened.
o Deep learning: end-to-end learning.
That is feature extraction + classification in a single step
slide: R. Fergus
slide: R. Fergus
slide: Honglak Lee
Artificial neural networks and deep
learning
● Deep learning stems from artificial neural networks.
● There are many deep learning architectures.
● Among them, deep convolutional networks (CNN)
perform the best on the recognition tasks.
● In the following, we will review convolutional neural
networks (CNN) for
o image classification
o object detection
Convoltional Neural Networks
● CNN: a neural network consists of
o fully-connected layer
o convolution layer
o max-pooling
o nonlinear activation (ReLU or sigmoid)
o ………
Fully-connected layers
● If the input is an image, the fully connected layer
will have a huge amount of links between layers:
● The weights are required to be learned.
Convolution layer
● Instead of fully connection, using a 𝑘𝑘 × 𝑘𝑘 widow to slide the
image and performing inner product on every site.
● That is, applying a 𝑘𝑘 × 𝑘𝑘 FIR filter or convolution on the image.
● The coefficients are required to be learned.
Convolution vs. fully-connection
●.Convolutional layer:
o Shared weights
o Shift invariance
o Local
Convolutional layer Fully connected layer
Multiple FIR filters in a convolutional layer
● Often multiple FIR filters are in a convolutional layer.
● The filters’ outputs serves the inputs of the next layer.
● So, if the number of filters used in a convolution layer
are a number of 𝑐𝑐𝑙𝑙, the output of this layer forms an 𝑛𝑛𝑙𝑙 ×
𝑛𝑛𝑙𝑙 × 𝑐𝑐𝑙𝑙 volume.
𝑛𝑛𝑙𝑙
𝑛𝑛𝑙𝑙
Multiple “volume” FIR filters
● So, the output of the convolution layer has 𝑐𝑐𝑙𝑙
channels, forming an 𝑛𝑛𝑙𝑙 × 𝑛𝑛𝑙𝑙 × 𝑐𝑐𝑙𝑙 volume.
● Actually, the FIR filters applied in a CNN are
of size 𝑘𝑘 × 𝑘𝑘 × 𝑐𝑐𝑙𝑙 (though we usually
abbreviate it as 𝑘𝑘 × 𝑘𝑘 in for simplicity); it is
indeed a “volume” FIR filter).
Input: a RGB (3-chanel) image of size 𝑁𝑁 × 𝑁𝑁
● Eg., 𝑁𝑁 = 32, input to the first convolutional layer having 5
filters
● Eg., 𝑁𝑁 = 40, input to a cascade of convolutional layers, a
fully connected layer, and the final output layer. (entire network)
A single
neuron
o activation
function example
o sigmoid
o ReLU
z
Nonlinear activation function
● or if the layers are cascaded linearly, they can be replaced
by a single equivalent layer.
Pooling for dimension (size)
reduction
or the weights will still be.
 Summaries the input
● Eg, Max pooling
Max pooling layer (cont)
After max
pooling, the size
(i.e., dimension)
of the feature
map is reduced.
● Sharing parameters is good
○ taking advantage of local coherence to learn a more efficient representation:
■ no redundancy
■ translation invariance
■ slight rotation invariance with pooling
● Efficient for detection:
○ all computations are shared
○ can handle varying input sizes (no need to relearn weights for new sizes)
● ConvNets are convolutional all the way up including fully connected layers
Why are ConvNets good for detection?
slide: Pierre Sermanett
Big-data training images from Internet
● ILSVRC competition (ImageNet Challenge)
o ImageNet: collecting images according to the Wordnet tree.
o ILSVRC: choosing words in different tree branches.
ILSVRC Image classification
challenge
Fine tuning
● ILSVRC (ImageNet challenge) is a large
dataset with diverse object classes.
● Using the pre-trained weights on ILSVRC for
fine-tuning is a popular strategy.
Winner of ILSVRC 2012 of Image
classification: AlexNet
• 5 convolutional layers, 3 fully-connected layers
• The number of neurons in each layer is given by 253440, 186624, 64896, 64896, 43264,
4096, 4096, 1000.
● This was made possible by:
○ fast hardware: GPU-optimized code
○ big dataset: 1.2 million images vs thousands before
○ better regularization: dropout
Winner of ILSVRC 2014 of Image
classification: GoogleNet
● Inception: basic
building block in
googlenet
● GoogleNet: many
versions later. (Here, 7
inceptions)
a single inception
ILSVRC 2014 Single-net best performed –
VGG network (11- 19 layers)
Design criterion:
Using 3 × 3 filters
(to find small
details in every
layer)
Max-pooling (half-
size reduced of
the height and
width of the
feature map)
+
Double the
number of feature
maps by doubling
the filters.
ILSVRC 2015 winner – Residual
network (50- 151 layers)
Design criterion:
Add the short-cut link
Fully connected layer → average
pooling
Use batch normalization
From image classification to object
detection
● The above CNNs are designed for image classification (i.e.,
assume only one concept is contained in the input image).
● However, they serve as important building blocks for
feature extraction, and can be migrated to a new architecture
for object detection.
Image classification task
Object detection task
Object detection CNNs
● RCNN – Fast RCNN – Raster RCNN
● RFCN
● SSD
● PVA net
● Yolo v2
● ……
R-CNN
●R-CNN: Regions with CNN features
43
Koen E. A. van de Sande, Jasper R. R.
Uijlings, Theo Gevers, Arnold W. M. Smeulders,
Segmentation As Selective Search for
Object Recognition, in ICCV 2011
● Scan the input image for possible objects using an algorithm called Selective
Search, generating ~2000 region proposals
● Run a convolutional neural net (CNN) on top of each of these region proposals.
The CNN are pre-trained on the ImageNet and fine-tuned here.
● Take the output of each CNN and feed it into a) an SVM to classify the region
and b) a regressor to tighten the bounding box of the object, if such an object
exists.
● bounding box regression: output the center
and size of tight bounding box of the object.
● Generate region proposals based on the last feature map of the network, not from the
original image itself. As a result, we can train just one CNN for the entire image.
● The CNN is fined-tuned from the image classification network pre-trained on ImageNet.
● However, selective search in the original image is still needed.
● Without using SVMs: replacing SVMs with the CNN output.
● At the last layer of an initial CNN, a 3x3 sliding window moves across the feature map
and maps it to a lower dimension (e.g. 256-d)
● For each sliding-window location, it generates multiple possible regions based on k
fixed-ratio anchor boxes (default bounding boxes)
● Each region proposal consists of a) an “objectness” score for that region and b) 4
coordinates representing the bounding box of the region
Faster RCNN: region proposal ntwork
● The main insight of Faster R-CNN was to replace the slow selective search algorithm
with a fast neural net. Specifically, it introduced the region proposal network (RPN).
● Faster R-CNN = RPN + Fast R-CNN
● In other words, look at each location in our last feature map and consider 𝑘𝑘 boxes
centered around it: a tall, a wide, and a large box, etc. For each of those boxes, output
whether or not we think it contains an object, and what the coordinates for that box are.
● Feed the proposal into what is essentially a Fast R-CNN.
● Union the CNN in the bottom for both the region proposal network in faster RCNN and
the bounding-box-regression/object-classification in fast RCNN.
Results of Faster RCNN
SSD
● Region proposal and classification are trained simultaneously, unlike faster
RCNN that they are trained alternatively.
● Early convolution layers are also used. Early layers corresponds to smaller
objects, and rear layers corresponds to large objects.
● Faster and performance even better than faster RCNN
Yolo v2 (cvpr 2017)
● Modified from faster RCNN and Yolo
o use batch normalization; remove dropout.
o higher-resolution CNN classifier pretrained: from 224 ×
224 to 448 × 448
o use 9000 classes in the ImageNet for pre-training, instead
of 1000.
o direct location prediction: solve the instability in the
bounding box regression of faster RCNN.
● state-of-the-art on standard detection tasks like PASCAL
VOC and COCO datasets.
o At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets
78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet
and SSD while still running significantly faster.
Dataset
●Total 9,667 images:
o1,964 images annotated by ourselves
o7,703 images with bounding annotations from a public
dataset (ATR)
53
Applications: faster RCNN for clothing
detection
Dataset (Cont’d)
●9 categories:
●bag, belt, dress, footwear, glasses, hat, pants, skirt,
upperclothes
●# bounding boxes per category
54
LabelMe Annotation Tool
● A web-based tool to
create bounding boxes
and assign labels.
55
56
Ournotated
dataATR
Detection Results
57
Approach: Faster RCNN
Quantitative Results
● Metric: mAP (mean Average Precision)
● A detection is considered correct if its IoU
(intersection over union) with ground truth ≥
0.5 and its label is correct.
●Detection performance
58
Quantitative Results
●Metric: mAP (mean Average Precision)
●A detection is considered correct if its IoU
(Intersection over union) with ground truth ≥
0.5 and its label is correct.
●Detection performance
59
❏Perform better on larger items, e.g., upperclothes, dress, pants
Quantitative Results
●Metric: mAP (mean Average Precision)
●A detection is considered correct if its IoU
(Intersection over union) with ground truth ≥
0.5 and its label is correct.
●Detection performance
60
❏Perform better on larger items, e.g., upperclothes, dress, pants
❏Belts are very difficult to detect.
Summary
● The clothes item detector trained with
bounding box annotations can produce
satisfactory results. Even only a small set of
training data is applied.
● Trainin data is an issue: It is time-consuming
to obtain ground-truth bounding boxes.
61
Face detection
● Face-detection CNN: it is trained on a large-
scale face image dataset following similar ideas.
● We show that the face detector can be
realized in a CPU-based machine, Zenbo.
Deep CNN face detection/alignment
on Zenbo
●Zenbo Specifications
o CPU: Intel Atom x5-Z8550 2.4 GHz
o OS: Android 6.0.1
o RAM: 4G
o without using GPUs
● Frames per second
o 2.5 FPS [Resolution (640x480) ]
● Code optimizations
o C++ and OpenBLAS library
o Multi-threads computation
o without using any deep learning frameworks such as
tensorflow or pytorch
海洋空拍機魟魚偵測
● Chien-Hung Chen’s master thesis (Dept. of
Mech. & Elec. Mach. Eng., NSYSU);
● advisor: Prof. Keng-Hao Liu
A difficult problem: human may fail to track all
the 魟魚 successfully.
● Using Faster RCNN to train and detect
 base net: ZF or VGG
 Detection based on a video; using continuous
frames to refine the results.
Demo (close range)
ZF model VGG model
ZF model with time information VGG model with time information
Demo (distant range)
ZF model VGG model
ZF model with time information VGG model with time information
Demo (hard case)
ZF model VGG model
ZF model with time information VGG model with time information
Quantitative Results
● In the ground-truth, some 魟魚 sequences
detected by our method are not marked by human.
● After re-investigating these cases with human
experts, they have re-marked them as ground
truth.
Results of some video
Applicatins of deep CNN detector
● Deep CNN object detection techniques have
grown very fast in recent years. Several
promising models have been developed.
● The methods can be used for machine
inspection.
● Preparing data (with ground-truth regions)
would be an issue.
o Make the data type diverse
o If only few data with labeled regions can be collected,
augmenting the data by some attack (eg., by flipping,
rotation, cropping, lighting changes, blurring, sharpening
JPEG, etc.) is a useful technique for training.
Acknowledgement
Part of the slides are from the tutorial of CVPR2014, Deep
Learning for Computer Vision.

Weitere ähnliche Inhalte

Was ist angesagt?

Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesFellowship at Vodafone FutureLab
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksUsman Qayyum
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detectionBrodmann17
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methodsBrodmann17
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learningSushant Shrivastava
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation岳華 杜
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Universitat Politècnica de Catalunya
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamWithTheBest
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Universitat Politècnica de Catalunya
 
Codetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningCodetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningMatthew Opala
 
160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural RepresentationJunho Cho
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in VisionSangmin Woo
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prismalostleaves
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: reviewDmytro Mishkin
 
Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...
Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...
Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...Sangmin Woo
 

Was ist angesagt? (20)

Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
 
Semantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network ApproachesSemantic segmentation with Convolutional Neural Network Approaches
Semantic segmentation with Convolutional Neural Network Approaches
 
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
Deep Visual Saliency - Kevin McGuinness - UPC Barcelona 2017
 
Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
Fast methods for deep learning based object detection
Fast methods for deep learning based object detectionFast methods for deep learning based object detection
Fast methods for deep learning based object detection
 
Advanced deep learning based object detection methods
Advanced deep learning based object detection methodsAdvanced deep learning based object detection methods
Advanced deep learning based object detection methods
 
Object detection with deep learning
Object detection with deep learningObject detection with deep learning
Object detection with deep learning
 
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic SegmentationSemantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
Semantic Segmentation - Fully Convolutional Networks for Semantic Segmentation
 
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
Convolutional Neural Networks (D1L3 2017 UPC Deep Learning for Computer Vision)
 
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani WithanawasamScene classification using Convolutional Neural Networks - Jayani Withanawasam
Scene classification using Convolutional Neural Networks - Jayani Withanawasam
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 
Codetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep LearningCodetecon #KRK 3 - Object detection with Deep Learning
Codetecon #KRK 3 - Object detection with Deep Learning
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
 
160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation160205 NeuralArt - Understanding Neural Representation
160205 NeuralArt - Understanding Neural Representation
 
Transformer in Vision
Transformer in VisionTransformer in Vision
Transformer in Vision
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Deep Learning behind Prisma
Deep Learning behind PrismaDeep Learning behind Prisma
Deep Learning behind Prisma
 
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...
Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...
Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...
 

Ähnlich wie 物件偵測與辨識技術

Deep Neural Networks Presentation
Deep Neural Networks PresentationDeep Neural Networks Presentation
Deep Neural Networks PresentationBohdan Klimenko
 
NMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptxNMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptxLEGENDARYTECHNICAL
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Suraj Aavula
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detectionMonicaDommaraju
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17
 
NMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptxNMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptxLEGENDARYTECHNICAL
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern PresentationDaniel Cahall
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxssuser3aa461
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - Hiroshi Fukui
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksMarcinJedyk
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptxnitin571047
 

Ähnlich wie 物件偵測與辨識技術 (20)

Deep Neural Networks Presentation
Deep Neural Networks PresentationDeep Neural Networks Presentation
Deep Neural Networks Presentation
 
NMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptxNMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptx
 
Convolution Neural Network (CNN)
Convolution Neural Network (CNN)Convolution Neural Network (CNN)
Convolution Neural Network (CNN)
 
Deep learning based object detection
Deep learning based object detectionDeep learning based object detection
Deep learning based object detection
 
object-detection.pptx
object-detection.pptxobject-detection.pptx
object-detection.pptx
 
Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides Brodmann17 CVPR 2017 review - meetup slides
Brodmann17 CVPR 2017 review - meetup slides
 
NMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptxNMO IE-2 Activity Presentation.pptx
NMO IE-2 Activity Presentation.pptx
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Scene understanding
Scene understandingScene understanding
Scene understanding
 
Mnist report
Mnist reportMnist report
Mnist report
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に - 最近の研究情勢についていくために - Deep Learningを中心に -
最近の研究情勢についていくために - Deep Learningを中心に -
 
Deep learning
Deep learning Deep learning
Deep learning
 
Convolutional neural networks
Convolutional neural  networksConvolutional neural  networks
Convolutional neural networks
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Introduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural NetworksIntroduction to computer vision with Convoluted Neural Networks
Introduction to computer vision with Convoluted Neural Networks
 
FINAL_Team_4.pptx
FINAL_Team_4.pptxFINAL_Team_4.pptx
FINAL_Team_4.pptx
 
Visual Transformers
Visual TransformersVisual Transformers
Visual Transformers
 

Mehr von CHENHuiMei

小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵CHENHuiMei
 
QIF對AOI設備業之衝擊與機會
QIF對AOI設備業之衝擊與機會QIF對AOI設備業之衝擊與機會
QIF對AOI設備業之衝擊與機會CHENHuiMei
 
產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉
產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉
產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉CHENHuiMei
 
基於少樣本深度學習之橡膠墊片檢測系統
基於少樣本深度學習之橡膠墊片檢測系統基於少樣本深度學習之橡膠墊片檢測系統
基於少樣本深度學習之橡膠墊片檢測系統CHENHuiMei
 
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校CHENHuiMei
 
使用人工智慧檢測三維錫球瑕疵_台大傅楸善
使用人工智慧檢測三維錫球瑕疵_台大傅楸善使用人工智慧檢測三維錫球瑕疵_台大傅楸善
使用人工智慧檢測三維錫球瑕疵_台大傅楸善CHENHuiMei
 
IIoT發展趨勢及設備業者因應之_微軟葉怡君
IIoT發展趨勢及設備業者因應之_微軟葉怡君IIoT發展趨勢及設備業者因應之_微軟葉怡君
IIoT發展趨勢及設備業者因應之_微軟葉怡君CHENHuiMei
 
精密機械的空間軌跡精度光學檢測法_台大范光照
精密機械的空間軌跡精度光學檢測法_台大范光照精密機械的空間軌跡精度光學檢測法_台大范光照
精密機械的空間軌跡精度光學檢測法_台大范光照CHENHuiMei
 
When AOI meets AI
When AOI meets AIWhen AOI meets AI
When AOI meets AICHENHuiMei
 
2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠
2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠
2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠CHENHuiMei
 
2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士
2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士
2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士CHENHuiMei
 
2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民
2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民
2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民CHENHuiMei
 
2018AOI論壇_AOI and IoT產線應用_工研院周森益
2018AOI論壇_AOI and IoT產線應用_工研院周森益2018AOI論壇_AOI and IoT產線應用_工研院周森益
2018AOI論壇_AOI and IoT產線應用_工研院周森益CHENHuiMei
 
2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢
2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢
2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢CHENHuiMei
 
2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章
2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章
2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章CHENHuiMei
 
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘CHENHuiMei
 
2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏
2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏
2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏CHENHuiMei
 
2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓
2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓
2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓CHENHuiMei
 

Mehr von CHENHuiMei (20)

小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
小數據如何實現電腦視覺,微軟AI研究首席剖析關鍵
 
QIF對AOI設備業之衝擊與機會
QIF對AOI設備業之衝擊與機會QIF對AOI設備業之衝擊與機會
QIF對AOI設備業之衝擊與機會
 
產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉
產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉
產研融合推手-台大AOI設備研發聯盟_台大陳亮嘉
 
基於少樣本深度學習之橡膠墊片檢測系統
基於少樣本深度學習之橡膠墊片檢測系統基於少樣本深度學習之橡膠墊片檢測系統
基於少樣本深度學習之橡膠墊片檢測系統
 
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
AOI智慧升級─AI訓練師在地養成計畫_台灣人工智慧學校
 
使用人工智慧檢測三維錫球瑕疵_台大傅楸善
使用人工智慧檢測三維錫球瑕疵_台大傅楸善使用人工智慧檢測三維錫球瑕疵_台大傅楸善
使用人工智慧檢測三維錫球瑕疵_台大傅楸善
 
IIoT發展趨勢及設備業者因應之_微軟葉怡君
IIoT發展趨勢及設備業者因應之_微軟葉怡君IIoT發展趨勢及設備業者因應之_微軟葉怡君
IIoT發展趨勢及設備業者因應之_微軟葉怡君
 
精密機械的空間軌跡精度光學檢測法_台大范光照
精密機械的空間軌跡精度光學檢測法_台大范光照精密機械的空間軌跡精度光學檢測法_台大范光照
精密機械的空間軌跡精度光學檢測法_台大范光照
 
Report
ReportReport
Report
 
Deep learning
Deep learningDeep learning
Deep learning
 
When AOI meets AI
When AOI meets AIWhen AOI meets AI
When AOI meets AI
 
2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠
2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠
2018AOI論壇_基於生成對抗網路之非監督式AOI技術_工研院蔡雅惠
 
2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士
2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士
2018AOIEA論壇Keynote_眺望趨勢 量測設備未來10年發展重點_致茂曾一士
 
2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民
2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民
2018AOI論壇Keynote_AI入魂製造領域現況與趨勢_工研院熊治民
 
2018AOI論壇_AOI and IoT產線應用_工研院周森益
2018AOI論壇_AOI and IoT產線應用_工研院周森益2018AOI論壇_AOI and IoT產線應用_工研院周森益
2018AOI論壇_AOI and IoT產線應用_工研院周森益
 
2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢
2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢
2018AOI論壇_AOI參與整廠協作之實務建議_達明機器人黃鐘賢
 
2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章
2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章
2018AOI論壇_深度學習在電腦視覺應用上的疑問_中央大學曾定章
 
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
2018AOI論壇_深度學習於表面瑕疪檢測_元智大學蔡篤銘
 
2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏
2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏
2018AOI論壇_時機已到 AOI導入邊緣運算_SAS林育宏
 
2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓
2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓
2018AOI論壇_如何導入深度學習來提升工業瑕疵檢測技術_工研院賴璟皓
 

Kürzlich hochgeladen

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Kürzlich hochgeladen (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

物件偵測與辨識技術

  • 1. Deep Learning Techniques for Object Detection and Recognition Chu-Song Chen
  • 2. Outline ● Computer Vision ● Image Classification and Object Detection ● Crowdsourcing + Machine Learning o Image Net + ILSVRC Challenge o Deep Convolution Nets ● Recent Advances and Results
  • 3. Computer Vision ● Research on the methods for acquiring, processing, analyzing, and understanding images and, in general, high- dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions.
  • 4. Object Detection & Recognition ● Object recognition is one of the main tasks in computer vision. Semantic segmentation Object detection
  • 5. What is object detection? ● Image classification ● object localization ● object detection ● segmentation difficulty
  • 6. Why is object detection important? ● Perception is one of the biggest bottlenecks of ○ Robotics ○ Self-driving cars ○ Surveillance
  • 7. Applications ● Image classification ○ image search (Google, Baidu, Bing) ● Object detection ○ face ■ smart phone/cameras ■ election duplicate votes ■ CCTV ■ border control ■ casinos ■ visa processing ■ crime solving ■ prosopagnosia (face blindness) ○ objects ■ license plates ■ pedestrian detection (Daimler, MobileEye): ● warning and automatic braking reducing accidents and severity ■ vehicle detection for forward collision warning (MobileEye) ■ traffic sign detection (MobileEye)  E-commerce  machine inspection
  • 8. Machine Learning & Computer Vision ● How to achieve object recognition? o Typically through machine learning in computer vision. ● Training stage: o Collect training sample images. o Learn an object detector. ● Inference stage: Employ the learned detector for detection. o Take pedestrian detection as an example:
  • 9. Pedestrian detection: training phase (traditional approach) ● Collecting training data o Extracting features (or casting data into feature space).  color, edge, gradient, silhouette, dimension reduction, etc. o Learning an object detector classifier  Many learning methods: eg., Neural Networks, SVM, Boosting, Cascaded AdaBoost, random forest. Positive training data Negative training data
  • 10. 10 Pedestrian detection: testing phase (traditional approach) ● After learning a human detector o A detection window can be used to scan the testing image along x and y directions for human detection.
  • 11. 11 Pedestrian detection: inference phase ● Human detection o Detection windows with different sizes are used to detect humans with different scales. … … …
  • 12. Difficulties for object recognition ● Object recognition To human (an image and an image block) To machine (a data ary of real numbers)
  • 13. Past breakthroughs in object detection researches o Face detection: Haar feature + AdaBoost learning. (2000) ● Every mobile phone is equipped with this function now. o SIFT and HOG: local discriminating features. (2004) + SVM for object detection. ● A key component to RGB vision- based positioning and localization.
  • 14. Examples of several breakthroughs in object detection researches ● Deformable part models (2008): o HOG feature o Latent SVM + stochastic gradient descent (SGD) training o Training scale of the above: 5K ~ 20K training images.
  • 15. General object recognition o The above methods bring many ingredient in application. o However, they are still difficult to achieve general object detection/recognition. ● Recent big breakthroughs of object detection comes from crowdsourcing + machine learning: o More labeled training data are gathered from mechanical turk. o More suitable machine learning techniques: deep convolution neural networks (CNNs).
  • 16. Artificial neural networks and deep learning ● Why deep learning? o A limitation of tradition methods: separate feature extraction and classifier training as two independent processes. o One motivation in deep learning is to joining feature extraction and classification into a single framework. o This causes a large number of parameters. However, when the number of training images is huge, the issue of over-fitting is lessened. o Deep learning: end-to-end learning. That is feature extraction + classification in a single step
  • 19.
  • 20.
  • 22. Artificial neural networks and deep learning ● Deep learning stems from artificial neural networks. ● There are many deep learning architectures. ● Among them, deep convolutional networks (CNN) perform the best on the recognition tasks. ● In the following, we will review convolutional neural networks (CNN) for o image classification o object detection
  • 23. Convoltional Neural Networks ● CNN: a neural network consists of o fully-connected layer o convolution layer o max-pooling o nonlinear activation (ReLU or sigmoid) o ………
  • 24. Fully-connected layers ● If the input is an image, the fully connected layer will have a huge amount of links between layers: ● The weights are required to be learned.
  • 25. Convolution layer ● Instead of fully connection, using a 𝑘𝑘 × 𝑘𝑘 widow to slide the image and performing inner product on every site. ● That is, applying a 𝑘𝑘 × 𝑘𝑘 FIR filter or convolution on the image. ● The coefficients are required to be learned.
  • 26. Convolution vs. fully-connection ●.Convolutional layer: o Shared weights o Shift invariance o Local Convolutional layer Fully connected layer
  • 27. Multiple FIR filters in a convolutional layer ● Often multiple FIR filters are in a convolutional layer. ● The filters’ outputs serves the inputs of the next layer. ● So, if the number of filters used in a convolution layer are a number of 𝑐𝑐𝑙𝑙, the output of this layer forms an 𝑛𝑛𝑙𝑙 × 𝑛𝑛𝑙𝑙 × 𝑐𝑐𝑙𝑙 volume. 𝑛𝑛𝑙𝑙 𝑛𝑛𝑙𝑙
  • 28. Multiple “volume” FIR filters ● So, the output of the convolution layer has 𝑐𝑐𝑙𝑙 channels, forming an 𝑛𝑛𝑙𝑙 × 𝑛𝑛𝑙𝑙 × 𝑐𝑐𝑙𝑙 volume. ● Actually, the FIR filters applied in a CNN are of size 𝑘𝑘 × 𝑘𝑘 × 𝑐𝑐𝑙𝑙 (though we usually abbreviate it as 𝑘𝑘 × 𝑘𝑘 in for simplicity); it is indeed a “volume” FIR filter).
  • 29. Input: a RGB (3-chanel) image of size 𝑁𝑁 × 𝑁𝑁 ● Eg., 𝑁𝑁 = 32, input to the first convolutional layer having 5 filters ● Eg., 𝑁𝑁 = 40, input to a cascade of convolutional layers, a fully connected layer, and the final output layer. (entire network)
  • 30. A single neuron o activation function example o sigmoid o ReLU z Nonlinear activation function ● or if the layers are cascaded linearly, they can be replaced by a single equivalent layer.
  • 31. Pooling for dimension (size) reduction or the weights will still be.  Summaries the input ● Eg, Max pooling
  • 32. Max pooling layer (cont) After max pooling, the size (i.e., dimension) of the feature map is reduced.
  • 33. ● Sharing parameters is good ○ taking advantage of local coherence to learn a more efficient representation: ■ no redundancy ■ translation invariance ■ slight rotation invariance with pooling ● Efficient for detection: ○ all computations are shared ○ can handle varying input sizes (no need to relearn weights for new sizes) ● ConvNets are convolutional all the way up including fully connected layers Why are ConvNets good for detection? slide: Pierre Sermanett
  • 34. Big-data training images from Internet ● ILSVRC competition (ImageNet Challenge) o ImageNet: collecting images according to the Wordnet tree. o ILSVRC: choosing words in different tree branches.
  • 36. Fine tuning ● ILSVRC (ImageNet challenge) is a large dataset with diverse object classes. ● Using the pre-trained weights on ILSVRC for fine-tuning is a popular strategy.
  • 37. Winner of ILSVRC 2012 of Image classification: AlexNet • 5 convolutional layers, 3 fully-connected layers • The number of neurons in each layer is given by 253440, 186624, 64896, 64896, 43264, 4096, 4096, 1000. ● This was made possible by: ○ fast hardware: GPU-optimized code ○ big dataset: 1.2 million images vs thousands before ○ better regularization: dropout
  • 38. Winner of ILSVRC 2014 of Image classification: GoogleNet ● Inception: basic building block in googlenet ● GoogleNet: many versions later. (Here, 7 inceptions) a single inception
  • 39. ILSVRC 2014 Single-net best performed – VGG network (11- 19 layers) Design criterion: Using 3 × 3 filters (to find small details in every layer) Max-pooling (half- size reduced of the height and width of the feature map) + Double the number of feature maps by doubling the filters.
  • 40. ILSVRC 2015 winner – Residual network (50- 151 layers) Design criterion: Add the short-cut link Fully connected layer → average pooling Use batch normalization
  • 41. From image classification to object detection ● The above CNNs are designed for image classification (i.e., assume only one concept is contained in the input image). ● However, they serve as important building blocks for feature extraction, and can be migrated to a new architecture for object detection. Image classification task Object detection task
  • 42. Object detection CNNs ● RCNN – Fast RCNN – Raster RCNN ● RFCN ● SSD ● PVA net ● Yolo v2 ● ……
  • 43. R-CNN ●R-CNN: Regions with CNN features 43 Koen E. A. van de Sande, Jasper R. R. Uijlings, Theo Gevers, Arnold W. M. Smeulders, Segmentation As Selective Search for Object Recognition, in ICCV 2011
  • 44. ● Scan the input image for possible objects using an algorithm called Selective Search, generating ~2000 region proposals ● Run a convolutional neural net (CNN) on top of each of these region proposals. The CNN are pre-trained on the ImageNet and fine-tuned here. ● Take the output of each CNN and feed it into a) an SVM to classify the region and b) a regressor to tighten the bounding box of the object, if such an object exists.
  • 45. ● bounding box regression: output the center and size of tight bounding box of the object.
  • 46. ● Generate region proposals based on the last feature map of the network, not from the original image itself. As a result, we can train just one CNN for the entire image. ● The CNN is fined-tuned from the image classification network pre-trained on ImageNet. ● However, selective search in the original image is still needed. ● Without using SVMs: replacing SVMs with the CNN output.
  • 47. ● At the last layer of an initial CNN, a 3x3 sliding window moves across the feature map and maps it to a lower dimension (e.g. 256-d) ● For each sliding-window location, it generates multiple possible regions based on k fixed-ratio anchor boxes (default bounding boxes) ● Each region proposal consists of a) an “objectness” score for that region and b) 4 coordinates representing the bounding box of the region Faster RCNN: region proposal ntwork
  • 48. ● The main insight of Faster R-CNN was to replace the slow selective search algorithm with a fast neural net. Specifically, it introduced the region proposal network (RPN). ● Faster R-CNN = RPN + Fast R-CNN
  • 49. ● In other words, look at each location in our last feature map and consider 𝑘𝑘 boxes centered around it: a tall, a wide, and a large box, etc. For each of those boxes, output whether or not we think it contains an object, and what the coordinates for that box are. ● Feed the proposal into what is essentially a Fast R-CNN. ● Union the CNN in the bottom for both the region proposal network in faster RCNN and the bounding-box-regression/object-classification in fast RCNN.
  • 51. SSD ● Region proposal and classification are trained simultaneously, unlike faster RCNN that they are trained alternatively. ● Early convolution layers are also used. Early layers corresponds to smaller objects, and rear layers corresponds to large objects. ● Faster and performance even better than faster RCNN
  • 52. Yolo v2 (cvpr 2017) ● Modified from faster RCNN and Yolo o use batch normalization; remove dropout. o higher-resolution CNN classifier pretrained: from 224 × 224 to 448 × 448 o use 9000 classes in the ImageNet for pre-training, instead of 1000. o direct location prediction: solve the instability in the bounding box regression of faster RCNN. ● state-of-the-art on standard detection tasks like PASCAL VOC and COCO datasets. o At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster.
  • 53. Dataset ●Total 9,667 images: o1,964 images annotated by ourselves o7,703 images with bounding annotations from a public dataset (ATR) 53 Applications: faster RCNN for clothing detection
  • 54. Dataset (Cont’d) ●9 categories: ●bag, belt, dress, footwear, glasses, hat, pants, skirt, upperclothes ●# bounding boxes per category 54
  • 55. LabelMe Annotation Tool ● A web-based tool to create bounding boxes and assign labels. 55
  • 58. Quantitative Results ● Metric: mAP (mean Average Precision) ● A detection is considered correct if its IoU (intersection over union) with ground truth ≥ 0.5 and its label is correct. ●Detection performance 58
  • 59. Quantitative Results ●Metric: mAP (mean Average Precision) ●A detection is considered correct if its IoU (Intersection over union) with ground truth ≥ 0.5 and its label is correct. ●Detection performance 59 ❏Perform better on larger items, e.g., upperclothes, dress, pants
  • 60. Quantitative Results ●Metric: mAP (mean Average Precision) ●A detection is considered correct if its IoU (Intersection over union) with ground truth ≥ 0.5 and its label is correct. ●Detection performance 60 ❏Perform better on larger items, e.g., upperclothes, dress, pants ❏Belts are very difficult to detect.
  • 61. Summary ● The clothes item detector trained with bounding box annotations can produce satisfactory results. Even only a small set of training data is applied. ● Trainin data is an issue: It is time-consuming to obtain ground-truth bounding boxes. 61
  • 62. Face detection ● Face-detection CNN: it is trained on a large- scale face image dataset following similar ideas. ● We show that the face detector can be realized in a CPU-based machine, Zenbo.
  • 63. Deep CNN face detection/alignment on Zenbo ●Zenbo Specifications o CPU: Intel Atom x5-Z8550 2.4 GHz o OS: Android 6.0.1 o RAM: 4G o without using GPUs ● Frames per second o 2.5 FPS [Resolution (640x480) ] ● Code optimizations o C++ and OpenBLAS library o Multi-threads computation o without using any deep learning frameworks such as tensorflow or pytorch
  • 64. 海洋空拍機魟魚偵測 ● Chien-Hung Chen’s master thesis (Dept. of Mech. & Elec. Mach. Eng., NSYSU); ● advisor: Prof. Keng-Hao Liu A difficult problem: human may fail to track all the 魟魚 successfully. ● Using Faster RCNN to train and detect  base net: ZF or VGG  Detection based on a video; using continuous frames to refine the results.
  • 65. Demo (close range) ZF model VGG model ZF model with time information VGG model with time information
  • 66. Demo (distant range) ZF model VGG model ZF model with time information VGG model with time information
  • 67. Demo (hard case) ZF model VGG model ZF model with time information VGG model with time information
  • 68. Quantitative Results ● In the ground-truth, some 魟魚 sequences detected by our method are not marked by human. ● After re-investigating these cases with human experts, they have re-marked them as ground truth. Results of some video
  • 69. Applicatins of deep CNN detector ● Deep CNN object detection techniques have grown very fast in recent years. Several promising models have been developed. ● The methods can be used for machine inspection. ● Preparing data (with ground-truth regions) would be an issue. o Make the data type diverse o If only few data with labeled regions can be collected, augmenting the data by some attack (eg., by flipping, rotation, cropping, lighting changes, blurring, sharpening JPEG, etc.) is a useful technique for training.
  • 70. Acknowledgement Part of the slides are from the tutorial of CVPR2014, Deep Learning for Computer Vision.