SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Building and road detection
from large aerial imagery
Shunta SAITO*, Yoshimitsu AOKI*
* Graduate School of Science and Technology,
Keio University, Japan
Motivation
• Understanding aerial image is highly demanded for generating maps,
analyzing disaster scale, detecting changes to manage estate, etc.
• But it’s been usually done by human experts, so that it’s both slow and costly.
• Remote sensing community has been focused on this task but it’s still difficult
to detect terrestrial objects automatically from aerial image with high accuracy.
Goal
Input aerial image
(RGB)
Output 3-channels map
R: road, G: building, B: others
The trade-off between different
objects at a same pixel
Previous Works
Senaras et al., Building detection with decision fusion, 2013
ResultProcess flow
Input
Predicted labels
Ground truth
Mean shift
segmentation
Extract 15
different features
Combine
the multiple
classifica-
tion results
segments
feature
feature
feature
Vegetation mask
Shadow mask
Infrared-Red
image
Input aerial image (Infrared, RGB)
Infrared-Red-
Green image
Hue-Saturation-
Intensity (HSI)
image
Normalized
Difference
Vegetation Index
(NDVI) image
classifier
classifier
classifier
Previous Works
Volodymyr Mnih, Machine Learning for Aerial Image Labeling, 2013
• They take patch-based approach
which is very suited to use
Convolutional Neural Network (CNN).
• They formulate the problem as
obtaining a mapping from aerial
image patch to label image patch.
• However, they train two CNNs, one
for building, another for road, despite
there may be trade-offs between
them.
Process flow Result Description
Aerial imagery
Predicted Label
Noise model
Patches
CNNCNNCNN
Dataset
Aerial image Building label
x 151
Road label
x 1109
Convolutional Neural Network
64 x 64 x 3 (RGB) sized patches
Correct answers
Predictions
16 x 16 x 3 (building, road, other)
Calculate loss
Backpropagation
Our Approach
We train a Convolutional Neural Network (CNN) as a mapping from an input aerial
image patch and a 3-channel label image patch using stochastic gradient descent.
R
G
B
Input aerial
image patch
Predicted map patch
FC(4096)C(64, 9x9/2) P(2/1) C(128, 7x7/1) C(128, 5x5/1) FC(768)
• We train a CNN which has the above architecture.
• The CNN takes a small RGB image patch as input, and
output a predicted 3-channel label patch.
• The predicted label patch is consisted of Road channel
and Building channel and others channel.
• No pre-processing like segmentation is needed and we
don’t need to design any image features. CNN obtains
good feature extractors automatically.
Patch-based framework
allow the network to use
surrounding pixels to predict labels in the
center patch
Using context
It’s building!
?
1
0
0
0
0
1
˜mi2
˜mi1
˜mi3
i
s
˜m
wm
wm
ws
ws
Road
Building
Otherwise
Aerial image
patch
Predicted label patch
p( ˜m|s) =
w2
m
i=1
p( ˜mi|s)We learn with CNN.
Loss function
• Each pixel in predicted label image
• is independent each other (assumption)
• is always belonging to only one of the 3 labels (building, road, others)
ˆmi
(1.56, 4.37, 3.11)
softmax
(0.05, 0.74, 0.21)
Predicted label
(1, 0, 0)
˜mi
Correct label
c : channel (Building, Road, Others)
P : correct label distribution (1-of-3 coding)
Q : predicted label distribution
Asymmetric cross entropy
and just minimize this cross entropy by
Stochastic Gradient Descent
wc : weight for each channel loss
wbuilding = 1.5, wroad = 1.5, wothers = 0
*because prediction loss in the others channel is not important
Dataset
+ →
Building label Road label 3-channel labelAerial image
• We combine the Volodymyr’s Road and
Building detection datasets* to create our 3-
channel map dataset.
• Our dataset contains 147 sets of aerial
images and 3-channel label images.
- 137 sets for training
- 10 sets for testing
• Each image is 1500 x 1500 pixel sized at
1m^2/pixel resolution.
• The entire dataset covers roughly 340 km^2
of mainly urban region in Massachusetts, the
United States.
Aerial images 3-ch labels
Dataset
* http://www.cs.toronto.edu/~vmnih/data/
Experiment
• Training with 137 images
and labels
• Testing with 10 images
and labels
• We test some variants of
the basic architecture.
R
G
B
Input aerial
image patch
Predicted map patch
FC(4096)C(64, 9x9/2) P(2/1) C(128, 7x7/1) C(128, 5x5/1) FC(768)
Basic architecture
Activation Dropout rate Filter size
S-ReLU(Basic) ReLU N/A 9-7-5
S-ReLU-Dropout ReLU 0.5 9-7-5
S-Maxout Maxout 0.5 9-7-5
ReLU ReLU N/A 16-4-3
ReLU-Dropout ReLU 0.5 16-4-3
Maxout Maxout 0.5 16-4-3
Tested architectures
Input aerial image Predicted 3-channel label image
from the basic architecture
Example of Test Results
Input aerial image
Example of Test Results
Predicted 3-channel label image
from the basic architecture
Precision-recall curve
Compare road channel result with Volodymyr Mnih’s result
Compare building channel result with Volodymyr Mnih’s result
Precision-recall curve
Road Building
S-ReLU(Basic) 0.8905 0.9241
S-ReLU-Dropout 0.8889 0.9220
S-Maxout 0.8842 0.9185
ReLU 0.8657 0.8984
ReLU-Dropout 0.8650 0.8973
Maxout 0.8548 0.8940
Volodymyr 0.8873 0.9150
Precision at breakeven point
• Our basic architecture achieved the
best results.
• Using Maxout or Dropout, or both
seem not to improve the performance.
• The architecture which has smaller
filter size is better than ones with
bigger filters.
Conclusion
• We propose a CNN-based building and road extraction method for aerial
imagery.
• Our method doesn’t need hand-designed image features because the good
feature extractors are automatically constructed by training CNN.
• Our CNN predicts building and road regions simultaneously at state-of-the-art
accuracy.
Thank you for your kind attention.
All codes to generate our
dataset, perform training
of CNN, and test of the
resulting models are
available on GitHub.
https://github.com/mitmul/ssai

Weitere ähnliche Inhalte

Was ist angesagt?

U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptxNoorUlHaq47
 
Lecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image ProcessingLecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image ProcessingVARUN KUMAR
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descentkandelin
 
Intensity Transformation
Intensity TransformationIntensity Transformation
Intensity TransformationAmnaakhaan
 
IRJET - Disease Detection in Plant using Machine Learning
IRJET -  	  Disease Detection in Plant using Machine LearningIRJET -  	  Disease Detection in Plant using Machine Learning
IRJET - Disease Detection in Plant using Machine LearningIRJET Journal
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Syed Atif Naseem
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...Ceni Babaoglu, PhD
 
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Analysis by semantic segmentation of Multispectral satellite imagery using de...Analysis by semantic segmentation of Multispectral satellite imagery using de...
Analysis by semantic segmentation of Multispectral satellite imagery using de...Yogesh S Awate
 
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)Journal For Research
 
Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep LearningYaminiAlapati1
 
Edge linking in image processing
Edge linking in image processingEdge linking in image processing
Edge linking in image processingVARUN KUMAR
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extractionskylian
 
Lecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matchingLecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matchingcairo university
 

Was ist angesagt? (20)

Cnn
CnnCnn
Cnn
 
U-Netpresentation.pptx
U-Netpresentation.pptxU-Netpresentation.pptx
U-Netpresentation.pptx
 
Lecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image ProcessingLecture 16 KL Transform in Image Processing
Lecture 16 KL Transform in Image Processing
 
Optimization/Gradient Descent
Optimization/Gradient DescentOptimization/Gradient Descent
Optimization/Gradient Descent
 
Intensity Transformation
Intensity TransformationIntensity Transformation
Intensity Transformation
 
Siamese networks
Siamese networksSiamese networks
Siamese networks
 
IRJET - Disease Detection in Plant using Machine Learning
IRJET -  	  Disease Detection in Plant using Machine LearningIRJET -  	  Disease Detection in Plant using Machine Learning
IRJET - Disease Detection in Plant using Machine Learning
 
Statistical Pattern recognition(1)
Statistical Pattern recognition(1)Statistical Pattern recognition(1)
Statistical Pattern recognition(1)
 
Cnn
CnnCnn
Cnn
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...
5. Linear Algebra for Machine Learning: Singular Value Decomposition and Prin...
 
Image segmentation
Image segmentationImage segmentation
Image segmentation
 
Analysis by semantic segmentation of Multispectral satellite imagery using de...
Analysis by semantic segmentation of Multispectral satellite imagery using de...Analysis by semantic segmentation of Multispectral satellite imagery using de...
Analysis by semantic segmentation of Multispectral satellite imagery using de...
 
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
LEAF DISEASE DETECTION USING IMAGE PROCESSING AND SUPPORT VECTOR MACHINE (SVM)
 
Visualization of Deep Learning
Visualization of Deep LearningVisualization of Deep Learning
Visualization of Deep Learning
 
Edge linking in image processing
Edge linking in image processingEdge linking in image processing
Edge linking in image processing
 
Feature Extraction
Feature ExtractionFeature Extraction
Feature Extraction
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Clustering
ClusteringClustering
Clustering
 
Lecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matchingLecture 6-computer vision features descriptors matching
Lecture 6-computer vision features descriptors matching
 

Ähnlich wie Building and road detection from large aerial imagery

Automatic Dense Semantic Mapping From Visual Street-level Imagery
Automatic Dense Semantic Mapping From Visual Street-level ImageryAutomatic Dense Semantic Mapping From Visual Street-level Imagery
Automatic Dense Semantic Mapping From Visual Street-level ImagerySunando Sengupta
 
B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentationjesujoseph
 
Automatic Building detection for satellite Images using IGV and DSM
Automatic Building detection for satellite Images using IGV and DSMAutomatic Building detection for satellite Images using IGV and DSM
Automatic Building detection for satellite Images using IGV and DSMAmit Raikar
 
Resume mixed signal
Resume mixed signalResume mixed signal
Resume mixed signaltarora1
 
Resume mixed signal
Resume mixed signalResume mixed signal
Resume mixed signaltarora1
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
 
Resume digital
Resume digitalResume digital
Resume digitaltarora1
 
Resume digital
Resume digitalResume digital
Resume digitaltarora1
 
Feature Analyst Extraction of Lockheed Martin building using ArcGIS
Feature Analyst Extraction of Lockheed Martin building using ArcGISFeature Analyst Extraction of Lockheed Martin building using ArcGIS
Feature Analyst Extraction of Lockheed Martin building using ArcGISAriez Reyes
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...AMD Developer Central
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14Ashish Mundhra
 
project_final_seminar
project_final_seminarproject_final_seminar
project_final_seminarMUKUL BICHKAR
 
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networksJehong Lee
 
Fundamentals of Image processing.ppt
Fundamentals of Image processing.pptFundamentals of Image processing.ppt
Fundamentals of Image processing.pptssuser9a00df
 
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3NanubalaDhruvan
 
Ground_System_Design_&_Operation
Ground_System_Design_&_OperationGround_System_Design_&_Operation
Ground_System_Design_&_OperationSteven Gemeny
 
The automatic license plate recognition(alpr)
The automatic license plate recognition(alpr)The automatic license plate recognition(alpr)
The automatic license plate recognition(alpr)eSAT Publishing House
 
Vehicle detection in Aerial Images
Vehicle detection in Aerial ImagesVehicle detection in Aerial Images
Vehicle detection in Aerial ImagesKoshy Geoji
 

Ähnlich wie Building and road detection from large aerial imagery (20)

Automatic Dense Semantic Mapping From Visual Street-level Imagery
Automatic Dense Semantic Mapping From Visual Street-level ImageryAutomatic Dense Semantic Mapping From Visual Street-level Imagery
Automatic Dense Semantic Mapping From Visual Street-level Imagery
 
B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentation
 
Automatic Building detection for satellite Images using IGV and DSM
Automatic Building detection for satellite Images using IGV and DSMAutomatic Building detection for satellite Images using IGV and DSM
Automatic Building detection for satellite Images using IGV and DSM
 
Resume mixed signal
Resume mixed signalResume mixed signal
Resume mixed signal
 
Resume mixed signal
Resume mixed signalResume mixed signal
Resume mixed signal
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
Resume digital
Resume digitalResume digital
Resume digital
 
Resume digital
Resume digitalResume digital
Resume digital
 
Feature Analyst Extraction of Lockheed Martin building using ArcGIS
Feature Analyst Extraction of Lockheed Martin building using ArcGISFeature Analyst Extraction of Lockheed Martin building using ArcGIS
Feature Analyst Extraction of Lockheed Martin building using ArcGIS
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14
 
project_final_seminar
project_final_seminarproject_final_seminar
project_final_seminar
 
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks[Paper research] GOSELO: for Robot navigation using Reactive neural networks
[Paper research] GOSELO: for Robot navigation using Reactive neural networks
 
Fundamentals of Image processing.ppt
Fundamentals of Image processing.pptFundamentals of Image processing.ppt
Fundamentals of Image processing.ppt
 
rmsip98.ppt
rmsip98.pptrmsip98.ppt
rmsip98.ppt
 
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
TRAFFIC MANAGEMENT THROUGH SATELLITE IMAGING-- Part 3
 
Ground_System_Design_&_Operation
Ground_System_Design_&_OperationGround_System_Design_&_Operation
Ground_System_Design_&_Operation
 
The automatic license plate recognition(alpr)
The automatic license plate recognition(alpr)The automatic license plate recognition(alpr)
The automatic license plate recognition(alpr)
 
Vehicle detection in Aerial Images
Vehicle detection in Aerial ImagesVehicle detection in Aerial Images
Vehicle detection in Aerial Images
 

Mehr von Shunta Saito

Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Shunta Saito
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)Shunta Saito
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to ChainerShunta Saito
 
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...Shunta Saito
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
DeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksDeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksShunta Saito
 
Building detection with decision fusion
Building detection with decision fusionBuilding detection with decision fusion
Building detection with decision fusionShunta Saito
 
Automatic selection of object recognition methods using reinforcement learning
Automatic selection of object recognition methods using reinforcement learningAutomatic selection of object recognition methods using reinforcement learning
Automatic selection of object recognition methods using reinforcement learningShunta Saito
 
強化学習入門
強化学習入門強化学習入門
強化学習入門Shunta Saito
 
視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論Shunta Saito
 
集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回Shunta Saito
 

Mehr von Shunta Saito (12)

Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向Deep LearningフレームワークChainerと最近の技術動向
Deep LearningフレームワークChainerと最近の技術動向
 
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
[unofficial] Pyramid Scene Parsing Network (CVPR 2017)
 
Introduction to Chainer
Introduction to ChainerIntroduction to Chainer
Introduction to Chainer
 
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
[5 minutes LT] Brief Introduction to Recent Image Recognition Methods and Cha...
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
LT@Chainer Meetup
LT@Chainer MeetupLT@Chainer Meetup
LT@Chainer Meetup
 
DeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural NetworksDeepPose: Human Pose Estimation via Deep Neural Networks
DeepPose: Human Pose Estimation via Deep Neural Networks
 
Building detection with decision fusion
Building detection with decision fusionBuilding detection with decision fusion
Building detection with decision fusion
 
Automatic selection of object recognition methods using reinforcement learning
Automatic selection of object recognition methods using reinforcement learningAutomatic selection of object recognition methods using reinforcement learning
Automatic selection of object recognition methods using reinforcement learning
 
強化学習入門
強化学習入門強化学習入門
強化学習入門
 
視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論視覚認知システムにおける知覚と推論
視覚認知システムにおける知覚と推論
 
集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回集合知プログラミングゼミ第1回
集合知プログラミングゼミ第1回
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Kürzlich hochgeladen (20)

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Building and road detection from large aerial imagery

  • 1. Building and road detection from large aerial imagery Shunta SAITO*, Yoshimitsu AOKI* * Graduate School of Science and Technology, Keio University, Japan
  • 2. Motivation • Understanding aerial image is highly demanded for generating maps, analyzing disaster scale, detecting changes to manage estate, etc. • But it’s been usually done by human experts, so that it’s both slow and costly. • Remote sensing community has been focused on this task but it’s still difficult to detect terrestrial objects automatically from aerial image with high accuracy. Goal Input aerial image (RGB) Output 3-channels map R: road, G: building, B: others The trade-off between different objects at a same pixel
  • 3. Previous Works Senaras et al., Building detection with decision fusion, 2013 ResultProcess flow Input Predicted labels Ground truth Mean shift segmentation Extract 15 different features Combine the multiple classifica- tion results segments feature feature feature Vegetation mask Shadow mask Infrared-Red image Input aerial image (Infrared, RGB) Infrared-Red- Green image Hue-Saturation- Intensity (HSI) image Normalized Difference Vegetation Index (NDVI) image classifier classifier classifier
  • 4. Previous Works Volodymyr Mnih, Machine Learning for Aerial Image Labeling, 2013 • They take patch-based approach which is very suited to use Convolutional Neural Network (CNN). • They formulate the problem as obtaining a mapping from aerial image patch to label image patch. • However, they train two CNNs, one for building, another for road, despite there may be trade-offs between them. Process flow Result Description Aerial imagery Predicted Label Noise model Patches CNNCNNCNN Dataset Aerial image Building label x 151 Road label x 1109
  • 5. Convolutional Neural Network 64 x 64 x 3 (RGB) sized patches Correct answers Predictions 16 x 16 x 3 (building, road, other) Calculate loss Backpropagation Our Approach We train a Convolutional Neural Network (CNN) as a mapping from an input aerial image patch and a 3-channel label image patch using stochastic gradient descent. R G B Input aerial image patch Predicted map patch FC(4096)C(64, 9x9/2) P(2/1) C(128, 7x7/1) C(128, 5x5/1) FC(768) • We train a CNN which has the above architecture. • The CNN takes a small RGB image patch as input, and output a predicted 3-channel label patch. • The predicted label patch is consisted of Road channel and Building channel and others channel. • No pre-processing like segmentation is needed and we don’t need to design any image features. CNN obtains good feature extractors automatically.
  • 6. Patch-based framework allow the network to use surrounding pixels to predict labels in the center patch Using context It’s building! ? 1 0 0 0 0 1 ˜mi2 ˜mi1 ˜mi3 i s ˜m wm wm ws ws Road Building Otherwise Aerial image patch Predicted label patch p( ˜m|s) = w2 m i=1 p( ˜mi|s)We learn with CNN.
  • 7. Loss function • Each pixel in predicted label image • is independent each other (assumption) • is always belonging to only one of the 3 labels (building, road, others) ˆmi (1.56, 4.37, 3.11) softmax (0.05, 0.74, 0.21) Predicted label (1, 0, 0) ˜mi Correct label c : channel (Building, Road, Others) P : correct label distribution (1-of-3 coding) Q : predicted label distribution Asymmetric cross entropy and just minimize this cross entropy by Stochastic Gradient Descent wc : weight for each channel loss wbuilding = 1.5, wroad = 1.5, wothers = 0 *because prediction loss in the others channel is not important
  • 8. Dataset + → Building label Road label 3-channel labelAerial image • We combine the Volodymyr’s Road and Building detection datasets* to create our 3- channel map dataset. • Our dataset contains 147 sets of aerial images and 3-channel label images. - 137 sets for training - 10 sets for testing • Each image is 1500 x 1500 pixel sized at 1m^2/pixel resolution. • The entire dataset covers roughly 340 km^2 of mainly urban region in Massachusetts, the United States. Aerial images 3-ch labels Dataset * http://www.cs.toronto.edu/~vmnih/data/
  • 9. Experiment • Training with 137 images and labels • Testing with 10 images and labels • We test some variants of the basic architecture. R G B Input aerial image patch Predicted map patch FC(4096)C(64, 9x9/2) P(2/1) C(128, 7x7/1) C(128, 5x5/1) FC(768) Basic architecture Activation Dropout rate Filter size S-ReLU(Basic) ReLU N/A 9-7-5 S-ReLU-Dropout ReLU 0.5 9-7-5 S-Maxout Maxout 0.5 9-7-5 ReLU ReLU N/A 16-4-3 ReLU-Dropout ReLU 0.5 16-4-3 Maxout Maxout 0.5 16-4-3 Tested architectures
  • 10. Input aerial image Predicted 3-channel label image from the basic architecture Example of Test Results
  • 11. Input aerial image Example of Test Results Predicted 3-channel label image from the basic architecture
  • 12. Precision-recall curve Compare road channel result with Volodymyr Mnih’s result
  • 13. Compare building channel result with Volodymyr Mnih’s result Precision-recall curve
  • 14. Road Building S-ReLU(Basic) 0.8905 0.9241 S-ReLU-Dropout 0.8889 0.9220 S-Maxout 0.8842 0.9185 ReLU 0.8657 0.8984 ReLU-Dropout 0.8650 0.8973 Maxout 0.8548 0.8940 Volodymyr 0.8873 0.9150 Precision at breakeven point • Our basic architecture achieved the best results. • Using Maxout or Dropout, or both seem not to improve the performance. • The architecture which has smaller filter size is better than ones with bigger filters. Conclusion • We propose a CNN-based building and road extraction method for aerial imagery. • Our method doesn’t need hand-designed image features because the good feature extractors are automatically constructed by training CNN. • Our CNN predicts building and road regions simultaneously at state-of-the-art accuracy.
  • 15. Thank you for your kind attention. All codes to generate our dataset, perform training of CNN, and test of the resulting models are available on GitHub. https://github.com/mitmul/ssai