SlideShare ist ein Scribd-Unternehmen logo
1 von 20
Generating Training Data from Noisy
Measurements
HAMED ALEMOHAMMAD
LEAD GEOSPATIAL DATA SCIENTIST
ML Hub Earth
 Machine Learning commons for EO
 Training data
 Models
 Standards and best practices
Global Land Cover Training Dataset
 Human-verified training dataset
 Using open-source Sentinel-2 imagery
 10 m spatial resolution.
 Global and geo-diverse
Workflow
S2 L2A
Reflectance
S2 L2A
Classification
GlobeLand30
Labels (2010)
Filtered Labels
Class
Predictions
Class
Verification
(Human)
Model
Training
Data
 Input Data:
 10 Sentinel-2 bands: Red, Green, Blue, Red-Edge1-3, NIR, Narrow NIR, SWIR1-2
 20 m bands scaled to 10m using bi-cubic interpolation
 Reference/Label Data:
 GlobeLand30 labels for 2010 used as a source
 Classes mapped to REF Land Cover Taxonomy
 Labels re-gridded to Sentinel-2 grid using nearest neighbor
 Labels filtered by agreement with classes from Sentinel-2’s 20m scene classification
(produced as part of atmospheric correction)
 Filtered labels used as reference labels for training
Methodology
 A pixel-based supervised Random Forests model trained for each scene.
 Pixels without valid reflectance are excluded from training.
 Training on class-stratified samples of half the pixels in a scene with one
Sentinel-2 pixel at 10 m for each label pixel at 30 m.
 Predictions are made on all pixels marked with usable classes during Level-2A
processing, including pixels labeled as unclassified.
 Annual labels will be generated by aggregating time series of predictions and
probabilities from the same tile throughout the year.
Results
 88.75% average model accuracy across 4 diverse scenes.
 Some classes, like water and snow/ice, predicted with high accuracy and high
confidence across all scenes.
 Other classes, like wetland and (semi) natural vegetation, are subtler and were
expected to be more difficult to classify.
 Woody vegetation and cultivated vegetation were predicted relatively
accurately and not confused with each other, as a result of including 20 m red
edge bands, resampled to 10 m.
 Artificial bare ground tended to be predicted in unclassified regions (in
reference data), taking over areas of natural bare ground and cultivated
vegetation and suggesting that traces of human activity would lead to pixels
classified as artificial bare ground in off-vegetation season.
Results
What about non-categorical variables?
 True value of categorical variables vs true value of continuous variables:
 Crop Yield
 Soil Moisture
 Temperature
 Precipitation
 All measurements of continuous variables are prone to uncertainty (noise and
bias).
 How to reduce/eliminate these uncertainties in training data?
In-SituModel Satellite
Truth
Noisy and biased measurement systems
slide courtesy of K. McColl
Generating Training Dataset
 Triple collocation (TC) is a technique for estimating the unknown error standard
deviations (or RMSEs) of three mutually independent measurement systems,
without treating any one system as zero-error “truth”.
𝑄𝑖𝑗 ≡ 𝐶𝑜𝑣 𝑋𝑖, 𝑋𝑗 𝜎𝜀𝑖
= 𝑄𝑖𝑖 −
𝑄 𝑖𝑗 𝑄𝑖𝑘
𝑄 𝑗𝑘
 TC-based RMSE estimates at each pixel are used to compute a priori probability
(𝑃𝑖) of selecting a particular dataset:
𝑃𝑖 =
1
𝜎𝜀𝑖
2
𝑖=1
3 1
𝜎𝜀𝑖
2
Sample time series of a pixel
𝑋1 𝑋2 𝑋3
𝑡1
𝑡2
𝑡3
𝑡 𝑁
𝑋 𝑇
Backup Slides
Alemohammad, et al., Biogeosciences, 2017
Alemohammad, et al., Biogeosciences, 2017
Things to check
 Sentinel-2 L2A classes
 What are the usable classes there?
 Plot actual scene + artificial bare ground

Weitere ähnliche Inhalte

Was ist angesagt?

Investigation of Chaotic-Type Features in Hyperspectral Satellite Data
Investigation of Chaotic-Type Features in Hyperspectral Satellite DataInvestigation of Chaotic-Type Features in Hyperspectral Satellite Data
Investigation of Chaotic-Type Features in Hyperspectral Satellite Datacsandit
 
Fragmentation revisited 050902
Fragmentation revisited 050902Fragmentation revisited 050902
Fragmentation revisited 050902Niels Nielsen
 
REMOTE SENSING
REMOTE SENSINGREMOTE SENSING
REMOTE SENSINGmusadoto
 
Retraining maximum likelihood classifiers using low-rank model.ppt
Retraining maximum likelihood classifiers using low-rank model.pptRetraining maximum likelihood classifiers using low-rank model.ppt
Retraining maximum likelihood classifiers using low-rank model.pptgrssieee
 
Распознавание облаков и теней на спутниковых изображениях с использованием гл...
Распознавание облаков и теней на спутниковых изображениях с использованием гл...Распознавание облаков и теней на спутниковых изображениях с использованием гл...
Распознавание облаков и теней на спутниковых изображениях с использованием гл...Ontico
 
Hsc 340 10 14
 Hsc 340 10 14 Hsc 340 10 14
Hsc 340 10 14CSULB
 
Maciej soja l3_poster
Maciej soja l3_posterMaciej soja l3_poster
Maciej soja l3_posterMaciej Soja
 
Raster data analysis
Raster data analysisRaster data analysis
Raster data analysisAbdul Raziq
 
10008-16.antoine_lefebvre2
10008-16.antoine_lefebvre210008-16.antoine_lefebvre2
10008-16.antoine_lefebvre2Antoine Lefebvre
 
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...ijaia
 
Robust registration of cloudy satellite images using two step segmentation
Robust registration of cloudy satellite images using two step segmentationRobust registration of cloudy satellite images using two step segmentation
Robust registration of cloudy satellite images using two step segmentationI3E Technologies
 
Irrera gold2010
Irrera gold2010Irrera gold2010
Irrera gold2010grssieee
 
Remote sensing e course (Geohydrology)
Remote sensing e course (Geohydrology)Remote sensing e course (Geohydrology)
Remote sensing e course (Geohydrology)Fatwa Ramdani
 
Pulvirenti_IGARSS2011.ppt
Pulvirenti_IGARSS2011.pptPulvirenti_IGARSS2011.ppt
Pulvirenti_IGARSS2011.pptgrssieee
 
Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...
Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...
Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...rsmahabir
 
geographic information system pdf
geographic information system pdfgeographic information system pdf
geographic information system pdfRolan Ben Lorono
 

Was ist angesagt? (19)

Investigation of Chaotic-Type Features in Hyperspectral Satellite Data
Investigation of Chaotic-Type Features in Hyperspectral Satellite DataInvestigation of Chaotic-Type Features in Hyperspectral Satellite Data
Investigation of Chaotic-Type Features in Hyperspectral Satellite Data
 
Fragmentation revisited 050902
Fragmentation revisited 050902Fragmentation revisited 050902
Fragmentation revisited 050902
 
REMOTE SENSING
REMOTE SENSINGREMOTE SENSING
REMOTE SENSING
 
Retraining maximum likelihood classifiers using low-rank model.ppt
Retraining maximum likelihood classifiers using low-rank model.pptRetraining maximum likelihood classifiers using low-rank model.ppt
Retraining maximum likelihood classifiers using low-rank model.ppt
 
Распознавание облаков и теней на спутниковых изображениях с использованием гл...
Распознавание облаков и теней на спутниковых изображениях с использованием гл...Распознавание облаков и теней на спутниковых изображениях с использованием гл...
Распознавание облаков и теней на спутниковых изображениях с использованием гл...
 
Hsc 340 10 14
 Hsc 340 10 14 Hsc 340 10 14
Hsc 340 10 14
 
Maciej soja l3_poster
Maciej soja l3_posterMaciej soja l3_poster
Maciej soja l3_poster
 
Raster data analysis
Raster data analysisRaster data analysis
Raster data analysis
 
10008-16.antoine_lefebvre2
10008-16.antoine_lefebvre210008-16.antoine_lefebvre2
10008-16.antoine_lefebvre2
 
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
 
Robust registration of cloudy satellite images using two step segmentation
Robust registration of cloudy satellite images using two step segmentationRobust registration of cloudy satellite images using two step segmentation
Robust registration of cloudy satellite images using two step segmentation
 
Irrera gold2010
Irrera gold2010Irrera gold2010
Irrera gold2010
 
Digital Elevation Model (DEM)
Digital Elevation Model (DEM)Digital Elevation Model (DEM)
Digital Elevation Model (DEM)
 
Remote sensing e course (Geohydrology)
Remote sensing e course (Geohydrology)Remote sensing e course (Geohydrology)
Remote sensing e course (Geohydrology)
 
Pulvirenti_IGARSS2011.ppt
Pulvirenti_IGARSS2011.pptPulvirenti_IGARSS2011.ppt
Pulvirenti_IGARSS2011.ppt
 
Af33174179
Af33174179Af33174179
Af33174179
 
Poster: MMSP 2008
Poster: MMSP 2008Poster: MMSP 2008
Poster: MMSP 2008
 
Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...
Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...
Separability Analysis of Integrated Spaceborne Radar and Optical Data: Sudan ...
 
geographic information system pdf
geographic information system pdfgeographic information system pdf
geographic information system pdf
 

Ähnlich wie Generating Training Data from Noisy Measurements

Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...grssieee
 
Copernicus Land Moniotring Service Portfolio
Copernicus Land Moniotring Service PortfolioCopernicus Land Moniotring Service Portfolio
Copernicus Land Moniotring Service PortfolioCLMS
 
IGARSS_2011_GALLOZA.pptx
IGARSS_2011_GALLOZA.pptxIGARSS_2011_GALLOZA.pptx
IGARSS_2011_GALLOZA.pptxgrssieee
 
Atmospheric Correction of Remote Sensing Data_RamaRao.pptx
Atmospheric Correction of Remote Sensing Data_RamaRao.pptxAtmospheric Correction of Remote Sensing Data_RamaRao.pptx
Atmospheric Correction of Remote Sensing Data_RamaRao.pptxssusercd49c0
 
Use of UAS for Hydrological Monitoring
Use of UAS for Hydrological MonitoringUse of UAS for Hydrological Monitoring
Use of UAS for Hydrological MonitoringSalvatore Manfreda
 
Rb euregeo 2012 poster 2
Rb euregeo 2012 poster 2Rb euregeo 2012 poster 2
Rb euregeo 2012 poster 2Ricardo Brasil
 
Yang-IGARSS2011-1082.pptx
Yang-IGARSS2011-1082.pptxYang-IGARSS2011-1082.pptx
Yang-IGARSS2011-1082.pptxgrssieee
 
AT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.pptAT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.pptgrssieee
 
SIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATOR
SIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATORSIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATOR
SIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATORgrssieee
 
Failed handoffs in collaborative Wi-Fi networks
Failed handoffs in collaborative Wi-Fi networksFailed handoffs in collaborative Wi-Fi networks
Failed handoffs in collaborative Wi-Fi networksTELKOMNIKA JOURNAL
 
WE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNI
WE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNIWE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNI
WE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNIgrssieee
 
Prediction of soil properties with NIR data and site descriptors using prepro...
Prediction of soil properties with NIR data and site descriptors using prepro...Prediction of soil properties with NIR data and site descriptors using prepro...
Prediction of soil properties with NIR data and site descriptors using prepro...FAO
 
2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...
2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...
2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...GIS in the Rockies
 
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...gerogepatton
 
2_Goodenough_IGARSS11_Final.ppt
2_Goodenough_IGARSS11_Final.ppt2_Goodenough_IGARSS11_Final.ppt
2_Goodenough_IGARSS11_Final.pptgrssieee
 
Kim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxKim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxgrssieee
 
Atmospheric Correction of Remotely Sensed Images in Spatial and Transform Domain
Atmospheric Correction of Remotely Sensed Images in Spatial and Transform DomainAtmospheric Correction of Remotely Sensed Images in Spatial and Transform Domain
Atmospheric Correction of Remotely Sensed Images in Spatial and Transform DomainCSCJournals
 

Ähnlich wie Generating Training Data from Noisy Measurements (20)

DRONES IN HYDROLOGY
DRONES IN HYDROLOGYDRONES IN HYDROLOGY
DRONES IN HYDROLOGY
 
Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...Molinier - Feature Selection for Tree Species Identification in Very High res...
Molinier - Feature Selection for Tree Species Identification in Very High res...
 
Copernicus Land Moniotring Service Portfolio
Copernicus Land Moniotring Service PortfolioCopernicus Land Moniotring Service Portfolio
Copernicus Land Moniotring Service Portfolio
 
IGARSS_2011_GALLOZA.pptx
IGARSS_2011_GALLOZA.pptxIGARSS_2011_GALLOZA.pptx
IGARSS_2011_GALLOZA.pptx
 
Atmospheric Correction of Remote Sensing Data_RamaRao.pptx
Atmospheric Correction of Remote Sensing Data_RamaRao.pptxAtmospheric Correction of Remote Sensing Data_RamaRao.pptx
Atmospheric Correction of Remote Sensing Data_RamaRao.pptx
 
Use of UAS for Hydrological Monitoring
Use of UAS for Hydrological MonitoringUse of UAS for Hydrological Monitoring
Use of UAS for Hydrological Monitoring
 
Rb euregeo 2012 poster 2
Rb euregeo 2012 poster 2Rb euregeo 2012 poster 2
Rb euregeo 2012 poster 2
 
Yang-IGARSS2011-1082.pptx
Yang-IGARSS2011-1082.pptxYang-IGARSS2011-1082.pptx
Yang-IGARSS2011-1082.pptx
 
AT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.pptAT_MB_MM_IGARSS2011.ppt
AT_MB_MM_IGARSS2011.ppt
 
SIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATOR
SIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATORSIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATOR
SIXTEEN CHANNEL, NON-SCANNING AIRBORNE LIDAR SURFACE TOPOGRAPHY (LIST) SIMULATOR
 
Failed handoffs in collaborative Wi-Fi networks
Failed handoffs in collaborative Wi-Fi networksFailed handoffs in collaborative Wi-Fi networks
Failed handoffs in collaborative Wi-Fi networks
 
WE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNI
WE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNIWE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNI
WE1.L09 - GLOBAL BIOMASS ESTIMATES FROM DESDYNI
 
Prediction of soil properties with NIR data and site descriptors using prepro...
Prediction of soil properties with NIR data and site descriptors using prepro...Prediction of soil properties with NIR data and site descriptors using prepro...
Prediction of soil properties with NIR data and site descriptors using prepro...
 
2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...
2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...
2013 ASPRS Track, Ozone Modeling for the Contiguous United States by Michael ...
 
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
MODELING THE CHLOROPHYLL-A FROM SEA SURFACE REFLECTANCE IN WEST AFRICA BY DEE...
 
2_Goodenough_IGARSS11_Final.ppt
2_Goodenough_IGARSS11_Final.ppt2_Goodenough_IGARSS11_Final.ppt
2_Goodenough_IGARSS11_Final.ppt
 
Landsat calibration summary_rse
Landsat calibration summary_rseLandsat calibration summary_rse
Landsat calibration summary_rse
 
Landsat calibration summary_rse
Landsat calibration summary_rseLandsat calibration summary_rse
Landsat calibration summary_rse
 
Kim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptxKim_WE3_T05_2.pptx
Kim_WE3_T05_2.pptx
 
Atmospheric Correction of Remotely Sensed Images in Spatial and Transform Domain
Atmospheric Correction of Remotely Sensed Images in Spatial and Transform DomainAtmospheric Correction of Remotely Sensed Images in Spatial and Transform Domain
Atmospheric Correction of Remotely Sensed Images in Spatial and Transform Domain
 

Mehr von Louisa Diggs

Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...
Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...
Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...Louisa Diggs
 
Using Active Learning to Quantify how Training Data Errors Impact Classificat...
Using Active Learning to Quantify how Training Data Errors Impact Classificat...Using Active Learning to Quantify how Training Data Errors Impact Classificat...
Using Active Learning to Quantify how Training Data Errors Impact Classificat...Louisa Diggs
 
Machine Learning for Better Maps
Machine Learning for Better MapsMachine Learning for Better Maps
Machine Learning for Better MapsLouisa Diggs
 
Cropped Field Boundaries, Food Systems, & Fire
Cropped Field Boundaries, Food Systems, & FireCropped Field Boundaries, Food Systems, & Fire
Cropped Field Boundaries, Food Systems, & FireLouisa Diggs
 
Challenges to Large Scale Mapping: Can Data Geometry Help?
Challenges to Large Scale Mapping: Can Data Geometry Help?Challenges to Large Scale Mapping: Can Data Geometry Help?
Challenges to Large Scale Mapping: Can Data Geometry Help?Louisa Diggs
 
A Random Walk of Issues Related to Training Data and Land Cover Mapping
A Random Walk of Issues Related to Training Data and Land Cover MappingA Random Walk of Issues Related to Training Data and Land Cover Mapping
A Random Walk of Issues Related to Training Data and Land Cover MappingLouisa Diggs
 
Assessing Land Cover Change using Uncertain Data
Assessing Land Cover Change using Uncertain DataAssessing Land Cover Change using Uncertain Data
Assessing Land Cover Change using Uncertain DataLouisa Diggs
 
Informal Settlements and Cadastral Mapping
Informal Settlements and Cadastral MappingInformal Settlements and Cadastral Mapping
Informal Settlements and Cadastral MappingLouisa Diggs
 
Sources of Map Error in Public Health Activities and Operations Research
Sources of Map Error in Public Health Activities and Operations ResearchSources of Map Error in Public Health Activities and Operations Research
Sources of Map Error in Public Health Activities and Operations ResearchLouisa Diggs
 
Measuring the impact of label noise on semantic segmentation using rastervision
Measuring the impact of label noise on semantic segmentation using rastervisionMeasuring the impact of label noise on semantic segmentation using rastervision
Measuring the impact of label noise on semantic segmentation using rastervisionLouisa Diggs
 
Mapping Smallholder Yields Using Micro-Satellite Data
Mapping Smallholder Yields Using Micro-Satellite DataMapping Smallholder Yields Using Micro-Satellite Data
Mapping Smallholder Yields Using Micro-Satellite DataLouisa Diggs
 
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASACrowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASALouisa Diggs
 
IMED 2018: The use of remote sensing, geostatistical and machine learning met...
IMED 2018: The use of remote sensing, geostatistical and machine learning met...IMED 2018: The use of remote sensing, geostatistical and machine learning met...
IMED 2018: The use of remote sensing, geostatistical and machine learning met...Louisa Diggs
 
IMED 2018: Predicting the environmental suitability of podoconiosis in Ethiopia
IMED 2018: Predicting the environmental suitability of podoconiosis in EthiopiaIMED 2018: Predicting the environmental suitability of podoconiosis in Ethiopia
IMED 2018: Predicting the environmental suitability of podoconiosis in EthiopiaLouisa Diggs
 
IMED 2018: Landcover/habitat
IMED 2018: Landcover/habitatIMED 2018: Landcover/habitat
IMED 2018: Landcover/habitatLouisa Diggs
 
IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...
IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...
IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...Louisa Diggs
 
IMED 2018: An intro to Remote Sensing and Machine Learning
IMED 2018: An intro to Remote Sensing and Machine LearningIMED 2018: An intro to Remote Sensing and Machine Learning
IMED 2018: An intro to Remote Sensing and Machine LearningLouisa Diggs
 
IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...
IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...
IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...Louisa Diggs
 
IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...
IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...
IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...Louisa Diggs
 
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...Louisa Diggs
 

Mehr von Louisa Diggs (20)

Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...
Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...
Workshop: Quantifying Error in Training Data for Mapping and Monitoring the E...
 
Using Active Learning to Quantify how Training Data Errors Impact Classificat...
Using Active Learning to Quantify how Training Data Errors Impact Classificat...Using Active Learning to Quantify how Training Data Errors Impact Classificat...
Using Active Learning to Quantify how Training Data Errors Impact Classificat...
 
Machine Learning for Better Maps
Machine Learning for Better MapsMachine Learning for Better Maps
Machine Learning for Better Maps
 
Cropped Field Boundaries, Food Systems, & Fire
Cropped Field Boundaries, Food Systems, & FireCropped Field Boundaries, Food Systems, & Fire
Cropped Field Boundaries, Food Systems, & Fire
 
Challenges to Large Scale Mapping: Can Data Geometry Help?
Challenges to Large Scale Mapping: Can Data Geometry Help?Challenges to Large Scale Mapping: Can Data Geometry Help?
Challenges to Large Scale Mapping: Can Data Geometry Help?
 
A Random Walk of Issues Related to Training Data and Land Cover Mapping
A Random Walk of Issues Related to Training Data and Land Cover MappingA Random Walk of Issues Related to Training Data and Land Cover Mapping
A Random Walk of Issues Related to Training Data and Land Cover Mapping
 
Assessing Land Cover Change using Uncertain Data
Assessing Land Cover Change using Uncertain DataAssessing Land Cover Change using Uncertain Data
Assessing Land Cover Change using Uncertain Data
 
Informal Settlements and Cadastral Mapping
Informal Settlements and Cadastral MappingInformal Settlements and Cadastral Mapping
Informal Settlements and Cadastral Mapping
 
Sources of Map Error in Public Health Activities and Operations Research
Sources of Map Error in Public Health Activities and Operations ResearchSources of Map Error in Public Health Activities and Operations Research
Sources of Map Error in Public Health Activities and Operations Research
 
Measuring the impact of label noise on semantic segmentation using rastervision
Measuring the impact of label noise on semantic segmentation using rastervisionMeasuring the impact of label noise on semantic segmentation using rastervision
Measuring the impact of label noise on semantic segmentation using rastervision
 
Mapping Smallholder Yields Using Micro-Satellite Data
Mapping Smallholder Yields Using Micro-Satellite DataMapping Smallholder Yields Using Micro-Satellite Data
Mapping Smallholder Yields Using Micro-Satellite Data
 
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASACrowdsourcing Land Cover and Land Use Data: Experiences from IIASA
Crowdsourcing Land Cover and Land Use Data: Experiences from IIASA
 
IMED 2018: The use of remote sensing, geostatistical and machine learning met...
IMED 2018: The use of remote sensing, geostatistical and machine learning met...IMED 2018: The use of remote sensing, geostatistical and machine learning met...
IMED 2018: The use of remote sensing, geostatistical and machine learning met...
 
IMED 2018: Predicting the environmental suitability of podoconiosis in Ethiopia
IMED 2018: Predicting the environmental suitability of podoconiosis in EthiopiaIMED 2018: Predicting the environmental suitability of podoconiosis in Ethiopia
IMED 2018: Predicting the environmental suitability of podoconiosis in Ethiopia
 
IMED 2018: Landcover/habitat
IMED 2018: Landcover/habitatIMED 2018: Landcover/habitat
IMED 2018: Landcover/habitat
 
IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...
IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...
IMED 2018: Modeled Population Estimates from Satellite Imagery and Microcensu...
 
IMED 2018: An intro to Remote Sensing and Machine Learning
IMED 2018: An intro to Remote Sensing and Machine LearningIMED 2018: An intro to Remote Sensing and Machine Learning
IMED 2018: An intro to Remote Sensing and Machine Learning
 
IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...
IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...
IMED 2018: Mapping Monkeypox risk in the Congo Basin using Remote Sensing and...
 
IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...
IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...
IMED 2018: Predicting spatiotemporal risk of yellow fever using a machine lea...
 
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
IMED 2018: Innovations and Challenges in the Use of Open-source Remote Sensin...
 

Kürzlich hochgeladen

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Kürzlich hochgeladen (20)

Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Generating Training Data from Noisy Measurements

  • 1. Generating Training Data from Noisy Measurements HAMED ALEMOHAMMAD LEAD GEOSPATIAL DATA SCIENTIST
  • 2. ML Hub Earth  Machine Learning commons for EO  Training data  Models  Standards and best practices
  • 3. Global Land Cover Training Dataset  Human-verified training dataset  Using open-source Sentinel-2 imagery  10 m spatial resolution.  Global and geo-diverse
  • 4. Workflow S2 L2A Reflectance S2 L2A Classification GlobeLand30 Labels (2010) Filtered Labels Class Predictions Class Verification (Human) Model Training
  • 5. Data  Input Data:  10 Sentinel-2 bands: Red, Green, Blue, Red-Edge1-3, NIR, Narrow NIR, SWIR1-2  20 m bands scaled to 10m using bi-cubic interpolation  Reference/Label Data:  GlobeLand30 labels for 2010 used as a source  Classes mapped to REF Land Cover Taxonomy  Labels re-gridded to Sentinel-2 grid using nearest neighbor  Labels filtered by agreement with classes from Sentinel-2’s 20m scene classification (produced as part of atmospheric correction)  Filtered labels used as reference labels for training
  • 6.
  • 7. Methodology  A pixel-based supervised Random Forests model trained for each scene.  Pixels without valid reflectance are excluded from training.  Training on class-stratified samples of half the pixels in a scene with one Sentinel-2 pixel at 10 m for each label pixel at 30 m.  Predictions are made on all pixels marked with usable classes during Level-2A processing, including pixels labeled as unclassified.  Annual labels will be generated by aggregating time series of predictions and probabilities from the same tile throughout the year.
  • 8. Results  88.75% average model accuracy across 4 diverse scenes.  Some classes, like water and snow/ice, predicted with high accuracy and high confidence across all scenes.  Other classes, like wetland and (semi) natural vegetation, are subtler and were expected to be more difficult to classify.  Woody vegetation and cultivated vegetation were predicted relatively accurately and not confused with each other, as a result of including 20 m red edge bands, resampled to 10 m.  Artificial bare ground tended to be predicted in unclassified regions (in reference data), taking over areas of natural bare ground and cultivated vegetation and suggesting that traces of human activity would lead to pixels classified as artificial bare ground in off-vegetation season.
  • 10.
  • 11. What about non-categorical variables?  True value of categorical variables vs true value of continuous variables:  Crop Yield  Soil Moisture  Temperature  Precipitation  All measurements of continuous variables are prone to uncertainty (noise and bias).  How to reduce/eliminate these uncertainties in training data?
  • 12. In-SituModel Satellite Truth Noisy and biased measurement systems slide courtesy of K. McColl
  • 13. Generating Training Dataset  Triple collocation (TC) is a technique for estimating the unknown error standard deviations (or RMSEs) of three mutually independent measurement systems, without treating any one system as zero-error “truth”. 𝑄𝑖𝑗 ≡ 𝐶𝑜𝑣 𝑋𝑖, 𝑋𝑗 𝜎𝜀𝑖 = 𝑄𝑖𝑖 − 𝑄 𝑖𝑗 𝑄𝑖𝑘 𝑄 𝑗𝑘  TC-based RMSE estimates at each pixel are used to compute a priori probability (𝑃𝑖) of selecting a particular dataset: 𝑃𝑖 = 1 𝜎𝜀𝑖 2 𝑖=1 3 1 𝜎𝜀𝑖 2
  • 14. Sample time series of a pixel 𝑋1 𝑋2 𝑋3 𝑡1 𝑡2 𝑡3 𝑡 𝑁 𝑋 𝑇
  • 15.
  • 16.
  • 18. Alemohammad, et al., Biogeosciences, 2017
  • 19. Alemohammad, et al., Biogeosciences, 2017
  • 20. Things to check  Sentinel-2 L2A classes  What are the usable classes there?  Plot actual scene + artificial bare ground