SlideShare ist ein Scribd-Unternehmen logo
1 von 16
SESIP-0720-JL
Using Apache Drill and Unidata
TDS* for NASA HDF-EOS on S3
ESIP 2020 Summer / HDF-EOS Workshop XXIII
This work was supported by NASA/GSFC under Raytheon Technologies contract number NNG15HZ39C.
This document does not contain technology or Technical Data controlled under either the U.S. International Traffic
in Arms Regulations or the U.S. Export Administration Regulations.
H. Joe Lee
EED-2 / The HDF Group / Software Engineer
hyoklee@hdfgroup.org
*THREDDS Data Server
SESIP-0720-JL
2
• HDF4
– HDF-EOS2
• HDF5
– HDF-EOS5
– netCDF-4
Hierarchical Data Format-Earth Observing
System
SESIP-0720-JL
3
HDF-EOS on S3
•HDF4?
• No elegant solution other than GDAL*
• Not so elegant: h4mapwriter / s3fs
•HDF5?
• Many OK solutions exist
• HDF5 VFD**/ HSDS*** / GDAL / Hyrax
DMR****++ / etc.
• But “Just OK is not OK.”
*Geospatial Data Abstraction Library
** Virtual File Driver
***Highly Scalable Data Service
****Dataset Metadata Response
SESIP-0720-JL
4
Apache Drill
• Supports Variety of storage - Amazon S3,
Azure Blob Storage, Google Cloud
Storage, Swift, NAS and local files.
• Data agility - query the raw data in-situ.
• Table - in-memory shredded columnar
representation for complex data
• BI Tools and REST API
SESIP-0720-JL
5
Apache Drill 1.18 (beta)
• Collection of HDF5 files on S3
• ANSI SQL
• Geoprocessing?
SESIP-0720-JL
6
THREDDS Data Server 5.0
(beta)
It supports S3!
• both HDF4 and HDF5
• NcML?
• Catalog for collection of files?
SESIP-0720-JL
7
netCDF-Java
• This is core library.
• THREDDS / Panoply / IDV shares this.
• toolsUI is a generic GUI tool based on
netCDF-Java.
• Like GDAL, if netCDF-Java works with
S3, the rest are trivial.
SESIP-0720-JL
8
toolsUI - HDF4 on S3
SESIP-0720-JL
9
Benchmark: TerraFusion on S3
• Test file size: 24G
• Format: HDF5/netCDF-4 CF
• One orbit data from 5 sensors on Terra
• S3 access from EC2 (m4.xlarge)
SESIP-0720-JL
10
Apache Drill fails after 7 minute.
read on
s3a://basicterrafusion/TERRA_BF_L1B_O535
57_20100112014327_F000_V001.h5:
com.amazonaws.AbortedException:
org.apache.drill.common.exceptions.UserE
xception$Builder.build(UserException.jav
a:657)
org.apache.drill.exec.store.hdf5.HDF5Bat
chReader.convertInputStreamToFile(HDF5Ba
tchReader.java:356)
SESIP-0720-JL
11
TDS responds within 2 minutes.
Float32
/MOPITT/granule_20100112/Geolocation/Latitude[ntr
ack_1 = 46][nstare = 29][npixels = 4];
Float32
/MOPITT/granule_20100112/Geolocation/Longitude[nt
rack_1 = 36][nstare = 29][npixels = 4];
Float64
/MOPITT/granule_20100112/Geolocation/Time[ntrack_
1 = 436];
} s3-
test/TERRA_BF_L1B_O53557_20100112014327_F000_V001
.h5;
real 1m47.065s
SESIP-0720-JL
12
h5ls responds in 2.5 minutes.
• HDF5 Virtual File Driver (VFD)
• --enable-ros3-vfd configuration option
It takes 2X longer (5 minutes) outside AWS.
SESIP-0720-JL
13
Role-based Access Control
(RBAC)
Drill THREDDS H5 VFD
Always Yes No
• RBAC eliminates access key and token.
• Access with s3://bucket/key.h5 (no https://)
• S3 buckets and objects can be private.
SESIP-0720-JL
14
THREDDS 5.0 is a Clear Winner
Based on our Benchmark Results.
• Performance is good.
• It supports HDF4.
• RBAC is supported.
• Existing netcdf-Java / OPeNDAP based
software works seamlessly.
SESIP-0720-JL
15
However, Use Case Still Matters
• SQL user? Try Drill after sanitization.
• Good for Collection of HDF5 files with 2D Grid.
• Use AWS Lambda (w/ CUMULUS) for sanitization.
• Java user? Try netCDF-Java.
• Python user? Try GDAL vsis3/ driver for HDF5 and viscurl/
for HDF4.
• OPeNDAP user? Try THREDDS 5.0 beta.
• HDF5 C/Fortran user? Try HDF5 VFD.
There are many (read-only) solutions for HDF-EOS on S3:
SESIP-0720-JL
16
This work was supported by NASA/GSFC under
Raytheon Technologies contract number
NNG15HZ39C.
in partnership with

Weitere ähnliche Inhalte

Was ist angesagt?

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 

Was ist angesagt? (20)

Leveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software TestingLeveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software Testing
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
Caching and Buffering in HDF5
Caching and Buffering in HDF5Caching and Buffering in HDF5
Caching and Buffering in HDF5
 
Google Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOSGoogle Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOS
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
Parallel HDF5 Developments
Parallel HDF5 DevelopmentsParallel HDF5 Developments
Parallel HDF5 Developments
 
Easy Access of NASA HDF data via OPeNDAP
Easy Access of NASA HDF data via OPeNDAPEasy Access of NASA HDF data via OPeNDAP
Easy Access of NASA HDF data via OPeNDAP
 
HDF-EOS 2/5 to netCDF Converter
HDF-EOS 2/5 to netCDF ConverterHDF-EOS 2/5 to netCDF Converter
HDF-EOS 2/5 to netCDF Converter
 
HDF Update 2016
HDF Update 2016HDF Update 2016
HDF Update 2016
 
NetCDF and HDF5
NetCDF and HDF5NetCDF and HDF5
NetCDF and HDF5
 
HDF Product Designer
HDF Product DesignerHDF Product Designer
HDF Product Designer
 
Status of HDF-EOS, Related Software and Tools
 Status of HDF-EOS, Related Software and Tools Status of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and Tools
 
HDF Product Designer: Using Templates to Achieve Interoperability
HDF Product Designer: Using Templates to Achieve InteroperabilityHDF Product Designer: Using Templates to Achieve Interoperability
HDF Product Designer: Using Templates to Achieve Interoperability
 
Open-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDFOpen-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDF
 
HDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDCHDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDC
 
HDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSSHDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSS
 
Easy Remote Access Via OPeNDAP
Easy Remote Access Via OPeNDAPEasy Remote Access Via OPeNDAP
Easy Remote Access Via OPeNDAP
 
Using IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS DataUsing IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS Data
 
Moving form HDF4 to HDF5/netCDF-4
Moving form HDF4 to HDF5/netCDF-4Moving form HDF4 to HDF5/netCDF-4
Moving form HDF4 to HDF5/netCDF-4
 

Ähnlich wie Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3

The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and HadoopDataWorks Summit
 

Ähnlich wie Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3 (20)

Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF5 Tools Updates
HDF5 Tools UpdatesHDF5 Tools Updates
HDF5 Tools Updates
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
HDF5 Advanced Topics
HDF5 Advanced TopicsHDF5 Advanced Topics
HDF5 Advanced Topics
 
Tools to improve the usability of NASA HDF Data
Tools to improve the usability of NASA HDF DataTools to improve the usability of NASA HDF Data
Tools to improve the usability of NASA HDF Data
 
HDF5 Tools Update
HDF5 Tools UpdateHDF5 Tools Update
HDF5 Tools Update
 
The Data Center and Hadoop
The Data Center and HadoopThe Data Center and Hadoop
The Data Center and Hadoop
 
HDF OPeNDAP project update and demo
HDF OPeNDAP project update and demoHDF OPeNDAP project update and demo
HDF OPeNDAP project update and demo
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF4 and HDF5 Performance Preliminary Results
HDF4 and HDF5 Performance Preliminary ResultsHDF4 and HDF5 Performance Preliminary Results
HDF4 and HDF5 Performance Preliminary Results
 
Efficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAPEfficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAP
 
ESDIS Status (2002)
ESDIS Status (2002)ESDIS Status (2002)
ESDIS Status (2002)
 
Bridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data ProductsBridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data Products
 
Status of HDF-EOS, Related Software, and Tools
Status of HDF-EOS, Related Software, and ToolsStatus of HDF-EOS, Related Software, and Tools
Status of HDF-EOS, Related Software, and Tools
 
HDF Status Update
HDF Status UpdateHDF Status Update
HDF Status Update
 
HDF Tools Tutorial
HDF Tools TutorialHDF Tools Tutorial
HDF Tools Tutorial
 
HDFView and HDF Java Products
HDFView and HDF Java ProductsHDFView and HDF Java Products
HDFView and HDF Java Products
 
Usage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 data
Usage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 dataUsage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 data
Usage of NCL, IDL, and MATLAB to access NASA HDF4/HDF-EOS2/HDF-EOS5 data
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 

Mehr von The HDF-EOS Tools and Information Center (12)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDF-EOS Data Product Developer's Guide
HDF-EOS Data Product Developer's GuideHDF-EOS Data Product Developer's Guide
HDF-EOS Data Product Developer's Guide
 
NASA Terra Data Fusion
NASA Terra Data FusionNASA Terra Data Fusion
NASA Terra Data Fusion
 
HDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at ScaleHDF Cloud: HDF5 at Scale
HDF Cloud: HDF5 at Scale
 
HDF for the Cloud
HDF for the CloudHDF for the Cloud
HDF for the Cloud
 
S3 VFD
S3 VFDS3 VFD
S3 VFD
 
HDF Data in the Cloud
HDF Data in the CloudHDF Data in the Cloud
HDF Data in the Cloud
 
HDF Kita Lab: JupyterLab + HDF Service
HDF Kita Lab: JupyterLab + HDF ServiceHDF Kita Lab: JupyterLab + HDF Service
HDF Kita Lab: JupyterLab + HDF Service
 

Kürzlich hochgeladen

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Steffen Staab
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastPapp Krisztián
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfonteinmasabamasaba
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationShrmpro
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareJim McKeeth
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...masabamasaba
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 

Kürzlich hochgeladen (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
Architecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the pastArchitecture decision records - How not to get lost in the past
Architecture decision records - How not to get lost in the past
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
%in Stilfontein+277-882-255-28 abortion pills for sale in Stilfontein
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Announcing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK SoftwareAnnouncing Codolex 2.0 from GDK Software
Announcing Codolex 2.0 from GDK Software
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
%+27788225528 love spells in new york Psychic Readings, Attraction spells,Bri...
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 

Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3

  • 1. SESIP-0720-JL Using Apache Drill and Unidata TDS* for NASA HDF-EOS on S3 ESIP 2020 Summer / HDF-EOS Workshop XXIII This work was supported by NASA/GSFC under Raytheon Technologies contract number NNG15HZ39C. This document does not contain technology or Technical Data controlled under either the U.S. International Traffic in Arms Regulations or the U.S. Export Administration Regulations. H. Joe Lee EED-2 / The HDF Group / Software Engineer hyoklee@hdfgroup.org *THREDDS Data Server
  • 2. SESIP-0720-JL 2 • HDF4 – HDF-EOS2 • HDF5 – HDF-EOS5 – netCDF-4 Hierarchical Data Format-Earth Observing System
  • 3. SESIP-0720-JL 3 HDF-EOS on S3 •HDF4? • No elegant solution other than GDAL* • Not so elegant: h4mapwriter / s3fs •HDF5? • Many OK solutions exist • HDF5 VFD**/ HSDS*** / GDAL / Hyrax DMR****++ / etc. • But “Just OK is not OK.” *Geospatial Data Abstraction Library ** Virtual File Driver ***Highly Scalable Data Service ****Dataset Metadata Response
  • 4. SESIP-0720-JL 4 Apache Drill • Supports Variety of storage - Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. • Data agility - query the raw data in-situ. • Table - in-memory shredded columnar representation for complex data • BI Tools and REST API
  • 5. SESIP-0720-JL 5 Apache Drill 1.18 (beta) • Collection of HDF5 files on S3 • ANSI SQL • Geoprocessing?
  • 6. SESIP-0720-JL 6 THREDDS Data Server 5.0 (beta) It supports S3! • both HDF4 and HDF5 • NcML? • Catalog for collection of files?
  • 7. SESIP-0720-JL 7 netCDF-Java • This is core library. • THREDDS / Panoply / IDV shares this. • toolsUI is a generic GUI tool based on netCDF-Java. • Like GDAL, if netCDF-Java works with S3, the rest are trivial.
  • 9. SESIP-0720-JL 9 Benchmark: TerraFusion on S3 • Test file size: 24G • Format: HDF5/netCDF-4 CF • One orbit data from 5 sensors on Terra • S3 access from EC2 (m4.xlarge)
  • 10. SESIP-0720-JL 10 Apache Drill fails after 7 minute. read on s3a://basicterrafusion/TERRA_BF_L1B_O535 57_20100112014327_F000_V001.h5: com.amazonaws.AbortedException: org.apache.drill.common.exceptions.UserE xception$Builder.build(UserException.jav a:657) org.apache.drill.exec.store.hdf5.HDF5Bat chReader.convertInputStreamToFile(HDF5Ba tchReader.java:356)
  • 11. SESIP-0720-JL 11 TDS responds within 2 minutes. Float32 /MOPITT/granule_20100112/Geolocation/Latitude[ntr ack_1 = 46][nstare = 29][npixels = 4]; Float32 /MOPITT/granule_20100112/Geolocation/Longitude[nt rack_1 = 36][nstare = 29][npixels = 4]; Float64 /MOPITT/granule_20100112/Geolocation/Time[ntrack_ 1 = 436]; } s3- test/TERRA_BF_L1B_O53557_20100112014327_F000_V001 .h5; real 1m47.065s
  • 12. SESIP-0720-JL 12 h5ls responds in 2.5 minutes. • HDF5 Virtual File Driver (VFD) • --enable-ros3-vfd configuration option It takes 2X longer (5 minutes) outside AWS.
  • 13. SESIP-0720-JL 13 Role-based Access Control (RBAC) Drill THREDDS H5 VFD Always Yes No • RBAC eliminates access key and token. • Access with s3://bucket/key.h5 (no https://) • S3 buckets and objects can be private.
  • 14. SESIP-0720-JL 14 THREDDS 5.0 is a Clear Winner Based on our Benchmark Results. • Performance is good. • It supports HDF4. • RBAC is supported. • Existing netcdf-Java / OPeNDAP based software works seamlessly.
  • 15. SESIP-0720-JL 15 However, Use Case Still Matters • SQL user? Try Drill after sanitization. • Good for Collection of HDF5 files with 2D Grid. • Use AWS Lambda (w/ CUMULUS) for sanitization. • Java user? Try netCDF-Java. • Python user? Try GDAL vsis3/ driver for HDF5 and viscurl/ for HDF4. • OPeNDAP user? Try THREDDS 5.0 beta. • HDF5 C/Fortran user? Try HDF5 VFD. There are many (read-only) solutions for HDF-EOS on S3:
  • 16. SESIP-0720-JL 16 This work was supported by NASA/GSFC under Raytheon Technologies contract number NNG15HZ39C. in partnership with