SlideShare a Scribd company logo
1 of 11
ESIP-0722 JG
Hyrax: Serving Data from S3
Summer ESIP 2022
This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001.
This document does not contain technology or Technical Data controlled under either the U.S. International Traffic
in Arms Regulations or the U.S. Export Administration Regulations.
James Gallagher
Software Engineer/NASA EED-3 contractor
jgallagher@opendap.org
ESIP-0722 JG
2
• Serve existing files that are stored on S3*
• There’s no need to alter the files – just
copy them to S3
• Works for HDF5** and netCDF4***
What you can do
* Simple Storage Service
** Hierarchical Data Format, version 5
*** Network Common Data Format, version 4
ESIP-0722 JG
3
• Where the server will run
• Where you will store the ancillary
metadata files the server needs
– These files are not the big data files
– They provide a road map to the interior
structure of those data files
• The URLs* of the data files you want to
serve
What you need to know
* Universal Resource Locators
ESIP-0722 JG
4
• Use the command line tool (get_dmrpp)
to build a DMR++* file
• Use get_dmrpp’s default setting and see
if the way it represents your data is
acceptable (try it and see)
• Customize the configuration if needed
• Write a script to process your collection of
files
How to make the ancillary files
*Dataset Metadata Response Plus Plus
ESIP-0722 JG
5
• Get the Hyrax server Docker container
• Start the container – it contains both the
Hyrax data server and the get_dmrpp
command
• Run get_dmrpp inside the container – a
recipe follows
To run get_dmrpp
ESIP-0722 JG
6
Go to the directory where you want to store the ancillary information
Start the docker container that has the Hyrax server and the get_dmrpp tool
docker run --hostname hyrax 
--publish 8080:8080 
--volume $(pwd):/usr/share/hyrax 
--env AWS_ACCESS_KEY_ID 
--env AWS_SECRET_ACCESS_KEY 
--name hyrax 
opendap/hyrax:snapshot
If your computer uses the new Apple Silicon processor (M1), add
--platform linux/amd64
The get_dmrpp Recipe
ESIP-0722 JG
7
Run any command in the Docker container
docker exec –interactive --tty hyrax /bin/bash
Run get_dmrpp in the Docker container
docker exec –interactive --tty hyrax 
/bin/bash -c “cd /usr/share/hyrax; get_dmrpp …”
Where
get_dmrpp 
-b . 
-u https://url.for.your/data/file.nc 
-o file.nc.dmrpp 
s3://bucket/objectname
get_dmrpp, continued
ESIP-0722 JG
8
Run any command in the Docker container
docker exec –interactive --tty hyrax /bin/bash
Run get_dmrpp in the Docker container
get_dmrpp 
-b . 
-u https://cloudydap.s3.amazonaws.com/samples/1A.GPM.GMI.COUNT2014v3.20160105-
S230545-E003816.010538.V03B.h5 
-o 1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5.dmrpp 
s3://cloudydap/samples/1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5
This will build a DMR++ document for the named granule in
cloudydap S3 bucket. The DMR++ will use the URL
https://cloudydap.s3... to read data values.
get_dmrpp, example
ESIP-0722 JG
9
• Look at the default output and decide how
it should change
• Look at the HDF5 Handler documentation
for Hyrax – the same options can be used
with get_dmrpp
• Put those optional ‘keys’ in a file and pass
name of that file to get_dmrpp using the
command’s -s option
How to customize the DMR++
ESIP-0722 JG
10
• How to build & deploy DMR++ files for Hyrax
• Customization options for HDF5
• If you’d like more information or a
demonstration, see me after the session
More information
Some useful Docker commands
docker rm -f $(docker ps -aq) # remove all containers
docker rmi -f $(docker images -q) # remove all images
ESIP-0722 JG
11
This work was supported by NASA/GSFC under
Raytheon Technologies contract number
80GSFC21CA001.

More Related Content

Similar to Hyrax: Serving Data from S3

02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configurationSubhas Kumar Ghosh
 
Dok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDoKC
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSDataWorks Summit
 
R Data Access from hdfs,spark,hive
R Data Access  from hdfs,spark,hiveR Data Access  from hdfs,spark,hive
R Data Access from hdfs,spark,hivearunkumar sadhasivam
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDatainside-BigData.com
 
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...DataStax
 
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryDesign and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryIJRESJOURNAL
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...Govt.Engineering college, Idukki
 
R the unsung hero of Big Data
R the unsung hero of Big DataR the unsung hero of Big Data
R the unsung hero of Big DataDhafer Malouche
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFSEdureka!
 
Graphing Nagios services with pnp4nagios
Graphing Nagios services with pnp4nagiosGraphing Nagios services with pnp4nagios
Graphing Nagios services with pnp4nagiosjasonholtzapple
 

Similar to Hyrax: Serving Data from S3 (20)

GR740 User day
GR740 User dayGR740 User day
GR740 User day
 
Spark Working Environment in Windows OS
Spark Working Environment in Windows OSSpark Working Environment in Windows OS
Spark Working Environment in Windows OS
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Dok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on KubernetesDok Talks #124 - Intro to Druid on Kubernetes
Dok Talks #124 - Intro to Druid on Kubernetes
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFS
 
Big Data
Big DataBig Data
Big Data
 
R Data Access from hdfs,spark,hive
R Data Access  from hdfs,spark,hiveR Data Access  from hdfs,spark,hive
R Data Access from hdfs,spark,hive
 
State of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigDataState of Containers and the Convergence of HPC and BigData
State of Containers and the Convergence of HPC and BigData
 
Rhel6
Rhel6Rhel6
Rhel6
 
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...
 
Docker.io
Docker.ioDocker.io
Docker.io
 
Exp-3.pptx
Exp-3.pptxExp-3.pptx
Exp-3.pptx
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
Hadoop
HadoopHadoop
Hadoop
 
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on RaspberryDesign and Research of Hadoop Distributed Cluster Based on Raspberry
Design and Research of Hadoop Distributed Cluster Based on Raspberry
 
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...Dache: A Data Aware Caching for Big-Data Applications Usingthe MapReduce Fra...
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
 
R the unsung hero of Big Data
R the unsung hero of Big DataR the unsung hero of Big Data
R the unsung hero of Big Data
 
Hadoop Architecture and HDFS
Hadoop Architecture and HDFSHadoop Architecture and HDFS
Hadoop Architecture and HDFS
 
Graphing Nagios services with pnp4nagios
Graphing Nagios services with pnp4nagiosGraphing Nagios services with pnp4nagios
Graphing Nagios services with pnp4nagios
 

More from The HDF-EOS Tools and Information Center

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 

More from The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 
Leveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software TestingLeveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software Testing
 
Google Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOSGoogle Colaboratory for HDF-EOS
Google Colaboratory for HDF-EOS
 
Parallel Computing with HDF Server
Parallel Computing with HDF ServerParallel Computing with HDF Server
Parallel Computing with HDF Server
 

Recently uploaded

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Recently uploaded (20)

Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

Hyrax: Serving Data from S3

  • 1. ESIP-0722 JG Hyrax: Serving Data from S3 Summer ESIP 2022 This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001. This document does not contain technology or Technical Data controlled under either the U.S. International Traffic in Arms Regulations or the U.S. Export Administration Regulations. James Gallagher Software Engineer/NASA EED-3 contractor jgallagher@opendap.org
  • 2. ESIP-0722 JG 2 • Serve existing files that are stored on S3* • There’s no need to alter the files – just copy them to S3 • Works for HDF5** and netCDF4*** What you can do * Simple Storage Service ** Hierarchical Data Format, version 5 *** Network Common Data Format, version 4
  • 3. ESIP-0722 JG 3 • Where the server will run • Where you will store the ancillary metadata files the server needs – These files are not the big data files – They provide a road map to the interior structure of those data files • The URLs* of the data files you want to serve What you need to know * Universal Resource Locators
  • 4. ESIP-0722 JG 4 • Use the command line tool (get_dmrpp) to build a DMR++* file • Use get_dmrpp’s default setting and see if the way it represents your data is acceptable (try it and see) • Customize the configuration if needed • Write a script to process your collection of files How to make the ancillary files *Dataset Metadata Response Plus Plus
  • 5. ESIP-0722 JG 5 • Get the Hyrax server Docker container • Start the container – it contains both the Hyrax data server and the get_dmrpp command • Run get_dmrpp inside the container – a recipe follows To run get_dmrpp
  • 6. ESIP-0722 JG 6 Go to the directory where you want to store the ancillary information Start the docker container that has the Hyrax server and the get_dmrpp tool docker run --hostname hyrax --publish 8080:8080 --volume $(pwd):/usr/share/hyrax --env AWS_ACCESS_KEY_ID --env AWS_SECRET_ACCESS_KEY --name hyrax opendap/hyrax:snapshot If your computer uses the new Apple Silicon processor (M1), add --platform linux/amd64 The get_dmrpp Recipe
  • 7. ESIP-0722 JG 7 Run any command in the Docker container docker exec –interactive --tty hyrax /bin/bash Run get_dmrpp in the Docker container docker exec –interactive --tty hyrax /bin/bash -c “cd /usr/share/hyrax; get_dmrpp …” Where get_dmrpp -b . -u https://url.for.your/data/file.nc -o file.nc.dmrpp s3://bucket/objectname get_dmrpp, continued
  • 8. ESIP-0722 JG 8 Run any command in the Docker container docker exec –interactive --tty hyrax /bin/bash Run get_dmrpp in the Docker container get_dmrpp -b . -u https://cloudydap.s3.amazonaws.com/samples/1A.GPM.GMI.COUNT2014v3.20160105- S230545-E003816.010538.V03B.h5 -o 1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5.dmrpp s3://cloudydap/samples/1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5 This will build a DMR++ document for the named granule in cloudydap S3 bucket. The DMR++ will use the URL https://cloudydap.s3... to read data values. get_dmrpp, example
  • 9. ESIP-0722 JG 9 • Look at the default output and decide how it should change • Look at the HDF5 Handler documentation for Hyrax – the same options can be used with get_dmrpp • Put those optional ‘keys’ in a file and pass name of that file to get_dmrpp using the command’s -s option How to customize the DMR++
  • 10. ESIP-0722 JG 10 • How to build & deploy DMR++ files for Hyrax • Customization options for HDF5 • If you’d like more information or a demonstration, see me after the session More information Some useful Docker commands docker rm -f $(docker ps -aq) # remove all containers docker rmi -f $(docker images -q) # remove all images
  • 11. ESIP-0722 JG 11 This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001.