This paper describes a cloud computing platform called CryoCloud that is designed to enable open and collaborative cryosphere science. CryoCloud provides a simple and cost-effective managed cloud environment for training users in cloud workflows and determining best
1. Award #1928393
Twila Moon1, Trey Stafford1,
Matt Fisher1, Alyse Thurber2
1 National Snow & Ice Data Center, CIRES, University of Colorado Boulder
2 Education & Outreach, Cooperative Institute for Research in
Environmental Science, University of Colorado Boulder
QGreenland
2. ⢠All-in-one Greenland-
focused GIS environment
for offline and online use
with QGIS (free & open
source software)
2
What is QGreenland?
⢠Curated interdisciplinary data
package for research, learning,
decision making, and
collaboration
3. How to use it?
Tutorials + How Tos
Research, Analysis,
Visualization
Field Planning
Education
6. Thank you!
Want to stay up-to-date? Sign up for our newsletter atâŚ
QGreenland.org
github.com/nsidc/qgreenland/
Contact â qgreenland.info@gmail.com
7. 1
Accelerating discovery for (NASA) Cryosphere
communities with open cloud infrastructure
Tasha Snow1, Joanna Millstein2, Wilson Sauthoff1,3,
Wei Ji Leong4, James Colliander5,6, James Munroe5,
Denis Felikson7, Jessica Scheick8, Fernando Perez9,
Tyler Sutterley10, Matthew Siegfried1,3
1Department of Geophysics, Colorado School of Mines
2Massachusetts Institute of Technology - Woods Hole Oceanographic Inst.
3Hydrologic Science & Engineering Program, Colorado School of Mines
4
Byrd Polar Research Center, The Ohio State University
5
International Interactive Computing Collaboration (2i2c.org)
6
Department of Mathematics, University of British Columbia
7
NASA Goddard Space Flight Center
8
Earth Systems Research Center, University of New Hampshire
9Statistics Department, University of California, Berkeley
10Applied Physics Laboratory, University of Washington
8. 2
What is the cloud?
Collaborative, reproducible,
open science in the cloud
10. https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2020AV000354
ChelleGentemann
Advances in data, software, and
computing are enabling transformational,
interdisciplinary science, changing the
realm of possible questions
Deliberately designed open science
communities can advance science and
inclusivity simultaneously
âŚor from your cell phone or on a flight
Image credits: Chelle Gentemann
Process massive amounts of data from
a $36 computerâŚ
4
11. 5
: A cloud-computing platform with bumpers
Goal: Simple and cost effective managed cloud environment for training and
transitioning new users to cloud workflows and determining community best practices
- Persistent for (at least) three years
- Small servers (32Gb/4 CPU) for all users with option for larger if bring own cloud
credits
pixabay.com
12. CryoCloud community building
6
CryoCloud Github:
github.com/cryointhecloud
- New Hub tools
- CryoCloud Slack
- Community office hours
- Training, tutorials, and resources
- Bringing in related Cryosphere communities and sharing in
infrastructure ideation and construction
cryointhecloud.com
13. CryoCloud helps you accelerate your science and
makes open science easy
7
Data read-ins 1-2 orders of magnitude faster
Easy to use, customizable â same software on your local/HPC/cloud
Collaboration made easy â co-coding, shared tools
Eliminates technology bottlenecks â environments, streamlines data access
No software expertise needed â cloud-computing as a service
Democratizes science
14. Software training on Friday (9 am - 12 pm ET)
8
All are welcome - Feel free to join if you were not already signed
up, but let us know so we can get you on CryoCloud first
It is hybrid â use the FOGSS zoom link
Apologies to other time zones â recording will be available on our
Jupyter Book (book.cryointhecloud.com) next week
16. icepyx: Python tools for obtaining and
working with ICESat-2 data
GitHub: https://github.com/icesat2py/icepyx
Documentation: https://icepyx.readthedocs.io
Jessica Scheick
University of New Hampshire
Earth Systems Research Center
eScience Institute (Univ. of Washington)
FOGSS
23 March 2023
17. What is icepyx?
⧠A community of ICESat-2 data
users, developers, data managers,
and researchers
⧠An open-source Python
software library
20. Easier authentication and token management
icepyx: ongoing
Data read-in
(for cloud
data)
Query
Unify
Explore
Spatio-
Temporal
(QUEST)
More built-in data processing
(e.g. strong/weak beams, analysis)
Peer-reviewed software
YOUR
contributions!!
Maintained
JupyterHub cloud
compute environment
Handling of newest
data products
Parallelized
workflows
21. ² icepyx â community and software for working with ICESat-2
² Save time and effort; transition to the cloud
² Many ways to get involved, at multiple levels
² Shared computational tools and data è more cool science!
Funding
Conclusions
Contact/More Info
² GitHub: @JessicaS11
² icepyx:
https://github.com/icesat2py/icepyx
80NSSC21K0505
https://icesat-2.hackweek.io
ICESat-2 2023 Hackweek
23. icepyx: how do I get involved?
⧠come to a meeting (theyâre not just for
developers!)
⧠contribute â it doesnât have to be code!
⧠Report bugs
⧠Improve documentation
⧠Request features
⧠apply to help organize or attend the
2023 Hackweek (applications coming
soon for August 2023 event in Seattle!)
24. icepyx: what can it do for you?
Artwork by @allison_horst
⧠save time
⧠learn in a safe space at your own pace
⧠work with code experts
⧠make FAIR and sharing easier
⧠practice open science
⧠get credit for your code!
⧠meet new friends and collaborators
⧠easily transition to the cloud
26. icepyx: an open-source
community and Python library for
obtaining and working with
ICESat-2 data
⧠Empower the ICESat-2 user community to utilize advanced
computing to answer their research questions without needing to
become software developers
⧠Build a community of domain scientists practicing open science
(including contributing to open-source software)
Objectives
27. FOGSS 2023 | March 23rd 2023
Todayâs talk: Michalea King1 & Ian Joughin1
Team members & contributors: Ian Howat2 & Scott Henderson1
1UW Polar Science Center, 2Ohio State Byrd Polar
GrIMP Products and Tools
*Greenland Ice Sheet Mapping Project
Crash Course Edition
28. GrIMP |Whatâs Included
⢠Velocity
⢠Optical + radar-derived
⢠Annual mosaics
⢠Quarterly mosaics
⢠Monthly
⢠6/12 day (Sentinel)
⢠Prepared pipeline for processing upcoming NISAR
mission data
⢠Imagery
⢠Digital Elevation Models
⢠Terminus position for most large outlets
⢠Annual
⢠Newly added ~weekly to monthly traces
⢠Plans for GrIS-wide sub-seasonal mÊlange
masks
32. its_live - inter-mission time series of
land ice velocity and elevation
Goal: Most complete land ice velocity and elevation time series practical
- extensive velocity time series possible because all source imagery is in cloud
- source data, processing, and output products 100% cloud based
- workflow open source and on GitHub
33. Pairs of Landsat 4578&9, Sentinel 2 and
Sentinel 1 processed to velocity maps
⢠On a 120 m grid common to all image types
⢠10M+ velocity maps hard to work with
⢠Data made accessible by stacking all maps geographically into 100
km x 100 km data cubes
⢠Data cubes hosted in the cloud ( as Zarr format XArray datasets )
36. Data Cubes are static objects - no
servers, no moving parts
⢠all users access data at the same URLs
⢠code written by one person runs for others, no changes required
⢠this enables simple python and Julia libraries, and future tools we
havenât thought of yet
37. Data Cubes are static objects - no
servers, no moving parts
conda install -c conda-forge itslive
Python:
import itslive
ts = itslive.data_cube.get_time_series( points=[(-140.51,60.07)],
variables = [ âv', âacquisition_date_img1',
âacquisition_date_img2â, âdate_dtâ]
)
returns an XArray dataset
42. Issues:
⢠Data Cubes in the cloud are not something we can put in a DAAC -
when the project ends, data access will get much harder unless
NASA âData in the cloudâ approaches support static datasets in
object stores.
⢠Processing in the cloud is hard to do on a NASA-funded project - all
computing costs are in your budget - along with your salary.
43. Takeaways: It is not the tools, it is the
universal access to the data.
⢠This allows anyone and everyone to write tools - and other people
can be more creative.
44. FOGSS Lightning Talk
3/23/2023
Sophie Goliber (1), Jason Briner (1), Sophie Nowicki (1), Beata Csatho (1), Renette Jones-Ivey (1),
William Lipscomb (2), Abani Patra (3), Kristin Poinar (1), Justin Quinn (4), Anton Schenk (1), Katherine
Thayer-Calder (2)
(1) University at Buffalo, USA, (2) University Corporation for Atmospheric Research, (3) Tufts
45. What is Ghub?
â Science gateway that provides resources to researchers,
educators, and students focused on Ice Sheet Science (Paleo,
Modern, Data scientists, modelers, and anything in between)
â Hosts and shares tools, datasets and resources with a goal unify
ice sheet observations and modeling
â Supported by the US National Science Foundation and
EarthCube
â Our web site is powered by the Hubzero open-source software
46. What is a tool?
â Tools on Ghub are a type of computational
resource that you can create, publish, and
share
â Place a user interface on your working
scientific code and associate background
information such as citations and
documentation with your tool, and release
the whole package in a citable way
â These can take the form of simple Jupyter
notebooks, or complex workflows that use
high-performance computing resources
Shekhar et al., 2021
47. What is a data set?
â Ghub can be used as a repository for data
products generated during your research
â Large datasets can be made available for
computations staged on the computing
cluster
â Host for the ISMIP6 (Ice Sheet Model
Intercomparison Project for CMIP6)
datasets
48. High-performance computing at Ghub
â Ghub makes use of computing and data storage
facilities at University at Buffalo's CCR (Center
For Computational Research)
â Used run certain complex tools or workflows and
house large datasets
â For examples of tools that use HPC, see âHPC-
based toolsâ on our Tool page
49. Ghub Groups and Projects
â Groups are a unique feature that allows
communities to grow inside of the Hub
â Groups are more commonly used to
share information between individuals
in the group
â Projects are used to provide and share updates
and notes for specific projects
â File Management
â Link Google Drive, Github, AWS S3
Bucket, or Dropbox
â Each project comes with a Git repository
to store your files and data accessible via
User interface or Jupyter Notebook
50. Jupyter Notebook
â Starts the Jupyter notebook server in your home
directory with access to a terminal
â Current Kernels in
â Python 3
â Geospatial kernel includes useful
packages liker rasterio, geopandas, and
cartopy
â Octave
â R
â Access your own files or files store in project
repositories!
â Useful in developing and testing tools, developing
code for a project, or learning programing
51. For a more detailed
overview of Ghub and
example of adding a tool,
attend the session
Optional: Polar software
training tomorrow at 9am!
Contact: Sophie Goliber
sophiego@buffalo.edu
Link to survey!
https://qrco.de/bdomxI