3. Who is supporting HDF?
• NASA/ESDIS
– Earth science applications, instrument data
• DOE/ASCI (Accelerated Strategic Computing Init.)
– Simulations on massively parallel machines
• NCSA/NSF/State of Illinois
– HPC and Grid data intensive apps, Visualization, user support
– Atmospheric and ocean modeling environments
• DOE Scientific Data Analysis & Computation Program
– High performance I/O R & D
• National Archives and Records Administration
– Small grant to consider HDF5 as an archive format
-3-
HDF
4. HDF software in 2003
• Library releases
• Java Products
• Other tools
-4-
HDF
5. HDF4.2 Release 1
• Planned for October, 2003
• Alpha release available now from NCSA ftp
server
• Bug fixes
• Szip compression
– Fast compression method for EOS data
• Not included: Error detection code in HDF4
– Evaluated, decided not needed in HDF4
– Will address outside the library
-5-
HDF
6. HDF4.2r1
• New compilers
– Intel
– Portland Group
• New OS
–
–
–
–
Mac OS X
AIX 5.1 64-bit
OSF1
Red Hat 8/9
-6-
HDF
7. HDF4.2r1
• Tools (per DAAC and Instrument Team requests)
– hdfimport
• Converts float and/or integer data to SDS and/or 8-bit Raster
• Image data can be scaled about the mean value
• Revision of earlier fp2hdf
– hdfdiff
• Compares two HDF4 files
• Revision of earlier hdfdiff tool
• Requested by DAAC & instrument teams
– hdfrepack
• Makes a copy of an HDF4 file
• optionally rewrite objects with compression, uncompression,
and/or chunking
-7-
HDF
8. HDF5 software milestones in 2003
Q1 ‘03
Base
library
High level
library
Java
products
♦
1
.
.4
Q2 ‘03
Q3 ‘03
5
♦
PIs
LA
♦H
1.2
♦
1
.
.6
♦
Q4 ‘03
0
♦
1
.
.6
1
2.0
1.
1.3
♦
Other
-8-
5
-H rsion
H4 ve .1
♦ on y 1
c ar
libr
HDF
9. HDF5 1.4.5
• Released in February 2003
• New platforms
– AIX 5.1 (-64 bit)
– Mac OS X
• New compiler support on Linux 2.4
– Portland Group – pgcc. pgf90, pgCC
– gcc and g++ 3.2.
• Added some missing Fortran 90 APIs
• Fixed many bugs
• Some performance improvements
-9-
HDF
10. HDF5 1.6.0
• Released in July
• Most notable new features
– New filters
• szip compression
• “shuffling”
• checksum
– Properties
• Generic properties to allow users to extend property lists according to
their needs
• Control allocation time and fill value properties
– Compact storage layout for datasets
– Redesigned I/O pipeline for better performance.
– Hyperslab operations
- 10 -
HDF
11. HDF5 1.6.0
• New tools
– H5diff -- compare two HDF5 files
– h5import
• import ascii and binary data to an HDF5 file
– H5fc & h5c++
• more easily compile Fortran and C++ applications that use
HDF5
• Old tools
– h5toh4 conversion
• upgrade of h5toh4 utility
• updated the HDF4 to HDF5 Mapping specification
- 11 -
HDF
12. HDF5 High level APIs
• Make HDF5 easier to use
– More operations per call than the normal HDF5 API
• Encourage standard ways to store objects
– Enforce standard representation of objects in HDF5
- 15 -
HDF
13. HL HDF5: HDF5 Lite
• Higher-level functions that do more
operations per call than the basic API
• Wrap intuitive functions around certain sets
of features in the existing APIs
• Currently covers dataset and attribute
related functions
- 16 -
HDF
14. HL HDF5: HDF5 Image
• Defines a standard storage scheme for
datasets that are intended to be interpreted
as images
• 2 types of images
– 8-bit indexed to a palette
– 24-bit with 3 color planes (RGB)
• Also palette functions
- 17 -
HDF
15. HL HDF5: HDF5 Table
• Defines a standard storage scheme for datasets that
are intended to be interpreted as tables
• A “table” is a collection of records with fixedlength fields:
- 18 -
HDF
16. Parallel HDF5
• A few performance improvements
• MPICH/MPE instrumentation feature added
– performance analysis tools for their MPI programs
• “Flexible parallel HDF5” programming model
– More flexible model for parallel HDF5
• New parallel platforms supported
–
–
–
–
–
Solaris 2.8 (32 & 64 bits)
OSF 5.1
Cray T3E, SV1, T90
HPUX 11.0
FreeBSD
- 19 -
HDF
17. HDF5 1.6.1
• Bug fixes needed by Aura team
• Due Oct. 15
• Thanks to Cheryl Craig and the Aura team
for finding the bugs and working with us
- 20 -
HDF
18. HDFView
HDFView – a Java based
visual tool to browse and
edit HDF4 and HDF5 files.
•
•
•
•
•
•
•
Browse objects in hierarchy
Import/export JPEG images
Create and delete objects
Copy/paste between files
Change/delete data content
Display/modify attributes
Save data values to a text file
- 21 -
HDF
19. Modular HDFView
Modular HDFView – improved
HDFView where I/O and GUI
components are replaceable modules.
• Replaceable modules:
– File I/O (file/data format)
– Tree view (show file structure)
– Table view (spreadsheet-like)
– Text view (view/edit text dataset)
– Image view (view/process image)
– Palette view (view/change palette)
– Metadata (attribute) view
- 22 -
Application
(HDFView)
Interfaces
I/O, TreeView,
TableView, etc
Default
User
Implementation
Implementation
HDF
20. Other tools work
• H5diff
– Compare the structure and contents of two HDF5 files,
and report differences
– Command line utility like Unix ‘diff’ and older ‘hdiff’
– Report missing objects, inconsistent size, datatype, etc.
– Compare values of numeric datasets
– First beta available January 2003
– See poster
- 23 -
HDF
21. Other activities with EOS tools teams
• Collaboration with ECS contractor to add
HDF-EOS modules to HDF java tool
• Consultation & collaboration with the Data
Usability team
– XML and other tools
– Poster at AGU
- 24 -
HDF
23. DOE/ASCI*
“ASCI provides the integrating simulation and
modeling capabilities and technologies needed …for
future design assessment and certification of nuclear
weapons and their components”
•
•
•
•
Massively parallel computing and I/O
Complex data models and big data
HDF5 a standard format for ASCI apps
NCSA role
– Library development and maintenance
– Data modeling
– Porting and tuning on big machines
* “Advanced Simulation and Computing Program”
- 26 -
HDF
24. National Archives and Records Administration
• Pilot project with HDF5
• Explore scientific data format requirements for
long term archiving of electronic records
• Geospatial data archiving and access
– 2-d and 3-d raster data, vector data
– Converting common formats to HDF5 and HDF-EOS
– Exploring scalability, applicability
• See poster:
“HDF5, HDF-EOS and Geospatial Data Archives”
- 27 -
HDF
25. Extendable Terascale Facility (ETF)
• NSF-sponsored computing and data grid
– Charter members:
NCSA, SDSC, Caltech, Argonne National Lab,
Pittsburgh Supercomputing Center
– Others to join later
•
•
•
•
Terascale computing and data
HDF4 and HDF5 apps common among early users
Parallel HDF5 on Linux clusters, others
Challenging I/O requirements
- 28 -
HDF
26. NPOESS
• National Polar-orbiting Operational Environmental
Satellite System
– Combine satellite systems of civil and defense programs
• HDF5 to be used to distribute data to users
• See presentations/posters this afternoon
- 29 -
HDF
27. netCDF-HDF Project
• Enhanced NetCDF-4 Interface to HDF5 Data
• Combine desirable characteristics of netCDF
and HDF5, while taking advantage of their
separate strengths
• Preserve format and API compatibility for
netCDF users
• Demonstrate benefits of this combination in
advanced Earth science modeling efforts
- 30 -
HDF
28. Atmospheric and Ocean Models
•
•
“Modeling Environment for Atmospheric Discovery”
HDF5 for high performance I/O for atmospheric and
ocean modeling
–
–
–
–
•
•
Weather Research and Forecasting (WRF) model
Regional Ocean Modeling System (ROMS)
Coupling of WRF and ROMS
Potential ETF application
UAH ESML & data mining also involved
See poster:
An HDF5 WRF/IO Module: Lessons Learned
- 31 -
HDF
29. DOE SciDAC* Program
• “Programming Models for Scalable Parallel Computing”
• High performance I/O R&D
–
–
–
–
Effectiveness of compression on I/O performance
Transformation of data during I/O
Integration of HDF5 with high performance Fortran
Improving parallel I/O performance in HDF5
* Scientific Discovery through Advanced Computing”
- 32 -
HDF
30. HDF5 Mesh API prototype
• Support for structured and unstructured “mesh” data
• For applications such as computational fluid
dynamics, finite element analysis, and visualization.
• A higher-level API
• Format
– HDF5 groups and datasets to organize the data
• Collaboration involving NCSA, CEI and others
• Documentation still pretty sketchy, but see
• ftp://ftp.ensight.com/pub/HDF_RW/hdf_rw.tgz
• Discussion list
- 33 -
HDF
31. Information Sources
• HDF website
– http://hdf.ncsa.uiuc.edu/
• HDF5 Information Center
– http://hdf.ncsa.uiuc.edu/HDF5/
• HDF Helpdesk
– hdfhelp@ncsa.uiuc.edu
• HDF users mailing list
– hdfnews@ncsa.uiuc.edu
- 34 -
HDF
33. Acknowledgements
This report is based upon work supported in part by a Cooperative
Agreement with NASA under NASA grant NAG 5-2040 and NAG
NCCS-599. Any opinions, findings, and conclusions or
yada yada yada …expressed in this material are those of the
recommendations
author(s) and do not necessarily reflect the views of the
National Aeronautics and Space Administration. Other
support provided by NCSA and other sponsors and agencies.
(http://hdf.ncsa.uiuc.edu/acknowledge.html).
- 36 -
HDF
This is Ray Milburn and Larry Klein
Discussions in Feb, June,
Exchanged email in July, Aug.
Sent early release of our code 8.29/03
Ray and Larry are said to have made a proposal to Raytheon to let Ray work with us on this. We are waiting to hear.