2. Abstract
At the past several HDF & HDF-EOS Workshops, there has been
some informal discussion of building on the success of HDF-EOS to
design a new profile, tentatively called HDF-GEO.
This profile would incorporate lessons learned from Earth science,
Earth applications, and Earth model data systems.
It would encompass all types of data, data descriptions, and other
metadata. It might support 1-, 2-, and 3-D spatial data as well as time
series; and it would include raw, calibrated, and analyzed data sets.
It would support data exchange by building its needed complexity on
top of minimal specialized features; and by providing clear
mechanisms and requirements for all types of appropriate metadata.
The organizers propose to host a discussion among the workshop
participants on the need, scope, and direction for HDF-GEO.
3. Which buzzwords would fit?
naming
rules
…
profile
data
model
best
practices
metadata
content
atomic &
compound
types
HDF-GEO
geo
ref
…ilities
self docu
-mentation
data
levels
markup
& schema
platform
support
test
suites
tools
4. Questions for discussion by Earth science
practitioners - Bottom-up analysis:
What are the successful features of existing
community data formats and conventions? (HDF5,
HDF-EOS, netCDF, CDF, GRIB BUFR, COARDS,
CF-1, NITF, FITS, FGDC RS extensions, ISO,
geoTIFF, ...)
Progress being made -- John Caron's Common Data
Model
Specific needs for geo- and time-referenced data
conventions
Specific needs to support observed (raw), calibrated,
and analyzed data sets
5. Questions for discussion by Earth science
practitioners - Top-down analysis: (1 of 2)
What is a profile? Consider specifics of how a
standard or group of standards are implemented for a
related set of uses and applications.
How does a profile relate to a format or other
elements of a standard?
What constitutes overkill? How much profile would be
beneficial, and how much would be difficult to
implement and of limited utility?
6. Questions for discussion by Earth science
practitioners - Top-down analysis: (2 of 2)
Why is it useful?
Establishes specific meanings for complicated terms or
relationships
Establishes common preferred terms for attributes which
can be described multiple ways.
Establishes practices which are consistent with portability
across operating systems, hardware, or archives
Establishes common expectations and obligations for data
stewardship
Clarifies community (and sponsor) long-term expectations,
beyond short-term necessity
other ...
10. Motivation:
In many instances, application-specific 'profiles' or 'conventions'
or best practices have shown their utility for users. In particular,
profiles have encouraged data exchange within communities of
interest. HDF provides minimal guidance for applications.
HDF-EOS was a mission-specific profile; resulted in successes
and lessons learned. HDF5 for NPOESS is another approach.
Is it time for another attempt, benefitng from all the lessons, and
targeted at a broader audience?
21. Multi-Dimensional Attribute
Variables
2-Dimensional Independent
Variable Array(s)
e.g., lat/lon, XYZ, sun alt/az,
sat alt/az, or land mask
Key concept: Index Attributes organize the primary dependant variables,
or entities. The same Index Attributes maybe used to organize associated
independent variables. Associated independent variables may be used
singly (almost always), in pairs (frequently), or in larger combinations.
Driving NPOESS requirements were different from predecessor systems, due both to the way in which the data is generated and the way in which it will be used.
NPOESS exercise a broad range of HDF capabilities. We report entitities with ranks of 0-6. Entities may be bits, bytes, C-types, structs, and arrays. They may contain large quantities of fill, due either to array design, missing data, or inapplicable regions (e.g., night, clouds, ocean)
NPOESS design process was driven by REQUIREMENTS and by ENGINEERING CHOICES
Requirement that were AWOL:
verification
testability
useability by automated applications
tools for development
tools for exploitation
Specific lessons learned along the way.
Things to think about that relate to the way HDF is employed in designing data products
Things to think about that relate to the way HDF works; extensions may be needed.
Collection metadata is often stored separate from the file metadata. HDF does not provide a clear mechanism for making the connection, except via generalized ‘attribute’. It would be good to have a consistent mechanism which could be understood by HDF readers.
There have been previous discussions at these workshops about an HDF-GEO extension. It would be based on lessons learned from EOSDIS, NPOESS, and other programs – both positive and negative.
Are we wise enough to come up with a specific set of requirements?
How firm would the standard need to be to assure success? Is there a required core with optional extensions?
What layers would be standardized?
Is it a preprocessing tool, or a set of rules?
Metadata?
What do we do about the inevitable errors or deficiencies?