SlideShare ist ein Scribd-Unternehmen logo
1 von 24
Discussion on an HDF-GEO
concept
HDF Workshop X
30 November 2006
Abstract










At the past several HDF & HDF-EOS Workshops, there has been
some informal discussion of building on the success of HDF-EOS to
design a new profile, tentatively called HDF-GEO.
This profile would incorporate lessons learned from Earth science,
Earth applications, and Earth model data systems.
It would encompass all types of data, data descriptions, and other
metadata. It might support 1-, 2-, and 3-D spatial data as well as time
series; and it would include raw, calibrated, and analyzed data sets.
It would support data exchange by building its needed complexity on
top of minimal specialized features; and by providing clear
mechanisms and requirements for all types of appropriate metadata.
The organizers propose to host a discussion among the workshop
participants on the need, scope, and direction for HDF-GEO.
Which buzzwords would fit?
naming
rules

…

profile

data
model

best
practices
metadata
content
atomic &
compound
types

HDF-GEO

geo
ref
…ilities
self docu
-mentation
data
levels

markup
& schema

platform
support

test
suites

tools
Questions for discussion by Earth science
practitioners - Bottom-up analysis:








What are the successful features of existing
community data formats and conventions? (HDF5,
HDF-EOS, netCDF, CDF, GRIB BUFR, COARDS,
CF-1, NITF, FITS, FGDC RS extensions, ISO,
geoTIFF, ...)
Progress being made -- John Caron's Common Data
Model
Specific needs for geo- and time-referenced data
conventions
Specific needs to support observed (raw), calibrated,
and analyzed data sets
Questions for discussion by Earth science
practitioners - Top-down analysis: (1 of 2)






What is a profile? Consider specifics of how a
standard or group of standards are implemented for a
related set of uses and applications.
How does a profile relate to a format or other
elements of a standard?
What constitutes overkill? How much profile would be
beneficial, and how much would be difficult to
implement and of limited utility?
Questions for discussion by Earth science
practitioners - Top-down analysis: (2 of 2)


Why is it useful?












Establishes specific meanings for complicated terms or
relationships
Establishes common preferred terms for attributes which
can be described multiple ways.
Establishes practices which are consistent with portability
across operating systems, hardware, or archives
Establishes common expectations and obligations for data
stewardship
Clarifies community (and sponsor) long-term expectations,
beyond short-term necessity
other ...
Discussion
Wrap-up


Send your list of







provisional HDF-GEO requirements
goals to be achieved
How HDF-GEO would help

to me at
alan@decisioninfo.com
Backup
Motivation:


In many instances, application-specific 'profiles' or 'conventions'
or best practices have shown their utility for users. In particular,
profiles have encouraged data exchange within communities of
interest. HDF provides minimal guidance for applications.
HDF-EOS was a mission-specific profile; resulted in successes
and lessons learned. HDF5 for NPOESS is another approach.
Is it time for another attempt, benefitng from all the lessons, and
targeted at a broader audience?
HDF Lessons from NPOESS &
Future Opportunities (excerpt)
Alan M. Goldberg
HDF Workshop IX,

<agoldber@mitre.org>
December 2005

NOTICE
This technical data was produced for the U.S. Government under
Contract No. 50-SPNA-9-00010, and is subject to the Rights in
Technical Data - General clause at FAR 52.227-14 (JUN 1987)

Approved for public release, distribution unlimited

© 2005 The MITRE Corporation. All rights reserved
Requirements for data products
Deal with complexity
– Large data granules
Order of Gb

– Complex intrinsic data complexity
Advanced sensors produce new challenges

– Multi-platform, multi-sensor, long duration data production
– Many data processing levels and product types

Satisfy operational, archival, and field terminal users
– Multiple users with heritage traditions

© 2005 The MITRE Corporation. All rights reserved
SENSORS

CCSDS (mux, code, frame) & Encrypt

Delivered Raw

Packetization
Compression
Aux.
Sensor
Data

Cal.
Source

ENVIRONMENTAL
SOURCE
COMPONENTS

Filtration

Comm
Processing

C3 S

Comm
Receiver

RDR
Production

IDPS

Comm
Xmitter

Data
Store

OTHER
SUBSYSTEMS

SPACE SEGMENT

NPOESS products delivered at multiple
levels

RDR Level

A/D Conversion
Detection
Flux
Manipulation

TDR Level

SDR
Production

SDR Level

EDR Level

EDR
Production

© 2005 The MITRE Corporation. All rights reserved
Sensor product types
Swath-oriented multispectral
imagery
– VIIRS – cross-track
whiskbroom
– CMIS – conical scan
– Imagery EDRs – resampled on
uniform grid

Slit spectra
– OMPS SDRs – cross-track
spectra, limb spectra

Image-array fourier spectra
– CrIS SDR

Directional spectra
– SESS energetic particle sensor
SDR

Point lists
– Active fires

3-d swath-oriented grid
– Vertical profile EDRs

2-d map grid
– Seasonal land products

Abstract byte structures
– RDRs

Abstract bit structures
– Encapsulated ancillary data

Bit planes
– Quality flage

Associated arrays (w/ stride?)
– geolocation

© 2005 The MITRE Corporation. All rights reserved
NPOESS product design development
Requirements
- Multi-platform, multisensor, long duration
data production
- Many data
processing levels and
product types
- Satisfy operational,
archival, and field
terminal users
Constraints
- Processing
architecture and
optimization
- Heritage designs
- Contractor style and
practices
- Budget and schedule

Intentions
- Use simple, robust standards
- Use best practices and experience from
previous operational and EOS missions
- Provide robust metadata
- Maximize commonality among products
- Forward-looking, not backward-looking
standardization

Design Process
- Experience
- Trades& Analyses

Result

Resources
- HDF5
- FGDC
- C&F conventions
- Expectation of tools by others
© 2005 The MITRE Corporation. All rights reserved
Lessons & Way Forward

© 2005 The MITRE Corporation. All rights reserved
Observations from development to date
Avoid the temptation to use heritage approaches without
reconsideration, but …
Novel concepts need to be tested
Data concepts, profiles, templates, or best practices should be
defined before coding begins
Use broad, basic standards to the greatest possible extent
– FGDC has flexible definitions, if carefully thought through

Define terms in context; clarity and precision as appropriate
Attempt to predefine data organizations in the past (e.g., HDF-EOS
‘swath’ or HDF4 ‘palette’) have offered limited flexibility. Keep to
simple standards which can be built upon and described well.
Lesson: be humble
It is a great service to future programs if we capture lessons and
evolve the standards
How do we get true estimates of the life-cycle savings for good
design?

© 2005 The MITRE Corporation. All rights reserved
Thoughts on future features for Earth
remote sensing products
Need to more fully integrate product components with HDF
features
Formalize the organization of metadata items which establish the
data structure
– Need mechanism to associate arrays by their independent variables

Formalize the organization of metadata items which establish the
data meaning
– XML is a potential mechanism – can it be well integrated?
– Work needed to understand the advantages and disadvantages
– Climate and Forecast (CF) sets a benchmark

Need a mechanism to encapsulate files in native format
– Case in which HDF is only used to provide consistent access

Need more investment in testing before committing to a design

© 2005 The MITRE Corporation. All rights reserved
Primary and Associated Arrays
Index
Attribute

n-Dimensional
Dependant
Variable (Entity)
Array

Primary Array
e.g., Flux, Brightness,
Counts, NDVI

Associated Array(s)
e.g., QC, Error bars
dimension ≤ n
1-Dimensional Attribute Variables
Index
Attribute

Primary
e.g., UTC time or angle
Additional
e.g., IET time, angle,
or presssure height

Associated
Independent
Variable(s)
Multi-Dimensional Attribute
Variables
2-Dimensional Independent
Variable Array(s)
e.g., lat/lon, XYZ, sun alt/az,
sat alt/az, or land mask

Key concept: Index Attributes organize the primary dependant variables,
or entities. The same Index Attributes maybe used to organize associated
independent variables. Associated independent variables may be used
singly (almost always), in pairs (frequently), or in larger combinations.
Issues going forward - style
Issues with assuring access understanding
– How will applications know which metadata is present?
– Need to define a core set with a default approach

Issues with users
– How to make providers and users comfortable with this or any
standard
– How to communicate the value of: best practices; careful &
flexible design; consistency; beauty of simplicity
– Ease of use as well as ease of creation

Issues with policy
– Helping to meet the letter and intent of the Information Quality
Act

Capturing data product design best practices
– Flexibility vs. consistency vs. ease-of-use for a purpose

© 2005 The MITRE Corporation. All rights reserved
Issues going forward - features
Issues with tools
– Tools are needed to create, validate, and exploit the data sets.
Understand structure and semantics

Issues with collections
– How to implement file and collection metadata, with
appropriate pointers forward and backward
– How to implement quasi-static collection metadata

Issues with HDF
– Processing efficiency (I/O) of compression, of compaction
– Repeated (fixed, not predetermined) metadata items with the
same <tag> not handled
– Archival format

© 2005 The MITRE Corporation. All rights reserved
Possible routes: Should there be an
HDF-GEO?
Specify a profile for the use of HDF in Earth science applications:
Generalized point (list), swath (sensor coordinates), grid
(georeferenced), abstract (raw), and encapsulated (native) profiles.
Generalized approach to associating georeferencing information
with observed information.
Generalized approach to incorporating associated variables with
the mission data
Generalized approach to ‘stride’
Preferred core metadata to assure human and machine readability
Identification metadata in UserBlock
Map appropriate metadata items from HDF native features (e.g.,
array rank and axis sizes)
Preferred approach to data object associations: arrays-of-structs
or structs-of-arrays?
Design guidelines or strict standardization?

© 2005 The MITRE Corporation. All rights reserved

Weitere ähnliche Inhalte

Ähnlich wie Hdg geo discussion

Standard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptxStandard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptx
RocioMendez59
 

Ähnlich wie Hdg geo discussion (20)

SEEDS Standards Process
SEEDS Standards ProcessSEEDS Standards Process
SEEDS Standards Process
 
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
 
Welcome to HDF Workshop V
Welcome to HDF Workshop VWelcome to HDF Workshop V
Welcome to HDF Workshop V
 
RFCs for HDF5 and HDF-EOS5 Status Update
RFCs for HDF5 and HDF-EOS5 Status UpdateRFCs for HDF5 and HDF-EOS5 Status Update
RFCs for HDF5 and HDF-EOS5 Status Update
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
Tools and Techniques for Creating, Maintaining, and Distributing Shareable Me...
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
MEDIN Discovery Metadata Standard
MEDIN Discovery Metadata StandardMEDIN Discovery Metadata Standard
MEDIN Discovery Metadata Standard
 
RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.RDBMS to NoSQL. An overview.
RDBMS to NoSQL. An overview.
 
PM ISE Information Interoperability Presentation -agile sourcing brief
PM ISE Information Interoperability Presentation -agile sourcing briefPM ISE Information Interoperability Presentation -agile sourcing brief
PM ISE Information Interoperability Presentation -agile sourcing brief
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Hdf5
Hdf5Hdf5
Hdf5
 
HDF5 and The HDF Group
HDF5 and The HDF GroupHDF5 and The HDF Group
HDF5 and The HDF Group
 
Presentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenbergPresentation 16 may keynote karin bredenberg
Presentation 16 may keynote karin bredenberg
 
DB_Lec_1 and 2.pptx
DB_Lec_1 and 2.pptxDB_Lec_1 and 2.pptx
DB_Lec_1 and 2.pptx
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
20100217 sopes overview for v3
20100217 sopes overview for v320100217 sopes overview for v3
20100217 sopes overview for v3
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 
Standard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptxStandard Safeguarding Dataset - overview for CSCDUG.pptx
Standard Safeguarding Dataset - overview for CSCDUG.pptx
 
Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)
 

Mehr von The HDF-EOS Tools and Information Center

Mehr von The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Kürzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Kürzlich hochgeladen (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

Hdg geo discussion

  • 1. Discussion on an HDF-GEO concept HDF Workshop X 30 November 2006
  • 2. Abstract      At the past several HDF & HDF-EOS Workshops, there has been some informal discussion of building on the success of HDF-EOS to design a new profile, tentatively called HDF-GEO. This profile would incorporate lessons learned from Earth science, Earth applications, and Earth model data systems. It would encompass all types of data, data descriptions, and other metadata. It might support 1-, 2-, and 3-D spatial data as well as time series; and it would include raw, calibrated, and analyzed data sets. It would support data exchange by building its needed complexity on top of minimal specialized features; and by providing clear mechanisms and requirements for all types of appropriate metadata. The organizers propose to host a discussion among the workshop participants on the need, scope, and direction for HDF-GEO.
  • 3. Which buzzwords would fit? naming rules … profile data model best practices metadata content atomic & compound types HDF-GEO geo ref …ilities self docu -mentation data levels markup & schema platform support test suites tools
  • 4. Questions for discussion by Earth science practitioners - Bottom-up analysis:     What are the successful features of existing community data formats and conventions? (HDF5, HDF-EOS, netCDF, CDF, GRIB BUFR, COARDS, CF-1, NITF, FITS, FGDC RS extensions, ISO, geoTIFF, ...) Progress being made -- John Caron's Common Data Model Specific needs for geo- and time-referenced data conventions Specific needs to support observed (raw), calibrated, and analyzed data sets
  • 5. Questions for discussion by Earth science practitioners - Top-down analysis: (1 of 2)    What is a profile? Consider specifics of how a standard or group of standards are implemented for a related set of uses and applications. How does a profile relate to a format or other elements of a standard? What constitutes overkill? How much profile would be beneficial, and how much would be difficult to implement and of limited utility?
  • 6. Questions for discussion by Earth science practitioners - Top-down analysis: (2 of 2)  Why is it useful?       Establishes specific meanings for complicated terms or relationships Establishes common preferred terms for attributes which can be described multiple ways. Establishes practices which are consistent with portability across operating systems, hardware, or archives Establishes common expectations and obligations for data stewardship Clarifies community (and sponsor) long-term expectations, beyond short-term necessity other ...
  • 8. Wrap-up  Send your list of      provisional HDF-GEO requirements goals to be achieved How HDF-GEO would help to me at alan@decisioninfo.com
  • 10. Motivation:  In many instances, application-specific 'profiles' or 'conventions' or best practices have shown their utility for users. In particular, profiles have encouraged data exchange within communities of interest. HDF provides minimal guidance for applications. HDF-EOS was a mission-specific profile; resulted in successes and lessons learned. HDF5 for NPOESS is another approach. Is it time for another attempt, benefitng from all the lessons, and targeted at a broader audience?
  • 11. HDF Lessons from NPOESS & Future Opportunities (excerpt) Alan M. Goldberg HDF Workshop IX, <agoldber@mitre.org> December 2005 NOTICE This technical data was produced for the U.S. Government under Contract No. 50-SPNA-9-00010, and is subject to the Rights in Technical Data - General clause at FAR 52.227-14 (JUN 1987) Approved for public release, distribution unlimited © 2005 The MITRE Corporation. All rights reserved
  • 12. Requirements for data products Deal with complexity – Large data granules Order of Gb – Complex intrinsic data complexity Advanced sensors produce new challenges – Multi-platform, multi-sensor, long duration data production – Many data processing levels and product types Satisfy operational, archival, and field terminal users – Multiple users with heritage traditions © 2005 The MITRE Corporation. All rights reserved
  • 13. SENSORS CCSDS (mux, code, frame) & Encrypt Delivered Raw Packetization Compression Aux. Sensor Data Cal. Source ENVIRONMENTAL SOURCE COMPONENTS Filtration Comm Processing C3 S Comm Receiver RDR Production IDPS Comm Xmitter Data Store OTHER SUBSYSTEMS SPACE SEGMENT NPOESS products delivered at multiple levels RDR Level A/D Conversion Detection Flux Manipulation TDR Level SDR Production SDR Level EDR Level EDR Production © 2005 The MITRE Corporation. All rights reserved
  • 14. Sensor product types Swath-oriented multispectral imagery – VIIRS – cross-track whiskbroom – CMIS – conical scan – Imagery EDRs – resampled on uniform grid Slit spectra – OMPS SDRs – cross-track spectra, limb spectra Image-array fourier spectra – CrIS SDR Directional spectra – SESS energetic particle sensor SDR Point lists – Active fires 3-d swath-oriented grid – Vertical profile EDRs 2-d map grid – Seasonal land products Abstract byte structures – RDRs Abstract bit structures – Encapsulated ancillary data Bit planes – Quality flage Associated arrays (w/ stride?) – geolocation © 2005 The MITRE Corporation. All rights reserved
  • 15. NPOESS product design development Requirements - Multi-platform, multisensor, long duration data production - Many data processing levels and product types - Satisfy operational, archival, and field terminal users Constraints - Processing architecture and optimization - Heritage designs - Contractor style and practices - Budget and schedule Intentions - Use simple, robust standards - Use best practices and experience from previous operational and EOS missions - Provide robust metadata - Maximize commonality among products - Forward-looking, not backward-looking standardization Design Process - Experience - Trades& Analyses Result Resources - HDF5 - FGDC - C&F conventions - Expectation of tools by others © 2005 The MITRE Corporation. All rights reserved
  • 16. Lessons & Way Forward © 2005 The MITRE Corporation. All rights reserved
  • 17. Observations from development to date Avoid the temptation to use heritage approaches without reconsideration, but … Novel concepts need to be tested Data concepts, profiles, templates, or best practices should be defined before coding begins Use broad, basic standards to the greatest possible extent – FGDC has flexible definitions, if carefully thought through Define terms in context; clarity and precision as appropriate Attempt to predefine data organizations in the past (e.g., HDF-EOS ‘swath’ or HDF4 ‘palette’) have offered limited flexibility. Keep to simple standards which can be built upon and described well. Lesson: be humble It is a great service to future programs if we capture lessons and evolve the standards How do we get true estimates of the life-cycle savings for good design? © 2005 The MITRE Corporation. All rights reserved
  • 18. Thoughts on future features for Earth remote sensing products Need to more fully integrate product components with HDF features Formalize the organization of metadata items which establish the data structure – Need mechanism to associate arrays by their independent variables Formalize the organization of metadata items which establish the data meaning – XML is a potential mechanism – can it be well integrated? – Work needed to understand the advantages and disadvantages – Climate and Forecast (CF) sets a benchmark Need a mechanism to encapsulate files in native format – Case in which HDF is only used to provide consistent access Need more investment in testing before committing to a design © 2005 The MITRE Corporation. All rights reserved
  • 19. Primary and Associated Arrays Index Attribute n-Dimensional Dependant Variable (Entity) Array Primary Array e.g., Flux, Brightness, Counts, NDVI Associated Array(s) e.g., QC, Error bars dimension ≤ n
  • 20. 1-Dimensional Attribute Variables Index Attribute Primary e.g., UTC time or angle Additional e.g., IET time, angle, or presssure height Associated Independent Variable(s)
  • 21. Multi-Dimensional Attribute Variables 2-Dimensional Independent Variable Array(s) e.g., lat/lon, XYZ, sun alt/az, sat alt/az, or land mask Key concept: Index Attributes organize the primary dependant variables, or entities. The same Index Attributes maybe used to organize associated independent variables. Associated independent variables may be used singly (almost always), in pairs (frequently), or in larger combinations.
  • 22. Issues going forward - style Issues with assuring access understanding – How will applications know which metadata is present? – Need to define a core set with a default approach Issues with users – How to make providers and users comfortable with this or any standard – How to communicate the value of: best practices; careful & flexible design; consistency; beauty of simplicity – Ease of use as well as ease of creation Issues with policy – Helping to meet the letter and intent of the Information Quality Act Capturing data product design best practices – Flexibility vs. consistency vs. ease-of-use for a purpose © 2005 The MITRE Corporation. All rights reserved
  • 23. Issues going forward - features Issues with tools – Tools are needed to create, validate, and exploit the data sets. Understand structure and semantics Issues with collections – How to implement file and collection metadata, with appropriate pointers forward and backward – How to implement quasi-static collection metadata Issues with HDF – Processing efficiency (I/O) of compression, of compaction – Repeated (fixed, not predetermined) metadata items with the same <tag> not handled – Archival format © 2005 The MITRE Corporation. All rights reserved
  • 24. Possible routes: Should there be an HDF-GEO? Specify a profile for the use of HDF in Earth science applications: Generalized point (list), swath (sensor coordinates), grid (georeferenced), abstract (raw), and encapsulated (native) profiles. Generalized approach to associating georeferencing information with observed information. Generalized approach to incorporating associated variables with the mission data Generalized approach to ‘stride’ Preferred core metadata to assure human and machine readability Identification metadata in UserBlock Map appropriate metadata items from HDF native features (e.g., array rank and axis sizes) Preferred approach to data object associations: arrays-of-structs or structs-of-arrays? Design guidelines or strict standardization? © 2005 The MITRE Corporation. All rights reserved

Hinweis der Redaktion

  1. Driving NPOESS requirements were different from predecessor systems, due both to the way in which the data is generated and the way in which it will be used.
  2. Source complexity multiple seources complex sources multiple processing levels
  3. NPOESS exercise a broad range of HDF capabilities. We report entitities with ranks of 0-6. Entities may be bits, bytes, C-types, structs, and arrays. They may contain large quantities of fill, due either to array design, missing data, or inapplicable regions (e.g., night, clouds, ocean)
  4. NPOESS design process was driven by REQUIREMENTS and by ENGINEERING CHOICES Requirement that were AWOL: verification testability useability by automated applications tools for development tools for exploitation
  5. Specific lessons learned along the way.
  6. Things to think about that relate to the way HDF is employed in designing data products
  7. Things to think about that relate to the way HDF works; extensions may be needed. Collection metadata is often stored separate from the file metadata. HDF does not provide a clear mechanism for making the connection, except via generalized ‘attribute’. It would be good to have a consistent mechanism which could be understood by HDF readers.
  8. There have been previous discussions at these workshops about an HDF-GEO extension. It would be based on lessons learned from EOSDIS, NPOESS, and other programs – both positive and negative. Are we wise enough to come up with a specific set of requirements? How firm would the standard need to be to assure success? Is there a required core with optional extensions? What layers would be standardized? Is it a preprocessing tool, or a set of rules? Metadata? What do we do about the inevitable errors or deficiencies?