SlideShare ist ein Scribd-Unternehmen logo
1 von 58
The HDF Group

Introduction to HDF5
Barbara Jones
The HDF Group
The 15th HDF and HDF-EOS Workshop
April 17-19, 2012

April 17-19, 2012

HDF/HDF-EOS Workshop XV

1

www.hdfgroup.org
Foreword
• We will be using H5Py – Python interface to HDF5
• Easy to learn
• Saves a lot of time fro prototyping and getting data
and metadata out of HDF5 files
• Hides HDF5 complexity

• Resources
http://code.google.com/p/h5py/wiki/HowTo
http://alfven.org/wp/hdf5-for-python/

• Installation requires Python 2.7, NumPy 1.6.1, and
HDF5 1.8.3 (or later)
April 17-19, 2012

HDF/HDF-EOS Workshop XV

2

www.hdfgroup.org
Topics Covered

•
•
•
•
•

April 17-19, 2012

What HDF5 is
HDF5 Data Model
HDF5 Software and Tools
Introduction to HDF5 APIs
Examples

HDF/HDF-EOS Workshop XV

3

www.hdfgroup.org
What is HDF5?
• Open file format
• Designed for high volume or complex data

• Open source software
• Works with data in the format

• A data model
• Structures for data organization and specification

April 17-19, 2012

HDF/HDF-EOS Workshop XV

4

www.hdfgroup.org
HDF = Hierarchical Data Format

• HDF4 is the first HDF
• Originally called HDF; last major release was version 4

• HDF5 benefits from lessons learned with HDF4
• Changes to file format, software, and data model
• HDF5 and HDF4 are different

• No plans for an HDF6!

April 17-19, 2012

HDF/HDF-EOS Workshop XV

5

www.hdfgroup.org
HDF5 has characteristics of …

April 17-19, 2012

HDF/HDF-EOS Workshop XV

6

www.hdfgroup.org
HDF5 is designed …
•
•
•
•

for small or high volume and/or complex data
for every size and type of system (portable)
for flexible, efficient storage and I/O
to enable applications to evolve in their use of
HDF5 and to accommodate new models
• to support long-term data preservation
• Use it as a file format tool kit

April 17-19, 2012

HDF/HDF-EOS Workshop XV

7

www.hdfgroup.org
HDF5 Technology Platform
• HDF5 data model
• The “building blocks” for data
organization and specification

• HDF5 software
• Library, language interfaces, tools

• HDF5 file format
• Bit-level organization of HDF5 file

April 17-19, 2012

HDF/HDF-EOS Workshop XV

8

www.hdfgroup.org
HDF5 Data Model
Dataset

HDF5
Objects

Link
Datatype

Group

Dataspace

Attribute

Property List

File

a.k.a. HDF5 Abstract Data Model
a.k.a. HDF5 Logical Data Model
April 17-19, 2012

HDF/HDF-EOS Workshop XV

9

www.hdfgroup.org
HDF5 File

An HDF5 file is a
container that holds
data objects.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

lat | lon | temp
----|-----|----12 | 23 | 3.1
15 | 24 | 4.2
17 | 21 | 3.6

10

www.hdfgroup.org
HDF5 Dataset

Metadata

Data

Dataspace

Rank Dimensions
3

Datatype

Dim_1 = 4
Dim_2 = 5
Dim_3 = 7

Integer

(optional)
Attributes

Properties

Time = 32.4

Chunked

Pressure = 987

Compressed

Multi-dimensional array of
identically typed data elements

• HDF5 datasets organize and contain “raw data values”.

• HDF5 datatypes describe individual data elements.
• HDF5 dataspaces describe the logical layout of the data elements.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

11

www.hdfgroup.org
HDF5 Dataset & Dataspace
Dim_3 = 7

HDF5 Dataspace
Rank

Dimensions

3

Specifications for array
dimensions

Multi-dimensional array of
identically typed data elements

• HDF5 datasets organize and contain “raw data values”.

• HDF5 dataspaces describe the logical layout of the data elements
April 17-19, 2012

HDF/HDF-EOS Workshop XV

12

www.hdfgroup.org
HDF5 Dataspaces
Describe the logical layout of the elements in an HDF5 dataset
• NULL
- no elements
• Scalar
- single element
• Simple array (most common)
- Multiple elements organized in a rectilinear array:
Rank = number of dimensions
Dimension size = number of elements in each dimension
Maximum number of elements in each dimension can
be fixed or unlimited

April 17-19, 2012

HDF/HDF-EOS Workshop XV

13

www.hdfgroup.org
HDF5 Dataspaces
Two roles:
Dataspace contains spatial information (logical layout)
about a dataset
stored in a file
• Rank and dimensions
• Permanent part of dataset
definition

Rank = 2
Dimensions = 4x6

Partial I/0: Dataspaces describe applications’ data
buffers and data elements participating in I/O
Rank = 1
Dimension = 10
April 17-19, 2012

HDF/HDF-EOS Workshop XV

14

www.hdfgroup.org
HDF5 Dataset & Datatype
HDF5 Datatype
Integer 32bit LE

Specifications for single data
element

Multi-dimensional array of
identically typed data elements

• HDF5 datasets organize and contain “raw data values”.

• HDF5 datatypes describe individual data elements.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

15

www.hdfgroup.org
HDF5 Datatypes
• Describe individual data elements in an HDF5 dataset
• Wide range of datatypes supported
• Integer (signed and unsigned, 32 and 64-bit, etc.)

•
•
•
•
•
•

Float
Variable-length sequence types (e.g., strings)
Compound (similar to C structs)
User-defined (e.g., 13-bit integer)
Nested types
Pretty much any type!

April 17-19, 2012

HDF/HDF-EOS Workshop XV

16

www.hdfgroup.org
HDF5 Dataset
3

5

12

Datatype:

32-bit Integer

Dataspace:

Rank = 2
Dimensions = 5 x 3

April 17-19, 2012

HDF/HDF-EOS Workshop XV

17

www.hdfgroup.org
HDF5 Dataset with Compound Datatype
3

5

V

int16

char

int32

V

V

V V V
V V V

2x3x2 array of float32

Compound
Datatype:

Dataspace:
April 17-19, 2012

Rank = 2, Dimensions = 5 x 3
HDF/HDF-EOS Workshop XV

18

www.hdfgroup.org
HDF5 Dataset
Metadata

Data

Dataspace

Rank Dimensions
3

Dim_1 = 4
Dim_2 = 5
Dim_3 = 7

Datatype
Integer

Attributes
(optional)

Properties

Time = 32.4

Chunked

Pressure = 987

Compressed

April 17-19, 2012

Multi-dimensional array of
identically typed data elements

HDF/HDF-EOS Workshop XV

19

www.hdfgroup.org
HDF5 Property Lists
Property lists allow you to configure or control the
behavior of the library.
They provide fine grain control when creating or accessing
objects. For example how datasets are
stored, performance tuning…
There are default values associated with property lists.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

20

www.hdfgroup.org
Dataset Storage Properties
Data elements
stored physically
adjacent to each
other

Contiguous
(default)

Better access time
for subsets;
extendible

Chunked

Improves storage
efficiency,
transmission speed

Chunked &
Compressed

April 17-19, 2012

HDF/HDF-EOS Workshop XV

22

www.hdfgroup.org
HDF5 Attributes
• Typically contain user metadata

• Have a name and a value
• Attributes “decorate” HDF5 objects
• Value is described by a datatype and a dataspace
Analogous to a dataset, but do not support partial IO
operations; nor can they be compressed or extended
April 17-19, 2012

HDF/HDF-EOS Workshop XV

23

www.hdfgroup.org
HDF5 Data Model: Are we there yet?
HDF5
Objects

Group and Link



Attribute
Property List



Dataspace



Datatype




Dataset
File
April 17-19, 2012

HDF/HDF-EOS Workshop XV

24


www.hdfgroup.org
HDF5 Groups and Links
HDF5 groups
and links
organize
data objects.

/

Viz

Every HDF5 file
has a root group

SimOut

Parameters
10;100;1000

lat | lon | temp
----|-----|----12 | 23 | 3.1
15 | 24 | 4.2
17 | 21 | 3.6

Timestep
36,000

Similar to UNIX directories
April 17-19, 2012

HDF/HDF-EOS Workshop XV

25

www.hdfgroup.org
HDF5 Groups
• The path to an object defines it
• Objects can be shared:
/A/k and /B/m are the same
temp

“/”

A

k

B

C

m
temp

= Group
= Dataset

April 17-19, 2012

HDF/HDF-EOS Workshop XV

26

www.hdfgroup.org
HDF5 Technology Platform
• HDF5 data model
• The “building blocks” for data
organization and specification

• HDF5 software
• Library, language interfaces,
tools

April 17-19, 2012

HDF/HDF-EOS Workshop XV

27

www.hdfgroup.org
HDF5 Home Page
HDF5 home page: http://hdfgroup.org/HDF5/
• Latest release: HDF5 1.8.8 (1.8.9 coming in May)

HDF5 source code:
•
•

Written in C, and includes optional C++, Fortran 90 APIs, and
High Level APIs.
Contains command-line utilities (h5dump, h5repack, h5diff,
..) and compile scripts

HDF5 pre-built binaries:
• When possible, include C, C++, F90, and High Level libraries.
Check ./lib/libhdf5.settings file.
• Built with and require the SZIP and ZLIB external libraries,
which are included.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

28

www.hdfgroup.org
HDF5 API and Applications
Applications

EOS
Application

Domain Data
Objects

EOS
library

MATLAB

…

HDF5 Library

Storage
April 17-19, 2012

HDF/HDF-EOS Workshop XV

29

www.hdfgroup.org
HDF5 Software Layers & Storage

HDF5 Library

Tools

API
…

Language
Interfaces
C, Fortran, C++
Internals

Virtual File
Layer

High Level
APIs

h5dump
tool

h5repack
tool

HDFview
tool

Java Interface

HDF5 Data Model Objects

Tunable Properties

Groups, Datasets, Attributes, …

Chunk Size, I/O Driver, …

Memory
Mgmt

Posix I/O

Datatype
Conversion

Filters

Split
Files

Chunked
Storage

Version
and so on…
Compatibility

MPI I/O

Custom

Storage

I/O Drivers
HDF5 File
Format

April 17-19, 2012

File

Split
Files
HDF/HDF-EOS Workshop XV

File on
Parallel
Filesystem
30

Other

www.hdfgroup.org
HDF5 File Format
• Defined by the HDF5 File Format Specification.
http://www.hdfgroup.org/HDF5/doc/H5.format.html
• Specifies the bit-level organization of an HDF5 file on
storage media.

• HDF5 library adheres to the File Format, so for the most
part basic users do not need to know the guts of this
information.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

31

www.hdfgroup.org
Useful Tools For New Users
h5dump:
Tool to “dump” or display contents of HDF5 files
h5cc, h5c++, h5fc:
Scripts to compile applications
HDFView:
Java browser to view HDF4 and HDF5 files
http://www.hdfgroup.org/hdf-java-html/hdfview/

April 17-19, 2012

HDF/HDF-EOS Workshop XV

32

www.hdfgroup.org
h5dump utility
h5dump [options] [file]
-H, --header
-d <names>
-g <names>

-p

Display header only – no data
Display specified
pathname/dataset(s)
Display the specified group(s) and
all members
Display properties

<names> is one or more appropriate object names.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

33

www.hdfgroup.org
Example of h5dump Output
HDF5 “my.h5" {
GROUP "/" {
DATASET “mydata" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) }
DATA {
1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24
“/”
}
mydata
}
}
}
my.h5
April 17-19, 2012

HDF/HDF-EOS Workshop XV

34

www.hdfgroup.org
Introduction to
HDF5 Programming Model
and APIs

April 17-19, 2012

HDF/HDF-EOS Workshop XV

35

www.hdfgroup.org
General Programming Paradigm
• Object is opened or created
• Object is written to or read from, possibly many
times
• Object is closed

• Properties of object are optionally defined
Creation properties
Access properties
April 17-19, 2012

HDF/HDF-EOS Workshop XV

36

www.hdfgroup.org
The HDF5 API
• The API is extensive

Swiss Army
Cybertool 34

 300+ functions

• This can be daunting… but there is hope
A few functions can do a lot
Start simple
Build up knowledge as more features are needed

April 17-19, 2012

HDF/HDF-EOS Workshop XV

38

www.hdfgroup.org
HDF5 APIs
• Currently C, Fortran 90, C++ and Java bindings
supported by The HDF Group
• Others:
HDF5DotNet (C#, VB.NET, IronPython,..)
http://hdf5.net/
h5py (Python)
http://code.google.com/p/h5py/
(developed by Andrew Collette)
April 17-19, 2012

HDF/HDF-EOS Workshop XV

39

www.hdfgroup.org
Language Specific Requirements
• For portability, the HDF5 library has its own defined
types. For example, hid_t is used for object handles.

• Must include language specific files in your application:
C – Add “#include hdf5.h”
F90 - Add “USE HDF5”
Call h5open_f/h5close_f to initialize/close
Fortran interface
C++ - Add “#include H5Cpp.h”
Python - Add “import h5py” / “import numpy”

April 17-19, 2012

HDF/HDF-EOS Workshop XV

40

www.hdfgroup.org
The HDF Group

Example HDF5 Code

April 17-19, 2012

HDF/HDF-EOS Workshop XV

41

www.hdfgroup.org
Steps to Create a File
1. Specify property lists (or use defaults)
2. Create the file
3. Close the file (and properties if necessary)

April 17-19, 2012

HDF/HDF-EOS Workshop XV

42

www.hdfgroup.org
Creating an HDF5 File in Python
File Access Flag (create new file)

1. import h5py

2. file = h5py.File ('file.h5', 'w')
3. file.close ()
file.h5

“/” (root)

April 17-19, 2012

HDF/HDF-EOS Workshop XV

43

www.hdfgroup.org
Creating an HDF5 File In C
1. Specify Include File
#include “hdf5.h”

2. Example of Defined Types

int main() {

hid_t
herr_t

3. File Access Flag
(create new file)

file_id;
status;

file_id = H5Fcreate ("file.h5", H5F_ACC_TRUNC,
H5P_DEFAULT, H5P_DEFAULT);
status = H5Fclose (file_id);
}

4. To specify
default property
lists
April 17-19, 2012

HDF/HDF-EOS Workshop XV

44

www.hdfgroup.org
Creating an HDF5 File in F90
PROGRAM FILEEXAMPLE

1. Specify HDF5 Module

USE HDF5
IMPLICIT NONE

2. Example of Defined Types

CHARACTER(LEN=8), PARAMETER :: filename = "filef.h5" ! File name
INTEGER(HID_T) :: file_id
! File identifier
INTEGER :: error

3. Initialize Fortran interface

CALL h5open_f (error)
CALL h5fcreate_f (filename, H5F_ACC_TRUNC_F, file_id, error)
CALL h5fclose_f (file_id, error)
CALL h5close_f (error)
END PROGRAM FILEEXAMPLE

April 17-19, 2012

HDF/HDF-EOS Workshop XV

4. Close Fortran interface

45

www.hdfgroup.org
Steps to Create a Dataset
1. Define dataset characteristics
a) Datatype
b) Dataspace
c) Properties (or use default)

2. Decide where to put it
Group or root group

3. Create dataset in file
4. Close dataset handle from step 3.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

46

www.hdfgroup.org
Example: Create a Dataset
dset.h5

“/” (root)
dset

Integer, 4x6

April 17-19, 2012

HDF/HDF-EOS Workshop XV

47

www.hdfgroup.org
Create a Dataset: h5_crtdat.py
1. import h5py
2. file = h5py.File ('dset.h5', 'w')
3. dataset = file.create_dataset ('dset', (4, 6), 'i')
4. file.close()
Name

Create Dataset in
Root Group

April 17-19, 2012

Dataspace
(shape)

Datatype

h5py closes the dataset for you

HDF/HDF-EOS Workshop XV

48

www.hdfgroup.org
Write To/Read From a Dataset: h5_rdwt.py
1. import h5py
2. import numpy as np
3. file = h5py.File('dset.h5','r+')

Open ‘dset’ in root group

4. dataset = file['dset']
5. data = np.zeros((4,6))
6.
7.
8.

for i in range(4):
for j in range(6):
data[i][j]= i*6+j+1

Write buffer to ‘dset’

9. dataset[...] = data
10. data_read = dataset[...]

Read data in ‘dset’ into buffer

11. file.close()

April 17-19, 2012

HDF/HDF-EOS Workshop XV

49

www.hdfgroup.org
How To Write to a Subset of the dataset?

dim2

5

5

5

5

5

5

5

5

dim1

5
5

5

5

dataset[1:4, 2:6] = 5
(instead of using “dataset[…]”)

April 17-19, 2012

HDF/HDF-EOS Workshop XV

50

www.hdfgroup.org
Read integer into float buffer: h5_readtofloat.py
1. import h5py
2. import numpy as np
3. file = h5py.File('dset.h5','r+')
4. dataset = file['dset']
5. data = np.zeros((4,6))
6. for i in range(4):
7. for j in range(6):
8.
data[i][j]= i*6+j+1

Write buffer to integer ‘dset’
Read data in ‘dset’ into
float buffer

9. dataset[...] = data

10. data_read32 = np.zeros((4,6,), dtype=np.float32)
11. dataset.id.read (h5py.h5s.ALL, h5py.h5s.ALL, data_read32,
mtype=h5py.h5t.NATIVE_FLOAT)
12. file.close()
April 17-19, 2012

HDF/HDF-EOS Workshop XV

51

www.hdfgroup.org
Steps to Create a Group
1. Decide where to put it – “root group”
or other group
2. Define properties or use default

3. Create the group in file
4. Close the group

April 17-19, 2012

HDF/HDF-EOS Workshop XV

52

www.hdfgroup.org
Example: Create a Group
“/” (root)
dset

MyGroup

4x6 array of
integers

dset.h5

April 17-19, 2012

HDF/HDF-EOS Workshop XV

53

www.hdfgroup.org
Create a Group: h5_crtgrp.py
Create group ‘MyGroup’
under root group

1. import h5py
2. file = h5py.File('dset.h5', 'r+')

3. group = file.create_group ('MyGroup')
4. file.close()

h5py closes the group for you

April 17-19, 2012

HDF/HDF-EOS Workshop XV

54

www.hdfgroup.org
Example: Create Attributes
“/” (root)
dset
Attributes:
Units=“Meters per second”
Speed=[100,200]

4x6 array of
integers

dset.h5

April 17-19, 2012

HDF/HDF-EOS Workshop XV

55

www.hdfgroup.org
Create Attributes: h5_crtatt.py
1. import h5py
2. import numpy as np

3. file = h5py.File('dset.h5','r+')
4. dataset = file['/dset']

Create string attribute

5. dataset.attrs["Units"] = “Meters per second”

6. attr_data = np.zeros((2,))
7. attr_data[0] = 100
8. attr_data[1] = 200

Create integer attribute

9. dataset.attrs.create("Speed", attr_data, (2,), “i”)
10. file.close()
April 17-19, 2012

HDF/HDF-EOS Workshop XV

56

www.hdfgroup.org
HDF5 Tutorial and Examples
HDF5 Tutorial:
http://www.hdfgroup.org/HDF5/Tutor/
HDF5 Examples:
http://www.hdfgroup.org/ftp/HDF5/examples/
HDF5 Documentation:
http://www.hdfgroup.org/HDF5/doc/

April 17-19, 2012

HDF/HDF-EOS Workshop XV

58

www.hdfgroup.org
HDF5 Technology Platform
• HDF5 data model
• The “building blocks” for data
organization and specification

• HDF5 software
• Library, language interfaces, tools

• HDF5 file format
• Bit-level organization of HDF5 file

April 17-19, 2012

HDF/HDF-EOS Workshop XV

59

www.hdfgroup.org
The HDF Group

Thank You!

April 17-19, 2012

HDF/HDF-EOS Workshop XV

60

www.hdfgroup.org
The HDF Group

Questions/comments?

April 17-19, 2012

HDF/HDF-EOS Workshop XV

61

www.hdfgroup.org

Weitere ähnliche Inhalte

Was ist angesagt?

The Ultimate Administrator’s Guide to HCL Nomad Web
The Ultimate Administrator’s Guide to HCL Nomad WebThe Ultimate Administrator’s Guide to HCL Nomad Web
The Ultimate Administrator’s Guide to HCL Nomad Web
panagenda
 

Was ist angesagt? (20)

HCL Notes and Nomad Troubleshooting for Dummies
HCL Notes and Nomad Troubleshooting for DummiesHCL Notes and Nomad Troubleshooting for Dummies
HCL Notes and Nomad Troubleshooting for Dummies
 
Git flow Introduction
Git flow IntroductionGit flow Introduction
Git flow Introduction
 
The Ultimate Administrator’s Guide to HCL Nomad Web
The Ultimate Administrator’s Guide to HCL Nomad WebThe Ultimate Administrator’s Guide to HCL Nomad Web
The Ultimate Administrator’s Guide to HCL Nomad Web
 
Gitlab ci, cncf.sk
Gitlab ci, cncf.skGitlab ci, cncf.sk
Gitlab ci, cncf.sk
 
Android Crash analysis and The Dalvik Garbage collector – Tools and Tips
Android Crash analysis and The Dalvik Garbage collector – Tools and TipsAndroid Crash analysis and The Dalvik Garbage collector – Tools and Tips
Android Crash analysis and The Dalvik Garbage collector – Tools and Tips
 
An Introduction to Gradle for Java Developers
An Introduction to Gradle for Java DevelopersAn Introduction to Gradle for Java Developers
An Introduction to Gradle for Java Developers
 
TMUX Rocks!
TMUX Rocks!TMUX Rocks!
TMUX Rocks!
 
The open source philosophy
The open source philosophyThe open source philosophy
The open source philosophy
 
Jenkins Pipeline meets Oracle
Jenkins Pipeline meets OracleJenkins Pipeline meets Oracle
Jenkins Pipeline meets Oracle
 
Android presentation
Android presentationAndroid presentation
Android presentation
 
linux introduction
linux introductionlinux introduction
linux introduction
 
GitLab for CI/CD process
GitLab for CI/CD processGitLab for CI/CD process
GitLab for CI/CD process
 
LINUX DISTRIBUTIONS.pptx
LINUX DISTRIBUTIONS.pptxLINUX DISTRIBUTIONS.pptx
LINUX DISTRIBUTIONS.pptx
 
Project meeting: Android Graphics Architecture Overview
Project meeting: Android Graphics Architecture OverviewProject meeting: Android Graphics Architecture Overview
Project meeting: Android Graphics Architecture Overview
 
From frustration to fascination: dissecting Replication
From frustration to fascination: dissecting ReplicationFrom frustration to fascination: dissecting Replication
From frustration to fascination: dissecting Replication
 
Git and github
Git and githubGit and github
Git and github
 
Git and git flow
Git and git flowGit and git flow
Git and git flow
 
Android's Multimedia Framework
Android's Multimedia FrameworkAndroid's Multimedia Framework
Android's Multimedia Framework
 
Effective service and resource management with systemd
Effective service and resource management with systemdEffective service and resource management with systemd
Effective service and resource management with systemd
 
Operating System
Operating SystemOperating System
Operating System
 

Andere mochten auch

Andere mochten auch (20)

NetCDF and HDF5
NetCDF and HDF5NetCDF and HDF5
NetCDF and HDF5
 
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFViewHDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
 
Status of HDF-EOS, Related Software and Tools
 Status of HDF-EOS, Related Software and Tools Status of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and Tools
 
Bridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data ProductsBridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data Products
 
HDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSSHDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSS
 
HDF Tools Updates and Discussions
HDF Tools Updates and DiscussionsHDF Tools Updates and Discussions
HDF Tools Updates and Discussions
 
Using IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS DataUsing IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS Data
 
Granules Are Forever
Granules Are ForeverGranules Are Forever
Granules Are Forever
 
Connecting HDF with ISO Metadata Standards
Connecting HDF with ISO Metadata StandardsConnecting HDF with ISO Metadata Standards
Connecting HDF with ISO Metadata Standards
 
HDF4 Mapping Project Update
HDF4 Mapping Project UpdateHDF4 Mapping Project Update
HDF4 Mapping Project Update
 
HDF Tools Tutorial
HDF Tools TutorialHDF Tools Tutorial
HDF Tools Tutorial
 
HDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDCHDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDC
 
Earth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project UpdateEarth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project Update
 
Images of HDF5
Images of HDF5Images of HDF5
Images of HDF5
 
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
GES DISC Eexperiences with HDF Formats for MEaSUREs ProjectsGES DISC Eexperiences with HDF Formats for MEaSUREs Projects
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
 
HDF OPeNDAP Project Update and Demo
HDF OPeNDAP Project Update and DemoHDF OPeNDAP Project Update and Demo
HDF OPeNDAP Project Update and Demo
 
2011 ACSI Survey Summary
2011 ACSI Survey Summary2011 ACSI Survey Summary
2011 ACSI Survey Summary
 
HDF Project Status and Plans
HDF Project Status and PlansHDF Project Status and Plans
HDF Project Status and Plans
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Tools to improve the usability of NASA HDF Data
Tools to improve the usability of NASA HDF DataTools to improve the usability of NASA HDF Data
Tools to improve the usability of NASA HDF Data
 

Ähnlich wie Introduction to HDF5 Data and Programming Models

Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
The HDF-EOS Tools and Information Center
 
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout MapsEnsuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
The HDF-EOS Tools and Information Center
 

Ähnlich wie Introduction to HDF5 Data and Programming Models (20)

Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Hdf5 parallel
Hdf5 parallelHdf5 parallel
Hdf5 parallel
 
Hdf5 intro
Hdf5 introHdf5 intro
Hdf5 intro
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
 
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout MapsEnsuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
 
HDF Status and Development
HDF Status and DevelopmentHDF Status and Development
HDF Status and Development
 
HDF Updae
HDF UpdaeHDF Updae
HDF Updae
 
HDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/OHDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/O
 
HDF OPeNDAP project update and demo
HDF OPeNDAP project update and demoHDF OPeNDAP project update and demo
HDF OPeNDAP project update and demo
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
HDF5 Tools
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Parallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory TutorialParallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory Tutorial
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
Support for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF GroupSupport for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF Group
 

Mehr von The HDF-EOS Tools and Information Center

Mehr von The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 
Leveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software TestingLeveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software Testing
 

Kürzlich hochgeladen

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 

Kürzlich hochgeladen (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Evaluating the top large language models.pdf
Evaluating the top large language models.pdfEvaluating the top large language models.pdf
Evaluating the top large language models.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Introduction to HDF5 Data and Programming Models

  • 1. The HDF Group Introduction to HDF5 Barbara Jones The HDF Group The 15th HDF and HDF-EOS Workshop April 17-19, 2012 April 17-19, 2012 HDF/HDF-EOS Workshop XV 1 www.hdfgroup.org
  • 2. Foreword • We will be using H5Py – Python interface to HDF5 • Easy to learn • Saves a lot of time fro prototyping and getting data and metadata out of HDF5 files • Hides HDF5 complexity • Resources http://code.google.com/p/h5py/wiki/HowTo http://alfven.org/wp/hdf5-for-python/ • Installation requires Python 2.7, NumPy 1.6.1, and HDF5 1.8.3 (or later) April 17-19, 2012 HDF/HDF-EOS Workshop XV 2 www.hdfgroup.org
  • 3. Topics Covered • • • • • April 17-19, 2012 What HDF5 is HDF5 Data Model HDF5 Software and Tools Introduction to HDF5 APIs Examples HDF/HDF-EOS Workshop XV 3 www.hdfgroup.org
  • 4. What is HDF5? • Open file format • Designed for high volume or complex data • Open source software • Works with data in the format • A data model • Structures for data organization and specification April 17-19, 2012 HDF/HDF-EOS Workshop XV 4 www.hdfgroup.org
  • 5. HDF = Hierarchical Data Format • HDF4 is the first HDF • Originally called HDF; last major release was version 4 • HDF5 benefits from lessons learned with HDF4 • Changes to file format, software, and data model • HDF5 and HDF4 are different • No plans for an HDF6! April 17-19, 2012 HDF/HDF-EOS Workshop XV 5 www.hdfgroup.org
  • 6. HDF5 has characteristics of … April 17-19, 2012 HDF/HDF-EOS Workshop XV 6 www.hdfgroup.org
  • 7. HDF5 is designed … • • • • for small or high volume and/or complex data for every size and type of system (portable) for flexible, efficient storage and I/O to enable applications to evolve in their use of HDF5 and to accommodate new models • to support long-term data preservation • Use it as a file format tool kit April 17-19, 2012 HDF/HDF-EOS Workshop XV 7 www.hdfgroup.org
  • 8. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools • HDF5 file format • Bit-level organization of HDF5 file April 17-19, 2012 HDF/HDF-EOS Workshop XV 8 www.hdfgroup.org
  • 9. HDF5 Data Model Dataset HDF5 Objects Link Datatype Group Dataspace Attribute Property List File a.k.a. HDF5 Abstract Data Model a.k.a. HDF5 Logical Data Model April 17-19, 2012 HDF/HDF-EOS Workshop XV 9 www.hdfgroup.org
  • 10. HDF5 File An HDF5 file is a container that holds data objects. April 17-19, 2012 HDF/HDF-EOS Workshop XV lat | lon | temp ----|-----|----12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 10 www.hdfgroup.org
  • 11. HDF5 Dataset Metadata Data Dataspace Rank Dimensions 3 Datatype Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Integer (optional) Attributes Properties Time = 32.4 Chunked Pressure = 987 Compressed Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 datatypes describe individual data elements. • HDF5 dataspaces describe the logical layout of the data elements. April 17-19, 2012 HDF/HDF-EOS Workshop XV 11 www.hdfgroup.org
  • 12. HDF5 Dataset & Dataspace Dim_3 = 7 HDF5 Dataspace Rank Dimensions 3 Specifications for array dimensions Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 dataspaces describe the logical layout of the data elements April 17-19, 2012 HDF/HDF-EOS Workshop XV 12 www.hdfgroup.org
  • 13. HDF5 Dataspaces Describe the logical layout of the elements in an HDF5 dataset • NULL - no elements • Scalar - single element • Simple array (most common) - Multiple elements organized in a rectilinear array: Rank = number of dimensions Dimension size = number of elements in each dimension Maximum number of elements in each dimension can be fixed or unlimited April 17-19, 2012 HDF/HDF-EOS Workshop XV 13 www.hdfgroup.org
  • 14. HDF5 Dataspaces Two roles: Dataspace contains spatial information (logical layout) about a dataset stored in a file • Rank and dimensions • Permanent part of dataset definition Rank = 2 Dimensions = 4x6 Partial I/0: Dataspaces describe applications’ data buffers and data elements participating in I/O Rank = 1 Dimension = 10 April 17-19, 2012 HDF/HDF-EOS Workshop XV 14 www.hdfgroup.org
  • 15. HDF5 Dataset & Datatype HDF5 Datatype Integer 32bit LE Specifications for single data element Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 datatypes describe individual data elements. April 17-19, 2012 HDF/HDF-EOS Workshop XV 15 www.hdfgroup.org
  • 16. HDF5 Datatypes • Describe individual data elements in an HDF5 dataset • Wide range of datatypes supported • Integer (signed and unsigned, 32 and 64-bit, etc.) • • • • • • Float Variable-length sequence types (e.g., strings) Compound (similar to C structs) User-defined (e.g., 13-bit integer) Nested types Pretty much any type! April 17-19, 2012 HDF/HDF-EOS Workshop XV 16 www.hdfgroup.org
  • 17. HDF5 Dataset 3 5 12 Datatype: 32-bit Integer Dataspace: Rank = 2 Dimensions = 5 x 3 April 17-19, 2012 HDF/HDF-EOS Workshop XV 17 www.hdfgroup.org
  • 18. HDF5 Dataset with Compound Datatype 3 5 V int16 char int32 V V V V V V V V 2x3x2 array of float32 Compound Datatype: Dataspace: April 17-19, 2012 Rank = 2, Dimensions = 5 x 3 HDF/HDF-EOS Workshop XV 18 www.hdfgroup.org
  • 19. HDF5 Dataset Metadata Data Dataspace Rank Dimensions 3 Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Datatype Integer Attributes (optional) Properties Time = 32.4 Chunked Pressure = 987 Compressed April 17-19, 2012 Multi-dimensional array of identically typed data elements HDF/HDF-EOS Workshop XV 19 www.hdfgroup.org
  • 20. HDF5 Property Lists Property lists allow you to configure or control the behavior of the library. They provide fine grain control when creating or accessing objects. For example how datasets are stored, performance tuning… There are default values associated with property lists. April 17-19, 2012 HDF/HDF-EOS Workshop XV 20 www.hdfgroup.org
  • 21. Dataset Storage Properties Data elements stored physically adjacent to each other Contiguous (default) Better access time for subsets; extendible Chunked Improves storage efficiency, transmission speed Chunked & Compressed April 17-19, 2012 HDF/HDF-EOS Workshop XV 22 www.hdfgroup.org
  • 22. HDF5 Attributes • Typically contain user metadata • Have a name and a value • Attributes “decorate” HDF5 objects • Value is described by a datatype and a dataspace Analogous to a dataset, but do not support partial IO operations; nor can they be compressed or extended April 17-19, 2012 HDF/HDF-EOS Workshop XV 23 www.hdfgroup.org
  • 23. HDF5 Data Model: Are we there yet? HDF5 Objects Group and Link  Attribute Property List  Dataspace  Datatype   Dataset File April 17-19, 2012 HDF/HDF-EOS Workshop XV 24  www.hdfgroup.org
  • 24. HDF5 Groups and Links HDF5 groups and links organize data objects. / Viz Every HDF5 file has a root group SimOut Parameters 10;100;1000 lat | lon | temp ----|-----|----12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 Timestep 36,000 Similar to UNIX directories April 17-19, 2012 HDF/HDF-EOS Workshop XV 25 www.hdfgroup.org
  • 25. HDF5 Groups • The path to an object defines it • Objects can be shared: /A/k and /B/m are the same temp “/” A k B C m temp = Group = Dataset April 17-19, 2012 HDF/HDF-EOS Workshop XV 26 www.hdfgroup.org
  • 26. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools April 17-19, 2012 HDF/HDF-EOS Workshop XV 27 www.hdfgroup.org
  • 27. HDF5 Home Page HDF5 home page: http://hdfgroup.org/HDF5/ • Latest release: HDF5 1.8.8 (1.8.9 coming in May) HDF5 source code: • • Written in C, and includes optional C++, Fortran 90 APIs, and High Level APIs. Contains command-line utilities (h5dump, h5repack, h5diff, ..) and compile scripts HDF5 pre-built binaries: • When possible, include C, C++, F90, and High Level libraries. Check ./lib/libhdf5.settings file. • Built with and require the SZIP and ZLIB external libraries, which are included. April 17-19, 2012 HDF/HDF-EOS Workshop XV 28 www.hdfgroup.org
  • 28. HDF5 API and Applications Applications EOS Application Domain Data Objects EOS library MATLAB … HDF5 Library Storage April 17-19, 2012 HDF/HDF-EOS Workshop XV 29 www.hdfgroup.org
  • 29. HDF5 Software Layers & Storage HDF5 Library Tools API … Language Interfaces C, Fortran, C++ Internals Virtual File Layer High Level APIs h5dump tool h5repack tool HDFview tool Java Interface HDF5 Data Model Objects Tunable Properties Groups, Datasets, Attributes, … Chunk Size, I/O Driver, … Memory Mgmt Posix I/O Datatype Conversion Filters Split Files Chunked Storage Version and so on… Compatibility MPI I/O Custom Storage I/O Drivers HDF5 File Format April 17-19, 2012 File Split Files HDF/HDF-EOS Workshop XV File on Parallel Filesystem 30 Other www.hdfgroup.org
  • 30. HDF5 File Format • Defined by the HDF5 File Format Specification. http://www.hdfgroup.org/HDF5/doc/H5.format.html • Specifies the bit-level organization of an HDF5 file on storage media. • HDF5 library adheres to the File Format, so for the most part basic users do not need to know the guts of this information. April 17-19, 2012 HDF/HDF-EOS Workshop XV 31 www.hdfgroup.org
  • 31. Useful Tools For New Users h5dump: Tool to “dump” or display contents of HDF5 files h5cc, h5c++, h5fc: Scripts to compile applications HDFView: Java browser to view HDF4 and HDF5 files http://www.hdfgroup.org/hdf-java-html/hdfview/ April 17-19, 2012 HDF/HDF-EOS Workshop XV 32 www.hdfgroup.org
  • 32. h5dump utility h5dump [options] [file] -H, --header -d <names> -g <names> -p Display header only – no data Display specified pathname/dataset(s) Display the specified group(s) and all members Display properties <names> is one or more appropriate object names. April 17-19, 2012 HDF/HDF-EOS Workshop XV 33 www.hdfgroup.org
  • 33. Example of h5dump Output HDF5 “my.h5" { GROUP "/" { DATASET “mydata" { DATATYPE { H5T_STD_I32BE } DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) } DATA { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 “/” } mydata } } } my.h5 April 17-19, 2012 HDF/HDF-EOS Workshop XV 34 www.hdfgroup.org
  • 34. Introduction to HDF5 Programming Model and APIs April 17-19, 2012 HDF/HDF-EOS Workshop XV 35 www.hdfgroup.org
  • 35. General Programming Paradigm • Object is opened or created • Object is written to or read from, possibly many times • Object is closed • Properties of object are optionally defined Creation properties Access properties April 17-19, 2012 HDF/HDF-EOS Workshop XV 36 www.hdfgroup.org
  • 36. The HDF5 API • The API is extensive Swiss Army Cybertool 34  300+ functions • This can be daunting… but there is hope A few functions can do a lot Start simple Build up knowledge as more features are needed April 17-19, 2012 HDF/HDF-EOS Workshop XV 38 www.hdfgroup.org
  • 37. HDF5 APIs • Currently C, Fortran 90, C++ and Java bindings supported by The HDF Group • Others: HDF5DotNet (C#, VB.NET, IronPython,..) http://hdf5.net/ h5py (Python) http://code.google.com/p/h5py/ (developed by Andrew Collette) April 17-19, 2012 HDF/HDF-EOS Workshop XV 39 www.hdfgroup.org
  • 38. Language Specific Requirements • For portability, the HDF5 library has its own defined types. For example, hid_t is used for object handles. • Must include language specific files in your application: C – Add “#include hdf5.h” F90 - Add “USE HDF5” Call h5open_f/h5close_f to initialize/close Fortran interface C++ - Add “#include H5Cpp.h” Python - Add “import h5py” / “import numpy” April 17-19, 2012 HDF/HDF-EOS Workshop XV 40 www.hdfgroup.org
  • 39. The HDF Group Example HDF5 Code April 17-19, 2012 HDF/HDF-EOS Workshop XV 41 www.hdfgroup.org
  • 40. Steps to Create a File 1. Specify property lists (or use defaults) 2. Create the file 3. Close the file (and properties if necessary) April 17-19, 2012 HDF/HDF-EOS Workshop XV 42 www.hdfgroup.org
  • 41. Creating an HDF5 File in Python File Access Flag (create new file) 1. import h5py 2. file = h5py.File ('file.h5', 'w') 3. file.close () file.h5 “/” (root) April 17-19, 2012 HDF/HDF-EOS Workshop XV 43 www.hdfgroup.org
  • 42. Creating an HDF5 File In C 1. Specify Include File #include “hdf5.h” 2. Example of Defined Types int main() { hid_t herr_t 3. File Access Flag (create new file) file_id; status; file_id = H5Fcreate ("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); status = H5Fclose (file_id); } 4. To specify default property lists April 17-19, 2012 HDF/HDF-EOS Workshop XV 44 www.hdfgroup.org
  • 43. Creating an HDF5 File in F90 PROGRAM FILEEXAMPLE 1. Specify HDF5 Module USE HDF5 IMPLICIT NONE 2. Example of Defined Types CHARACTER(LEN=8), PARAMETER :: filename = "filef.h5" ! File name INTEGER(HID_T) :: file_id ! File identifier INTEGER :: error 3. Initialize Fortran interface CALL h5open_f (error) CALL h5fcreate_f (filename, H5F_ACC_TRUNC_F, file_id, error) CALL h5fclose_f (file_id, error) CALL h5close_f (error) END PROGRAM FILEEXAMPLE April 17-19, 2012 HDF/HDF-EOS Workshop XV 4. Close Fortran interface 45 www.hdfgroup.org
  • 44. Steps to Create a Dataset 1. Define dataset characteristics a) Datatype b) Dataspace c) Properties (or use default) 2. Decide where to put it Group or root group 3. Create dataset in file 4. Close dataset handle from step 3. April 17-19, 2012 HDF/HDF-EOS Workshop XV 46 www.hdfgroup.org
  • 45. Example: Create a Dataset dset.h5 “/” (root) dset Integer, 4x6 April 17-19, 2012 HDF/HDF-EOS Workshop XV 47 www.hdfgroup.org
  • 46. Create a Dataset: h5_crtdat.py 1. import h5py 2. file = h5py.File ('dset.h5', 'w') 3. dataset = file.create_dataset ('dset', (4, 6), 'i') 4. file.close() Name Create Dataset in Root Group April 17-19, 2012 Dataspace (shape) Datatype h5py closes the dataset for you HDF/HDF-EOS Workshop XV 48 www.hdfgroup.org
  • 47. Write To/Read From a Dataset: h5_rdwt.py 1. import h5py 2. import numpy as np 3. file = h5py.File('dset.h5','r+') Open ‘dset’ in root group 4. dataset = file['dset'] 5. data = np.zeros((4,6)) 6. 7. 8. for i in range(4): for j in range(6): data[i][j]= i*6+j+1 Write buffer to ‘dset’ 9. dataset[...] = data 10. data_read = dataset[...] Read data in ‘dset’ into buffer 11. file.close() April 17-19, 2012 HDF/HDF-EOS Workshop XV 49 www.hdfgroup.org
  • 48. How To Write to a Subset of the dataset? dim2 5 5 5 5 5 5 5 5 dim1 5 5 5 5 dataset[1:4, 2:6] = 5 (instead of using “dataset[…]”) April 17-19, 2012 HDF/HDF-EOS Workshop XV 50 www.hdfgroup.org
  • 49. Read integer into float buffer: h5_readtofloat.py 1. import h5py 2. import numpy as np 3. file = h5py.File('dset.h5','r+') 4. dataset = file['dset'] 5. data = np.zeros((4,6)) 6. for i in range(4): 7. for j in range(6): 8. data[i][j]= i*6+j+1 Write buffer to integer ‘dset’ Read data in ‘dset’ into float buffer 9. dataset[...] = data 10. data_read32 = np.zeros((4,6,), dtype=np.float32) 11. dataset.id.read (h5py.h5s.ALL, h5py.h5s.ALL, data_read32, mtype=h5py.h5t.NATIVE_FLOAT) 12. file.close() April 17-19, 2012 HDF/HDF-EOS Workshop XV 51 www.hdfgroup.org
  • 50. Steps to Create a Group 1. Decide where to put it – “root group” or other group 2. Define properties or use default 3. Create the group in file 4. Close the group April 17-19, 2012 HDF/HDF-EOS Workshop XV 52 www.hdfgroup.org
  • 51. Example: Create a Group “/” (root) dset MyGroup 4x6 array of integers dset.h5 April 17-19, 2012 HDF/HDF-EOS Workshop XV 53 www.hdfgroup.org
  • 52. Create a Group: h5_crtgrp.py Create group ‘MyGroup’ under root group 1. import h5py 2. file = h5py.File('dset.h5', 'r+') 3. group = file.create_group ('MyGroup') 4. file.close() h5py closes the group for you April 17-19, 2012 HDF/HDF-EOS Workshop XV 54 www.hdfgroup.org
  • 53. Example: Create Attributes “/” (root) dset Attributes: Units=“Meters per second” Speed=[100,200] 4x6 array of integers dset.h5 April 17-19, 2012 HDF/HDF-EOS Workshop XV 55 www.hdfgroup.org
  • 54. Create Attributes: h5_crtatt.py 1. import h5py 2. import numpy as np 3. file = h5py.File('dset.h5','r+') 4. dataset = file['/dset'] Create string attribute 5. dataset.attrs["Units"] = “Meters per second” 6. attr_data = np.zeros((2,)) 7. attr_data[0] = 100 8. attr_data[1] = 200 Create integer attribute 9. dataset.attrs.create("Speed", attr_data, (2,), “i”) 10. file.close() April 17-19, 2012 HDF/HDF-EOS Workshop XV 56 www.hdfgroup.org
  • 55. HDF5 Tutorial and Examples HDF5 Tutorial: http://www.hdfgroup.org/HDF5/Tutor/ HDF5 Examples: http://www.hdfgroup.org/ftp/HDF5/examples/ HDF5 Documentation: http://www.hdfgroup.org/HDF5/doc/ April 17-19, 2012 HDF/HDF-EOS Workshop XV 58 www.hdfgroup.org
  • 56. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools • HDF5 file format • Bit-level organization of HDF5 file April 17-19, 2012 HDF/HDF-EOS Workshop XV 59 www.hdfgroup.org
  • 57. The HDF Group Thank You! April 17-19, 2012 HDF/HDF-EOS Workshop XV 60 www.hdfgroup.org
  • 58. The HDF Group Questions/comments? April 17-19, 2012 HDF/HDF-EOS Workshop XV 61 www.hdfgroup.org

Hinweis der Redaktion

  1. HDF5 has the characteristics of other formats that are outthere.It’s hard to store metadata in a binary flat file and it is not scalable
  2. Dataspace describes “logical” layout and nothing about how it is actually stored on disk. For our purposes we describe dimension 1 as theVertical dimension, dimension 2 as the horizontal dimension, and dimension 3 as the depth or number of planes in the dataset.
  3. Arrows symbolize the links between objects.
  4. H5T_STD_I32BE is a pre-defineddatatype with encoding to fully interpret the data.
  5. In step 9, the attr_data in effect is your dataspace