SlideShare ist ein Scribd-Unternehmen logo
1 von 58
The HDF Group

Introduction to HDF5
Barbara Jones
The HDF Group
The 15th HDF and HDF-EOS Workshop
April 17-19, 2012

April 17-19, 2012

HDF/HDF-EOS Workshop XV

1

www.hdfgroup.org
Foreword
• We will be using H5Py – Python interface to HDF5
• Easy to learn
• Saves a lot of time fro prototyping and getting data
and metadata out of HDF5 files
• Hides HDF5 complexity

• Resources
http://code.google.com/p/h5py/wiki/HowTo
http://alfven.org/wp/hdf5-for-python/

• Installation requires Python 2.7, NumPy 1.6.1, and
HDF5 1.8.3 (or later)
April 17-19, 2012

HDF/HDF-EOS Workshop XV

2

www.hdfgroup.org
Topics Covered

•
•
•
•
•

April 17-19, 2012

What HDF5 is
HDF5 Data Model
HDF5 Software and Tools
Introduction to HDF5 APIs
Examples

HDF/HDF-EOS Workshop XV

3

www.hdfgroup.org
What is HDF5?
• Open file format
• Designed for high volume or complex data

• Open source software
• Works with data in the format

• A data model
• Structures for data organization and specification

April 17-19, 2012

HDF/HDF-EOS Workshop XV

4

www.hdfgroup.org
HDF = Hierarchical Data Format

• HDF4 is the first HDF
• Originally called HDF; last major release was version 4

• HDF5 benefits from lessons learned with HDF4
• Changes to file format, software, and data model
• HDF5 and HDF4 are different

• No plans for an HDF6!

April 17-19, 2012

HDF/HDF-EOS Workshop XV

5

www.hdfgroup.org
HDF5 has characteristics of …

April 17-19, 2012

HDF/HDF-EOS Workshop XV

6

www.hdfgroup.org
HDF5 is designed …
•
•
•
•

for small or high volume and/or complex data
for every size and type of system (portable)
for flexible, efficient storage and I/O
to enable applications to evolve in their use of
HDF5 and to accommodate new models
• to support long-term data preservation
• Use it as a file format tool kit

April 17-19, 2012

HDF/HDF-EOS Workshop XV

7

www.hdfgroup.org
HDF5 Technology Platform
• HDF5 data model
• The “building blocks” for data
organization and specification

• HDF5 software
• Library, language interfaces, tools

• HDF5 file format
• Bit-level organization of HDF5 file

April 17-19, 2012

HDF/HDF-EOS Workshop XV

8

www.hdfgroup.org
HDF5 Data Model
Dataset

HDF5
Objects

Link
Datatype

Group

Dataspace

Attribute

Property List

File

a.k.a. HDF5 Abstract Data Model
a.k.a. HDF5 Logical Data Model
April 17-19, 2012

HDF/HDF-EOS Workshop XV

9

www.hdfgroup.org
HDF5 File

An HDF5 file is a
container that holds
data objects.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

lat | lon | temp
----|-----|----12 | 23 | 3.1
15 | 24 | 4.2
17 | 21 | 3.6

10

www.hdfgroup.org
HDF5 Dataset

Metadata

Data

Dataspace

Rank Dimensions
3

Datatype

Dim_1 = 4
Dim_2 = 5
Dim_3 = 7

Integer

(optional)
Attributes

Properties

Time = 32.4

Chunked

Pressure = 987

Compressed

Multi-dimensional array of
identically typed data elements

• HDF5 datasets organize and contain “raw data values”.

• HDF5 datatypes describe individual data elements.
• HDF5 dataspaces describe the logical layout of the data elements.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

11

www.hdfgroup.org
HDF5 Dataset & Dataspace
Dim_3 = 7

HDF5 Dataspace
Rank

Dimensions

3

Specifications for array
dimensions

Multi-dimensional array of
identically typed data elements

• HDF5 datasets organize and contain “raw data values”.

• HDF5 dataspaces describe the logical layout of the data elements
April 17-19, 2012

HDF/HDF-EOS Workshop XV

12

www.hdfgroup.org
HDF5 Dataspaces
Describe the logical layout of the elements in an HDF5 dataset
• NULL
- no elements
• Scalar
- single element
• Simple array (most common)
- Multiple elements organized in a rectilinear array:
Rank = number of dimensions
Dimension size = number of elements in each dimension
Maximum number of elements in each dimension can
be fixed or unlimited

April 17-19, 2012

HDF/HDF-EOS Workshop XV

13

www.hdfgroup.org
HDF5 Dataspaces
Two roles:
Dataspace contains spatial information (logical layout)
about a dataset
stored in a file
• Rank and dimensions
• Permanent part of dataset
definition

Rank = 2
Dimensions = 4x6

Partial I/0: Dataspaces describe applications’ data
buffers and data elements participating in I/O
Rank = 1
Dimension = 10
April 17-19, 2012

HDF/HDF-EOS Workshop XV

14

www.hdfgroup.org
HDF5 Dataset & Datatype
HDF5 Datatype
Integer 32bit LE

Specifications for single data
element

Multi-dimensional array of
identically typed data elements

• HDF5 datasets organize and contain “raw data values”.

• HDF5 datatypes describe individual data elements.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

15

www.hdfgroup.org
HDF5 Datatypes
• Describe individual data elements in an HDF5 dataset
• Wide range of datatypes supported
• Integer (signed and unsigned, 32 and 64-bit, etc.)

•
•
•
•
•
•

Float
Variable-length sequence types (e.g., strings)
Compound (similar to C structs)
User-defined (e.g., 13-bit integer)
Nested types
Pretty much any type!

April 17-19, 2012

HDF/HDF-EOS Workshop XV

16

www.hdfgroup.org
HDF5 Dataset
3

5

12

Datatype:

32-bit Integer

Dataspace:

Rank = 2
Dimensions = 5 x 3

April 17-19, 2012

HDF/HDF-EOS Workshop XV

17

www.hdfgroup.org
HDF5 Dataset with Compound Datatype
3

5

V

int16

char

int32

V

V

V V V
V V V

2x3x2 array of float32

Compound
Datatype:

Dataspace:
April 17-19, 2012

Rank = 2, Dimensions = 5 x 3
HDF/HDF-EOS Workshop XV

18

www.hdfgroup.org
HDF5 Dataset
Metadata

Data

Dataspace

Rank Dimensions
3

Dim_1 = 4
Dim_2 = 5
Dim_3 = 7

Datatype
Integer

Attributes
(optional)

Properties

Time = 32.4

Chunked

Pressure = 987

Compressed

April 17-19, 2012

Multi-dimensional array of
identically typed data elements

HDF/HDF-EOS Workshop XV

19

www.hdfgroup.org
HDF5 Property Lists
Property lists allow you to configure or control the
behavior of the library.
They provide fine grain control when creating or accessing
objects. For example how datasets are
stored, performance tuning…
There are default values associated with property lists.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

20

www.hdfgroup.org
Dataset Storage Properties
Data elements
stored physically
adjacent to each
other

Contiguous
(default)

Better access time
for subsets;
extendible

Chunked

Improves storage
efficiency,
transmission speed

Chunked &
Compressed

April 17-19, 2012

HDF/HDF-EOS Workshop XV

22

www.hdfgroup.org
HDF5 Attributes
• Typically contain user metadata

• Have a name and a value
• Attributes “decorate” HDF5 objects
• Value is described by a datatype and a dataspace
Analogous to a dataset, but do not support partial IO
operations; nor can they be compressed or extended
April 17-19, 2012

HDF/HDF-EOS Workshop XV

23

www.hdfgroup.org
HDF5 Data Model: Are we there yet?
HDF5
Objects

Group and Link



Attribute
Property List



Dataspace



Datatype




Dataset
File
April 17-19, 2012

HDF/HDF-EOS Workshop XV

24


www.hdfgroup.org
HDF5 Groups and Links
HDF5 groups
and links
organize
data objects.

/

Viz

Every HDF5 file
has a root group

SimOut

Parameters
10;100;1000

lat | lon | temp
----|-----|----12 | 23 | 3.1
15 | 24 | 4.2
17 | 21 | 3.6

Timestep
36,000

Similar to UNIX directories
April 17-19, 2012

HDF/HDF-EOS Workshop XV

25

www.hdfgroup.org
HDF5 Groups
• The path to an object defines it
• Objects can be shared:
/A/k and /B/m are the same
temp

“/”

A

k

B

C

m
temp

= Group
= Dataset

April 17-19, 2012

HDF/HDF-EOS Workshop XV

26

www.hdfgroup.org
HDF5 Technology Platform
• HDF5 data model
• The “building blocks” for data
organization and specification

• HDF5 software
• Library, language interfaces,
tools

April 17-19, 2012

HDF/HDF-EOS Workshop XV

27

www.hdfgroup.org
HDF5 Home Page
HDF5 home page: http://hdfgroup.org/HDF5/
• Latest release: HDF5 1.8.8 (1.8.9 coming in May)

HDF5 source code:
•
•

Written in C, and includes optional C++, Fortran 90 APIs, and
High Level APIs.
Contains command-line utilities (h5dump, h5repack, h5diff,
..) and compile scripts

HDF5 pre-built binaries:
• When possible, include C, C++, F90, and High Level libraries.
Check ./lib/libhdf5.settings file.
• Built with and require the SZIP and ZLIB external libraries,
which are included.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

28

www.hdfgroup.org
HDF5 API and Applications
Applications

EOS
Application

Domain Data
Objects

EOS
library

MATLAB

…

HDF5 Library

Storage
April 17-19, 2012

HDF/HDF-EOS Workshop XV

29

www.hdfgroup.org
HDF5 Software Layers & Storage

HDF5 Library

Tools

API
…

Language
Interfaces
C, Fortran, C++
Internals

Virtual File
Layer

High Level
APIs

h5dump
tool

h5repack
tool

HDFview
tool

Java Interface

HDF5 Data Model Objects

Tunable Properties

Groups, Datasets, Attributes, …

Chunk Size, I/O Driver, …

Memory
Mgmt

Posix I/O

Datatype
Conversion

Filters

Split
Files

Chunked
Storage

Version
and so on…
Compatibility

MPI I/O

Custom

Storage

I/O Drivers
HDF5 File
Format

April 17-19, 2012

File

Split
Files
HDF/HDF-EOS Workshop XV

File on
Parallel
Filesystem
30

Other

www.hdfgroup.org
HDF5 File Format
• Defined by the HDF5 File Format Specification.
http://www.hdfgroup.org/HDF5/doc/H5.format.html
• Specifies the bit-level organization of an HDF5 file on
storage media.

• HDF5 library adheres to the File Format, so for the most
part basic users do not need to know the guts of this
information.

April 17-19, 2012

HDF/HDF-EOS Workshop XV

31

www.hdfgroup.org
Useful Tools For New Users
h5dump:
Tool to “dump” or display contents of HDF5 files
h5cc, h5c++, h5fc:
Scripts to compile applications
HDFView:
Java browser to view HDF4 and HDF5 files
http://www.hdfgroup.org/hdf-java-html/hdfview/

April 17-19, 2012

HDF/HDF-EOS Workshop XV

32

www.hdfgroup.org
h5dump utility
h5dump [options] [file]
-H, --header
-d <names>
-g <names>

-p

Display header only – no data
Display specified
pathname/dataset(s)
Display the specified group(s) and
all members
Display properties

<names> is one or more appropriate object names.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

33

www.hdfgroup.org
Example of h5dump Output
HDF5 “my.h5" {
GROUP "/" {
DATASET “mydata" {
DATATYPE { H5T_STD_I32BE }
DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) }
DATA {
1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24
“/”
}
mydata
}
}
}
my.h5
April 17-19, 2012

HDF/HDF-EOS Workshop XV

34

www.hdfgroup.org
Introduction to
HDF5 Programming Model
and APIs

April 17-19, 2012

HDF/HDF-EOS Workshop XV

35

www.hdfgroup.org
General Programming Paradigm
• Object is opened or created
• Object is written to or read from, possibly many
times
• Object is closed

• Properties of object are optionally defined
Creation properties
Access properties
April 17-19, 2012

HDF/HDF-EOS Workshop XV

36

www.hdfgroup.org
The HDF5 API
• The API is extensive

Swiss Army
Cybertool 34

 300+ functions

• This can be daunting… but there is hope
A few functions can do a lot
Start simple
Build up knowledge as more features are needed

April 17-19, 2012

HDF/HDF-EOS Workshop XV

38

www.hdfgroup.org
HDF5 APIs
• Currently C, Fortran 90, C++ and Java bindings
supported by The HDF Group
• Others:
HDF5DotNet (C#, VB.NET, IronPython,..)
http://hdf5.net/
h5py (Python)
http://code.google.com/p/h5py/
(developed by Andrew Collette)
April 17-19, 2012

HDF/HDF-EOS Workshop XV

39

www.hdfgroup.org
Language Specific Requirements
• For portability, the HDF5 library has its own defined
types. For example, hid_t is used for object handles.

• Must include language specific files in your application:
C – Add “#include hdf5.h”
F90 - Add “USE HDF5”
Call h5open_f/h5close_f to initialize/close
Fortran interface
C++ - Add “#include H5Cpp.h”
Python - Add “import h5py” / “import numpy”

April 17-19, 2012

HDF/HDF-EOS Workshop XV

40

www.hdfgroup.org
The HDF Group

Example HDF5 Code

April 17-19, 2012

HDF/HDF-EOS Workshop XV

41

www.hdfgroup.org
Steps to Create a File
1. Specify property lists (or use defaults)
2. Create the file
3. Close the file (and properties if necessary)

April 17-19, 2012

HDF/HDF-EOS Workshop XV

42

www.hdfgroup.org
Creating an HDF5 File in Python
File Access Flag (create new file)

1. import h5py

2. file = h5py.File ('file.h5', 'w')
3. file.close ()
file.h5

“/” (root)

April 17-19, 2012

HDF/HDF-EOS Workshop XV

43

www.hdfgroup.org
Creating an HDF5 File In C
1. Specify Include File
#include “hdf5.h”

2. Example of Defined Types

int main() {

hid_t
herr_t

3. File Access Flag
(create new file)

file_id;
status;

file_id = H5Fcreate ("file.h5", H5F_ACC_TRUNC,
H5P_DEFAULT, H5P_DEFAULT);
status = H5Fclose (file_id);
}

4. To specify
default property
lists
April 17-19, 2012

HDF/HDF-EOS Workshop XV

44

www.hdfgroup.org
Creating an HDF5 File in F90
PROGRAM FILEEXAMPLE

1. Specify HDF5 Module

USE HDF5
IMPLICIT NONE

2. Example of Defined Types

CHARACTER(LEN=8), PARAMETER :: filename = "filef.h5" ! File name
INTEGER(HID_T) :: file_id
! File identifier
INTEGER :: error

3. Initialize Fortran interface

CALL h5open_f (error)
CALL h5fcreate_f (filename, H5F_ACC_TRUNC_F, file_id, error)
CALL h5fclose_f (file_id, error)
CALL h5close_f (error)
END PROGRAM FILEEXAMPLE

April 17-19, 2012

HDF/HDF-EOS Workshop XV

4. Close Fortran interface

45

www.hdfgroup.org
Steps to Create a Dataset
1. Define dataset characteristics
a) Datatype
b) Dataspace
c) Properties (or use default)

2. Decide where to put it
Group or root group

3. Create dataset in file
4. Close dataset handle from step 3.
April 17-19, 2012

HDF/HDF-EOS Workshop XV

46

www.hdfgroup.org
Example: Create a Dataset
dset.h5

“/” (root)
dset

Integer, 4x6

April 17-19, 2012

HDF/HDF-EOS Workshop XV

47

www.hdfgroup.org
Create a Dataset: h5_crtdat.py
1. import h5py
2. file = h5py.File ('dset.h5', 'w')
3. dataset = file.create_dataset ('dset', (4, 6), 'i')
4. file.close()
Name

Create Dataset in
Root Group

April 17-19, 2012

Dataspace
(shape)

Datatype

h5py closes the dataset for you

HDF/HDF-EOS Workshop XV

48

www.hdfgroup.org
Write To/Read From a Dataset: h5_rdwt.py
1. import h5py
2. import numpy as np
3. file = h5py.File('dset.h5','r+')

Open ‘dset’ in root group

4. dataset = file['dset']
5. data = np.zeros((4,6))
6.
7.
8.

for i in range(4):
for j in range(6):
data[i][j]= i*6+j+1

Write buffer to ‘dset’

9. dataset[...] = data
10. data_read = dataset[...]

Read data in ‘dset’ into buffer

11. file.close()

April 17-19, 2012

HDF/HDF-EOS Workshop XV

49

www.hdfgroup.org
How To Write to a Subset of the dataset?

dim2

5

5

5

5

5

5

5

5

dim1

5
5

5

5

dataset[1:4, 2:6] = 5
(instead of using “dataset[…]”)

April 17-19, 2012

HDF/HDF-EOS Workshop XV

50

www.hdfgroup.org
Read integer into float buffer: h5_readtofloat.py
1. import h5py
2. import numpy as np
3. file = h5py.File('dset.h5','r+')
4. dataset = file['dset']
5. data = np.zeros((4,6))
6. for i in range(4):
7. for j in range(6):
8.
data[i][j]= i*6+j+1

Write buffer to integer ‘dset’
Read data in ‘dset’ into
float buffer

9. dataset[...] = data

10. data_read32 = np.zeros((4,6,), dtype=np.float32)
11. dataset.id.read (h5py.h5s.ALL, h5py.h5s.ALL, data_read32,
mtype=h5py.h5t.NATIVE_FLOAT)
12. file.close()
April 17-19, 2012

HDF/HDF-EOS Workshop XV

51

www.hdfgroup.org
Steps to Create a Group
1. Decide where to put it – “root group”
or other group
2. Define properties or use default

3. Create the group in file
4. Close the group

April 17-19, 2012

HDF/HDF-EOS Workshop XV

52

www.hdfgroup.org
Example: Create a Group
“/” (root)
dset

MyGroup

4x6 array of
integers

dset.h5

April 17-19, 2012

HDF/HDF-EOS Workshop XV

53

www.hdfgroup.org
Create a Group: h5_crtgrp.py
Create group ‘MyGroup’
under root group

1. import h5py
2. file = h5py.File('dset.h5', 'r+')

3. group = file.create_group ('MyGroup')
4. file.close()

h5py closes the group for you

April 17-19, 2012

HDF/HDF-EOS Workshop XV

54

www.hdfgroup.org
Example: Create Attributes
“/” (root)
dset
Attributes:
Units=“Meters per second”
Speed=[100,200]

4x6 array of
integers

dset.h5

April 17-19, 2012

HDF/HDF-EOS Workshop XV

55

www.hdfgroup.org
Create Attributes: h5_crtatt.py
1. import h5py
2. import numpy as np

3. file = h5py.File('dset.h5','r+')
4. dataset = file['/dset']

Create string attribute

5. dataset.attrs["Units"] = “Meters per second”

6. attr_data = np.zeros((2,))
7. attr_data[0] = 100
8. attr_data[1] = 200

Create integer attribute

9. dataset.attrs.create("Speed", attr_data, (2,), “i”)
10. file.close()
April 17-19, 2012

HDF/HDF-EOS Workshop XV

56

www.hdfgroup.org
HDF5 Tutorial and Examples
HDF5 Tutorial:
http://www.hdfgroup.org/HDF5/Tutor/
HDF5 Examples:
http://www.hdfgroup.org/ftp/HDF5/examples/
HDF5 Documentation:
http://www.hdfgroup.org/HDF5/doc/

April 17-19, 2012

HDF/HDF-EOS Workshop XV

58

www.hdfgroup.org
HDF5 Technology Platform
• HDF5 data model
• The “building blocks” for data
organization and specification

• HDF5 software
• Library, language interfaces, tools

• HDF5 file format
• Bit-level organization of HDF5 file

April 17-19, 2012

HDF/HDF-EOS Workshop XV

59

www.hdfgroup.org
The HDF Group

Thank You!

April 17-19, 2012

HDF/HDF-EOS Workshop XV

60

www.hdfgroup.org
The HDF Group

Questions/comments?

April 17-19, 2012

HDF/HDF-EOS Workshop XV

61

www.hdfgroup.org

Weitere ähnliche Inhalte

Was ist angesagt?

Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
MLconf
 
Training Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL LibraryTraining Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL Library
Neo4j
 
Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS
Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWSImproving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS
Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS
InfluxData
 

Was ist angesagt? (20)

Data Science and CDSW
Data Science and CDSWData Science and CDSW
Data Science and CDSW
 
Hadoop seminar
Hadoop seminarHadoop seminar
Hadoop seminar
 
Integrating NiFi and Flink
Integrating NiFi and FlinkIntegrating NiFi and Flink
Integrating NiFi and Flink
 
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
 
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
Aaron Roth, Associate Professor, University of Pennsylvania, at MLconf NYC 2017
 
Wie modelliere ich mein Core DWH?
Wie modelliere ich mein Core DWH?Wie modelliere ich mein Core DWH?
Wie modelliere ich mein Core DWH?
 
Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advanced
 
Using Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with CassandraUsing Event-Driven Architectures with Cassandra
Using Event-Driven Architectures with Cassandra
 
The Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop EcosystemThe Evolution of the Hadoop Ecosystem
The Evolution of the Hadoop Ecosystem
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
Martin Moucka [Red Hat] | How Red Hat Uses gNMI, Telegraf and InfluxDB to Gai...
 
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
Reduce SRE Stress: Minimizing Service Downtime with Grafana, InfluxDB and Tel...
 
5 Ways to Make Your Postgres GDPR-Ready
5 Ways to Make Your Postgres GDPR-Ready5 Ways to Make Your Postgres GDPR-Ready
5 Ways to Make Your Postgres GDPR-Ready
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
 
Luigi presentation NYC Data Science
Luigi presentation NYC Data ScienceLuigi presentation NYC Data Science
Luigi presentation NYC Data Science
 
How to Migrate from Oracle to EDB Postgres
How to Migrate from Oracle to EDB PostgresHow to Migrate from Oracle to EDB Postgres
How to Migrate from Oracle to EDB Postgres
 
Training Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL LibraryTraining Week: Build APIs with Neo4j GraphQL Library
Training Week: Build APIs with Neo4j GraphQL Library
 
Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS
Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWSImproving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS
Improving Industrial Machine Support Using InfluxDB, Web SCADA, and AWS
 
Oracle Database Appliance, ODA, X7-2 portfolio.
Oracle Database Appliance, ODA, X7-2 portfolio.Oracle Database Appliance, ODA, X7-2 portfolio.
Oracle Database Appliance, ODA, X7-2 portfolio.
 
Privacy by design
Privacy by designPrivacy by design
Privacy by design
 

Andere mochten auch

Andere mochten auch (20)

NetCDF and HDF5
NetCDF and HDF5NetCDF and HDF5
NetCDF and HDF5
 
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFViewHDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
 
Status of HDF-EOS, Related Software and Tools
 Status of HDF-EOS, Related Software and Tools Status of HDF-EOS, Related Software and Tools
Status of HDF-EOS, Related Software and Tools
 
Bridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data ProductsBridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data Products
 
HDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSSHDF Group Support for NPP/NPOESS/JPSS
HDF Group Support for NPP/NPOESS/JPSS
 
HDF Tools Updates and Discussions
HDF Tools Updates and DiscussionsHDF Tools Updates and Discussions
HDF Tools Updates and Discussions
 
Using IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS DataUsing IDL with Suomi NPP VIIRS Data
Using IDL with Suomi NPP VIIRS Data
 
Granules Are Forever
Granules Are ForeverGranules Are Forever
Granules Are Forever
 
Connecting HDF with ISO Metadata Standards
Connecting HDF with ISO Metadata StandardsConnecting HDF with ISO Metadata Standards
Connecting HDF with ISO Metadata Standards
 
HDF4 Mapping Project Update
HDF4 Mapping Project UpdateHDF4 Mapping Project Update
HDF4 Mapping Project Update
 
HDF Tools Tutorial
HDF Tools TutorialHDF Tools Tutorial
HDF Tools Tutorial
 
HDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDCHDF & HDF-EOS Data & Support at NSIDC
HDF & HDF-EOS Data & Support at NSIDC
 
Earth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project UpdateEarth Science Data and Information System (ESDIS) Project Update
Earth Science Data and Information System (ESDIS) Project Update
 
Images of HDF5
Images of HDF5Images of HDF5
Images of HDF5
 
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
GES DISC Eexperiences with HDF Formats for MEaSUREs ProjectsGES DISC Eexperiences with HDF Formats for MEaSUREs Projects
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
 
HDF OPeNDAP Project Update and Demo
HDF OPeNDAP Project Update and DemoHDF OPeNDAP Project Update and Demo
HDF OPeNDAP Project Update and Demo
 
2011 ACSI Survey Summary
2011 ACSI Survey Summary2011 ACSI Survey Summary
2011 ACSI Survey Summary
 
HDF Project Status and Plans
HDF Project Status and PlansHDF Project Status and Plans
HDF Project Status and Plans
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Tools to improve the usability of NASA HDF Data
Tools to improve the usability of NASA HDF DataTools to improve the usability of NASA HDF Data
Tools to improve the usability of NASA HDF Data
 

Ähnlich wie Introduction to HDF5 Data and Programming Models

Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
The HDF-EOS Tools and Information Center
 
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout MapsEnsuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
The HDF-EOS Tools and Information Center
 

Ähnlich wie Introduction to HDF5 Data and Programming Models (20)

Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Hdf5 parallel
Hdf5 parallelHdf5 parallel
Hdf5 parallel
 
Hdf5 intro
Hdf5 introHdf5 intro
Hdf5 intro
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 iRODS
HDF5 iRODSHDF5 iRODS
HDF5 iRODS
 
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 productsInteroperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
 
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout MapsEnsuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
 
HDF Status and Development
HDF Status and DevelopmentHDF Status and Development
HDF Status and Development
 
HDF Updae
HDF UpdaeHDF Updae
HDF Updae
 
HDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/OHDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/O
 
HDF OPeNDAP project update and demo
HDF OPeNDAP project update and demoHDF OPeNDAP project update and demo
HDF OPeNDAP project update and demo
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
HDF5 Tools
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Parallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory TutorialParallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory Tutorial
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
Support for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF GroupSupport for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF Group
 

Mehr von The HDF-EOS Tools and Information Center

Mehr von The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 
Leveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software TestingLeveraging the Cloud for HDF Software Testing
Leveraging the Cloud for HDF Software Testing
 

KĂźrzlich hochgeladen

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

KĂźrzlich hochgeladen (20)

From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 

Introduction to HDF5 Data and Programming Models

  • 1. The HDF Group Introduction to HDF5 Barbara Jones The HDF Group The 15th HDF and HDF-EOS Workshop April 17-19, 2012 April 17-19, 2012 HDF/HDF-EOS Workshop XV 1 www.hdfgroup.org
  • 2. Foreword • We will be using H5Py – Python interface to HDF5 • Easy to learn • Saves a lot of time fro prototyping and getting data and metadata out of HDF5 files • Hides HDF5 complexity • Resources http://code.google.com/p/h5py/wiki/HowTo http://alfven.org/wp/hdf5-for-python/ • Installation requires Python 2.7, NumPy 1.6.1, and HDF5 1.8.3 (or later) April 17-19, 2012 HDF/HDF-EOS Workshop XV 2 www.hdfgroup.org
  • 3. Topics Covered • • • • • April 17-19, 2012 What HDF5 is HDF5 Data Model HDF5 Software and Tools Introduction to HDF5 APIs Examples HDF/HDF-EOS Workshop XV 3 www.hdfgroup.org
  • 4. What is HDF5? • Open file format • Designed for high volume or complex data • Open source software • Works with data in the format • A data model • Structures for data organization and specification April 17-19, 2012 HDF/HDF-EOS Workshop XV 4 www.hdfgroup.org
  • 5. HDF = Hierarchical Data Format • HDF4 is the first HDF • Originally called HDF; last major release was version 4 • HDF5 benefits from lessons learned with HDF4 • Changes to file format, software, and data model • HDF5 and HDF4 are different • No plans for an HDF6! April 17-19, 2012 HDF/HDF-EOS Workshop XV 5 www.hdfgroup.org
  • 6. HDF5 has characteristics of … April 17-19, 2012 HDF/HDF-EOS Workshop XV 6 www.hdfgroup.org
  • 7. HDF5 is designed … • • • • for small or high volume and/or complex data for every size and type of system (portable) for flexible, efficient storage and I/O to enable applications to evolve in their use of HDF5 and to accommodate new models • to support long-term data preservation • Use it as a file format tool kit April 17-19, 2012 HDF/HDF-EOS Workshop XV 7 www.hdfgroup.org
  • 8. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools • HDF5 file format • Bit-level organization of HDF5 file April 17-19, 2012 HDF/HDF-EOS Workshop XV 8 www.hdfgroup.org
  • 9. HDF5 Data Model Dataset HDF5 Objects Link Datatype Group Dataspace Attribute Property List File a.k.a. HDF5 Abstract Data Model a.k.a. HDF5 Logical Data Model April 17-19, 2012 HDF/HDF-EOS Workshop XV 9 www.hdfgroup.org
  • 10. HDF5 File An HDF5 file is a container that holds data objects. April 17-19, 2012 HDF/HDF-EOS Workshop XV lat | lon | temp ----|-----|----12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 10 www.hdfgroup.org
  • 11. HDF5 Dataset Metadata Data Dataspace Rank Dimensions 3 Datatype Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Integer (optional) Attributes Properties Time = 32.4 Chunked Pressure = 987 Compressed Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 datatypes describe individual data elements. • HDF5 dataspaces describe the logical layout of the data elements. April 17-19, 2012 HDF/HDF-EOS Workshop XV 11 www.hdfgroup.org
  • 12. HDF5 Dataset & Dataspace Dim_3 = 7 HDF5 Dataspace Rank Dimensions 3 Specifications for array dimensions Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 dataspaces describe the logical layout of the data elements April 17-19, 2012 HDF/HDF-EOS Workshop XV 12 www.hdfgroup.org
  • 13. HDF5 Dataspaces Describe the logical layout of the elements in an HDF5 dataset • NULL - no elements • Scalar - single element • Simple array (most common) - Multiple elements organized in a rectilinear array: Rank = number of dimensions Dimension size = number of elements in each dimension Maximum number of elements in each dimension can be fixed or unlimited April 17-19, 2012 HDF/HDF-EOS Workshop XV 13 www.hdfgroup.org
  • 14. HDF5 Dataspaces Two roles: Dataspace contains spatial information (logical layout) about a dataset stored in a file • Rank and dimensions • Permanent part of dataset definition Rank = 2 Dimensions = 4x6 Partial I/0: Dataspaces describe applications’ data buffers and data elements participating in I/O Rank = 1 Dimension = 10 April 17-19, 2012 HDF/HDF-EOS Workshop XV 14 www.hdfgroup.org
  • 15. HDF5 Dataset & Datatype HDF5 Datatype Integer 32bit LE Specifications for single data element Multi-dimensional array of identically typed data elements • HDF5 datasets organize and contain “raw data values”. • HDF5 datatypes describe individual data elements. April 17-19, 2012 HDF/HDF-EOS Workshop XV 15 www.hdfgroup.org
  • 16. HDF5 Datatypes • Describe individual data elements in an HDF5 dataset • Wide range of datatypes supported • Integer (signed and unsigned, 32 and 64-bit, etc.) • • • • • • Float Variable-length sequence types (e.g., strings) Compound (similar to C structs) User-defined (e.g., 13-bit integer) Nested types Pretty much any type! April 17-19, 2012 HDF/HDF-EOS Workshop XV 16 www.hdfgroup.org
  • 17. HDF5 Dataset 3 5 12 Datatype: 32-bit Integer Dataspace: Rank = 2 Dimensions = 5 x 3 April 17-19, 2012 HDF/HDF-EOS Workshop XV 17 www.hdfgroup.org
  • 18. HDF5 Dataset with Compound Datatype 3 5 V int16 char int32 V V V V V V V V 2x3x2 array of float32 Compound Datatype: Dataspace: April 17-19, 2012 Rank = 2, Dimensions = 5 x 3 HDF/HDF-EOS Workshop XV 18 www.hdfgroup.org
  • 19. HDF5 Dataset Metadata Data Dataspace Rank Dimensions 3 Dim_1 = 4 Dim_2 = 5 Dim_3 = 7 Datatype Integer Attributes (optional) Properties Time = 32.4 Chunked Pressure = 987 Compressed April 17-19, 2012 Multi-dimensional array of identically typed data elements HDF/HDF-EOS Workshop XV 19 www.hdfgroup.org
  • 20. HDF5 Property Lists Property lists allow you to configure or control the behavior of the library. They provide fine grain control when creating or accessing objects. For example how datasets are stored, performance tuning… There are default values associated with property lists. April 17-19, 2012 HDF/HDF-EOS Workshop XV 20 www.hdfgroup.org
  • 21. Dataset Storage Properties Data elements stored physically adjacent to each other Contiguous (default) Better access time for subsets; extendible Chunked Improves storage efficiency, transmission speed Chunked & Compressed April 17-19, 2012 HDF/HDF-EOS Workshop XV 22 www.hdfgroup.org
  • 22. HDF5 Attributes • Typically contain user metadata • Have a name and a value • Attributes “decorate” HDF5 objects • Value is described by a datatype and a dataspace Analogous to a dataset, but do not support partial IO operations; nor can they be compressed or extended April 17-19, 2012 HDF/HDF-EOS Workshop XV 23 www.hdfgroup.org
  • 23. HDF5 Data Model: Are we there yet? HDF5 Objects Group and Link  Attribute Property List  Dataspace  Datatype   Dataset File April 17-19, 2012 HDF/HDF-EOS Workshop XV 24  www.hdfgroup.org
  • 24. HDF5 Groups and Links HDF5 groups and links organize data objects. / Viz Every HDF5 file has a root group SimOut Parameters 10;100;1000 lat | lon | temp ----|-----|----12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 Timestep 36,000 Similar to UNIX directories April 17-19, 2012 HDF/HDF-EOS Workshop XV 25 www.hdfgroup.org
  • 25. HDF5 Groups • The path to an object defines it • Objects can be shared: /A/k and /B/m are the same temp “/” A k B C m temp = Group = Dataset April 17-19, 2012 HDF/HDF-EOS Workshop XV 26 www.hdfgroup.org
  • 26. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools April 17-19, 2012 HDF/HDF-EOS Workshop XV 27 www.hdfgroup.org
  • 27. HDF5 Home Page HDF5 home page: http://hdfgroup.org/HDF5/ • Latest release: HDF5 1.8.8 (1.8.9 coming in May) HDF5 source code: • • Written in C, and includes optional C++, Fortran 90 APIs, and High Level APIs. Contains command-line utilities (h5dump, h5repack, h5diff, ..) and compile scripts HDF5 pre-built binaries: • When possible, include C, C++, F90, and High Level libraries. Check ./lib/libhdf5.settings file. • Built with and require the SZIP and ZLIB external libraries, which are included. April 17-19, 2012 HDF/HDF-EOS Workshop XV 28 www.hdfgroup.org
  • 28. HDF5 API and Applications Applications EOS Application Domain Data Objects EOS library MATLAB … HDF5 Library Storage April 17-19, 2012 HDF/HDF-EOS Workshop XV 29 www.hdfgroup.org
  • 29. HDF5 Software Layers & Storage HDF5 Library Tools API … Language Interfaces C, Fortran, C++ Internals Virtual File Layer High Level APIs h5dump tool h5repack tool HDFview tool Java Interface HDF5 Data Model Objects Tunable Properties Groups, Datasets, Attributes, … Chunk Size, I/O Driver, … Memory Mgmt Posix I/O Datatype Conversion Filters Split Files Chunked Storage Version and so on… Compatibility MPI I/O Custom Storage I/O Drivers HDF5 File Format April 17-19, 2012 File Split Files HDF/HDF-EOS Workshop XV File on Parallel Filesystem 30 Other www.hdfgroup.org
  • 30. HDF5 File Format • Defined by the HDF5 File Format Specification. http://www.hdfgroup.org/HDF5/doc/H5.format.html • Specifies the bit-level organization of an HDF5 file on storage media. • HDF5 library adheres to the File Format, so for the most part basic users do not need to know the guts of this information. April 17-19, 2012 HDF/HDF-EOS Workshop XV 31 www.hdfgroup.org
  • 31. Useful Tools For New Users h5dump: Tool to “dump” or display contents of HDF5 files h5cc, h5c++, h5fc: Scripts to compile applications HDFView: Java browser to view HDF4 and HDF5 files http://www.hdfgroup.org/hdf-java-html/hdfview/ April 17-19, 2012 HDF/HDF-EOS Workshop XV 32 www.hdfgroup.org
  • 32. h5dump utility h5dump [options] [file] -H, --header -d <names> -g <names> -p Display header only – no data Display specified pathname/dataset(s) Display the specified group(s) and all members Display properties <names> is one or more appropriate object names. April 17-19, 2012 HDF/HDF-EOS Workshop XV 33 www.hdfgroup.org
  • 33. Example of h5dump Output HDF5 “my.h5" { GROUP "/" { DATASET “mydata" { DATATYPE { H5T_STD_I32BE } DATASPACE { SIMPLE ( 4, 6 ) / ( 4, 6 ) } DATA { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24 “/” } mydata } } } my.h5 April 17-19, 2012 HDF/HDF-EOS Workshop XV 34 www.hdfgroup.org
  • 34. Introduction to HDF5 Programming Model and APIs April 17-19, 2012 HDF/HDF-EOS Workshop XV 35 www.hdfgroup.org
  • 35. General Programming Paradigm • Object is opened or created • Object is written to or read from, possibly many times • Object is closed • Properties of object are optionally defined Creation properties Access properties April 17-19, 2012 HDF/HDF-EOS Workshop XV 36 www.hdfgroup.org
  • 36. The HDF5 API • The API is extensive Swiss Army Cybertool 34  300+ functions • This can be daunting… but there is hope A few functions can do a lot Start simple Build up knowledge as more features are needed April 17-19, 2012 HDF/HDF-EOS Workshop XV 38 www.hdfgroup.org
  • 37. HDF5 APIs • Currently C, Fortran 90, C++ and Java bindings supported by The HDF Group • Others: HDF5DotNet (C#, VB.NET, IronPython,..) http://hdf5.net/ h5py (Python) http://code.google.com/p/h5py/ (developed by Andrew Collette) April 17-19, 2012 HDF/HDF-EOS Workshop XV 39 www.hdfgroup.org
  • 38. Language Specific Requirements • For portability, the HDF5 library has its own defined types. For example, hid_t is used for object handles. • Must include language specific files in your application: C – Add “#include hdf5.h” F90 - Add “USE HDF5” Call h5open_f/h5close_f to initialize/close Fortran interface C++ - Add “#include H5Cpp.h” Python - Add “import h5py” / “import numpy” April 17-19, 2012 HDF/HDF-EOS Workshop XV 40 www.hdfgroup.org
  • 39. The HDF Group Example HDF5 Code April 17-19, 2012 HDF/HDF-EOS Workshop XV 41 www.hdfgroup.org
  • 40. Steps to Create a File 1. Specify property lists (or use defaults) 2. Create the file 3. Close the file (and properties if necessary) April 17-19, 2012 HDF/HDF-EOS Workshop XV 42 www.hdfgroup.org
  • 41. Creating an HDF5 File in Python File Access Flag (create new file) 1. import h5py 2. file = h5py.File ('file.h5', 'w') 3. file.close () file.h5 “/” (root) April 17-19, 2012 HDF/HDF-EOS Workshop XV 43 www.hdfgroup.org
  • 42. Creating an HDF5 File In C 1. Specify Include File #include “hdf5.h” 2. Example of Defined Types int main() { hid_t herr_t 3. File Access Flag (create new file) file_id; status; file_id = H5Fcreate ("file.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT); status = H5Fclose (file_id); } 4. To specify default property lists April 17-19, 2012 HDF/HDF-EOS Workshop XV 44 www.hdfgroup.org
  • 43. Creating an HDF5 File in F90 PROGRAM FILEEXAMPLE 1. Specify HDF5 Module USE HDF5 IMPLICIT NONE 2. Example of Defined Types CHARACTER(LEN=8), PARAMETER :: filename = "filef.h5" ! File name INTEGER(HID_T) :: file_id ! File identifier INTEGER :: error 3. Initialize Fortran interface CALL h5open_f (error) CALL h5fcreate_f (filename, H5F_ACC_TRUNC_F, file_id, error) CALL h5fclose_f (file_id, error) CALL h5close_f (error) END PROGRAM FILEEXAMPLE April 17-19, 2012 HDF/HDF-EOS Workshop XV 4. Close Fortran interface 45 www.hdfgroup.org
  • 44. Steps to Create a Dataset 1. Define dataset characteristics a) Datatype b) Dataspace c) Properties (or use default) 2. Decide where to put it Group or root group 3. Create dataset in file 4. Close dataset handle from step 3. April 17-19, 2012 HDF/HDF-EOS Workshop XV 46 www.hdfgroup.org
  • 45. Example: Create a Dataset dset.h5 “/” (root) dset Integer, 4x6 April 17-19, 2012 HDF/HDF-EOS Workshop XV 47 www.hdfgroup.org
  • 46. Create a Dataset: h5_crtdat.py 1. import h5py 2. file = h5py.File ('dset.h5', 'w') 3. dataset = file.create_dataset ('dset', (4, 6), 'i') 4. file.close() Name Create Dataset in Root Group April 17-19, 2012 Dataspace (shape) Datatype h5py closes the dataset for you HDF/HDF-EOS Workshop XV 48 www.hdfgroup.org
  • 47. Write To/Read From a Dataset: h5_rdwt.py 1. import h5py 2. import numpy as np 3. file = h5py.File('dset.h5','r+') Open ‘dset’ in root group 4. dataset = file['dset'] 5. data = np.zeros((4,6)) 6. 7. 8. for i in range(4): for j in range(6): data[i][j]= i*6+j+1 Write buffer to ‘dset’ 9. dataset[...] = data 10. data_read = dataset[...] Read data in ‘dset’ into buffer 11. file.close() April 17-19, 2012 HDF/HDF-EOS Workshop XV 49 www.hdfgroup.org
  • 48. How To Write to a Subset of the dataset? dim2 5 5 5 5 5 5 5 5 dim1 5 5 5 5 dataset[1:4, 2:6] = 5 (instead of using “dataset[…]”) April 17-19, 2012 HDF/HDF-EOS Workshop XV 50 www.hdfgroup.org
  • 49. Read integer into float buffer: h5_readtofloat.py 1. import h5py 2. import numpy as np 3. file = h5py.File('dset.h5','r+') 4. dataset = file['dset'] 5. data = np.zeros((4,6)) 6. for i in range(4): 7. for j in range(6): 8. data[i][j]= i*6+j+1 Write buffer to integer ‘dset’ Read data in ‘dset’ into float buffer 9. dataset[...] = data 10. data_read32 = np.zeros((4,6,), dtype=np.float32) 11. dataset.id.read (h5py.h5s.ALL, h5py.h5s.ALL, data_read32, mtype=h5py.h5t.NATIVE_FLOAT) 12. file.close() April 17-19, 2012 HDF/HDF-EOS Workshop XV 51 www.hdfgroup.org
  • 50. Steps to Create a Group 1. Decide where to put it – “root group” or other group 2. Define properties or use default 3. Create the group in file 4. Close the group April 17-19, 2012 HDF/HDF-EOS Workshop XV 52 www.hdfgroup.org
  • 51. Example: Create a Group “/” (root) dset MyGroup 4x6 array of integers dset.h5 April 17-19, 2012 HDF/HDF-EOS Workshop XV 53 www.hdfgroup.org
  • 52. Create a Group: h5_crtgrp.py Create group ‘MyGroup’ under root group 1. import h5py 2. file = h5py.File('dset.h5', 'r+') 3. group = file.create_group ('MyGroup') 4. file.close() h5py closes the group for you April 17-19, 2012 HDF/HDF-EOS Workshop XV 54 www.hdfgroup.org
  • 53. Example: Create Attributes “/” (root) dset Attributes: Units=“Meters per second” Speed=[100,200] 4x6 array of integers dset.h5 April 17-19, 2012 HDF/HDF-EOS Workshop XV 55 www.hdfgroup.org
  • 54. Create Attributes: h5_crtatt.py 1. import h5py 2. import numpy as np 3. file = h5py.File('dset.h5','r+') 4. dataset = file['/dset'] Create string attribute 5. dataset.attrs["Units"] = “Meters per second” 6. attr_data = np.zeros((2,)) 7. attr_data[0] = 100 8. attr_data[1] = 200 Create integer attribute 9. dataset.attrs.create("Speed", attr_data, (2,), “i”) 10. file.close() April 17-19, 2012 HDF/HDF-EOS Workshop XV 56 www.hdfgroup.org
  • 55. HDF5 Tutorial and Examples HDF5 Tutorial: http://www.hdfgroup.org/HDF5/Tutor/ HDF5 Examples: http://www.hdfgroup.org/ftp/HDF5/examples/ HDF5 Documentation: http://www.hdfgroup.org/HDF5/doc/ April 17-19, 2012 HDF/HDF-EOS Workshop XV 58 www.hdfgroup.org
  • 56. HDF5 Technology Platform • HDF5 data model • The “building blocks” for data organization and specification • HDF5 software • Library, language interfaces, tools • HDF5 file format • Bit-level organization of HDF5 file April 17-19, 2012 HDF/HDF-EOS Workshop XV 59 www.hdfgroup.org
  • 57. The HDF Group Thank You! April 17-19, 2012 HDF/HDF-EOS Workshop XV 60 www.hdfgroup.org
  • 58. The HDF Group Questions/comments? April 17-19, 2012 HDF/HDF-EOS Workshop XV 61 www.hdfgroup.org

Hinweis der Redaktion

  1. HDF5 has the characteristics of other formats that are outthere.It’s hard to store metadata in a binary flat file and it is not scalable
  2. Dataspace describes “logical” layout and nothing about how it is actually stored on disk. For our purposes we describe dimension 1 as theVertical dimension, dimension 2 as the horizontal dimension, and dimension 3 as the depth or number of planes in the dataset.
  3. Arrows symbolize the links between objects.
  4. H5T_STD_I32BE is a pre-defineddatatype with encoding to fully interpret the data.
  5. In step 9, the attr_data in effect is your dataspace