HDF4 and HDF5 Performance Preliminary Results

HDF4 and HDF5 Performance
Preliminary Results

Elena Pourmal
IV HDF-EOS Workshop
September 19 - 21 2000

Why compare?
• HDF5 emerges as a new standard
– proved to be robust
– most of the planned features have been
implemented in HDF5-1.2.2
– has a lot of new features compared to HDF4
– time for performance study and tuning

• Users move their data and applications to
HDF5
• HDF4 is not “bad,” but has limited
capabilities

HDF5
•
•
•

•
•
•
•

•
•

•

Files over 2GB
Unlimited number of objects
One data model
(multidimensional array of
structures)
|| support
Thread safe
Mounting files
Diversity of datatypes
(compound, VL, opaque) and
operations (create, write, read,
delete, shared)
“Native” file is portable
Modifiable I/O pipe-line
(registration of compression
methods)
Selections (unions and regular
blocks)

HDF4
•
•
•

Files less than 2GB
Max limit 20000 of objects
Different data models for SD,
GR, RI, Vdatas

•
•
•
•

N/A
N/A
N/A
Only predefined datatypes such
as float32, int16, char8

•
•

“Native” file is not portable
N/A

•

Selections (simple regular
subsampling)

What to compare?
(short list of common features)
• File I/O operations
–
–
–
–
–

plain read and write
hyperslab selections
regular subsampling
access to large number of objects
storage overhead

• Data organization in the file and access to it
– Vdata vs compound datasets

• Chunking, unlimited dimensions, compression

Benchmark Environment
• 440-Mhz UltraSPARC i-IIi
– 1G memory
– Sun OS 5.7
– gettimeofday()

• 2 - 550 Mhz Pentium III Xeon
– 1G memory
– RedHat 6.2
– clock()

• each measurement was taken 10 times, average
and best times were collected

Benchmarks
• Writing 1Dim and 2Dim datasets of integers
• Reading 2Dim contiguous hyperslabs of integers
• Reading 2Dim contiguous hyperslabs of integers
with subsampling
• Reading fixed size hyperslabs of integers from
different locations in the dataset
• Writing and reading Vdatas and Compound
Datasets
• CERES data

Writing 1Dim and 2Dim Datasets

Writing 1Dim Datasets
• In this test we created one-dimensional arrays of integers
with sizes varying from 8Kbytes to 8000 Kbytes in steps
of 8Kbytes. We measured the average and best times for
writing these arrays into HDF4 and HDF5 files.
• Test was performed on Solaris platform. Neither HDF4 nor
HDF5 performed data conversion.


2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
7896

7432

6968

6504

6040

5576

5112

4648

4184

3720

3256

2792

2328

1864

1400

936

472

HDF4
HDF5

8

Time (seconds)

Writing 1Dim dataset (best time)

Dataset size (Kbytes)

HDF5 performs about 8 times better than HDF4.
System activity affects timing results.

• In this test we created two-dimensional arrays with sizes
varying from 40 X 40 bytes to 4000 X 4000 bytes in steps
of 40 bytes for each dimension. We measured the average
and best times for writing these arrays into HDF4 and
HDF5 files. The graphs were plotted by averaging the
values obtained for the same array size, without
considering the shape of the array.

Writing 2Dim Datasets (best time)

Time (microseconds)

450000
400000
350000
300000
250000

HDF4
HDF5

200000
150000
100000
50000
3563

2883

2490

2188

1920

1684

1466

1266

1076

899

732

577

432

302

183

79.3

0.39

0

Dataset size (Kbytes)

HDF4 shows nonlinear growth. HDF5 performs about 10 times better
than HDF4.

Reading 2Dim Contiguous
Hyperslabs

Reading Contiguous Hyperslabs
• In this test we created a file with 1000 X 1000 array of
integers. Subsequently, we read hyperslabs of different
sizes starting from a fixed position in the array and the
measurements for read were averaged over 10 runs. HDF51.2.2, HDF5-1.2.2-patched and HDF5 development
libraries were tested.

Reading Hyperslabs

250000
200000
150000

HDF4
HDF5

100000
50000
8E+05

7E+05

6E+05

5E+05

4E+05

3E+05

3E+05

2E+05

2E+05

1E+05

64800

27900

0
100

Time (microseconds)

Hyperslab selection, best time
HDF5-1.2.2

Size of hyperslab (number of elements)

For hyperslabs > 1MB, HDF5 becomes more than 3 times slower
than HDF4. It also shows nonlinear growth.

Reading Hyperslabs
(latest version of the HDF5 development branch)

100000
80000
60000

HDF4
HDF5

40000
20000
8E+05

7E+05

6E+05

5E+05

4E+05

3E+05

3E+05

2E+05

2E+05

1E+05

64500

27600

0
100

Time (microseconds)

Hyperslab selection, best time
HDF5 development branch


For hyperslabs > 2MB, HDF5 becomes more about 1.5 times slower
than HDF4. It still shows nonlinear growth.

Reading contiguous hyperslabs
(fixed size)
• In this test, the size of the hyperslab was fixed to 100x100
elements. The hyperslab was moved, first along the X
axis, then along the Y axis, and finally along the diagonal
and the read performance was measured.

Reading 100x100 Hyperslabs from Different Locations
Selection of 100x100 hyperslab
(best time)

Time (microseconds)

6000
5000
HDF4
HDF5-1.2.2
HDF5-1.2.2-patched
HDF5 development

4000
3000
2000
1000
0
1

2

3

4

5

6

7

8

9

10

Events

For small hyperslabs HDF5 performs about 3 times better than HDF4.

Reading Hyperslabs with
Subsampling

Subsampling Hyperslabs
• In this test we created a file with 1000x1000 array of
integers. Subsequently, we read every second element of
the hyperslabs of different sizes starting from a fixed
position in the array and the measurements for read were
averaged over 10 runs. HDF5-1.2.2, and HDF5
development libraries were tested.

Reading Each Second Element of the Hyperslabs

Hyperslabs with subsampling each second
element (best time)
35
Time (seconds)

30
25
20

HDF4

15

HDF5

10
5
3E+05

3E+05

3E+05

2E+05

2E+05

2E+05

2E+05

1E+05

1E+05

1E+05

91000

74700

59500

45500

32000

19600

8900

100

0


HDF5 shows nonlinear growth. HDF4 performs about 3 times
for the hyperslabs with the size > .5MB

First Attempt to Improve the Performance

Hyperslabs with selection (best time)
30

Time (minutes)

25
20

HDF4
HDF5
HDF5 (latest)

15
10
5

3E+05

2E+05

2E+05

2E+05

2E+05

2E+05

1E+05

1E+05

97600

80300

63900

48500

34200

21000

9400

100

0

Size of hyperslab

HDF4 still performs 2 times better for the hyperslabs > 2MB.
HDF5 shows nonlinear growth.

Current Behavior (HDF5 development branch)

18
16
14
12
10
8
6
4
2
0
3E+05

3E+05

2E+05

2E+05

2E+05

2E+05

2E+05

1E+05

1E+05

1E+05

85800

70800

56400

42700

30000

18600

8500

HDF4
HDF5

100

Time (seconds)

Hyperslab with selection
(best time)

Hyperslab size (number of elements)

HDF5 growth linear and performs about 10 times better than HDF4 .

Vdatas and Compound Datasets
• In this test we created HDF4 files with Vdata and HDF5
files with compound dataset with sizes from 1000 to
1000000 number of records:
• float a; short b;float c[3]; char d;
• write operation, write with packing data and partial read
were tested.
• Test was performed on Linux platforms. We also looked
into data conversion issues.

Writing Data (VSwrite and H5Dwrite)

1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0

HDF4 native

4E+05

4E+05

4E+05

3E+05

3E+05

3E+05

2E+05

2E+05

1E+05

1E+05

75000

38000

HDF4 with conversion
HDF5 native
HDF5 with conversion
1000

Time ( in seconds)

Writing Vdatas and Compound Datasets
(average time)

Number of records (19bytes each)

Conversion does not affect HDF4 performance. It does affect
HDF5 ( more than in 15 times)

Writing Data
(timing includes packing:VSpack and H5Tpack)

3.5
3
2.5
2
1.5
1
0.5
0
9E+05

9E+05

8E+05

7E+05

6E+05

6E+05

5E+05

4E+05

4E+05

3E+05

2E+05

1E+05

73000

HDF4
HDF4 with packing
HDF5
HDF5 with packing

1000

Time (seconds)

Writing Vdatas and Compound Datasets
Effect of data packing in HDF4 and HDF5
(average time)

Number of records

Data packing was added to the previous test. For HDF5 we have
very small effect.

Reading Two Fields

1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0

HDF4
HDF4 without unpcking
data

9E+05

9E+05

8E+05

7E+05

6E+05

5E+05

5E+05

4E+05

3E+05

2E+05

2E+05

79000

HDF5

1000

Time ( seconds)

Reading Vdatas and Compound datasets
Native read
(average time)

Number of records

Unpacking slows down HDF4 significantly ( about 8 times)
HDF5 was reading packed data in this test.

Structure of CERES file
Vgroup
CERES_ES8

Vgroup
Data Fields

18

SDS

Vgroup
Geolocation Fields

19

Vdata

2

1

SDS

Vdata

Ceres File
• Used H4toH5 converter to create an HDF5
version of the file
– 81MB (HDF4), 80MB (HDF5)
– 1 min 55 sec on Linux
– 3 min 56 sec on Solaris

• Benchmarks
– read up to 14 datasets (2148x660 floats)
– subsampling: read two columns from the same
datasets

• Benchmark was run on Solaris and Linux
platforms

Reading CERES data on big and little - endian machines
Reading CERES data on different platforms
(best times)
3

Time (seconds)

2.5
2

HDF4 (LE)
HDF5 (LE)
HDF4 (BE)
HDF5 (BE)

1.5
1
0.5
0
1

2

3

4

5

6

7

8

9 10 11 12 13 14

Number of 2148x660 datasets read

On Solaris platform, HDF5 was twice faster than HDF4.
On Linux (data conversion is on), HDF4 was about 1.3-1.5 faster.

Subsetting CERES Data

Selection of two columns from 2148x660 CERES
dataset
(best times)

Time (seconds)

25
20
HDF4
HDF5
HDF5 tuned

15
10
5
0
1

2

3

4

5

6

7

8

9 10 11 12 13 14

Number of datasets

Current version of HDF5 shows about 3 times better performance.

Conclusion
• Goal: tune HDF5 and give our users
recommendations on its efficient usage
• Continue to study HDF4 and HDF5 performance
– try more platforms: O2K, NT/Windows
– try other features (e.g. chunking, compression)
– specific HDF5 features (e.g. writing/reading big files,
VL datatypes, compound datatypes, selections)

• Users input is necessary, send us access patterns
you use!
• Results will be available @http://hdf.ncsa.uiuc.edu

HDF4 and HDF5 Performance Preliminary Results

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Ähnlich wie HDF4 and HDF5 Performance Preliminary Results

Ähnlich wie HDF4 and HDF5 Performance Preliminary Results (20)

Mehr von The HDF-EOS Tools and Information Center

Mehr von The HDF-EOS Tools and Information Center (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

HDF4 and HDF5 Performance Preliminary Results