Initial investigation of pyJac: an analytical Jacobian generator for chemical kinetics
1. Initial investigation of pyJac:
an analytical Jacobian
generator for chemical kinetics
Kyle Niemeyer
Oregon State University
Nicholas Curtis, Chih-Jen Sung
University of Connecticut
Fall 2015 Meeting of WSSCI
5 October 2015
Funding: NSF awards 1535065 & 1534688
17. Introducing pyJac
• Accelerate chemical kinetic integration by
providing source code to evaluate chemical kinetic
Jacobian matrices analytically
6
18. Introducing pyJac
• Accelerate chemical kinetic integration by
providing source code to evaluate chemical kinetic
Jacobian matrices analytically
• pyJac capable of generating source code for CPU
and GPU architectures
6
19. Introducing pyJac
• Accelerate chemical kinetic integration by
providing source code to evaluate chemical kinetic
Jacobian matrices analytically
• pyJac capable of generating source code for CPU
and GPU architectures
• Compatible with both CHEMKIN- and Cantera-
format mechanisms
6
29. Optimized Evaluation
• General idea:
• Large portions of Jacobian entries constant for a single
reaction
• Compute this portion once, and update as needed for
all species pairs
10
30. Optimized Evaluation
• General idea:
• Large portions of Jacobian entries constant for a single
reaction
• Compute this portion once, and update as needed for
all species pairs
• Potential increase in computational efficiency
10
31. Optimized Evaluation
• General idea:
• Large portions of Jacobian entries constant for a single
reaction
• Compute this portion once, and update as needed for
all species pairs
• Potential increase in computational efficiency
• Most expensive calculation can be performed once per
reaction
10
32. Optimized Evaluation
• General idea:
• Large portions of Jacobian entries constant for a single
reaction
• Compute this portion once, and update as needed for
all species pairs
• Potential increase in computational efficiency
• Most expensive calculation can be performed once per
reaction
• Species pairs updates relatively simple in comparison
10
34. Validation: PaSR (1)
11
Fuel # Species # Reactions Source
H2/CO 13 27 Burke et al.
CH4 53 325 GRI Mech 3
C2H4 111 784 USC Mech II
Mechanisms used
35. Validation: PaSR (1)
11
Parameter H2/air CH4/air C2H4/air
ϕ 1
T 400, 600, and 800 K
P 1, 10, and 25 atm
# particles 100
𝜏res 10 ms 5 ms 100 μs
𝜏mix 1 ms 1 ms 10 μs
𝜏pair 10 ms 5 ms 100 μs
PaSR conditions; run for 10 residence times
Fuel # Species # Reactions Source
H2/CO 13 27 Burke et al.
CH4 53 325 GRI Mech 3
C2H4 111 784 USC Mech II
Mechanisms used
37. Validation: PaSR (2)
• First ensured species concentrations, reaction rates, species
production rates, and derivative term matched Cantera output.
12
38. Validation: PaSR (2)
• First ensured species concentrations, reaction rates, species
production rates, and derivative term matched Cantera output.
• Jacobian validation:
12
39. Validation: PaSR (2)
• First ensured species concentrations, reaction rates, species
production rates, and derivative term matched Cantera output.
• Jacobian validation:
• Due to negative densities in some cases from finite
difference, Cantera not possible
12
40. Validation: PaSR (2)
• First ensured species concentrations, reaction rates, species
production rates, and derivative term matched Cantera output.
• Jacobian validation:
• Due to negative densities in some cases from finite
difference, Cantera not possible
• In addition, step size issues led to large errors even with
high-order finite differences
12
41. Validation: PaSR (2)
• First ensured species concentrations, reaction rates, species
production rates, and derivative term matched Cantera output.
• Jacobian validation:
• Due to negative densities in some cases from finite
difference, Cantera not possible
• In addition, step size issues led to large errors even with
high-order finite differences
• Therefore: used numdifftools* for accurate finite
difference Jacobian based on pyJac derivative output
12
42. Validation: PaSR (2)
• First ensured species concentrations, reaction rates, species
production rates, and derivative term matched Cantera output.
• Jacobian validation:
• Due to negative densities in some cases from finite
difference, Cantera not possible
• In addition, step size issues led to large errors even with
high-order finite differences
• Therefore: used numdifftools* for accurate finite
difference Jacobian based on pyJac derivative output
12
*uses multiple-term Richard extrapolation of
central differences (order 4–10)
44. Validation: PaSR (3)
• “Error”: 2-norm of relative difference with FD
13
Mechanism Sample size Mean Error Max Error
H2/CO 900,900 2.4×10-6 % 0.87%
CH4 450,900 3.4×10-3 % 0.26%
C2H4 91,800 2.2×10-5 % 3.4×10-3 %
45. Validation: PaSR (3)
• “Error”: 2-norm of relative difference with FD
• Discrepancies between analytical (CPU and GPU) and
FD Jacobian matrices small
13
Mechanism Sample size Mean Error Max Error
H2/CO 900,900 2.4×10-6 % 0.87%
CH4 450,900 3.4×10-3 % 0.26%
C2H4 91,800 2.2×10-5 % 3.4×10-3 %
46. Validation: PaSR (3)
• “Error”: 2-norm of relative difference with FD
• Discrepancies between analytical (CPU and GPU) and
FD Jacobian matrices small
• Maximum error less than 1% for all cases considered.
13
Mechanism Sample size Mean Error Max Error
H2/CO 900,900 2.4×10-6 % 0.87%
CH4 450,900 3.4×10-3 % 0.26%
C2H4 91,800 2.2×10-5 % 3.4×10-3 %
49. pyJac Performance (CPU)
• Compare
performance of
pyJac, TChem1, and
finite difference
14
1Safta C, Najm HN, Knio OM. TChem - A Software
Toolkit for the Analysis of Complex Kinetic
Models. Sandia National Laboratories; 2011.
50. pyJac Performance (CPU)
• Compare
performance of
pyJac, TChem1, and
finite difference
• PaSR data from
validation used here
14
1Safta C, Najm HN, Knio OM. TChem - A Software
Toolkit for the Analysis of Complex Kinetic
Models. Sandia National Laboratories; 2011.
51. pyJac Performance (CPU)
• Compare
performance of
pyJac, TChem1, and
finite difference
• PaSR data from
validation used here
• Mean runtime of 10
runs / # conditions
14
1Safta C, Najm HN, Knio OM. TChem - A Software
Toolkit for the Analysis of Complex Kinetic
Models. Sandia National Laboratories; 2011.
58. pyJac Performance (GPU)
• One Jacobian matrix evaluated per GPU thread
• Full utilization at same number of conditions, likely due to
memory bandwidth saturation
16
2.63×
3.13× 3.59×
59. pyJac Performance (GPU)
• One Jacobian matrix evaluated per GPU thread
• Full utilization at same number of conditions, likely due to
memory bandwidth saturation
• Again, slightly super linear growth with mechanism size
16
2.63×
3.13× 3.59×
61. Future Work
• Why do pyJac and TChem perform similarly for the
larger mechanism? Explore using larger mechanisms
17
62. Future Work
• Why do pyJac and TChem perform similarly for the
larger mechanism? Explore using larger mechanisms
• Cache optimization: Reorder species/reactions to
improve cache hit rates
17
63. Future Work
• Why do pyJac and TChem perform similarly for the
larger mechanism? Explore using larger mechanisms
• Cache optimization: Reorder species/reactions to
improve cache hit rates
• Shared memory usage for GPU pyJac acceleration
17
64. Future Work
• Why do pyJac and TChem perform similarly for the
larger mechanism? Explore using larger mechanisms
• Cache optimization: Reorder species/reactions to
improve cache hit rates
• Shared memory usage for GPU pyJac acceleration
• Eventual code goals:
17
65. Future Work
• Why do pyJac and TChem perform similarly for the
larger mechanism? Explore using larger mechanisms
• Cache optimization: Reorder species/reactions to
improve cache hit rates
• Shared memory usage for GPU pyJac acceleration
• Eventual code goals:
• Sparse matrix formats
17
66. Future Work
• Why do pyJac and TChem perform similarly for the
larger mechanism? Explore using larger mechanisms
• Cache optimization: Reorder species/reactions to
improve cache hit rates
• Shared memory usage for GPU pyJac acceleration
• Eventual code goals:
• Sparse matrix formats
• Support for constant volume
17
67. Future Work
• Why do pyJac and TChem perform similarly for the
larger mechanism? Explore using larger mechanisms
• Cache optimization: Reorder species/reactions to
improve cache hit rates
• Shared memory usage for GPU pyJac acceleration
• Eventual code goals:
• Sparse matrix formats
• Support for constant volume
• Code generation in Fortran and Matlab
17
69. Conclusions
• Developed analytical, exact Jacobian generator
that supports both CPU and GPU platforms (and
all modern reaction rate formulations
18
70. Conclusions
• Developed analytical, exact Jacobian generator
that supports both CPU and GPU platforms (and
all modern reaction rate formulations
• pyJac v0.9-beta available today:
https://github.com/kyleniemeyer/pyJac
18