In this presentation I share my views on how mechanistic, first principles models can be used to accelerate innovation in the biotechnology industry. I present various techniques and methodologies from process systems engineering and I summarise key scientific works that will enable us today to use first-principles models to develop process understanding, reduce uncertainty and innovate faster than ever before. A creative commons version of this presentation will follow shortly.
First principles models as a tool to accelerate innovation in the design and operation of biotechnological processes
1. First principles models
as a tool to accelerate innovation
in the design and operation of
biotechnological processes
Pablo A Rolandi, PhD
1The more unpredictable the world is the more we rely on predictions. –Steve Rivkin.
2. Overview
2
Models and model-centric technologies
The modeling landscape
Applications of first principles modeling:
• soft-sensing
• process troubleshooting
• process and model uncertainty
• process development
• design space
• bioreactors and bioseparations
3. Modeling: formalisms
All models are wrong, some are useful. – George EP Box.
Simplicity is the ultimate sophistication. – Leonardo da Vinci.
Nonlinear differential equation models:
• Ordinary differential (ODE)
• Differential-algebraic (DAE)
• Partial differential (PDE)
Regression models:
• Principal component analysis (PCA)
• Partial least squares (PLS)
Statistical models:
• Maximum likelihood (ML)
• Bayesian
And many more…
• Petri nets • Delay differential equations (DDE)
• Boolean and Bayesian networks • Stochastic differential equations (SDE)
• Agent-based models (ABM) • Master equations
• Artificial neural networks (ANN) • Gaussian process regression (kriging)
Key question: what are the requirements of the application?
3
𝐹 𝑥, 𝑥, 𝑦, 𝑢, 𝜃, 𝑡 = 0
4. Modeling: applications
Process operations[1]
[1] Rolandi, PA and Romagnoli, JA; Integrated model-centric framework for support of manufacturing operations. Part I: The framework.; Comp & Chem Engng, 2010
Modeling can be used in R&D and process development as well,
not only in process operations!
4
5. Bioreactor (Process)
Model (DAE)
Application: soft-sensing
Overview
𝐹 𝑥, 𝑥, 𝑦, 𝑢, 𝜃, 𝑡 = 0
Controls/Manipulations
Disturbances
Measured variables
Predictions
Soft-sensing: use a model to compute the current (transient) values
of unmeasured process variables of interest
Issues: unmeasured disturbances, model fitness & unmeasured state variables
Benefits: real-time process monitoring for troubleshooting and optimisation
Bulk pH, exhaust CO2
Cell viability, biomass yield,
dissolved CO2
A subset of the above
Substrate feed rate,
gas flow rate,
agitation speed
Inoculum
(i.e., initial conditions)
Unmeasured variables
5
6. Application: soft-sensing
Industrial continuous pulping digester[1]
6
Modeled in gPROMS v2.1 (~2005)
[1] Rolandi, PA and Romagnoli, JA; Smart Enterprise for Pulp and Paper: Digester Modeling and Validation; CACE 14, 2003
~10,000 variables/equations
~1,000 states
~100 degrees-of-freedom
1 reactor + ~15 auxiliary process units
7. Application: soft-sensing
Industrial continuous pulping digester[1]
7
Selectivity remains
constant constant lignin
to cellulose ratio
Yield decreases both
lignin and cellulose were
removed from the wood
Reactor design principle: early “impregnation” stages used
for intra-particular diffusion, not heterogeneous reaction
process operation inconsistent with process design
How fit-for-purpose are first principles models of bioprocesses?
Inspection of
simulation profiles
enabled
troubleshooting:
• Too high
temperatures
• Too low alkali
concentrations
Model-based
optimisation led to
more favourable
operation:
• Benefits: 500,000-
2,000,000 US$/year
[1] Rolandi, PA and Romagnoli, JA; Optimisation and Transition Planning of a Continuous Industrial Digester; ESCAPE 14, 2004
8. Modeling: first principles models
Scale- and operation-invariant parameters[1]
8
[1] Craven, S; Shirsat, N; Whelan, J & Glennon, B; Process Model Comparison and Transferability Across Bioreactor Scales and Modes (…) ; Biotechnol Prog, 2013
Model of CHO cells (using Monod kinetics):
• i) 3L bench-top and ii) 15L pilot-scale bioreactors
• a) batch, b) bolus fed-batch and c) continuous fed-batch conditions
6 parameters determined experimentally and
11 parameters fitted to experimental data
(all with direct physical interpretation)
Good transferability of
model better scale-up
9. 9
Modeling: first principles models
Scale- and operation-invariant parameters[1]
[1] Craven, S; Shirsat, N; Whelan, J & Glennon, B; Process Model Comparison and Transferability Across Bioreactor Scales and Modes (…) ; Biotechnol Prog, 2013
10. 10
Modeling: first principles models
A simple benchmark bioreactor model[1]
𝑉 ∙ 𝐵 = 𝜇 ∙ 𝑉 − 𝑞 𝐵
𝑉 ∙ 𝑆 = 𝑞 𝑆0 − 𝑆 − 𝑟1 ∙ 𝑉 ∙ 𝑚𝑤 ∙ 𝐵
𝑟1 = 𝑟1,𝑚𝑎𝑥 ∙ 𝑆/ 𝐾𝑠 + 𝑆
𝑟2 = 𝑘2 ∙ 𝐸 ∙ 𝑀1/ 𝐾 𝑀1 + 𝑀1
𝑟3 = 𝑘3,𝑚𝑎𝑥 ∙ 𝐾𝐼/ 𝐾𝐼 + 𝑀2
𝑀1 = 𝑟1 − 𝑟2 − 𝜇 ∙ 𝑀1
𝑀2 = 𝑟2 − 𝑟3 − 𝜇 ∙ 𝑀2
𝑀3 ≡ 𝐸 = 𝑟3 − 𝜇 ∙ 𝐸
𝜇 = 𝑌𝐵/𝑆 ∙ 𝑟1
20 vars, 9 eqns 3 dof (𝑢), 8 parameters (𝜃)
A more realistic model than [1]
can be found on [2]
(e.g., taking into account full set of
amino acids in CHO cell)
Parameters in bold are
estimated numerically
𝐹 𝑥, 𝑥, 𝑦, 𝑢, 𝜃, 𝑡 = 0
[𝜇𝑚𝑜𝑙/𝑔𝐷𝑊.ℎ]
𝑢 = {𝑉, 𝑞, 𝑆0}
𝜃 = {𝑀𝑊, 𝒀 𝑩/𝑺, 𝑟1,𝑚𝑎𝑥, 𝐾𝑆, 𝒌 𝟐, , 𝐾 𝑀1, 𝒌 𝟑,𝒎𝒂𝒙, 𝑲 𝑰, }
[g/h]
[𝜇𝑚𝑜𝑙/𝑔𝐷𝑊.ℎ]
[𝜇𝑚𝑜𝑙/𝑔𝐷𝑊.ℎ]
[1/ℎ]
[𝜇𝑚𝑜𝑙/𝑔𝐷𝑊.ℎ]
[1] Kremling, A et al.; Genome Research, 2004; [2] Kontoravdi, C et al.; Biotechnol Prog, 2007
12. Modeling: model calibration
A simple benchmark bioreactor model*
What do we do with this residual
parametric uncertainty?
Using synthetic process data (i.e., simulation with noise)
𝑆𝑆𝑅 𝜃 − 𝑆𝑆𝑅 𝜃 𝑀𝐿 ≤ 𝜒 𝑁𝑃,1−𝛼
Confidence regions (ML):
𝜃 − 𝜃 𝑀𝐿
𝑇
𝑉−1
𝜃 − 𝜃 𝑀𝐿 ≤
𝑁𝑃 ∙ 𝑆𝑆𝑅 𝜃 𝑀𝐿
𝑁 − 𝑁𝑃
𝐹 𝑁𝑃,𝑁−𝑁𝑃,1−𝛼
Issue: are linearised
(i.e., ellipsoid-like)
confidence regions good
approximations?[1]
12* Rolandi, PA; ongoing research towards an MSc on Digital Biology at the University of Manchester.
13. 13
Modeling: uncertainty quantification
Process and model uncertainty
Improve actuation, re-design
control system
Reduce uncertainty, ignore,
improve quantification
Parameter values
Factors : Critical Process
Parameters (CPPs)
Reduce uncertainty,
ignore, measure directly
UQ and GSA are very powerful methods to develop process understanding, ensure
quality and develop predictive models with targeted experimentation
Responses: Critical
Quality Attributes (CQAs)
Bioreactor (Process)
Model (DAE)
𝐹 𝑥, 𝑥, 𝑦, 𝑢, 𝜃, 𝑡 = 0
Controls/Manipulations
Disturbances
Measured variables
Predictions
Exhaust CO2, bulk pH
Biomass, protein yield, bulk
concentrations
A subset of the above
Unmeasured variables
14. 14
Application: soft-sensing
Uncertainty quantification*
Numerics: Monte Carlo (MC) techniques (not OAT)
• Efficient sampling: low discrepancy sequences/quasi-random numbers, correlated
factors (e.g., Iman-Conover method)
• Pleasingly parallel computations: each run is independent of the others
• Number of runs: O(10^4 – 10^5)
Benefits: more realistic predictions taking into account
process/model uncertainties (a family of trajectories!)
Monte Carlo
Assessing the impact of
parametric uncertainty…
… this framework can be
applied to any CPP!
* Rolandi, PA; ongoing research towards an MSc on Digital Biology at the University of Manchester.
15. Modeling: first principles models
Hybrid multi-scale modeling[1]
Is the “well-mixed” (homogeneous) bioreactor assumption valid?
What can we learn from rigorous hydrodynamic calculations?
[1] Bezzo, F; Macchietto, S & Pantelides, CC; General Hybrid Multizonal/CFD Approach for Bioreactor Modeling; AIChE Journal, 2003
10, 15, 20, 25 zones 300, 450, 600 rpm CSTR vs hybrid, 300 (lhs) & 600 (rhs) rpm representative zones
Xantan rate (lhs) & effective viscosity (rhs)
aggregation
disaggregation
CFD
Zones
Modeling strategy: multi-compartmental model
(based on CFD simulations with decoupled or coupled data flows)
15
17. 17
Application: process troubleshooting
Industrial bioseparations[1]
Disturbances:
• Dimer protein concentrations: AA ~N(0.108, 0.024);
AB ~N(0.127, 0.023); BB ~N(0.104, 0.023)
Controls:
• Mass challenge (mg/ml) and wash length (CV)
Control (decision) space:
• Continuous (region) or discrete (grid)
Product quality: the probability of meeting the product spec
constraint: 0.25 < B (monomer) <0.45 for both resins
Narrow operating region
with >75% chance of
meeting the CQA!
HIGH & LOW
[1] Close, EJ; Salm, JR; Bracewell, DG & Sorensen, E; A model based approach for identifying robust operating conditions (…); Chem Eng Sci, 2014
18. 18
Application: process development
Industrial bioseparations[1]
Better: no feed variability
(computed with deterministic
feed deterministic design
space)
Baseline: variability in feed
(SD~0.02) and p>75%
(computed with deterministic
parameter values)
Worse: variability in feed
(SD~0.01) and p>95%
(computed with deterministic
parameter values)
Reactive process troubleshooting pro-active robust process development:
probabilistic design space (i.e., QbD)
Model-based
quantification of the effect
of different levels of
uncertainty/ variability in
feed composition!
(can be extended to account for
model uncertainty as well)
[1] Close, EJ; Salm, JR; Bracewell, DG & Sorensen, E; A model based approach for identifying robust operating conditions (…); Chem Eng Sci, 2014
19. 19
Engineering workflows:
Iterative model and process development
Model development by model-based design of experiments
Process development by model-targeted experimentation
Global Sensitivity Analysis (GSA)[1]:
• Algorithms: Sobol indices or DGSM
• Numerics: based on Monte Carlo integration
Goals (factor CPP; response CQA):
• Which factors are most important?
• Which factors are unimportant?
• Which factors can reduce response variance to an
acceptable value?
• Which factors effect the responses of interest?
• Meta-modeling
Modeling
Experimentation
Process
[1] Saltelli, A et al; Sensitivity analysis practices: Strategies for model-based inference; Reliab Eng Syst Safe, 2006
20. 20
Further applications:
Industrial continuous pulping digester[1]
Offline dynamic optimisation:
• Digester: yield maximisation, economic maximisation
• Benefits: ~1,400,000 US$/year (and simpler control structure)
• Bioreactors: yield maximisation, batch-time minimisation
(controls: substrate feed and aeration rate)
Dynamic data reconciliation:
• Digester: bias estimation (mass balance closure)
• Benefits: ~ 500,000 US$/year in utility savings (evaporators)
Real-time dynamic optimisation (advanced process control):
• Digester: optimal set-point tracking (during grade transitions)
• Benefits: on-spec product (selectivity)
• Bioreactors: model-based control for disturbance rejection (e.g., feed failure)
[1] Rolandi, PA; Model-Based Framework for Integrated Simulation, Optimisation and Control of Process Systems; PhD Thesis, 2005
21. 21
Today’s modeling landscape
Numerical techniques are widely available
• Some key components of the puzzle are not implemented in commercial
tools
Parallel computing (e.g., cloud) is a key enabler
• Amazon Web Services (AWS) sells ~15.6 hours of compute time for $1
Structured/segregated models of fermentation processes are being developed
and calibrated
• Is this the decade they will become sufficiently predictive?
First principles modeling is becoming widely applied in industry
• Corporate modeling functions are being established
First principles modeling:
• Generates unparalleled process understanding
• Accelerates innovation in process development (e.g., scale-up/scale-down)
and process operations (e.g., soft-sensing and troubleshooting)
• Provides competitive advantage and delivers value to organisations