SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Downloaden Sie, um offline zu lesen
Machine Learning and Surrogate Optimization
on Heterogeneous Catalysts
Ichigaku Takigawa
2019 PRESTO International Symposium on Materials Informatics

Feb 9-11, 2019 @ Tokyo
Graduate School of Information Science and Technology, Hokkaido University
Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University
Heterogeneous catalysts and surface reactions
Wolfgang Pauli
“God made the bulk; 

the surface was invented by the devil.”
adsorption
diffusion
desorption
dissociation
recombination
kinks
terraces
adatom
vacancysteps
Many hard-to-quantify factors complicate their atomic-
level characterization by modelling and experiments.
• reaction conditions
• composition
• support
• surface termination
• particle size & morphology
• atomic coordination environment
• disordered/amorphous structures

in their active state

:
GAS
(Reactants)
SOLID
(Catalysts)
Hi-fidelity simulations are too time-consuming...
Then how can we characterize the catalytic activity?
K. Shimizu et al, ACS Catal. 2, 1904 (2012)
d-band center (εd − EF) / eVd-band center (εd − EF) / eV
Hammer–Nørskov d-band model
reactionrates
Volcano
trends!
adsorption energy / eV
Brønsted-Evans-Polanyi
relation
activationenergy/eV
Linear trends!
The d-electrons of transition metals govern...
Several DFT-calculated indexes capture the trend to some extent...
Outline: Our ML-based studies
1. Can we predict the d-band center?
2. Can we predict the adsorption energy?
3. Can we predict the catalytic activity?
predicting DFT-calculated values by machine learning
  (Takigawa et al, RSC Advances. 2016)
predicting DFT-calculated values by machine learning
  (Toyao et al, JPC C 2018)
predicting values from experiments reported in the  literature
by machine learning
  (Suzuki et al, in preparation)
Case 1. Predicting the d-band centers
Guest
Host
Ruban, Hammer, Stoltze, Skriver, Nørskov, J Mol Catal A, 115:421-429 (1997)
J. K. Nørskov, et al., Advances in Catalysis, 2000
Host
Guest
Two types of models
• 1% doped
• overlayer
[1% doped]
The d-bands of
transition metals
play central roles.
The beauty of the periodic table worked!
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times

mean RMSE:
0.153 / eV
100 times

mean RMSE:
0.235 / eV
100 times

mean RMSE:
0.402 / eV
The ML model
•Group (G)
•Bulk Wigner–Seitz radius (R) in Å
•Atomic number (AN)
•Atomic mass (AM) in g mol
−1
•Period (P)
•Electronegativity (EN)
•Ionization energy (IE) in eV
•Enthalpy of fusion (∆fusH) in J g
−1
•Density at 25 ℃ (ρ) in g cm
−3
Readily available

9 descriptors pretested

for host & guest (18 in total)
Gradient Boosted Tree Regression (GBR)

with only 6 descriptors
(1) Group in the periodic table (host)
(2) Density at 25 ℃ (host)
(3) Enthalpy of fusion (guest)
(4) Ionization energy (guest)
(5) Enthalpy of fusion (host)
(6) Ionization energy (host)
11 ML methods pretested
[3 Tree Ensembles (Nonlinear Regression Models)] 

GBR (Gradient Boosted Tree Regression); ETR (Extra-Trees Regression); RFR (Random Forest Regression);
[5 Linear Regression Models] 

OLS (Ordinary Least Squares Regression); PLS (Partial Least Squares Regression); LASSO (Lasso
Regression); RIDGE (Ridge Regression); RANSAC (Random Sample Consensus Regression);
G
BR
ETR
G
PR
R
FR
KR
R
O
LS
R
ID
G
E
PLS
R
AN
SAC
SVR
LASSO
[3 Kernel Methods (Nonlinear Regression Models)] 

GPR (Gaussian Process Regression); KRR (Kernel Ridge Regression); SVR (Support Vector Regression);
training sets (75%)
test sets (25%)
Tree ensemble regressors (GBR, ETR, RFR)
Decision Tree
(Regression Tree)
Tree Ensemble
⇡
⇡
y
x1
x2
y
x1
x2
x1
x2
ˆy
y = sin
✓q
x2
1 + x2
2
◆
y = sin
✓q
x2
1 + x2
2
◆
= + + + ...
x1
x2
ˆy
c1 c2
c3c1
c2
c3
• Region-wise constant prediction
• The regions are given by recursive axis-
parallel partitioning of the data space
<latexit sha1_base64="HFw5DiyTzq0XGqmoTg6I06/Dc80=">AAACr3ichVHLSsNQED3GV62vqhvBTbEoClImVVBciW5c+qoVXzWJV72YJiFJS2vxB1wLLkRBwYX4GW76Ay78BHGp4MaFkzQgKuqE5J57Zs7cczO6Y0rPJ3psUBqbmltaY23x9o7Oru5ET++qZxddQ2QN27TdNV3zhCktkfWlb4o1xxVaQTdFTj+cC/K5knA9aVsrfsURWwVt35J70tB8ptZHynl1LFnOZ0bziRSlKYzkT6BGIIUoFuxEDZvYhQ0DRRQgYMFnbEKDx88GVBAc5rZQZc5lJMO8wDHirC1yleAKjdlD/u7zbiNiLd4HPb1QbfApJr8uK5MYoge6pReq0R090fuvvaphj8BLhVe9rhVOvvukf/ntX1WBVx8Hn6o/PfvYw1ToVbJ3J2SCWxh1feno7GV5emmoOkzX9Mz+r+iR7vkGVunVuFkUS+d/+NHZy+9/LMhHFTxC9fvAfoLVTFodT9PiRGpmNhpmDAMYxAhPbBIzmMcCsnyChVNc4FJRlZyyrezUS5WGSNOHL6HID0VdmeQ=</latexit>
Tree ensemble regressors (GBR, ETR, RFR)
Advantages
• quick, nonlinear, parallelizable
• highly accurate (widely used in many winning
solutions for data prediction competitions)
• usually less hyperparameter dependent 

(compared to kernel methods and neural networks)
• conservative extrapolation
• "variable importance" provided
• popular implementations
• Scikit-learn
• XGBoost (by DMLC)
• LightGBM (by Microsoft)
…Data
How to generate multiple
decision trees?
• RFR / ETR

random patches (random subsampling of instances and variables) or random splits
• GBR (can be also mixed with the above strategy)

sequentially add a new tree to compensate the weak point of the current ensemble.
Descriptor analysis and evaluation
100 times mean RMSE:
0.204±0.047 / eV
100 times mean RMSE:
0.212±0.047 / eV
100 times mean RMSE:
0.214±0.046 / eV
GBR with 18
descriptors
GBR with 6
descriptors
GBR with 4
descriptors
Descriptor
Importances
Descriptor
Selection

(top-k)
training sets (75%)
test sets (25%)
Case 2. Predicting the adsorption energy
DFT calculation of adsorption energy
• 10 hours with our 32 cores workstation 

(CH3 on the Cu monometallic surface)
• even longer time (about 34 hours) for the system
containing another metal such as Pb
Predicting Adsorption energy of CH3
ML prediction
• < 1 sec with our 1 core laptop
• not dependent on target systems, but
methods we choose
training sets (75%)
test sets (25%)
But what these mean for catalyst design and discovery!?
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11
Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56
Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39
Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63
Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27
Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3
Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79
Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51
Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33
Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -0.78 -1.65 -1.64 -1.87
Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26
Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82
Cu -2.42 -2.89 -2.94 -3.88 -4.63
Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27
Rh -1.42 -1.51 -2.12 -1.81 -1.7
Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79
Ag -3.68 -3.8 -3.63 -4.51
Ir -2.14 -2.11 -2.7
Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06
Au -2.86 -3.09 -2.89 -3.44 -3.56
Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au
Fe -2.17 -3.11
Co -1.17 -1.37 -2.12
Ni -0.33 -1.18 -2.61 -2.43
Cu -2.42 -2.29 -2.49 -3.71 -4.63
Ru -2.02
Rh -1.32 -1.73 -2.12
Pd -1.94 -1.83 -1.97
Ag -3.75 -3.68 -4.51
Ir -1.78 -1.71 -2.7
Pt -2.13
Au -3.09 -2.89
training sets (75%)
test sets (25%)
training sets (50%)
test sets (50%)
training sets (25%)
test sets (75%)
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
gradient boosting
w/ 6 descriptors
100 times

mean RMSE:
0.153 / eV
100 times

mean RMSE:
0.235 / eV
100 times

mean RMSE:
0.402 / eV
Standard procedure for optimizing the activity
All your
available
data
• Experiments
• Simulations
Hypothesis
generation
(abduction)
Check results
Feedback
Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions

for many possible
candidates
Machine Learning
(any "data-driven" 

predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-fidelity
simulations
Replace the time-consuming and costly part by ML
All your
available
data
Check the best
predicted ones
Feed them as
training data
Make predictions

for many possible
candidates
Machine Learning
(any "data-driven" 

predictions)
The "Surrogate (or proxy)"
model for
• Demanding experiments
• Time-consuming hi-fidelity
simulations
This simple procedure
won't work in most
practical cases!
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
Nicely predicted for

the average (but mediocre )
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
nice "discovery" can be largely
deviated from the average of knowns
outlier
Nicely predicted for

the average (but mediocre )
ML itself is not for "discovery"
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity
Training samples
Best known value
ML captures the average trend of available knowns
"discovery" corresponds to something not in knowns
Mismatch
ML prediction
Fitted to minimize
the average error.
<latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit>
<latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
nice "discovery" can be largely
deviated from the average of knowns
outlier
Nicely predicted for

the average (but mediocre )
An ML model is just representative of the training data
Highly Inaccurate Model Predictions from
Extrapolation (Lohninger 1999)
"Beware of the perils of extrapolation,
and understand that ML algorithms
build models that are representative of
the available training samples."
"exploitation""exploration"
to obtain new knowledge/data to use the knowledge/data to
improve the performane
We also need this ML basically for this
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
Use ML to guide the balance between "exploitation" and "exploration"!
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
Use ML to guide the balance between "exploitation" and "exploration"!
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
e.g.

"expected improvement"
Use ML to guide the balance between "exploitation" and "exploration"!
Surrogate optimization (model-based optimization)
x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit>
Descriptors (High dimensional)
Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
"Uncertainty" of
prediction
e.g. prediction variance
e.g.

"expected improvement"
1. Initial Sampling (DoE)
2. Loop:
1. Construct a Surrogate Model.
2. Search the Infill Criterion.
3. Add new samples. (intervention)
• Reinforcement learning
• Blackbox optimization
• Bayesian optimization
• Sequential design of experiments
• Multi-armed bandit
• Evolutional computation
• Game-theoretic approaches

:
An Open Research Topic in ML
Use ML to guide the balance between "exploitation" and "exploration"!
Structure-activity landscapes are nonsmooth...
J. Med. Chem. 2012, 55, 2932−2942
The structure-activity landscape can be often
nonsmooth. Small changes in descriptors can
largely affect the activity/selectivity.
Activity cliffs Selectivity cliffs
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
• the sum of compositions equals to 1 (compositional restriction)
Batched + restriced surrogate optimization
Our optimization model for heterogeneous catalyst data:
• Nonsmooth interpolation (SMAC-like by tree ensemble regressors)

with multistart local search with L-BFGS + random samples
• Use EI as the infill criterion estimated by OOB or Quantile regression
• Get multiple samples in each loop 

by batched optimization with "diversified" samples by clustering

(need some margin; in reality, not easy to realize suggested catalysts)
• Pose several restrictions on new samples to be tested
• the sum of compositions equals to 1 (compositional restriction)
• restrict the number of elements in a catalyst (bounded nonzeros)

(because a catalyst with 60 elements would be not realistic...)
Case 3. Predicting the catalytic acitivity (in prep)
• Oxidative coupling of methane (OCM) 

[Zavyalova+ 2011]
• Water gas shift reaction (WGS) 

[Odabaşi+ 2014]
• CO oxidation [Günay+ 2013]
Test on 3 DatasetsOur model
GPR-based BO
Random
ICReDD, Hokkaido University
Check the website for any collaborations and postdoc positions!
Our mission:

To rationally design and discover
new chemical reactions 

by seemlessly fusing
• experimental sciences (realization)
• computational sciences (theory-driven)
• information sciences (data-driven)
started Oct 2018, funded $ 6.4 million per year by government (for 10 years)
Sapporo
Tokyo
HOKKAIDO
• 2 million population

(5th largest city in Japan)
• 6.3m / 248 inches

avg. annual snowfall
Institute for Chemical Reaction Design and Discovery
(WPI-ICReDD), Hokkaido University
Summary
• Predicting the d-band centers by ML

(Takigawa et al, RSC Advances. 2016)
• Predicting the adsorption energy by ML

(Toyao et al, JPC C 2018)
• Predicting the experimentally-reported catalytic
activity by ML

(Suzuki et al, in preparation)
Acknowledgements
Ken-ichi
SHIMIZU

(ICAT)
Satoru
TAKAKUSAGI

(ICAT)
Takashi
TOYAO

(ICAT)
Keisuke

SUZUKI

(DENSO)

Weitere ähnliche Inhalte

Ähnlich wie Machine Learning and Surrogate Optimization on Heterogeneous Catalysts

Heterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.pptHeterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.pptManoj Mohapatra
 
Heterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and ChallengesHeterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and ChallengesManoj Mohapatra
 
CONVERSIONS TO USE METRIC PREFIXES This table us
CONVERSIONS TO USE   METRIC PREFIXES This table usCONVERSIONS TO USE   METRIC PREFIXES This table us
CONVERSIONS TO USE METRIC PREFIXES This table usAlleneMcclendon878
 
Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...iosrjce
 
Basics Nuclear Physics concepts
Basics Nuclear Physics conceptsBasics Nuclear Physics concepts
Basics Nuclear Physics conceptsMuhammad IrfaN
 
A facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocompositeA facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocompositeIOSR Journals
 
Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids Ehsan B. Haghighi
 
Restrained refinement using Reflex
Restrained refinement using ReflexRestrained refinement using Reflex
Restrained refinement using Reflexzavalij
 
Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805Riikka Puurunen
 
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...Abdullah Khan Zehady
 
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...KAMAL CHOUDHARY
 
I0371048054
I0371048054I0371048054
I0371048054theijes
 
Passivation Of Ga As Surface With Sa Ms Final
Passivation Of Ga As Surface With Sa Ms   FinalPassivation Of Ga As Surface With Sa Ms   Final
Passivation Of Ga As Surface With Sa Ms Finalrkjean
 
revision xi - chapters1-5.pdf
revision xi - chapters1-5.pdfrevision xi - chapters1-5.pdf
revision xi - chapters1-5.pdfssuserfa137e1
 
Block 3 training exercises. Draw the Lewis struct
Block 3 training exercises.    Draw the Lewis structBlock 3 training exercises.    Draw the Lewis struct
Block 3 training exercises. Draw the Lewis structChantellPantoja184
 
The CHE Data Book - KFUPM.pdf
The CHE Data Book - KFUPM.pdfThe CHE Data Book - KFUPM.pdf
The CHE Data Book - KFUPM.pdfRobinSMChrystie
 

Ähnlich wie Machine Learning and Surrogate Optimization on Heterogeneous Catalysts (20)

Basic solid state chem
Basic solid state chemBasic solid state chem
Basic solid state chem
 
Investigation on thermoelectric material
Investigation on thermoelectric materialInvestigation on thermoelectric material
Investigation on thermoelectric material
 
Heterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.pptHeterogeneous Catalyst-opportunity and challenges.ppt
Heterogeneous Catalyst-opportunity and challenges.ppt
 
Heterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and ChallengesHeterogeneous Catalysis - Opportunities and Challenges
Heterogeneous Catalysis - Opportunities and Challenges
 
CONVERSIONS TO USE METRIC PREFIXES This table us
CONVERSIONS TO USE   METRIC PREFIXES This table usCONVERSIONS TO USE   METRIC PREFIXES This table us
CONVERSIONS TO USE METRIC PREFIXES This table us
 
Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...Preparation characterization and conductivity studies of Nasicon systems Ag3-...
Preparation characterization and conductivity studies of Nasicon systems Ag3-...
 
Epr 1
Epr 1Epr 1
Epr 1
 
Basics Nuclear Physics concepts
Basics Nuclear Physics conceptsBasics Nuclear Physics concepts
Basics Nuclear Physics concepts
 
A facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocompositeA facile method to prepare CdO-Mn3O4 nanocomposite
A facile method to prepare CdO-Mn3O4 nanocomposite
 
Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids Single Phase Heat Transfer with Nanofluids
Single Phase Heat Transfer with Nanofluids
 
Restrained refinement using Reflex
Restrained refinement using ReflexRestrained refinement using Reflex
Restrained refinement using Reflex
 
Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805Puurunen_invited-talk_ALD-modelling_ALD2005_050805
Puurunen_invited-talk_ALD-modelling_ALD2005_050805
 
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
Paleo environmental bio-diversity macro-evolutionary data mining and deep lea...
 
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...Accelerated Materials Discovery & Characterization with Classical, Quantum an...
Accelerated Materials Discovery & Characterization with Classical, Quantum an...
 
I0371048054
I0371048054I0371048054
I0371048054
 
Passivation Of Ga As Surface With Sa Ms Final
Passivation Of Ga As Surface With Sa Ms   FinalPassivation Of Ga As Surface With Sa Ms   Final
Passivation Of Ga As Surface With Sa Ms Final
 
revision xi - chapters1-5.pdf
revision xi - chapters1-5.pdfrevision xi - chapters1-5.pdf
revision xi - chapters1-5.pdf
 
Block 3 training exercises. Draw the Lewis struct
Block 3 training exercises.    Draw the Lewis structBlock 3 training exercises.    Draw the Lewis struct
Block 3 training exercises. Draw the Lewis struct
 
Talk 3_0
Talk 3_0Talk 3_0
Talk 3_0
 
The CHE Data Book - KFUPM.pdf
The CHE Data Book - KFUPM.pdfThe CHE Data Book - KFUPM.pdf
The CHE Data Book - KFUPM.pdf
 

Mehr von Ichigaku Takigawa

データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜Ichigaku Takigawa
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?Ichigaku Takigawa
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesIchigaku Takigawa
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Ichigaku Takigawa
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開Ichigaku Takigawa
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考Ichigaku Takigawa
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜Ichigaku Takigawa
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンスIchigaku Takigawa
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見Ichigaku Takigawa
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short IntroIchigaku Takigawa
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割Ichigaku Takigawa
 
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryMachine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryIchigaku Takigawa
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいことIchigaku Takigawa
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学Ichigaku Takigawa
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱Ichigaku Takigawa
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesIchigaku Takigawa
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれからIchigaku Takigawa
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)Ichigaku Takigawa
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから Ichigaku Takigawa
 

Mehr von Ichigaku Takigawa (20)

機械学習と自動微分
機械学習と自動微分機械学習と自動微分
機械学習と自動微分
 
データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜データ社会を生きる技術
〜機械学習の夢と現実〜
データ社会を生きる技術
〜機械学習の夢と現実〜
 
機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?機械学習を科学研究で使うとは?
機械学習を科学研究で使うとは?
 
A Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree EnsemblesA Modern Introduction to Decision Tree Ensembles
A Modern Introduction to Decision Tree Ensembles
 
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
Exploring Practices in Machine Learning and Machine Discovery for Heterogeneo...
 
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
機械学習と機械発見:自然科学融合が誘起するデータ科学の新展開
 
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考機械学習と機械発見:自然科学研究におけるデータ利活用の再考
機械学習と機械発見:自然科学研究におけるデータ利活用の再考
 
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
小1にルービックキューブを教えてみた 〜群論スポーツの教育とパターン認知〜
 
"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス"データ化"する化学と情報技術・人工知能・データサイエンス
"データ化"する化学と情報技術・人工知能・データサイエンス
 
自然科学における機械学習と機械発見
自然科学における機械学習と機械発見自然科学における機械学習と機械発見
自然科学における機械学習と機械発見
 
幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro幾何と機械学習: A Short Intro
幾何と機械学習: A Short Intro
 
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
決定森回帰の信頼区間推定, Benign Overfitting, 多変量木とReLUネットの入力空間分割
 
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric ChemistryMachine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
Machine Learning for Molecules: Lessons and Challenges of Data-Centric Chemistry
 
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
機械学習を自然現象の理解・発見に使いたい人に知っておいてほしいこと
 
自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学自己紹介:機械学習・機械発見とデータ中心的自然科学
自己紹介:機械学習・機械発見とデータ中心的自然科学
 
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
機械学習・機械発見から見るデータ中心型化学の野望と憂鬱
 
Machine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and GeometriesMachine Learning for Molecular Graph Representations and Geometries
Machine Learning for Molecular Graph Representations and Geometries
 
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
(2021.11) 機械学習と機械発見:データ中心型の化学・材料科学の教訓とこれから
 
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
機械学習~データを予測に変える技術~で化学に挑む! (サイエンスアゴラ2021)
 
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから (2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
(2021.10) 機械学習と機械発見 データ中心型の化学・材料科学の教訓とこれから
 

Kürzlich hochgeladen

Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptxkhadijarafiq2012
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 

Kürzlich hochgeladen (20)

Engler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomyEngler and Prantl system of classification in plant taxonomy
Engler and Prantl system of classification in plant taxonomy
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Types of different blotting techniques.pptx
Types of different blotting techniques.pptxTypes of different blotting techniques.pptx
Types of different blotting techniques.pptx
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 

Machine Learning and Surrogate Optimization on Heterogeneous Catalysts

  • 1. Machine Learning and Surrogate Optimization on Heterogeneous Catalysts Ichigaku Takigawa 2019 PRESTO International Symposium on Materials Informatics
 Feb 9-11, 2019 @ Tokyo Graduate School of Information Science and Technology, Hokkaido University Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University
  • 2. Heterogeneous catalysts and surface reactions Wolfgang Pauli “God made the bulk; 
 the surface was invented by the devil.” adsorption diffusion desorption dissociation recombination kinks terraces adatom vacancysteps Many hard-to-quantify factors complicate their atomic- level characterization by modelling and experiments. • reaction conditions • composition • support • surface termination • particle size & morphology • atomic coordination environment • disordered/amorphous structures
 in their active state
 : GAS (Reactants) SOLID (Catalysts) Hi-fidelity simulations are too time-consuming...
  • 3. Then how can we characterize the catalytic activity? K. Shimizu et al, ACS Catal. 2, 1904 (2012) d-band center (εd − EF) / eVd-band center (εd − EF) / eV Hammer–Nørskov d-band model reactionrates Volcano trends! adsorption energy / eV Brønsted-Evans-Polanyi relation activationenergy/eV Linear trends! The d-electrons of transition metals govern... Several DFT-calculated indexes capture the trend to some extent...
  • 4. Outline: Our ML-based studies 1. Can we predict the d-band center? 2. Can we predict the adsorption energy? 3. Can we predict the catalytic activity? predicting DFT-calculated values by machine learning   (Takigawa et al, RSC Advances. 2016) predicting DFT-calculated values by machine learning   (Toyao et al, JPC C 2018) predicting values from experiments reported in the  literature by machine learning   (Suzuki et al, in preparation)
  • 5. Case 1. Predicting the d-band centers Guest Host Ruban, Hammer, Stoltze, Skriver, Nørskov, J Mol Catal A, 115:421-429 (1997) J. K. Nørskov, et al., Advances in Catalysis, 2000 Host Guest Two types of models • 1% doped • overlayer [1% doped] The d-bands of transition metals play central roles.
  • 6. The beauty of the periodic table worked! Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11 Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56 Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39 Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63 Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27 Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3 Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79 Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51 Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33 Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.78 -1.65 -1.64 -1.87 Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26 Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82 Cu -2.42 -2.89 -2.94 -3.88 -4.63 Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27 Rh -1.42 -1.51 -2.12 -1.81 -1.7 Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79 Ag -3.68 -3.8 -3.63 -4.51 Ir -2.14 -2.11 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 Au -2.86 -3.09 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -2.17 -3.11 Co -1.17 -1.37 -2.12 Ni -0.33 -1.18 -2.61 -2.43 Cu -2.42 -2.29 -2.49 -3.71 -4.63 Ru -2.02 Rh -1.32 -1.73 -2.12 Pd -1.94 -1.83 -1.97 Ag -3.75 -3.68 -4.51 Ir -1.78 -1.71 -2.7 Pt -2.13 Au -3.09 -2.89 training sets (75%) test sets (25%) training sets (50%) test sets (50%) training sets (25%) test sets (75%) gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors 100 times
 mean RMSE: 0.153 / eV 100 times
 mean RMSE: 0.235 / eV 100 times
 mean RMSE: 0.402 / eV
  • 7. The ML model •Group (G) •Bulk Wigner–Seitz radius (R) in Å •Atomic number (AN) •Atomic mass (AM) in g mol −1 •Period (P) •Electronegativity (EN) •Ionization energy (IE) in eV •Enthalpy of fusion (∆fusH) in J g −1 •Density at 25 ℃ (ρ) in g cm −3 Readily available
 9 descriptors pretested
 for host & guest (18 in total) Gradient Boosted Tree Regression (GBR)
 with only 6 descriptors (1) Group in the periodic table (host) (2) Density at 25 ℃ (host) (3) Enthalpy of fusion (guest) (4) Ionization energy (guest) (5) Enthalpy of fusion (host) (6) Ionization energy (host)
  • 8. 11 ML methods pretested [3 Tree Ensembles (Nonlinear Regression Models)] 
 GBR (Gradient Boosted Tree Regression); ETR (Extra-Trees Regression); RFR (Random Forest Regression); [5 Linear Regression Models] 
 OLS (Ordinary Least Squares Regression); PLS (Partial Least Squares Regression); LASSO (Lasso Regression); RIDGE (Ridge Regression); RANSAC (Random Sample Consensus Regression); G BR ETR G PR R FR KR R O LS R ID G E PLS R AN SAC SVR LASSO [3 Kernel Methods (Nonlinear Regression Models)] 
 GPR (Gaussian Process Regression); KRR (Kernel Ridge Regression); SVR (Support Vector Regression); training sets (75%) test sets (25%)
  • 9. Tree ensemble regressors (GBR, ETR, RFR) Decision Tree (Regression Tree) Tree Ensemble ⇡ ⇡ y x1 x2 y x1 x2 x1 x2 ˆy y = sin ✓q x2 1 + x2 2 ◆ y = sin ✓q x2 1 + x2 2 ◆ = + + + ... x1 x2 ˆy c1 c2 c3c1 c2 c3 • Region-wise constant prediction • The regions are given by recursive axis- parallel partitioning of the data space <latexit sha1_base64="HFw5DiyTzq0XGqmoTg6I06/Dc80=">AAACr3ichVHLSsNQED3GV62vqhvBTbEoClImVVBciW5c+qoVXzWJV72YJiFJS2vxB1wLLkRBwYX4GW76Ay78BHGp4MaFkzQgKuqE5J57Zs7cczO6Y0rPJ3psUBqbmltaY23x9o7Oru5ET++qZxddQ2QN27TdNV3zhCktkfWlb4o1xxVaQTdFTj+cC/K5knA9aVsrfsURWwVt35J70tB8ptZHynl1LFnOZ0bziRSlKYzkT6BGIIUoFuxEDZvYhQ0DRRQgYMFnbEKDx88GVBAc5rZQZc5lJMO8wDHirC1yleAKjdlD/u7zbiNiLd4HPb1QbfApJr8uK5MYoge6pReq0R090fuvvaphj8BLhVe9rhVOvvukf/ntX1WBVx8Hn6o/PfvYw1ToVbJ3J2SCWxh1feno7GV5emmoOkzX9Mz+r+iR7vkGVunVuFkUS+d/+NHZy+9/LMhHFTxC9fvAfoLVTFodT9PiRGpmNhpmDAMYxAhPbBIzmMcCsnyChVNc4FJRlZyyrezUS5WGSNOHL6HID0VdmeQ=</latexit>
  • 10. Tree ensemble regressors (GBR, ETR, RFR) Advantages • quick, nonlinear, parallelizable • highly accurate (widely used in many winning solutions for data prediction competitions) • usually less hyperparameter dependent 
 (compared to kernel methods and neural networks) • conservative extrapolation • "variable importance" provided • popular implementations • Scikit-learn • XGBoost (by DMLC) • LightGBM (by Microsoft) …Data How to generate multiple decision trees? • RFR / ETR
 random patches (random subsampling of instances and variables) or random splits • GBR (can be also mixed with the above strategy)
 sequentially add a new tree to compensate the weak point of the current ensemble.
  • 11. Descriptor analysis and evaluation 100 times mean RMSE: 0.204±0.047 / eV 100 times mean RMSE: 0.212±0.047 / eV 100 times mean RMSE: 0.214±0.046 / eV GBR with 18 descriptors GBR with 6 descriptors GBR with 4 descriptors Descriptor Importances Descriptor Selection
 (top-k) training sets (75%) test sets (25%)
  • 12. Case 2. Predicting the adsorption energy DFT calculation of adsorption energy • 10 hours with our 32 cores workstation 
 (CH3 on the Cu monometallic surface) • even longer time (about 34 hours) for the system containing another metal such as Pb Predicting Adsorption energy of CH3 ML prediction • < 1 sec with our 1 core laptop • not dependent on target systems, but methods we choose training sets (75%) test sets (25%)
  • 13. But what these mean for catalyst design and discovery!? Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.92 -0.96 -0.97 -1.65 -1.64 -2.24 -1.87 -2.4 -3.11 Co -1.37 -1.23 -2.12 -2.82 -2.53 -2.26 -3.56 Ni -0.33 -1.18 -1.92 -2.03 -2.43 -2.15 -2.82 -3.39 Cu -2.42 -2.49 -2.67 -2.89 -2.94 -3.82 -4.63 Ru -1.11 -1.04 -1.12 -1.41 -1.88 -1.81 -1.54 -2.27 Rh -1.42 -1.32 -1.51 -1.7 -1.73 -2.12 -1.81 -1.7 -2.18 -2.3 Pd -1.47 -1.29 -1.29 -1.03 -1.58 -1.83 -1.68 -1.52 -1.79 Ag -3.75 -3.56 -3.62 -3.8 -4.03 -3.5 -3.93 -4.51 Ir -1.78 -1.71 -1.78 -1.55 -2.14 -2.53 -2.2 -2.11 -2.6 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 -1.96 -2.33 Au -3.03 -2.82 -2.85 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -0.78 -1.65 -1.64 -1.87 Co -1.18 -1.17 -1.37 -1.87 -2.12 -2.82 -2.26 Ni -0.33 -1.18 -1.17 -2.61 -2.43 -2.15 -2.82 Cu -2.42 -2.89 -2.94 -3.88 -4.63 Ru -1.11 -1.04 -1.12 -1.11 -1.41 -1.81 -2.27 Rh -1.42 -1.51 -2.12 -1.81 -1.7 Pd -1.29 -1.29 -1.03 -1.58 -1.83 -1.52 -1.79 Ag -3.68 -3.8 -3.63 -4.51 Ir -2.14 -2.11 -2.7 Pt -1.71 -1.47 -2.13 -2.01 -2.23 -2.06 Au -2.86 -3.09 -2.89 -3.44 -3.56 Fe Co Ni Cu Ru Rh Pd Ag Ir Pt Au Fe -2.17 -3.11 Co -1.17 -1.37 -2.12 Ni -0.33 -1.18 -2.61 -2.43 Cu -2.42 -2.29 -2.49 -3.71 -4.63 Ru -2.02 Rh -1.32 -1.73 -2.12 Pd -1.94 -1.83 -1.97 Ag -3.75 -3.68 -4.51 Ir -1.78 -1.71 -2.7 Pt -2.13 Au -3.09 -2.89 training sets (75%) test sets (25%) training sets (50%) test sets (50%) training sets (25%) test sets (75%) gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors gradient boosting w/ 6 descriptors 100 times
 mean RMSE: 0.153 / eV 100 times
 mean RMSE: 0.235 / eV 100 times
 mean RMSE: 0.402 / eV
  • 14. Standard procedure for optimizing the activity All your available data • Experiments • Simulations Hypothesis generation (abduction) Check results Feedback
  • 15. Replace the time-consuming and costly part by ML All your available data Check the best predicted ones Feed them as training data Make predictions
 for many possible candidates Machine Learning (any "data-driven" 
 predictions) The "Surrogate (or proxy)" model for • Demanding experiments • Time-consuming hi-fidelity simulations
  • 16. Replace the time-consuming and costly part by ML All your available data Check the best predicted ones Feed them as training data Make predictions
 for many possible candidates Machine Learning (any "data-driven" 
 predictions) The "Surrogate (or proxy)" model for • Demanding experiments • Time-consuming hi-fidelity simulations This simple procedure won't work in most practical cases!
  • 17. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value
  • 18. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit>
  • 19. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> Nicely predicted for
 the average (but mediocre )
  • 20. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> nice "discovery" can be largely deviated from the average of knowns outlier Nicely predicted for
 the average (but mediocre )
  • 21. ML itself is not for "discovery" x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity Training samples Best known value ML captures the average trend of available knowns "discovery" corresponds to something not in knowns Mismatch ML prediction Fitted to minimize the average error. <latexit sha1_base64="4h6QwX09LZ8F9MU4th/gpU5JJF8=">AAAC2XichVFNTxRBEH2MKAgqC1xMvGzcYNbEbGqBBMOJaEw88uECyQ7ZTI+90GG+0tO7cZ3MwRvh6kESTpp4MP4HLxzcP+CBn0A4QuLFgzWzkxAkYE1m+vWretWvp0TkqdgQHQ9Zt4Zv3xkZvTs2fu/+g4nS5NR6HHa0Kxtu6IV6Uzix9FQgG0YZT25GWjq+8OSG2H2Z5Te6UscqDN6YXiS3fGc7UG3lOoapVmnW9lXQStpp2fYdsyNE8iq1kxxrP5Fahzqt9p6V21Vb+Mm79KmdtkoVqlEe5augXoAKilgOS33YeIsQLjrwIRHAMPbgIOaniToIEXNbSJjTjFSel0gxxtoOV0mucJjd5e8275oFG/A+6xnnapdP8fjVrCxjhn7RNzqjPn2nE/pzba8k75F56fEqBloZtSb2H679/q/K59Vg50J1o2eDNp7nXhV7j3Imu4U70HfffzpbW1ydSZ7QFzpl/5/pmI74BkH33P26IlcPb/Aj2Mv1fyzLFxU8wvq/A7sK1mdr9bkarcxXll4UwxzFIzxGlSe2gCW8xjIafMIBfuAn+lbT+mDtWfuDUmuo0EzjUlgf/wKKf6xx</latexit> <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> nice "discovery" can be largely deviated from the average of knowns outlier Nicely predicted for
 the average (but mediocre )
  • 22. An ML model is just representative of the training data Highly Inaccurate Model Predictions from Extrapolation (Lohninger 1999) "Beware of the perils of extrapolation, and understand that ML algorithms build models that are representative of the available training samples." "exploitation""exploration" to obtain new knowledge/data to use the knowledge/data to improve the performane We also need this ML basically for this
  • 23. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> Use ML to guide the balance between "exploitation" and "exploration"!
  • 24. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> "Uncertainty" of prediction e.g. prediction variance Use ML to guide the balance between "exploitation" and "exploration"!
  • 25. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> "Uncertainty" of prediction e.g. prediction variance e.g.
 "expected improvement" Use ML to guide the balance between "exploitation" and "exploration"!
  • 26. Surrogate optimization (model-based optimization) x<latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit><latexit sha1_base64="BLB8K/n7QYAsE73zsDEUiBvCSV8=">AAAB/XicbVDLSgMxFL2pr1pfVZdugkVwVWZE0GXRjcsK9gHtUDJppo1NMkOSEctQ/AW3uncnbv0Wt36JaTsLbT1w4XDOvZzLCRPBjfW8L1RYWV1b3yhulra2d3b3yvsHTROnmrIGjUWs2yExTHDFGpZbwdqJZkSGgrXC0fXUbz0wbXis7uw4YYEkA8UjTol1UrMbyuxx0itXvKo3A14mfk4qkKPeK393+zFNJVOWCmJMx/cSG2REW04Fm5S6qWEJoSMyYB1HFZHMBNns2wk+cUofR7F2oyyeqb8vMiKNGcvQbUpih2bRm4r/eqFcSLbRZZBxlaSWKToPjlKBbYynVeA+14xaMXaEUM3d75gOiSbUusJKrhR/sYJl0jyr+l7Vvz2v1K7yeopwBMdwCj5cQA1uoA4NoHAPz/ACr+gJvaF39DFfLaD85hD+AH3+ADzJlfc=</latexit> Descriptors (High dimensional) Activity ML prediction <latexit sha1_base64="0VEGB1BS2t8KmbZWf3FuR1QlwM8=">AAACrnichVFLLwNRFP6MV72LjcSm0RA2zZmiWithY+nVIjTNzLitiXllZtqg8QdsLSywILEQP8PGH7DwE8SSxMbCmemIWJRzc+899zvnO/e796iOoXs+0XOL1NrW3tEZ6+ru6e3rH4gPDhU8u+pqIq/Zhu1uqYonDN0SeV/3DbHluEIxVUNsqgdLQXyzJlxPt60N/8gRRVOpWHpZ1xSfoe3y5K5q1g9PpkrxJKVy2QzNpBOUIsqmKcPOLMk5OZeQGQksichW7PgjdrEHGxqqMCFgwWffgAKPxw5kEBzGiqgz5rKnh3GBE3Qzt8pZgjMURg94rfBpJ0ItPgc1vZCt8S0GT5eZCYzTE93RGz3SPb3QZ9Na9bBGoOWId7XBFU5p4HRk/eNflsm7j/0f1p+afZSRDbXqrN0JkeAVWoNfOz5/W59fG69P0A29sv5reqYHfoFVe9duV8XaxR96VNbS/MeCeJTBLfzuU6K5U0in5OkUrc4kFxajZsYwijFMcsfmsIBlrCDPN5g4wyWuJJIKUlEqNVKllogzjF8m7X8BK8iaxA==</latexit> "Uncertainty" of prediction e.g. prediction variance e.g.
 "expected improvement" 1. Initial Sampling (DoE) 2. Loop: 1. Construct a Surrogate Model. 2. Search the Infill Criterion. 3. Add new samples. (intervention) • Reinforcement learning • Blackbox optimization • Bayesian optimization • Sequential design of experiments • Multi-armed bandit • Evolutional computation • Game-theoretic approaches
 : An Open Research Topic in ML Use ML to guide the balance between "exploitation" and "exploration"!
  • 27. Structure-activity landscapes are nonsmooth... J. Med. Chem. 2012, 55, 2932−2942 The structure-activity landscape can be often nonsmooth. Small changes in descriptors can largely affect the activity/selectivity. Activity cliffs Selectivity cliffs
  • 28. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data:
  • 29. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples
  • 30. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression
  • 31. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts)
  • 32. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts) • Pose several restrictions on new samples to be tested
  • 33. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts) • Pose several restrictions on new samples to be tested • the sum of compositions equals to 1 (compositional restriction)
  • 34. Batched + restriced surrogate optimization Our optimization model for heterogeneous catalyst data: • Nonsmooth interpolation (SMAC-like by tree ensemble regressors)
 with multistart local search with L-BFGS + random samples • Use EI as the infill criterion estimated by OOB or Quantile regression • Get multiple samples in each loop 
 by batched optimization with "diversified" samples by clustering
 (need some margin; in reality, not easy to realize suggested catalysts) • Pose several restrictions on new samples to be tested • the sum of compositions equals to 1 (compositional restriction) • restrict the number of elements in a catalyst (bounded nonzeros)
 (because a catalyst with 60 elements would be not realistic...)
  • 35. Case 3. Predicting the catalytic acitivity (in prep) • Oxidative coupling of methane (OCM) 
 [Zavyalova+ 2011] • Water gas shift reaction (WGS) 
 [Odabaşi+ 2014] • CO oxidation [Günay+ 2013] Test on 3 DatasetsOur model GPR-based BO Random
  • 36. ICReDD, Hokkaido University Check the website for any collaborations and postdoc positions! Our mission:
 To rationally design and discover new chemical reactions 
 by seemlessly fusing • experimental sciences (realization) • computational sciences (theory-driven) • information sciences (data-driven) started Oct 2018, funded $ 6.4 million per year by government (for 10 years) Sapporo Tokyo HOKKAIDO • 2 million population
 (5th largest city in Japan) • 6.3m / 248 inches
 avg. annual snowfall Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University
  • 37.
  • 38. Summary • Predicting the d-band centers by ML
 (Takigawa et al, RSC Advances. 2016) • Predicting the adsorption energy by ML
 (Toyao et al, JPC C 2018) • Predicting the experimentally-reported catalytic activity by ML
 (Suzuki et al, in preparation) Acknowledgements Ken-ichi SHIMIZU
 (ICAT) Satoru TAKAKUSAGI
 (ICAT) Takashi TOYAO
 (ICAT) Keisuke
 SUZUKI
 (DENSO)