FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
Transfer Learning for Software Performance Analysis: An Exploratory Analysis
1. Transfer Learning for
Software Performance Analysis
An Exploratory Analysis
Pooyan Jamshidi Norbert Siegmund Miguel Velez Christian Kaestner Akshay Patel Yuvraj Agarwal
4. Empirical observations confirm that systems
are becoming increasingly configurable
Modern systems
• Increasingly configurable
with software evolution
• Deployed in dynamic and
uncertain environments
6
Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu*, Long Jin*, Xuepeng Fan*‡, Yuanyuan Zhou*,
Shankar Pasupathy† and Rukma Talwadker†
*University of California San Diego, ‡Huazhong Univ. of Science & Technology, †NetApp, Inc
{tixu, longjin, xuf001, yyzhou}@cs.ucsd.edu
{Shankar.Pasupathy, Rukma.Talwadker}@netapp.com
ABSTRACT
Configuration problems are not only prevalent, but also severely
impair the reliability of today’s system software. One fundamental
reason is the ever-increasing complexity of configuration, reflected
by the large number of configuration parameters (“knobs”). With
hundreds of knobs, configuring system software to ensure high re-
liability and performance becomes a daunting, error-prone task.
This paper makes a first step in understanding a fundamental
question of configuration design: “do users really need so many
knobs?” To provide the quantitatively answer, we study the con-
figuration settings of real-world users, including thousands of cus-
tomers of a commercial storage system (Storage-A), and hundreds
of users of two widely-used open-source system software projects.
Our study reveals a series of interesting findings to motivate soft-
ware architects and developers to be more cautious and disciplined
in configuration design. Motivated by these findings, we provide
a few concrete, practical guidelines which can significantly reduce
the configuration space. Take Storage-A as an example, the guide-
lines can remove 51.9% of its parameters and simplify 19.7% of
the remaining ones with little impact on existing users. Also, we
study the existing configuration navigation methods in the context
of “too many knobs” to understand their effectiveness in dealing
with the over-designed configuration, and to provide practices for
building navigation support in system software.
7/2006 7/2008 7/2010 7/2012 7/2014
0
100
200
300
400
500
600
700
Storage-A
Numberofparameters
Release time
1/1999 1/2003 1/2007 1/2011
0
100
200
300
400
500
5.6.2
5.5.0
5.0.16
5.1.3
4.1.0
4.0.12
3.23.0
1/2014
MySQL
Numberofparameters
Release time
1/1998 1/2002 1/2006 1/2010 1/2014
0
100
200
300
400
500
600
1.3.14
2.2.14
2.3.4
2.0.35
1.3.24Numberofparameters
Release time
Apache
1/2006 1/2008 1/2010 1/2012 1/2014
0
40
80
120
160
200
2.0.0
1.0.0
0.19.0
0.1.0
Hadoop
Numberofparameters
Release time
MapReduce
HDFS
Figure 1: The increasing number of configuration parameters with
software evolution. Storage-A is a commercial storage system from a ma-
jor storage company in the U.S.
[Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
15. Here is when transfer learning comes to the
scene
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
A. Preliminary concepts
In this section, we provide forma
cepts that we use throughout this st
enable us to concisely convey conce
1) Configuration and environmen
the i-th feature of a configurable s
enabled or disabled and one of the
configuration space is mathematical
all the features C = Dom(F1) ⇥
Dom(Fi) = {0, 1}. A configurat
a member of the configuration spa
all the parameters are assigned to
range (i.e., complete instantiations of
We also describe an environment
e = [w, h, v] drawn from a given
W ⇥H ⇥V , where they respectively
values for workload, hardware and
2) Performance model: Given a
configuration space F and environm
formance model is a black-box fu
given some observations of the syst
combination of system’s features x
e 2 E. To construct a performanc
with configuration space F, we run A
II. INTUITION
performance behavior of configurable
enable (i) performance debugging, (ii)
ii) design-time evolution, or (iv) runtime
ck empirical understanding of how the
of a system will vary when the environ-
nges. Such empirical understanding will
ghts to develop faster and more accurate
hat allow us to make predictions and
rmance for highly configurable systems
ents [10]. For instance, we can learn
of a system on a cheap hardware in a
ment and use that to understand the per-
he system on a production server before
er. More specifically, we would like to
nship is between the performance of a
nvironment (characterized by software
e, workload, and system version) to the
nvironmental conditions.
aim for an empirical understanding of
to improve learning via an informed
other words, we at learning a perfor-
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
or workload, hardware and system version.
rformance model: Given a software system A with
ation space F and environmental instances E, a per-
e model is a black-box function f : F ⇥ E ! R
me observations of the system performance for each
tion of system’s features x 2 F in an environment
To construct a performance model for a system A
nfiguration space F, we run A in environment instance
n various combinations of configurations xi 2 F, and
he resulting performance values yi = f(xi) + ✏i, xi 2
e ✏i ⇠ N (0, i). The training data for our regression
is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
e function is simply a mapping from the input space to
rable performance metric that produces interval-scaled
re we assume it produces real numbers).
rformance distribution: For the performance model,
sured and associated the performance response to each
ation, now let introduce another concept where we
environment and we measure the performance. An
al performance distribution is a stochastic process,
! (R), that defines a probability distribution over
ance measures for each environmental conditions. To
t a performance distribution for a system A with
ation space F, similarly to the process of deriving
d like to
ance of a
software
on) to the
nding of
informed
a perfor-
ell-suited
ledge we
research
n (trans-
urce and
e carried
nsferable
consider
a set of
mary vari-
formance
nderstand
will be
s kind of
this area
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
configuration space F, similarly to the process of deriving
Extract Reuse
Learn Learn
20
• An ML approach
• Uses the knowledge
learned on the source
• To learn a cheaper
model for the target
16. A simple Transfer Learning via model shift
Machines twice as fast
23
log P (θ, Xobs )
Θ
P (θ|Xobs )
Θ
Figure 5: The first column shows the log joint prob
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
P(θ|Xobs) P(θ|Xobs) P(θ|Xobs)
Transfer
function
Source
Target
[Pavel Valov, et al. “Transferring performance prediction models…”, ICPE’17 ]
Throughput
[higher, better]
17. However, when the environment change is
not homogeneous, things can go wrong
24
log P(θ, Xobs )
Θ
log P(θ, X
Θ
P(θ|Xobs)
Θ
P(θ|Xob
Θ
Figure 5: The first column shows the log joint probability an
have estimates of the log joint and the posterior for uniform
except that more points were chosen in high likelihood region
AGPR will query point (x). However, given sufficient smooth
Throughput
[higher, better]
19. Even learning from a source with a small
correlation is better than no transfer
10
20
30
40
50
60
AbsolutePercentageError[%]Sources s s1 s2 s3 s4 s5 s6
noise-level 0 5 10 15 20 25 30
corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19
µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75
Fig. 6: Prediction accuracy of the model learned with samples
from different sources of different relatedness to the target.
GP is the model without transfer learning.
TABLE
column
datasets
measure
1
2
3
4
5
6
predictio
system,
as the e
pled for
Models becomes more
accurate when the source is
more related to the target
27
[P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
20. We need to know and transfer learning works
• When simple transfer works/not works?
• How source and target are “related”?
• What knowledge we can transfer across environments?
28
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
performance behavior of configurable
enable (i) performance debugging, (ii)
i) design-time evolution, or (iv) runtime
ck empirical understanding of how the
of a system will vary when the environ-
nges. Such empirical understanding will
ghts to develop faster and more accurate
at allow us to make predictions and
mance for highly configurable systems
ents [10]. For instance, we can learn
of a system on a cheap hardware in a
ment and use that to understand the per-
he system on a production server before
er. More specifically, we would like to
nship is between the performance of a
nvironment (characterized by software
e, workload, and system version) to the
nvironmental conditions.
aim for an empirical understanding of
to improve learning via an informed
other words, we at learning a perfor-
ged environment based on a well-suited
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
onfigurable
ugging, (ii)
(iv) runtime
of how the
he environ-
anding will
ore accurate
ictions and
ble systems
e can learn
rdware in a
and the per-
erver before
ould like to
mance of a
by software
sion) to the
standing of
n informed
g a perfor-
well-suited
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
ironment space: Let Fi indicate
urable system A which is either
e of them holds by default. The
ematically a Cartesian product of
m(F1) ⇥ · · · ⇥ Dom(Fd), where
nfiguration of a system is then
tion space (feature space) where
gned to a specific value in their
ations of the system’s parameters).
onment instance by 3 variables
a given environment space E =
pectively represent sets of possible
are and system version.
Given a software system A with
environmental instances E, a per-
-box function f : F ⇥ E ! R
the system performance for each
atures x 2 F in an environment
formance model for a system A
we run A in environment instance
ons of configurations xi 2 F, and
ance values yi = f(xi) + ✏i, xi 2
e training data for our regression
= {(xi, yi)}n
i=1. In other words, a
a mapping from the input space to
etric that produces interval-scaled
duces real numbers).
on: For the performance model,
the performance response to each
duce another concept where we
we measure the performance. An
ribution is a stochastic process,
es a probability distribution over
ach environmental conditions. To
stribution for a system A with
ilarly to the process of deriving
e run A on various combinations
nfiguration and environment space: Let Fi indicate
feature of a configurable system A which is either
or disabled and one of them holds by default. The
ation space is mathematically a Cartesian product of
features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
) = {0, 1}. A configuration of a system is then
er of the configuration space (feature space) where
parameters are assigned to a specific value in their
e., complete instantiations of the system’s parameters).
describe an environment instance by 3 variables
, h, v] drawn from a given environment space E =
⇥V , where they respectively represent sets of possible
or workload, hardware and system version.
rformance model: Given a software system A with
ation space F and environmental instances E, a per-
e model is a black-box function f : F ⇥ E ! R
me observations of the system performance for each
tion of system’s features x 2 F in an environment
To construct a performance model for a system A
figuration space F, we run A in environment instance
n various combinations of configurations xi 2 F, and
he resulting performance values yi = f(xi) + ✏i, xi 2
e ✏i ⇠ N (0, i). The training data for our regression
s then simply Dtr = {(xi, yi)}n
i=1. In other words, a
function is simply a mapping from the input space to
rable performance metric that produces interval-scaled
re we assume it produces real numbers).
rformance distribution: For the performance model,
ured and associated the performance response to each
ation, now let introduce another concept where we
environment and we measure the performance. An
l performance distribution is a stochastic process,
! (R), that defines a probability distribution over
ance measures for each environmental conditions. To
t a performance distribution for a system A with
ation space F, similarly to the process of deriving
ormance models, we run A on various combinations
Extract Reuse
Learn Learn
whywhen
22. Establishing theoretical principles of transfer
learning for performance analysis
30
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
mance model in a changed environment based on a well-suited
sampling set that has been determined by the knowledge we
gained in other environments. Therefore, the main research
question is whether there exists a common information (trans-
ferable/reusable knowledge) that applies to both source and
target environments of systems and therefore can be carried
over from either environment to the other. This transferable
A. Preliminary concepts
In this section, we provide
cepts that we use throughout t
enable us to concisely convey
1) Configuration and enviro
the i-th feature of a configura
enabled or disabled and one
configuration space is mathem
all the features C = Dom(F
Dom(Fi) = {0, 1}. A confi
a member of the configuratio
all the parameters are assigne
range (i.e., complete instantiati
We also describe an environ
e = [w, h, v] drawn from a
W ⇥H ⇥V , where they respec
values for workload, hardware
2) Performance model: Giv
configuration space F and env
formance model is a black-bo
given some observations of th
combination of system’s featu
e 2 E. To construct a perfor
with configuration space F, we
e 2 E on various combinations
record the resulting performan
F where ✏i ⇠ N (0, i). The t
models is then simply Dtr = {
response function is simply a m
a measurable performance met
data (here we assume it produ
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
mance model in a changed environment based on a well-suited
sampling set that has been determined by the knowledge we
gained in other environments. Therefore, the main research
question is whether there exists a common information (trans-
ferable/reusable knowledge) that applies to both source and
target environments of systems and therefore can be carried
over from either environment to the other. This transferable
A. Preliminary concepts
In this section, we provide formal definitions of four c
cepts that we use throughout this study. The formal notati
enable us to concisely convey concept throughout the pap
1) Configuration and environment space: Let Fi indic
the i-th feature of a configurable system A which is ei
enabled or disabled and one of them holds by default.
configuration space is mathematically a Cartesian produc
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), wh
Dom(Fi) = {0, 1}. A configuration of a system is t
a member of the configuration space (feature space) wh
all the parameters are assigned to a specific value in t
range (i.e., complete instantiations of the system’s paramete
We also describe an environment instance by 3 variab
e = [w, h, v] drawn from a given environment space E
W ⇥H ⇥V , where they respectively represent sets of poss
values for workload, hardware and system version.
2) Performance model: Given a software system A w
configuration space F and environmental instances E, a p
formance model is a black-box function f : F ⇥ E !
given some observations of the system performance for e
combination of system’s features x 2 F in an environm
e 2 E. To construct a performance model for a system
with configuration space F, we run A in environment insta
e 2 E on various combinations of configurations xi 2 F,
record the resulting performance values yi = f(xi) + ✏i, x
F where ✏i ⇠ N (0, i). The training data for our regress
models is then simply Dtr = {(xi, yi)}n
i=1. In other word
response function is simply a mapping from the input spac
a measurable performance metric that produces interval-sca
data (here we assume it produces real numbers).
develop faster and more accurate
ow us to make predictions and
e for highly configurable systems
10]. For instance, we can learn
system on a cheap hardware in a
nd use that to understand the per-
tem on a production server before
ore specifically, we would like to
is between the performance of a
nment (characterized by software
kload, and system version) to the
mental conditions.
or an empirical understanding of
mprove learning via an informed
words, we at learning a perfor-
nvironment based on a well-suited
determined by the knowledge we
ts. Therefore, the main research
sts a common information (trans-
that applies to both source and
ems and therefore can be carried
nt to the other. This transferable
nsfer learning [10].
ferent changes that we consider
ion: A configuration is a set of
options. This is the primary vari-
onsider to understand performance
, we would like to understand
he system under study will be
nfiguration changes. This kind of
s of previous work in this area
er, they assumed a predetermined
workload, hardware, and software
workload describes the input of
tes on. The performance behavior
er different workload conditions.
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
configuration space F, similarly to the process of deriving
the performance models, we run A on various combinations
configurations xi 2 F, for a specific environment instance
e 2 E and record the resulting performance values yi. We then
fit a probability distribution to the set of measured performance
values De = {yi} using kernel density estimation [2] (in the
mportant insights to develop faster and more accurate
techniques that allow us to make predictions and
tions of performance for highly configurable systems
ging environments [10]. For instance, we can learn
ance behavior of a system on a cheap hardware in a
d lab environment and use that to understand the per-
e behavior of the system on a production server before
to the end user. More specifically, we would like to
hat the relationship is between the performance of a
n a specific environment (characterized by software
ation, hardware, workload, and system version) to the
we vary its environmental conditions.
s research, we aim for an empirical understanding of
ance behavior to improve learning via an informed
g process. In other words, we at learning a perfor-
model in a changed environment based on a well-suited
g set that has been determined by the knowledge we
n other environments. Therefore, the main research
is whether there exists a common information (trans-
eusable knowledge) that applies to both source and
nvironments of systems and therefore can be carried
m either environment to the other. This transferable
ge is a case for transfer learning [10].
s first introduce different changes that we consider
work: (i) Configuration: A configuration is a set of
s over configuration options. This is the primary vari-
he system that we consider to understand performance
. More specifically, we would like to understand
performance of the system under study will be
ed as a result of configuration changes. This kind of
s the primary focus of previous work in this area
9], [26], [9], however, they assumed a predetermined
ment (i.e., a specific workload, hardware, and software
(ii) Workload: The workload describes the input of
m on which it operates on. The performance behavior
ystem can vary under different workload conditions.
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), wher
Dom(Fi) = {0, 1}. A configuration of a system is the
a member of the configuration space (feature space) wher
all the parameters are assigned to a specific value in the
range (i.e., complete instantiations of the system’s parameters
We also describe an environment instance by 3 variable
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possibl
values for workload, hardware and system version.
2) Performance model: Given a software system A wit
configuration space F and environmental instances E, a per
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for eac
combination of system’s features x 2 F in an environmen
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instanc
e 2 E on various combinations of configurations xi 2 F, an
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regressio
models is then simply Dtr = {(xi, yi)}n
i=1. In other words,
response function is simply a mapping from the input space t
a measurable performance metric that produces interval-scale
data (here we assume it produces real numbers).
3) Performance distribution: For the performance mode
we measured and associated the performance response to eac
configuration, now let introduce another concept where w
vary the environment and we measure the performance. A
empirical performance distribution is a stochastic process
pd : E ! (R), that defines a probability distribution ove
performance measures for each environmental conditions. T
construct a performance distribution for a system A wit
configuration space F, similarly to the process of derivin
the performance models, we run A on various combination
configurations xi 2 F, for a specific environment instanc
e 2 E and record the resulting performance values yi. We the
fit a probability distribution to the set of measured performanc
values De = {yi} using kernel density estimation [2] (in th
Extract Reuse
Learn Learn
Theoretical
Principles
of Transfer
Learning
37. 61
Many systems are now configurable
Here is when transfer learning comes to the
scene
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
mance model in a changed environment based on a well-suited
sampling set that has been determined by the knowledge we
gained in other environments. Therefore, the main research
question is whether there exists a common information (trans-
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
II. INTUITION
e performance behavior of configurable
n enable (i) performance debugging, (ii)
(iii) design-time evolution, or (iv) runtime
lack empirical understanding of how the
r of a system will vary when the environ-
hanges. Such empirical understanding will
sights to develop faster and more accurate
that allow us to make predictions and
ormance for highly configurable systems
ments [10]. For instance, we can learn
r of a system on a cheap hardware in a
nment and use that to understand the per-
f the system on a production server before
user. More specifically, we would like to
ionship is between the performance of a
environment (characterized by software
are, workload, and system version) to the
environmental conditions.
we aim for an empirical understanding of
or to improve learning via an informed
n other words, we at learning a perfor-
anged environment based on a well-suited
s been determined by the knowledge we
ronments. Therefore, the main research
here exists a common information (trans-
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
liminary concepts
is section, we provide formal definitions of four con-
hat we use throughout this study. The formal notations
us to concisely convey concept throughout the paper.
Configuration and environment space: Let Fi indicate
h feature of a configurable system A which is either
d or disabled and one of them holds by default. The
uration space is mathematically a Cartesian product of
features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Fi) = {0, 1}. A configuration of a system is then
ber of the configuration space (feature space) where
parameters are assigned to a specific value in their
i.e., complete instantiations of the system’s parameters).
so describe an environment instance by 3 variables
w, h, v] drawn from a given environment space E =
⇥V , where they respectively represent sets of possible
for workload, hardware and system version.
erformance model: Given a software system A with
uration space F and environmental instances E, a per-
ce model is a black-box function f : F ⇥ E ! R
ome observations of the system performance for each
nation of system’s features x 2 F in an environment
. To construct a performance model for a system A
nfiguration space F, we run A in environment instance
on various combinations of configurations xi 2 F, and
the resulting performance values yi = f(xi) + ✏i, xi 2
re ✏i ⇠ N (0, i). The training data for our regression
is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
se function is simply a mapping from the input space to
urable performance metric that produces interval-scaled
ere we assume it produces real numbers).
erformance distribution: For the performance model,
asured and associated the performance response to each
uration, now let introduce another concept where we
e environment and we measure the performance. An
cal performance distribution is a stochastic process,
! (R), that defines a probability distribution over
mance measures for each environmental conditions. To
ct a performance distribution for a system A with
uration space F, similarly to the process of deriving
formance models, we run A on various combinations
urations xi 2 F, for a specific environment instance
and record the resulting performance values yi. We then
nfigurable
gging, (ii)
v) runtime
f how the
e environ-
nding will
e accurate
ctions and
e systems
can learn
ware in a
nd the per-
ver before
uld like to
mance of a
y software
on) to the
tanding of
informed
a perfor-
well-suited
wledge we
n research
on (trans-
ource and
be carried
ansferable
e consider
s a set of
mary vari-
rformance
understand
y will be
is kind of
this area
etermined
d software
e input of
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
configuration space F, similarly to the process of deriving
the performance models, we run A on various combinations
configurations xi 2 F, for a specific environment instance
e 2 E and record the resulting performance values yi. We then
Extract Reuse
Learn Learn
12
• An ML approach
• Uses the knowledge
learned on the source
• To learn a cheaper
model for the target
Hypotheses were categorized in 4 research
questions
24
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probability and the corresponding posterior. In the second column we
have estimates of the log joint and the posterior for uniformly spaced points. In the third column we have the same
except that more points were chosen in high likelihood regions.
AGPR will query point (x). However, given sufficient smoothness, we know that the joint probability will be very low
there after exponentiation due to points (3) and (4). Therefore, the BAPE active learner will not be as interested in (x)
as AGPR. Observe that the uncerainty at (x) is large in the log joint probability space in comparison to the uncertainty
elsewhere; however, in the probability space this is smaller than the uncertainty at the high probability regions. As
Figure 5 indicates, while we model the log joint probability as a GP we are more interested in the uncertainty model
logP(θ, Xobs)
Θ
logP(θ, Xobs)
Θ
logP(θ, Xobs)
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probability and the corresponding posterior. In the second column we
have estimates of the log joint and the posterior for uniformly spaced points. In the third column we have the same
5 10 15 20 25
5
10
15
20
25
14
16
18
20
22
24
5 10 15 20 25
5
10
15
20
25
8
10
12
14
16
18
20
22
24
26
5 10 15 20 25
5
10
15
20
25
5
10
15
20
25
30
CPU usage [%] CPU usage [%]
(a) (b)
(c) (d)
Prediction without transfer learning
5 10 15 20 25
5
10
15
20
25
10
15
20
25
Prediction with transfer learning
RQ1: consistent across
environments
RQ2: influence of
configuration options
fs = … - 7o1o2 + …
ft = … - 3o1o2 + …
RQ3: Option interactions RQ4: Invalid configurations
Building performance models is expensive
11
25 options × 10 values = 1025 configurations
Measure