SlideShare a Scribd company logo
1 of 37
Download to read offline
Transfer Learning for
Software Performance Analysis
An	Exploratory	Analysis
Pooyan	Jamshidi Norbert	Siegmund Miguel	Velez Christian	Kaestner Akshay Patel Yuvraj Agarwal
Many	systems	are	now	configurable
built
5
Empirical	observations	confirm	that	systems	
are	becoming	increasingly	configurable
Modern	systems
• Increasingly	configurable	
with	software	evolution
• Deployed	in	dynamic and	
uncertain	environments
6
Understanding and Dealing with Over-Designed Configuration in System Software
Tianyin Xu*, Long Jin*, Xuepeng Fan*‡, Yuanyuan Zhou*,
Shankar Pasupathy† and Rukma Talwadker†
*University of California San Diego, ‡Huazhong Univ. of Science & Technology, †NetApp, Inc
{tixu, longjin, xuf001, yyzhou}@cs.ucsd.edu
{Shankar.Pasupathy, Rukma.Talwadker}@netapp.com
ABSTRACT
Configuration problems are not only prevalent, but also severely
impair the reliability of today’s system software. One fundamental
reason is the ever-increasing complexity of configuration, reflected
by the large number of configuration parameters (“knobs”). With
hundreds of knobs, configuring system software to ensure high re-
liability and performance becomes a daunting, error-prone task.
This paper makes a first step in understanding a fundamental
question of configuration design: “do users really need so many
knobs?” To provide the quantitatively answer, we study the con-
figuration settings of real-world users, including thousands of cus-
tomers of a commercial storage system (Storage-A), and hundreds
of users of two widely-used open-source system software projects.
Our study reveals a series of interesting findings to motivate soft-
ware architects and developers to be more cautious and disciplined
in configuration design. Motivated by these findings, we provide
a few concrete, practical guidelines which can significantly reduce
the configuration space. Take Storage-A as an example, the guide-
lines can remove 51.9% of its parameters and simplify 19.7% of
the remaining ones with little impact on existing users. Also, we
study the existing configuration navigation methods in the context
of “too many knobs” to understand their effectiveness in dealing
with the over-designed configuration, and to provide practices for
building navigation support in system software.
7/2006 7/2008 7/2010 7/2012 7/2014
0
100
200
300
400
500
600
700
Storage-A
Numberofparameters
Release time
1/1999 1/2003 1/2007 1/2011
0
100
200
300
400
500
5.6.2
5.5.0
5.0.16
5.1.3
4.1.0
4.0.12
3.23.0
1/2014
MySQL
Numberofparameters
Release time
1/1998 1/2002 1/2006 1/2010 1/2014
0
100
200
300
400
500
600
1.3.14
2.2.14
2.3.4
2.0.35
1.3.24Numberofparameters
Release time
Apache
1/2006 1/2008 1/2010 1/2012 1/2014
0
40
80
120
160
200
2.0.0
1.0.0
0.19.0
0.1.0
Hadoop
Numberofparameters
Release time
MapReduce
HDFS
Figure 1: The increasing number of configuration parameters with
software evolution. Storage-A is a commercial storage system from a ma-
jor storage company in the U.S.
[Tianyin Xu,	et	al.,	“Too	Many	Knobs…”,	FSE’15]
Configurations	determine	the	performance	
behavior	of	configurable	software	systems
Configuration	options	
enable	different	code	
paths	depending	on	
environmental	conditions
Performance	=	Non-functional	
property	(e.g.,	response	time,	
energy)
8
void Parrot_setenv(. . . name,. . . value){
#ifdef PARROT_HAS_SETENV
my_setenv(name, value, 1);
#else
int name_len=strlen(name);
int val_len=strlen(value);
char* envs=glob_env;
if(envs==NULL){
return;
}
strcpy(envs,name);
strcpy(envs+name_len,"=");
strcpy(envs+name_len + 1,value);
putenv(envs);
#endif
}
#ifdef LINUX
extern int Parrot_signbit(double x){
union{
double d;
Influence	of	options	are	typically	significant
number of counters
number of splitters
latency(ms)
100
150
1
200
250
2
300
Cubic Interpolation Over Finer Grid
243 684 10125 14166 18
Only	by	tweaking	
2	options	out	of	200	
in	Apache	Storm	
- observed	~100%	change	
in	latency
Developers,	users,	and	operators	need	to	
understand	the	influence	of	configurations	
• Execution	time
• Energy	consumption
• Safety
• …
11
Understanding Performance
Behavior of Software Matters
What	do	we	mean	by	“performance	model”?
13
𝑓(𝒐 𝟏, 𝒐 𝟐)	= 	5 + 3𝒐 𝟏 + 15𝒐 𝟐 − 7𝒐 𝟏×𝒐 𝟐
𝒄 =	< 𝒐 𝟏, 𝒐 𝟐 >
𝒄 =	< 𝒐 𝟏, 𝒐 𝟐, … , 𝒐 𝟏𝟎 >
𝒄 =	< 𝒐 𝟏, 𝒐 𝟐, … , 𝒐 𝟏𝟎𝟎 >
⋮
[Norbert	Siegmund,	et	al.,	“Performance-influence	models	for	highly	configurable	systems”,	FSE’15]
𝑓: ℂ → ℝ
14
Measure
Learn
𝑓(𝒐 𝟏, 𝒐 𝟐)	= 	5 + 3𝒐 𝟏 + 15𝒐 𝟐 − 7𝒐 𝟏×𝒐 𝟐
TurtleBot
Optimization	+
Reasoning	+
Debugging	
Learning	predictive	performance	models	via	
sensitivity	analysis
25	options	× 2	values	=	225 configurations
Building	performance	models	is	expensive
15
Measure
16
25	options	× 2	values	=	225 configurations
1	min	each	measurement
Over	60	years	to	finish	the	measurements!
• Specific	hardware
• Specific	workload
• Specific	version
• …
18
0
4
0.5
1
4
×104
Latency(µs)
3
1.5
3
2
2
2
1 1
-1
4
0
1
2
4
Latency(µs)
×105
3
3
4
3
5
2
2
1 1
10
15
20
25
30
Latency(µs)
35
40
45
(a) cass-20 v1 (b) cass-20 v2
concurrent_reads
concurrent_writes
concurrent_reads
concurrent_writes
Performance	models	are	built	assuming	fixed	
environments
Environment	change	- >	
New	performance	model
0
4
0.5
1
4
×104
Latency(µs)
3
1.5
3
2
2
2
1 1
-1
4
0
1
2
4
Latency(µs)
×105
3
3
4
3
5
2
2
1 1
10
15
20
25
30
Latency(µs)
35
40
45
(a) cass-20 v1 (b) cass-20 v2
concurrent_reads
concurrent_writes
concurrent_reads
concurrent_writes
Transfer Learning
Here	is	when	transfer	learning	comes	to	the	
scene
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
A. Preliminary concepts
In this section, we provide forma
cepts that we use throughout this st
enable us to concisely convey conce
1) Configuration and environmen
the i-th feature of a configurable s
enabled or disabled and one of the
configuration space is mathematical
all the features C = Dom(F1) ⇥
Dom(Fi) = {0, 1}. A configurat
a member of the configuration spa
all the parameters are assigned to
range (i.e., complete instantiations of
We also describe an environment
e = [w, h, v] drawn from a given
W ⇥H ⇥V , where they respectively
values for workload, hardware and
2) Performance model: Given a
configuration space F and environm
formance model is a black-box fu
given some observations of the syst
combination of system’s features x
e 2 E. To construct a performanc
with configuration space F, we run A
II. INTUITION
performance behavior of configurable
enable (i) performance debugging, (ii)
ii) design-time evolution, or (iv) runtime
ck empirical understanding of how the
of a system will vary when the environ-
nges. Such empirical understanding will
ghts to develop faster and more accurate
hat allow us to make predictions and
rmance for highly configurable systems
ents [10]. For instance, we can learn
of a system on a cheap hardware in a
ment and use that to understand the per-
he system on a production server before
er. More specifically, we would like to
nship is between the performance of a
nvironment (characterized by software
e, workload, and system version) to the
nvironmental conditions.
aim for an empirical understanding of
to improve learning via an informed
other words, we at learning a perfor-
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
or workload, hardware and system version.
rformance model: Given a software system A with
ation space F and environmental instances E, a per-
e model is a black-box function f : F ⇥ E ! R
me observations of the system performance for each
tion of system’s features x 2 F in an environment
To construct a performance model for a system A
nfiguration space F, we run A in environment instance
n various combinations of configurations xi 2 F, and
he resulting performance values yi = f(xi) + ✏i, xi 2
e ✏i ⇠ N (0, i). The training data for our regression
is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
e function is simply a mapping from the input space to
rable performance metric that produces interval-scaled
re we assume it produces real numbers).
rformance distribution: For the performance model,
sured and associated the performance response to each
ation, now let introduce another concept where we
environment and we measure the performance. An
al performance distribution is a stochastic process,
! (R), that defines a probability distribution over
ance measures for each environmental conditions. To
t a performance distribution for a system A with
ation space F, similarly to the process of deriving
d like to
ance of a
software
on) to the
nding of
informed
a perfor-
ell-suited
ledge we
research
n (trans-
urce and
e carried
nsferable
consider
a set of
mary vari-
formance
nderstand
will be
s kind of
this area
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
configuration space F, similarly to the process of deriving
Extract Reuse
Learn Learn
20
• An	ML	approach	
• Uses	the	knowledge	
learned	on	the	source	
• To	learn	a	cheaper	
model	for	the	target
A	simple	Transfer	Learning	via	model	shift
Machines	twice	as	fast
23
log P (θ, Xobs )
Θ
P (θ|Xobs )
Θ
Figure 5: The first column shows the log joint prob
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
P(θ|Xobs) P(θ|Xobs) P(θ|Xobs)
Transfer	
function
Source
Target
[Pavel	Valov,	et	al.	“Transferring	performance	prediction	models…”,	ICPE’17	]
Throughput	
[higher,	better]
However,	when	the	environment	change	is	
not	homogeneous,	things	can	go	wrong
24
log P(θ, Xobs )
Θ
log P(θ, X
Θ
P(θ|Xobs)
Θ
P(θ|Xob
Θ
Figure 5: The first column shows the log joint probability an
have estimates of the log joint and the posterior for uniform
except that more points were chosen in high likelihood region
AGPR will query point (x). However, given sufficient smooth
Throughput	
[higher,	better]
Other	forms	of	transfer	learning	exist	
25
Measure
Learn
TurtleBot
Measure
Simulator	(Gazebo)
Data
Reuse
DataData
[P.	Jamshidi,	et	al.,	“Transfer	learning	for	improving	model	predictions ….”,	SEAMS’17]
𝑓(𝒐 𝟏, 𝒐 𝟐)	= 	5 + 3𝒐 𝟏 + 15𝒐 𝟐 − 7𝒐 𝟏×𝒐 𝟐
Even	learning	from	a	source	with	a	small	
correlation	is	better	than	no	transfer
10
20
30
40
50
60
AbsolutePercentageError[%]Sources s s1 s2 s3 s4 s5 s6
noise-level 0 5 10 15 20 25 30
corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19
µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75
Fig. 6: Prediction accuracy of the model learned with samples
from different sources of different relatedness to the target.
GP is the model without transfer learning.
TABLE
column
datasets
measure
1
2
3
4
5
6
predictio
system,
as the e
pled for
Models	becomes	more	
accurate	when	the	source	is	
more	related	to	the	target
27
[P.	Jamshidi,	et	al.,	“Transfer	learning	for	improving	model	predictions ….”,	SEAMS’17]
We	need	to	know	and	transfer	learning	works
• When	simple	transfer	works/not	works?
• How	source	and	target	are	“related”?
• What	knowledge	we	can	transfer	across	environments?
28
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
performance behavior of configurable
enable (i) performance debugging, (ii)
i) design-time evolution, or (iv) runtime
ck empirical understanding of how the
of a system will vary when the environ-
nges. Such empirical understanding will
ghts to develop faster and more accurate
at allow us to make predictions and
mance for highly configurable systems
ents [10]. For instance, we can learn
of a system on a cheap hardware in a
ment and use that to understand the per-
he system on a production server before
er. More specifically, we would like to
nship is between the performance of a
nvironment (characterized by software
e, workload, and system version) to the
nvironmental conditions.
aim for an empirical understanding of
to improve learning via an informed
other words, we at learning a perfor-
ged environment based on a well-suited
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
onfigurable
ugging, (ii)
(iv) runtime
of how the
he environ-
anding will
ore accurate
ictions and
ble systems
e can learn
rdware in a
and the per-
erver before
ould like to
mance of a
by software
sion) to the
standing of
n informed
g a perfor-
well-suited
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
ironment space: Let Fi indicate
urable system A which is either
e of them holds by default. The
ematically a Cartesian product of
m(F1) ⇥ · · · ⇥ Dom(Fd), where
nfiguration of a system is then
tion space (feature space) where
gned to a specific value in their
ations of the system’s parameters).
onment instance by 3 variables
a given environment space E =
pectively represent sets of possible
are and system version.
Given a software system A with
environmental instances E, a per-
-box function f : F ⇥ E ! R
the system performance for each
atures x 2 F in an environment
formance model for a system A
we run A in environment instance
ons of configurations xi 2 F, and
ance values yi = f(xi) + ✏i, xi 2
e training data for our regression
= {(xi, yi)}n
i=1. In other words, a
a mapping from the input space to
etric that produces interval-scaled
duces real numbers).
on: For the performance model,
the performance response to each
duce another concept where we
we measure the performance. An
ribution is a stochastic process,
es a probability distribution over
ach environmental conditions. To
stribution for a system A with
ilarly to the process of deriving
e run A on various combinations
nfiguration and environment space: Let Fi indicate
feature of a configurable system A which is either
or disabled and one of them holds by default. The
ation space is mathematically a Cartesian product of
features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
) = {0, 1}. A configuration of a system is then
er of the configuration space (feature space) where
parameters are assigned to a specific value in their
e., complete instantiations of the system’s parameters).
describe an environment instance by 3 variables
, h, v] drawn from a given environment space E =
⇥V , where they respectively represent sets of possible
or workload, hardware and system version.
rformance model: Given a software system A with
ation space F and environmental instances E, a per-
e model is a black-box function f : F ⇥ E ! R
me observations of the system performance for each
tion of system’s features x 2 F in an environment
To construct a performance model for a system A
figuration space F, we run A in environment instance
n various combinations of configurations xi 2 F, and
he resulting performance values yi = f(xi) + ✏i, xi 2
e ✏i ⇠ N (0, i). The training data for our regression
s then simply Dtr = {(xi, yi)}n
i=1. In other words, a
function is simply a mapping from the input space to
rable performance metric that produces interval-scaled
re we assume it produces real numbers).
rformance distribution: For the performance model,
ured and associated the performance response to each
ation, now let introduce another concept where we
environment and we measure the performance. An
l performance distribution is a stochastic process,
! (R), that defines a probability distribution over
ance measures for each environmental conditions. To
t a performance distribution for a system A with
ation space F, similarly to the process of deriving
ormance models, we run A on various combinations
Extract Reuse
Learn Learn
whywhen
Theoretical Principles of
Transfer Learning
Establishing	theoretical	principles	of	transfer	
learning	for	performance	analysis
30
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
mance model in a changed environment based on a well-suited
sampling set that has been determined by the knowledge we
gained in other environments. Therefore, the main research
question is whether there exists a common information (trans-
ferable/reusable knowledge) that applies to both source and
target environments of systems and therefore can be carried
over from either environment to the other. This transferable
A. Preliminary concepts
In this section, we provide
cepts that we use throughout t
enable us to concisely convey
1) Configuration and enviro
the i-th feature of a configura
enabled or disabled and one
configuration space is mathem
all the features C = Dom(F
Dom(Fi) = {0, 1}. A confi
a member of the configuratio
all the parameters are assigne
range (i.e., complete instantiati
We also describe an environ
e = [w, h, v] drawn from a
W ⇥H ⇥V , where they respec
values for workload, hardware
2) Performance model: Giv
configuration space F and env
formance model is a black-bo
given some observations of th
combination of system’s featu
e 2 E. To construct a perfor
with configuration space F, we
e 2 E on various combinations
record the resulting performan
F where ✏i ⇠ N (0, i). The t
models is then simply Dtr = {
response function is simply a m
a measurable performance met
data (here we assume it produ
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
mance model in a changed environment based on a well-suited
sampling set that has been determined by the knowledge we
gained in other environments. Therefore, the main research
question is whether there exists a common information (trans-
ferable/reusable knowledge) that applies to both source and
target environments of systems and therefore can be carried
over from either environment to the other. This transferable
A. Preliminary concepts
In this section, we provide formal definitions of four c
cepts that we use throughout this study. The formal notati
enable us to concisely convey concept throughout the pap
1) Configuration and environment space: Let Fi indic
the i-th feature of a configurable system A which is ei
enabled or disabled and one of them holds by default.
configuration space is mathematically a Cartesian produc
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), wh
Dom(Fi) = {0, 1}. A configuration of a system is t
a member of the configuration space (feature space) wh
all the parameters are assigned to a specific value in t
range (i.e., complete instantiations of the system’s paramete
We also describe an environment instance by 3 variab
e = [w, h, v] drawn from a given environment space E
W ⇥H ⇥V , where they respectively represent sets of poss
values for workload, hardware and system version.
2) Performance model: Given a software system A w
configuration space F and environmental instances E, a p
formance model is a black-box function f : F ⇥ E !
given some observations of the system performance for e
combination of system’s features x 2 F in an environm
e 2 E. To construct a performance model for a system
with configuration space F, we run A in environment insta
e 2 E on various combinations of configurations xi 2 F,
record the resulting performance values yi = f(xi) + ✏i, x
F where ✏i ⇠ N (0, i). The training data for our regress
models is then simply Dtr = {(xi, yi)}n
i=1. In other word
response function is simply a mapping from the input spac
a measurable performance metric that produces interval-sca
data (here we assume it produces real numbers).
develop faster and more accurate
ow us to make predictions and
e for highly configurable systems
10]. For instance, we can learn
system on a cheap hardware in a
nd use that to understand the per-
tem on a production server before
ore specifically, we would like to
is between the performance of a
nment (characterized by software
kload, and system version) to the
mental conditions.
or an empirical understanding of
mprove learning via an informed
words, we at learning a perfor-
nvironment based on a well-suited
determined by the knowledge we
ts. Therefore, the main research
sts a common information (trans-
that applies to both source and
ems and therefore can be carried
nt to the other. This transferable
nsfer learning [10].
ferent changes that we consider
ion: A configuration is a set of
options. This is the primary vari-
onsider to understand performance
, we would like to understand
he system under study will be
nfiguration changes. This kind of
s of previous work in this area
er, they assumed a predetermined
workload, hardware, and software
workload describes the input of
tes on. The performance behavior
er different workload conditions.
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
configuration space F, similarly to the process of deriving
the performance models, we run A on various combinations
configurations xi 2 F, for a specific environment instance
e 2 E and record the resulting performance values yi. We then
fit a probability distribution to the set of measured performance
values De = {yi} using kernel density estimation [2] (in the
mportant insights to develop faster and more accurate
techniques that allow us to make predictions and
tions of performance for highly configurable systems
ging environments [10]. For instance, we can learn
ance behavior of a system on a cheap hardware in a
d lab environment and use that to understand the per-
e behavior of the system on a production server before
to the end user. More specifically, we would like to
hat the relationship is between the performance of a
n a specific environment (characterized by software
ation, hardware, workload, and system version) to the
we vary its environmental conditions.
s research, we aim for an empirical understanding of
ance behavior to improve learning via an informed
g process. In other words, we at learning a perfor-
model in a changed environment based on a well-suited
g set that has been determined by the knowledge we
n other environments. Therefore, the main research
is whether there exists a common information (trans-
eusable knowledge) that applies to both source and
nvironments of systems and therefore can be carried
m either environment to the other. This transferable
ge is a case for transfer learning [10].
s first introduce different changes that we consider
work: (i) Configuration: A configuration is a set of
s over configuration options. This is the primary vari-
he system that we consider to understand performance
. More specifically, we would like to understand
performance of the system under study will be
ed as a result of configuration changes. This kind of
s the primary focus of previous work in this area
9], [26], [9], however, they assumed a predetermined
ment (i.e., a specific workload, hardware, and software
(ii) Workload: The workload describes the input of
m on which it operates on. The performance behavior
ystem can vary under different workload conditions.
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), wher
Dom(Fi) = {0, 1}. A configuration of a system is the
a member of the configuration space (feature space) wher
all the parameters are assigned to a specific value in the
range (i.e., complete instantiations of the system’s parameters
We also describe an environment instance by 3 variable
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possibl
values for workload, hardware and system version.
2) Performance model: Given a software system A wit
configuration space F and environmental instances E, a per
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for eac
combination of system’s features x 2 F in an environmen
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instanc
e 2 E on various combinations of configurations xi 2 F, an
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regressio
models is then simply Dtr = {(xi, yi)}n
i=1. In other words,
response function is simply a mapping from the input space t
a measurable performance metric that produces interval-scale
data (here we assume it produces real numbers).
3) Performance distribution: For the performance mode
we measured and associated the performance response to eac
configuration, now let introduce another concept where w
vary the environment and we measure the performance. A
empirical performance distribution is a stochastic process
pd : E ! (R), that defines a probability distribution ove
performance measures for each environmental conditions. T
construct a performance distribution for a system A wit
configuration space F, similarly to the process of derivin
the performance models, we run A on various combination
configurations xi 2 F, for a specific environment instanc
e 2 E and record the resulting performance values yi. We the
fit a probability distribution to the set of measured performanc
values De = {yi} using kernel density estimation [2] (in th
Extract Reuse
Learn Learn
Theoretical	
Principles
of	Transfer
Learning
We	conducted	an	exploratory/empirical	study
• 10 hypotheses	(assumptions	about	“relatedness”)
• 4 real-world	configurable	systems
• 36 comparisons	of	environmental	changes
• Statistical	analyses	to	verify	the	hypotheses
32
Hardware	 Workloads	 Versions
• RQ1:	Does	the	performance	behavior	stay	consistent?
• RQ2:	Is	the	influence	of	options	on	performance	consistent?	
• RQ3:	Are	the	interactions among	options	preserved?	
• RQ4:	Are	the	invalid	configurations	similar	across	environments?	
33
Our	research	questions	are	about	across	
environments
similarities
Our	research	questions	are	about	across	
environments
• RQ1:	Does	the	performance	behavior	stay	consistent?
• RQ2:	Is	the	influence	of	options	on	performance	consistent?	
• RQ3:	Are	the	interactions among	options	preserved?	
• RQ4:	Are	the	invalid	configurations	similar across	environments?	
34
Similarity across environments matters!
similarities
Subject	systems	we	investigated
36
SPEAR	(SAT	Solver)
Analysis	time
14	options	
16,384	configurations
SAT	problems
3	hardware
2	versions
X264	(video	encoder)
Encoding	time
16	options	
4,000 configurations
Video	quality/size
2	hardware
3	versions
SQLite	(DB	engine)
Query	time
14	options	
1,000 configurations
DB	Queries
2	hardware
2 versions
SaC (Compiler)
Execution	time
50	options	
71,267 configurations
10	Demo	programs
38
TABLE II: Results indicate that there exist several forms of knowledge that can be transfered across environments and can be used in transfer learning.
RQ1 RQ2 RQ3 RQ4
H1.1 H1.2 H1.3 H1.4 H2.1 H2.2 H3.1 H3.2 H4.1 H4.2
Environment ES M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 M16 M17 M18
SPEAR— Workload (#variables/#clauses): w1 : 774/5934, w2 : 1008/7728, w3 : 1554/11914, w4 : 978/7498; Version: v1 : 1.2, v2 : 2.7
ec1 : [h2 ! h1, w1, v2] S 1.00 0.22 0.97 0.92 0.92 9 7 7 0 1 25 25 25 1.00 0.47 0.45 1 1.00
ec2 : [h4 ! h1, w1, v2] L 0.59 24.88 0.91 0.76 0.86 12 7 4 2 0.51 41 27 21 0.98 0.48 0.45 1 0.98
ec3 : [h1, w1 ! w2, v2] L 0.96 1.97 0.17 0.44 0.32 9 7 4 3 1 23 23 22 0.99 0.45 0.45 1 1.00
ec4 : [h1, w1 ! w3, v2] M 0.90 3.36 -0.08 0.30 0.11 7 7 4 3 0.99 22 23 22 0.99 0.45 0.49 1 0.94
ec5 : [h1, w1, v2 ! v1] S 0.23 0.30 0.35 0.28 0.32 6 5 3 1 0.32 21 7 7 0.33 0.45 0.50 1 0.96
ec6 : [h1, w1 ! w2, v1 ! v2] L -0.10 0.72 -0.05 0.35 0.04 5 6 1 3 0.68 7 21 7 0.31 0.50 0.45 1 0.96
ec7 : [h1 ! h2, w1 ! w4, v2 ! v1] VL -0.10 6.95 0.14 0.41 0.15 6 4 2 2 0.88 21 7 7 -0.44 0.47 0.50 1 0.97
x264— Workload (#pictures/size): w1 : 8/2, w2 : 32/11, w3 : 128/44; Version: v1 : r2389, v2 : r2744, v3 : r2744
ec1 : [h2 ! h1, w3, v3] SM 0.97 1.00 0.99 0.97 0.92 9 10 8 0 0.86 21 33 18 1.00 0.49 0.49 1 1
ec2 : [h2 ! h1, w1, v3] S 0.96 0.02 0.96 0.76 0.79 9 9 8 0 0.94 36 27 24 1.00 0.49 0.49 1 1
ec3 : [h1, w1 ! w2, v3] M 0.65 0.06 0.63 0.53 0.58 9 11 8 1 0.89 27 33 22 0.96 0.49 0.49 1 1
ec4 : [h1, w1 ! w3, v3] M 0.67 0.06 0.64 0.53 0.56 9 10 7 1 0.88 27 33 20 0.96 0.49 0.49 1 1
ec5 : [h1, w3, v2 ! v3] L 0.05 1.64 0.44 0.43 0.42 12 10 10 0 0.83 47 33 29 1.00 0.49 0.49 1 1
ec6 : [h1, w3, v1 ! v3] L 0.06 1.54 0.43 0.43 0.37 11 10 9 0 0.80 46 33 27 0.99 0.49 0.49 1 1
ec7 : [h1, w1 ! w3, v2 ! v3] L 0.08 1.03 0.26 0.25 0.22 8 10 5 1 0.78 33 33 20 0.94 0.49 0.49 1 1
ec8 : [h2 ! h1, w1 ! w3, v2 ! v3] VL 0.09 14.51 0.26 0.23 0.25 8 9 5 2 0.58 33 21 18 0.94 0.49 0.49 1 1
SQLite— Workload: w1 : write seq, w2 : write batch, w3 : read rand, w4 : read seq; Version: v1 : 3.7.6.3, v2 : 3.19.0
ec1 : [h3 ! h2, w1, v1] S 0.99 0.37 0.82 0.35 0.31 5 2 2 0 1 13 9 8 1.00 N/A N/A N/A N/A
ec2 : [h3 ! h2, w2, v1] M 0.97 1.08 0.88 0.40 0.49 5 5 4 0 1 10 11 9 1.00 N/A N/A N/A N/A
ec3 : [h2, w1 ! w2, v1] S 0.96 1.27 0.83 0.40 0.35 2 3 1 0 1 9 9 7 0.99 N/A N/A N/A N/A
ec4 : [h2, w3 ! w4, v1] M 0.50 1.24 0.43 0.17 0.43 1 1 0 0 1 4 2 2 1.00 N/A N/A N/A N/A
ec5 : [h1, w1, v1 ! v2] M 0.95 1.00 0.79 0.24 0.29 2 4 1 0 1 12 11 7 0.99 N/A N/A N/A N/A
ec6 : [h1, w2 ! w1, v1 ! v2] L 0.51 2.80 0.44 0.25 0.30 3 4 1 1 0.31 7 11 6 0.96 N/A N/A N/A N/A
ec7 : [h2 ! h1, w2 ! w1, v1 ! v2] VL 0.53 4.91 0.53 0.42 0.47 3 5 2 1 0.31 7 13 6 0.97 N/A N/A N/A N/A
SaC— Workload: w1 : srad, w2 : pfilter, w3 : kmeans, w4 : hotspot, w5 : nw, w6 : nbody100, w7 : nbody150, w8 : nbody750, w9 : gc, w10 : cg
ec1 : [h1, w1 ! w2, v1] L 0.66 25.02 0.65 0.10 0.79 13 14 8 0 0.88 82 73 52 0.27 0.18 0.17 0.88 0.73
ec2 : [h1, w1 ! w3, v1] L 0.44 15.77 0.42 0.10 0.65 13 10 8 0 0.91 82 63 50 0.56 0.18 0.12 0.90 0.84
ec3 : [h1, w1 ! w4, v1] S 0.93 7.88 0.93 0.36 0.90 12 10 9 0 0.96 37 64 34 0.94 0.16 0.15 0.26 0.88
ec4 : [h1, w1 ! w5, v1] L 0.96 2.82 0.78 0.06 0.81 16 12 10 0 0.94 34 58 25 0.04 0.15 0.22 0.19 -0.29
ec5 : [h1, w2 ! w3, v1] M 0.76 1.82 0.84 0.67 0.86 17 11 9 1 0.95 79 61 47 0.55 0.27 0.13 0.83 0.88
ec6 : [h1, w2 ! w4, v1] S 0.91 5.54 0.80 0.00 0.91 14 11 8 0 0.85 64 65 31 -0.40 0.13 0.15 0.12 0.64
ec7 : [h1, w2 ! w5, v1] L 0.68 25.31 0.57 0.11 0.71 14 14 8 0 0.88 67 59 29 0.05 0.21 0.22 0.09 -0.13
ec8 : [h1, w3 ! w4, v1] L 0.68 1.70 0.56 0.00 0.91 14 13 9 1 0.88 57 67 36 0.34 0.11 0.14 0.05 0.67
ec9 : [h1, w3 ! w5, v1] VL 0.06 3.68 0.20 0.00 0.64 16 10 9 0 0.90 51 58 35 -0.52 0.11 0.21 0.06 -0.41
ec10 : [h1, w4 ! w5, v1] L 0.70 4.85 0.76 0.00 0.75 12 12 11 0 0.95 58 57 43 0.29 0.14 0.20 0.64 -0.14
ec11 : [h1, w6 ! w7, v1] S 0.82 5.79 0.77 0.25 0.88 36 30 28 2 0.89 109 164 102 0.96 N/A N/A N/A N/A
ec12 : [h1, w6 ! w8, v1] S 1.00 0.52 0.92 0.80 0.97 38 30 22 6 0.94 51 53 43 0.99 N/A N/A N/A N/A
ec13 : [h1, w8 ! w7, v1] S 1.00 0.32 0.92 0.53 0.99 30 33 26 1 0.98 53 89 51 1.00 N/A N/A N/A N/A
ec14 : [h1, w9 ! w10, v1] L 0.24 4.85 0.56 0.44 0.77 22 21 18 3 0.69 237 226 94 0.86 N/A N/A N/A N/A
ES: Expected severity of change (Sec. III-B): S: small change; SM: small medium change; M: medium change; L: large change; VL: very large change.
SaC workload descriptions: srad: random matrix generator; pfilter: particle filtering; hotspot: heat transfer differential equations; k-means: clustering; nw: optimal matching;
nbody: simulation of dynamic systems; cg: conjugate gradient; gc: garbage collector. Hardware descriptions (ID: Type/CPUs/Clock (GHz)/RAM (GiB)/Disk):
h1: NUC/4/1.30/15/SSD; h2: NUC/2/2.13/7/SCSI; h3:Station/2/2.8/3/SCSI; h4: Amazon/1/2.4/1/SSD; h5: Amazon/1/2.4/0.5/SSD; h6: Azure/1/2.4/3/SCSI
Metrics: M1: Pearson correlation; M2: Kullback-Leibler (KL) divergence; M3: Spearman correlation; M4/M5: Perc. of top/bottom conf.; M6/M7: Number of influential options;
M8/M9: Number of options agree/disagree; M10: Correlation btw importance of options; M11/M12: Number of interactions; M13: Number of interactions agree on effects;
M14: Correlation btw the coeffs; M15/M16: Perc. of invalid conf. in source/target; M17: Perc. of invalid conf. common btw environments; M18: Correlation btw coeffs
RQ1:	Does	the	performance	behavior	stay	
consistent	across	environments?
Environmental	change Sev
erity
Lin.	
corr.
SPEAR
NUC/2	à NUC/4	 S 1.00
Amazon_nano à NUC L 0.59
Hardware/workload/version VL -0.10
x264
Version L 0.06
Workload M 0.65
SQLite
write-seq à write-batch S 0.96
read-rand	à read-seq M 0.50
40
log P (θ, Xobs )
Θ
P (θ|Xobs )
Θ
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
P(θ|Xobs) P(θ|Xobs) P(θ|Xobs)
Throughput	
𝑓; = 𝛼×𝑓= + β
𝑓;
𝑓=
Insight: We	observed	a	linear	shift	only	for	non-severe	hardware	changes
We	observed	similar	performance	distribution	
across	environments
44
Environmental	change Seve
rity
Lin.	
corr.
Diver
gence
x264
Version L 0.05 1.64	
Workload/version L 0.08 1.03
SQLite
write-seq à write-batch VL 0.51 2.80	
read-rand	à read-seq M 0.50 1.24	
SaC
Workload VL 0.06 3.68
0 100 200 300 400
Runtime [s]
0
100
200
300
400
500
Frequency
0 100 200 300 400
Runtime [s]
0
100
200
300
400
500
Frequency
0 2 4 6 8 10
Runtime [s]
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency
0 2 4 6 8 10
Runtime [s]
0
500
1000
1500
2000
2500
3000
3500
4000
Frequency
(a) (b)
(c) (d)
Insight: For	severe	changes,	the	performance	distributions	are	similar,	showing	
the	potential	for	learning	a	non-linear	transfer	function.
RQ2:	Is	the	influence	of	configuration	options	on	
performance	consistent	across	environments?
48
Environmental	change Sev
erity
Dim Paired	t-test
S T
x264
Version L
16
12 10
Hardware/workload/ver VL 8 9
SQLite
write-seq à write-batch VL
14
3 4
read-rand	à read-seq M 1 1
SaC
Workload VL 50 16 10
Insight: Only	a	subset	of	options	is	influential	which	is	
largely	preserved	across	all	environment	changes.
𝑪 =	< 𝒐 𝟏, 𝒐 𝟐, 𝒐 𝟑, 𝒐 𝟒, 𝒐 𝟓, 𝒐 𝟔, 𝒐 𝟕 >
RQ3:	Are	the	interactions	among	configuration	
options	preserved	across	environments?	
52
Environmental	change Seve
rity
Dim Interactions Corr.
S T
SPEAR
Amazon_nano à NUC L 14 41 27 0.98	
x264
Version L 16 47 33 1
SQLite
Workload VL 50 109 164 0.96	
fs =	… - 7o1o2 +	2o1o3 – 0.2o2o3
ft =	…	- 6o1o2 +	2o1o3 – 0.1o2o3
Insight: Only	a	subset	of	option	interactions	is	influential	
which	is	largely	preserved	across	all	environment	changes.
RQ4:	Are	the	invalid	configurations	in	the	
source	also	invalid	in	the	target	environment?
55
Environmental	change Seve
rity
Invalid Corr.
S T
SPEAR
Workload/version L 50% 45% 0.96
Hardware/workload/ver VL 47% 50% 0.97
15
20
25
18
20
22
24
5 10 15 20 25
5
10
15
20
25
8
10
12
14
16
18
20
22
24
26
5 10 15 20 25
5
10
15
20
25
5
10
15
20
25
30
CPU usage [%] CPU usage [%]
(a) (b)
Prediction without transfer learning
15
20
25
20
25
Prediction with transfer learning
Insight: A	moderate	percentage	of	configurations	are	invalid	across	environments.
Data	is	publically	available
56
https://github.com/pooyanjamshidi/ase17
Preprint:	https://arxiv.org/abs/1709.02280
58
Simple
Change
Severe
Change
Findings Implications
- Strong	correlations	
- Similar	performance	distributions
- Similar	options	or	interactions	
- High	invalid	configurations
- Simple	transfer	learning
- Establish	non-linear	relation
- Focus	on	interesting	regions
Implications:	This	study	opens	up	several	
future	research	opportunities
59
Sampling Learning Performance	testing
number of counters
number of splitters
latency(ms)
100
150
1
200
250
2
300
Cubic Interpolation Over Finer Grid
243 684 10125 14166 18
Performance	tuning
Similarity across environments matters!
60
Bello…
Hire me!
61
Many	systems	are	now	configurable
Here	is	when	transfer	learning	comes	to	the	
scene
Target (Learn)Source (Given)
DataModel
Transferable
Knowledge
II. INTUITION
Understanding the performance behavior of configurable
software systems can enable (i) performance debugging, (ii)
performance tuning, (iii) design-time evolution, or (iv) runtime
adaptation [11]. We lack empirical understanding of how the
performance behavior of a system will vary when the environ-
ment of the system changes. Such empirical understanding will
provide important insights to develop faster and more accurate
learning techniques that allow us to make predictions and
optimizations of performance for highly configurable systems
in changing environments [10]. For instance, we can learn
performance behavior of a system on a cheap hardware in a
controlled lab environment and use that to understand the per-
formance behavior of the system on a production server before
shipping to the end user. More specifically, we would like to
know, what the relationship is between the performance of a
system in a specific environment (characterized by software
configuration, hardware, workload, and system version) to the
one that we vary its environmental conditions.
In this research, we aim for an empirical understanding of
performance behavior to improve learning via an informed
sampling process. In other words, we at learning a perfor-
mance model in a changed environment based on a well-suited
sampling set that has been determined by the knowledge we
gained in other environments. Therefore, the main research
question is whether there exists a common information (trans-
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
II. INTUITION
e performance behavior of configurable
n enable (i) performance debugging, (ii)
(iii) design-time evolution, or (iv) runtime
lack empirical understanding of how the
r of a system will vary when the environ-
hanges. Such empirical understanding will
sights to develop faster and more accurate
that allow us to make predictions and
ormance for highly configurable systems
ments [10]. For instance, we can learn
r of a system on a cheap hardware in a
nment and use that to understand the per-
f the system on a production server before
user. More specifically, we would like to
ionship is between the performance of a
environment (characterized by software
are, workload, and system version) to the
environmental conditions.
we aim for an empirical understanding of
or to improve learning via an informed
n other words, we at learning a perfor-
anged environment based on a well-suited
s been determined by the knowledge we
ronments. Therefore, the main research
here exists a common information (trans-
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
liminary concepts
is section, we provide formal definitions of four con-
hat we use throughout this study. The formal notations
us to concisely convey concept throughout the paper.
Configuration and environment space: Let Fi indicate
h feature of a configurable system A which is either
d or disabled and one of them holds by default. The
uration space is mathematically a Cartesian product of
features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Fi) = {0, 1}. A configuration of a system is then
ber of the configuration space (feature space) where
parameters are assigned to a specific value in their
i.e., complete instantiations of the system’s parameters).
so describe an environment instance by 3 variables
w, h, v] drawn from a given environment space E =
⇥V , where they respectively represent sets of possible
for workload, hardware and system version.
erformance model: Given a software system A with
uration space F and environmental instances E, a per-
ce model is a black-box function f : F ⇥ E ! R
ome observations of the system performance for each
nation of system’s features x 2 F in an environment
. To construct a performance model for a system A
nfiguration space F, we run A in environment instance
on various combinations of configurations xi 2 F, and
the resulting performance values yi = f(xi) + ✏i, xi 2
re ✏i ⇠ N (0, i). The training data for our regression
is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
se function is simply a mapping from the input space to
urable performance metric that produces interval-scaled
ere we assume it produces real numbers).
erformance distribution: For the performance model,
asured and associated the performance response to each
uration, now let introduce another concept where we
e environment and we measure the performance. An
cal performance distribution is a stochastic process,
! (R), that defines a probability distribution over
mance measures for each environmental conditions. To
ct a performance distribution for a system A with
uration space F, similarly to the process of deriving
formance models, we run A on various combinations
urations xi 2 F, for a specific environment instance
and record the resulting performance values yi. We then
nfigurable
gging, (ii)
v) runtime
f how the
e environ-
nding will
e accurate
ctions and
e systems
can learn
ware in a
nd the per-
ver before
uld like to
mance of a
y software
on) to the
tanding of
informed
a perfor-
well-suited
wledge we
n research
on (trans-
ource and
be carried
ansferable
e consider
s a set of
mary vari-
rformance
understand
y will be
is kind of
this area
etermined
d software
e input of
A. Preliminary concepts
In this section, we provide formal definitions of four con-
cepts that we use throughout this study. The formal notations
enable us to concisely convey concept throughout the paper.
1) Configuration and environment space: Let Fi indicate
the i-th feature of a configurable system A which is either
enabled or disabled and one of them holds by default. The
configuration space is mathematically a Cartesian product of
all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where
Dom(Fi) = {0, 1}. A configuration of a system is then
a member of the configuration space (feature space) where
all the parameters are assigned to a specific value in their
range (i.e., complete instantiations of the system’s parameters).
We also describe an environment instance by 3 variables
e = [w, h, v] drawn from a given environment space E =
W ⇥H ⇥V , where they respectively represent sets of possible
values for workload, hardware and system version.
2) Performance model: Given a software system A with
configuration space F and environmental instances E, a per-
formance model is a black-box function f : F ⇥ E ! R
given some observations of the system performance for each
combination of system’s features x 2 F in an environment
e 2 E. To construct a performance model for a system A
with configuration space F, we run A in environment instance
e 2 E on various combinations of configurations xi 2 F, and
record the resulting performance values yi = f(xi) + ✏i, xi 2
F where ✏i ⇠ N (0, i). The training data for our regression
models is then simply Dtr = {(xi, yi)}n
i=1. In other words, a
response function is simply a mapping from the input space to
a measurable performance metric that produces interval-scaled
data (here we assume it produces real numbers).
3) Performance distribution: For the performance model,
we measured and associated the performance response to each
configuration, now let introduce another concept where we
vary the environment and we measure the performance. An
empirical performance distribution is a stochastic process,
pd : E ! (R), that defines a probability distribution over
performance measures for each environmental conditions. To
construct a performance distribution for a system A with
configuration space F, similarly to the process of deriving
the performance models, we run A on various combinations
configurations xi 2 F, for a specific environment instance
e 2 E and record the resulting performance values yi. We then
Extract Reuse
Learn Learn
12
• An	ML	approach	
• Uses	the	knowledge	
learned	on	the	source	
• To	learn	a	cheaper	
model	for	the	target	
Hypotheses	were	categorized	in	4	research	
questions
24
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
log P(θ, Xobs )
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probability and the corresponding posterior. In the second column we
have estimates of the log joint and the posterior for uniformly spaced points. In the third column we have the same
except that more points were chosen in high likelihood regions.
AGPR will query point (x). However, given sufficient smoothness, we know that the joint probability will be very low
there after exponentiation due to points (3) and (4). Therefore, the BAPE active learner will not be as interested in (x)
as AGPR. Observe that the uncerainty at (x) is large in the log joint probability space in comparison to the uncertainty
elsewhere; however, in the probability space this is smaller than the uncertainty at the high probability regions. As
Figure 5 indicates, while we model the log joint probability as a GP we are more interested in the uncertainty model
logP(θ, Xobs)
Θ
logP(θ, Xobs)
Θ
logP(θ, Xobs)
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
P(θ|Xobs)
Θ
Figure 5: The first column shows the log joint probability and the corresponding posterior. In the second column we
have estimates of the log joint and the posterior for uniformly spaced points. In the third column we have the same
5 10 15 20 25
5
10
15
20
25
14
16
18
20
22
24
5 10 15 20 25
5
10
15
20
25
8
10
12
14
16
18
20
22
24
26
5 10 15 20 25
5
10
15
20
25
5
10
15
20
25
30
CPU usage [%] CPU usage [%]
(a) (b)
(c) (d)
Prediction without transfer learning
5 10 15 20 25
5
10
15
20
25
10
15
20
25
Prediction with transfer learning
RQ1:	consistent	across	
environments
RQ2:	influence	of	
configuration	options	
fs =	…	- 7o1o2 +	…
ft =	…	- 3o1o2 +	…
RQ3:	Option	interactions	 RQ4:	Invalid	configurations
Building	performance	models	is	expensive
11
25	options	× 10	values	=	1025 configurations
Measure

More Related Content

What's hot

Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...MLconf
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16MLconf
 
Genetic Algorithm for Process Scheduling
Genetic Algorithm for Process SchedulingGenetic Algorithm for Process Scheduling
Genetic Algorithm for Process SchedulingLogin Technoligies
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15MLconf
 
Genetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentGenetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentSwapnil Shahade
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex PerrierAlexis Perrier
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureMani Goswami
 
Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...NECST Lab @ Politecnico di Milano
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMLAI2
 
Tutotial 2 answer
Tutotial 2 answerTutotial 2 answer
Tutotial 2 answerUdaya Kumar
 
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...bhavikpooja
 
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...Naoki Shibata
 
Parallel External Memory Algorithms Applied to Generalized Linear Models
Parallel External Memory Algorithms Applied to Generalized Linear ModelsParallel External Memory Algorithms Applied to Generalized Linear Models
Parallel External Memory Algorithms Applied to Generalized Linear ModelsRevolution Analytics
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer ChemistryPreferred Networks
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2Park Chunduck
 
VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...
VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...
VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...Luc Lesoil
 
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017MLconf
 
Collective Communications in MPI
 Collective Communications in MPI Collective Communications in MPI
Collective Communications in MPIHanif Durad
 

What's hot (20)

Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
Alex Smola, Professor in the Machine Learning Department, Carnegie Mellon Uni...
 
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
 
Cloud Computing and PSo
Cloud Computing and PSoCloud Computing and PSo
Cloud Computing and PSo
 
Genetic Algorithm for Process Scheduling
Genetic Algorithm for Process SchedulingGenetic Algorithm for Process Scheduling
Genetic Algorithm for Process Scheduling
 
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
Braxton McKee, Founder & CEO, Ufora at MLconf SF - 11/13/15
 
Genetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing EnvironmentGenetic Algorithm for task scheduling in Cloud Computing Environment
Genetic Algorithm for task scheduling in Cloud Computing Environment
 
Chapter 4 pc
Chapter 4 pcChapter 4 pc
Chapter 4 pc
 
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup  - Alex PerrierLarge data with Scikit-learn - Boston Data Mining Meetup  - Alex Perrier
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
 
Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...Pretzel: optimized Machine Learning framework for low-latency and high throug...
Pretzel: optimized Machine Learning framework for low-latency and high throug...
 
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and ArchitecturesMetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures
 
Tutotial 2 answer
Tutotial 2 answerTutotial 2 answer
Tutotial 2 answer
 
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...An efficient approach for load balancing using dynamic ab algorithm in cloud ...
An efficient approach for load balancing using dynamic ab algorithm in cloud ...
 
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...(Slides) Task scheduling algorithm for multicore processor system for minimiz...
(Slides) Task scheduling algorithm for multicore processor system for minimiz...
 
Parallel External Memory Algorithms Applied to Generalized Linear Models
Parallel External Memory Algorithms Applied to Generalized Linear ModelsParallel External Memory Algorithms Applied to Generalized Linear Models
Parallel External Memory Algorithms Applied to Generalized Linear Models
 
Introduction to Chainer Chemistry
Introduction to Chainer ChemistryIntroduction to Chainer Chemistry
Introduction to Chainer Chemistry
 
Caffe framework tutorial2
Caffe framework tutorial2Caffe framework tutorial2
Caffe framework tutorial2
 
VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...
VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...
VaMoS 2021 - Deep Software Variability: Towards Handling Cross-Layer Configur...
 
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
Aran Khanna, Software Engineer, Amazon Web Services at MLconf ATL 2017
 
Collective Communications in MPI
 Collective Communications in MPI Collective Communications in MPI
Collective Communications in MPI
 

Similar to Transfer Learning for Software Performance Analysis: An Exploratory Analysis

Transfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwareTransfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwarePooyan Jamshidi
 
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...SERENEWorkshop
 
SERENE 2014 School: Gabor karsai serene2014_school
SERENE 2014 School: Gabor karsai serene2014_schoolSERENE 2014 School: Gabor karsai serene2014_school
SERENE 2014 School: Gabor karsai serene2014_schoolHenry Muccini
 
Cloud scale anomaly detection for software misconfigurations
Cloud scale anomaly detection for software misconfigurationsCloud scale anomaly detection for software misconfigurations
Cloud scale anomaly detection for software misconfigurationsLN Renganarayana
 
Requirements vs design vs runtime
Requirements vs design vs runtimeRequirements vs design vs runtime
Requirements vs design vs runtimebdemchak
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxfaithxdunce63732
 
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...Daniele Gianni
 
05561 Xfer Research 01
05561 Xfer Research 0105561 Xfer Research 01
05561 Xfer Research 01Rob Gillen
 
05561 Xfer Research 02
05561 Xfer Research 0205561 Xfer Research 02
05561 Xfer Research 02Rob Gillen
 
Performance Evaluation of a Network Using Simulation Tools or Packet Tracer
Performance Evaluation of a Network Using Simulation Tools or Packet TracerPerformance Evaluation of a Network Using Simulation Tools or Packet Tracer
Performance Evaluation of a Network Using Simulation Tools or Packet TracerIOSRjournaljce
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...jsvetter
 
Bt0081 software engineering2
Bt0081 software engineering2Bt0081 software engineering2
Bt0081 software engineering2Techglyphs
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Pooyan Jamshidi
 
Os rtos.ppt
Os rtos.pptOs rtos.ppt
Os rtos.pptrahul km
 

Similar to Transfer Learning for Software Performance Analysis: An Exploratory Analysis (20)

Transfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable SoftwareTransfer Learning for Performance Analysis of Highly-Configurable Software
Transfer Learning for Performance Analysis of Highly-Configurable Software
 
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
SERENE 2014 School: Resilience in Cyber-Physical Systems: Challenges and Oppo...
 
SERENE 2014 School: Gabor karsai serene2014_school
SERENE 2014 School: Gabor karsai serene2014_schoolSERENE 2014 School: Gabor karsai serene2014_school
SERENE 2014 School: Gabor karsai serene2014_school
 
Cloud scale anomaly detection for software misconfigurations
Cloud scale anomaly detection for software misconfigurationsCloud scale anomaly detection for software misconfigurations
Cloud scale anomaly detection for software misconfigurations
 
Requirements vs design vs runtime
Requirements vs design vs runtimeRequirements vs design vs runtime
Requirements vs design vs runtime
 
2453
24532453
2453
 
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docxCS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
CS 301 Computer ArchitectureStudent # 1 EID 09Kingdom of .docx
 
20120140506012
2012014050601220120140506012
20120140506012
 
20120140506012
2012014050601220120140506012
20120140506012
 
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
Calibration of Deployment Simulation Models - A Multi-Paradigm Modelling Appr...
 
05561 Xfer Research 01
05561 Xfer Research 0105561 Xfer Research 01
05561 Xfer Research 01
 
05561 Xfer Research 02
05561 Xfer Research 0205561 Xfer Research 02
05561 Xfer Research 02
 
50120140501006 2
50120140501006 250120140501006 2
50120140501006 2
 
Performance Evaluation of a Network Using Simulation Tools or Packet Tracer
Performance Evaluation of a Network Using Simulation Tools or Packet TracerPerformance Evaluation of a Network Using Simulation Tools or Packet Tracer
Performance Evaluation of a Network Using Simulation Tools or Packet Tracer
 
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
Exploring Emerging Technologies in the Extreme Scale HPC Co-Design Space with...
 
Bt0081 software engineering2
Bt0081 software engineering2Bt0081 software engineering2
Bt0081 software engineering2
 
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
Transfer Learning for Improving Model Predictions in Highly Configurable Soft...
 
iiwas2009
iiwas2009iiwas2009
iiwas2009
 
Embedded systems
Embedded systemsEmbedded systems
Embedded systems
 
Os rtos.ppt
Os rtos.pptOs rtos.ppt
Os rtos.ppt
 

More from Pooyan Jamshidi

Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachLearning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachPooyan Jamshidi
 
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn... A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...Pooyan Jamshidi
 
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Pooyan Jamshidi
 
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Pooyan Jamshidi
 
Transfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsTransfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsPooyan Jamshidi
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOpsPooyan Jamshidi
 
Integrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsIntegrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsPooyan Jamshidi
 
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based SoftwareArchitectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based SoftwarePooyan Jamshidi
 
Production-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectProduction-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectPooyan Jamshidi
 
Sensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwareSensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwarePooyan Jamshidi
 
Configuration Optimization Tool
Configuration Optimization ToolConfiguration Optimization Tool
Configuration Optimization ToolPooyan Jamshidi
 
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...Pooyan Jamshidi
 
Towards Quality-Aware Development of Big Data Applications with DICE
Towards Quality-Aware Development of Big Data Applications with DICETowards Quality-Aware Development of Big Data Applications with DICE
Towards Quality-Aware Development of Big Data Applications with DICEPooyan Jamshidi
 
Self learning cloud controllers
Self learning cloud controllersSelf learning cloud controllers
Self learning cloud controllersPooyan Jamshidi
 
Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...
Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...
Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...Pooyan Jamshidi
 
Fuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringFuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringPooyan Jamshidi
 
Autonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwareAutonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwarePooyan Jamshidi
 
Cloud Migration Patterns: A Multi-Cloud Architectural Perspective
Cloud Migration Patterns: A Multi-Cloud Architectural PerspectiveCloud Migration Patterns: A Multi-Cloud Architectural Perspective
Cloud Migration Patterns: A Multi-Cloud Architectural PerspectivePooyan Jamshidi
 

More from Pooyan Jamshidi (20)

Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery ApproachLearning LWF Chain Graphs: A Markov Blanket Discovery Approach
Learning LWF Chain Graphs: A Markov Blanket Discovery Approach
 
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn... A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
A Framework for Robust Control of Uncertainty in Self-Adaptive Software Conn...
 
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
Machine Learning Meets Quantitative Planning: Enabling Self-Adaptation in Aut...
 
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
Ensembles of Many Diverse Weak Defenses can be Strong: Defending Deep Neural ...
 
Transfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning SystemsTransfer Learning for Performance Analysis of Machine Learning Systems
Transfer Learning for Performance Analysis of Machine Learning Systems
 
Machine Learning meets DevOps
Machine Learning meets DevOpsMachine Learning meets DevOps
Machine Learning meets DevOps
 
Learning to Sample
Learning to SampleLearning to Sample
Learning to Sample
 
Integrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of RobotsIntegrated Model Discovery and Self-Adaptation of Robots
Integrated Model Discovery and Self-Adaptation of Robots
 
Architectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based SoftwareArchitectural Tradeoff in Learning-Based Software
Architectural Tradeoff in Learning-Based Software
 
Production-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software ArchitectProduction-Ready Machine Learning for the Software Architect
Production-Ready Machine Learning for the Software Architect
 
Architecting for Scale
Architecting for ScaleArchitecting for Scale
Architecting for Scale
 
Sensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic SoftwareSensitivity Analysis for Building Adaptive Robotic Software
Sensitivity Analysis for Building Adaptive Robotic Software
 
Configuration Optimization Tool
Configuration Optimization ToolConfiguration Optimization Tool
Configuration Optimization Tool
 
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
Microservices Architecture Enables DevOps: Migration to a Cloud-Native Archit...
 
Towards Quality-Aware Development of Big Data Applications with DICE
Towards Quality-Aware Development of Big Data Applications with DICETowards Quality-Aware Development of Big Data Applications with DICE
Towards Quality-Aware Development of Big Data Applications with DICE
 
Self learning cloud controllers
Self learning cloud controllersSelf learning cloud controllers
Self learning cloud controllers
 
Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...
Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...
Workload Patterns for Quality-driven Dynamic Cloud Service Configuration and...
 
Fuzzy Control meets Software Engineering
Fuzzy Control meets Software EngineeringFuzzy Control meets Software Engineering
Fuzzy Control meets Software Engineering
 
Autonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based SoftwareAutonomic Resource Provisioning for Cloud-Based Software
Autonomic Resource Provisioning for Cloud-Based Software
 
Cloud Migration Patterns: A Multi-Cloud Architectural Perspective
Cloud Migration Patterns: A Multi-Cloud Architectural PerspectiveCloud Migration Patterns: A Multi-Cloud Architectural Perspective
Cloud Migration Patterns: A Multi-Cloud Architectural Perspective
 

Recently uploaded

Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxSilpa
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Silpa
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusNazaninKarimi6
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxSilpa
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Silpa
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxDiariAli
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.Silpa
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 

Recently uploaded (20)

PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.Selaginella: features, morphology ,anatomy and reproduction.
Selaginella: features, morphology ,anatomy and reproduction.
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptxClimate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
Climate Change Impacts on Terrestrial and Aquatic Ecosystems.pptx
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 

Transfer Learning for Software Performance Analysis: An Exploratory Analysis

  • 1. Transfer Learning for Software Performance Analysis An Exploratory Analysis Pooyan Jamshidi Norbert Siegmund Miguel Velez Christian Kaestner Akshay Patel Yuvraj Agarwal
  • 3. 5
  • 4. Empirical observations confirm that systems are becoming increasingly configurable Modern systems • Increasingly configurable with software evolution • Deployed in dynamic and uncertain environments 6 Understanding and Dealing with Over-Designed Configuration in System Software Tianyin Xu*, Long Jin*, Xuepeng Fan*‡, Yuanyuan Zhou*, Shankar Pasupathy† and Rukma Talwadker† *University of California San Diego, ‡Huazhong Univ. of Science & Technology, †NetApp, Inc {tixu, longjin, xuf001, yyzhou}@cs.ucsd.edu {Shankar.Pasupathy, Rukma.Talwadker}@netapp.com ABSTRACT Configuration problems are not only prevalent, but also severely impair the reliability of today’s system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters (“knobs”). With hundreds of knobs, configuring system software to ensure high re- liability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: “do users really need so many knobs?” To provide the quantitatively answer, we study the con- figuration settings of real-world users, including thousands of cus- tomers of a commercial storage system (Storage-A), and hundreds of users of two widely-used open-source system software projects. Our study reveals a series of interesting findings to motivate soft- ware architects and developers to be more cautious and disciplined in configuration design. Motivated by these findings, we provide a few concrete, practical guidelines which can significantly reduce the configuration space. Take Storage-A as an example, the guide- lines can remove 51.9% of its parameters and simplify 19.7% of the remaining ones with little impact on existing users. Also, we study the existing configuration navigation methods in the context of “too many knobs” to understand their effectiveness in dealing with the over-designed configuration, and to provide practices for building navigation support in system software. 7/2006 7/2008 7/2010 7/2012 7/2014 0 100 200 300 400 500 600 700 Storage-A Numberofparameters Release time 1/1999 1/2003 1/2007 1/2011 0 100 200 300 400 500 5.6.2 5.5.0 5.0.16 5.1.3 4.1.0 4.0.12 3.23.0 1/2014 MySQL Numberofparameters Release time 1/1998 1/2002 1/2006 1/2010 1/2014 0 100 200 300 400 500 600 1.3.14 2.2.14 2.3.4 2.0.35 1.3.24Numberofparameters Release time Apache 1/2006 1/2008 1/2010 1/2012 1/2014 0 40 80 120 160 200 2.0.0 1.0.0 0.19.0 0.1.0 Hadoop Numberofparameters Release time MapReduce HDFS Figure 1: The increasing number of configuration parameters with software evolution. Storage-A is a commercial storage system from a ma- jor storage company in the U.S. [Tianyin Xu, et al., “Too Many Knobs…”, FSE’15]
  • 5. Configurations determine the performance behavior of configurable software systems Configuration options enable different code paths depending on environmental conditions Performance = Non-functional property (e.g., response time, energy) 8 void Parrot_setenv(. . . name,. . . value){ #ifdef PARROT_HAS_SETENV my_setenv(name, value, 1); #else int name_len=strlen(name); int val_len=strlen(value); char* envs=glob_env; if(envs==NULL){ return; } strcpy(envs,name); strcpy(envs+name_len,"="); strcpy(envs+name_len + 1,value); putenv(envs); #endif } #ifdef LINUX extern int Parrot_signbit(double x){ union{ double d;
  • 6. Influence of options are typically significant number of counters number of splitters latency(ms) 100 150 1 200 250 2 300 Cubic Interpolation Over Finer Grid 243 684 10125 14166 18 Only by tweaking 2 options out of 200 in Apache Storm - observed ~100% change in latency
  • 9. What do we mean by “performance model”? 13 𝑓(𝒐 𝟏, 𝒐 𝟐) = 5 + 3𝒐 𝟏 + 15𝒐 𝟐 − 7𝒐 𝟏×𝒐 𝟐 𝒄 = < 𝒐 𝟏, 𝒐 𝟐 > 𝒄 = < 𝒐 𝟏, 𝒐 𝟐, … , 𝒐 𝟏𝟎 > 𝒄 = < 𝒐 𝟏, 𝒐 𝟐, … , 𝒐 𝟏𝟎𝟎 > ⋮ [Norbert Siegmund, et al., “Performance-influence models for highly configurable systems”, FSE’15] 𝑓: ℂ → ℝ
  • 10. 14 Measure Learn 𝑓(𝒐 𝟏, 𝒐 𝟐) = 5 + 3𝒐 𝟏 + 15𝒐 𝟐 − 7𝒐 𝟏×𝒐 𝟐 TurtleBot Optimization + Reasoning + Debugging Learning predictive performance models via sensitivity analysis 25 options × 2 values = 225 configurations
  • 13. • Specific hardware • Specific workload • Specific version • … 18 0 4 0.5 1 4 ×104 Latency(µs) 3 1.5 3 2 2 2 1 1 -1 4 0 1 2 4 Latency(µs) ×105 3 3 4 3 5 2 2 1 1 10 15 20 25 30 Latency(µs) 35 40 45 (a) cass-20 v1 (b) cass-20 v2 concurrent_reads concurrent_writes concurrent_reads concurrent_writes Performance models are built assuming fixed environments Environment change - > New performance model 0 4 0.5 1 4 ×104 Latency(µs) 3 1.5 3 2 2 2 1 1 -1 4 0 1 2 4 Latency(µs) ×105 3 3 4 3 5 2 2 1 1 10 15 20 25 30 Latency(µs) 35 40 45 (a) cass-20 v1 (b) cass-20 v2 concurrent_reads concurrent_writes concurrent_reads concurrent_writes
  • 15. Here is when transfer learning comes to the scene Target (Learn)Source (Given) DataModel Transferable Knowledge II. INTUITION Understanding the performance behavior of configurable software systems can enable (i) performance debugging, (ii) performance tuning, (iii) design-time evolution, or (iv) runtime adaptation [11]. We lack empirical understanding of how the performance behavior of a system will vary when the environ- ment of the system changes. Such empirical understanding will provide important insights to develop faster and more accurate learning techniques that allow us to make predictions and optimizations of performance for highly configurable systems in changing environments [10]. For instance, we can learn performance behavior of a system on a cheap hardware in a controlled lab environment and use that to understand the per- formance behavior of the system on a production server before shipping to the end user. More specifically, we would like to know, what the relationship is between the performance of a system in a specific environment (characterized by software configuration, hardware, workload, and system version) to the one that we vary its environmental conditions. In this research, we aim for an empirical understanding of performance behavior to improve learning via an informed sampling process. In other words, we at learning a perfor- A. Preliminary concepts In this section, we provide forma cepts that we use throughout this st enable us to concisely convey conce 1) Configuration and environmen the i-th feature of a configurable s enabled or disabled and one of the configuration space is mathematical all the features C = Dom(F1) ⇥ Dom(Fi) = {0, 1}. A configurat a member of the configuration spa all the parameters are assigned to range (i.e., complete instantiations of We also describe an environment e = [w, h, v] drawn from a given W ⇥H ⇥V , where they respectively values for workload, hardware and 2) Performance model: Given a configuration space F and environm formance model is a black-box fu given some observations of the syst combination of system’s features x e 2 E. To construct a performanc with configuration space F, we run A II. INTUITION performance behavior of configurable enable (i) performance debugging, (ii) ii) design-time evolution, or (iv) runtime ck empirical understanding of how the of a system will vary when the environ- nges. Such empirical understanding will ghts to develop faster and more accurate hat allow us to make predictions and rmance for highly configurable systems ents [10]. For instance, we can learn of a system on a cheap hardware in a ment and use that to understand the per- he system on a production server before er. More specifically, we would like to nship is between the performance of a nvironment (characterized by software e, workload, and system version) to the nvironmental conditions. aim for an empirical understanding of to improve learning via an informed other words, we at learning a perfor- A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance or workload, hardware and system version. rformance model: Given a software system A with ation space F and environmental instances E, a per- e model is a black-box function f : F ⇥ E ! R me observations of the system performance for each tion of system’s features x 2 F in an environment To construct a performance model for a system A nfiguration space F, we run A in environment instance n various combinations of configurations xi 2 F, and he resulting performance values yi = f(xi) + ✏i, xi 2 e ✏i ⇠ N (0, i). The training data for our regression is then simply Dtr = {(xi, yi)}n i=1. In other words, a e function is simply a mapping from the input space to rable performance metric that produces interval-scaled re we assume it produces real numbers). rformance distribution: For the performance model, sured and associated the performance response to each ation, now let introduce another concept where we environment and we measure the performance. An al performance distribution is a stochastic process, ! (R), that defines a probability distribution over ance measures for each environmental conditions. To t a performance distribution for a system A with ation space F, similarly to the process of deriving d like to ance of a software on) to the nding of informed a perfor- ell-suited ledge we research n (trans- urce and e carried nsferable consider a set of mary vari- formance nderstand will be s kind of this area values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To construct a performance distribution for a system A with configuration space F, similarly to the process of deriving Extract Reuse Learn Learn 20 • An ML approach • Uses the knowledge learned on the source • To learn a cheaper model for the target
  • 16. A simple Transfer Learning via model shift Machines twice as fast 23 log P (θ, Xobs ) Θ P (θ|Xobs ) Θ Figure 5: The first column shows the log joint prob log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ P(θ|Xobs) P(θ|Xobs) P(θ|Xobs) Transfer function Source Target [Pavel Valov, et al. “Transferring performance prediction models…”, ICPE’17 ] Throughput [higher, better]
  • 17. However, when the environment change is not homogeneous, things can go wrong 24 log P(θ, Xobs ) Θ log P(θ, X Θ P(θ|Xobs) Θ P(θ|Xob Θ Figure 5: The first column shows the log joint probability an have estimates of the log joint and the posterior for uniform except that more points were chosen in high likelihood region AGPR will query point (x). However, given sufficient smooth Throughput [higher, better]
  • 19. Even learning from a source with a small correlation is better than no transfer 10 20 30 40 50 60 AbsolutePercentageError[%]Sources s s1 s2 s3 s4 s5 s6 noise-level 0 5 10 15 20 25 30 corr. coeff. 0.98 0.95 0.89 0.75 0.54 0.34 0.19 µ(pe) 15.34 14.14 17.09 18.71 33.06 40.93 46.75 Fig. 6: Prediction accuracy of the model learned with samples from different sources of different relatedness to the target. GP is the model without transfer learning. TABLE column datasets measure 1 2 3 4 5 6 predictio system, as the e pled for Models becomes more accurate when the source is more related to the target 27 [P. Jamshidi, et al., “Transfer learning for improving model predictions ….”, SEAMS’17]
  • 20. We need to know and transfer learning works • When simple transfer works/not works? • How source and target are “related”? • What knowledge we can transfer across environments? 28 Target (Learn)Source (Given) DataModel Transferable Knowledge II. INTUITION performance behavior of configurable enable (i) performance debugging, (ii) i) design-time evolution, or (iv) runtime ck empirical understanding of how the of a system will vary when the environ- nges. Such empirical understanding will ghts to develop faster and more accurate at allow us to make predictions and mance for highly configurable systems ents [10]. For instance, we can learn of a system on a cheap hardware in a ment and use that to understand the per- he system on a production server before er. More specifically, we would like to nship is between the performance of a nvironment (characterized by software e, workload, and system version) to the nvironmental conditions. aim for an empirical understanding of to improve learning via an informed other words, we at learning a perfor- ged environment based on a well-suited A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and onfigurable ugging, (ii) (iv) runtime of how the he environ- anding will ore accurate ictions and ble systems e can learn rdware in a and the per- erver before ould like to mance of a by software sion) to the standing of n informed g a perfor- well-suited A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and ironment space: Let Fi indicate urable system A which is either e of them holds by default. The ematically a Cartesian product of m(F1) ⇥ · · · ⇥ Dom(Fd), where nfiguration of a system is then tion space (feature space) where gned to a specific value in their ations of the system’s parameters). onment instance by 3 variables a given environment space E = pectively represent sets of possible are and system version. Given a software system A with environmental instances E, a per- -box function f : F ⇥ E ! R the system performance for each atures x 2 F in an environment formance model for a system A we run A in environment instance ons of configurations xi 2 F, and ance values yi = f(xi) + ✏i, xi 2 e training data for our regression = {(xi, yi)}n i=1. In other words, a a mapping from the input space to etric that produces interval-scaled duces real numbers). on: For the performance model, the performance response to each duce another concept where we we measure the performance. An ribution is a stochastic process, es a probability distribution over ach environmental conditions. To stribution for a system A with ilarly to the process of deriving e run A on various combinations nfiguration and environment space: Let Fi indicate feature of a configurable system A which is either or disabled and one of them holds by default. The ation space is mathematically a Cartesian product of features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where ) = {0, 1}. A configuration of a system is then er of the configuration space (feature space) where parameters are assigned to a specific value in their e., complete instantiations of the system’s parameters). describe an environment instance by 3 variables , h, v] drawn from a given environment space E = ⇥V , where they respectively represent sets of possible or workload, hardware and system version. rformance model: Given a software system A with ation space F and environmental instances E, a per- e model is a black-box function f : F ⇥ E ! R me observations of the system performance for each tion of system’s features x 2 F in an environment To construct a performance model for a system A figuration space F, we run A in environment instance n various combinations of configurations xi 2 F, and he resulting performance values yi = f(xi) + ✏i, xi 2 e ✏i ⇠ N (0, i). The training data for our regression s then simply Dtr = {(xi, yi)}n i=1. In other words, a function is simply a mapping from the input space to rable performance metric that produces interval-scaled re we assume it produces real numbers). rformance distribution: For the performance model, ured and associated the performance response to each ation, now let introduce another concept where we environment and we measure the performance. An l performance distribution is a stochastic process, ! (R), that defines a probability distribution over ance measures for each environmental conditions. To t a performance distribution for a system A with ation space F, similarly to the process of deriving ormance models, we run A on various combinations Extract Reuse Learn Learn whywhen
  • 22. Establishing theoretical principles of transfer learning for performance analysis 30 Target (Learn)Source (Given) DataModel Transferable Knowledge II. INTUITION Understanding the performance behavior of configurable software systems can enable (i) performance debugging, (ii) performance tuning, (iii) design-time evolution, or (iv) runtime adaptation [11]. We lack empirical understanding of how the performance behavior of a system will vary when the environ- ment of the system changes. Such empirical understanding will provide important insights to develop faster and more accurate learning techniques that allow us to make predictions and optimizations of performance for highly configurable systems in changing environments [10]. For instance, we can learn performance behavior of a system on a cheap hardware in a controlled lab environment and use that to understand the per- formance behavior of the system on a production server before shipping to the end user. More specifically, we would like to know, what the relationship is between the performance of a system in a specific environment (characterized by software configuration, hardware, workload, and system version) to the one that we vary its environmental conditions. In this research, we aim for an empirical understanding of performance behavior to improve learning via an informed sampling process. In other words, we at learning a perfor- mance model in a changed environment based on a well-suited sampling set that has been determined by the knowledge we gained in other environments. Therefore, the main research question is whether there exists a common information (trans- ferable/reusable knowledge) that applies to both source and target environments of systems and therefore can be carried over from either environment to the other. This transferable A. Preliminary concepts In this section, we provide cepts that we use throughout t enable us to concisely convey 1) Configuration and enviro the i-th feature of a configura enabled or disabled and one configuration space is mathem all the features C = Dom(F Dom(Fi) = {0, 1}. A confi a member of the configuratio all the parameters are assigne range (i.e., complete instantiati We also describe an environ e = [w, h, v] drawn from a W ⇥H ⇥V , where they respec values for workload, hardware 2) Performance model: Giv configuration space F and env formance model is a black-bo given some observations of th combination of system’s featu e 2 E. To construct a perfor with configuration space F, we e 2 E on various combinations record the resulting performan F where ✏i ⇠ N (0, i). The t models is then simply Dtr = { response function is simply a m a measurable performance met data (here we assume it produ II. INTUITION Understanding the performance behavior of configurable software systems can enable (i) performance debugging, (ii) performance tuning, (iii) design-time evolution, or (iv) runtime adaptation [11]. We lack empirical understanding of how the performance behavior of a system will vary when the environ- ment of the system changes. Such empirical understanding will provide important insights to develop faster and more accurate learning techniques that allow us to make predictions and optimizations of performance for highly configurable systems in changing environments [10]. For instance, we can learn performance behavior of a system on a cheap hardware in a controlled lab environment and use that to understand the per- formance behavior of the system on a production server before shipping to the end user. More specifically, we would like to know, what the relationship is between the performance of a system in a specific environment (characterized by software configuration, hardware, workload, and system version) to the one that we vary its environmental conditions. In this research, we aim for an empirical understanding of performance behavior to improve learning via an informed sampling process. In other words, we at learning a perfor- mance model in a changed environment based on a well-suited sampling set that has been determined by the knowledge we gained in other environments. Therefore, the main research question is whether there exists a common information (trans- ferable/reusable knowledge) that applies to both source and target environments of systems and therefore can be carried over from either environment to the other. This transferable A. Preliminary concepts In this section, we provide formal definitions of four c cepts that we use throughout this study. The formal notati enable us to concisely convey concept throughout the pap 1) Configuration and environment space: Let Fi indic the i-th feature of a configurable system A which is ei enabled or disabled and one of them holds by default. configuration space is mathematically a Cartesian produc all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), wh Dom(Fi) = {0, 1}. A configuration of a system is t a member of the configuration space (feature space) wh all the parameters are assigned to a specific value in t range (i.e., complete instantiations of the system’s paramete We also describe an environment instance by 3 variab e = [w, h, v] drawn from a given environment space E W ⇥H ⇥V , where they respectively represent sets of poss values for workload, hardware and system version. 2) Performance model: Given a software system A w configuration space F and environmental instances E, a p formance model is a black-box function f : F ⇥ E ! given some observations of the system performance for e combination of system’s features x 2 F in an environm e 2 E. To construct a performance model for a system with configuration space F, we run A in environment insta e 2 E on various combinations of configurations xi 2 F, record the resulting performance values yi = f(xi) + ✏i, x F where ✏i ⇠ N (0, i). The training data for our regress models is then simply Dtr = {(xi, yi)}n i=1. In other word response function is simply a mapping from the input spac a measurable performance metric that produces interval-sca data (here we assume it produces real numbers). develop faster and more accurate ow us to make predictions and e for highly configurable systems 10]. For instance, we can learn system on a cheap hardware in a nd use that to understand the per- tem on a production server before ore specifically, we would like to is between the performance of a nment (characterized by software kload, and system version) to the mental conditions. or an empirical understanding of mprove learning via an informed words, we at learning a perfor- nvironment based on a well-suited determined by the knowledge we ts. Therefore, the main research sts a common information (trans- that applies to both source and ems and therefore can be carried nt to the other. This transferable nsfer learning [10]. ferent changes that we consider ion: A configuration is a set of options. This is the primary vari- onsider to understand performance , we would like to understand he system under study will be nfiguration changes. This kind of s of previous work in this area er, they assumed a predetermined workload, hardware, and software workload describes the input of tes on. The performance behavior er different workload conditions. all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To construct a performance distribution for a system A with configuration space F, similarly to the process of deriving the performance models, we run A on various combinations configurations xi 2 F, for a specific environment instance e 2 E and record the resulting performance values yi. We then fit a probability distribution to the set of measured performance values De = {yi} using kernel density estimation [2] (in the mportant insights to develop faster and more accurate techniques that allow us to make predictions and tions of performance for highly configurable systems ging environments [10]. For instance, we can learn ance behavior of a system on a cheap hardware in a d lab environment and use that to understand the per- e behavior of the system on a production server before to the end user. More specifically, we would like to hat the relationship is between the performance of a n a specific environment (characterized by software ation, hardware, workload, and system version) to the we vary its environmental conditions. s research, we aim for an empirical understanding of ance behavior to improve learning via an informed g process. In other words, we at learning a perfor- model in a changed environment based on a well-suited g set that has been determined by the knowledge we n other environments. Therefore, the main research is whether there exists a common information (trans- eusable knowledge) that applies to both source and nvironments of systems and therefore can be carried m either environment to the other. This transferable ge is a case for transfer learning [10]. s first introduce different changes that we consider work: (i) Configuration: A configuration is a set of s over configuration options. This is the primary vari- he system that we consider to understand performance . More specifically, we would like to understand performance of the system under study will be ed as a result of configuration changes. This kind of s the primary focus of previous work in this area 9], [26], [9], however, they assumed a predetermined ment (i.e., a specific workload, hardware, and software (ii) Workload: The workload describes the input of m on which it operates on. The performance behavior ystem can vary under different workload conditions. all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), wher Dom(Fi) = {0, 1}. A configuration of a system is the a member of the configuration space (feature space) wher all the parameters are assigned to a specific value in the range (i.e., complete instantiations of the system’s parameters We also describe an environment instance by 3 variable e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possibl values for workload, hardware and system version. 2) Performance model: Given a software system A wit configuration space F and environmental instances E, a per formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for eac combination of system’s features x 2 F in an environmen e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instanc e 2 E on various combinations of configurations xi 2 F, an record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regressio models is then simply Dtr = {(xi, yi)}n i=1. In other words, response function is simply a mapping from the input space t a measurable performance metric that produces interval-scale data (here we assume it produces real numbers). 3) Performance distribution: For the performance mode we measured and associated the performance response to eac configuration, now let introduce another concept where w vary the environment and we measure the performance. A empirical performance distribution is a stochastic process pd : E ! (R), that defines a probability distribution ove performance measures for each environmental conditions. T construct a performance distribution for a system A wit configuration space F, similarly to the process of derivin the performance models, we run A on various combination configurations xi 2 F, for a specific environment instanc e 2 E and record the resulting performance values yi. We the fit a probability distribution to the set of measured performanc values De = {yi} using kernel density estimation [2] (in th Extract Reuse Learn Learn Theoretical Principles of Transfer Learning
  • 23. We conducted an exploratory/empirical study • 10 hypotheses (assumptions about “relatedness”) • 4 real-world configurable systems • 36 comparisons of environmental changes • Statistical analyses to verify the hypotheses 32 Hardware Workloads Versions
  • 24. • RQ1: Does the performance behavior stay consistent? • RQ2: Is the influence of options on performance consistent? • RQ3: Are the interactions among options preserved? • RQ4: Are the invalid configurations similar across environments? 33 Our research questions are about across environments similarities
  • 25. Our research questions are about across environments • RQ1: Does the performance behavior stay consistent? • RQ2: Is the influence of options on performance consistent? • RQ3: Are the interactions among options preserved? • RQ4: Are the invalid configurations similar across environments? 34 Similarity across environments matters! similarities
  • 27. 38 TABLE II: Results indicate that there exist several forms of knowledge that can be transfered across environments and can be used in transfer learning. RQ1 RQ2 RQ3 RQ4 H1.1 H1.2 H1.3 H1.4 H2.1 H2.2 H3.1 H3.2 H4.1 H4.2 Environment ES M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12 M13 M14 M15 M16 M17 M18 SPEAR— Workload (#variables/#clauses): w1 : 774/5934, w2 : 1008/7728, w3 : 1554/11914, w4 : 978/7498; Version: v1 : 1.2, v2 : 2.7 ec1 : [h2 ! h1, w1, v2] S 1.00 0.22 0.97 0.92 0.92 9 7 7 0 1 25 25 25 1.00 0.47 0.45 1 1.00 ec2 : [h4 ! h1, w1, v2] L 0.59 24.88 0.91 0.76 0.86 12 7 4 2 0.51 41 27 21 0.98 0.48 0.45 1 0.98 ec3 : [h1, w1 ! w2, v2] L 0.96 1.97 0.17 0.44 0.32 9 7 4 3 1 23 23 22 0.99 0.45 0.45 1 1.00 ec4 : [h1, w1 ! w3, v2] M 0.90 3.36 -0.08 0.30 0.11 7 7 4 3 0.99 22 23 22 0.99 0.45 0.49 1 0.94 ec5 : [h1, w1, v2 ! v1] S 0.23 0.30 0.35 0.28 0.32 6 5 3 1 0.32 21 7 7 0.33 0.45 0.50 1 0.96 ec6 : [h1, w1 ! w2, v1 ! v2] L -0.10 0.72 -0.05 0.35 0.04 5 6 1 3 0.68 7 21 7 0.31 0.50 0.45 1 0.96 ec7 : [h1 ! h2, w1 ! w4, v2 ! v1] VL -0.10 6.95 0.14 0.41 0.15 6 4 2 2 0.88 21 7 7 -0.44 0.47 0.50 1 0.97 x264— Workload (#pictures/size): w1 : 8/2, w2 : 32/11, w3 : 128/44; Version: v1 : r2389, v2 : r2744, v3 : r2744 ec1 : [h2 ! h1, w3, v3] SM 0.97 1.00 0.99 0.97 0.92 9 10 8 0 0.86 21 33 18 1.00 0.49 0.49 1 1 ec2 : [h2 ! h1, w1, v3] S 0.96 0.02 0.96 0.76 0.79 9 9 8 0 0.94 36 27 24 1.00 0.49 0.49 1 1 ec3 : [h1, w1 ! w2, v3] M 0.65 0.06 0.63 0.53 0.58 9 11 8 1 0.89 27 33 22 0.96 0.49 0.49 1 1 ec4 : [h1, w1 ! w3, v3] M 0.67 0.06 0.64 0.53 0.56 9 10 7 1 0.88 27 33 20 0.96 0.49 0.49 1 1 ec5 : [h1, w3, v2 ! v3] L 0.05 1.64 0.44 0.43 0.42 12 10 10 0 0.83 47 33 29 1.00 0.49 0.49 1 1 ec6 : [h1, w3, v1 ! v3] L 0.06 1.54 0.43 0.43 0.37 11 10 9 0 0.80 46 33 27 0.99 0.49 0.49 1 1 ec7 : [h1, w1 ! w3, v2 ! v3] L 0.08 1.03 0.26 0.25 0.22 8 10 5 1 0.78 33 33 20 0.94 0.49 0.49 1 1 ec8 : [h2 ! h1, w1 ! w3, v2 ! v3] VL 0.09 14.51 0.26 0.23 0.25 8 9 5 2 0.58 33 21 18 0.94 0.49 0.49 1 1 SQLite— Workload: w1 : write seq, w2 : write batch, w3 : read rand, w4 : read seq; Version: v1 : 3.7.6.3, v2 : 3.19.0 ec1 : [h3 ! h2, w1, v1] S 0.99 0.37 0.82 0.35 0.31 5 2 2 0 1 13 9 8 1.00 N/A N/A N/A N/A ec2 : [h3 ! h2, w2, v1] M 0.97 1.08 0.88 0.40 0.49 5 5 4 0 1 10 11 9 1.00 N/A N/A N/A N/A ec3 : [h2, w1 ! w2, v1] S 0.96 1.27 0.83 0.40 0.35 2 3 1 0 1 9 9 7 0.99 N/A N/A N/A N/A ec4 : [h2, w3 ! w4, v1] M 0.50 1.24 0.43 0.17 0.43 1 1 0 0 1 4 2 2 1.00 N/A N/A N/A N/A ec5 : [h1, w1, v1 ! v2] M 0.95 1.00 0.79 0.24 0.29 2 4 1 0 1 12 11 7 0.99 N/A N/A N/A N/A ec6 : [h1, w2 ! w1, v1 ! v2] L 0.51 2.80 0.44 0.25 0.30 3 4 1 1 0.31 7 11 6 0.96 N/A N/A N/A N/A ec7 : [h2 ! h1, w2 ! w1, v1 ! v2] VL 0.53 4.91 0.53 0.42 0.47 3 5 2 1 0.31 7 13 6 0.97 N/A N/A N/A N/A SaC— Workload: w1 : srad, w2 : pfilter, w3 : kmeans, w4 : hotspot, w5 : nw, w6 : nbody100, w7 : nbody150, w8 : nbody750, w9 : gc, w10 : cg ec1 : [h1, w1 ! w2, v1] L 0.66 25.02 0.65 0.10 0.79 13 14 8 0 0.88 82 73 52 0.27 0.18 0.17 0.88 0.73 ec2 : [h1, w1 ! w3, v1] L 0.44 15.77 0.42 0.10 0.65 13 10 8 0 0.91 82 63 50 0.56 0.18 0.12 0.90 0.84 ec3 : [h1, w1 ! w4, v1] S 0.93 7.88 0.93 0.36 0.90 12 10 9 0 0.96 37 64 34 0.94 0.16 0.15 0.26 0.88 ec4 : [h1, w1 ! w5, v1] L 0.96 2.82 0.78 0.06 0.81 16 12 10 0 0.94 34 58 25 0.04 0.15 0.22 0.19 -0.29 ec5 : [h1, w2 ! w3, v1] M 0.76 1.82 0.84 0.67 0.86 17 11 9 1 0.95 79 61 47 0.55 0.27 0.13 0.83 0.88 ec6 : [h1, w2 ! w4, v1] S 0.91 5.54 0.80 0.00 0.91 14 11 8 0 0.85 64 65 31 -0.40 0.13 0.15 0.12 0.64 ec7 : [h1, w2 ! w5, v1] L 0.68 25.31 0.57 0.11 0.71 14 14 8 0 0.88 67 59 29 0.05 0.21 0.22 0.09 -0.13 ec8 : [h1, w3 ! w4, v1] L 0.68 1.70 0.56 0.00 0.91 14 13 9 1 0.88 57 67 36 0.34 0.11 0.14 0.05 0.67 ec9 : [h1, w3 ! w5, v1] VL 0.06 3.68 0.20 0.00 0.64 16 10 9 0 0.90 51 58 35 -0.52 0.11 0.21 0.06 -0.41 ec10 : [h1, w4 ! w5, v1] L 0.70 4.85 0.76 0.00 0.75 12 12 11 0 0.95 58 57 43 0.29 0.14 0.20 0.64 -0.14 ec11 : [h1, w6 ! w7, v1] S 0.82 5.79 0.77 0.25 0.88 36 30 28 2 0.89 109 164 102 0.96 N/A N/A N/A N/A ec12 : [h1, w6 ! w8, v1] S 1.00 0.52 0.92 0.80 0.97 38 30 22 6 0.94 51 53 43 0.99 N/A N/A N/A N/A ec13 : [h1, w8 ! w7, v1] S 1.00 0.32 0.92 0.53 0.99 30 33 26 1 0.98 53 89 51 1.00 N/A N/A N/A N/A ec14 : [h1, w9 ! w10, v1] L 0.24 4.85 0.56 0.44 0.77 22 21 18 3 0.69 237 226 94 0.86 N/A N/A N/A N/A ES: Expected severity of change (Sec. III-B): S: small change; SM: small medium change; M: medium change; L: large change; VL: very large change. SaC workload descriptions: srad: random matrix generator; pfilter: particle filtering; hotspot: heat transfer differential equations; k-means: clustering; nw: optimal matching; nbody: simulation of dynamic systems; cg: conjugate gradient; gc: garbage collector. Hardware descriptions (ID: Type/CPUs/Clock (GHz)/RAM (GiB)/Disk): h1: NUC/4/1.30/15/SSD; h2: NUC/2/2.13/7/SCSI; h3:Station/2/2.8/3/SCSI; h4: Amazon/1/2.4/1/SSD; h5: Amazon/1/2.4/0.5/SSD; h6: Azure/1/2.4/3/SCSI Metrics: M1: Pearson correlation; M2: Kullback-Leibler (KL) divergence; M3: Spearman correlation; M4/M5: Perc. of top/bottom conf.; M6/M7: Number of influential options; M8/M9: Number of options agree/disagree; M10: Correlation btw importance of options; M11/M12: Number of interactions; M13: Number of interactions agree on effects; M14: Correlation btw the coeffs; M15/M16: Perc. of invalid conf. in source/target; M17: Perc. of invalid conf. common btw environments; M18: Correlation btw coeffs
  • 28. RQ1: Does the performance behavior stay consistent across environments? Environmental change Sev erity Lin. corr. SPEAR NUC/2 à NUC/4 S 1.00 Amazon_nano à NUC L 0.59 Hardware/workload/version VL -0.10 x264 Version L 0.06 Workload M 0.65 SQLite write-seq à write-batch S 0.96 read-rand à read-seq M 0.50 40 log P (θ, Xobs ) Θ P (θ|Xobs ) Θ log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ P(θ|Xobs) P(θ|Xobs) P(θ|Xobs) Throughput 𝑓; = 𝛼×𝑓= + β 𝑓; 𝑓= Insight: We observed a linear shift only for non-severe hardware changes
  • 29. We observed similar performance distribution across environments 44 Environmental change Seve rity Lin. corr. Diver gence x264 Version L 0.05 1.64 Workload/version L 0.08 1.03 SQLite write-seq à write-batch VL 0.51 2.80 read-rand à read-seq M 0.50 1.24 SaC Workload VL 0.06 3.68 0 100 200 300 400 Runtime [s] 0 100 200 300 400 500 Frequency 0 100 200 300 400 Runtime [s] 0 100 200 300 400 500 Frequency 0 2 4 6 8 10 Runtime [s] 0 500 1000 1500 2000 2500 3000 3500 4000 Frequency 0 2 4 6 8 10 Runtime [s] 0 500 1000 1500 2000 2500 3000 3500 4000 Frequency (a) (b) (c) (d) Insight: For severe changes, the performance distributions are similar, showing the potential for learning a non-linear transfer function.
  • 30. RQ2: Is the influence of configuration options on performance consistent across environments? 48 Environmental change Sev erity Dim Paired t-test S T x264 Version L 16 12 10 Hardware/workload/ver VL 8 9 SQLite write-seq à write-batch VL 14 3 4 read-rand à read-seq M 1 1 SaC Workload VL 50 16 10 Insight: Only a subset of options is influential which is largely preserved across all environment changes. 𝑪 = < 𝒐 𝟏, 𝒐 𝟐, 𝒐 𝟑, 𝒐 𝟒, 𝒐 𝟓, 𝒐 𝟔, 𝒐 𝟕 >
  • 31. RQ3: Are the interactions among configuration options preserved across environments? 52 Environmental change Seve rity Dim Interactions Corr. S T SPEAR Amazon_nano à NUC L 14 41 27 0.98 x264 Version L 16 47 33 1 SQLite Workload VL 50 109 164 0.96 fs = … - 7o1o2 + 2o1o3 – 0.2o2o3 ft = … - 6o1o2 + 2o1o3 – 0.1o2o3 Insight: Only a subset of option interactions is influential which is largely preserved across all environment changes.
  • 32. RQ4: Are the invalid configurations in the source also invalid in the target environment? 55 Environmental change Seve rity Invalid Corr. S T SPEAR Workload/version L 50% 45% 0.96 Hardware/workload/ver VL 47% 50% 0.97 15 20 25 18 20 22 24 5 10 15 20 25 5 10 15 20 25 8 10 12 14 16 18 20 22 24 26 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 30 CPU usage [%] CPU usage [%] (a) (b) Prediction without transfer learning 15 20 25 20 25 Prediction with transfer learning Insight: A moderate percentage of configurations are invalid across environments.
  • 34. 58 Simple Change Severe Change Findings Implications - Strong correlations - Similar performance distributions - Similar options or interactions - High invalid configurations - Simple transfer learning - Establish non-linear relation - Focus on interesting regions
  • 35. Implications: This study opens up several future research opportunities 59 Sampling Learning Performance testing number of counters number of splitters latency(ms) 100 150 1 200 250 2 300 Cubic Interpolation Over Finer Grid 243 684 10125 14166 18 Performance tuning Similarity across environments matters!
  • 37. 61 Many systems are now configurable Here is when transfer learning comes to the scene Target (Learn)Source (Given) DataModel Transferable Knowledge II. INTUITION Understanding the performance behavior of configurable software systems can enable (i) performance debugging, (ii) performance tuning, (iii) design-time evolution, or (iv) runtime adaptation [11]. We lack empirical understanding of how the performance behavior of a system will vary when the environ- ment of the system changes. Such empirical understanding will provide important insights to develop faster and more accurate learning techniques that allow us to make predictions and optimizations of performance for highly configurable systems in changing environments [10]. For instance, we can learn performance behavior of a system on a cheap hardware in a controlled lab environment and use that to understand the per- formance behavior of the system on a production server before shipping to the end user. More specifically, we would like to know, what the relationship is between the performance of a system in a specific environment (characterized by software configuration, hardware, workload, and system version) to the one that we vary its environmental conditions. In this research, we aim for an empirical understanding of performance behavior to improve learning via an informed sampling process. In other words, we at learning a perfor- mance model in a changed environment based on a well-suited sampling set that has been determined by the knowledge we gained in other environments. Therefore, the main research question is whether there exists a common information (trans- A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a II. INTUITION e performance behavior of configurable n enable (i) performance debugging, (ii) (iii) design-time evolution, or (iv) runtime lack empirical understanding of how the r of a system will vary when the environ- hanges. Such empirical understanding will sights to develop faster and more accurate that allow us to make predictions and ormance for highly configurable systems ments [10]. For instance, we can learn r of a system on a cheap hardware in a nment and use that to understand the per- f the system on a production server before user. More specifically, we would like to ionship is between the performance of a environment (characterized by software are, workload, and system version) to the environmental conditions. we aim for an empirical understanding of or to improve learning via an informed n other words, we at learning a perfor- anged environment based on a well-suited s been determined by the knowledge we ronments. Therefore, the main research here exists a common information (trans- A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a liminary concepts is section, we provide formal definitions of four con- hat we use throughout this study. The formal notations us to concisely convey concept throughout the paper. Configuration and environment space: Let Fi indicate h feature of a configurable system A which is either d or disabled and one of them holds by default. The uration space is mathematically a Cartesian product of features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Fi) = {0, 1}. A configuration of a system is then ber of the configuration space (feature space) where parameters are assigned to a specific value in their i.e., complete instantiations of the system’s parameters). so describe an environment instance by 3 variables w, h, v] drawn from a given environment space E = ⇥V , where they respectively represent sets of possible for workload, hardware and system version. erformance model: Given a software system A with uration space F and environmental instances E, a per- ce model is a black-box function f : F ⇥ E ! R ome observations of the system performance for each nation of system’s features x 2 F in an environment . To construct a performance model for a system A nfiguration space F, we run A in environment instance on various combinations of configurations xi 2 F, and the resulting performance values yi = f(xi) + ✏i, xi 2 re ✏i ⇠ N (0, i). The training data for our regression is then simply Dtr = {(xi, yi)}n i=1. In other words, a se function is simply a mapping from the input space to urable performance metric that produces interval-scaled ere we assume it produces real numbers). erformance distribution: For the performance model, asured and associated the performance response to each uration, now let introduce another concept where we e environment and we measure the performance. An cal performance distribution is a stochastic process, ! (R), that defines a probability distribution over mance measures for each environmental conditions. To ct a performance distribution for a system A with uration space F, similarly to the process of deriving formance models, we run A on various combinations urations xi 2 F, for a specific environment instance and record the resulting performance values yi. We then nfigurable gging, (ii) v) runtime f how the e environ- nding will e accurate ctions and e systems can learn ware in a nd the per- ver before uld like to mance of a y software on) to the tanding of informed a perfor- well-suited wledge we n research on (trans- ource and be carried ansferable e consider s a set of mary vari- rformance understand y will be is kind of this area etermined d software e input of A. Preliminary concepts In this section, we provide formal definitions of four con- cepts that we use throughout this study. The formal notations enable us to concisely convey concept throughout the paper. 1) Configuration and environment space: Let Fi indicate the i-th feature of a configurable system A which is either enabled or disabled and one of them holds by default. The configuration space is mathematically a Cartesian product of all the features C = Dom(F1) ⇥ · · · ⇥ Dom(Fd), where Dom(Fi) = {0, 1}. A configuration of a system is then a member of the configuration space (feature space) where all the parameters are assigned to a specific value in their range (i.e., complete instantiations of the system’s parameters). We also describe an environment instance by 3 variables e = [w, h, v] drawn from a given environment space E = W ⇥H ⇥V , where they respectively represent sets of possible values for workload, hardware and system version. 2) Performance model: Given a software system A with configuration space F and environmental instances E, a per- formance model is a black-box function f : F ⇥ E ! R given some observations of the system performance for each combination of system’s features x 2 F in an environment e 2 E. To construct a performance model for a system A with configuration space F, we run A in environment instance e 2 E on various combinations of configurations xi 2 F, and record the resulting performance values yi = f(xi) + ✏i, xi 2 F where ✏i ⇠ N (0, i). The training data for our regression models is then simply Dtr = {(xi, yi)}n i=1. In other words, a response function is simply a mapping from the input space to a measurable performance metric that produces interval-scaled data (here we assume it produces real numbers). 3) Performance distribution: For the performance model, we measured and associated the performance response to each configuration, now let introduce another concept where we vary the environment and we measure the performance. An empirical performance distribution is a stochastic process, pd : E ! (R), that defines a probability distribution over performance measures for each environmental conditions. To construct a performance distribution for a system A with configuration space F, similarly to the process of deriving the performance models, we run A on various combinations configurations xi 2 F, for a specific environment instance e 2 E and record the resulting performance values yi. We then Extract Reuse Learn Learn 12 • An ML approach • Uses the knowledge learned on the source • To learn a cheaper model for the target Hypotheses were categorized in 4 research questions 24 log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ log P(θ, Xobs ) Θ P(θ|Xobs) Θ P(θ|Xobs) Θ P(θ|Xobs) Θ Figure 5: The first column shows the log joint probability and the corresponding posterior. In the second column we have estimates of the log joint and the posterior for uniformly spaced points. In the third column we have the same except that more points were chosen in high likelihood regions. AGPR will query point (x). However, given sufficient smoothness, we know that the joint probability will be very low there after exponentiation due to points (3) and (4). Therefore, the BAPE active learner will not be as interested in (x) as AGPR. Observe that the uncerainty at (x) is large in the log joint probability space in comparison to the uncertainty elsewhere; however, in the probability space this is smaller than the uncertainty at the high probability regions. As Figure 5 indicates, while we model the log joint probability as a GP we are more interested in the uncertainty model logP(θ, Xobs) Θ logP(θ, Xobs) Θ logP(θ, Xobs) Θ P(θ|Xobs) Θ P(θ|Xobs) Θ P(θ|Xobs) Θ Figure 5: The first column shows the log joint probability and the corresponding posterior. In the second column we have estimates of the log joint and the posterior for uniformly spaced points. In the third column we have the same 5 10 15 20 25 5 10 15 20 25 14 16 18 20 22 24 5 10 15 20 25 5 10 15 20 25 8 10 12 14 16 18 20 22 24 26 5 10 15 20 25 5 10 15 20 25 5 10 15 20 25 30 CPU usage [%] CPU usage [%] (a) (b) (c) (d) Prediction without transfer learning 5 10 15 20 25 5 10 15 20 25 10 15 20 25 Prediction with transfer learning RQ1: consistent across environments RQ2: influence of configuration options fs = … - 7o1o2 + … ft = … - 3o1o2 + … RQ3: Option interactions RQ4: Invalid configurations Building performance models is expensive 11 25 options × 10 values = 1025 configurations Measure