Data Quality Issues in Online Reinforcement Learning for Self-Adaptive Systems (Keynote)

© SSE, Prof. Dr. Andreas Metzger
Data Quality Issues in Online Reinforcement
Learning for Self-adaptive Systems
Andreas Metzger
SEA4DQ@ESEC/FSE 2022
Andreas Metzger. 2022. Data Quality Issues in Online Reinforcement Learning for Self-
Adaptive Systems (Keynote). In Proceedings of the 2nd International Workshop on Software
Engineering and AI for Data Quality in CyberPhysical Systems/Internet of Things (SEA4DQ ’22),
November 17, 2022, Singapore, Singapore. ACM, New York, NY, USA
https://doi.org/10.1145/3549037.3570194

©
SSE,
Prof.
Dr.
Andreas
Metzger
SOFTWARE SYSTEMS ENGINEERING
Prof. Dr. K. Pohl
Agenda
1. Fundamentals and Motivation
2. Addressing Data Quality Issues
a. Issue 1
b. Issue 2
c. Issue 3
3. Discussion
SEA4DQ@ESEC 2

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Fundamentals
(Self-)adaptive Software Systems [Salehie & Tahvildari, 2009; Weyns, 2021]
• Observe changes in environment, requirements and themselves
• Modify their structure, parameters and behavior
Adaptive Software Life-Cycle Model
SEA4DQ@ESEC 3
DEV OPS
self-observe
ADAPT
self-modify
manual human-in-the-loop autonomous

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Fundamentals
MAPE-K Reference Model [Kephart & Chess, 2003; Salehie & Tahvildari, 2009]
• Architecture as defining characteristic of adaptive system [Weyns, 2020]
SEA4DQ@ESEC 4
Example: Adaptive Web Shop
• Monitor: Sudden increase in number of concurrent users (workload peak)
• Analyze: Slow response of web shop if no adaptation
• Plan: Deactivating optional recommendation feature
• Execute: Replace dynamic recommendations with static banner
Self-Adaptation Logic
Analyze
Monitor Execute
Plan
Knowledge
Identify concrete
adaptations
Enact concrete
adaptation
Determine need for
adaptation
Collect and aggregate
observations
System Logic
Sensors Effectors
0
e
+
0
0
1
e
+
0
5
2
e
+
0
5
3
e
+
0
5
4
e
+
0
5
5
e
+
0
5
6
e
+
0
5
1
0
0
1
5
0
2
0
0
2
5
0
d
$
w
o
r
k
l
o
a
d
Workload
t

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Fundamentals
Defining when to adapt
• Requires anticipating all relevant
environment states
• Infeasible in most cases due to incomplete
information at design time
Example:
• Concrete services dynamically bound during
execution not known at design time
Defining how to adapt
• Requires knowing concrete effect of adaptation
• Concrete effect typically not precisely known at
design time
Example:
• Deactivating recommendation engine has positive
impact on performance
• But: How much exactly?
SEA4DQ@ESEC 5
Engineering Challenge: „Design Time Uncertainty“ [Weyns et al. 2013; Weyns, 2020; Weyns et al., 2022]

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Online Reinforcement Learning
Principle Idea
[Metzger et al., 2022; Palm et al., 2020; Xu et al. 2012; Jamshidi et al.
2015; Arabnejad et al., 2017; Wang et al. 2020]
• Continuously learn and update adaptation policy
• Based on concrete observations (data, feedback) from
live system execution
• Facilitates leveraging information only available at
runtime
SEA4DQ@ESEC 6
Analyze
Monitor Execute
Plan
Knowledge
System Logic
Sensors Effectors
Learn
Feedback Update

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Fundamental RL Model
[Sutton & Barto, 2018]
• Reward function R defines learning goal
• RL agent aims to optimize cumulative rewards
Exploitation-Exploration trade-off
• In each learning step:
either exploitation or exploration
• Exploitation: Reuse of existing knowledge
• Exploration: Collection of new knowledge
SEA4DQ@ESEC 7
Action A
State S
Reward R
Action
Selection
Next state S’
RL Agent
Policy
Policy
Update
Environ-
ment
Textbook example “Cliff Walk“:
Actions =
{UP, DOWN,
LEFT, RIGHT}
Reward
[Sutton & Barto, 2018]
States:

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
8
MAPE-K Model Integrated Model
RL Model
Analyze
Monitor Execute
Plan
Knowledge
Online RL for SAS
Execute
Policy
(K)
Monitor
Action
Selection
(A + P)
Policy
Update
Action
A
State S
Reward R
Next state S’
Action A
State S
Reward R
Action
Selection
Next state S’
RL Agent
Policy
Policy
Update
Environ-
ment
Integrating MAPE-K and Reinforcement Learning [Metzger et al., 2022]
SEA4DQ@ESEC

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Selected Data Quality Issues
SEA4DQ@ESEC 9
DEV OPS
self-observe
ADAPT
self-modify
manual human-in-the-loop
(2) Data Drift
“Fluctuating RL
performance”
(1) Data Sparsity
“Slow RL”
(3) Data Opaqueness
“RL as
black box”

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Agenda
a. Sparsity
b. Drift
c. Opaqueness
3. Discussion
SEA4DQ@ESEC 10

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Data Sparsity
Situation
• Large, discrete adaptation space (i.e., set of possible adaptations)
• Example: Service-oriented system
• 8 abstract services, with 2 concrete services each  256 discrete adaptations possible
Shortcomings of state-of-the-art solutions
[Xu et al. 2012; Jamshidi et al. 2015; Arabnejad et al., 2017; Wang et al. 2020]
• Use Random exploration
 Slow Learning in presence of many adaptations
[Filho & Porter, 2017; Dulac-Arnold et al., 2015]
• Do not consider system evolution (“DevOps”)
 New adaptations explored late
SEA4DQ@ESEC 11

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Online RL for SAS
Execute
Policy
(K)
Monitor
Action
Selection
(A + P)
Policy
Update
Action
A
State S
Reward R
Next state S’
Addressing Data Sparsity
Feature-Model-based Exploration
[Metzger et al., 2020a; Metzger et al., 2022]
• Explicitly modeling adaptations in a
feature model (FM) from software
product-line engineering
[Metzger & Pohl, 2004]
• FM typically encodes semantic
relationships
• Exploration considers FM structure
SEA4DQ@ESEC 12
Feature Model

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Modeling Adaptations as Feature Model
Web Shop
Data
Logging
Content
Discovery
Min Max
Medium
Search
Recommen-
dation

  

Web Shop
Data
Logging
Content
Discovery
Min Max
Medium
Search
Recommen-
dation

 

Nbr of Concurrent Users  1000  Adaptation
Mandatory
Optional
Alternative
 Activated
• FM = compact specification of valid system configurations
• Concrete system configuration = combination of activated features
• Adaptation = Change of concrete system configuration at run time
Recommendation
 Max  Medium
Recommendation
 Max  Medium

SEA4DQ@ESEC 13
Example:

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Feature-Model-based Exploration Web Shop
Data
Logging
Content
Discovery
Min Max
Medium
Search
Recommen-
dation
State of the Art: random
FM-guided:
2. Explore all configurations including this leaf node…
3. …only then explore configurations involving sibling
features
1. Start with any leaf node
Recommendation
 Max  Medium
SEA4DQ@ESEC 14
Evolution-aware:
• Determine set-theoretic difference between FM before and after evolution step
• Removed configurations  Delete from policy (knowledge)
• Added configurations  Explore them first

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Validation
SEA4DQ@ESEC 15
CloudRM
[Mann, 2016]
Nbr of Features 63
Nbr of Adaptations 344
FM Depth 3
Selected Exemplar

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Improvement over SotA FM-guided Evol.-aware
Time to Threshold 23% 57%
Total Performance 38% 60%
Validation
FM-guided Evolution-aware
SEA4DQ@ESEC 16
Selected Results [Metzger et al., 2020a; Metzger et al., 2022]

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Agenda
a. Sparsity
b. Drift
c. Opaqueness
3. Discussion
SEA4DQ@ESEC 17

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Data Drift
Situation
• Non-stationarity of environment and / or data
• Adaptation policy may become sub-optimal over time
• Example: Cloud application
• Change of CPU of physical machines or OS performance of virtual machine
• Has effect on application performance (but is not observed by adaptation logic)
Shortcomings of state-of-the-art solutions
[Xu et al. 2012; Jamshidi et al. 2015; Arabnejad et al., 2017; Wang et al. 2020]
• Balance between exploration and exploitation via -greedy:
• Use of -decay for convergence of learning
 Insufficient exploration if  too small
(need for detecting non-stationarity and dynamically changing )
SEA4DQ@ESEC 18
: Exploration: Select
random action
(1- ): Exploitation: Select
best action (according to
knowledge)

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Addressing Data Drift
Deep RL to avoid need for changing exploration rate
[Palm et al. 2020; Metzger et al. 2020b]
Deep RL = Knowledge represented as
Deep Artificial Neuronal Network
• Natural handling of
non-stationarity due to
stochastic action selection (sampling)
• Additional benefits
• Handling of continuous states and actions
• Generalization over unseen, neighbouring states
SEA4DQ@ESEC 19
Online RL for SAS
Execute
Policy
(K)
Monitor
Action
Selection
Policy
Update
Action
A
State S
Reward R
Next state S’
(Sampling)

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Validation
Brownout-RUBiS: Adaptive Web Shop [Klein et al. 2014]
• Change of rate of recommendations via “dimmer” value
SEA4DQ@ESEC 20
Selected Exemplar

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Validation
SEA4DQ@ESEC 21
Selected Results [Palm et al. 2020; Metzger et al. 2020b]
Dimmer Value
Reward
Latency
(rel. response time)
Workload
100 %
50%
CPU-Performance
Automatic handling of
non-stationarity
Non-stationarity
t

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Agenda
a. Sparsity
b. Drift
c. Opaqueness
3. Discussion
SEA4DQ@ESEC 22

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Data Opaqueness
Decision-making of RL agent not transparent
 Difficult to deduce decision-making by only observing R, S, A
SEA4DQ@ESEC 23
R
S, A
Example:

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Data Opaqueness
 Reward function often too complex to deduce dynamic RL behavior
SEA4DQ@ESEC 24
Example: Example: Prescriptive Business Process Monitoring System
• Raises alarms to proactively adapt running processes
Alarm No Alarm

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Data Opaqueness
 Knowledge in Deep RL not explicitly represented
SEA4DQ@ESEC 25
State
S
Action
A
Classical RL (Q-Learning)
Actions A = {UP, DOWN,
LEFT, RIGHT}
Reward
States S = {0, …, 47}
0 11
24 35
24 25 26 27 28 29 30 31 32 33 34 35
UP -13,36 -12,57 -11,73 -10,74 -9,95 -8,91 -7,99 -6,98 -5,95 -4,92 -3,93 -2,98
RIGHT -12,00 -11,00 -10,00 -9,00 -8,00 -7,00 -6,00 -5,00 -4,00 -3,00 -2,00 -1,93
LEFT -12,99 -13,00 -11,98 -10,95 -9,99 -8,89 -7,97 -6,98 -5,94 -4,81 -3,94 -2,98
DOWN -13,95 -112,18 -112,80 -111,49 -112,13 -112,68 -112,91 -112,42 -111,81 -110,62 -112,86 -1,00
Action
A
State S:
Deep RL
Falling into the cliff Reaching the Goal (G)

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Addressing Data Opaqueness
Reward Decomposition
[Sequeira et al., 2020]
Decompose reward function to explain short-
term goal orientation of RL (train sub-RL agents)
Pro
• Helpful in the presence of multiple, “competing”
quality goals for learning
• Provides contrastive (counterfactual) explanations
Con
• No indication of explanation’s relevance
• Requires manually selecting relevant explanations
 cognitive overhead
Interestingness Elements
[Juozapaitis et al., 2019]
Identify relevant moments of interaction
between agent and environment at runtime
Pro
• Facilitates automatically selecting relevant
interactions to be explained
Con
• Does not explain whether RL behaves as expected
and for the right reasons
26
Explainable AI for Online RL
Decomposed Interestingness Elements (DINEs)
SEA4DQ@ESEC
+

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Explainable RL
Important Interaction
Is RL in given state
uncertain (wide range of actions)
or
certain (almost always same
action)?
• How much does relative
importance of actions differ for
each sub-agent?
• Number of DINES shown can be
tuned via Threshold ρ (level of
inequality)
27
Three Types of DINEs
SEA4DQ@ESEC
Visualization in Dashboard
Certain Uncertain

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Explainable RL
Reward Channel
Dominance
What influence does a sub-agent
have on possible actions?
• Influence of rewards of sub-
agents on composed decision
28
SEA4DQ@ESEC
Relative

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Explainable RL
Reward Channel
Extremum
Where are RL decisions for
potentially critical states?
• Points after local
minimum/maximum of state-
value
• ExpectedReward (S) –
ExpectedReward (S’) > ϕ
 Maximum
• Number of DINES shown can be
tuned via Threshold ϕ
29
SEA4DQ@ESEC
Minimum
Maximum

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Validation
Important Interactions Reward Channel Extrema
SEA4DQ@ESEC 30
Addressing Cognitive Overhead of Explanations

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Agenda
a. Sparsity
b. Drift
c. Opaqueness
3. Discussion
SEA4DQ@ESEC 31

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Discussion
RL not applicable for all systems
• High risk if wrong adaptations lead to damage  Safe RL
• Data can be used to manipulate RL  Adversarial RL
Performance / Sustainability of RL
• Deep RL (like all Deep ML) very resource intensive
• Simple dependencies need to be learned  Meta RL
 Hybrid ML
Difficult to get Reward Function right
• Explainable RL = „Debugging“
• How to align human goals and AI goals?  Reward Engineering
SEA4DQ@ESEC 32
Current Limitations of RL  and future directions

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
Thanks!
SEA4DQ@ESEC 33
Research leading to these results has received funding from the EU’s
Horizon 2020 research and innovation programme under grant
agreement no. 871493 – www.dataports-project.eu
More information:
Track on Data and AI Driven
Engineering (DAIDE)
As part of SEAA 2023 (49th Euromicro Conference
Series on Software Engineering and Advanced
Applications)
Montenegro/Albania, Summer 2023
2nd International Conference on
AI Engineering – Software
Engineering for AI (CAIN 2023)
collocated with ICSE 2023
Melbourne, Australia, May 2023
Call for Papers of related events:

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
References
[Arabnejad et al., 2017] H. Arabnejad, C. Pahl, P. Jamshidi, and G. Estrada, “A comparison of reinforcement learning techniques for
fuzzy cloud autoscaling,” in 17th Intl Symposium on Cluster, Cloud and Grid Computing, CCGRID 2017
[De Lemos et al. 2010] R. de Lemos et al., “Software Engineering for Self-Adaptive Systems: A Second Research Roadmap,” in Softw.
Eng. for Self-Adaptive Systems II, ser. LNCS. Springer, 2013, vol. 7475, pp. 1–32
[Di Francescomarino et al. 2018] Chiara Di Francescomarino, Chiara Ghidini, Fabrizio Maria Maggi, Fredrik Milani: Predictive Process
Monitoring Methods: Which One Suits Me Best? BPM 2018: 462-479
[Dulac-Arnold et al. 2015] Gabriel Dulac-Arnold, Richard Evans, Peter Sunehag, Ben Coppin: Reinforcement Learning in Large Discrete
Action Spaces. CoRR abs/1512.07679 (2015)
[Evermann et al. 2017] Evermann, J., Rehse, J., Fettke, P.: Predicting process behaviour using deep learning. Decision Support Systems
100, 2017
[Filho & Porter, 2017] Filho, R.V.R., Porter, B.: Defining emergent software using continuous self-assembly, perception, and learning.
TAAS 12(3), 16:1–16:25 (2017)
[Jamshidi et al., 2015] P. Jamshidi, A. Molzam Sharifloo, C. Pahl, A. Metzger, and G. Estrada, “Self-learning cloud controllers: Fuzzy Q-
learning for knowledge evolution (short paper),” in Int’l Conference on Cloud and Autonomic Computing (IC- CAC 2015) Cambridge,
USA, September 21-24, 2015,
[Kephart & Chess, 2003] J. O. Kephart and D. M. Chess, “The vision of autonomic computing,” IEEE Computer, vol. 36, no. 1, pp. 41–50,
2003.
[Klein et al. 2014] C. Klein, M. Maggio, K. Arzen, F. Hernandez-Rodriguez, “Brownout: building more robust cloud applications”. In:
36th Intl Conf. on Software Engineering (ICSE 2014), pp. 700–711. ACM, 2014
[Mann, 2016] Z. Mann, “Interplay of virtual machine selection and virtual machine placement”, in: 5th European Conf. on Service-
Oriented and Cloud Computing, ESOCC’16, LNCS vol. 9846, pp. 137–151 (2016)
[Metzger & Pohl, 2014] A. Metzger, K. Pohl, “Software product line engineering and variability management: Achievements and
challenges,” in ICSE Future of Software Engineering Track (FOSE 2014), ACM, 2014, pp. 70–84.
34
SEA4DQ@ESEC

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
References
[Metzger et al. 2019] A. Metzger, A. Neubauer, P. Bohn, and K. Pohl, “Proactive process adaptation using deep learning ensembles,” in
31st Int’l Conf. on Advanced Information Systems Engineering (CAiSE 2019), LNCS, vol. 11483. Springer, 2019, pp. 547–562
[Metzger et al. 2020] A. Metzger, C. Quinton, Z. Á. Mann, L. Baresi, K. Pohl, “Realizing Self-Adaptive Systems via Online Reinforcement
Learning and Feature-Model-guided Exploration”, Computing, Springer, March, 2022
[Metzger et al. 2020a] A. Metzger, C. Quinton, Z. Mann, L. Baresi, and K. Pohl, “Feature model-guided online reinforcement learning
for self-adaptive services,” in 18th Int’l Conf. on Service-Oriented Computing (ICSOC 2020), LNCS 12571, Springer, 2020
[Metzger et al. 2020b] A. Metzger, T. Kley, and A. Palm, “Triggering proactive business process adaptations via online reinforcement
learning,” in 18th Int’l Conf. on Business Process Management (BPM 2020), LNCS 12168. Springer, 2020e
[Palm et al. 2020] A. Palm, A. Metzger, and K. Pohl, “Online reinforcement learning for self-adaptive information systems,” in 32nd Int’l
Conf. on Advanced Information Systems Engineering (CAiSE 2020), LNCS 12127. Springer, 2020
[Porter et al., 2020] B. Porter, R. R. Filho, and P. Dean, “A survey of methodology in self-adaptive systems research,” in ACSOS. IEEE,
2020
[Salehie & Tahvildari, 2009] M. Salehie and L. Tahvildari, “Self-adaptive software: Landscape and research challenges,” TAAS, vol. 4, no.
2, 2009.
35
SEA4DQ@ESEC

©
SSE,
Prof.
Dr.
Andreas
Metzger
Prof. Dr. K. Pohl
References
[Siegmund et al. 2012] N. Siegmund, S. Kolesnikov, C. Kästner, S. Apel, D. Batory, M. Rosenmüller, G. Saake, G.: Predicting Performance
via Automated Feature-interaction Detection. In: 34th Intl Conf. on Software Engineering (ICSE 2012), pp. 167–177, ACM, 2012
[Sutton & Barto, 2018] R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, 2nd ed. Cambridge, MA, USA: MIT Press,
2018
[Taylor & Stone, 2009] M. Taylor, P. Stone: Transfer learning for reinforcement learning domains: A survey. J. Mach. Learn. Res. 10,
1633–1685 (2009)
[Wang et al., 2020] Hongbing Wang, Jiajie Li, Qi Yu, Tianjing Hong, Jia Yan, Wei Zhao: Integrating recurrent neural networks and
reinforcement learning for dynamic service composition. Future Gener. Comput. Syst. 107: 551-563 (2020)
[Weyns et al. 2013] Danny Weyns, Nelly Bencomo, Radu Calinescu, Javier Cámara, Carlo Ghezzi, Vincenzo Grassi, Lars Grunske, Paola
Inverardi, Jean-Marc Jézéquel, Sam Malek, Raffaela Mirandola, Marco Mori, Giordano Tamburrelli: Perpetual Assurances for Self-
Adaptive Systems. Software Engineering for Self-Adaptive Systems 2013: 31-63
[Weyns, 2021] Danny Weyns, Introduction to Self-Adaptive Systems: A Contemporary Software Engineering Perspective, Wiley, 2021.
[Weyns et al., 2022] D. Weyns, I. Gerostathopoulos, N. Abbas, J. Andersson, S. Biffl, P. Brada, T. Bures, A. D. Salle, P. Lago, A. Musil, J.
Musil, and P. Pelliccione, “Preliminary results of a survey on the use of self-adaptation in industry,” in 17th Intl Symp. on Software
Engineering for Adaptive and Self-Managing Systems, SEAMS@ICSE 2022, 2022.
[Xu et al., 2012] C. Xu, J. Rao, and X. Bu, “URL: A unified reinforcement learning approach for autonomic cloud management,” J.
Parallel Distrib. Comput., vol. 72, no. 2, pp. 95–105, 2012
36
SEA4DQ@ESEC

Data Quality Issues in Online Reinforcement Learning for Self-Adaptive Systems (Keynote)

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie Data Quality Issues in Online Reinforcement Learning for Self-Adaptive Systems (Keynote)

Ähnlich wie Data Quality Issues in Online Reinforcement Learning for Self-Adaptive Systems (Keynote) (20)

Mehr von Andreas Metzger

Mehr von Andreas Metzger (14)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Data Quality Issues in Online Reinforcement Learning for Self-Adaptive Systems (Keynote)

Hinweis der Redaktion