SlideShare ist ein Scribd-Unternehmen logo
1 von 54
Downloaden Sie, um offline zu lesen
A Reinforcement Learning Approach to Solving
Hybrid Flexible Flowline Scheduling Problems
Bert Van Vreckem Dmitriy Borodin Wim De Bruyn Ann
Now´e
Authors
• Bert Van Vreckem, HoGent Business and Information
Management
bert.vanvreckem@hogent.be
• Dmitriy Borodin, OMPartners
dborodin@ompartners.com
• Wim De Bruyn, HoGent Business and Information
Management
wim.debruyn@hogent.be
• Ann Now´e, Artificial Intelligence Lab, Vrije Universiteit Brussel
ann.nowe@vub.ac.be
HFFSP MISTA2013: 29 August 2013 3/28
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 4/28
Hybrid Flexible Flowline Scheduling Problems
Powerful model for complex real-life production scheduling
problems.
In α/β/γ notation1:
HFFLm, ((RM(i)
)
(m)
i=1/Mj, rm, prec, Siljk, Ailjk, lag/Cmax
1
(Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 5/28
Hybrid Flexible Flowline Scheduling Problems
Powerful model for complex real-life production scheduling
problems.
In α/β/γ notation1:
HFFLm, ((RM(i)
)
(m)
i=1/Mj, rm, prec, Siljk, Ailjk, lag/Cmax
Flowline Scheduling problems: jobs processed in consecutive stages.
Stage 1 Stage 2 Stage 3 Stage 4
1
(Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 5/28
Hybrid Flexible Flowline Scheduling Problems
Hybrid case: unrelated parallel machines
M11
M12
M13
M21
M22
M31
M32
M33
M34
M41
M42
HFFSP MISTA2013: 29 August 2013 6/28
Hybrid Flexible Flowline Scheduling Problems
Flexible case: stages may be skipped
M11
M12
M13
M21
M22
M41
M42
HFFSP MISTA2013: 29 August 2013 7/28
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Machine eligibility
M11
M13
M21
M22
M31
M33
M42
HFFSP MISTA2013: 29 August 2013 8/28
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Time lag between stages
Stage 1
Stage 2
Stage 3
Stage 4
HFFSP MISTA2013: 29 August 2013 9/28
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Sequence dependent setup times
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
HFFSP MISTA2013: 29 August 2013 10/28
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Sequence dependent setup times
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
J2 J1M1
J2 J1M2
HFFSP MISTA2013: 29 August 2013 10/28
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Sequence dependent setup times
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
J2 J1M1
J2 J1M2
HFFSP MISTA2013: 29 August 2013 11/28
Hybrid Flexible Flowline Scheduling Problems
Other constraints: Precendence relations between jobs
1 2 3 4 5 6 7 8 9 10 11 12
J1 J2M1
J1 J2M2
J2 J1M1
J2 J1M2
HFFSP MISTA2013: 29 August 2013 12/28
Hybrid Flexible Flowline Scheduling Problems
Precedence relations between jobs make the problem much
harder, in a way that MILP/CPLEX approach doesn’t work
anymore for larger instances (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 13/28
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 14/28
A Machine Learning Approach
Scheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations
• Machine assignment
HFFSP MISTA2013: 29 August 2013 15/28
A Machine Learning Approach
Scheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations → Learning Automata
• Machine assignment
HFFSP MISTA2013: 29 August 2013 15/28
A Machine Learning Approach
Scheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations → Learning Automata
• Machine assignment → Earliest Preparation Next Stage
(EPNS) (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 15/28
A Machine Learning Approach
Scheduling Hybrid Flexible Flowline Scheduling Problems
Two stages:
• Job permutations → Learning Automata
• Machine assignment → Earliest Preparation Next Stage
(EPNS) (Urlings, 2010)
HFFSP MISTA2013: 29 August 2013 15/28
Reinforcement learning
At every discrete time step t:
• Agent percieves environment state s(t)
• Agent chooses action a(t) ∈ A = a1, . . . , an according to
some policy
• Environment places agent in new state s(t + 1) and gives
reinforcement r(t)
• Goal: learn policy that maximizes long term cumulative
reward t r(t)
Environment
Agent
s
r
a
HFFSP MISTA2013: 29 August 2013 16/28
Learning Automata (LA)
Reinforcement Learning agents that choose action according to
probability distribution p(t) = (p1(t), . . . , pn(t)), with
pi = Prob[a(t) = ai] and s.t. n
i=1 pi = 1
pi(0) = 1
n (1)
pi(t + 1) = pi(t) +αrewr(t)(1 − pi(t))
−αpen(1 − r(t))pi(t) (2)
if ai is the action taken at instant t
pj(t + 1) = pj(t) −αrewr(t)pj(t)
+αpen(1 − r(t))
1
n − 1
− pj(t) (3)
if aj = ai
HFFSP MISTA2013: 29 August 2013 17/28
Learning Automata (LA)
Reinforcement Learning agents that choose action according to
probability distribution p(t) = (p1(t), . . . , pn(t)), with
pi = Prob[a(t) = ai] and s.t. n
i=1 pi = 1
pi(0) = 1
n (1)
pi(t + 1) = pi(t) +αrewr(t)(1 − pi(t))
−αpen(1 − r(t))pi(t) (2)
if ai is the action taken at instant t
pj(t + 1) = pj(t) −αrewr(t)pj(t)
+αpen(1 − r(t))
1
n − 1
− pj(t) (3)
if aj = ai
HFFSP MISTA2013: 29 August 2013 17/28
Learning Automata (LA)
Reinforcement Learning agents that choose action according to
probability distribution p(t) = (p1(t), . . . , pn(t)), with
pi = Prob[a(t) = ai] and s.t. n
i=1 pi = 1
pi(0) = 1
n (1)
pi(t + 1) = pi(t) +αrewr(t)(1 − pi(t))
−αpen(1 − r(t))pi(t) (2)
if ai is the action taken at instant t
pj(t + 1) = pj(t) −αrewr(t)pj(t)
+αpen(1 − r(t))
1
n − 1
− pj(t) (3)
if aj = ai
HFFSP MISTA2013: 29 August 2013 17/28
Learning Automaton update
1 2 3 4
0
0.2
0.4
0.6
0.8
1
i
pi
HFFSP MISTA2013: 29 August 2013 18/28
Learning Automaton update
1 2 3 4
0
0.2
0.4
0.6
0.8
1
i
pi
E.g. action 3 was chosen
HFFSP MISTA2013: 29 August 2013 18/28
Learning Automaton update
1 2 3 4
0
0.2
0.4
0.6
0.8
1
i
pi
E.g. action 3 was chosen
1 2 3 4
0
0.2
0.4
0.6
0.8
1
r(t) = 1
pi
HFFSP MISTA2013: 29 August 2013 18/28
Learning Automaton update
1 2 3 4
0
0.2
0.4
0.6
0.8
1
i
pi
E.g. action 3 was chosen
1 2 3 4
0
0.2
0.4
0.6
0.8
1
r(t) = 1
pi
1 2 3 4
0
0.2
0.4
0.6
0.8
1
r(t) = 0
pi
HFFSP MISTA2013: 29 August 2013 18/28
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 19/28
Probabilistic Basic Simple Strategy (PBSS)
(Wauters, 2012)
• A LA is assigned to every position of a permutation
HFFSP MISTA2013: 29 August 2013 20/28
Probabilistic Basic Simple Strategy (PBSS)
(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resulting
in a permutation
HFFSP MISTA2013: 29 August 2013 20/28
Probabilistic Basic Simple Strategy (PBSS)
(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resulting
in a permutation
• Quality of solution is evaluated
HFFSP MISTA2013: 29 August 2013 20/28
Probabilistic Basic Simple Strategy (PBSS)
(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resulting
in a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule Linear
Reward-Inaction (αpen = 0):
HFFSP MISTA2013: 29 August 2013 20/28
Probabilistic Basic Simple Strategy (PBSS)
(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resulting
in a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule Linear
Reward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1
HFFSP MISTA2013: 29 August 2013 20/28
Probabilistic Basic Simple Strategy (PBSS)
(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resulting
in a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule Linear
Reward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1
• If not, r(t) = 0
HFFSP MISTA2013: 29 August 2013 20/28
Probabilistic Basic Simple Strategy (PBSS)
(Wauters, 2012)
• A LA is assigned to every position of a permutation
• LAs play a dispersion game to choose unique action, resulting
in a permutation
• Quality of solution is evaluated
• Update probabilities according to LA update rule Linear
Reward-Inaction (αpen = 0):
• Better result than best one so far: r(t) = 1
• If not, r(t) = 0
• Repeat until convergence
HFFSP MISTA2013: 29 August 2013 20/28
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems that
involve learning permutations
HFFSP MISTA2013: 29 August 2013 21/28
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems that
involve learning permutations
• but doesn’t work well when precedence constraints are
involved
HFFSP MISTA2013: 29 August 2013 21/28
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems that
involve learning permutations
• but doesn’t work well when precedence constraints are
involved
• PBSS only learns from positive experience (i.e. improving on
previous solutions)
HFFSP MISTA2013: 29 August 2013 21/28
Probabilistic Basic Simple Strategy (PBSS)
• PBSS: great results in several optimization problems that
involve learning permutations
• but doesn’t work well when precedence constraints are
involved
• PBSS only learns from positive experience (i.e. improving on
previous solutions)
• Doesn’t learn to avoid invalid permutations
HFFSP MISTA2013: 29 August 2013 21/28
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update with
r(t) = 0 and αpen > 0 for all agents that are involved in the
violation of precedence constraints.
HFFSP MISTA2013: 29 August 2013 22/28
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update with
r(t) = 0 and αpen > 0 for all agents that are involved in the
violation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in all
agents, depending on the resulting makespan ms and best
makespan until now msbest:
HFFSP MISTA2013: 29 August 2013 22/28
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update with
r(t) = 0 and αpen > 0 for all agents that are involved in the
violation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in all
agents, depending on the resulting makespan ms and best
makespan until now msbest:
• improved: r(t) = 1;
HFFSP MISTA2013: 29 August 2013 22/28
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update with
r(t) = 0 and αpen > 0 for all agents that are involved in the
violation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in all
agents, depending on the resulting makespan ms and best
makespan until now msbest:
• improved: r(t) = 1;
• equally good: r(t) = 1/2;
HFFSP MISTA2013: 29 August 2013 22/28
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update with
r(t) = 0 and αpen > 0 for all agents that are involved in the
violation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in all
agents, depending on the resulting makespan ms and best
makespan until now msbest:
• improved: r(t) = 1;
• equally good: r(t) = 1/2;
• worse: r(t) = msbest
2ms ;
HFFSP MISTA2013: 29 August 2013 22/28
Extending PBSS for precendence constraints
Updating probabilities:
• If the job permutation is invalid, perform an update with
r(t) = 0 and αpen > 0 for all agents that are involved in the
violation of precedence constraints.
• If the job permutation is valid, perform a LR−I update in all
agents, depending on the resulting makespan ms and best
makespan until now msbest:
• improved: r(t) = 1;
• equally good: r(t) = 1/2;
• worse: r(t) = msbest
2ms ;
• no valid schedule found: r(t) = 0;
HFFSP MISTA2013: 29 August 2013 22/28
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 23/28
Experiments
• HFFSP Benchmark problems from (Ruiz et al., 2008)2
• problem sets with 5, 7, 9, 11, 13, 15 jobs, 96 instances in each
set
• + other constraints that make problems harder (precedence
relations!)
• αrew = 0.1; αpen = 0.5 (no tuning)
• Run until converges, or at most 300 seconds
2
Available at http://soa.iti.es/problem-instances
HFFSP MISTA2013: 29 August 2013 24/28
Results
Instance set 5 7 9 11 13 15 overall
mean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484
best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34
# improved 11 12 18 12 9 6 68
# equal 62 40 19 18 8 7 154
# worse 23 44 59 66 79 82 354
HFFSP MISTA2013: 29 August 2013 25/28
Results
Instance set 5 7 9 11 13 15 overall
mean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484
best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34
# improved 11 12 18 12 9 6 68
# equal 62 40 19 18 8 7 154
# worse 23 44 59 66 79 82 354
HFFSP MISTA2013: 29 August 2013 25/28
Contents
1 Hybrid Flexible Flowline Scheduling Problems
2 A Machine Learning Approach
3 Learning Permutations with Precedence Constraints
4 Experiments & results
5 Conclusion
HFFSP MISTA2013: 29 August 2013 26/28
Results and Discussion
Contributions:
• Extension of PBSS for learning permutations with precedence
constraints
• Simple model + RL approach can yield good quality results
for challenging HFFSP instances
HFFSP MISTA2013: 29 August 2013 27/28
Results and Discussion
Contributions:
• Extension of PBSS for learning permutations with precedence
constraints
• Simple model + RL approach can yield good quality results
for challenging HFFSP instances
Discussion & future work:
• Precedence relations do make the problem harder
• Parameter tuning
• Convergence
• Larger instances (50, 100 jobs)
• Explore possibilities for improvement in machine assignment
HFFSP MISTA2013: 29 August 2013 27/28
Thank you!
Questions?
bert.vanvreckem@hogent.be
http://www.slideshare.net/bertvanvreckem/
HFFSP MISTA2013: 29 August 2013 28/28

Weitere ähnliche Inhalte

Was ist angesagt?

Stein's method for functional Poisson approximation
Stein's method for functional Poisson approximationStein's method for functional Poisson approximation
Stein's method for functional Poisson approximationLaurent Decreusefond
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsPK Lehre
 
Hierarchical Reinforcement Learning with Option-Critic Architecture
Hierarchical Reinforcement Learning with Option-Critic ArchitectureHierarchical Reinforcement Learning with Option-Critic Architecture
Hierarchical Reinforcement Learning with Option-Critic ArchitectureNecip Oguz Serbetci
 
Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theoryjfrchicanog
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 
Variational inference
Variational inference  Variational inference
Variational inference Natan Katz
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimationChristian Robert
 
RuleML2015: Input-Output STIT Logic for Normative Systems
RuleML2015: Input-Output STIT Logic for Normative SystemsRuleML2015: Input-Output STIT Logic for Normative Systems
RuleML2015: Input-Output STIT Logic for Normative SystemsRuleML
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?Christian Robert
 
Bayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet ProcessBayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet ProcessAlessandro Panella
 

Was ist angesagt? (20)

Stein's method for functional Poisson approximation
Stein's method for functional Poisson approximationStein's method for functional Poisson approximation
Stein's method for functional Poisson approximation
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
Hierarchical Reinforcement Learning with Option-Critic Architecture
Hierarchical Reinforcement Learning with Option-Critic ArchitectureHierarchical Reinforcement Learning with Option-Critic Architecture
Hierarchical Reinforcement Learning with Option-Critic Architecture
 
Problem Understanding through Landscape Theory
Problem Understanding through Landscape TheoryProblem Understanding through Landscape Theory
Problem Understanding through Landscape Theory
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Variational inference
Variational inference  Variational inference
Variational inference
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimation
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
RuleML2015: Input-Output STIT Logic for Normative Systems
RuleML2015: Input-Output STIT Logic for Normative SystemsRuleML2015: Input-Output STIT Logic for Normative Systems
RuleML2015: Input-Output STIT Logic for Normative Systems
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
master thesis presentation
master thesis presentationmaster thesis presentation
master thesis presentation
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
QMC: Operator Splitting Workshop, Incremental Learning-to-Learn with Statisti...
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Can we estimate a constant?
Can we estimate a constant?Can we estimate a constant?
Can we estimate a constant?
 
CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...
CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...
CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Bayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet ProcessBayesian Nonparametrics: Models Based on the Dirichlet Process
Bayesian Nonparametrics: Models Based on the Dirichlet Process
 

Andere mochten auch

Linux troubleshooting tips
Linux troubleshooting tipsLinux troubleshooting tips
Linux troubleshooting tipsBert Van Vreckem
 
Een fileserver opzetten met Samba
Een fileserver opzetten met SambaEen fileserver opzetten met Samba
Een fileserver opzetten met SambaBert Van Vreckem
 
One vagrantfile to rule them all
One vagrantfile to rule them allOne vagrantfile to rule them all
One vagrantfile to rule them allBert Van Vreckem
 
Linux Enterprise - inleiding cursus, 5 trends in systeembeheer
Linux Enterprise - inleiding cursus, 5 trends in systeembeheerLinux Enterprise - inleiding cursus, 5 trends in systeembeheer
Linux Enterprise - inleiding cursus, 5 trends in systeembeheerBert Van Vreckem
 
Gebruikers, groepen en permissies
Gebruikers, groepen en permissiesGebruikers, groepen en permissies
Gebruikers, groepen en permissiesBert Van Vreckem
 
Een literatuurstudie maken: hoe & waarom
Een literatuurstudie maken: hoe & waaromEen literatuurstudie maken: hoe & waarom
Een literatuurstudie maken: hoe & waaromBert Van Vreckem
 

Andere mochten auch (8)

Linux troubleshooting tips
Linux troubleshooting tipsLinux troubleshooting tips
Linux troubleshooting tips
 
Een fileserver opzetten met Samba
Een fileserver opzetten met SambaEen fileserver opzetten met Samba
Een fileserver opzetten met Samba
 
One vagrantfile to rule them all
One vagrantfile to rule them allOne vagrantfile to rule them all
One vagrantfile to rule them all
 
Wachtwoorden in Linux
Wachtwoorden in LinuxWachtwoorden in Linux
Wachtwoorden in Linux
 
Workshop latex
Workshop latexWorkshop latex
Workshop latex
 
Linux Enterprise - inleiding cursus, 5 trends in systeembeheer
Linux Enterprise - inleiding cursus, 5 trends in systeembeheerLinux Enterprise - inleiding cursus, 5 trends in systeembeheer
Linux Enterprise - inleiding cursus, 5 trends in systeembeheer
 
Gebruikers, groepen en permissies
Gebruikers, groepen en permissiesGebruikers, groepen en permissies
Gebruikers, groepen en permissies
 
Een literatuurstudie maken: hoe & waarom
Een literatuurstudie maken: hoe & waaromEen literatuurstudie maken: hoe & waarom
Een literatuurstudie maken: hoe & waarom
 

Ähnlich wie A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems

MOMDPSO_IDETC_2014_Weiyang
MOMDPSO_IDETC_2014_WeiyangMOMDPSO_IDETC_2014_Weiyang
MOMDPSO_IDETC_2014_WeiyangMDO_Lab
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIJack Clark
 
Sparsenet
SparsenetSparsenet
Sparsenetndronen
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryRikiya Takahashi
 
Introduction to Max-SAT and Max-SAT Evaluation
Introduction to Max-SAT and Max-SAT EvaluationIntroduction to Max-SAT and Max-SAT Evaluation
Introduction to Max-SAT and Max-SAT EvaluationMasahiro Sakai
 
presentation
presentationpresentation
presentationjie ren
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingAdam Doyle
 
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfCold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfPo-Chuan Chen
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...Yuko Kuroki (黒木祐子)
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systemsOlivier Teytaud
 
Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringArthur Mensch
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorizationrecsysfr
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Faisal Siddiqi
 
Tracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First PrinciplesTracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First Principleskenluck2001
 
Sampling-Based Planning Algorithms for Multi-Objective Missions
Sampling-Based Planning Algorithms for Multi-Objective MissionsSampling-Based Planning Algorithms for Multi-Objective Missions
Sampling-Based Planning Algorithms for Multi-Objective MissionsMd Mahbubur Rahman
 

Ähnlich wie A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems (20)

MOMDPSO_IDETC_2014_Weiyang
MOMDPSO_IDETC_2014_WeiyangMOMDPSO_IDETC_2014_Weiyang
MOMDPSO_IDETC_2014_Weiyang
 
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAIDeep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
 
Sparsenet
SparsenetSparsenet
Sparsenet
 
Uncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game TheoryUncertainty Awareness in Integrating Machine Learning and Game Theory
Uncertainty Awareness in Integrating Machine Learning and Game Theory
 
Introduction to Max-SAT and Max-SAT Evaluation
Introduction to Max-SAT and Max-SAT EvaluationIntroduction to Max-SAT and Max-SAT Evaluation
Introduction to Max-SAT and Max-SAT Evaluation
 
presentation
presentationpresentation
presentation
 
Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques Optimization Using Evolutionary Computing Techniques
Optimization Using Evolutionary Computing Techniques
 
Synthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-makingSynthesis of analytical methods data driven decision-making
Synthesis of analytical methods data driven decision-making
 
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfCold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
 
Ds33717725
Ds33717725Ds33717725
Ds33717725
 
Ds33717725
Ds33717725Ds33717725
Ds33717725
 
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
[AAAI2021] Combinatorial Pure Exploration with Full-bandit or Partial Linear ...
 
Planning for power systems
Planning for power systemsPlanning for power systems
Planning for power systems
 
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
 
Massive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filteringMassive Matrix Factorization : Applications to collaborative filtering
Massive Matrix Factorization : Applications to collaborative filtering
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
MUMS: Bayesian, Fiducial, and Frequentist Conference - Model Selection in the...
 
Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019Facebook Talk at Netflix ML Platform meetup Sep 2019
Facebook Talk at Netflix ML Platform meetup Sep 2019
 
Tracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First PrinciplesTracking the tracker: Time Series Analysis in Python from First Principles
Tracking the tracker: Time Series Analysis in Python from First Principles
 
Sampling-Based Planning Algorithms for Multi-Objective Missions
Sampling-Based Planning Algorithms for Multi-Objective MissionsSampling-Based Planning Algorithms for Multi-Objective Missions
Sampling-Based Planning Algorithms for Multi-Objective Missions
 

Kürzlich hochgeladen

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 

Kürzlich hochgeladen (20)

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 

A Reinforcement Learning Approach for Hybrid Flexible Flowline Scheduling Problems

  • 1.
  • 2. A Reinforcement Learning Approach to Solving Hybrid Flexible Flowline Scheduling Problems Bert Van Vreckem Dmitriy Borodin Wim De Bruyn Ann Now´e
  • 3. Authors • Bert Van Vreckem, HoGent Business and Information Management bert.vanvreckem@hogent.be • Dmitriy Borodin, OMPartners dborodin@ompartners.com • Wim De Bruyn, HoGent Business and Information Management wim.debruyn@hogent.be • Ann Now´e, Artificial Intelligence Lab, Vrije Universiteit Brussel ann.nowe@vub.ac.be HFFSP MISTA2013: 29 August 2013 3/28
  • 4. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 4/28
  • 5. Hybrid Flexible Flowline Scheduling Problems Powerful model for complex real-life production scheduling problems. In α/β/γ notation1: HFFLm, ((RM(i) ) (m) i=1/Mj, rm, prec, Siljk, Ailjk, lag/Cmax 1 (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 5/28
  • 6. Hybrid Flexible Flowline Scheduling Problems Powerful model for complex real-life production scheduling problems. In α/β/γ notation1: HFFLm, ((RM(i) ) (m) i=1/Mj, rm, prec, Siljk, Ailjk, lag/Cmax Flowline Scheduling problems: jobs processed in consecutive stages. Stage 1 Stage 2 Stage 3 Stage 4 1 (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 5/28
  • 7. Hybrid Flexible Flowline Scheduling Problems Hybrid case: unrelated parallel machines M11 M12 M13 M21 M22 M31 M32 M33 M34 M41 M42 HFFSP MISTA2013: 29 August 2013 6/28
  • 8. Hybrid Flexible Flowline Scheduling Problems Flexible case: stages may be skipped M11 M12 M13 M21 M22 M41 M42 HFFSP MISTA2013: 29 August 2013 7/28
  • 9. Hybrid Flexible Flowline Scheduling Problems Other constraints: Machine eligibility M11 M13 M21 M22 M31 M33 M42 HFFSP MISTA2013: 29 August 2013 8/28
  • 10. Hybrid Flexible Flowline Scheduling Problems Other constraints: Time lag between stages Stage 1 Stage 2 Stage 3 Stage 4 HFFSP MISTA2013: 29 August 2013 9/28
  • 11. Hybrid Flexible Flowline Scheduling Problems Other constraints: Sequence dependent setup times 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 HFFSP MISTA2013: 29 August 2013 10/28
  • 12. Hybrid Flexible Flowline Scheduling Problems Other constraints: Sequence dependent setup times 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 J2 J1M1 J2 J1M2 HFFSP MISTA2013: 29 August 2013 10/28
  • 13. Hybrid Flexible Flowline Scheduling Problems Other constraints: Sequence dependent setup times 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 J2 J1M1 J2 J1M2 HFFSP MISTA2013: 29 August 2013 11/28
  • 14. Hybrid Flexible Flowline Scheduling Problems Other constraints: Precendence relations between jobs 1 2 3 4 5 6 7 8 9 10 11 12 J1 J2M1 J1 J2M2 J2 J1M1 J2 J1M2 HFFSP MISTA2013: 29 August 2013 12/28
  • 15. Hybrid Flexible Flowline Scheduling Problems Precedence relations between jobs make the problem much harder, in a way that MILP/CPLEX approach doesn’t work anymore for larger instances (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 13/28
  • 16. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 14/28
  • 17. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: • Job permutations • Machine assignment HFFSP MISTA2013: 29 August 2013 15/28
  • 18. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: • Job permutations → Learning Automata • Machine assignment HFFSP MISTA2013: 29 August 2013 15/28
  • 19. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: • Job permutations → Learning Automata • Machine assignment → Earliest Preparation Next Stage (EPNS) (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 15/28
  • 20. A Machine Learning Approach Scheduling Hybrid Flexible Flowline Scheduling Problems Two stages: • Job permutations → Learning Automata • Machine assignment → Earliest Preparation Next Stage (EPNS) (Urlings, 2010) HFFSP MISTA2013: 29 August 2013 15/28
  • 21. Reinforcement learning At every discrete time step t: • Agent percieves environment state s(t) • Agent chooses action a(t) ∈ A = a1, . . . , an according to some policy • Environment places agent in new state s(t + 1) and gives reinforcement r(t) • Goal: learn policy that maximizes long term cumulative reward t r(t) Environment Agent s r a HFFSP MISTA2013: 29 August 2013 16/28
  • 22. Learning Automata (LA) Reinforcement Learning agents that choose action according to probability distribution p(t) = (p1(t), . . . , pn(t)), with pi = Prob[a(t) = ai] and s.t. n i=1 pi = 1 pi(0) = 1 n (1) pi(t + 1) = pi(t) +αrewr(t)(1 − pi(t)) −αpen(1 − r(t))pi(t) (2) if ai is the action taken at instant t pj(t + 1) = pj(t) −αrewr(t)pj(t) +αpen(1 − r(t)) 1 n − 1 − pj(t) (3) if aj = ai HFFSP MISTA2013: 29 August 2013 17/28
  • 23. Learning Automata (LA) Reinforcement Learning agents that choose action according to probability distribution p(t) = (p1(t), . . . , pn(t)), with pi = Prob[a(t) = ai] and s.t. n i=1 pi = 1 pi(0) = 1 n (1) pi(t + 1) = pi(t) +αrewr(t)(1 − pi(t)) −αpen(1 − r(t))pi(t) (2) if ai is the action taken at instant t pj(t + 1) = pj(t) −αrewr(t)pj(t) +αpen(1 − r(t)) 1 n − 1 − pj(t) (3) if aj = ai HFFSP MISTA2013: 29 August 2013 17/28
  • 24. Learning Automata (LA) Reinforcement Learning agents that choose action according to probability distribution p(t) = (p1(t), . . . , pn(t)), with pi = Prob[a(t) = ai] and s.t. n i=1 pi = 1 pi(0) = 1 n (1) pi(t + 1) = pi(t) +αrewr(t)(1 − pi(t)) −αpen(1 − r(t))pi(t) (2) if ai is the action taken at instant t pj(t + 1) = pj(t) −αrewr(t)pj(t) +αpen(1 − r(t)) 1 n − 1 − pj(t) (3) if aj = ai HFFSP MISTA2013: 29 August 2013 17/28
  • 25. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi HFFSP MISTA2013: 29 August 2013 18/28
  • 26. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi E.g. action 3 was chosen HFFSP MISTA2013: 29 August 2013 18/28
  • 27. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi E.g. action 3 was chosen 1 2 3 4 0 0.2 0.4 0.6 0.8 1 r(t) = 1 pi HFFSP MISTA2013: 29 August 2013 18/28
  • 28. Learning Automaton update 1 2 3 4 0 0.2 0.4 0.6 0.8 1 i pi E.g. action 3 was chosen 1 2 3 4 0 0.2 0.4 0.6 0.8 1 r(t) = 1 pi 1 2 3 4 0 0.2 0.4 0.6 0.8 1 r(t) = 0 pi HFFSP MISTA2013: 29 August 2013 18/28
  • 29. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 19/28
  • 30. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) • A LA is assigned to every position of a permutation HFFSP MISTA2013: 29 August 2013 20/28
  • 31. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) • A LA is assigned to every position of a permutation • LAs play a dispersion game to choose unique action, resulting in a permutation HFFSP MISTA2013: 29 August 2013 20/28
  • 32. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) • A LA is assigned to every position of a permutation • LAs play a dispersion game to choose unique action, resulting in a permutation • Quality of solution is evaluated HFFSP MISTA2013: 29 August 2013 20/28
  • 33. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) • A LA is assigned to every position of a permutation • LAs play a dispersion game to choose unique action, resulting in a permutation • Quality of solution is evaluated • Update probabilities according to LA update rule Linear Reward-Inaction (αpen = 0): HFFSP MISTA2013: 29 August 2013 20/28
  • 34. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) • A LA is assigned to every position of a permutation • LAs play a dispersion game to choose unique action, resulting in a permutation • Quality of solution is evaluated • Update probabilities according to LA update rule Linear Reward-Inaction (αpen = 0): • Better result than best one so far: r(t) = 1 HFFSP MISTA2013: 29 August 2013 20/28
  • 35. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) • A LA is assigned to every position of a permutation • LAs play a dispersion game to choose unique action, resulting in a permutation • Quality of solution is evaluated • Update probabilities according to LA update rule Linear Reward-Inaction (αpen = 0): • Better result than best one so far: r(t) = 1 • If not, r(t) = 0 HFFSP MISTA2013: 29 August 2013 20/28
  • 36. Probabilistic Basic Simple Strategy (PBSS) (Wauters, 2012) • A LA is assigned to every position of a permutation • LAs play a dispersion game to choose unique action, resulting in a permutation • Quality of solution is evaluated • Update probabilities according to LA update rule Linear Reward-Inaction (αpen = 0): • Better result than best one so far: r(t) = 1 • If not, r(t) = 0 • Repeat until convergence HFFSP MISTA2013: 29 August 2013 20/28
  • 37. Probabilistic Basic Simple Strategy (PBSS) • PBSS: great results in several optimization problems that involve learning permutations HFFSP MISTA2013: 29 August 2013 21/28
  • 38. Probabilistic Basic Simple Strategy (PBSS) • PBSS: great results in several optimization problems that involve learning permutations • but doesn’t work well when precedence constraints are involved HFFSP MISTA2013: 29 August 2013 21/28
  • 39. Probabilistic Basic Simple Strategy (PBSS) • PBSS: great results in several optimization problems that involve learning permutations • but doesn’t work well when precedence constraints are involved • PBSS only learns from positive experience (i.e. improving on previous solutions) HFFSP MISTA2013: 29 August 2013 21/28
  • 40. Probabilistic Basic Simple Strategy (PBSS) • PBSS: great results in several optimization problems that involve learning permutations • but doesn’t work well when precedence constraints are involved • PBSS only learns from positive experience (i.e. improving on previous solutions) • Doesn’t learn to avoid invalid permutations HFFSP MISTA2013: 29 August 2013 21/28
  • 41. Extending PBSS for precendence constraints Updating probabilities: • If the job permutation is invalid, perform an update with r(t) = 0 and αpen > 0 for all agents that are involved in the violation of precedence constraints. HFFSP MISTA2013: 29 August 2013 22/28
  • 42. Extending PBSS for precendence constraints Updating probabilities: • If the job permutation is invalid, perform an update with r(t) = 0 and αpen > 0 for all agents that are involved in the violation of precedence constraints. • If the job permutation is valid, perform a LR−I update in all agents, depending on the resulting makespan ms and best makespan until now msbest: HFFSP MISTA2013: 29 August 2013 22/28
  • 43. Extending PBSS for precendence constraints Updating probabilities: • If the job permutation is invalid, perform an update with r(t) = 0 and αpen > 0 for all agents that are involved in the violation of precedence constraints. • If the job permutation is valid, perform a LR−I update in all agents, depending on the resulting makespan ms and best makespan until now msbest: • improved: r(t) = 1; HFFSP MISTA2013: 29 August 2013 22/28
  • 44. Extending PBSS for precendence constraints Updating probabilities: • If the job permutation is invalid, perform an update with r(t) = 0 and αpen > 0 for all agents that are involved in the violation of precedence constraints. • If the job permutation is valid, perform a LR−I update in all agents, depending on the resulting makespan ms and best makespan until now msbest: • improved: r(t) = 1; • equally good: r(t) = 1/2; HFFSP MISTA2013: 29 August 2013 22/28
  • 45. Extending PBSS for precendence constraints Updating probabilities: • If the job permutation is invalid, perform an update with r(t) = 0 and αpen > 0 for all agents that are involved in the violation of precedence constraints. • If the job permutation is valid, perform a LR−I update in all agents, depending on the resulting makespan ms and best makespan until now msbest: • improved: r(t) = 1; • equally good: r(t) = 1/2; • worse: r(t) = msbest 2ms ; HFFSP MISTA2013: 29 August 2013 22/28
  • 46. Extending PBSS for precendence constraints Updating probabilities: • If the job permutation is invalid, perform an update with r(t) = 0 and αpen > 0 for all agents that are involved in the violation of precedence constraints. • If the job permutation is valid, perform a LR−I update in all agents, depending on the resulting makespan ms and best makespan until now msbest: • improved: r(t) = 1; • equally good: r(t) = 1/2; • worse: r(t) = msbest 2ms ; • no valid schedule found: r(t) = 0; HFFSP MISTA2013: 29 August 2013 22/28
  • 47. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 23/28
  • 48. Experiments • HFFSP Benchmark problems from (Ruiz et al., 2008)2 • problem sets with 5, 7, 9, 11, 13, 15 jobs, 96 instances in each set • + other constraints that make problems harder (precedence relations!) • αrew = 0.1; αpen = 0.5 (no tuning) • Run until converges, or at most 300 seconds 2 Available at http://soa.iti.es/problem-instances HFFSP MISTA2013: 29 August 2013 24/28
  • 49. Results Instance set 5 7 9 11 13 15 overall mean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484 best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34 # improved 11 12 18 12 9 6 68 # equal 62 40 19 18 8 7 154 # worse 23 44 59 66 79 82 354 HFFSP MISTA2013: 29 August 2013 25/28
  • 50. Results Instance set 5 7 9 11 13 15 overall mean RD (%) 0.0697 2.0131 1.1568 1.6565 3.7294 7.9189 2.7484 best RD (%) -35.70 -24.71 -26.92 -21.10 -43.34 -10.46 -43.34 # improved 11 12 18 12 9 6 68 # equal 62 40 19 18 8 7 154 # worse 23 44 59 66 79 82 354 HFFSP MISTA2013: 29 August 2013 25/28
  • 51. Contents 1 Hybrid Flexible Flowline Scheduling Problems 2 A Machine Learning Approach 3 Learning Permutations with Precedence Constraints 4 Experiments & results 5 Conclusion HFFSP MISTA2013: 29 August 2013 26/28
  • 52. Results and Discussion Contributions: • Extension of PBSS for learning permutations with precedence constraints • Simple model + RL approach can yield good quality results for challenging HFFSP instances HFFSP MISTA2013: 29 August 2013 27/28
  • 53. Results and Discussion Contributions: • Extension of PBSS for learning permutations with precedence constraints • Simple model + RL approach can yield good quality results for challenging HFFSP instances Discussion & future work: • Precedence relations do make the problem harder • Parameter tuning • Convergence • Larger instances (50, 100 jobs) • Explore possibilities for improvement in machine assignment HFFSP MISTA2013: 29 August 2013 27/28