SlideShare ist ein Scribd-Unternehmen logo
1 von 113
Machine Learning and ILP for Multi-Agent Systems Daniel Kudenko & Dimitar Kazakov Department of Computer Science University of York, UK ACAI-01, Prague, July 2001
Why Learning Agents? ,[object Object],[object Object],[object Object]
A Brief History Machine Learning Agents Disembodied  ML Single-Agent  System Single-Agent  Learning Multiple  Single-Agent  Learners Multiple  Single-Agent System Social  Multi-Agent Learners Social  Multi-Agent System
Outline ,[object Object],[object Object],[object Object],[object Object]
What is Machine Learning? ,[object Object],[object Object]
Types of Learning  ,[object Object],[object Object],[object Object]
Inductive Learning ,[object Object]
Inductive Learning Examples of  Category C 1 Examples of  Category C 2 Examples of  Category C n Inductive Learning System Hypothesis (Procedure to Classify New Examples)
Inductive Learning Example Ammo:  low Monster:  near Light:  good Category:   shoot Inductive Learning System If (Ammo = high) and  (light   {medium, good})  then shoot; ……… .. Ammo:  low Monster:  far Light:  medium Category:  ¬ shoot Ammo:  high Monster:  far Light:  good Category:   shoot
Performance Measure ,[object Object],[object Object]
Where’s the knowledge? ,[object Object],[object Object],[object Object],[object Object]
Example Language ,[object Object],[object Object],[object Object],[object Object]
Hypothesis Language ,[object Object],[object Object],[object Object]
Learning bias ,[object Object],[object Object],[object Object],[object Object]
Inductive Learning ,[object Object],[object Object],[object Object]
Inductive Learning for Agents ,[object Object],[object Object],[object Object],[object Object]
Batch vs Incremental Learning ,[object Object],[object Object],[object Object]
Batch Learning for Agents ,[object Object],[object Object],[object Object],[object Object]
Eager vs. Lazy learning ,[object Object],[object Object]
Active Learning ,[object Object],[object Object]
Black-Box vs. White-Box ,[object Object],[object Object]
Reinforcement Learning ,[object Object],[object Object],[object Object],[object Object]
Q Learning Value of a state: discounted cumulative reward  V  (s t ) =   i    0    i  r(s t+i ,a t+i ) 0       < 1  is a discount factor (   = 0 means that only immediate reward is considered). r(s t+i  ,a t+i ) is the reward determined by performing actions specified by policy   . Q(s,a) = r(s,a) + V*(  (s,a)) Optimal Policy:    *(s) = argmax a  Q(s,a)
Q Learning Initialize all Q(s,a) to 0 In some state s choose some action a. Let s’ be the resulting state.  Update Q:  Q(s,a) = r +    max a’  Q(s’,a’)
Q Learning ,[object Object],[object Object],[object Object]
Pros and Cons of RL ,[object Object],[object Object],[object Object],[object Object],[object Object]
Combination of IL and RL ,[object Object],[object Object],[object Object]
Unsupervised Learning ,[object Object],[object Object],[object Object],[object Object]
Learning and Verification ,[object Object],[object Object],[object Object]
Learning and Verification [Gordon ’00] ,[object Object],[object Object],[object Object],[object Object]
Learning and Verification ,[object Object],[object Object],[object Object]
Learning in Multi-Agent Systems ,[object Object],[object Object],[object Object],[object Object],[object Object]
Types of Multi-Agent Learning [Weiss & Dillenbourg 99] ,[object Object],[object Object],[object Object]
Social Awareness ,[object Object],[object Object],[object Object]
Levels of Social Awareness [Vidal&Durfee 97] ,[object Object],[object Object],[object Object],[object Object]
Social Awareness and Q Learning ,[object Object],[object Object],[object Object]
Agent models and Q Learning ,[object Object],[object Object],[object Object],[object Object]
Agent Models and Q Learning ,[object Object],[object Object],[object Object]
Q Learning and Communication [Tan 93] ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Role Learning ,[object Object],[object Object],[object Object],[object Object]
Q Learning of roles ,[object Object],[object Object]
Q Learning of Roles  [Balch 99] ,[object Object],[object Object],[object Object],[object Object],[object Object]
Distributed Learning ,[object Object],[object Object],[object Object]
Distributed Data Mining ,[object Object],[object Object],[object Object]
Bibliography [Mitchell 97]  T. Mitchell.  Machine Learning.  McGraw Hill, 1997. [Michalski et al. 98] R.S. Michalski, I. Bratko, M. Kubat.  Machine Learning and Data Mining: Methods and Applications.  Wiley, 1998. [Dietterich&Flann 95] T. Dietterich and N.Flann. Explanation-based Learning and Reinforcement Learning. In  Proceedings of the Twelfth International Conference on Machine Learning , 1995. [Dzeroski et al. 98] S. Dzeroski, L. DeRaedt, and H. Blockeel. Relational Reinforcement Learning. In:  Proceedings of the Eighth International Conference on Inductive Logic Programming ILP-98.  Springer, 1998. [Gordon 00] D. Gordon: Asimovian Adaptive Agents.  Journal of Artificial Intelligence Research,  13, 2000. [Weiss & Dilelnbourg 99] G. Weiss and P. Dillenbourg. What is ‘Multi’ in Multi-Agent Learning? In P. Dillenbourg (ed.),  Collaborative Learning. Cognitive and Computational Approaches.  Pergamon Press, 1999. [Vidal & Durfee 97] J.M. Vidal and E. Durfee. Agents Learning about Agents: A Framework and Analysis. In Working Notes of the AAAI-97 workshop on Multiagent Learning, 1997.  [Mundhe & Sen 00] M. Mundhe and S. Sen. Evaluating Concurrent Reinforcement Learners.  Proceeding s  of the Fourth International Conference on Multiagent Systems , IEEE Press, 2000. [Claus & Boutillier 98] C. Claus and C. Boutillier.  The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems . AAAI 98. [Lauer & Riedmiller 00]  M. Lauer and M. Riedmiller. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems. In Proceedings of the Seventeenth International Conference in Machine Learning, 2000.
Bibliography [Tan 93] M. Tan. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. In: Proceedings of the Tenth International Conference on Machine Learning, 1993. [Prasad et al. 96] M.V.N. Prasad, S.E. Lander and V.R. Lesser. Learning Organizational Roles for Negotiated Search. International Journal of Human-Computer Studies, 48(1), 1996. [Ono & Fukomoto 96] N. Ono and K. Fukomoto. A Modular Approach to Multi-Agent Reinforcement Learning. Proceedings of the First International Conference on Multi-Agent Systems, 1996. [Crites & Barto 98] R. Crites and A. Barto. Elevator Group Control Using Multiple Reinforcement Learning Agents. Machine Learning, 1998.  [Balch 99] T. Balch. Reward and Diversity in Multi-Robot Foraging. Proceedings of the IJCAI-99 Workshop on Agents Learning About, From, and With other Agents, 1999. [Provost & Kolluri 99] F. Provost and V. Kolluri.  &quot;A Survey of Methods for Scaling Up Inductive Algorithms.&quot; Data Mining and Knowledge Discovery   3 , 1999. [Provost & Hennessy 96] F. Provost and D. Hennessy. Scaling up: Distributed Machine Learning with Cooperation. AAAI 96, 1996.
B   R   E   A   K
Machine Learning and ILP for MAS: Part II ,[object Object],[object Object],[object Object],[object Object]
Machine Learning and ILP for MAS: Part II ,[object Object],[object Object],[object Object],[object Object]
From Machine Learning to Learning Agents ,[object Object],Classic Machine Learning Active  Learning Learning as one of many goals:  Learning Agent(s) Closed Loop Machine Learning
Integrating Machine Learning into the Agent Architecture ,[object Object],[object Object],[object Object]
Time Constraints on Learning ,[object Object],[object Object],[object Object],[object Object],[object Object]
Doing Eager  vs . Lazy Learning  under Time Pressure ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
“ Clear-cut” vs. Any-time Learning  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Time Constraints on Learning in Simulated Environments ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Synchronisation    Time Constraints Multi-agent Progol (Muggleton) Asynchronous The York MA Environment (Kazakov  et al .) 1-move-per-round, immediate update Logic-based MAS for conflict simulations (Kudenko, Alonso) 1-move-per-round, batch update Real time Upper bound Unlimited time
Learning and Recall ,[object Object],[object Object],[object Object]
Learning and Recall (2) Update sensory information Recall  current model of world to choose and carry out an action Observe the action outcome Learn  new model of the world
Learning and Recall  (3) Update sensory information Recall  current model of world to choose and carry out an action Learn  new model of the world ,[object Object],[object Object]
Learning and Recall  (4) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Machine Learning and ILP for MAS: Part II ,[object Object],[object Object],[object Object],[object Object]
Machine Learning Revisited ,[object Object],[object Object],[object Object],[object Object]
Object and Concept Language ,[object Object],[object Object],+ + + + _ _ _ _  
Machine Learning Biases ,[object Object],[object Object],[object Object],[object Object]
Preference Bias, Search Bias & Version Space ,[object Object],+ + + + _ _ _ _ most spec. concept most gen. concept
Inductive Logic Programming ,[object Object],[object Object],[object Object],[object Object]
LP as ILP Object Language  ,[object Object],[object Object],[object Object]
ILP Object Language Example  gbc(v8,30000,4000). + £ 4000 30,000 Audi V8 :- gbc(uno,90000,3000). - £ 3000 90,000 Fiat Uno gbc(z3,50000,5000). + £ 5000 50,000 BMW Z3 y/n price mileage model ILP representation Good bargain cars
LP as ILP Concept Language  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Modes in ILP ,[object Object],[object Object],[object Object],[object Object],[object Object]
Modes in ILP ,[object Object],[object Object],[object Object],[object Object],[object Object]
Modes in ILP ,[object Object],[object Object],[object Object],[object Object],[object Object]
Modes in ILP ,[object Object],[object Object],[object Object],[object Object],[object Object]
Types in ILP ,[object Object],[object Object],[object Object],[object Object]
ILP Types and Modes: Example  gbc(v8,30000,4000). + 4000 30,000 Audi V8 :- gbc(uno,90000,3000). - 3000 90,000 Fiat Uno gbc(z3,50000,5000). + 5000 50,000 BMW Z3 modeh(1,gbc(+model,+mileage,+price))? y/n price mileage model ILP representation (Progol) Good bargain cars
Positive Only Learning  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Background Knowledge ,[object Object],[object Object],[object Object]
Background Knowledge (2) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Choice of Background Knowledge ,[object Object],[object Object]
ILP Preference Bias ,[object Object],[object Object],[object Object],[object Object]
Induction in ILP ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Example of Induction p(X,Y). p(b,a) :- q(b). p(X,a). p(X,Y) :- q(X). BK: q(b). q(c). Training examples: p(b,a). p(f,g). :- p(i,j).
Induction in Progol ,[object Object],[object Object],[object Object],[object Object],T = p(X,Y)    = p(X,a) :- q(X). p(X,a). p(X,Y) :- q(X)
Summary of ILP Basics ,[object Object],[object Object],[object Object],[object Object],[object Object]
Learning Pure Logic Programs  vs . Decision Lists ,[object Object],[object Object],[object Object]
Decision List Example ,[object Object],[object Object],[object Object],[object Object]
Updating Decision Lists with Exceptions ,[object Object],[object Object],[object Object],[object Object]
Updating Decision Lists with Exceptions ,[object Object],[object Object],[object Object]
Replacing Exceptions with Rules:  Before  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Replacing Exceptions with Rules: After  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Eager ILP  vs . Analogical Prediction ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Analogical Prediction Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Analogical Prediction Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Timing Analysis of Theories Learned with ILP ,[object Object],[object Object],[object Object],[object Object]
Timing Analysis of ILP Theories: Example ,[object Object],[object Object],[object Object],[object Object]
Machine Learning and ILP for MAS: Part II ,[object Object],[object Object],[object Object],[object Object]
Agent Applications of ILP  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Agent Applications of ILP  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Agent Applications of ILP  ,[object Object],[object Object],[object Object]
Agent Applications of ILP  ,[object Object],[object Object],[object Object],[object Object],[object Object]
The York MA Environment ,[object Object],[object Object],[object Object],[object Object]
The York MA Environment
The York MA Environment ,[object Object],[object Object],[object Object],[object Object]
Machine Learning and ILP for MAS: Part II ,[object Object],[object Object],[object Object],[object Object]
Learning and Natural Selection ,[object Object],[object Object],[object Object]
Darwinian  vs . Lamarckian Evolution ,[object Object],[object Object],[object Object]
Darwinian  vs . Lamarckian Evolution (2) ,[object Object],[object Object],[object Object],[object Object]
Learning and Language ,[object Object],[object Object],[object Object],[object Object]
Communication and Learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Communication and Learning (2) ,[object Object],[object Object],[object Object],[object Object],[object Object]
Communication and Learning (3) ,[object Object]
Our Current Research ,[object Object],[object Object],[object Object],[object Object]
The End

Weitere ähnliche Inhalte

Was ist angesagt?

Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
butest
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
butest
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
butest
 
ensemble learning
ensemble learningensemble learning
ensemble learning
butest
 
Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)
butest
 

Was ist angesagt? (20)

Alanoud alqoufi inductive learning
Alanoud alqoufi inductive learningAlanoud alqoufi inductive learning
Alanoud alqoufi inductive learning
 
Problem Formulation in Artificial Inteligence Projects
Problem Formulation in Artificial Inteligence ProjectsProblem Formulation in Artificial Inteligence Projects
Problem Formulation in Artificial Inteligence Projects
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
 
Opinion mining
Opinion miningOpinion mining
Opinion mining
 
Introduction to-machine-learning
Introduction to-machine-learningIntroduction to-machine-learning
Introduction to-machine-learning
 
Machine Learning and Data Mining: 14 Evaluation and Credibility
Machine Learning and Data Mining: 14 Evaluation and CredibilityMachine Learning and Data Mining: 14 Evaluation and Credibility
Machine Learning and Data Mining: 14 Evaluation and Credibility
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Decision tree, softmax regression and ensemble methods in machine learning
Decision tree, softmax regression and ensemble methods in machine learningDecision tree, softmax regression and ensemble methods in machine learning
Decision tree, softmax regression and ensemble methods in machine learning
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Statistical foundations of ml
Statistical foundations of mlStatistical foundations of ml
Statistical foundations of ml
 
ensemble learning
ensemble learningensemble learning
ensemble learning
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWER
 
Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)Induction and Decision Tree Learning (Part 1)
Induction and Decision Tree Learning (Part 1)
 

Ähnlich wie acai01-updated.ppt

reinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdfreinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdf
VaishnavGhadge1
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
butest
 
What is Reinforcement Learning.pdf
What is Reinforcement Learning.pdfWhat is Reinforcement Learning.pdf
What is Reinforcement Learning.pdf
Aiblogtech
 

Ähnlich wie acai01-updated.ppt (20)

Reinforcement Learning.ppt
Reinforcement Learning.pptReinforcement Learning.ppt
Reinforcement Learning.ppt
 
reinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdfreinforcement-learning-141009013546-conversion-gate02.pdf
reinforcement-learning-141009013546-conversion-gate02.pdf
 
Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning
 
YijueRL.ppt
YijueRL.pptYijueRL.ppt
YijueRL.ppt
 
RL_online _presentation_1.ppt
RL_online _presentation_1.pptRL_online _presentation_1.ppt
RL_online _presentation_1.ppt
 
reiniforcement learning.ppt
reiniforcement learning.pptreiniforcement learning.ppt
reiniforcement learning.ppt
 
Chapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics courseChapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics course
 
Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)Lecture #1: Introduction to machine learning (ML)
Lecture #1: Introduction to machine learning (ML)
 
ML_lec1.pdf
ML_lec1.pdfML_lec1.pdf
ML_lec1.pdf
 
Artificial Intelligence.pptx
Artificial Intelligence.pptxArtificial Intelligence.pptx
Artificial Intelligence.pptx
 
What is Reinforcement Learning.pdf
What is Reinforcement Learning.pdfWhat is Reinforcement Learning.pdf
What is Reinforcement Learning.pdf
 
CS3013 -MACHINE LEARNING.pptx
CS3013 -MACHINE LEARNING.pptxCS3013 -MACHINE LEARNING.pptx
CS3013 -MACHINE LEARNING.pptx
 
Communicating Agents Seeking Information
Communicating Agents Seeking InformationCommunicating Agents Seeking Information
Communicating Agents Seeking Information
 
Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)Lecture 09(introduction to machine learning)
Lecture 09(introduction to machine learning)
 
A Review on Introduction to Reinforcement Learning
A Review on Introduction to Reinforcement LearningA Review on Introduction to Reinforcement Learning
A Review on Introduction to Reinforcement Learning
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
An efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game LearningAn efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game Learning
 
Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
Reinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningReinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-Learning
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 

Mehr von butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
butest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
butest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
butest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
butest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
butest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
butest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
butest
 
Facebook
Facebook Facebook
Facebook
butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
butest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
butest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
butest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
butest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
butest
 

Mehr von butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

acai01-updated.ppt

  • 1. Machine Learning and ILP for Multi-Agent Systems Daniel Kudenko & Dimitar Kazakov Department of Computer Science University of York, UK ACAI-01, Prague, July 2001
  • 2.
  • 3. A Brief History Machine Learning Agents Disembodied ML Single-Agent System Single-Agent Learning Multiple Single-Agent Learners Multiple Single-Agent System Social Multi-Agent Learners Social Multi-Agent System
  • 4.
  • 5.
  • 6.
  • 7.
  • 8. Inductive Learning Examples of Category C 1 Examples of Category C 2 Examples of Category C n Inductive Learning System Hypothesis (Procedure to Classify New Examples)
  • 9. Inductive Learning Example Ammo: low Monster: near Light: good Category: shoot Inductive Learning System If (Ammo = high) and (light  {medium, good}) then shoot; ……… .. Ammo: low Monster: far Light: medium Category: ¬ shoot Ammo: high Monster: far Light: good Category: shoot
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23. Q Learning Value of a state: discounted cumulative reward V  (s t ) =  i  0  i r(s t+i ,a t+i ) 0   < 1 is a discount factor (  = 0 means that only immediate reward is considered). r(s t+i ,a t+i ) is the reward determined by performing actions specified by policy  . Q(s,a) = r(s,a) + V*(  (s,a)) Optimal Policy:  *(s) = argmax a Q(s,a)
  • 24. Q Learning Initialize all Q(s,a) to 0 In some state s choose some action a. Let s’ be the resulting state. Update Q: Q(s,a) = r +  max a’ Q(s’,a’)
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
  • 35.
  • 36.
  • 37.
  • 38.
  • 39.
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45. Bibliography [Mitchell 97] T. Mitchell. Machine Learning. McGraw Hill, 1997. [Michalski et al. 98] R.S. Michalski, I. Bratko, M. Kubat. Machine Learning and Data Mining: Methods and Applications. Wiley, 1998. [Dietterich&Flann 95] T. Dietterich and N.Flann. Explanation-based Learning and Reinforcement Learning. In Proceedings of the Twelfth International Conference on Machine Learning , 1995. [Dzeroski et al. 98] S. Dzeroski, L. DeRaedt, and H. Blockeel. Relational Reinforcement Learning. In: Proceedings of the Eighth International Conference on Inductive Logic Programming ILP-98. Springer, 1998. [Gordon 00] D. Gordon: Asimovian Adaptive Agents. Journal of Artificial Intelligence Research, 13, 2000. [Weiss & Dilelnbourg 99] G. Weiss and P. Dillenbourg. What is ‘Multi’ in Multi-Agent Learning? In P. Dillenbourg (ed.), Collaborative Learning. Cognitive and Computational Approaches. Pergamon Press, 1999. [Vidal & Durfee 97] J.M. Vidal and E. Durfee. Agents Learning about Agents: A Framework and Analysis. In Working Notes of the AAAI-97 workshop on Multiagent Learning, 1997. [Mundhe & Sen 00] M. Mundhe and S. Sen. Evaluating Concurrent Reinforcement Learners. Proceeding s of the Fourth International Conference on Multiagent Systems , IEEE Press, 2000. [Claus & Boutillier 98] C. Claus and C. Boutillier. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems . AAAI 98. [Lauer & Riedmiller 00] M. Lauer and M. Riedmiller. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems. In Proceedings of the Seventeenth International Conference in Machine Learning, 2000.
  • 46. Bibliography [Tan 93] M. Tan. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents. In: Proceedings of the Tenth International Conference on Machine Learning, 1993. [Prasad et al. 96] M.V.N. Prasad, S.E. Lander and V.R. Lesser. Learning Organizational Roles for Negotiated Search. International Journal of Human-Computer Studies, 48(1), 1996. [Ono & Fukomoto 96] N. Ono and K. Fukomoto. A Modular Approach to Multi-Agent Reinforcement Learning. Proceedings of the First International Conference on Multi-Agent Systems, 1996. [Crites & Barto 98] R. Crites and A. Barto. Elevator Group Control Using Multiple Reinforcement Learning Agents. Machine Learning, 1998. [Balch 99] T. Balch. Reward and Diversity in Multi-Robot Foraging. Proceedings of the IJCAI-99 Workshop on Agents Learning About, From, and With other Agents, 1999. [Provost & Kolluri 99] F. Provost and V. Kolluri. &quot;A Survey of Methods for Scaling Up Inductive Algorithms.&quot; Data Mining and Knowledge Discovery 3 , 1999. [Provost & Hennessy 96] F. Provost and D. Hennessy. Scaling up: Distributed Machine Learning with Cooperation. AAAI 96, 1996.
  • 47. B R E A K
  • 48.
  • 49.
  • 50.
  • 51.
  • 52.
  • 53.
  • 54.
  • 55.
  • 56. Synchronisation  Time Constraints Multi-agent Progol (Muggleton) Asynchronous The York MA Environment (Kazakov et al .) 1-move-per-round, immediate update Logic-based MAS for conflict simulations (Kudenko, Alonso) 1-move-per-round, batch update Real time Upper bound Unlimited time
  • 57.
  • 58. Learning and Recall (2) Update sensory information Recall current model of world to choose and carry out an action Observe the action outcome Learn new model of the world
  • 59.
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 66.
  • 67.
  • 68. ILP Object Language Example gbc(v8,30000,4000). + £ 4000 30,000 Audi V8 :- gbc(uno,90000,3000). - £ 3000 90,000 Fiat Uno gbc(z3,50000,5000). + £ 5000 50,000 BMW Z3 y/n price mileage model ILP representation Good bargain cars
  • 69.
  • 70.
  • 71.
  • 72.
  • 73.
  • 74.
  • 75. ILP Types and Modes: Example gbc(v8,30000,4000). + 4000 30,000 Audi V8 :- gbc(uno,90000,3000). - 3000 90,000 Fiat Uno gbc(z3,50000,5000). + 5000 50,000 BMW Z3 modeh(1,gbc(+model,+mileage,+price))? y/n price mileage model ILP representation (Progol) Good bargain cars
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82. Example of Induction p(X,Y). p(b,a) :- q(b). p(X,a). p(X,Y) :- q(X). BK: q(b). q(c). Training examples: p(b,a). p(f,g). :- p(i,j).
  • 83.
  • 84.
  • 85.
  • 86.
  • 87.
  • 88.
  • 89.
  • 90.
  • 91.
  • 92.
  • 93.
  • 94.
  • 95.
  • 96.
  • 97.
  • 98.
  • 99.
  • 100.
  • 101.
  • 102. The York MA Environment
  • 103.
  • 104.
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.
  • 110.
  • 111.
  • 112.