SlideShare a Scribd company logo
1 of 32
Download to read offline
• PS: This file is for reference only. Do not
depend solely on it for the content. It is to
supplement your Text book content. It is
recommended to go through suggested
readings/Text book to have detailed
knowledge of the content.
1
1. Introduction
2
Definition
• In 1959, Arthur Samuel, a pioneer in the field
of machine learning (ML) defined it as the
“field of study that gives computers the
ability to learn without being explicitly
programmed”
3
Definition
“A computer program is said to learn from experience
with respect to some class of tasks and performance
measure, if the performance at the tasks, as measured by
the performance measure, improves with experience”
Features of a well-defined learning problem:
• The learning task
• The measure of performance
• The task experience
• Types of learning tasks
5
What is the Learning Problem?
• Learning = Improving with experience at some
task
• Improve over task T ,
• with respect to performance measure P ,
• based on experience E.
6
What is the Learning Problem?
• E.g., Learn to play checkers
T : Play checkers
P : % of games won in world tournament
E: opportunity to play against self
•
7
Learning to Play Checkers
• E.g., Learn to play checkers
T : Play checkers
P : % of games won in world tournament
• What Experience
• What exactly should be learned?
• How shall it be represented?
• What specific algorithm to learn it?
8
Designing a Learning System
• Consider designing a program to learn to play
checkers, with the goal of entering it in the world
checkers tournament
9
Designing a Learning System
• Consider designing a program to learn to play
checkers, with the goal of entering it in the world
checkers tournament
• Performance measure: the percentage of games it
wins in this tournament.
• Requires the following sets
– Choosing Training Experience
– Choosing the Target Function
– Choosing the Representation of the Target Function
– Choosing the Function Approximation Algorithm
10
Choosing the Training Experience
1. What training experience should the system have?
– A design choice with great impact on the outcome.
2. What amount of interaction should there be
between the system and the supervisor?
3. Which training examples?
11
Choosing the Training Experience
1. What training experience should the system have?
– A design choice with great impact on the outcome.
• Will the training experience provide direct or indirect
feedback?
– Direct Feedback: system learns from examples of individual checkers
board states and the correct move for each
Just a bunch of board states together with a correct move.
12
Choosing the Training Experience
• Direct
13
Choosing the Training Experience
1. What training experience should the system have?
– A design choice with great impact on the outcome.
• Will the training experience provide direct or indirect
feedback?
– Direct Feedback: system learns from examples of individual checkers
board states and the correct move for each
Just a bunch of board states together with a correct move.
– Indirect Feedback: A bunch of recorded games, where the correctness
of the moves is inferred by the result of the game.
• Credit assignment problem: Value of early states must be inferred from
the outcome
14
Direct feedback easier to learn from
Choosing the Training Experience
2. What amount of interaction should there be between the
system and the supervisor?
– Choice #1: No freedom. Supervisor provides all training
examples.
– Choice #2: Semi-free. Supervisor provides training
examples, system constructs its own examples too, and
asks questions to the supervisor in cases of doubt.
– Choice #3: Total-freedom. System learns to play
completely unsupervised
• How “daring” the system should be in exploring new boards?
15
Choosing the Training Experience
3. Which training examples?
– There is an huge huge number of possible games.
– No time to try all possible games.
– System should learn with examples that it will
encounter in the future.
– For example, if the goal is to beat humans, it
should be able to do well in situations that
humans encounter when they play (this is hard to
achieve in practice).
16
Choosing the Training Experience
– If training the checkers program consists only of
experiences played against itself, it may never encounter
crucial board states that are likely to be played by the
human checkers champion
– Most theory of machine learning rests on the assumption
that the distribution of training examples is identical to the
distribution of test examples
17
Partial Design of Checkers Learning
Program
• A checkers learning problem:
– Task T: playing checkers
– Performance measure P: percent of games won in the
world tournament
– Training experience E: games played against itself
• Remaining choices
– The exact type of knowledge to be learned
– A representation for this target knowledge
– A learning mechanism
18
Choosing the Target Function
What should be learned exactly?
• The computer program knows the legal moves.
Should learn how to choose the best move. Program
needs to learn the best move from among legal moves
• The computer should learn a ‘hidden’ function.
– target function: ChooseMove : B → M
– B legal Board state, M – legal Move
• ChooseMove is difficult to learn given indirect training
19
Choosing the Target Function
• What should be learned exactly?
20
Choosing the Target Function
• So, our Alternative target function
– An evaluation function that assigns a numerical score to any given
board state
– V : B → ( where is the set of real numbers)
• V(b) for an arbitrary board state b in B
– if b is a final board state that is won, then V(b) = 100
– if b is a final board state that is lost, then V(b) = -100
– if b is a final board state that is drawn, then V(b) = 0
– if b is not a final state, then V(b) = V(b '), where b' is the
best final board state that can be achieved starting from b
and playing optimally until the end of the game
21
 
Choosing the Target Function
• V(b) gives a recursive definition for board state b
– Not usable because not efficient to compute except is first
three trivial cases
– nonoperational definition
• Goal of learning is to discover an operational
description of V
• Learning the target function is often called function
approximation
– Referred to as
22
V̂
Choosing a Representation for the Target
Function
• Choice of representations involve trade offs
– Pick a very expressive representation to allow close approximation to
the ideal target function V
– More expressive, more training data required to choose among
alternative hypotheses
• Use linear combination of the following board features:
– x1: the number of black pieces on the board
– x2: the number of red pieces on the board
– x3: the number of black kings on the board
– x4: the number of red kings on the board
– x5: the number of black pieces threatened by red (i.e. which can be
captured on red's next turn)
– x6: the number of red pieces threatened by black
23
6
6
5
5
4
4
3
3
2
2
1
1
0
)
(
ˆ x
w
x
w
x
w
x
w
x
w
x
w
w
b
V 






24
Partial Design of Checkers Learning
Program
• A checkers learning problem:
– Task T: playing checkers
– Performance measure P: percent of games won in the
world tournament
– Training experience E: games played against itself
– Target Function: V: Board →
– Target function representation
25
6
6
5
5
4
4
3
3
2
2
1
1
0
)
(
ˆ x
w
x
w
x
w
x
w
x
w
x
w
w
b
V 







Choosing a Function Approximation
Algorithm
• To learn we require a set of training
examples describing the board b and the
training value Vtrain(b)
– Ordered pair
26
V̂
 
b
V
b train
,
100
,
0
,
0
,
0
,
1
,
0
,
3 6
5
4
3
2
1 





 x
x
x
x
x
x
x1: the number of black pieces on the board
x2: the number of red pieces on the board
x3: the number of black kings on the board
x4: the number of red kings on the board
x5: the number of black pieces threatened by red (i.e. which can be
captured on red's next turn)
x6: the number of red pieces threatened by black
Choosing a Function Approximation
Algorithm
• Need a procedure that first derives such training
examples from the indirect training experience, then
adjust the weights Wi to best fits these training
examples.
27
Estimating Training Values
• Need to assign specific scores to intermediate
board states
• Approximate intermediate board state b using
the learner's current approximation of the
next board state following b
– Simple and successful approach
– More accurate for states closer to end states
28
))
(
(
ˆ
)
( b
Successor
V
b
Vtrain 
Adjusting the Weights
• Choose the weights wi to best fit the set of training examples
• Minimize the squared error E between the train values and
the values predicted by the hypothesis
• Require an algorithm that
– will incrementally refine weights as new training examples become
available
– will be robust to errors in these estimated training values
• Least Mean Squares (LMS) is one such algorithm
29
   
 
 




examples
training
b
V
b
train
train
b
V
b
V
E
,
2
ˆ
LMS Weight Update Rule
• For each train example
– Use the current weights to calculate
– For each weight wi, update it as
– where
• is a small constant (e.g. 0.1)
30
 
b
V
b train
,
 
b
V
ˆ

   
  i
train
i
i x
b
V
b
V
w
w ˆ


 
Summary of Design Choices
Suggested Readings
• “Machine Learning” by Tom Mitchell, McGraw
Hill Publisher, Chapter 1
32

More Related Content

Similar to Module 1.pdf

introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
butest
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
butest
 
vorl1.ppt
vorl1.pptvorl1.ppt
vorl1.ppt
butest
 
课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)
butest
 
Machine Learning 1 - Introduction
Machine Learning 1 - IntroductionMachine Learning 1 - Introduction
Machine Learning 1 - Introduction
butest
 

Similar to Module 1.pdf (20)

introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
ML_Lecture_1.ppt
ML_Lecture_1.pptML_Lecture_1.ppt
ML_Lecture_1.ppt
 
ML Unit 1 CS.ppt
ML Unit 1 CS.pptML Unit 1 CS.ppt
ML Unit 1 CS.ppt
 
vorl1.ppt
vorl1.pptvorl1.ppt
vorl1.ppt
 
ML_ Unit_1_PART_A
ML_ Unit_1_PART_AML_ Unit_1_PART_A
ML_ Unit_1_PART_A
 
ML PPT print.pdf
ML PPT print.pdfML PPT print.pdf
ML PPT print.pdf
 
Presentation1
Presentation1Presentation1
Presentation1
 
课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)
 
ML
MLML
ML
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
module_1_ppt.pdf
module_1_ppt.pdfmodule_1_ppt.pdf
module_1_ppt.pdf
 
Machine Learning 1 - Introduction
Machine Learning 1 - IntroductionMachine Learning 1 - Introduction
Machine Learning 1 - Introduction
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
ai4.ppt
ai4.pptai4.ppt
ai4.ppt
 
Unit 3 – AIML.pptx
Unit 3 – AIML.pptxUnit 3 – AIML.pptx
Unit 3 – AIML.pptx
 
Machine learning for computer vision part 2
Machine learning for computer vision part 2Machine learning for computer vision part 2
Machine learning for computer vision part 2
 
Week 1.pdf
Week 1.pdfWeek 1.pdf
Week 1.pdf
 

Recently uploaded

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
JoseMangaJr1
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 

Recently uploaded (20)

Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 

Module 1.pdf

  • 1. • PS: This file is for reference only. Do not depend solely on it for the content. It is to supplement your Text book content. It is recommended to go through suggested readings/Text book to have detailed knowledge of the content. 1
  • 3. Definition • In 1959, Arthur Samuel, a pioneer in the field of machine learning (ML) defined it as the “field of study that gives computers the ability to learn without being explicitly programmed” 3
  • 4. Definition “A computer program is said to learn from experience with respect to some class of tasks and performance measure, if the performance at the tasks, as measured by the performance measure, improves with experience” Features of a well-defined learning problem: • The learning task • The measure of performance • The task experience • Types of learning tasks
  • 5. 5
  • 6. What is the Learning Problem? • Learning = Improving with experience at some task • Improve over task T , • with respect to performance measure P , • based on experience E. 6
  • 7. What is the Learning Problem? • E.g., Learn to play checkers T : Play checkers P : % of games won in world tournament E: opportunity to play against self • 7
  • 8. Learning to Play Checkers • E.g., Learn to play checkers T : Play checkers P : % of games won in world tournament • What Experience • What exactly should be learned? • How shall it be represented? • What specific algorithm to learn it? 8
  • 9. Designing a Learning System • Consider designing a program to learn to play checkers, with the goal of entering it in the world checkers tournament 9
  • 10. Designing a Learning System • Consider designing a program to learn to play checkers, with the goal of entering it in the world checkers tournament • Performance measure: the percentage of games it wins in this tournament. • Requires the following sets – Choosing Training Experience – Choosing the Target Function – Choosing the Representation of the Target Function – Choosing the Function Approximation Algorithm 10
  • 11. Choosing the Training Experience 1. What training experience should the system have? – A design choice with great impact on the outcome. 2. What amount of interaction should there be between the system and the supervisor? 3. Which training examples? 11
  • 12. Choosing the Training Experience 1. What training experience should the system have? – A design choice with great impact on the outcome. • Will the training experience provide direct or indirect feedback? – Direct Feedback: system learns from examples of individual checkers board states and the correct move for each Just a bunch of board states together with a correct move. 12
  • 13. Choosing the Training Experience • Direct 13
  • 14. Choosing the Training Experience 1. What training experience should the system have? – A design choice with great impact on the outcome. • Will the training experience provide direct or indirect feedback? – Direct Feedback: system learns from examples of individual checkers board states and the correct move for each Just a bunch of board states together with a correct move. – Indirect Feedback: A bunch of recorded games, where the correctness of the moves is inferred by the result of the game. • Credit assignment problem: Value of early states must be inferred from the outcome 14 Direct feedback easier to learn from
  • 15. Choosing the Training Experience 2. What amount of interaction should there be between the system and the supervisor? – Choice #1: No freedom. Supervisor provides all training examples. – Choice #2: Semi-free. Supervisor provides training examples, system constructs its own examples too, and asks questions to the supervisor in cases of doubt. – Choice #3: Total-freedom. System learns to play completely unsupervised • How “daring” the system should be in exploring new boards? 15
  • 16. Choosing the Training Experience 3. Which training examples? – There is an huge huge number of possible games. – No time to try all possible games. – System should learn with examples that it will encounter in the future. – For example, if the goal is to beat humans, it should be able to do well in situations that humans encounter when they play (this is hard to achieve in practice). 16
  • 17. Choosing the Training Experience – If training the checkers program consists only of experiences played against itself, it may never encounter crucial board states that are likely to be played by the human checkers champion – Most theory of machine learning rests on the assumption that the distribution of training examples is identical to the distribution of test examples 17
  • 18. Partial Design of Checkers Learning Program • A checkers learning problem: – Task T: playing checkers – Performance measure P: percent of games won in the world tournament – Training experience E: games played against itself • Remaining choices – The exact type of knowledge to be learned – A representation for this target knowledge – A learning mechanism 18
  • 19. Choosing the Target Function What should be learned exactly? • The computer program knows the legal moves. Should learn how to choose the best move. Program needs to learn the best move from among legal moves • The computer should learn a ‘hidden’ function. – target function: ChooseMove : B → M – B legal Board state, M – legal Move • ChooseMove is difficult to learn given indirect training 19
  • 20. Choosing the Target Function • What should be learned exactly? 20
  • 21. Choosing the Target Function • So, our Alternative target function – An evaluation function that assigns a numerical score to any given board state – V : B → ( where is the set of real numbers) • V(b) for an arbitrary board state b in B – if b is a final board state that is won, then V(b) = 100 – if b is a final board state that is lost, then V(b) = -100 – if b is a final board state that is drawn, then V(b) = 0 – if b is not a final state, then V(b) = V(b '), where b' is the best final board state that can be achieved starting from b and playing optimally until the end of the game 21  
  • 22. Choosing the Target Function • V(b) gives a recursive definition for board state b – Not usable because not efficient to compute except is first three trivial cases – nonoperational definition • Goal of learning is to discover an operational description of V • Learning the target function is often called function approximation – Referred to as 22 V̂
  • 23. Choosing a Representation for the Target Function • Choice of representations involve trade offs – Pick a very expressive representation to allow close approximation to the ideal target function V – More expressive, more training data required to choose among alternative hypotheses • Use linear combination of the following board features: – x1: the number of black pieces on the board – x2: the number of red pieces on the board – x3: the number of black kings on the board – x4: the number of red kings on the board – x5: the number of black pieces threatened by red (i.e. which can be captured on red's next turn) – x6: the number of red pieces threatened by black 23 6 6 5 5 4 4 3 3 2 2 1 1 0 ) ( ˆ x w x w x w x w x w x w w b V       
  • 24. 24
  • 25. Partial Design of Checkers Learning Program • A checkers learning problem: – Task T: playing checkers – Performance measure P: percent of games won in the world tournament – Training experience E: games played against itself – Target Function: V: Board → – Target function representation 25 6 6 5 5 4 4 3 3 2 2 1 1 0 ) ( ˆ x w x w x w x w x w x w w b V        
  • 26. Choosing a Function Approximation Algorithm • To learn we require a set of training examples describing the board b and the training value Vtrain(b) – Ordered pair 26 V̂   b V b train , 100 , 0 , 0 , 0 , 1 , 0 , 3 6 5 4 3 2 1        x x x x x x x1: the number of black pieces on the board x2: the number of red pieces on the board x3: the number of black kings on the board x4: the number of red kings on the board x5: the number of black pieces threatened by red (i.e. which can be captured on red's next turn) x6: the number of red pieces threatened by black
  • 27. Choosing a Function Approximation Algorithm • Need a procedure that first derives such training examples from the indirect training experience, then adjust the weights Wi to best fits these training examples. 27
  • 28. Estimating Training Values • Need to assign specific scores to intermediate board states • Approximate intermediate board state b using the learner's current approximation of the next board state following b – Simple and successful approach – More accurate for states closer to end states 28 )) ( ( ˆ ) ( b Successor V b Vtrain 
  • 29. Adjusting the Weights • Choose the weights wi to best fit the set of training examples • Minimize the squared error E between the train values and the values predicted by the hypothesis • Require an algorithm that – will incrementally refine weights as new training examples become available – will be robust to errors in these estimated training values • Least Mean Squares (LMS) is one such algorithm 29             examples training b V b train train b V b V E , 2 ˆ
  • 30. LMS Weight Update Rule • For each train example – Use the current weights to calculate – For each weight wi, update it as – where • is a small constant (e.g. 0.1) 30   b V b train ,   b V ˆ        i train i i x b V b V w w ˆ    
  • 31. Summary of Design Choices
  • 32. Suggested Readings • “Machine Learning” by Tom Mitchell, McGraw Hill Publisher, Chapter 1 32