SlideShare ist ein Scribd-Unternehmen logo
1 von 39
Machine Learning
Ref: Machine Learning By Tom Mitchell
from : https://www.ibm.com/blogs/systems/ai-machine-learning-and-deep-learning-whats-the-difference/
AI is anything capable of mimicking human behavior
Machine learning algorithms apply statistical methodologies to identify patterns in past human behavior and
make decisions.
DL techniques can adapt on their own, uncovering features in data that we never specifically programmed
them to find, and therefore we say they learn on their own.
Machine Learning - Andrew Ng
“If a typical person can do a mental task with less than one second of thought, we
can probably automate it using AI either now or in the near future.”
Learning - Definition
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with experience
E.
To have a well defined learning Problem
● Three features must be identified
○ Class of Tasks, Measure of performance to be improved and Source of Experience
A checkers learning problem
Objective : Designing a program to learn to play Checkers
Checkers
Video : https://youtu.be/ScKIdStgAfU
How to : https://www.wikihow.com/Play-Checkers
A checkers learning problem
● Task T: playing checkers
● Performance measure P: percent of games won against opponents
● Training experience E: playing practice games against itself
A handwriting recognition learning problem
A handwriting recognition learning problem
● Task T: recognizing and classifying handwritten words within images
● Performance measure P: percent of words correctly classified
● Training experience E: a database of handwritten words with given classifications
A robot driving learning problem
A robot driving learning problem
● Task T: driving on public four-lane highways using vision sensors
● Performance measure P: average distance traveled before an error (as judged by human overseer)
● Training experience E: a sequence of images and steering commands recorded while observing a
human driver
Designing a Learning System
● Choosing the Training Experience
● Choosing the Target Function
● Choosing a Representation for the Target function
● Choosing a Function Approximation Algorithm
○ Estimating Training Values
○ Adjusting Weights
● The Final Design
Choosing the Training Experience
First Design Choice
Choose the type of training experience from which our system will learn
The type of Training Experience has a significant impact on success or failure of the learner
Training Experience - Attributes 1
Type of Training Data
● Direct
○ Checkerboard status, Correct Move
● Indirect
○ Move sequences and final outcome of the various games played
○ Correctness of the specific moves early in the game must be inferred indirectly - from won or lost
○ Need to assign credits, - determining the degree to which each move in the sequence deserves credit or blame
for the final outcome
○ Credit assignment is a difficult problem - the game can be lost even when early moves are optimal, if these are
followed later by poor moves
Training Experience - Attributes 2
The degree to which the learner controls the sequence of training examples
● Teacher selects informative board states & provides the correct moves
● For each proposed confusing board state it asks the teacher for correct move
● Learner may have complete control
○ when it learns by playing itself with no teacher - learner may choose between experimenting with novel board
states or honing its skill by playing minor variations of promising lines of play
Training Experience - Attributes 3
How well E represents the distribution of examples over which the final P must be made
● learning is most reliable when the training examples follow a distribution similar to that of future test
examples.
Attributes 3 - checkers learning scenario
● The performance metric P is the percent of games the system wins in the world tournament.
● If its training experience E consists only of games played against itself, there is an obvious danger that
this training experience might not be fully representative of the distribution of situations over which
it will later be tested.
● For example, the learner might never encounter certain crucial board states that are very likely to be
played by the human checkers champion.
Attribute 3 (Contd…)
In practice, it is often necessary to learn from a distribution of examples that is somewhat different from
those on which the final system will be evaluated
Such situations are problematic because mastery of one distribution of examples will not necessary lead to
strong performance over some other distribution
Design of Learning System
Needs to choose
● the exact type of knowledge to be learned
● a representation for this target knowledge
● a learning mechanism
Choosing the Target Function
Determine what type of knowledge will be learned
Assume a checkers-playing program
● Can generate the legal moves from any board state
● Need to learn how to choose the best move from these legal moves
● This learning task is representative of a large class of tasks for which the legal moves that define
some large search space are known a priori, but for which the best search strategy is not known.
Choosing the Target Function contd..
The type of information to be learned is a
program that chooses the best move for any
given board state
● ChooseMove : B → M
● Where B is a set of legal board state
● M is a set of legal moves
Very difficult - indirect training experience
The problem of improving P at task T
Reduces to
Learning a Target Function such as ChooseMove
CHOICE OF THE TARGET FUNCTION WILL BE THE
KEY
Alternate Target Function
An evaluation function that assigns a numerical
score to any given board state
Should assign higher score to better board states
V : B → ℛ
● Where B is a set of legal board states
● ℛ denotes a set of real numbers
Alternate Target Function (Contd…)
If system can learn V
● It can select the best move from any current board position
○ Generate the successor board state for every legal move
○ Use V to choose the best successor
Possible Definition for Target function
● if b is a final board state that is won, then V(b) = 100
● If b is a final board state that is lost, then V(b) = -100
● if b is a final board state that is drawn, then V(b) = 0
● if b is a not a final state in the game, then V(b) = V(b’), where b' is the best final board state that can
be achieved starting from b and playing optimally until the end of the game
Cannot be efficiently computable - NON-OPERATIONAL Definition
Goal of learning is to discover operational description of V - evaluate and select moves within realistic
time bounds
Approximation to Target Function
The problem of improving P at task T
Reduces to
Learning a Target Function such as ChooseMove
Reduces to
Operational Description of the ideal target function V
Difficult to learn operational form of V perfectly
Acquire Approximation
Process of learning the target function with some
approximation - Function Approximation
The function that is actually learned by our
program -
Choosing a Representation for the Target
Function
Represent as
● A large table with all board states and a value for each board state
● A collection of rules that match against features of the board state
● A quadratic polynomial of predefined board features
● An ANN
A Simple Representation of
● xl: the number of black pieces on the board
● x2: the number of red pieces on the board
● xs: the number of black kings on the board
● x4: the number of red kings on the board
● x5: the number of black pieces threatened by red (i.e., which can be captured on red's next turn)
● X6: the number of red pieces threatened by black
where w0 through w6 are numerical coefficients or weights - to be chosen by the Learning Algorithm
Partial Design - Checker’s Learning
Program
● Task T: playing checkers
● Performance measure P: percent of games won in the world tournament Specification of L.Task
● Training experience E: games played against itself
● Target function: V : Board → ℛ Design choices for the implementation of the learning Problem
● Target function representation
Partial Design (Contd…)
Net effect of this set of Design choices is to reduce
The problem of Learning a Checkers Strategy
Problem of Learning the values of Coefficients w0 through w6 in the target function
Representation
Choosing a Function Approximation
Algorithm
To learn the target function , training examples are needed - (b,Vtrain (b))
For Example,
((x1 = 3,x2 = 0,x3 = 1,x4 = 0, x5 = 0,x6 = 0),+100) describes the board state b in which black has won
Steps Involved
Estimating Training values from the indirect Training Experience available
Adjusting the weights to best fit the training Example
Estimating Training Values
● The only training information available to our learner is whether the game was eventually won or
lost.
● We require training examples that assign specific scores to specific board states.
● It is easy to assign a value to board states that correspond to the end of the game, it is less obvious
how to assign training values to the more numerous intermediate board states that occur before the
game's end.
● The game was eventually won or lost does not necessarily indicate that every board state along the
game path was necessarily good or bad
● Even if the program loses the game, it may still be the case that board states occurring early in the
game should be rated very highly and that the cause of the loss was a subsequent poor move.
Estimating Training Values
One simple approach has been found to be surprisingly successful.
seem strange to use the current version of to estimate training values that will be used to refine this very
same function.
This will make sense if tends to be more accurate for board states closer to game's end.
Adjusting Weights - Common Approach
Is to define the best hypothesis, or set of weights, as that which minimizes the squared error E between the
training values and the values predicted by the hypothesis
Least Mean Squares LMS training rule
For each training example <b, Vtrain(b)>
● Use the current weights to calculate (b)
● For each weight wi, update its as
𝜼 is a small constant - 0.1
Final Design
Final Design - Four Modules
● Performance System
○ the module that must solve the given performance task - playing checkers, by using the learned target
function(s).
○ Input : A new game
○ Output : generates game history
○ Select the next move determined by
○ The system’s performance to improve as becomes increasingly accurate
● Critic
○ Input : history of the game
○ Output : a set of training examples (b, Vtrain)
Final Design - Four Modules Contd...
● Generalizer
○ Input : Training Examples
○ Output : Hypothesis - its estimate of target function
○ Generalizes from specific examples
○ Our Case : LMS and
● Experiment Generator
○ Input : Current Hypothesis
○ Output : A new problem
○ Pick a new practice problem that will maximize the learning rate
○ E.g Initial game board or can create board positions designed to explore particular region of space
Choices chosen
➔ What Algorithms exist for learning general target function ? Which one performs best for what type
of algorithm?
➔ How much training data is sufficient?
➔ When and How can prior knowledge held by the learner to guide the process of generalizing from
examples?
.
.
Issues in Machine Learning

Weitere ähnliche Inhalte

Ähnlich wie Machine Learning- Introduction.pptx

vorl1.ppt
vorl1.pptvorl1.ppt
vorl1.ppt
butest
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
butest
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
butest
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemML
Janani C
 
课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)
butest
 

Ähnlich wie Machine Learning- Introduction.pptx (20)

vorl1.ppt
vorl1.pptvorl1.ppt
vorl1.ppt
 
ML_ Unit_1_PART_A
ML_ Unit_1_PART_AML_ Unit_1_PART_A
ML_ Unit_1_PART_A
 
ML_Lecture_1.ppt
ML_Lecture_1.pptML_Lecture_1.ppt
ML_Lecture_1.ppt
 
ML PPT print.pdf
ML PPT print.pdfML PPT print.pdf
ML PPT print.pdf
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
introducción a Machine Learning
introducción a Machine Learningintroducción a Machine Learning
introducción a Machine Learning
 
ML Unit 1 CS.ppt
ML Unit 1 CS.pptML Unit 1 CS.ppt
ML Unit 1 CS.ppt
 
Overview of machine learning
Overview of machine learning Overview of machine learning
Overview of machine learning
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
Online learning &amp; adaptive game playing
Online learning &amp; adaptive game playingOnline learning &amp; adaptive game playing
Online learning &amp; adaptive game playing
 
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
 
Learning
LearningLearning
Learning
 
Machine Learning using Support Vector Machine
Machine Learning using Support Vector MachineMachine Learning using Support Vector Machine
Machine Learning using Support Vector Machine
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Presentation1
Presentation1Presentation1
Presentation1
 
An efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game LearningAn efficient use of temporal difference technique in Computer Game Learning
An efficient use of temporal difference technique in Computer Game Learning
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement Learning
 
Lecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptxLecture 3.1_ Logistic Regression.pptx
Lecture 3.1_ Logistic Regression.pptx
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemML
 
课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)课堂讲义(最后更新:2009-9-25)
课堂讲义(最后更新:2009-9-25)
 

Kürzlich hochgeladen

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night StandCall Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Doddaballapur Road ☎ 7737669865 🥵 Book Your One night Stand
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 

Machine Learning- Introduction.pptx

  • 1. Machine Learning Ref: Machine Learning By Tom Mitchell
  • 3. AI is anything capable of mimicking human behavior Machine learning algorithms apply statistical methodologies to identify patterns in past human behavior and make decisions. DL techniques can adapt on their own, uncovering features in data that we never specifically programmed them to find, and therefore we say they learn on their own.
  • 4. Machine Learning - Andrew Ng “If a typical person can do a mental task with less than one second of thought, we can probably automate it using AI either now or in the near future.”
  • 5. Learning - Definition A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. To have a well defined learning Problem ● Three features must be identified ○ Class of Tasks, Measure of performance to be improved and Source of Experience
  • 6. A checkers learning problem Objective : Designing a program to learn to play Checkers Checkers Video : https://youtu.be/ScKIdStgAfU How to : https://www.wikihow.com/Play-Checkers
  • 7. A checkers learning problem ● Task T: playing checkers ● Performance measure P: percent of games won against opponents ● Training experience E: playing practice games against itself
  • 8. A handwriting recognition learning problem
  • 9. A handwriting recognition learning problem ● Task T: recognizing and classifying handwritten words within images ● Performance measure P: percent of words correctly classified ● Training experience E: a database of handwritten words with given classifications
  • 10. A robot driving learning problem
  • 11. A robot driving learning problem ● Task T: driving on public four-lane highways using vision sensors ● Performance measure P: average distance traveled before an error (as judged by human overseer) ● Training experience E: a sequence of images and steering commands recorded while observing a human driver
  • 12. Designing a Learning System ● Choosing the Training Experience ● Choosing the Target Function ● Choosing a Representation for the Target function ● Choosing a Function Approximation Algorithm ○ Estimating Training Values ○ Adjusting Weights ● The Final Design
  • 13. Choosing the Training Experience First Design Choice Choose the type of training experience from which our system will learn The type of Training Experience has a significant impact on success or failure of the learner
  • 14. Training Experience - Attributes 1 Type of Training Data ● Direct ○ Checkerboard status, Correct Move ● Indirect ○ Move sequences and final outcome of the various games played ○ Correctness of the specific moves early in the game must be inferred indirectly - from won or lost ○ Need to assign credits, - determining the degree to which each move in the sequence deserves credit or blame for the final outcome ○ Credit assignment is a difficult problem - the game can be lost even when early moves are optimal, if these are followed later by poor moves
  • 15. Training Experience - Attributes 2 The degree to which the learner controls the sequence of training examples ● Teacher selects informative board states & provides the correct moves ● For each proposed confusing board state it asks the teacher for correct move ● Learner may have complete control ○ when it learns by playing itself with no teacher - learner may choose between experimenting with novel board states or honing its skill by playing minor variations of promising lines of play
  • 16. Training Experience - Attributes 3 How well E represents the distribution of examples over which the final P must be made ● learning is most reliable when the training examples follow a distribution similar to that of future test examples.
  • 17. Attributes 3 - checkers learning scenario ● The performance metric P is the percent of games the system wins in the world tournament. ● If its training experience E consists only of games played against itself, there is an obvious danger that this training experience might not be fully representative of the distribution of situations over which it will later be tested. ● For example, the learner might never encounter certain crucial board states that are very likely to be played by the human checkers champion.
  • 18. Attribute 3 (Contd…) In practice, it is often necessary to learn from a distribution of examples that is somewhat different from those on which the final system will be evaluated Such situations are problematic because mastery of one distribution of examples will not necessary lead to strong performance over some other distribution
  • 19. Design of Learning System Needs to choose ● the exact type of knowledge to be learned ● a representation for this target knowledge ● a learning mechanism
  • 20. Choosing the Target Function Determine what type of knowledge will be learned Assume a checkers-playing program ● Can generate the legal moves from any board state ● Need to learn how to choose the best move from these legal moves ● This learning task is representative of a large class of tasks for which the legal moves that define some large search space are known a priori, but for which the best search strategy is not known.
  • 21. Choosing the Target Function contd.. The type of information to be learned is a program that chooses the best move for any given board state ● ChooseMove : B → M ● Where B is a set of legal board state ● M is a set of legal moves Very difficult - indirect training experience The problem of improving P at task T Reduces to Learning a Target Function such as ChooseMove CHOICE OF THE TARGET FUNCTION WILL BE THE KEY
  • 22. Alternate Target Function An evaluation function that assigns a numerical score to any given board state Should assign higher score to better board states V : B → ℛ ● Where B is a set of legal board states ● ℛ denotes a set of real numbers
  • 23. Alternate Target Function (Contd…) If system can learn V ● It can select the best move from any current board position ○ Generate the successor board state for every legal move ○ Use V to choose the best successor
  • 24. Possible Definition for Target function ● if b is a final board state that is won, then V(b) = 100 ● If b is a final board state that is lost, then V(b) = -100 ● if b is a final board state that is drawn, then V(b) = 0 ● if b is a not a final state in the game, then V(b) = V(b’), where b' is the best final board state that can be achieved starting from b and playing optimally until the end of the game Cannot be efficiently computable - NON-OPERATIONAL Definition Goal of learning is to discover operational description of V - evaluate and select moves within realistic time bounds
  • 25. Approximation to Target Function The problem of improving P at task T Reduces to Learning a Target Function such as ChooseMove Reduces to Operational Description of the ideal target function V Difficult to learn operational form of V perfectly Acquire Approximation Process of learning the target function with some approximation - Function Approximation The function that is actually learned by our program -
  • 26. Choosing a Representation for the Target Function Represent as ● A large table with all board states and a value for each board state ● A collection of rules that match against features of the board state ● A quadratic polynomial of predefined board features ● An ANN
  • 27. A Simple Representation of ● xl: the number of black pieces on the board ● x2: the number of red pieces on the board ● xs: the number of black kings on the board ● x4: the number of red kings on the board ● x5: the number of black pieces threatened by red (i.e., which can be captured on red's next turn) ● X6: the number of red pieces threatened by black where w0 through w6 are numerical coefficients or weights - to be chosen by the Learning Algorithm
  • 28. Partial Design - Checker’s Learning Program ● Task T: playing checkers ● Performance measure P: percent of games won in the world tournament Specification of L.Task ● Training experience E: games played against itself ● Target function: V : Board → ℛ Design choices for the implementation of the learning Problem ● Target function representation
  • 29. Partial Design (Contd…) Net effect of this set of Design choices is to reduce The problem of Learning a Checkers Strategy Problem of Learning the values of Coefficients w0 through w6 in the target function Representation
  • 30. Choosing a Function Approximation Algorithm To learn the target function , training examples are needed - (b,Vtrain (b)) For Example, ((x1 = 3,x2 = 0,x3 = 1,x4 = 0, x5 = 0,x6 = 0),+100) describes the board state b in which black has won Steps Involved Estimating Training values from the indirect Training Experience available Adjusting the weights to best fit the training Example
  • 31. Estimating Training Values ● The only training information available to our learner is whether the game was eventually won or lost. ● We require training examples that assign specific scores to specific board states. ● It is easy to assign a value to board states that correspond to the end of the game, it is less obvious how to assign training values to the more numerous intermediate board states that occur before the game's end. ● The game was eventually won or lost does not necessarily indicate that every board state along the game path was necessarily good or bad ● Even if the program loses the game, it may still be the case that board states occurring early in the game should be rated very highly and that the cause of the loss was a subsequent poor move.
  • 32. Estimating Training Values One simple approach has been found to be surprisingly successful. seem strange to use the current version of to estimate training values that will be used to refine this very same function. This will make sense if tends to be more accurate for board states closer to game's end.
  • 33. Adjusting Weights - Common Approach Is to define the best hypothesis, or set of weights, as that which minimizes the squared error E between the training values and the values predicted by the hypothesis
  • 34. Least Mean Squares LMS training rule For each training example <b, Vtrain(b)> ● Use the current weights to calculate (b) ● For each weight wi, update its as 𝜼 is a small constant - 0.1
  • 36. Final Design - Four Modules ● Performance System ○ the module that must solve the given performance task - playing checkers, by using the learned target function(s). ○ Input : A new game ○ Output : generates game history ○ Select the next move determined by ○ The system’s performance to improve as becomes increasingly accurate ● Critic ○ Input : history of the game ○ Output : a set of training examples (b, Vtrain)
  • 37. Final Design - Four Modules Contd... ● Generalizer ○ Input : Training Examples ○ Output : Hypothesis - its estimate of target function ○ Generalizes from specific examples ○ Our Case : LMS and ● Experiment Generator ○ Input : Current Hypothesis ○ Output : A new problem ○ Pick a new practice problem that will maximize the learning rate ○ E.g Initial game board or can create board positions designed to explore particular region of space
  • 39. ➔ What Algorithms exist for learning general target function ? Which one performs best for what type of algorithm? ➔ How much training data is sufficient? ➔ When and How can prior knowledge held by the learner to guide the process of generalizing from examples? . . Issues in Machine Learning