SlideShare ist ein Scribd-Unternehmen logo
1 von 17
LiTH, 1998

             1
Abstract

   This essay deals with the studies of machine learning, an important part of computer
  science. The emphasis is put on three major sub areas, Decision trees, Artificial Neural
Networks and Evolutionary Computation. For each approach the theory behind the algorithm
   is explained as well as the experience that I have received when examining different
                             implementations of the algorithms.




LiTH, 1998

                                            2
Contents

  Introduction......................................................................................................................................4

  Machine Learning............................................................................................................................4

  Decision tree learning.......................................................................................................................5
     Theory...............................................................................................................................................................5
     The Algorithm..................................................................................................................................................6
     Practice.............................................................................................................................................................6

  Artificial Neural Networks...............................................................................................................7
     Theory...............................................................................................................................................................7
     Biology.............................................................................................................................................................8
     Computer..........................................................................................................................................................8
     Practice...........................................................................................................................................................10

  Evolutionary computation.............................................................................................................11
     Theory.............................................................................................................................................................11
     Genetic algorithms..........................................................................................................................................11
     Genetic programming.....................................................................................................................................12
     Practice...........................................................................................................................................................13

  Other approaches...........................................................................................................................15

  Conclusion.......................................................................................................................................16

  References.......................................................................................................................................17
     Books..............................................................................................................................................................17
     Websites.........................................................................................................................................................17




LiTH, 1998

                                                                                  3
Introduction

As part of the course TDDB55 - "Medieinformatik, projekt 1" at the University of Linköping I
choose to look into the field of machine learning. To be more precise, I choose the
assignment “Evaluate machine learning algorithms for user modeling”. The different
algorithms I have evaluated are decision tree learning, artificial neural networks and
evolutionary computation. I am also mentioning other approaches such as Bayesian
networks and PAC learning. As the main source of information I have used the books
Artificial intelligence – a modern approach by Peter Norvig and Stuart Russell and Machine
learning by Tom M. Mitchell as well as the various enlightening sites on the Internet.

Machine Learning

What is machine learning? That was the first question that I faced when I started looking in to
the subject. It is a fairly new science, approximately as old as computer science itself. Ever
since the realization of the very first computers people have dreamed of teaching their
machines into reasoning and acting like humans and other forms of intelligent life. This is
where machine learning and other closely related fields such as Artificial Intelligence, AI,
comes in.

Machine learning is the technique of implementing algorithms that learn on computers. What
then is learning? Well Tom M. Mitchell gives this definition:

             “A computer program is said to learn from experience E with resect
             to some class of tasks T and performance measure P, if its
             performance at tasks in T, as measured by P, improves with
             experience E."

The task may for instance be to recognize different faces of people (see Artificial Neural
Network learning) and the experience would then be the set of pictures of people used to
train the system. Measurement of the performance is here checking if the computer can
determine who is on the picture correctly. In this case the performance would partly be
evaluated by humans and fed back to the computer, but there are many examples of tasks
where the performance can be measured by the computer itself e.g. learning to play chess,
where it is easier to define rules for the measurement of the success of the algorithm.

A fundamental part of learning is searching. By searching through a hypothesis space for the
hypothesis that fits the training examples best, the algorithms can simulate a modest form of
intelligence.




LiTH, 1998

                                                 4
Decision tree learning

Theory

Decision tree learning is one of the most popular learning methods and has been
successfully implemented in such different tasks as learning to diagnose medical cases and
assessment of credit risk of loan applications. The method is best suited for problems that
have discrete output values and instances with a fixed set of attributes with values that are
preferably discrete and disjoint. At least in the basic form of decision tree algorithm. With
modifications to the basic algorithm, decision trees can be made to handle continuous, real-
valued attributes as well as outputting real-valued results. Applications with these features
are less common though. Other advantages with decision trees are that they are robust to
errors and that they may have missing attribute values and still function. If the set of training
examples is infected by incorrect instances the algorithm is still able to make correct
assumptions on the set and if all attributes are not represented, it will still work as long as the
missing value is not required in the tree search. A major drawback to inductive learning
through decision trees is that the algorithms are not capable of recognizing patterns in
learning examples and therefor do not know anything about cases that have not yet been
examined through examples.




                                   Figure 1 Occupation decision tree

              One possible implementation of a decision tree for trying to establish a person’s occupation.




LiTH, 1998

                                                           5
The Algorithm

Basically the algorithm is the assembly of a tree graph. The tree will then be used to make
decisions, hence decision tree. The leaf-nodes of the tree represent the classifications. The
rest of the nodes are test on the attribute values of the tree. An important aspect of the
construction of the tree is deciding the order of the tests (the internal structure of the nodes in
the tree). The most popular philosophy on this subject dates back to the 14thcentury and
William of Ockham who preferred the simplest of hypothesis consequent with the examples,
also known as Ockham’s razor.

             Ockham’s razor: Prefer the simplest hypothesis that fits all the data.

In other words this means that we are interested in asking the “most interesting” questions
first, or to be more scientific, to ask the questions that give us the greatest information gain.

Claude Shannon, the father of information theory defined a metric for the information gain, in
the 1940’s, called entropy. The entropy for each question is a value between 0 and 1, where
1 rates as the most information possible gained.

             Entropy (S) of a question on an attribute =   Σ -p log
                                                                 i    2   pi, over all
             disjoint values of the attribute where pi is a classification (such as e.g.
             positive or negative) of the example.

This way the question with the highest entropy, out of the remaining questions, is chosen
until all relevant questions are asked (questions with entropy 0 are pointless to ask since
they do not add any information to the tree). Since the entropy of each question changes with
new examples to the training domain, the tree should be reconstructed for every new
example. This is obviously not very practical. A better approach is to wait until a notable
amount of new examples (e.g. 10%) have been added to the original set and then
reconstruct.

Practice

When looking for implementations of algorithms using decision trees, I soon discovered that
most available systems were more or less associated to data mining and expert systems.
Some examples that I found were Alice from Isoft (France) and CART from Salford Systems
(USA). I examined the CART system on a Windows 98 platform. I found that CART had a
very nice graphic-interface that showed the decision tree. The major limitation to CART was
that it only produced binary trees, but there were also many interesting parameters that could
be tuned in the tree modeling process.

A free demo of the CART system can be downloaded from Salford Systems’ website.

LiTH, 1998

                                                     6
http://www.salford-systems.com/

Artificial Neural Networks

Theory

Artificial neural networks have proved to be an efficient approach to learning real-valued
functions over both continuous and discrete-valued attributes. One of the biggest advantages
with artificial neural networks is that they are robust to noise in the training data. This ability
has contributed to successful implementation tasks, such as face and handwriting
recognition, robot control, language translation, pronunciation software, vehicle control, etc.

In a “pure” artificial neural network, all the nodes work in parallel. This requires special and
expensive hardware. Most of today’s implementations of artificial neural networks are done
on single-processor computers, by simulation of parallelism. This enables only a fraction of
the capacity in terms of speed of a “ pure” neural network.

The idea for artificial neural networks came from the science of biology.




                                 Figure 2 Artificial Neural Network

                  Example of neural network for establishing identity of a human face on a picture.




LiTH, 1998

                                                         7
Biology

The idea for Artificial Neural Networks originated from the studies of the brain. Since the
brain seem to have an unprecedented ability to learn a wide range of things, it has been an
inspiring challenge to copy its characteristics. The thinking part of the brain is a vast network
made up of approximately 1011 nerve cells, called neurons. Each neuron is connected to, on
average, ten thousand other cells. The connections are organized in layers.

Each neuron consists of a cell body, called the soma and several shorter fibers, dendrites
and a long output fiber called the axon. A junction called the synapse, serves as a
connection between each cell.

When a signal propagates from neuron to neuron, it is first handled by the synapse that can
either increase or decrease its electrochemical potential. Synapses have the ability to
change the characteristics over time. This ability, researchers believe, is what we refer to as
learning. The synapses then lead the signal into the cell via the dendrite. If the total potential
of the cell reaches a certain threshold, an electronic pulse, also called action potential, is
sent down the axon and finally on to the synapses. This is then repeated for each neuron
layer in the network. The last layer is the output layer.




                                 Figure 3 Neuron – the Brain




Computer

The computer implementation of the neuron is the unit and the connections it uses, called
links. The links each have a numeric weight. Through updating the weights the links come to
have the same function as the synapses in the brain. The input function sums all the
incoming signals and their associated weights. The activation function (f in figure 4) then
determines whether to send an activation signal (a in figure 4) onto the output links. There




LiTH, 1998

                                                8
are several ideas for different activation functions, but they all have in common that they
depend on whether the sum of the input function reaches the threshold or not.




                                Figure 4 Unit – the Computer

Some units are connected to the outside environment and assigned as input or output units.
The rest of the units are called hidden units and serve as network layers between the input
and output layer. There are two major varieties of network structures, the feed-forward and
the recurrent networks. In feed-forward networks the signal only travels in one direction and
there are no loops. In a recurrent network there are no such restrictions. The recurrent
network is much more advanced and can hold memory, but it is also more vulnerable to
chaotic behavior and instability. The brain is a recurrent network. Some other examples of
successful recurrent networks are the Hopfield and the Boltzmann networks.

The simplest form of feed-forward networks are the perceptrons. They do not have any
hidden units. Still they are able to represent advanced functionality as AND, OR and NOT in
boolean algebra. Networks with one or more hidden layers are called multilayer networks.
The most popular method for learning in multilayer networks is called backpropagation. The
basic idea in backpropagation is to minimize the squared error between the network output
and the target values of the training example through dividing the “blame” among the
contributing weights.

Overfitting is an important issue in machine learning, especially so in neural network
learning. Overfitting is when the networks are trained too much on a small domain of training
data, which it then performs very well on, but when new data is added it can not generalize
sufficiently.




                                     Figure 5 Perceptron

LiTH, 1998

                                               9
Perceptrons are single layered feed-forward networks. They were the first approach to artificial neural networks that computer

                                          scientist began to study in the late 1950’s.


Practice

I first looked at Tom M. Mitchell’s implementation of face recognition using an artificial neural
network. It is made for the Unix platform and can be downloaded over the Internet (see URL
below). It requires some graphic-display program in order to view the images processed by
the system. The images used in the system were in the pgm format and I used XV by John
Bradley to view them. The system gave an interesting input on how artificial neural networks
can discover patterns in pictures for example. The main drawback with this system was that
it used tiny images, only 32x30 pixels in size so it took some time to get used to.

http://www.cs.cmu.edu/afs/cs/zproject/theo-11/www/decision-trees.html

I also looked at a similar commercial system ImageFinder 3.4 from Attrasoft Inc. It had a very
user-friendly interface and was easy to get started with. ImageFinder is a Java application
that can take gif or jpeg images as input and learn their characteristics in an artificial neural
network. The number of images to learn and the number of times to “practice” on them can
be decided by the user. Then when the network is done, the user can specify a directory in
which to search for similar images. The output is the names of the closest resembling images
and their score based on how close they reassemble the training examples. Unfortunately
the free demo that I could download did not allow for any adjustment of parameters which
would have made it even more interesting to evaluate.

http://attrasoft.com/




LiTH, 1998

                                                              10
Figure 6 ImageFinder 3.4


Evolutionary computation

Theory

Genetic algorithms and genetic programming are two closely related forms of evolutionary
programming. Some authors consider these terms to be synonyms, while others chose to
refer to genetic algorithms when the hypothesis or “gene” is a simple bit string and to
genetic programming when the hypothesis is more advanced, usually symbolic expressions
or programming code. Genetic algorithms have been successfully utilized, especially on
optimization problems. Since many problems can be thought of as optimization problems,
this is no limitation to its usefulness.

Genetic algorithms

                                         Background

Nature is the best known producer of robust and successful organisms. Over time,
organisms that are not well suited for their environment die of, while others that are better
suited live to reproduce. Parents form offspring, so that each new generation carry on earlier
generation’s experience. If the environment changes slowly, species can adapt to the
changes. Occasionally, random mutations take place. Most of these results in death for the
mutated individual, but a few mutations result in new successful species.

These facts were first revealed by Charles Darwin in his publication The Origin of Species on
the Basis of Natural Selection.


LiTH, 1998

                                              11
Figure 7 Reproduction in genetic algorithms

                                               Algorithm

In order to simulate evolution, the algorithm needs a metric to establish which selections are
better than others in respect of solving the problem at hand. This metric is called the fitness
function. The most promising individuals (the highest scores on the fitness function) then
receive a higher reproduction likelihood. The next step is to decide where in the bit string to
make the crossover. This is usually done randomly somewhere along the string. Then the
parts from the original bit strings are swapped to form two new strings. This is the natural
selection part of the algorithm. But if this had been the only step in the reproduction, most
algorithms would have been able to find only local optimums. To solve this problem the
algorithm incorporates another basic component of regeneration in nature, mutations. This
way an individual bit string can leave a population that is “stuck”. The chance of a mutation is
usually very low.

             Evolution algorithm

             1. Chose individuals for reproduction based on fitness function.

             2. Chose where to make crossover.

             3. Reproduce using crossover.

             4. Make mutations to single bits with small random chance.

             5. Repeat from step 1.



Genetic programming

Genetic programming differs from genetic algorithms in that it strives to optimize code and
not bit strings. Programs manipulated by a genetic programming are usually represented by
trees corresponding to the parse tree of the program. Just as in genetic algorithms, the
individuals produce new generations through selection, crossover and mutation. The fitness



LiTH, 1998

                                                   12
of an individual is usually determined by executing the program on training data. Crossover is
executed through swapping randomly selected subtrees.




                 Figure 8 Crossover operation in genetic programming


Practice

I found several very interesting sites on evolutionary programming on the Web. Two of the
best ones were Java applets on the web. One of the game Tron and the other a site called
GA playground.

             Tron:                      http://dendrite.cs.brandeis.edu/tron/

             The GA playground:         http://www.aridolan.com/ga/gaa/gaa.html

Tron is a computer game based on the 1982 Walt Disney movie with the same name. It uses
a genetic algorithm in order to learn from previous games.

According to the people behind the program they

  “... have put a genetic learning algorithm online. A “background" GA generates players by
       having the computer play itself. A "foreground" GA leads the evolutionary process,
                  evaluating players by their performance against real people.”

It is very hard to beat the computer at Tron at present. I only succeeded two out of
approximately 50 times. The on-line game Tron is a good example of successful utilization of
evolutionary computation.




LiTH, 1998

                                              13
Figure 9 Tron - Computer winning rate

                      The evolution of the Tron program over time according to the authors.


The other explored site, the GA Playground was similar in that it provided on-line Java
applets for evaluation of algorithms. This site provided more freedom to choose different
algorithms and parameters on different, user-selected, problems.

One example of an interesting problem was the Travelling Salesman Problem, which was
implemented in three different cases (All cities on a circle, Cities in Bavaria and Capitals of
the US). These examples gave good insight on how the applet worked.




LiTH, 1998

                                                       14
Figure 10 GA Playground’s TSP solving algorithm

  GA Playground has a very nice and adjustable user interface that allows for different setups. Some features require that the

                                      program is downloaded and run as an application.


Other approaches

There are several other interesting approaches to machine learning then the ones mentioned
above. The Probably Approximately Correct (PAC) learning, is one good model for learning.
The Bayesian learning model is another. It is based around Bayes theorem for calculating
posterior probability P(h|D), from prior probability P(h), together with P(D) and P(D|h).

                                                    Bayes theorem:




A third promising model is reinforcement learning. It is closely related to dynamic
programming and frequently used to solve optimization problems. The Q algorithm is an
interesting example from this category.

There are many more and since this is a fairly new science, even more are sure to come.




LiTH, 1998

                                                              15
Conclusion

I have throughout my personal project studied a variety of different algorithms that have
shown to been more or less useful for their different objectives. Some systems have been
pure genetic algorithms or pure artificial neural networks, while others have integrated
different approaches in an attempt to get the best of each algorithm. Different algorithms
have had different advantages and disadvantages, i.e. decision trees are better suited for
discrete valued environments. I have found that accurate knowledge about the
characteristics of the problem and basic knowledge about the algorithms is essential to find a
good algorithm for the task. Some problems are better suited for machine learning algorithms
than others. This may be because there is still a long way to go in the science of machine
learning, or because some of the expectations on machine learning are too high. For
instance Stuart/Norvig suggests in Artificial Intelligence – a modern approach that it might
always be worth trying a simple implementation of an artificial neural network or even a
genetic algorithm on a problem just to see if it will work. Our knowledge of how, for example,
neural networks work is very limited, especially with recurrent networks.

There are other important aspects where knowledge is important. For example the
occurrence of overfitting is a trap that anyone dealing with machine learning should be aware
of. Even if the algorithm is perfect, the handling of the set of training data is still a dubious
matter.




LiTH, 1998

                                               16
References

Books

Norvig, Peter & Russell, Stuart (1995). Artificial intelligence – a modern approach, USA

              http://www.cs.berkeley.edu/~russell/aima.html


Mitchell, M., Tom (1997). Machine learning. McGraw-Hill, USA.

              http://www.cs.cmu.edu/~tom/mlbook.html




Websites

http://www.isoft.fr/

http://www.salford-systems.com/

http://dendrite.cs.brandeis.edu/tron/

http://www.aridolan.com/ga/gaa/gaa.html




LiTH, 1998

                                                         17

Weitere ähnliche Inhalte

Was ist angesagt?

Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyMarina Santini
 
Machine learning
Machine learningMachine learning
Machine learningRohit Kumar
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learningbutest
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai UniversityMadhav Mishra
 
Lecture1 introduction to machine learning
Lecture1 introduction to machine learningLecture1 introduction to machine learning
Lecture1 introduction to machine learningUmmeSalmaM1
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its ApplicationsBhuvan Chopra
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb pptbutest
 
PWL Seattle #23 - A Few Useful Things to Know About Machine Learning
PWL Seattle #23 - A Few Useful Things to Know About Machine LearningPWL Seattle #23 - A Few Useful Things to Know About Machine Learning
PWL Seattle #23 - A Few Useful Things to Know About Machine LearningTristan Penman
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.butest
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learningamalalhait
 
Machine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction IntroductionMachine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction Introductionbutest
 
Machine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math RefresherMachine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math Refresherbutest
 
Machine learning (domingo's paper)
Machine learning (domingo's paper)Machine learning (domingo's paper)
Machine learning (domingo's paper)Akhilesh Joshi
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisWriteMyThesis
 
Machine Learning techniques
Machine Learning techniques Machine Learning techniques
Machine Learning techniques Jigar Patel
 

Was ist angesagt? (20)

Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
 
Machine learning
Machine learningMachine learning
Machine learning
 
Basics of Machine Learning
Basics of Machine LearningBasics of Machine Learning
Basics of Machine Learning
 
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai UniversityMachine Learning Unit 1 Semester 3  MSc IT Part 2 Mumbai University
Machine Learning Unit 1 Semester 3 MSc IT Part 2 Mumbai University
 
Lecture1 introduction to machine learning
Lecture1 introduction to machine learningLecture1 introduction to machine learning
Lecture1 introduction to machine learning
 
International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)International Journal of Engineering Inventions (IJEI)
International Journal of Engineering Inventions (IJEI)
 
Machine Learning and its Applications
Machine Learning and its ApplicationsMachine Learning and its Applications
Machine Learning and its Applications
 
2.17Mb ppt
2.17Mb ppt2.17Mb ppt
2.17Mb ppt
 
Ai unit-6
Ai unit-6Ai unit-6
Ai unit-6
 
PWL Seattle #23 - A Few Useful Things to Know About Machine Learning
PWL Seattle #23 - A Few Useful Things to Know About Machine LearningPWL Seattle #23 - A Few Useful Things to Know About Machine Learning
PWL Seattle #23 - A Few Useful Things to Know About Machine Learning
 
Introduction to Machine Learning.
Introduction to Machine Learning.Introduction to Machine Learning.
Introduction to Machine Learning.
 
Unsupervised learning
Unsupervised learningUnsupervised learning
Unsupervised learning
 
Machine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction IntroductionMachine Learning: Machine Learning: Introduction Introduction
Machine Learning: Machine Learning: Introduction Introduction
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Machine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math RefresherMachine Learning Preliminaries and Math Refresher
Machine Learning Preliminaries and Math Refresher
 
Machine learning (domingo's paper)
Machine learning (domingo's paper)Machine learning (domingo's paper)
Machine learning (domingo's paper)
 
Hot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesisHot Topics in Machine Learning For Research and thesis
Hot Topics in Machine Learning For Research and thesis
 
Machine Learning techniques
Machine Learning techniques Machine Learning techniques
Machine Learning techniques
 
Machine learning
Machine learningMachine learning
Machine learning
 
Reasoning in AI
Reasoning in AIReasoning in AI
Reasoning in AI
 

Andere mochten auch

Predictive Analytics and Azure Machine Learning Case Studies
Predictive Analytics and Azure Machine Learning Case StudiesPredictive Analytics and Azure Machine Learning Case Studies
Predictive Analytics and Azure Machine Learning Case StudiesCasey Lucas
 
SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기
SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기
SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기Dae Kim
 
Cloud and Machine Learning in real world business
Cloud and Machine Learning in real world businessCloud and Machine Learning in real world business
Cloud and Machine Learning in real world businessDae Kim
 
Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)Dmitry Efimov
 
10 uses cases - Artificial Intelligence and Machine Learning in Education - b...
10 uses cases - Artificial Intelligence and Machine Learning in Education - b...10 uses cases - Artificial Intelligence and Machine Learning in Education - b...
10 uses cases - Artificial Intelligence and Machine Learning in Education - b...Victor John Tan
 
Amazon Machine Learning Case Study: Predicting Customer Churn
Amazon Machine Learning Case Study: Predicting Customer ChurnAmazon Machine Learning Case Study: Predicting Customer Churn
Amazon Machine Learning Case Study: Predicting Customer ChurnAmazon Web Services
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeAmazon Web Services
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningSri Ambati
 
Virtualization 101: Everything You Need To Know To Get Started With VMware
Virtualization 101: Everything You Need To Know To Get Started With VMwareVirtualization 101: Everything You Need To Know To Get Started With VMware
Virtualization 101: Everything You Need To Know To Get Started With VMwareDatapath Consulting
 

Andere mochten auch (9)

Predictive Analytics and Azure Machine Learning Case Studies
Predictive Analytics and Azure Machine Learning Case StudiesPredictive Analytics and Azure Machine Learning Case Studies
Predictive Analytics and Azure Machine Learning Case Studies
 
SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기
SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기
SOSCON 2016 - OSS "개발자"의 Machine Learning 분투기
 
Cloud and Machine Learning in real world business
Cloud and Machine Learning in real world businessCloud and Machine Learning in real world business
Cloud and Machine Learning in real world business
 
Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)
 
10 uses cases - Artificial Intelligence and Machine Learning in Education - b...
10 uses cases - Artificial Intelligence and Machine Learning in Education - b...10 uses cases - Artificial Intelligence and Machine Learning in Education - b...
10 uses cases - Artificial Intelligence and Machine Learning in Education - b...
 
Amazon Machine Learning Case Study: Predicting Customer Churn
Amazon Machine Learning Case Study: Predicting Customer ChurnAmazon Machine Learning Case Study: Predicting Customer Churn
Amazon Machine Learning Case Study: Predicting Customer Churn
 
Build a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-timeBuild a Recommendation Engine using Amazon Machine Learning in Real-time
Build a Recommendation Engine using Amazon Machine Learning in Real-time
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 
Virtualization 101: Everything You Need To Know To Get Started With VMware
Virtualization 101: Everything You Need To Know To Get Started With VMwareVirtualization 101: Everything You Need To Know To Get Started With VMware
Virtualization 101: Everything You Need To Know To Get Started With VMware
 

Ähnlich wie Introduction.doc

On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Miningbutest
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
Ai based projects
Ai based projectsAi based projects
Ai based projectsaliaKhan71
 
A Machine Learning Primer,
A Machine Learning Primer,A Machine Learning Primer,
A Machine Learning Primer,Eirini Ntoutsi
 
Deep learning Introduction and Basics
Deep learning  Introduction and BasicsDeep learning  Introduction and Basics
Deep learning Introduction and BasicsNitin Mishra
 
Lect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdfLect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdfHassanElalfy4
 
Introduction AI ML& Mathematicals of ML.pdf
Introduction AI ML& Mathematicals of ML.pdfIntroduction AI ML& Mathematicals of ML.pdf
Introduction AI ML& Mathematicals of ML.pdfGandhiMathy6
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.pptbutest
 
Machine learning
Machine learningMachine learning
Machine learningAbrar ali
 
Chapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics courseChapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics coursegideymichael
 
Machine Learning Chapter one introduction
Machine Learning Chapter one introductionMachine Learning Chapter one introduction
Machine Learning Chapter one introductionARVIND SARDAR
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeOsama Ghandour Geris
 
Introduction to Artificial Intelligence.doc
Introduction to Artificial Intelligence.docIntroduction to Artificial Intelligence.doc
Introduction to Artificial Intelligence.docbutest
 
Machine Learning
Machine LearningMachine Learning
Machine Learningbutest
 
ML All Chapter PDF.pdf
ML All Chapter PDF.pdfML All Chapter PDF.pdf
ML All Chapter PDF.pdfexample43
 
A Few Useful Things to Know about Machine Learning
A Few Useful Things to Know about Machine LearningA Few Useful Things to Know about Machine Learning
A Few Useful Things to Know about Machine Learningnep_test_account
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.pptARVIND SARDAR
 

Ähnlich wie Introduction.doc (20)

syllabus-CBR.pdf
syllabus-CBR.pdfsyllabus-CBR.pdf
syllabus-CBR.pdf
 
On Machine Learning and Data Mining
On Machine Learning and Data MiningOn Machine Learning and Data Mining
On Machine Learning and Data Mining
 
AI Presentation 1
AI Presentation 1AI Presentation 1
AI Presentation 1
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Ai based projects
Ai based projectsAi based projects
Ai based projects
 
A Machine Learning Primer,
A Machine Learning Primer,A Machine Learning Primer,
A Machine Learning Primer,
 
Deep learning Introduction and Basics
Deep learning  Introduction and BasicsDeep learning  Introduction and Basics
Deep learning Introduction and Basics
 
Lect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdfLect 7 intro to M.L..pdf
Lect 7 intro to M.L..pdf
 
Introduction AI ML& Mathematicals of ML.pdf
Introduction AI ML& Mathematicals of ML.pdfIntroduction AI ML& Mathematicals of ML.pdf
Introduction AI ML& Mathematicals of ML.pdf
 
LearningAG.ppt
LearningAG.pptLearningAG.ppt
LearningAG.ppt
 
Machine learning
Machine learningMachine learning
Machine learning
 
Chapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics courseChapter 6 - Learning data and analytics course
Chapter 6 - Learning data and analytics course
 
Machine Learning Chapter one introduction
Machine Learning Chapter one introductionMachine Learning Chapter one introduction
Machine Learning Chapter one introduction
 
Machine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-codeMachine learning-in-details-with-out-python-code
Machine learning-in-details-with-out-python-code
 
Introduction to Artificial Intelligence.doc
Introduction to Artificial Intelligence.docIntroduction to Artificial Intelligence.doc
Introduction to Artificial Intelligence.doc
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
ML All Chapter PDF.pdf
ML All Chapter PDF.pdfML All Chapter PDF.pdf
ML All Chapter PDF.pdf
 
Lect 01, 02
Lect 01, 02Lect 01, 02
Lect 01, 02
 
A Few Useful Things to Know about Machine Learning
A Few Useful Things to Know about Machine LearningA Few Useful Things to Know about Machine Learning
A Few Useful Things to Know about Machine Learning
 
Machine Learning Ch 1.ppt
Machine Learning Ch 1.pptMachine Learning Ch 1.ppt
Machine Learning Ch 1.ppt
 

Mehr von butest

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEbutest
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jacksonbutest
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALbutest
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer IIbutest
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.docbutest
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1butest
 
Facebook
Facebook Facebook
Facebook butest
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...butest
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTbutest
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docbutest
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docbutest
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.docbutest
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!butest
 

Mehr von butest (20)

EL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBEEL MODELO DE NEGOCIO DE YOUTUBE
EL MODELO DE NEGOCIO DE YOUTUBE
 
1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同1. MPEG I.B.P frame之不同
1. MPEG I.B.P frame之不同
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Timeline: The Life of Michael Jackson
Timeline: The Life of Michael JacksonTimeline: The Life of Michael Jackson
Timeline: The Life of Michael Jackson
 
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
 
LESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIALLESSONS FROM THE MICHAEL JACKSON TRIAL
LESSONS FROM THE MICHAEL JACKSON TRIAL
 
Com 380, Summer II
Com 380, Summer IICom 380, Summer II
Com 380, Summer II
 
PPT
PPTPPT
PPT
 
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet JazzThe MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
 
MICHAEL JACKSON.doc
MICHAEL JACKSON.docMICHAEL JACKSON.doc
MICHAEL JACKSON.doc
 
Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1Social Networks: Twitter Facebook SL - Slide 1
Social Networks: Twitter Facebook SL - Slide 1
 
Facebook
Facebook Facebook
Facebook
 
Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...Executive Summary Hare Chevrolet is a General Motors dealership ...
Executive Summary Hare Chevrolet is a General Motors dealership ...
 
Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...Welcome to the Dougherty County Public Library's Facebook and ...
Welcome to the Dougherty County Public Library's Facebook and ...
 
NEWS ANNOUNCEMENT
NEWS ANNOUNCEMENTNEWS ANNOUNCEMENT
NEWS ANNOUNCEMENT
 
C-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.docC-2100 Ultra Zoom.doc
C-2100 Ultra Zoom.doc
 
MAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.docMAC Printing on ITS Printers.doc.doc
MAC Printing on ITS Printers.doc.doc
 
Mac OS X Guide.doc
Mac OS X Guide.docMac OS X Guide.doc
Mac OS X Guide.doc
 
hier
hierhier
hier
 
WEB DESIGN!
WEB DESIGN!WEB DESIGN!
WEB DESIGN!
 

Introduction.doc

  • 2. Abstract This essay deals with the studies of machine learning, an important part of computer science. The emphasis is put on three major sub areas, Decision trees, Artificial Neural Networks and Evolutionary Computation. For each approach the theory behind the algorithm is explained as well as the experience that I have received when examining different implementations of the algorithms. LiTH, 1998 2
  • 3. Contents Introduction......................................................................................................................................4 Machine Learning............................................................................................................................4 Decision tree learning.......................................................................................................................5 Theory...............................................................................................................................................................5 The Algorithm..................................................................................................................................................6 Practice.............................................................................................................................................................6 Artificial Neural Networks...............................................................................................................7 Theory...............................................................................................................................................................7 Biology.............................................................................................................................................................8 Computer..........................................................................................................................................................8 Practice...........................................................................................................................................................10 Evolutionary computation.............................................................................................................11 Theory.............................................................................................................................................................11 Genetic algorithms..........................................................................................................................................11 Genetic programming.....................................................................................................................................12 Practice...........................................................................................................................................................13 Other approaches...........................................................................................................................15 Conclusion.......................................................................................................................................16 References.......................................................................................................................................17 Books..............................................................................................................................................................17 Websites.........................................................................................................................................................17 LiTH, 1998 3
  • 4. Introduction As part of the course TDDB55 - "Medieinformatik, projekt 1" at the University of Linköping I choose to look into the field of machine learning. To be more precise, I choose the assignment “Evaluate machine learning algorithms for user modeling”. The different algorithms I have evaluated are decision tree learning, artificial neural networks and evolutionary computation. I am also mentioning other approaches such as Bayesian networks and PAC learning. As the main source of information I have used the books Artificial intelligence – a modern approach by Peter Norvig and Stuart Russell and Machine learning by Tom M. Mitchell as well as the various enlightening sites on the Internet. Machine Learning What is machine learning? That was the first question that I faced when I started looking in to the subject. It is a fairly new science, approximately as old as computer science itself. Ever since the realization of the very first computers people have dreamed of teaching their machines into reasoning and acting like humans and other forms of intelligent life. This is where machine learning and other closely related fields such as Artificial Intelligence, AI, comes in. Machine learning is the technique of implementing algorithms that learn on computers. What then is learning? Well Tom M. Mitchell gives this definition: “A computer program is said to learn from experience E with resect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E." The task may for instance be to recognize different faces of people (see Artificial Neural Network learning) and the experience would then be the set of pictures of people used to train the system. Measurement of the performance is here checking if the computer can determine who is on the picture correctly. In this case the performance would partly be evaluated by humans and fed back to the computer, but there are many examples of tasks where the performance can be measured by the computer itself e.g. learning to play chess, where it is easier to define rules for the measurement of the success of the algorithm. A fundamental part of learning is searching. By searching through a hypothesis space for the hypothesis that fits the training examples best, the algorithms can simulate a modest form of intelligence. LiTH, 1998 4
  • 5. Decision tree learning Theory Decision tree learning is one of the most popular learning methods and has been successfully implemented in such different tasks as learning to diagnose medical cases and assessment of credit risk of loan applications. The method is best suited for problems that have discrete output values and instances with a fixed set of attributes with values that are preferably discrete and disjoint. At least in the basic form of decision tree algorithm. With modifications to the basic algorithm, decision trees can be made to handle continuous, real- valued attributes as well as outputting real-valued results. Applications with these features are less common though. Other advantages with decision trees are that they are robust to errors and that they may have missing attribute values and still function. If the set of training examples is infected by incorrect instances the algorithm is still able to make correct assumptions on the set and if all attributes are not represented, it will still work as long as the missing value is not required in the tree search. A major drawback to inductive learning through decision trees is that the algorithms are not capable of recognizing patterns in learning examples and therefor do not know anything about cases that have not yet been examined through examples. Figure 1 Occupation decision tree One possible implementation of a decision tree for trying to establish a person’s occupation. LiTH, 1998 5
  • 6. The Algorithm Basically the algorithm is the assembly of a tree graph. The tree will then be used to make decisions, hence decision tree. The leaf-nodes of the tree represent the classifications. The rest of the nodes are test on the attribute values of the tree. An important aspect of the construction of the tree is deciding the order of the tests (the internal structure of the nodes in the tree). The most popular philosophy on this subject dates back to the 14thcentury and William of Ockham who preferred the simplest of hypothesis consequent with the examples, also known as Ockham’s razor. Ockham’s razor: Prefer the simplest hypothesis that fits all the data. In other words this means that we are interested in asking the “most interesting” questions first, or to be more scientific, to ask the questions that give us the greatest information gain. Claude Shannon, the father of information theory defined a metric for the information gain, in the 1940’s, called entropy. The entropy for each question is a value between 0 and 1, where 1 rates as the most information possible gained. Entropy (S) of a question on an attribute = Σ -p log i 2 pi, over all disjoint values of the attribute where pi is a classification (such as e.g. positive or negative) of the example. This way the question with the highest entropy, out of the remaining questions, is chosen until all relevant questions are asked (questions with entropy 0 are pointless to ask since they do not add any information to the tree). Since the entropy of each question changes with new examples to the training domain, the tree should be reconstructed for every new example. This is obviously not very practical. A better approach is to wait until a notable amount of new examples (e.g. 10%) have been added to the original set and then reconstruct. Practice When looking for implementations of algorithms using decision trees, I soon discovered that most available systems were more or less associated to data mining and expert systems. Some examples that I found were Alice from Isoft (France) and CART from Salford Systems (USA). I examined the CART system on a Windows 98 platform. I found that CART had a very nice graphic-interface that showed the decision tree. The major limitation to CART was that it only produced binary trees, but there were also many interesting parameters that could be tuned in the tree modeling process. A free demo of the CART system can be downloaded from Salford Systems’ website. LiTH, 1998 6
  • 7. http://www.salford-systems.com/ Artificial Neural Networks Theory Artificial neural networks have proved to be an efficient approach to learning real-valued functions over both continuous and discrete-valued attributes. One of the biggest advantages with artificial neural networks is that they are robust to noise in the training data. This ability has contributed to successful implementation tasks, such as face and handwriting recognition, robot control, language translation, pronunciation software, vehicle control, etc. In a “pure” artificial neural network, all the nodes work in parallel. This requires special and expensive hardware. Most of today’s implementations of artificial neural networks are done on single-processor computers, by simulation of parallelism. This enables only a fraction of the capacity in terms of speed of a “ pure” neural network. The idea for artificial neural networks came from the science of biology. Figure 2 Artificial Neural Network Example of neural network for establishing identity of a human face on a picture. LiTH, 1998 7
  • 8. Biology The idea for Artificial Neural Networks originated from the studies of the brain. Since the brain seem to have an unprecedented ability to learn a wide range of things, it has been an inspiring challenge to copy its characteristics. The thinking part of the brain is a vast network made up of approximately 1011 nerve cells, called neurons. Each neuron is connected to, on average, ten thousand other cells. The connections are organized in layers. Each neuron consists of a cell body, called the soma and several shorter fibers, dendrites and a long output fiber called the axon. A junction called the synapse, serves as a connection between each cell. When a signal propagates from neuron to neuron, it is first handled by the synapse that can either increase or decrease its electrochemical potential. Synapses have the ability to change the characteristics over time. This ability, researchers believe, is what we refer to as learning. The synapses then lead the signal into the cell via the dendrite. If the total potential of the cell reaches a certain threshold, an electronic pulse, also called action potential, is sent down the axon and finally on to the synapses. This is then repeated for each neuron layer in the network. The last layer is the output layer. Figure 3 Neuron – the Brain Computer The computer implementation of the neuron is the unit and the connections it uses, called links. The links each have a numeric weight. Through updating the weights the links come to have the same function as the synapses in the brain. The input function sums all the incoming signals and their associated weights. The activation function (f in figure 4) then determines whether to send an activation signal (a in figure 4) onto the output links. There LiTH, 1998 8
  • 9. are several ideas for different activation functions, but they all have in common that they depend on whether the sum of the input function reaches the threshold or not. Figure 4 Unit – the Computer Some units are connected to the outside environment and assigned as input or output units. The rest of the units are called hidden units and serve as network layers between the input and output layer. There are two major varieties of network structures, the feed-forward and the recurrent networks. In feed-forward networks the signal only travels in one direction and there are no loops. In a recurrent network there are no such restrictions. The recurrent network is much more advanced and can hold memory, but it is also more vulnerable to chaotic behavior and instability. The brain is a recurrent network. Some other examples of successful recurrent networks are the Hopfield and the Boltzmann networks. The simplest form of feed-forward networks are the perceptrons. They do not have any hidden units. Still they are able to represent advanced functionality as AND, OR and NOT in boolean algebra. Networks with one or more hidden layers are called multilayer networks. The most popular method for learning in multilayer networks is called backpropagation. The basic idea in backpropagation is to minimize the squared error between the network output and the target values of the training example through dividing the “blame” among the contributing weights. Overfitting is an important issue in machine learning, especially so in neural network learning. Overfitting is when the networks are trained too much on a small domain of training data, which it then performs very well on, but when new data is added it can not generalize sufficiently. Figure 5 Perceptron LiTH, 1998 9
  • 10. Perceptrons are single layered feed-forward networks. They were the first approach to artificial neural networks that computer scientist began to study in the late 1950’s. Practice I first looked at Tom M. Mitchell’s implementation of face recognition using an artificial neural network. It is made for the Unix platform and can be downloaded over the Internet (see URL below). It requires some graphic-display program in order to view the images processed by the system. The images used in the system were in the pgm format and I used XV by John Bradley to view them. The system gave an interesting input on how artificial neural networks can discover patterns in pictures for example. The main drawback with this system was that it used tiny images, only 32x30 pixels in size so it took some time to get used to. http://www.cs.cmu.edu/afs/cs/zproject/theo-11/www/decision-trees.html I also looked at a similar commercial system ImageFinder 3.4 from Attrasoft Inc. It had a very user-friendly interface and was easy to get started with. ImageFinder is a Java application that can take gif or jpeg images as input and learn their characteristics in an artificial neural network. The number of images to learn and the number of times to “practice” on them can be decided by the user. Then when the network is done, the user can specify a directory in which to search for similar images. The output is the names of the closest resembling images and their score based on how close they reassemble the training examples. Unfortunately the free demo that I could download did not allow for any adjustment of parameters which would have made it even more interesting to evaluate. http://attrasoft.com/ LiTH, 1998 10
  • 11. Figure 6 ImageFinder 3.4 Evolutionary computation Theory Genetic algorithms and genetic programming are two closely related forms of evolutionary programming. Some authors consider these terms to be synonyms, while others chose to refer to genetic algorithms when the hypothesis or “gene” is a simple bit string and to genetic programming when the hypothesis is more advanced, usually symbolic expressions or programming code. Genetic algorithms have been successfully utilized, especially on optimization problems. Since many problems can be thought of as optimization problems, this is no limitation to its usefulness. Genetic algorithms Background Nature is the best known producer of robust and successful organisms. Over time, organisms that are not well suited for their environment die of, while others that are better suited live to reproduce. Parents form offspring, so that each new generation carry on earlier generation’s experience. If the environment changes slowly, species can adapt to the changes. Occasionally, random mutations take place. Most of these results in death for the mutated individual, but a few mutations result in new successful species. These facts were first revealed by Charles Darwin in his publication The Origin of Species on the Basis of Natural Selection. LiTH, 1998 11
  • 12. Figure 7 Reproduction in genetic algorithms Algorithm In order to simulate evolution, the algorithm needs a metric to establish which selections are better than others in respect of solving the problem at hand. This metric is called the fitness function. The most promising individuals (the highest scores on the fitness function) then receive a higher reproduction likelihood. The next step is to decide where in the bit string to make the crossover. This is usually done randomly somewhere along the string. Then the parts from the original bit strings are swapped to form two new strings. This is the natural selection part of the algorithm. But if this had been the only step in the reproduction, most algorithms would have been able to find only local optimums. To solve this problem the algorithm incorporates another basic component of regeneration in nature, mutations. This way an individual bit string can leave a population that is “stuck”. The chance of a mutation is usually very low. Evolution algorithm 1. Chose individuals for reproduction based on fitness function. 2. Chose where to make crossover. 3. Reproduce using crossover. 4. Make mutations to single bits with small random chance. 5. Repeat from step 1. Genetic programming Genetic programming differs from genetic algorithms in that it strives to optimize code and not bit strings. Programs manipulated by a genetic programming are usually represented by trees corresponding to the parse tree of the program. Just as in genetic algorithms, the individuals produce new generations through selection, crossover and mutation. The fitness LiTH, 1998 12
  • 13. of an individual is usually determined by executing the program on training data. Crossover is executed through swapping randomly selected subtrees. Figure 8 Crossover operation in genetic programming Practice I found several very interesting sites on evolutionary programming on the Web. Two of the best ones were Java applets on the web. One of the game Tron and the other a site called GA playground. Tron: http://dendrite.cs.brandeis.edu/tron/ The GA playground: http://www.aridolan.com/ga/gaa/gaa.html Tron is a computer game based on the 1982 Walt Disney movie with the same name. It uses a genetic algorithm in order to learn from previous games. According to the people behind the program they “... have put a genetic learning algorithm online. A “background" GA generates players by having the computer play itself. A "foreground" GA leads the evolutionary process, evaluating players by their performance against real people.” It is very hard to beat the computer at Tron at present. I only succeeded two out of approximately 50 times. The on-line game Tron is a good example of successful utilization of evolutionary computation. LiTH, 1998 13
  • 14. Figure 9 Tron - Computer winning rate The evolution of the Tron program over time according to the authors. The other explored site, the GA Playground was similar in that it provided on-line Java applets for evaluation of algorithms. This site provided more freedom to choose different algorithms and parameters on different, user-selected, problems. One example of an interesting problem was the Travelling Salesman Problem, which was implemented in three different cases (All cities on a circle, Cities in Bavaria and Capitals of the US). These examples gave good insight on how the applet worked. LiTH, 1998 14
  • 15. Figure 10 GA Playground’s TSP solving algorithm GA Playground has a very nice and adjustable user interface that allows for different setups. Some features require that the program is downloaded and run as an application. Other approaches There are several other interesting approaches to machine learning then the ones mentioned above. The Probably Approximately Correct (PAC) learning, is one good model for learning. The Bayesian learning model is another. It is based around Bayes theorem for calculating posterior probability P(h|D), from prior probability P(h), together with P(D) and P(D|h). Bayes theorem: A third promising model is reinforcement learning. It is closely related to dynamic programming and frequently used to solve optimization problems. The Q algorithm is an interesting example from this category. There are many more and since this is a fairly new science, even more are sure to come. LiTH, 1998 15
  • 16. Conclusion I have throughout my personal project studied a variety of different algorithms that have shown to been more or less useful for their different objectives. Some systems have been pure genetic algorithms or pure artificial neural networks, while others have integrated different approaches in an attempt to get the best of each algorithm. Different algorithms have had different advantages and disadvantages, i.e. decision trees are better suited for discrete valued environments. I have found that accurate knowledge about the characteristics of the problem and basic knowledge about the algorithms is essential to find a good algorithm for the task. Some problems are better suited for machine learning algorithms than others. This may be because there is still a long way to go in the science of machine learning, or because some of the expectations on machine learning are too high. For instance Stuart/Norvig suggests in Artificial Intelligence – a modern approach that it might always be worth trying a simple implementation of an artificial neural network or even a genetic algorithm on a problem just to see if it will work. Our knowledge of how, for example, neural networks work is very limited, especially with recurrent networks. There are other important aspects where knowledge is important. For example the occurrence of overfitting is a trap that anyone dealing with machine learning should be aware of. Even if the algorithm is perfect, the handling of the set of training data is still a dubious matter. LiTH, 1998 16
  • 17. References Books Norvig, Peter & Russell, Stuart (1995). Artificial intelligence – a modern approach, USA http://www.cs.berkeley.edu/~russell/aima.html Mitchell, M., Tom (1997). Machine learning. McGraw-Hill, USA. http://www.cs.cmu.edu/~tom/mlbook.html Websites http://www.isoft.fr/ http://www.salford-systems.com/ http://dendrite.cs.brandeis.edu/tron/ http://www.aridolan.com/ga/gaa/gaa.html LiTH, 1998 17