Decision trees use a tree structure to show possible consequences of decisions. A decision tree has internal nodes that represent tests of attributes, branches that represent test outcomes, and leaf nodes that represent class labels. Entropy is a measure of uncertainty in a random variable. Information gain is used to calculate the difference between entropy before and after splitting on an attribute, to determine which attribute provides the most information about the class. The attribute with the highest information gain is selected as the root node of the decision tree.
2. Decision Tree
A decision tree is a decision support tool that
uses a tree and their possible consequences.
Decision Tree is a flow-chart like structure in which
internal node represents test on an attribute
each branch represents outcome of test
each leaf node represents class label (decision taken
after computing all attributes)
03/10/2013DT and Entropy2
3. Consists of DT
03/10/2013DT and Entropy3
A decision tree consists of 3 types of nodes:
1.Decision nodes
2.Chance nodes
3.End nodes
4. Types of variables in DT
Four types of tree can generated from a variables.
Those are..
03/10/2013DT and Entropy4
Terminal
.
Both are Left side
/
Both are Right side
Separated in Both side
/
5. Decision Table
03/10/2013DT and Entropy5
Evidence Action Author Thread Length
e1 skip known new long
e2 read unknown new short
e3 skip unknown old long
e4 skip known old long
e5 read known new short
e6 skip known old long
8. Entropy
Entropy is a measure of the uncertainty in a random
variable
The term Entropy, usually refers to the Shannon
entropy, which quantifies the expected value of the
information contained in a message.
Given a random variable ‘v’ with value Vk , the entropy
of x is defined by
k
kk
vPvPvH )(log)()( 2
03/10/2013DT and Entropy8
9. Entropy Measurement Unit
03/10/2013DT and Entropy9
bit
{0,1}
Based on 2
nat
Also known as nit or nepit
Logarithmic unit, based on e
1 nat = 1.44 bit = 0.434 ban
ban
Also known as hartley or a dit (short for decimal digit)
Logarithmic unit, based on 10
Introduced by Alan Turing and I J Good
1 ban = 3.32 bits = 2.30 nats
11. Entropy for n+p variables
03/10/2013DT and Entropy11
if we consider we have n+p examples
Where p is positive and n is negative.
qp
n
qp
n
qp
p
qp
p
qp
p
B
2
log
2
log
)(
12. Reminder
03/10/2013DT and Entropy12
The Expected Entropy (EH) or Reminder remaining
after trying attribute A (with branches i = 1,2.....,k)
is :
d
k kk
kkk
pn
p
B
pn
pn
Ader
1
)()(minRe
13. Information Gain (IG)
03/10/2013DT and Entropy13
Information Gain is a non-symmetric measure of
the difference between two probability
distributions P and Q.
)(minRe)()( Ader
np
p
BAGain