Decision Tree and entropy

•Als PPTX, PDF herunterladen•

1 gefällt mir•1,963 views

Decision trees use a tree structure to show possible consequences of decisions. A decision tree has internal nodes that represent tests of attributes, branches that represent test outcomes, and leaf nodes that represent class labels. Entropy is a measure of uncertainty in a random variable. Information gain is used to calculate the difference between entropy before and after splitting on an attribute, to determine which attribute provides the most information about the class. The attribute with the highest information gain is selected as the root node of the decision tree.

Technologie Business

Decision Tree, Entropy
Md Saeed Siddik
Khaza Moinuddin Mazumder

Decision Tree
A decision tree is a decision support tool that
uses a tree and their possible consequences.
Decision Tree is a flow-chart like structure in which
internal node represents test on an attribute
each branch represents outcome of test
each leaf node represents class label (decision taken
after computing all attributes)
03/10/2013DT and Entropy2

Consists of DT
03/10/2013DT and Entropy3
 A decision tree consists of 3 types of nodes:
1.Decision nodes
2.Chance nodes
3.End nodes

Types of variables in DT
Four types of tree can generated from a variables.
Those are..
03/10/2013DT and Entropy4
Terminal
.
Both are Left side
/
Both are Right side

Separated in Both side
/

Decision Table
03/10/2013DT and Entropy5
Evidence Action Author Thread Length
e1 skip known new long
e2 read unknown new short
e3 skip unknown old long
e4 skip known old long
e5 read known new short
e6 skip known old long

Author
Length
Skip
Rea
d
Thread
read skip
Decision Tree
03/10/2013DT and Entropy6

Decision
03/10/2013DT and Entropy7
 Known ∧ Long ⇒ Skip
 Known ∧ Short ⇒ Read
 Unknown ∧ New ⇒ Read
 Unknown ∧ Old ⇒ Skip

Entropy
Entropy is a measure of the uncertainty in a random
variable
The term Entropy, usually refers to the Shannon
entropy, which quantifies the expected value of the
information contained in a message.
Given a random variable ‘v’ with value Vk , the entropy
of x is defined by
k
kk
vPvPvH )(log)()( 2
03/10/2013DT and Entropy8

Entropy Measurement Unit
03/10/2013DT and Entropy9
 bit
 {0,1}
 Based on 2
 nat
 Also known as nit or nepit
 Logarithmic unit, based on e
 1 nat = 1.44 bit = 0.434 ban
 ban
 Also known as hartley or a dit (short for decimal digit)
 Logarithmic unit, based on 10
 Introduced by Alan Turing and I J Good
 1 ban = 3.32 bits = 2.30 nats

Entropy
03/10/2013DT and Entropy10
 Given the Boolean random variable with
probability q, (1-q)
)1(log)1(log)( 22
qqqqqB

Entropy for n+p variables
03/10/2013DT and Entropy11
if we consider we have n+p examples
Where p is positive and n is negative.
qp
n
qp
n
qp
p
qp
p
qp
p
B
2
log
2
log
)(

Reminder
03/10/2013DT and Entropy12
The Expected Entropy (EH) or Reminder remaining
after trying attribute A (with branches i = 1,2.....,k)
is :
d
k kk
kkk
pn
p
B
pn
pn
Ader
1
)()(minRe

Information Gain (IG)
03/10/2013DT and Entropy13
Information Gain is a non-symmetric measure of
the difference between two probability
distributions P and Q.
)(minRe)()( Ader
np
p
BAGain

Calculate the root
03/10/2013DT and Entropy14
 Choose the attribute with highest gain.

Empfohlen

OOPs Difference FAQsUmar Ali

2.2 decision treeKrish_ver2

pratik meshram-Unit 5 (contemporary mkt r sch)Pratik Meshram

Decision Tables as a Programming ToolBrenda Barnes

Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Sunil Nair

Decision treesRohit Srivastava

Dcs unit 2Anil Nigam

Secure information aggregation in sensor networksAleksandr Yampolskiy

Empfohlen

OOPs Difference FAQsUmar Ali

2.2 decision treeKrish_ver2

pratik meshram-Unit 5 (contemporary mkt r sch)Pratik Meshram

Decision Tables as a Programming ToolBrenda Barnes

Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Sunil Nair

Decision treesRohit Srivastava

Dcs unit 2Anil Nigam

Secure information aggregation in sensor networksAleksandr Yampolskiy

Quantile Quantile Plot qq plot Saeed Siddik

Comparative analysis on different DES modelSaeed Siddik

Connect dell equallogic storage to linux instanceSaeed Siddik

Comparison between VMware and Open Stack CloudSaeed Siddik

Deadlock in distribute system by saeed siddikSaeed Siddik

MIS Case StudySaeed Siddik

Birth & death information automationSaeed Siddik

Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina

TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3

SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521

Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan

DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell

New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada

How to write a Business Continuity PlanDatabarracks

Weitere ähnliche Inhalte

Mehr von Saeed Siddik

Quantile Quantile Plot qq plot Saeed Siddik

Comparative analysis on different DES modelSaeed Siddik

Connect dell equallogic storage to linux instanceSaeed Siddik

Comparison between VMware and Open Stack CloudSaeed Siddik

Deadlock in distribute system by saeed siddikSaeed Siddik

MIS Case StudySaeed Siddik

Birth & death information automationSaeed Siddik

Mehr von Saeed Siddik (7)

Quantile Quantile Plot qq plot

Comparative analysis on different DES model

Connect dell equallogic storage to linux instance

Comparison between VMware and Open Stack Cloud

Deadlock in distribute system by saeed siddik

MIS Case Study

Birth & death information automation

Kürzlich hochgeladen