Roughset & it’s variants

ROUGH SET & IT’SVARIANTS:VARIABLE
PRICISION ROUGH SET AND FUZZY ROUGH
SET APPROACHES
presented by-
Rajdeep Chatterjee
PLP, MIU ISI Kolkata

Overview
 Introduction to Rough Set
 Information/Decision Systems
 Indiscernibility
 Set Approximations of Rough Set
 Reducts and Core
 Dependency of Attributes
 Variable Precision rough Set (VPRS)
 Set Approximations (VPRS)
 Fuzzy Rough Set (FRS)
 Set Approximations and Dependency (FRS)
 Observations
R Chatterjee, PLP, MIU ISI Kolkata

Introduction
 Often, information on the surrounding
world is
◦ Imprecise
◦ Incomplete
◦ uncertain.
 We should be able to process uncertain
and/or incomplete information.

Introduction
 “Rough set theory” was developed by
Zdzislaw Pawlak in the early 1980’s.
 Representative Publications:
◦ Z. Pawlak, “Rough Sets”, International Journal of
Computer and Information Sciences,Vol.11,341-
356 (1982).
◦ Z. Pawlak, Rough Sets -Theoretical Aspect of
Reasoning about Data, Kluwer Academic
Pubilishers (1991).

Information Systems
Age LEMS
X1 16-30 50
X2 16-30 0
X3 31-45 1-25
X4 31-45 1-25
X5 46-60 26-49
X6 16-30 26-49
X7 46-60 26-49
 IS is a pair (U,A)
 U is a non-empty
finite set of objects.
 A is a non-empty
finite set of
attributes such that
for every
 is called the value
set of a.
aV

Decision Systems
Age LEMS Walk
X1 16-30 50 Yes
X2 16-30 0 No
X3 31-45 1-25 No
X4 31-45 1-25 Yes
X5 46-60 26-49 No
X6 16-30 26-49 Yes
X7 46-60 26-49 No
 DS:
 is the decision
attribute (instead of
one we can consider
more decision
attributes).
 The elements of A
are called the
condition attributes.
}){,( dAUT 
Ad 
Condition attributes
Decision attribute

Indiscernibility
 The equivalence relation
A binary relation which is
reflexive (xRx for any object x) ,
symmetric (if xRy then yRx), and
transitive (if xRy and yRz then xRz).
 The equivalence class of an element
consists of all objects
such that xRy.
XXR 
Rx][
Xx
Xy

Indiscernibility
 Let IS = (U,A) be an information system, then with
any there is an associated equivalence
relation:
where is called the B-indiscernibility relation.
 If then objects x and x’ are
indiscernible from each other by attributes from B.
 The equivalence classes of the B-indiscernibility
relation are denoted by
AB 
)}'()(,|)',{()( 2
xaxaBaUxxBINDIS 
)(BINDIS
),()',( BINDxx IS
Rx][

An Example of Indiscernibility
Age LEMS Walk
X1 16-30 50 Yes
X2 16-30 0 No
X3 31-45 1-25 No
X4 31-45 1-25 Yes
X5 46-60 26-49 No
X6 16-30 26-49 Yes
X7 46-60 26-49 No
 The non-empty
subsets of the
condition attributes
are {Age}, {LEMS}, and
{Age, LEMS}.
 IND({Age}) =
{{x1,x2,x6}, {x3,x4},
{x5,x7}}
 IND({LEMS}) = {{x1},
{x2}, {x3,x4}, {x5,x6,x7}}
 IND({Age,LEMS}) =
{{x1}, {x2}, {x3,x4},
{x5,x7}, {x6}}.

Set Approximation
 Let T = (U,A) and let and
We can approximate X using only the
information contained in B by
constructing the B-lower and B-upper
approximations of X, denoted and
respectively, where
AB  .UX 
XB
XB
},][|{ XxxXB B 
}.][|{  XxxXB B

Set Approximation
 B-boundary region of X,
consists of those objects that we cannot
decisively classify into X in B.
 B-outside region of X,
consists of those objects that can be with
certainty classified as not belonging to X.
 A set is said to be “rough” if its boundary
region is non-empty,otherwise the set is crisp.
,)( XBXBXBNB 
,XBU 

An Example of Set Approximation
 LetW = {x |Walk(x)
= yes}.
 The decision class,
Walk, is rough since
the boundary region
is not empty
}.7,5,2{
},4,3{)(
},6,4,3,1{
},6,1{
xxxWAU
xxWBN
xxxxWA
xxWA
A




Age LEMS Walk
X1 16-30 50 Yes
X2 16-30 0 No
X3 31-45 1-25 No
X4 31-45 1-25 Yes
X5 46-60 26-49 No
X6 16-30 26-49 Yes
X7 46-60 26-49 No

An Pictorial Depiction of Set Approximation
yes
yes/no
no
{{x1},{x6}}
{{x3,x4}}
{{x2}, {x5,x7}}
WA
AW

Lower & Upper Approximations
}:/{ XYRUYXR  
}:/{  XYRUYXR 
LowerApproximation:
Upper Approximation:

Properties of Approximations
1.
2.
3.
4.
5.
6. YX
YX
YBXBYXB
YBXBYXB
UUBUBBB
XBXXB






)()()(
)()()(
)()(,)()(
)(

)()( YBXB 
)()( YBXB 
implies
implies

Properties of Approximations
7.
8.
9.
10.
11.
12. )())(())((
)())(())((
)()(
)()(
)()()(
)()()(
XBXBBXBB
XBXBBXBB
XBXB
XBXB
YBXBYXB
YBXBYXB






Where, -X denotes U - X.

Four Basic Classes of Rough Sets
 X is roughly B-definable, iff and
 X is internally B-undefinable, iff
and
 X is externally B-undefinable, iff
and
 X is totally B-undefinable, iff
and
)(XB
,)( UXB 
)(XB
,)( UXB 
)(XB
,)( UXB 
)(XB
.)( UXB 

Accuracy of Approximation
where |X| denotes the cardinality of
Obviously
If X is crisp with respect to B.
If X is rough with respect to B.
|)(|
|)(|
)(
XB
XB
XB 
.X
.10  B
,1)( XB
,1)( XB

Issues in the Decision Table
 The same or indiscernible objects may be
represented several times.
 Some of the attributes may be
superfluous (redundant).
That is, their removal cannot worsen the
classification.

Reduct
 Keep only those attributes that preserve
the indiscernibility relation and,
consequently, set approximation.
 There are usually several such subsets of
attributes and those which are minimal
are called reducts.

Dispensable & Indispensable
Attributes
Let
Attribute c is dispensable in T
if , otherwise
attribute c is indispensable in T.
.Cc
)()( }){( DPOSDPOS cCC 
XCDPOS
DUX
C /
)(


The C-positive region of D:

Independent
 T = (U, C, D) is independent
if all are indispensable in TCc

Reduct & Core
 The set of attributes is called a
reduct of C, if T’ = (U, R, D) is independent
and
 The set of all the condition attributes
indispensable in T is denoted by CORE(C).
where RED(C) is the set of all reducts of C.
CR 
).()( DPOSDPOS CR 
)()( CREDCCORE 

Discernibility Matrix
 Let T = (U, C, D) be a decision table, with
By a discernibility matrix of T, denoted M(T),
we will mean matrix defined as:
for i, j = 1,2,…,n
 is the set of all the condition attributes
that classify objects ui and uj into different
classes
}.,...,,{ 21 nuuuU 
ijc

Discernibility Function
 A discernibility function for an information
system IS is a boolean function om m boolean
variables (corresponding to the attributes
a1,a2,…,am) defined as follows.
 Where .The set of all prime
implicants of determines the set of all reduct
of IS.

Examples of Discernibility Matrix
a b c d
u1 a0 b1 c1 Y
u2 a1 b1 c0 n
u3 a0 b2 c1 n
u4 a1 b1 c1 Y
 In order to discern equivalence
classes of the decision attribute d,
to preserve conditions described
by the discernibility matrix for
this table
u1 u2 u3
u2
u3
u4
a,c
b
c a,b

C = {a, b, c}
D = {d}
Reduct = {b, c}
cb
bacbca

 )()(

Dependency of Attributes
 Set of attribute D depends totally on a set
of attributes C, denoted if all
values of attributes from D are uniquely
determined by values of attributes from C.
,DC 

 Let D and C be subsets of A. We will say
that D depends on C in a degree k
denoted by if
where called C-
positive region of D.
),10(  k ,DC k
||
|)(|
),(
U
DPOS
DCk C
 
),()(
/
XCDPOS
DUX
C

 

 Obviously
 If k = 1 we say that D depends totally on
C.
 If k < 1 we say that D depends partially
(in a degree k) on C.
.
||
|)(|
),(
/


DUX U
XC
DCk 

Variable Precision Rough Set
 A generalized model of rough sets called
variable precision model (VPRS) aimed at
modeling classification problems involving
uncertain or imprecise information, is
presented by Wojceich Ziarko in 1993.
 This extended rough set model able to
allow some degree of misclassification in
the largely correct classification.

Variable Precision Rough Set
 c(X,Y) of the relative degree of misclassification of the
set X with respect to set Y defined as
where card denotes set cardinality.
 The quantity c(X,Y) will be referred to as the relative
classification error.
 The actual number of misclassified elements is given
by the product c(X,Y)*card(X) which is referred to as
an absolute classification error.

-majority (VPRS)
 is known as admissible classification
error must be within the range .
 More than *100 elements of X should be
common with Y. then it is called –majority
relations.
 Let X1={x1,x2,x3,x4}
X2={x1,x2,x5}
Y={x1,x2,x3,x8}
XY


XY 1
25.0
 XY 2
33.0


Set Approximations inVPRS
 Let A=(U,R) which consists of a non-
empty, finite universe U and of the
equivalence relation R on U. The
equivalence relation R, referred to as an
indiscernibility relation, corresponds to a
partitioning of the universe U into a
collection of equivalence classes or
elementary sets R*={E1,E2,…,En}

Lower & Upper Approximation (VPRS)
 Lower approximation:
or, equivalently,
 Upper approximation:
EX

R
R
R


Boundary & Negative Region (VPRS)
 Boundary region:
 Negative region:

Theoretical aspect of Approximation
(VPRS)
 The lower approximation of the set X can
be interpreted as the collection of all
those elements of U which can be
classified into X with the classification
error not greater than b.
 The negative region of the set X is the
collection of all those elements of U
which can be classified into the
complement of X, -X with the
classification error not greater than b.

Theoretical aspect of Approximation
(VPRS)
 The boundary region of the set X cannot
be classified either into X or –X with the
classification error not greater than b.
 The upper approximation of the set X
includes all those elements of U which
cannot be classified into -X with the error
not greater than b.

Fuzzy-Rough Sets (FRS)
 One particular use of RST is that of attribute reduction
in datasets. Given dataset with discretized attribute
values, it is possible to find a subset of the original
attributes that are the most informative (termed as
Reduct).
 However, most often the case that the values of
attributes may be real-valued and cannot be handled by
traditional rough set.
 Some discretization is possible which in turn gives you
loss of information.
 To deal with vagueness and noisy data in the dataset,
Fuzzy Rough Set was introduced by Richard Jensen.

Fuzzification for Conditional Features
a b c q
1 -0.4 -0.3 -0.5 No
2 -0.4 0.2 -0.1 Yes
3 -0.3 -0.4 -0.3 No
4 0.3 -0.3 0 Yes
5 0.2 -0.3 0 Yes
6 0.2 0 0 no
 Fuzzy-rough set is defined by two
fuzzy sets: fuzzy lower and upper
approximations, obtained by
extending the corresponding
crisp rough set notions.
 In crisp case, elements that
belong to the lower
approximation (i.e., have
membership of 1) are said to
belong to the approximated set
with absolute certainty. In fuzzy-
rough case, elements may have a
membership in the range [0,1],
allowing greater flexibility in
handling uncertainty.

Membership values from MFs of
linguistic labels
a b c q
Na Za Nb Zb Nc Zc {1,3,6} {2,4,5}
1 0.8 0.2 0.6 0.4 1.0 0.0 1.0 0.0
2 0.8 0.2 0.0 0.6 0.2 0.8 0.0 1.0
3 0.6 0.4 0.8 0.2 0.6 0.4 1.0 0.0
4 0.0 0.4 0.6 0.4 0.0 1.0 0.0 1.0
5 0.0 0.6 0.6 0.4 0.0 1.0 0.0 1.0
6 0.0 0.6 0.0 1.0 0.0 1.0 1.0 0.0

Lower & Upper Approximations (FRS)
 Lower approximation:
 Upper approximation:
sup
/ PUF
infUy
sup
Uy
sup
/ PUF
P
P

Positive Region & Dependency Measure (FRS)
 The membership of an object , belonging to the fuzzy
positive region can be defined by
 Using the definition of the fuzzy positive region, the new
dependency function can be defined as follows:
P
sup
/ QUX

Observations
 Evaluation of importance of particular attributes and
elimination of redundant attributes from the decision
table.
 Construction of a minimal subset of independent
attributes ensuring the same quality of classification as
the whole set, i.e. reducts of the set of attributes.
 Intersection of these reducts giving a core of attributes,
which cannot be eliminated without disturbing the
ability of approximating the classification and
 Generation of logical rules from the reduced decision
table.

Roughset & it’s variants

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (6)

Ähnlich wie Roughset & it’s variants

Ähnlich wie Roughset & it’s variants (20)

Mehr von Dr. Rajdeep Chatterjee

Mehr von Dr. Rajdeep Chatterjee (8)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Roughset & it’s variants