Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
11.11.08_GrC2011_Decision Rule Visualization for Knowledge Discovery by Means of Rough Set Approach
1. 11/09/2011 GrC2011
Decision Rule Visualization for Knowledge
Discovery by Means of Rough Set Approach
Motoyuki Ohki, Masahiro Inuiguchi, Toshinobu Harada
Graduate School of Engineering Science, Osaka University
Faculty of Systems Engineering, Wakayama University
2. 00. Outline 1 / 25
01. Background and Purpose
02. Algorithm for Decision Rule Visualization
03. Visualization System
04. Evaluation Experiment
05. Summary and Future Work
3. 01. Background 2 / 25
Rough Set Approach
- Attribute Reduction
- Induce Decision Rules
Application to various fields
4. 01. Background 3 / 25
A Decision Table
Decision rule:If “b1” then “1”
The number
Sample Color (a) Shape (b) Type (d) Preference
of doors (c)
car1 colored (a1) nature (b1) two (c1) personal (d1) I'd like to buy (1)
car2 colored (a1) rounded (b2) four (c2) sporty (d2) I don't know (2)
car3 monochrome (a2) rounded (b2) four (c2) formal (d3) I don't know (2)
car4 monochrome (a2) nature (b1) four (c2) personal (d1) I'd like to buy (1)
car5 monochrome (a2) rounded (b2) two (c1) personal (d1) I don't know (2)
car6 colored (a1) rounded (b2) two (c1) sporty (d2) I'd like to buy (1)
Decision rule:If “a1 and d2” then “1”
We select useful decision rules among many rules.
We apply the rules to actual problems.
5. 01. Background 4 / 25
Technical issue
- Difficulty of interpretation
- Depending on analysts
Difficulty of finding
usuful decision rules ...
An example of
inducing decision rules[1]
[1] HOLON CREATE, Rough Sets Analysis Program, http://www.holon.com/program.html
6. 01. Purpose 5 /25
Proposing Algorithm for Visualization of
Decision Rule in Rough Set Approach
Supporting discovery of useful decision rule
Examples of visual data mining [1,2,3]
[1] SOM Self-organization maps http://www.mindware-jp.com/Viscovery/self-organizing-maps.html
[2] Purple Insight MineSet http://journal.mycom.co.jp/news/2006/06/28/347.html
[3] Natto View http://www.holon.com/program.html
7. 02. Methods used in the proposed visualization 6 / 25
Three Methods
(i) The decision matrix-based rule induction
(ii) Calculation of Co-occurrence Rates
(iii) Hayashi’s Quantification Method Ⅳ
We evaluate the dependencies between attribute
values and conclusions quantitatively.
8. 02. Co-occurrence Rate 7 / 25
Definition
- Degrees of the dependencies “between attribute values”
and “between attribute values and conclusion”
- Jaccard coefficient
Formula
, |X| : cardinality of set X
9. 02. Co-occurrence Rate 8 / 25
Calculation Example
The number
Sample Color (a) Shape (b) Type (d) Preference
of doors (c)
car1 colored (a1) nature (b1) two (c1) personal (d1) I'd like to buy (1)
car2 colored (a1) rounded (b2) four (c2) sporty (d2) I don't know (2)
car3 monochrome (a2) rounded (b2) four (c2) formal (d3) I don't know (2)
car4 monochrome (a2) nature (b1) four (c2) personal (d1) I'd like to buy (1)
car5 monochrome (a2) rounded (b2) two (c1) personal (d1) I don't know (2)
car6 colored (a1) rounded (b2) two (c1) sporty (d2) I'd like to buy (1)
the rate between “a1” and “b1”
10. 02. Hayashi’s Quantification Method Ⅳ 9 / 25
Definition
- A kind of multi-dimensional scaling
- Plot all objects in the two dimensional coordinate system
Algorithm
:
11. 02. Flow of the Decision Rule Visualization 10 / 25
Input
A decision table
Analysis
Calculate Jaccard coefficients between attribute values
Apply Hayashi’s quantification method
1. We obtain the locations of attribute values in X-Y coordinate.
Output
Attribute values
12. 02. Flow of the Decision Rule Visualization 11 / 25
Input
A decision table
Analysis
Calculate Jaccard coefficients
between attribute values and conclusion
2. We obtain the location of attribute values in Z coordinates.
Output
13. 02. Flow of the Decision Rule Visualization 12 / 25
Input
A decision table
Analysis
Induce decision rules by rough set approach
Calculate C.I values
3. Decision rules are represented as links.
b2
Decision Rule:a1b2
Output
a1
14. 03. Visualization System 13 / 25
c1 0.500
Strongly dependent
with the conclusion
Decision rule : c1d3
Candidate for the
useful decision rules
Decision table
- Attribute values : 16
- Induced decision rules : 31
15. 04. Evaluation Experiment 14 / 25
Two evaluation experiments
- We check the efficiency and usefulness of visualization method.
[1] Product evaluation experiment
- To check the advantage of visualization method
[2] Numerical experiment
- To check the usefulness of decision rules selected by
examinees utilizing the visualization system
16. 04. Product Evaluation Experiment 15 / 25
Procedure 1
Samples and attribute values
- 24 digital cameras as samples
- 7 condition attributes
ex) Face shape, Position of lens … etc.
Procedure 2
We ask three examinees about buying motivation of these
digital cameras.
- conclusion 1 : “I want to buy it”
- conclusion 2 : “I will not buy it”
17. 04. Product Evaluation Experiment 16 / 25
Procedure 3
We compare the advantage of selecting decision rules
by the following two methods.
- one : Proposed Visualization Method
- the other : Commercial Software provided by HOLON[1]
Comparison
[1] HOLON CREATE Rough Sets Analysis Program http://www.holon.com/program.html
18. 04. Product Evaluation Experiment 17 / 25
Evaluation of Commercial Software
List of decision rules with C.I values Decision Rules C.I value
e2f3 0.167
b2f2 0.167
Difficulty in finding the useful a2d2 0.167
c1f1g2 0.167
decision rules b1c1f1 0.167
a2f2g1 0.167
The selected decision rules are different a2b1e2 0.167
among examinees. b2e1 0.083
d2f3 0.083
a1d3 0.083
Decision rules and C.I
values induced by a
commercial software
19. 04. Product Evaluation Experiment 18 / 25
Evaluation of Visualization System
1. It is easy to understand the
strength of dependencies
at one look.
Examples
- e2 (no dial, Z-value = 0.450)
- c1 (shape of face is straight line,
Z-value = 0.429)
- g2 (shape of edge strip is rounded,
Z-value = 0.412)
20. 04. Product Evaluation Experiment 19 / 25
Evaluation of Visualization System
2. We can find a weakly related
condition attribute values.
Examples
- f1, f2, and f3 are located
lower position
- “f” (location of flash) is not very
influential for this examinee’s
preference.
21. 04. Product Evaluation Experiment 20 / 25
Evaluation of Visualization System
3. The length of linkes can e2
express imbalanced influence
of attribute values. b1
a2
Examples
- “e2f3” : long link
→ unreliable decision rule
- “a2b1e2” : short link
→ reliable decision rule f3
Decision rules composed by three attribute values
Decision rules composed by two attribute values
22. 04. Numerical Experiment 21 / 25
Procedure 1
Partion “car” data set into ten subsets randomly
- “car” data set : obtained from UCI web site*1+
1 2 3 10
[1] UCI Machine Learning Repository http://archive.ics.uci.edu/ml/
23. 04. Numerical Experiment 22 / 25
Procedure 2
Ask each of three examinees to select three decision
rules to each subsets of “car” data set
a1c3
b1d2
a1c2
a1d2
a1c3
d2b1
a1c3
d2b1
b1d2
24. 04. Numerical Experiment 23 / 25
Procedure 3
Compare the selected three decision rules(Rule Set 1) with non-
selected decision rules(Rule Set 2) having the same C.I values
Rule Set 1 Rule Set 2
a1d2, a1c3, b1d2 c2d2, b3c3 …
1 2 9 1 2 9
Calculation of Average Accuracy
25. 04. Numerical Experiment 24 / 25
Results of Average Accuracy
By the paired t-test with
significance level α =
0.05, we confirmed the
advantage of Rule Set 1
to Rule Set 2.
We confirmed the
usefulness of the
proposed method.
26. 05. Summary and Future Work 25 / 25
Summary
1. We proposed a method of visualizing decision rules
2. We developed a visualization system based on the proposed
method
3. We conducted two experiments. We confirmed the
effectiveness and usefulness of the visualization system.
Future Work
1. To conduct more experiments with many different decision
tables.
2. To improve the system in order to enhance the precision of
analysis method.
27. Thank you for listening !
Motoyuki Ohki
Graduate School of Engineering Science, Osaka University
E-Mail : ohki@inulab.sys.es.osaka-u.ac.jp
29. 00. Samples and Attribute
24 digital cameras 7 attribute values
30. 00. Conventional Research
Multi-valued decision diagrams [1]
- This method uses a multi-valued
decision diagram.
Hierarchical visualization method[2]
- This method uses a hierarchical
graph structure.
*1+ Y. Tomoto, T. Ohira, T. Nakamura, M. Kanoh, and H. Itoh, “Applying Multi-valued Decision Diagram to
Visualization of If-Then Rules” Kansei Engineering International Journal, vol.9, no.2, 2010, pp.259-267.
*2+ A. Ito, T. Yoshikawa, T. Furuhashi, S. Mitsumatsu,“Profiling by Association Analysis using Hierarchical
Visualization Method” Kansei Engineering International Journal, vol.10, no.2, 2011, pp.205-212.
31. 00. Co-occurrence Rate 30 / 14
The reason of selecting Jaccard coefficient
- Attribute value X and attribute value Y
For example
(1) |X| = 100, |Y| = 1, |X∩Y| = 1, |X∪Y| = 100
Jaccard = 1/100 Simpson = 1
Cosine = 1/10 Dice = 2/101
(2) |X| = 100, |Y| = 100, |X∩Y| = 50, |X∪Y| = 150
Jaccard = 1/3 Simpson = 1/2
Cosine = 1/2 Dice = 1/2