[PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model

•

1 gefällt mir•319 views

Models obtained by decision tree induction techniques excel in being interpretable. However, they can be prone to overfitting, which results in a low predictive performance. Ensemble techniques provide a solution to this problem, and are hence able to achieve higher accuracies. However, this comes at a cost of losing the excellent interpretability of the resulting model, making ensemble techniques impractical in applications where decision support, instead of decision making, is crucial. To bridge this gap, we present the GENESIM algorithm that transforms an ensemble of decision trees into a single decision tree with an enhanced predictive performance while maintaining interpretability by using a genetic algorithm. We compared GENESIM to prevalent decision tree induction algorithms, ensemble techniques and a similar technique, called ISM, using twelve publicly available data sets. The results show that GENESIM achieves better predictive performance on most of these data sets compared to decision tree induction techniques & ISM. The results also show that GENESIM's predictive performance is in the same order of magnitude as the ensemble techniques. However, the resulting model of GENESIM outperforms the ensemble techniques regarding interpretability as it has a very low complexity.

Daten & Analysen

GENESIM: GENetic Extraction of a Single, Interpretable Model
Gilles Vandewiele, Kiani Lannoye, Olivier Janssens,
Femke Ongenae, Filip De Turck, Sofie Van Hoecke

1.
Decision tree ensembles
Dietterich, Thomas G. "Ensemble methods in machine learning”

3
An ensemble combines predictions of (uncorrelated) classifiers

5
Disadvantages of decision tree ensembles
Majority voting/mean
Stacking

6
Interpretability is of major importance in critical domains!
Healthcare Finance Law

2.
Merging decision trees
Andrzejak, Artur, Felix Langner, and Silvestre Zabala. "Interpretable models from distributed data via merging of decision trees."

9
Decision tree Decision space
Converting a decision tree to a decision space

10
Calculating the intersection of two spaces (computational geometry)

11
Possible extension: pruning of the merged decision space

12
How to convert the merged decision spaces back to decision trees?
Possible root nodes

3.
GENESIM: genetic algorithms to the rescue!

16
Iteratively merging decision trees in a genetic fashion

18
Mutations
8
Swap two random subtrees Change split threshold

19
Other use-case: distributed learning
Partition 1
Partition n
.
.
.
.
.
.

4.
Other similar techniques: inTrees & ISM
Deng, Houtao. "Interpreting tree ensembles with intrees.“
Van Assche, Anneleen, and Hendrik Blockeel. "Seeing the forest through the trees: Learning a comprehensible model from an ensemble."

21
inTrees: from an ensemble to an ordered rule list

22
Ordered rule list is less interpretable than decision tree!

23
ISM: constructing new trees with split criteria calculated from the ensemble

25
Compared GENESIM to all prominent decision tree induction algorithms:
- C4.5
- CART
- QUEST
- GUIDE
Two (decision tree) ensemble methods:
- Random Forests
- eXtreme Gradient Boosting (XGBoost)
And one similar model extraction technique
- ISM (Interpretable, Single Model)
Hyper-parameters are tuned using grid
search or bayesian optimization (GPs)

26
* 10 measurements with 3-fold cross-validation
→ Measure accuracy and model complexity (number of nodes)
* Construct Win-Tie-Loss matrix with bootstrap testing:
→ Algorithm A wins over algorithm B when its mean accuracy is higher
and the p-value of the bootstrap test <= 0.05
→ Tie between A and B when p-value > 0.05

27
GENESIM outperforms all decision tree induction algorithms, except for C4.5 (but produces smaller trees!)
Complexity Accuracy

29
GENESIM extracts an interpretable model with enhanced predictive performance

30
Reduce the computational runtime and perform optimizations to allow for hyper-parameter tuning
and running for more iterations!

31
Try other meta-heuristics & increase initial population diversity
Particle Swarm
Optimization
Ant Colony
Optimization

32
THANK YOU!
Acknowledgements:
-Reviewers & organizing committee
-My promotors: Filip De Turck & Femke Ongenae
gilles.vandewiele@ugent.be
@Gillesvdwiele
https://github.com/IBCNServices/GENESIM

Weitere ähnliche Inhalte

Ähnlich wie [PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model

decisiontrees.pptPriyadharshiniG41

P1121133727Ashraf Aboshosha

04 1 evolutionTianlu Wang

5 5 10kohannim

PR422_hyper-deep ensembles.pdfSunghoon Joo

Summary.pptbutest

Download Itbutest

Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi

CPSC 531:System Modeling and Simulation.pptxFarhan27013

On cascading small decision treesJulià Minguillón

Artificial intelligence ShakthiShakthi13

November, 2006 CCKM'06 1 butest

Histogram-Based Method for Effective Initialization of the K-Means Clustering...Gingles Caroline

Sotaguesta4fafe

slidesbutest

Multiple Classifier SystemsFarzad Vasheghani Farahani

slides.pdfFlorentBersani

An Efficient Clustering Method for Aggregation on Data FragmentsIJMER

Machine learning for_financeStefan Duprey

10 best practices in operational analytics Decision Management Solutions

Ähnlich wie [PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model (20)

decisiontrees.ppt

P1121133727

04 1 evolution

5 5 10

PR422_hyper-deep ensembles.pdf

Summary.ppt

Download It

Machine learning session6(decision trees random forrest)

CPSC 531:System Modeling and Simulation.pptx

On cascading small decision trees

Artificial intelligence

November, 2006 CCKM'06 1

Histogram-Based Method for Effective Initialization of the K-Means Clustering...

Sota

slides

Multiple Classifier Systems

slides.pdf

An Efficient Clustering Method for Aggregation on Data Fragments

Machine learning for_finance

10 best practices in operational analytics

Kürzlich hochgeladen

一比一原版(UCD毕业证书）加州大学戴维斯分校毕业证成绩单原件一模一样wsppdmt

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli

如何办理英国诺森比亚大学毕业证（NU毕业证书）成绩单原件一模一样wsppdmt

Ranking and Scoring Exercises for ResearchRajesh Mondal

Sequential and reinforcement learning for demand side management by Margaux B...Paris Women in Machine Learning and Data Science

Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg

SR-101-01012024-EN.docx Federal Constitution of the Swiss ConfederationEfruzAsilolu

7. Epi of Chronic respiratory diseases.pptibrahimabdi22

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums

Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhAbortion pills in Riyadh +966572737505 get cytotec

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg

Digital Transformation Playbook by Graham WareGraham Ware

Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg

一比一原版(曼大毕业证书）曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark

Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14

Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

怎样办理伦敦大学城市学院毕业证（CITY毕业证书）成绩单学校原版复制vexqp

Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg

Kürzlich hochgeladen (20)

一比一原版(UCD毕业证书）加州大学戴维斯分校毕业证成绩单原件一模一样

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...

如何办理英国诺森比亚大学毕业证（NU毕业证书）成绩单原件一模一样

Ranking and Scoring Exercises for Research

Sequential and reinforcement learning for demand side management by Margaux B...

Harnessing the Power of GenAI for BI and Reporting.pptx

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...

SR-101-01012024-EN.docx Federal Constitution of the Swiss Confederation

7. Epi of Chronic respiratory diseases.ppt

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...

Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...

Digital Transformation Playbook by Graham Ware

Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...

一比一原版(曼大毕业证书）曼尼托巴大学毕业证成绩单留信学历认证一手价格

Lecture_2_Deep_Learning_Overview-newone1

Abortion pills in Jeddah | +966572737505 | Get Cytotec

怎样办理伦敦大学城市学院毕业证（CITY毕业证书）成绩单学校原版复制

Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...

[PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model

1. GENESIM: GENetic Extraction of a Single, Interpretable Model Gilles Vandewiele, Kiani Lannoye, Olivier Janssens, Femke Ongenae, Filip De Turck, Sofie Van Hoecke

2. 1. Decision tree ensembles Dietterich, Thomas G. "Ensemble methods in machine learning”

3. 3 An ensemble combines predictions of (uncorrelated) classifiers

4. 4 Advantages of decision tree ensembles

5. 5 Disadvantages of decision tree ensembles Majority voting/mean Stacking

6. 6 Interpretability is of major importance in critical domains! Healthcare Finance Law

7. 7 GENESIM: best of both worlds

8. 2. Merging decision trees Andrzejak, Artur, Felix Langner, and Silvestre Zabala. "Interpretable models from distributed data via merging of decision trees."

9. 9 Decision tree Decision space Converting a decision tree to a decision space

10. 10 Calculating the intersection of two spaces (computational geometry)

11. 11 Possible extension: pruning of the merged decision space

12. 12 How to convert the merged decision spaces back to decision trees? Possible root nodes

13. 3. GENESIM: genetic algorithms to the rescue!

14. 14 GENESIM

15. 15 Creating an initial population

16. 16 Iteratively merging decision trees in a genetic fashion

17. 17 Evaluating the fittest individual

18. 18 Mutations 8 Swap two random subtrees Change split threshold

19. 19 Other use-case: distributed learning Partition 1 Partition n . . . . . .

20. 4. Other similar techniques: inTrees & ISM Deng, Houtao. "Interpreting tree ensembles with intrees.“ Van Assche, Anneleen, and Hendrik Blockeel. "Seeing the forest through the trees: Learning a comprehensible model from an ensemble."

21. 21 inTrees: from an ensemble to an ordered rule list

22. 22 Ordered rule list is less interpretable than decision tree!

23. 23 ISM: constructing new trees with split criteria calculated from the ensemble

24. 5. Results

25. 25 Compared GENESIM to all prominent decision tree induction algorithms: - C4.5 - CART - QUEST - GUIDE Two (decision tree) ensemble methods: - Random Forests - eXtreme Gradient Boosting (XGBoost) And one similar model extraction technique - ISM (Interpretable, Single Model) Hyper-parameters are tuned using grid search or bayesian optimization (GPs)

26. 26 * 10 measurements with 3-fold cross-validation → Measure accuracy and model complexity (number of nodes) * Construct Win-Tie-Loss matrix with bootstrap testing: → Algorithm A wins over algorithm B when its mean accuracy is higher and the p-value of the bootstrap test <= 0.05 → Tie between A and B when p-value > 0.05

27. 27 GENESIM outperforms all decision tree induction algorithms, except for C4.5 (but produces smaller trees!) Complexity Accuracy

28. 6. Conclusion and future work

29. 29 GENESIM extracts an interpretable model with enhanced predictive performance

30. 30 Reduce the computational runtime and perform optimizations to allow for hyper-parameter tuning and running for more iterations!

31. 31 Try other meta-heuristics & increase initial population diversity Particle Swarm Optimization Ant Colony Optimization

32. 32 THANK YOU! Acknowledgements: -Reviewers & organizing committee -My promotors: Filip De Turck & Femke Ongenae gilles.vandewiele@ugent.be @Gillesvdwiele https://github.com/IBCNServices/GENESIM

[PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Ähnlich wie [PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model

Ähnlich wie [PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

[PAKDD - BDM17] GENESIM: GENetic Extraction of a Single, Interpretable Model