The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
An insight into Educational Data Mining at Muğla Sıtkı Koçman University, Turkey
1. Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Educational Data Mining
An insight into EDM at Muğla Sıtkı Koçman University
Presentation by Steven Strehl, HTW Berlin
E-Mail: steven.strehl@htw-berlin.de
June 30th, 2014
2. Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Gürüler, Hüseyin and Istanbullu, Ayhan (2014): Modeling Student Performance
in Higher Education Using Data Mining. In: Educational Data Mining,
Alejandro Peña-Ayala (Ed.).
Gürüler, Hüseyin (2005): Veritabanları Üzerinde Veri Madenciliği
Uygulaması. Muğla Sıtkı Koçman Üniversitesi.
Teşekkür ederim to Hüseyin Gürüler for providing support and additional information by mail and
Sema Karakurt for helping me translate important aspects of Hüseyin’s Master’s thesis from Turkish
to English.
Primary sources
3. Vision and Goals
Student Knowledge Discovery Software
Improve efficiency, quality,
“experience” of studies
Predict failure or success of students
Eliminate factors that lead to failure
Figure 1: SKDS user interface
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
4. Context
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Student profiling
Figure 2: Female student
Gender
Military service
Grade Point Average
Name
Secondary school
Scholarship
Age
Final mark
Secondary school
Nationality
Study programme
Family income
Marital status
Focus subject
Parents’ professions
Religion
Hometown
…
5. Context
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Figure 3: Individual students
Figure 4: Anonymous students
6. Basics
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Knowledge Discovery in Databases
Data MiningAvailable Data Selected Data Pre-processed
Data
Transformed
Data
Interpretation
and Evaluation
Figure 5: Stages of the KDD process
7. Basics
Data Mining
Cross Industry Standard Process
for Data Mining (CRISP-DM)
Verification-driven DM
Aims at verifying assumptions
by data queries
Discovery-driven DM
Aims at gaining new insights
by unveiling patterns
Figure 6: CRISP-DM Process Diagram
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
8. Approach
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Data curation
Figure 7: Male student
Gender
Military service
Grade Point Average
Name
Secondary school
Scholarship
Age
Final mark
Secondary school
Nationality
Study programme
Family income
Marital status
Focus subject
Parents’ professions
Religion
Hometown
…
9. Approach
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Data volume and provenience
Data Mining
Microsoft
Decision
Tree
Algorithm
Available Data
University departments,
faculties, central
registration system and
archives
Selected Data
13 tables related to
the scope of SKDS
Pre-processed
Data
6 tables
Transformed Data
View consisting of
111 columns and
6,470 records
Interpretation
and Evaluation
Figure 8: Stages of the KDD process
10. Approach
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Models and results
Model 1: GPA >= 3.0
- Language prep. (English)
- Registration preference
Model 2: GPA >= 2.0
- Family income
Figure 9: Decision Tree for Model 1 Figure 10: Decision Tree for Model 2
11. Issues and Outlook
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Issues
Data availability, variety, format
Usability not yet suitable for everyday use
Transfer impact of findings
Outlook
Data Mining new algorithms available
MÜKÜP/SKDS was an early WEKA
Improvement new GUI, easier to use
12. Discussion
Except where otherwise noted, content of this work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Topics
Social background Family income has been discovered as an
important factor for success. Does the data scientist’s work end here?
Study conditions How to measure and find aspects that counterbalance negative preconditions
of the social background?
Health How does the mental and physical state influence success and failure?
Open Data What if Student Life Cycle Data were open?
13. List of Figures
[1] Gürüler, Hüseyin and Istanbullu, Ayhan (2014): Modeling Student Performance in Higher Education Using Data Mining. No CC license.
[2] http://openclipart.org/user-detail/ryanlerch Public domain.
[3] http://openclipart.org/user-detail/ryanlerch Public domain.
[4] http://openclipart.org/user-detail/thekua Public domain.
[5] Composed graphic from http://openclipart.org/user-detail/jean_victor_balin, http://openclipart.org/user-detail/buggi, http://openclipart.org/user-detail/gsagri04 Public
Domain.
[6] Kenneth Jensen, https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining#mediaviewer/File:CRISP-DM_Process_Diagram.png
[7] http://openclipart.org/user-detail/ryanlerch Public domain.
[8] Composed graphic from http://openclipart.org/user-detail/jean_victor_balin, http://openclipart.org/user-detail/buggi,
http://openclipart.org/user-detail/gsagri04 Public Domain.
[9] Gürüler, Hüseyin and Istanbullu, Ayhan (2014): Modeling Student Performance in Higher Education Using Data Mining. No CC license.
[10] Gürüler, Hüseyin and Istanbullu, Ayhan (2014): Modeling Student Performance in Higher Education Using Data Mining. No CC license.