2. Overview
Main Objective of Our Research
Concept of KDD
Methods of Preprocessing Academic Data for Mining
Data Analysis
Relational Database
Universal Database
Synthetic Data Population
Data Transformation
Interested Association Rules for Academic Data
Preprocessing of Academic Data for Mining Association Rule 2
3. Main Objective of Our Research
To get knowledge and find the correlation of several explicit & implicit
factors related to:
Students’ academic progress
Potentiality decay of students
Abandonment
Why do students drop out before Graduation ?
Retention
Why does students’ extended continuation prevail ?
Preprocessing of Academic Data for Mining Association Rule 3
4. Concept of KDD
Knowledge Discovery and
Data mining Process
Data
Target
Data
Preprocessed Data
Transformed Data
Patterns/ Models
Knowledge
Selection
Preprocessing
Transformation
Data mining
Interpretation
Evaluation
4
5. Why Preprocessing before Data
Mining ?
Reasons for proposing a preprocessing technique before
applying mining association rules in academic data :
Proper interpretation of the results of mining is essential
to ensure that useful knowledge is derived from the data.
Blind application of data-mining methods can be a
dangerous activity, easily leading to the discovery of
meaningless and invalid patterns.
Preprocessing of Academic Data for Mining Association Rule 5
6. Methods of Preprocessing Academic
Data for Mining
Data Analysis of BIIS
Database
Personal Information Academic Information
Age SSC or equivalent GPA, Board
Gender HSC or equivalent GPA, Board
Origin Area(Birth Place)
Admission Year / Batch
Present Address Department
Hall Resident/Attached
Current Level/Term
Current CGPA
Term wise CGPA
Subject wise detailed Grade
Credit Hour Completed
Preprocessing of Academic Data for Mining Association Rule 6
7. Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Analysis (contd.)
Age
Origin Area
Record of Taken Courses
Experience of Teachers
Hall Resident/Attached
Term Duration
SSC & HSC
GPA/Board
Gender
CGPA
Factors related to Academic Performance of Student
Academic
Performance
Preprocessing of Academic Data for Mining Association Rule 7
8. Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Analysis (contd.)
Age
Origin Area
Credit Hour Ratio
Session Jam
Hall Resident/Attached
Term Duration
SSC & HSC
GPA/Board
Gender
Current
CGPA
Abandonment/
Retention of student
Stay Duration
Factors related to Abandonment/Retention of student
Preprocessing of Academic Data for Mining Association Rule 8
9. Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Analysis (contd.)
Factors related to Condition of Academic Institution
Rate of Student
Retention
Average CGPA of all Students
Experience of Teachers
Rate of Student
Abandonment
Research & Publications
Condition of
Academic Institution
Preprocessing of Academic Data for Mining Association Rule 9
10. Methods of Preprocessing Academic Data for Mining
(Contd.)
Relational Database
Student Course
Grade
Sheet
representsachieves
Finding Correlation between performance of different courses
Preprocessing of Academic Data for Mining Association Rule 10
11. Methods of Preprocessing Academic Data for Mining
(Contd.)
How we have populated data in universal database?
Let us consider a 3 credit course CSE 303
Now we assume 5 possible scenarios:
Universal Database &
Synthetic Data Population
Preprocessing of Academic Data for Mining Association Rule 11
A student appears class tests(CT) having attendance more than 60%,
appeared term final examinations.
A student appeared CT but attendance is less than 60% and appeared term
final examination.
Class test and attendance are carried over and appeared term final
examination.
A student appeared CT and attendance is more than 60% but not appeared
term final examination.
A student attended less than 60% of classes and did not appear both in CT
and term final examination.
12. Methods of Preprocessing Academic Data for Mining
(Contd.)
Two algorithm have been developed to populate the
universal table :
Synthetic_Generation ( )
Generate_Grade ()
Universal Database &
Synthetic Data Population (contd.)
Student_Id CSE303_secA CSE303_secB CSE303_CT CSE303_
Attendance
CSE303_
Total
CSE303_Grade
…0805001 90 75 55 30 250 A+
0805002 85 70 45 25 225 A
… … … … … … …
Records of all taken courses of corresponding student ID are generated synthetically in a single
row of the universal table.
Preprocessing of Academic Data for Mining Association Rule 12
13. Methods of Preprocessing Academic Data for Mining
(Contd.)
Data Transformation
Definition Credit Hour Range
SecA_high or SecB_high 3 >=75 && <=105
SecA_avg or SecB_avg 3 >=60 && <75
SecA_low or SecB_low 3 < 60
CT_high 3 >=48 && <=60
CT_average 3 >=36 && <=48
CT_low 3 < 36
Grade_high 3 >=225 && <=300
Grade_average 3 >=180 && < 225
Grade_low 3 < 180
Transformation rule table for 3.0 credit course
Student_
ID
SecA_
high
SecA
_average
SecA_
low
SecB
_high
SecB
_average
SecB_
low
CT_
high
CT
_average
CT
_low
Grade_
high
Grade_
average
Grade_
low
0805001 1 0 0 1 0 0 1 0 0 1 0 0
0805002 1 0 0 0 1 0 0 1 0 1 0 0
… … … … … … … … … … … … …
Transformed table from universal table
Preprocessing of Academic Data for Mining Association Rule 13
14. Association Rules for Academic Data
No. Interested Association Rule Purpose
1. Course_No => CGPA_high Performance of Individual Course
2. Course_No => CGPA_low
3. Sec_A_high => CGPA_high Impact of Section of Answer Script
4. Sec_B_high => CGPA_high
5. CT_high =>CGPA_high Impact of Class test
6. CT_low => CGPA_low
7. Hall_Resident =>CGPA_low Impact of Residence
8. Attached =>CGPA_high
9. Course_No_1=> Course_No_2 Correlation of
courses
10. (Course_No_1,Course_No_2) => Course_No_3
11. Permanent_Address_City =>CGPA_high
Impact of locality
12. Permanent_Address_Rural =>CGPA_low
Preprocessing of Academic Data for Mining Association Rule 14
15. Future Work
Academic Performance
Family Background
Previous Academic Record
Seat Allotment in Hall
Offering Scholarship
Abandonment/Retention
Stay Duration
Session Jam
Unwanted leaves
Long term break
Condition of Institution
Average CGPA of all students
Term completion rate
Abandonment/retention rate
Research & Publications
Developing new mining algorithm which will be tested
using the synthetic dataset
Collecting real data from BIIS and using without disclosing
privacy to discover the Knowledge
Preprocessing of Academic Data for Mining Association Rule 15
16. Conclusions
Applies association rule mining algorithms to transform continuous
valued attribute into resemble the required educational knowledge
Guides to discover the required knowledge using the realistic
dataset and apply them in real life scenario
Developing Model using BIIS data but can be generalized
for application to any higher educational institution
Preprocessing of Academic Data for Mining Association Rule 16
17. Any Question or Suggestion is
Welcome
Preprocessing of Academic Data for Mining Association Rule 17