Diese Präsentation wurde erfolgreich gemeldet.
Die SlideShare-Präsentation wird heruntergeladen. ×

Towards Automated Classification of Discussion Transcripts: A Cognitive Presence Case

Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Anzeige
Wird geladen in …3
×

Hier ansehen

1 von 30 Anzeige

Towards Automated Classification of Discussion Transcripts: A Cognitive Presence Case

Herunterladen, um offline zu lesen

LAK'16 Conference paper presentation:

abstract:
In this paper, we present the results of an exploratory study that examined the problem of automating content analysis of student online discussion transcripts. We looked at the problem of coding discussion transcripts for the levels of cognitive presence, one of the three main constructs in the Community of Inquiry (CoI) model of distance education. Using Coh-Metrix and LIWC features, together with a set of custom features developed to capture discussion context, we developed a random forest classification system that achieved 70.3% classification accuracy and 0.63 Cohen’s kappa, which is significantly higher than values reported in the previous studies. Besides improvement in classification accuracy, the developed system is also less sensitive to overfitting as it uses only 205 classification features, which is around 100 times less features
than in similar systems based on bag-of-words features. We also provide an overview of the classification features most indicative of the different phases of cognitive presence that gives an additional insights into the nature of cognitive presence learning cycle. Overall, our results show great potential of the proposed approach, with an added benefit of providing further characterization of the cognitive presence coding scheme.

LAK'16 Conference paper presentation:

abstract:
In this paper, we present the results of an exploratory study that examined the problem of automating content analysis of student online discussion transcripts. We looked at the problem of coding discussion transcripts for the levels of cognitive presence, one of the three main constructs in the Community of Inquiry (CoI) model of distance education. Using Coh-Metrix and LIWC features, together with a set of custom features developed to capture discussion context, we developed a random forest classification system that achieved 70.3% classification accuracy and 0.63 Cohen’s kappa, which is significantly higher than values reported in the previous studies. Besides improvement in classification accuracy, the developed system is also less sensitive to overfitting as it uses only 205 classification features, which is around 100 times less features
than in similar systems based on bag-of-words features. We also provide an overview of the classification features most indicative of the different phases of cognitive presence that gives an additional insights into the nature of cognitive presence learning cycle. Overall, our results show great potential of the proposed approach, with an added benefit of providing further characterization of the cognitive presence coding scheme.

Anzeige
Anzeige

Weitere Verwandte Inhalte

Diashows für Sie (20)

Ähnlich wie Towards Automated Classification of Discussion Transcripts: A Cognitive Presence Case (20)

Anzeige

Weitere von Vitomir Kovanovic (13)

Aktuellste (20)

Anzeige

Towards Automated Classification of Discussion Transcripts: A Cognitive Presence Case

  1. 1. 1 Towards Automated Content Analysis of Discussion Transcripts: A Cognitive Presence Case Vitomir Kovanović¹, Srećko Joksimović¹, Zak Waters² Dragan Gašević¹ , Kirsty Kitto², Marek Hatala³, and George Siemens⁴ ¹ The University of Edinburgh ² Queensland University of Technology ³ Simon Fraser University ⁴ University of Texas at Arlington April 27, 2016 Edinburgh, UK v.kovanovic@ed.ac.uk http://Vitomir.kovanovic.info @vkovanovic Generously supported by
  2. 2. 2 Overall goal and why it matters Automate content analysis of online discussions for the levels of cognitive presence Benefits Faster coding of messages Operationalization of coding scheme Monitor student progress Speed-up research
  3. 3. 3 Online discussions The key ingredient in distance education Used all the time Adopted by most online and blended courses, often not in a productive manner. Supported by social-constructivism The social (co)construction of knowledge is essential for social-constructivist pedagogies adopted by most online instructors. Producelarge amounts of data Called gold mine of information about learning processes, they can be used to understand how people learn online. Require a lot of work from instructors That is why social-constructivist pedagogies work with up to ~ 30 students.
  4. 4. 4 Community of Inquiry Model of online learning experience Cognitive presence Student cognitive engagement Social presence Social climate in the course Teaching presence Course organization & design,
  5. 5. 5 Cognitive presence Central construct of the CoI model Triggering event Start of the learning cycle, sense of puzzlement, dilemma. Resolution Application and testing of the acquired knowledge Exploration Brainstorming of different ideas, information gathering. Integration Synthesis of the relevant ideas and information “an extent to which the participants in any particular configuration of a community of inquiry are able to construct meaning through sustained Communication” (Garrison, Anderson, and Archer, 1999, p. 89) Cognitive presence definition Triggering event Resolution Exploration Integration Learning cycle
  6. 6. 6 Assessment of three presences How we measure cognitive, social, and teaching presence Two ways of assessing levels of three presences 1. Quantitative content analysis instrument A coding scheme for each presence 2. CoI Survey instrument 34 Likert-scale questions
  7. 7. 7 Cognitive presence coding scheme
  8. 8. 8 Cognitive presence coding scheme
  9. 9. 9 Challenges of cognitive presence assessment Content analysis instrument: 1. Manual, labor intensive, time- consuming, 2. Requires expertise with CoI coding schemes, Survey instrument: 1. Self-reported, perceived instead of objective values, 2. Selection bias, not all students answer the survey, 3. No real-time feedback on student learning progress, 4. Almost no impact on educational practice.
  10. 10. 10 Text classification Builda classifier for coding cognitivepresence By automating coding of messages, we can overcome many of the challenges identified with CoI model adoption. Builds on previous text-mining work in education We build on the previous work on the same topic (Kovanovic et al. 2014, … Abandon kitchen sink approach We do not want bag-of-words overfitting approach Five-class text classifier Classifier needs to assign a cognitive presence class 1-Trigering event, 2 – Exploration, 3 – Integration, 4- Resolution, 0 – Other (non cognitive).
  11. 11. 11 Data: Courses 1. Six offerings of graduate level course in software engineering at distance learning university, 2. Total of 1,747 messages, 81 students. Phase Students Messages Winter 2008 15 212 Fall 2008 22 633 Summer 2009 10 243 Fall 2009 7 63 Winter 2010 14 359 Winter 2011 13 237 Average (SD) 13.5 (5.1) 291.2 (192.4) Total 81 1,747 Study dataset
  12. 12. 12 Data: Messages 1. Messages coded for level of cognitive presence on a scale 0-5. 2. Manually coded by two coders (agreement = 98:1%; Cohen's κ = 0:974). ID Phase Messages (%) 0 Other 140 8.01% 1 Triggering Event 308 17.63% 2 Exploration 684 39.17% 3 Integration 508 29.08% 4 Resolution 107 6.12% All 1747 100% Message coding results
  13. 13. 13 SMOTE preprocessing SMOTE preprocessing for class balancing. Dark blue – original instances which are preserved, light blue – synthetic instances, red – original instances which are removed. We generate new data points in minority classes by “syntactic resampling” using SMOTE technique. To generate a new data point (Z) ∈ Rn: • Pick a random data existing data point (X), • Pick K (in our case 5) instances most similar to the given data point, • Pick randomly one of the K neighbors (Y) • Create a new data point Z as a linear combination: Z = X + rand(0,1)*Y
  14. 14. 14 Extracted features 205 features in total extracted 1 3 2 4 5 LIWC features 93 different counts indicative of different psychological processes (e.g., affective, cognitive, social, perceptual) LSA similarity Average coherence of message’s paragraphs to each other. LSA space is built from Wikipedia articles related to concepts extracted from the topic start message (using TAGME). Coh-Metrix features 108 metrics of text coherence (and related metrics) Namedentity count Number of concepts related to DBPedia computer science category (using DBPedia spotlight) Discussion context features 1. Number of replies 2. Message depth 3. Cosine similarity to previous/next message 4. Thread start/end boolean indicators
  15. 15. 15 Random Forest classifier • A state-of-the-art ensemble learning method: • Builds a large collection of decision trees (i.e., forest) using a subset of features (i.e., columns) • Reduces the variance without increasing the bias • Final class for a data point: a simple majority vote across the forest. • Two parameters: 1. ntree – the number of trees built 2. mtry – number of features used ntree = 6
  16. 16. 16 Random Forest classifier Individual tree mtry=3 8 features
  17. 17. 17 Hyper-parameter tuning • We split the data to train/test data in 3:1 ratio • Two parameters 1. ntree – the number of trees built (we built 1,000) 2. mtry – number of features used (evaluated using 10-fold CV) Values of mtry evaluated: {2, 12, 23, 34, 44, 55, 66, 76, 87, 98, 108, 119, 130, 140, 151, 162, 172, 183, 194, 205}. mtry Accuracy (SD) Kappa (SD) Min 194 0.68 (0.04) 0.59 (0.04) Max 12 0.72 (0.04) 0.65 (0.05) Difference 0.04 0.06 Hyper parameter tuning results Hyper parameter tuning
  18. 18. 18 Implementation Feature Extraction • Coh-Metrix (McNamara, Graesser, McCarthy, & Cai, 2014) • LIWC (Tausczik & Pennebaker, 2010) • LSA similarity, Text Mining library for LSA (TML) Algorithm implementation • SMOTE algorithm implemented using WEKA • Random Forest classifier using randomForest R package • Repeated cross-validation using carret R package
  19. 19. 19 Performance evaluation • We obtained 70.3% classification accuracy (95% CI[0.66, 0.75]) and 0.63 Cohen’s κ. • Significant improvements over Cohen’s κ of 0.41 and 0.48 reported in Kovanovic et al. (2014) and Waters et al. (2015) studies. Predicted Other Triggering Exploration Integration Resolution Actual Other 79 2 2 2 2 Triggering 5 67 9 6 0 Exploration 9 15 35 27 1 Integration 2 2 23 44 16 Resolution 0 0 4 2 81 Confusion matrix Out-of-bag (OOB) error rate
  20. 20. 20 Performance evaluation • Much better performance than previous studies • Slightly below commonly accepted 0.7 Cohen’s κ. • Parameter optimization plays an important role (0.05 Cohen’s κ difference, 4% classification accuracy). • Feature space ~ 100x smaller than in the previous study • Limits the chances for overfitting • Features are more context-independent • Particularly important for different pedagogical contexts (e.g., MOOC discussions) • “Theory-driven” feature space Confusion matrix
  21. 21. 21 Feature Importance • A side product of Random Forest algorithm • Mean Decrease Gini (MDG) measure of feature contribution to reducing decision tree impurity • A long tail of feature importance • Few features very important, most not so much • Provides more detailed operationalization of CoI coding scheme.
  22. 22. 22 Feature importance Phase # Variable Description MDG* Other TE Exp. Int. Res. 1 cm.DESWC Number of words 32.91 55.41 80.91 117.71 183.30 280.68 2 ner.entity.cnt Number of named entities 26.41 13.44 21.67 28.84 44.75 64.18 3 cm.LDTTRa Lexical diversity, all words 21.98 0.85 0.77 0.71 0.65 0.58 4 message.depth Position within a discussion 19.09 2.39 1.00 1.84 1.87 2.00 5 cm.LDTTRc Lexical diversity, content words 17.12 0.95 0.90 0.86 0.82 0.78 6 cm.LSAGN Avg. givenness of each sentence 16.63 0.10 0.14 0.18 0.21 0.24 7 liwc.Qmark Number of question marks 16.59 0.27 1.84 0.92 0.58 0.38 8 message.sim.prev Similarity with previous message 16.41 0.20 0.06 0.22 0.30 0.39 9 cm.LDVOCD Lexical diversity, VOCD 15.43 12.92 28.99 53.57 83.47 97.16 10 liwc.money Number of money-related words 14.38 0.21 0.32 0.32 0.65 0.99 11 cm.DESPL Avg. number of paragraphs 12.47 4.26 6.37 7.49 10.17 14.05 12 Message.sim.next Similarity with next message 11.74 0.08 0.34 0.20 0.22 0.22 13 Message.reply.cnt Number of replies 11.67 0.42 1.44 0.82 1.10 0.84 14 cm.DESSC Sentence count 11.67 4.28 6.36 7.49 10.17 14.29 15 lsa.similarity Avg. LSA sim. between sentences 9.69 0.29 0.47 0.54 0.62 0.67 16 cm.DESSL Avg. sentence length 9.60 11.88 13.62 16.69 19.36 21.73 17 cm.DESWLsyd SD of word syllables count 8.92 0.98 1.33 0.98 0.97 0.97 18 liwc.i Number of FPS* pronouns 8.84 4.33 2.82 2.37 2.51 2.19 19 cm.RDFKGL Flesch-Kincaid Grade level 8.29 7.68 10.30 10.19 11.13 11.99 20 cm.SMCAUSwn WordNet overlap between verbs 8.14 0.38 0.48 0.51 0.50 0.47 * MDG - Mean decrease Gini impurity index, FPS - first person singular
  23. 23. 23 Feature importance Phase # Variable Description MDG* Other TE Exp. Int. Res. 1 cm.DESWC Number of words 32.91 55.41 80.91 117.71 183.30 280.68 2 ner.entity.cnt Number of named entities 26.41 13.44 21.67 28.84 44.75 64.18 3 cm.LDTTRa Lexical diversity, all words 21.98 0.85 0.77 0.71 0.65 0.58 4 message.depth Position within a discussion 19.09 2.39 1.00 1.84 1.87 2.00 5 cm.LDTTRc Lexical diversity, content words 17.12 0.95 0.90 0.86 0.82 0.78 6 cm.LSAGN Avg. givenness of each sentence 16.63 0.10 0.14 0.18 0.21 0.24 7 liwc.Qmark Number of question marks 16.59 0.27 1.84 0.92 0.58 0.38 8 message.sim.prev Similarity with previous message 16.41 0.20 0.06 0.22 0.30 0.39 9 cm.LDVOCD Lexical diversity, VOCD 15.43 12.92 28.99 53.57 83.47 97.16 10 liwc.money Number of money-related words 14.38 0.21 0.32 0.32 0.65 0.99 11 cm.DESPL Avg. number of paragraphs 12.47 4.26 6.37 7.49 10.17 14.05 12 Message.sim.next Similarity with next message 11.74 0.08 0.34 0.20 0.22 0.22 13 Message.reply.cnt Number of replies 11.67 0.42 1.44 0.82 1.10 0.84 14 cm.DESSC Sentence count 11.67 4.28 6.36 7.49 10.17 14.29 15 lsa.similarity Avg. LSA sim. between sentences 9.69 0.29 0.47 0.54 0.62 0.67 16 cm.DESSL Avg. sentence length 9.60 11.88 13.62 16.69 19.36 21.73 17 cm.DESWLsyd SD of word syllables count 8.92 0.98 1.33 0.98 0.97 0.97 18 liwc.i Number of FPS* pronouns 8.84 4.33 2.82 2.37 2.51 2.19 19 cm.RDFKGL Flesch-Kincaid Grade level 8.29 7.68 10.30 10.19 11.13 11.99 20 cm.SMCAUSwn WordNet overlap between verbs 8.14 0.38 0.48 0.51 0.50 0.47 * MDG - Mean decrease Gini impurity index, FPS - first person singular
  24. 24. 24 Operationalization of cognitive presence Higher levels of cognitive presence actually mean… The higher the cognitive presence (O -> TE -> E -> I -> R) • The longer the message. • The more concepts mentioned (more named entities). • The lower the lexical diversity (both at content level and in general). • The later its position in the thread. Except non-cognitive messages, they tend to occur closer to the end as well. • The higher the giveness of each sentence. • The fewer the question marks. Except non-cognitive, they have the smallest number of question marks. • The higher the number of paragraphs and sentences. • The higher the average length of sentence and their similarity to each other. • The more money-related terms.
  25. 25. 25 Operationalization of cognitive presence Unique characteristics T E Triggering event Syllabi count inconsistent Most replies Low similarity with the next message E Exploration Aside from non-cognitive messages, least replies Question marks more frequent than integration and resolution I Integration More replies than exploration and resolution R Resolution Lowest readability N C Non cognitive (other) High readability Very few replies Late in the thread Syllabi count consistent Little verb overlap Use of first person singular pronouns No similarity with next message Fewest question marks
  26. 26. 26 Summary Almost done • We developed a classifier for automated coding of discussion messages for the levels of cognitive presence • We significantly improved the classification accuracy (Cohen’s κ = , Classification Accuracy = ) • The feature space ~ 100x smaller • The feature space is also more generalizable • We provided more detailed operationalization of the cognitive presence coding scheme Future work: • We are currently coding a dataset from two MOOC courses by the University of Edinburgh • Evaluation of the classifier in the MOOC context
  27. 27. 27 Our plan It will be a small and fun project
  28. 28. 28 Reality But it is definitely not a small project
  29. 29. 29 The end That is all folks Thank you
  30. 30. 30 References Garrison, D. R., Anderson, T., & Archer, W. (1999). Critical Inquiry in a Text-Based Environment: Computer Conferencing in Higher Education. The Internet and Higher Education, 2(2–3), 87–105. Kovanović, V., Joksimović, S., Gašević, D., & Hatala, M. (2014). Automated Content Analysis of Online Discussion Transcripts. In Proceedings of the Workshops at the LAK 2014 Conference co- located with 4th International Conference on Learning Analytics and Knowledge (LAK 2014). Indianapolis, IN. Retrieved from http://ceur-ws.org/Vol-1137/ McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated Evaluation of Text and Discourse with Coh-Metrix. Cambridge University Press. Tausczik, Y. R., & Pennebaker, J. W. (2010). The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology, 29(1), 24–54. http://doi.org/10.1177/0261927X09351676 Waters, Z., Kovanović, V., Kitto, K., & Gašević, D. (2015). Structure matters: Adoption of structured classification approach in the context of cognitive presence classification. In Proceedings of the 11th Asia Information Retrieval Societies Conference, AIRS 2015.

×