SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Downloaden Sie, um offline zu lesen
Assessing Writing 19 (2014) 51–65
Contents lists available at ScienceDirect
Assessing Writing
The effects of computer-generated feedback on
the quality of writing
Marie Stevenson∗
, Aek Phakiti
University of Sydney, Australia
a r t i c l e i n f o
Article history:
Available online 17 December 2013
Keywords:
Automated writing evaluation (AWE)
Computer-generated feedback
Effects on writing quality
Critical review
a b s t r a c t
This study provides a critical review of research into the effects of
computer-generated feedback, known as automated writing evalu-
ation (AWE), on the quality of students’ writing. An initial research
survey revealed that only a relatively small number of studies have
been carried out and that most of these studies have examined the
effects of AWE feedback on measures of written production such as
scores and error frequencies. The critical review of the findings for
written production measures suggested that there is modest evi-
dence that AWE feedback has a positive effect on the quality of the
texts that students produce using AWE, and that as yet there is little
evidence that the effects of AWE transfer to more general improve-
ments in writing proficiency. Paucity of research, the mixed nature
of research findings, heterogeneity of participants, contexts and
designs, and methodological issues in some of the existing research
were identified as factors that limit our ability to draw firm con-
clusions concerning the effectiveness of AWE feedback. The study
provides recommendations for further AWE research, and in par-
ticular calls for more research that places emphasis on how AWE
can be integrated effectively in the classroom to support writing
instruction.
© 2013 Elsevier Ltd. All rights reserved.
1. Introduction
This study provides a critical review of literature on the pedagogical effectiveness of computer-
based educational technology for providing students with feedback on their writing that is commonly
∗ Corresponding author.
E-mail addresses: marie.stevenson@sydney.edu.au (M. Stevenson), aek.phakiti@sydney.edu.au (A. Phakiti).
1075-2935/$ – see front matter © 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.asw.2013.11.007
52 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65
known as Automated Writing Evaluation (AWE).1 AWE software provides computer-generated feed-
back on the quality of written texts. A central component of AWE software is a scoring engine that
generates automated scores based on techniques such as artificial intelligence, natural language
processing and latent semantic analysis (See Dikli, 2006; Philips, 2007; Shermis & Burstein, 2003;
Yang, Buckendahl, Juszkiewicz, & Bhola, 2002). AWE software that is used for pedagogical purposes
also provides written feedback in the form of general comments, specific comments and/or corrections.
Originally, AWE was primarily used in high-stakes testing situations to generate summative scores
to be used for assessment purposes. Widely used, commercially available scoring engines are Project
Essay GraderTM (PEG), e-rater®, Intelligent Essay AssessorTM (IEA), and IntelliMetricTM. In recent
years, the use of AWE for the provision of formative feedback in the writing classroom has steadily
increased, particularly in classrooms in the United States. AWE programs are currently being used
in many elementary, high school, college and university classrooms with a range of writers from
diverse backgrounds. Examples of commercially available AWE programs designed for classroom use
are: Criterion (Educational Testing Service: MY Access! (Vantage Learning): Write to Learn and Sum-
mary Street (Pearson Knowledge Technologies); and Writing Roadmap (McGraw Hill). These programs
sometimes incorporate the same scoring engine as used in summative programs. For example, Crite-
rion incorporates the e-rater scoring engine and MY Access! incorporates the IntellimetricTM scoring
engine.
Common to all AWE programs designed for classroom use is that they provide writers with multiple
drafting opportunities, and upon receiving feedback writers can choose whether or not to use this
feedback to revise their texts. AWE programs vary in the kinds of feedback they provide writers. Some
provide feedback on both global writing skills and language use (e.g., Criterion, MY Access!), whereas
others focus on language use (e.g., QBL) and some claim to focus primarily on content knowledge (e.g.,
Write to Learn and Summary Street). Some programs incorporate other tools such as model essays,
scoring rubrics, graphic organizers, and dictionaries and thesauri.
Like many other forms of educational technology, the use of AWE in the classroom has been the
subject of controversy, with scholars taking divergent stances. On the one hand, AWE has been hailed
as a means of liberating instructors, freeing them up to devote valuable time to aspects of writing
instruction other than marking assignments (e.g., Burstein, Chodorow, & Leacock, 2004; Herrington
& Moran, 2001; Hyland & Hyland, 2006; Philips, 2007). It has been seen as impacting positively on
the quality of students’ writing, due to the immediacy of its ‘on-line’ feedback (Dikli, 2006), and the
multiple practice and revision opportunities it provides (Warschauer & Ware, 2006). It has also been
claimed to have positive effects on student autonomy (Chen & Cheng, 2008).
On the other hand, the notion that computers are capable of providing effective writing feedback has
aroused considerable suspicion, perhaps fueled by the fearful specter of a world in which humans are
replaced by machines. Criticisms have been made concerning the capacity of AWE to provide accurate
and meaningful scores (e.g., Anson, 2006; Freitag Ericsson, 2006). There is a common perception that
computers are not capable of scoring human texts, as they do not possess human inferencing skills and
background knowledge (Anson, 2006). Other criticisms relate to the effects that AWE has on students’
writing. AWE has been accused of reflecting and promoting a primarily formalist approach to writing,
in which writing is viewed as simply being “mastery of a set of subskills” (Hyland & Hyland, 2006,
p. 95). Comments generated by AWE have been said to place too much emphasis on surface features
of writing, such as grammatical correctness (Hyland & Hyland, 2006) and the effects of writing for a
non-human audience have been decried. There is also fear that using AWE feedback may be more of an
exercise in developing test-taking strategies than in developing writing skills, with students writing
to the test by consciously or unconsciously adjusting their writing to meet the criteria of the software
(Patterson, 2005).
Positive and negative claims regarding the effects of AWE on students’ writing are not always based
on empirical evidence, and at times appear to reflect authors’ own ‘techno-positivistic’ or ‘technopho-
bic’ stances toward technology in the writing classroom. Moreover, quite a lot of the research that has
1
Other terms found in the literature are automated essay evaluation (AEE) (See Shermis & Burstein, 2013) and writing
evaluation technology.
M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 53
been carried out is or from authors who have been involved in developing a particular AWE program
or who are affiliated with organizations that have developed these programs, so could contain a bias
toward showing AWE in a positive light. Consequently, there is lack of clarity concerning the current
state of evidence for the effects on the quality of students’ writing of AWE programs designed for
teaching and learning purposes.
However, it is important to be aware that over the past decades there has also been controversy
about the effects of teacher feedback on writing. Perhaps the strongest opponent of classroom writing
feedback was Truscott (1996), who claimed that feedback on grammar should be abandoned, as it
ignored deeper learning processes, only led to pseudo-learning and had a negative effect on the quality
of students’ writing. While most scholars have taken less extreme positions, in a review of issues
relating to feedback in the classroom, Hyland and Hyland (2006) concluded that there was surprisingly
little consensus about the kinds of feedback that are effective and in particular about the long term
effects of feedback on writing development. However, some research synthetic evidence exists for the
effectiveness of teacher feedback. In a recent met-analytic study, Biber, Nekrasova, and Horn (2011)
found that, when compared to no feedback, teacher feedback was associated with gains in writing
development for both first and second language writers. They found that a focus on content and
language use was more effective than focus on a focus on form only, especially for second language
writers. They also found that comments were more effective than error correction, even for improving
grammatical accuracy. It is therefore timely to evaluate whether there is evidence that computer-
generated feedback is also associated with improvements in writing.
To date, the thrust of AWE research has been on validation through the examination of the psy-
chometric properties of AWE scores by, for example, calculating the degree of correlation between
computer-generated scores and scores given by human raters. Studies have frequently found high
correlations between AWE scores and human scores. and these results have been taken as providing
evidence that AWE scores provide a psychometrically valid measure of students’ writing. (See two
volumes edited by Shermis and Burstein (2003, 2013) for detailed results and in-depth discussion of
the reliability and validity of specific AWE systems). Such studies, however, do not inform us about
whether AWE is effective as a classroom tool to actually improve students’ writing. As Warschauer
and Ware (2006) pointed out, while evidence of psychometric reliability and validity is a necessary
pre-requisite, it is not sufficient for understanding whether AWE ‘works’ in the sense of contributing
to positive outcomes for student learning. Even the recently published ‘Handbook of Automated Essay
Evaluation’ (Shermis & Burstein, 2013), although it pays some attention to AWE as a teaching and
learning tool, still has a strong psychometric and assessment focus.
Although a number of individual studies have examined the effects of AWE feedback in the
classroom, no comprehensive review of the literature exists that examines whether AWE feedback
improves the quality of students’ writing. Warschauer and Ware (2006) provided a thought-provoking
discussion of some existing research on AWE in the classroom and used this to make recommenda-
tions for future AWE research. However, they only provided a limited review that did not include all of
the then available research and did not provide an overview of the evidence for the effects of AWE on
students’ writing. Moreover, since their paper was written a number of studies have been published
in this area.
2. The current study
The current study provides an evaluation of the available evidence for the effects of AWE feedback
in the writing classroom in terms of written production. The study focuses on research involving AWE
systems specifically designed as tools for providing formative evaluation in the writing classroom,
rather than AWE systems designed to provide summative assessment in testing situations. The purpose
of formative evaluation is to provide writers with individual feedback that can form the basis for further
learning (Philips, 2007). In formative evaluation, there is a need to inform students not only about their
level of achievement, but also about their specific strengths and weaknesses. Formative evaluation can
be said to involve assessment for learning, rather than assessment of learning (Taylor, 2005). In this
study, feedback is viewed as encompassing both numeric feedback (i.e., scores and ratings) and written
54 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65
feedback (i.e., global or specific comments on the quality of the text and/or identification of specific
problems in the actual text).
The study focuses on the effects of AWE on written production, because the capability to improve
the quality of students’ texts is central to claims made about the effectiveness of AWE feedback, and
because, likely as a consequence of this, the bulk of AWE pedagogical research focuses on written pro-
duction outcomes. The study includes AWE research on students from diverse backgrounds, in diverse
teaching contexts, and receiving diverse kinds of feedback from diverse AWE programs. The scope of
the research included is broad due to the relatively small number of existing studies and the hetero-
geneity of these studies. The study does not aim to make comparisons or draw conclusions about the
relative effects of AWE feedback on student writing for specific populations, contexts, feedback types
or programs. Instead, it aims to critically evaluate the effects of AWE feedback on written production
by identifying general patterns and trends, and identifying issues and factors that may impact on these
effects.
The study is divided into two stages: a research survey and a critical review. The objective of the
research survey is to determine the maturity of the research domain, and to provide a characterization
of the existing research that can be drawn on in the critical review. The objective of the critical review,
which is the central stage, is to identify overall patterns in the research findings and to evaluate and
interpret these findings, taking account of relevant issues and factors.
3. Method
3.1. The literature search
A comprehensive and systematic literature search was conducted to identify relevant primary
sources for inclusion in the research survey and critical review. Both published research (i.e., journal
articles, book chapters and reports) and unpublished research (i.e., theses and conference papers)
were identified.
The following means of identifying research were used:
a) Search engines: Google Scholar, Google.
b) Databases: ERIC, MLA, PsychInfo, SSCI, MLA, Ovid, PubPsych, Linguistics and Language Behavior
Abstracts (LLBA), Dissertation Abstracts International, Academic Search Elite, Expanded Academic,
ProQuest Dissertation and Theses Full-text, and Australian Education Index.
c) Search terms used: automated writing evaluation, automated writing feedback, computer-
generated feedback, computer feedback, and automated essay scoring automated evaluation,
electronic feedback, and program names (e.g., Criterion, Summary Street, Intelligent Essay Assessor,
Write to Learn, MY Access!).
d) Websites: ETS website (ets.org) (ETS Research Reports, TOEFL iBT Insight series, TOEFL iBT research
series, TOEFL Research Reports); AWE software websites.
e) Journals from 1990 to 2011: CAELL Journal; CALICO Journal; College English; English Journal; Com-
puter Assisted Language Learning; Computers and Composition; Educational Technology Research
and Development; English for Specific Purposes; IEEE Intelligent Systems; Journal of Basic Writ-
ing; Journal of Computer-Based Instruction; Journal of Educational Computing Research; Journal
of Research on Technology in Education; Journal of Second Language Writing; Journal of Technol-
ogy; Journal of Technology, Learning and Assessment, Language Learning and Technology; Language
Learning; Language Teaching Research; Learning, and Assessment; ReCALL; System; TESL-EJ.
f) Reference lists of already identified publications. In particular, the Ericson and Haswell (2006)
bibliography.
To be included, a primary source had to focus on empirical research on the use AWE feedback
generated by one or more commercially or non-commercially available programs for the formative
evaluation of texts in the writing classroom. The program reported on needed to provide text-
specific feedback. Studies were excluded that reported on programs that provided generic writing
guidelines (e.g., The Writing Partner: Zellermayer, Salomon, Globerson, & Givon, 1991; Essay Assist:
M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 55
Chandrasegaran, Ellis, & Poedjosoedarmo, 2005). Studies that reported results already reported else-
where were also excluded. Where the same results were reported more than once, published studies
were chosen above unpublished ones, or if both were published, the first publication was chosen. This
led to the exclusion of Grimes (2005) and Kintsch et al. (2000).
Based on the above criteria, 33 primary sources were identified for inclusion in the research survey
(See Appendix A).
3.2. Coding of research survey
A coding scheme of study descriptors was developed for the research survey. The unit of coding
was the study. A study was defined as consisting of “a set of data collected under a single research plan
from a designated sample of respondents” (Lipsey & Wilson, 2001, p. 76). As one of the publications,
Elliot and Mikulas (2004), included four studies with different samples, this led to a total of 36 studies
being identified.
In order to obtain an overview of the scope of the research domain, the studies were first classified
in terms of constructs of effectiveness: Product, Process and Perceptions. Lai (2010) defined effec-
tiveness of AWE feedback in terms of three dimensions: (1) the effects on written production (e.g.,
quality scores, error frequencies and rates, lexical measures and text length); (2) the effects on writing
processes (e.g., rates and types of revisions, editing time, time on task, and rates of text production);
and (3) perceived usefulness. In our study, combinations of these constructs were possible, as some
studies included more than one construct.
Subsequently, as the focus of the study is writing outcomes, only studies that included Product
measurements were coded in terms of Substantive descriptors and Methodological descriptors (See
Lipsey & Wilson, 2001). Substantive descriptors relate to substantive aspects of the study, such as the
characteristics of the intervention and the research context. Methodological descriptors relate to the
methods and procedures used in the study. Table 1 lists the coding categories for both kinds of descrip-
tors and the coding options within each category. In the methodological descriptors, ‘Control group’
refers to whether the study included a control condition and whether this involved comparing AWE
feedback with a no feedback condition or with a teacher feedback condition. ‘Text’ refers to whether
outcomes were measured using texts for which AWE feedback had been received or other texts, such
as writing assessment tasks. ‘Outcome measure’ refers to the measure(s) of written production that
were included in the study.
The coding categories and options were developed inductively by reading through the sample
studies. Developing the coding scheme was a cyclical process, and each study was coded a number
of times, until the coding scheme was sufficiently refined. These coding cycles were carried out by
the first researcher. The reliability of the coding was checked through the coding of 12 studies (one
Table 1
Research survey coding scheme.
Categories Descriptors
Substantive descriptors
Publication type ISI-listed journal; non-ISI listed journal; book chapter; thesis; report;
unpublished paper
AWE program Open coding
Country Open coding
Educational context Elementary; high school; elementary/high school; university & college
Language background L1; L1 & ESL; EFL/ESL only; unspecified
Methodological descriptors
Design Between group; within-groups; between & within group; single group
Reporting Statistical testing; descriptive statistics; no statistics
Control group No feedback; teacher feedback; no feedback & teacher feedback;
different AWE conditions; no control group
Text AWE texts; other texts; AWE texts & other texts
Outcome measure Scores; scores & other product measures; errors; citations
56 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65
Table 2
Research survey: constructs.
Construct Frequency
Product 17
Product & process 4
Product & perceptions 5
Product, process, & perceptions 4
Perceptions 5
Perceptions & process 1
Total 36
third of the data) by the second researcher. Rater reliability was calculated using Cohen’s kappa. For
the substantive descriptors the kappa values were all 1.00, except for language background, which
was .75. For the methodological descriptors the kappa values were .85 for Design, 1.00 for Reporting,
.85 for Control group, 1.00 for Text and .85 for Outcome measure. Any disagreements were resolved
through discussion.
For the research survey, the frequencies of the coding categories were collated and this information
was used to describe the characteristics of the studies in the research sample. For the critical literature
review, the findings of the sample studies were critically discussed in relation to the characteristics
of the studies identified in the research survey and also in relation to strengths or weaknesses of
particular studies.
3.3. Research survey
Table 2 shows that the primary focus of AWE research has so far been on the effects of AWE on
written production. Thirty of the thirty six studies include Product measures: 17 focus solely on Prod-
uct, and another 13 studies involve Product in combination with one or more of the other constructs.
The secondary focus has been on Perceptions, with five studies focusing solely on Perceptions, and
another 10 including Perceptions. No studies have focused solely on Process. In the remaining survey,
the thirty studies involving product measurements are characterized.
Table 3 shows that, in terms of types of publication, relatively few of the studies have appeared in
ISI-listed journals or in books. A number of the studies are from non-ISI-listed journals, and a number
are unpublished papers from conferences or websites. Table 3 also shows that 10 AWE programs are
involved in the sample and that the majority of these have been developed by organizations that are
major players in the field of educational technology: Criterion from ETS, MY Access! from Vantage
Learning, IEA and Summary Street from Pearson Knowledge Analysis Technologies. Criterion is the
program that has been examined most frequently.
Criterion, MY Access! and Writing Roadmap provide scores and feedback on both content and
language. However, one of the studies that examined Criterion (i.e., Chodorow, Gamon, & Tetreault,
2010) limited itself to examining feedback on article errors. Summary Street, IEA, LSA and ECS are all
Table 3
Publication, program and feedback.
Publication K Program K Feedback K
ISI-listed 7 Criterion 11 Content & language 20
Non-ISI-listed 7 My access 5 Content 5
Book chapter 1 Writing roadmap 1 Language 4
Thesis 5 ETIPS 2 Citations 1
Report 1 IEA 1
Unpublished paper 9 LSA semantic space 1
Summary street 3
ECS 1
SAIF 1
QBL 4
M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 57
based on a technique known as latent sematic analysis that purports to focus primarily on content
feedback. ETIPS provides feedback for pre-service teachers on tasks carried out in an on-line case-based
learning environment. SAIF provides feedback on the citations in a text. QBL provides comments on
language errors only. The table shows that most of the studies have involved programs that provide
both content and language feedback.
Table 4 shows that the majority of studies were carried out in classrooms in the United States, with
the remaining studies being carried out in Asian countries, with the exception of a single study carried
out in Egypt. University and college contexts were the most common, followed by high school contexts,
and then elementary contexts. Almost half the studies do not specify the language background of the
participants. Among the studies that did report the language backgrounds of the participants, only two
of the studies (i.e., Chodorow et al., 2010; Choi, 2010) investigated the effects of language background
on the effects of AWE feedback as a variable. Chodorow et al. (2010) compared the effects of Criterion
feedback on the article errors of native and non-native speakers, and Choi (2010) compared the effects
of Criterion feedback on written production measures of EFL students in Korea and ESL students in
the U.S.
3.4. Methodological features
Table 5 shows that most of the studies involved statistical testing, and that between group designs,
in which one or more AWE conditions were compared with one or more control conditions, are the
most common design. There were also a number of within group comparisons in which the same
group of students was compared across drafts and/or texts. One study (i.e., Scharber, Dexter, & Riedel,
2008) used a single group design in which students’ ETIPS scores were correlated with the number of
drafts they submitted.
Table 5 also shows that the most common control group for the between group comparisons
involved a condition in which students received no feedback. In some cases, students in this con-
dition wrote the same texts as students in the experimental condition(s) but received no feedback on
them, and in other cases students in the control condition did not produce any experimental texts.
However, it is unclear in most of the studies whether students in the control condition did receive
some teacher feedback during their normal classroom instruction. Only three studies have explicitly
compared AWE feedback to teacher feedback.
In addition, the table shows that many of the studies have examined the effects of AWE feedback
on AWE texts. However, 11 of the studies focus partly or exclusively on the transfer effects of AWE to
the quality of texts that were not written using AWE.
Lastly, Table 5 shows that scores followed by errors are the most common writing production
measures that have been examined in the studies. Other measures that have been examined include
text length, sentence length, lexical measures and number of citations.
4. Critical review
The research survey has shown that the AWE pedagogical research domain is not a very mature
one. Even though written production has been the main focus of research to date, the total number
of studies carried out remains relatively small, and a number of these studies are either unpublished
papers or published in unranked journals, and perhaps as a consequence are lacking in rigor. Moreover,
these studies are highly heterogeneous, varying in terms of factors such as the AWE program that is
examined, the design of the study, and the educational context in which the studies were carried
out. Hence, not surprisingly, the research has produced mixed and sometimes contradictory results.
As a result, there is only modest evidence that AWE feedback has a positive effect on the quality of
students’ writing and, as the research survey showed, much of the available evidence relates to the
effectiveness of AWE in improving the quality of texts written using AWE feedback.
The evidence for the effects of AWE on writing quality from within group comparisons can be said
to be stronger than the evidence from between-group comparisons. In general, within-group studies
have shown that AWE scores increase and the number of errors decrease across AWE drafts and
texts produced by the same writers (e.g., Attali, 2004; Choi, 2010; El Ebyary & Windeatt, 2010; Foltz,
58M.Stevenson,A.Phakiti/AssessingWriting19(2014)51–65
Table 4
Country, context, language background and sample size.
Country K Educational context K Language background k Sample size K
USA 21 University & College 17 L1 1 <10 1
Taiwan 4 High school 8 Mixed 6 11–50 4
USA & Korea 1 Elementary 3 EFL 8 51–100 9
Japan 1 Elementary & High school 2 EFL &ESL 1 101–200 5
China 1 Unspecified 14 >200 10
Hong Kong 1 Unspecified 1
Egypt 1
M.Stevenson,A.Phakiti/AssessingWriting19(2014)51–6559
Table 5
Methodological features.
Design K Reporting K Control Text Outcome
Between groups 20 Statistical testing 23 No feedback 17 AWE text 19 Scores 13
Within groups 7 Descriptive statistics 3 Teacher feedback 3 Other text 9 Scores + other measures 11
Between & within 2 No statistics 4 No feedback & teacher feedback 1 Both AWE and other text 2 Errors 5
Single group 1 Different AWE conditions 1 Citations 1
No control 8
60 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65
Laham, & Landaur, 1999; Shermis, Garvan, & Diao, 2008; Warden & Chen, 1995). This would appear
to indicate that writers are able to incorporate AWE feedback to improve the quality and accuracy of
AWE texts – at least according to the criteria that AWE programs use to evaluate texts. However, due
to methodological issues, some of the results of within-group studies need to be carefully interpreted.
To give an example, Attali (2004) excluded 71% of his data set from analysis because the writers did
not undertake any revising or redrafting. While the remaining students did on average increase their
score across drafts of the same texts, the lack of utilization of AWE by over two thirds of the cohort
at the very least places a question mark against the efficacy of AWE for stimulating students to revise
their texts. Moreover, an obvious limitation of within-group comparisons is that the lack of control
group makes it difficult to conclude with certainty that improvements are actually attributable to the
use of AWE software. Improvements made by students to successive drafts of a particular text could be
attributable to their own revising skills rather than to their use of revisions suggested by AWE feedback.
Improvements made to successive texts could possibly be attributable to other instructional factors
or possibly even to developmental factors.
The findings from between-group comparisons, which compare one or more AWE conditions with
one or more control conditions, are more mixed, and those findings that provide positive evidence
frequently suffer from serious methodological drawbacks. More than half the studies using between
groups comparisons showed either mixed effects or no effects for AWE feedback on writing outcomes.
Mixed effects involve effects being found for some texts but not for others (e.g., Riedel, Dexter, Scharber,
& Doering, 2006) for some measures but not for others (e.g., Rock, 2007), or for some groups of writers
and not for others (e.g., Schroeder, Grohe, & Pogue, 2008). In a number of cases, in their discussions
these studies largely ignore any negative evidence and hence draw conclusions about the effectiveness
of AWE that are more optimistic than appear to be warranted. For example, in a study by Schroeder
et al. (2008) on the effectiveness of Criterion in improving writing in a criminal justice writing course,
one of the three groups of students utilizing AWE feedback did not achieve significantly higher final
course grades than the control group. However, possible reasons for the non-significance of the results
for this third group are not mentioned and a very strong positive conclusion is drawn: “Results from
this study overwhelmingly point toward the value of technology when teaching writing skills” (p.
444). However, we did also find an example in which the authors did not appear to do justice to
their findings. Chodorow et al. (2010) found that Criterion reduced the article error rate of non-native
speakers, but not of native speakers. However, the study did not report the article error rates for the
native-speakers and does not raise the point that AWE may be less effective for native-speakers simply
because native-speakers do not tend to make many article errors. In this particular case, the lack of
a significant effect for native speakers should not be taken at face value as negative evidence for the
effectiveness of AWE.
A number of studies comparing AWE feedback to no feedback have found significant positive effects
for AWE on writing outcomes. For example, in a study by Franzke et al. (2005) on Summary Street
using a pre-test/posttest design with random assignment to an AWE or a no-feedback condition wrote,
students in both conditions wrote four texts, the quality of which were scored by human raters. It was
found that the AWE condition had higher holistic and content scores on both the averaged score for
the four texts and for orthogonal comparisons of the scores for the first two texts with the last two
texts. However, many of the studies are not as well-designed, and do not include a pretest or other
information on the comparability of students in experimental and control groups. In particular, results
of studies that have compared writing outcomes of students who received AWE with those of students
in previous cohorts should be viewed with caution. For example, Grimes (2008) found that in three
out of four schools students who used My Access had higher external test scores than students from a
previous year who did not receive AWE feedback. However, the author acknowledges that it is difficult
to attribute this improvement to AWE as during the intervention period important improvements to
the quality of writing instruction provided by teachers were also instituted.
As shown by the research survey, only three studies have explicitly compared AWE feedback with
teacher feedback (i.e., Frost, 2008; Rock, 2007; Warden, 2000). As the evidence from these studies is
also mixed, it seems premature to draw any firm conclusions. However, it should be pointed out that
none of the studies shows that AWE feedback is less effective than teacher feedback, which as such
could be taken as a positive sign. Nonetheless, of concern is that these studies report little about the
M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 61
nature of the teacher feedback given or whether this feedback is not comparable to the AWE feedback.
For example, in Warden (2000), an AWE condition in which students received specific error feedback
is compared with a teacher feedback condition in which students received no specific feedback, but
only general comments on content, organization, and grammar. As students in the teacher feedback
condition received no specific feedback on the accuracy of their texts, it is hardly surprising that the
number of errors decreased more in the AWE condition.
In general, there appears to be more support for improvement of error rates than improvement of
holistic scores. For example, Kellogg, Whiteford, and Quinlan (2010) found that holistic scores did not
improve, but that errors were reduced. As the errors types that reduced largely related to linguistic
aspects of the text, they drew the conclusion that there was tentative support for learning about
mechanical aspects of writing from AWE. In contrast, Chen (1997) found that an AWE group and a no-
feedback control group decreased linguistic errors equally. However, the results of this study could
well be attributable to a methodological drawback, as both experimental and control groups were in
the same classes. In these classes, the teachers spent time reviewing the most common error types
found by the computer, in the presence of all the students. Hence, both groups of students may have
benefited from this instruction.
There appears to be no clear evidence as yet concerning whether AWE feedback is associated with
more generalized improvements in writing proficiency. Some of the studies that have examined trans-
fer of the effects of AWE to texts for which no AWE feedback has been provided found no significant
differences between scores for AWE and non-AWE conditions (i.e., Choi, 2010; Kellogg et al., 2010;
Shermis, Burstein, & Bliss, 2004). Moreover, although three studies did find evidence of transfer (Elliot
& Mikulas, 2004; Grimes, 2008; Wang & Wang, 2012), none of these studies is rigorously designed.
The Wang and Wang (2008) study had only one participant in each condition. The flaws in the Grimes
(2008) study have already been discussed. In Elliot and Mikulas (2004), in each of four sub-studies it
was claimed that AWE feedback was associated with better exam performance. However, there was
no random assignment to conditions and the reader is given no information concerning the char-
acteristics of the participants in the two conditions. In one of the sub-studies, students’ results are
compared with students from a year 2000 baseline. In addition, results for two of the four sub-studies
were not tested statistically, and those that were tested are tested non-parametrically. Also, some of
the claims seem to be rather remarkable, such as that a group who used MY Access! between February
and March of 2003 had a pass rate of 81% compared to only 46% for a group who did not receive AWE
feedback. It seems rather unlikely that such a short AWE intervention could lead to such a substantial
change in assessment outcomes, indicating that other factors may also be in operation.
However, it is important to be aware that one of the big unknowns of writing feedback received from
teachers is also whether it leads to any generalized improvements in students’ revising ability or in the
quality of their texts. Hyland and Hyland (2006) pointed out that research on human feedback rarely
looks beyond immediate correction in a subsequent draft, so research on AWE research is not alone
in neglecting this area. Closely connected to whether feedback can lead to generalized improvements
in writing is whether it assists students in developing their ability to revise independently. One of
the first steps in developing revising skills is that writers are able to notice aspects of their texts
that have not, up to that point, been salient (Schmidt, 1990; Truscott, 1998). Once a feature has been
noticed it becomes available for reflection and analysis. As Hyland and Hyland (2006) pointed out,
demonstrating that a student can utilize feedback to edit a draft tells us little about whether the
student has successfully acquired a feature. Similarly, it tells us little about whether the student has
developed the meta-cognitive skills to be able to notice, and then subsequently evaluate and correct
textual problems in other texts successfully.
Currently, we know little about whether AWE actually promotes independent revising. However,
there is some evidence that receiving AWE feedback may not actually encourage students to make
changes either between or within drafts. Attali (2004) reported that 71% of students did not redraft
their essays and 48% of those who did redraft did this only once. Grimes (2005) reported that a typical
revision pattern for students was to submit a first draft, correct a few mechanical errors and resubmit
as fast as possible to see if the score improved. Warden (2000) found that students who were offered
a redrafting opportunity after receiving AWE feedback from QBL actually spent significantly less time
revising their first drafts than students who received AWE feedback on a single draft with no redrafting
62 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65
opportunity, or who received teacher feedback instead of AWE feedback. Students who received no
redrafting opportunity revised their texts before they received any feedback. They then submitted their
texts for marking, received a mark and AWE feedback, but were not given an opportunity to redraft
the text. In contrast, students who received AWE feedback and had an opportunity to redraft appeared
carried out little independent editing, instead waiting for the program to tell them what was wrong
with their texts and then specifically correcting these errors. While these students were successful in
correcting errors detected by AWE, they made few other changes to their texts. Moreover, this trend
continued across successive assignments, suggesting that AWE feedback was not leading to much
development in revising skills. However, it is important to remember that these findings corroborate
findings from revision research that writers – particularly younger writers-revise little and revise
superficially (Faigley & Witte, 1981; Whalen & Ménard, 1995). It may be that some students simply
do not possess the revising skills needed to allow them to benefit from the revision opportunities
afforded by AWE.
5. Conclusions and recommendations
This critical review suggests that there is only modest evidence that AWE feedback has a positive
effect on the quality of the texts that students produce using AWE, and that as yet there is little
clarity about whether AWE is associated with more general improvements in writing proficiency.
Paucity of research, heterogeneity of existing research, the mixed nature of research findings, and
methodological issues in some of the existing research are factors that limit our ability to draw firm
conclusions concerning the effectiveness of AWE feedback.
Initially, we endeavored to meta-analyze effect sizes for the product studies in this sample. How-
ever, due to methodological issues, many of the studies had to be excluded, leaving us with a very
small but still highly heterogenous sample. Heterogeneity necessitates the inclusion of moderator
analyses that examine the effects of variables such as AWE program, educational context and whether
AWE feedback was compared with no feedback or teacher feedback. However, with such a small sam-
ple, there was insufficient power to conduct moderator analyses. We felt that simply providing an
overall effect size that ignores possible effects of moderator variables was not a viable or meaningful
option. Instead, by carrying out a critical review we have been able to identify patterns in the existing
research as well as discussing gaps in the findings, and issues in the methodologies. Below are rec-
ommendations that follow from this review that can serve as a guideline for further research in this
area.
Although this review has not allowed us to differentiate the effectiveness of specific AWE programs,
given differences in the objectives of the programs and the nature of the feedback provided, it is likely
that such differences do exist. So far, more research on the effects of AWE has been carried out for
Criterion than for other programs. Therefore, more studies examining other programs are called for,
and in particular studies comparing the effectiveness of more than one AWE program.
A number of the studies provided only sketchy descriptions of their participants in terms of factors
such as SES, language background, literacy levels, and computer literacy. Future research needs to
be more rigorous in reporting participant characteristics, in controlling for participant variables and,
where appropriate, including these as variables in the research design. In particular, further research
is needed that examines the effectiveness of AWE feedback in ESL and EFL settings, and compares
these to L1 settings. Given the tremendous diversity of student populations within the United States,
not to mention the diversity in potential markets for AWE programs in both English-speaking and
EFL contexts outside the United States, it is of particular importance that the effectiveness of AWE
feedback for second language learners be investigated. The commercial programs in use in the United
States were not originally designed for English as a second language populations, even though they
are being marketed with such populations in mind (Warschauer & Ware, 2006).
In addition, further research examining the relative effects of AWE feedback and teacher feedback
is needed, in which greater explanation of the nature and quality of feedback provided by teachers
is given and in which it is ensured that the kinds of feedback offered by teachers and AWE programs
are more comparable. As there are so many factors in play, it is likely to turn out to be too simplis-
tic to make overall pronouncements about whether human feedback or computer feedback is better.
M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 63
What needs to be disentangled is whether it is really is the source of the feedback that matters,
or whether it is other factors such as the way it is delivered, and the nature of the feedback pro-
vided that make the difference. It is also important to be aware that, as it is frequently reiterated by
developers and researchers alike that AWE feedback is intended to augment teacher feedback rather
than replace it (e.g., Chen & Cheng, 2008; Kellogg et al., 2010; Philips, 2007), research into the rel-
ative effects of different ways of integrating AWE feedback into classroom writing instruction may
have greater ecological validity. In a qualitative study involving the use of AWE feedback in three
classrooms, Chen and Cheng (2008) found indications that AWE feedback may be indeed be more
effective when it is combined with human feedback. However, this study did not examine the effects
of different methods of integration on written production. There are a variety of possible ways of
combining AWE with teacher feedback, and of scaffolding AWE feedback. To name just a couple, stu-
dents can use AWE to help them improve the quality of initial drafts and then submit to the teacher
for feedback, teachers can use AWE as a diagnostic tool for identifying the problems that students
have with their writing, and/or teachers can provide initial training. Research that investigates differ-
ent possibilities for integrating AWE into classroom writing instruction would also be of pedagogical
value.
Some might argue that in terms of the effectiveness of AWE feedback, the bottom line is whether
the scores it generates correlate with external assessment outcomes and whether its repeated use in
the classroom improves students’ test results. However, while it is highly desirable that the transfer of
effects to AWE feedback to non-AWE texts be established, it is questionable whether external exams
provide the most appropriate means of doing so. Firstly, as Warschauer and Ware (2006) remark,
exam writing is generally based on a single draft in timed circumstance, whereas the whole point of
AWE is that it encourages multiple drafting. Secondly, the scoring on exams may be too far removed
from the aspects for which AWE provides feedback. Thirdly, AWE feedback may not be robust enough
as an instructional intervention to impact noticeably on exam scores. Instead, we would recommend
examining transfer of the effects of AWE feedback in non-test situations using texts that are similar
in terms of genres and topics to the AWE texts students have been writing.
The question remains, of course, whether the kinds of writing that AWE feedback give writers the
opportunity to engage in actually reflect the kinds of writing that students do in their classrooms.
AWE programs generally offer only a limited number of genres, such as persuasive, narrative and
informative genres, though some programs such as My Access! additionally enable teachers to use
their own prompts (See Grimes & Warschauer, 2010). Moreover, as mentioned, AWE has been accused
of promoting formulaic writing with an unimaginative five-paragraph structure. The way lies open for
AWE research to include a greater consideration of genre by controlling for genre as a variable, and by
systematically examining the influence of genre on the effectiveness of AWE feedback, for example,
by comparing the effects of AWE when standard prompts are used with the effects when teachers’
own prompts are used.
In conclusion, this study has carried out a critical review of research that examines the effects
of formative AWE feedback on the quality of texts that students produce. It has illuminated what is
known and what is not known about the effects of AWE feedback on writing. It could be argued that
a limitation of the study is that it takes a narrow view of effectiveness in terms of a single dimension:
written production measures. It does not focus on either of the other two dimensions of effectiveness
identified by Lai (2010): the effects on writing processes or perceived usefulness. However, we feel that
Lai’s first dimension is an appropriate and valuable focal point for a critical review, because improving
students’ writing is central to the objectives of AWE and to claims regarding its effectiveness, both
of which are reflected in the fact that, as this study has shown, the bulk of research conducted so
far focuses on written production. We certainly do applaud AWE research that takes a triangulated
approach AWE by incorporating the effects of AWE on written production (product perspective), on
revision processes and learning and teaching processes (process perspective) and on writers’ and
teachers’ perceptions (perception perspective) (e.g., Choi, 2010; Grimes, 2008). We would also join
in the plea made by Liu et al. (2002) concerning research on computer-based technology: “rather
than focusing on the benefits and potentials of computer technology, research needs to move toward
explaining how computers can be used to support (second) language learning – i.e., what kind of tasks
or activities should be used and in what kinds of settings” (pp. 26–27). Consequently, as the next step,
64 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65
in a follow-up study we will examine the use of AWE feedback in the classroom, including teaching
and learning processes and teacher and learner perceptions.
2Research survey sample
*Attali, Y. (2004). Exploring feedback and revision features of Criterion. Paper presented at the National Council on Measurement in
Education San Diego, April 12–16, 2004.
*Chen, J. F. (1997). Computer generated error feedback and writing process: A link [Electronic Version]. TESL-EJ, 2. Retrieved from
http://tesl-ej.org/ej07/a1.html
Chen, C. E., & Cheng, W. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning
effectiveness in EFL writing classes. Language Learning and Technology, 12(2), 94–112.
*Chodorow, M., Gamon, M., & Tetreault, J. (2010). The utility of article and preposition error correction systems for English
language learners: Feedback and assessment. Language Testing, 27(3), 419–436.
*Choi, J. (2010). The impact of automated essay scoring (AES) for improving English language learners essay writing. (Doctoral
dissertation. University of Virginia, 2010).
*El Ebyary, K., & Windeatt, S. (2010). The impact of computer-based feedback on students’ written work. International Journal
of English Studies, 10(2), 121–142.
*Elliot, S., & Mikulas, C. (2004). The impact of MY Access! ! Use on student writing performance: A technology overview and four
studies. Paper presented at the Annual Meeting of the American Educational Research Association.
*Foltz, P. W., Laham, D., & Landauer, T. K. (1999). The intelligent essay assessor: Applications to educational technology. Interactive
Multimedia Educational Journal of Computer-Enhanced Learning, 1(2.). Retrieved from www.knowledge-technologies.com
*Franzke, M., Kintsch, E., Caccamise, D., & Johnson, N. (2005). Summary Street: Computer support for comprehension and writing.
Journal of Educational Computing Research, 33(1), 53–80.
*Frost, K. L. (2008). The effects of automated essay scoring as a high school classroom Intervention, PhD thesis. Las Vegas: University
of Nevada.
*Grimes, D. C. (2008). Middle school use of automated writing evaluation: A multi-site case study, PhD thesis. Irvine: University of
California.
*Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. Journal
of Technology, Language, and Assessment, 8(6), 1–43.
*Kellogg, R., Whiteford, A., & Quinlan, T. (2010). Does automated feedback help students learn to write? Journal of Educational
Computing Research, 42, 173–196.
Lai, Y.-H. (2010). Which do students prefer to evaluate their essays: Peers or computer program. British Journal of Educational
Technology, 41(3), 432–454.
*Riedel, E., Dexter, S. L., Scharber, C., & Doering, A. (2006). Experimental evidence on the effectiveness of automated essay scoring
in teacher education cases. Journal of Educational Computing Research, 35(3), 267–287.
*Rock, J. (2007). The impact of short-term use of Criterion on writing skills in 9th grade (Research Report RR-07-07). Princeton, NJ:
Educational Testing Service.
Scharber, C., Dexter, S., & Riedel, E. (2008). Students’ experiences with an automated essay scorer. The Journal of Technology,
Learning and Assessment, 7(1), 1–44.
*Shermis, M. D., Burstein, J., & Bliss, L. (2004). The impact of automated essay scoring on high stakes writing assessments. Paper
Presented at the Annual Meeting of the National Council on Measurement in Education.
*Shermis, M., Garvan, C. W., & Diao, Y. (2008). The impact of automated essay scoring on writing outcomes. Paper presented at the
Annual Meetings of the National Council on Measurement in Education, March 25–27, 2008.
*Schroeder, J. A., Grohe, B., & Pogue, R. (2008). The impact of criterion writing evaluation technology on criminal justice student
writing skills. Journal of Criminal Justice Education, 19(3), 432–445.
*Wang, F., & Wang, S. (2012). A comparative study on the influence of automated evaluation system and teacher grading on
students’ English writing. Procedia Engineering, 29, 993–997.
*Warden, C. A. (2000). EFL business writing behavior in differing feedback environments. Language Learning, 50(4), 573–616.
Warden, C. A., & Chen, J. F. (1995). Improving feedback while decreasing teacher burden in ROC ESL business English classes.
In P. Porythiaux, T. Boswood, & B. Babcock (Eds.), Explorations in English for professional communications. Hong Kong: City
University of Hong Kong.
Other references
Anson, C. M. (2006). Can’t touch this: Reflections on the servitude of computers as readers. Machine scoring of student essays.
In P. Freitag Ericsson, & R. Haswell (Eds.), Machine scoring of student essays (pp. 38–56). Logan, Utah: Utah State University
Press.
Biber, D., Nekrasova, T., & Horn, B. (2011). The effectiveness of feedback for L1-english and L2-writing development: A meta-analysis.
(ETS Research Report RR-11-05). Princeton, NJ: ETS.
Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing service. AI Magazine
(Fall), 27–36.
Chandrasegaran, A., Ellis, M., & Poedjosoedarmo, G. (2005). Essay assist: Developing software for writing skills improvement in
partnership with students. RELC Journal, 36(2), 137–155.
2
References marked with an asterisk indicate studies that examine solely or partially the effects of AWE on writing outcomes,
and which therefore have been included in the critical review.
M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 65
Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning and Assessment, 5(1), 1–35.
Faigley, L., & Witte, S. (1981). Analyzing revision. College Composition and Communication, 32, 400–414.
Freitag Ericsson, P. (2006). The meaning of meaning. In P. Freitag Ericsson, & R. Haswell (Eds.), Machine scoring of student essays.
Logan Utah: Utah State University Press.
Grimes, D. (2005). Assessing automated assessment: Essay evaluation software in the classroom. Paper presented at the Computers
and Writing Conference Stanford, CA.
Herrington, A., & Moran, C. (2001). What happens when machines read our students writing? College English, 63(4), 480–499.
Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing. Language Teaching, 39, 83–101.
Patterson, N. (2005). Computerized writing assessment: Technology gone wrong. Voices From the Middle, 13(2), 56–57.
Philips, S. M. (2007). Automated essay scoring: A literature review (SAEE research series #30). Kelowna, BC: Society for the
Advancement of Excellence Education.
Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129–158.
Shermis, M. D., & Burstein, J. (Eds.). (2003). Automated essay scoring: A cross-disciplinary perspective. Hillsdale, NJ: Lawrence
Erlbaum Associates.
Shermis, M. D., & Burstein, J. (Eds.). (2013). Handbook of automated essay evaluation: Current applications and new directions. New
York and London: Routledge.
Taylor, A. R. (2005). A future in the process of arrival: Using computer technologies for the assessment of learning. TASA Institute,
Society for the Advancement of Excellence in Education.
Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46(2), 327–369.
Truscott, J. (1998). Noticing in second language acquisition: A critical review. Second Language Research, 14(2), 103–135.
Warschauer, M., & Ware, J. (2006). Automated writing evaluation: Defining the classroom research agenda. Language Teaching
Research, 10(2), 1–24.
Whalen, K., & Ménard, N. (1995). L1 and L2 writers’ strategic and linguistic knowledge: A model of multiple-level discourse
processing. Language Learning, 44(3), 381–418.
Yang, Y., Buckendahl, C. W., Juszkewicz, P. J., & Bhola, D. S. (2002). A review of strategies for validating computer-automated
scoring. Applied Measurement in Education, 15(4), 391–412.
Zellermayer, M., Salomon, G., Globerson, T., & Givon, H. (1991). Enhancing writing-related metacognitions through a comput-
erized writing partner. American Educational Research Journal, 28(2), 373–391.
Further reading
*Britt, A., Wiemer-Hastings, P., Larson, A., & Perfetti, C. (2004). Using intelligent feedback to improve sourcing and integration
in students’ essays. International Journal of Artificial Intelligence in Education, 14, 359–374.
Dikli, S. (2007). Automated essay scoring in an ESL setting. (Doctoral dissertation, Florida State University, 2007).
*Hoon, T. (2006). Online automated essay assessment: Potentials for writing development. Retrieved from http://ausweb.
scu.edu.au/aw06/papers/refereed/tan3/paper.html
*Lee, C., Wong, K. C. K., Cheung, W. K., & Lee, F. S. L. (2009). Web-based essay critiquing system and EFL students’ writing: A
quantitative and qualitative investigation. Computer Assisted Language Learning, 22(1), 57–72.
*Matsumoto, K., & Akahori, K. (2008). Evaluation of the use of automated writing assessment software. In C. Bonk, C. Bonk, et al.
(Eds.), Proceedings of World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education 2008 (pp.
1827–1832). Chesapeake, VA: AACE.
*Schreiner, M. E. (2002). The role of automatic feedback in the summarization of narrative text, PhD Thesis. University of Colorado.
*Steinhart, D. J. (2001). An intelligent tutoring system for improving student writing through the use of latent semantic analysis.
Boulder: University of Colorado.
Wade-Stein, D., & Kintsch, E. (2004). Summary Street: Interactive computer support for writing. Cognition and Instruction, 22(3),
333–362.
Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An International Journal, 3,
22–36.
Yao, Y. C., & Warden, C. A. (1996). Process writing and computer correction: Happy wedding or shotgun marriage? [Electronic
Version]. CALL Electronic Journal from Available at http://www.lerc.ritsumei.ac.jp/callej/1-1/Warden1.html.

Weitere ähnliche Inhalte

Was ist angesagt?

Writing good multiple choice test questions
Writing good multiple choice test questionsWriting good multiple choice test questions
Writing good multiple choice test questionsenglishonecfl
 
Technology and Assessment
Technology and AssessmentTechnology and Assessment
Technology and AssessmentRoselle Reonal
 
Teacher Connect Session 4
Teacher Connect Session 4Teacher Connect Session 4
Teacher Connect Session 4Sharon Seslija
 
June 2008 e_book_editions
June 2008 e_book_editionsJune 2008 e_book_editions
June 2008 e_book_editionsRatri Cahyani
 
The Formative Use Of E Assessment
The Formative Use Of E AssessmentThe Formative Use Of E Assessment
The Formative Use Of E Assessmentabboylea947
 
Teaching and Assessment Strategies for Self-Directed Learning
Teaching and Assessment Strategies for Self-Directed LearningTeaching and Assessment Strategies for Self-Directed Learning
Teaching and Assessment Strategies for Self-Directed LearningUnisa
 
Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...
Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...
Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...elireview
 
Readability of staar is misaligned schooling v10 n1,2019
Readability of staar  is  misaligned schooling v10  n1,2019Readability of staar  is  misaligned schooling v10  n1,2019
Readability of staar is misaligned schooling v10 n1,2019William Kritsonis
 
Week 9 writing discussion
Week 9 writing discussionWeek 9 writing discussion
Week 9 writing discussionHafizul Mukhlis
 
Asynchronous Audio Feedback
Asynchronous Audio FeedbackAsynchronous Audio Feedback
Asynchronous Audio FeedbackPhil Ice
 

Was ist angesagt? (11)

Writing good multiple choice test questions
Writing good multiple choice test questionsWriting good multiple choice test questions
Writing good multiple choice test questions
 
E assessment
E assessmentE assessment
E assessment
 
Technology and Assessment
Technology and AssessmentTechnology and Assessment
Technology and Assessment
 
Teacher Connect Session 4
Teacher Connect Session 4Teacher Connect Session 4
Teacher Connect Session 4
 
June 2008 e_book_editions
June 2008 e_book_editionsJune 2008 e_book_editions
June 2008 e_book_editions
 
The Formative Use Of E Assessment
The Formative Use Of E AssessmentThe Formative Use Of E Assessment
The Formative Use Of E Assessment
 
Teaching and Assessment Strategies for Self-Directed Learning
Teaching and Assessment Strategies for Self-Directed LearningTeaching and Assessment Strategies for Self-Directed Learning
Teaching and Assessment Strategies for Self-Directed Learning
 
Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...
Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...
Toward an Ecological View of Electronic Peer Review: Agency, Uptakes, and Tra...
 
Readability of staar is misaligned schooling v10 n1,2019
Readability of staar  is  misaligned schooling v10  n1,2019Readability of staar  is  misaligned schooling v10  n1,2019
Readability of staar is misaligned schooling v10 n1,2019
 
Week 9 writing discussion
Week 9 writing discussionWeek 9 writing discussion
Week 9 writing discussion
 
Asynchronous Audio Feedback
Asynchronous Audio FeedbackAsynchronous Audio Feedback
Asynchronous Audio Feedback
 

Andere mochten auch

2014 meta analysis in reading
2014 meta analysis in reading2014 meta analysis in reading
2014 meta analysis in readingMagdy Mahdy
 
Visual narrative
Visual narrativeVisual narrative
Visual narrativeMagdy Mahdy
 
2014 ts knowledge of r n w
2014 ts knowledge of r n w2014 ts knowledge of r n w
2014 ts knowledge of r n wMagdy Mahdy
 
Effective teachers
Effective teachersEffective teachers
Effective teachersMagdy Mahdy
 
اسس بناء المنهج
اسس بناء المنهجاسس بناء المنهج
اسس بناء المنهجMagdy Mahdy
 

Andere mochten auch (9)

2014 meta analysis in reading
2014 meta analysis in reading2014 meta analysis in reading
2014 meta analysis in reading
 
Visual narrative
Visual narrativeVisual narrative
Visual narrative
 
Beyond cons14
Beyond cons14Beyond cons14
Beyond cons14
 
2014 ts knowledge of r n w
2014 ts knowledge of r n w2014 ts knowledge of r n w
2014 ts knowledge of r n w
 
Ts lang errors
Ts lang errorsTs lang errors
Ts lang errors
 
Effective teachers
Effective teachersEffective teachers
Effective teachers
 
R proposal 8
R proposal 8R proposal 8
R proposal 8
 
اسس بناء المنهج
اسس بناء المنهجاسس بناء المنهج
اسس بناء المنهج
 
Assessment 1
Assessment 1Assessment 1
Assessment 1
 

Ähnlich wie Computer generated feedback

Automated Writing Assessment In The Classroom
Automated Writing Assessment In The ClassroomAutomated Writing Assessment In The Classroom
Automated Writing Assessment In The ClassroomCourtney Esco
 
Automated Writing Evaluation
Automated Writing EvaluationAutomated Writing Evaluation
Automated Writing EvaluationClaire Webber
 
Automated Essay Scoring And The Search For Valid Writing Assessment
Automated Essay Scoring And The Search For Valid Writing AssessmentAutomated Essay Scoring And The Search For Valid Writing Assessment
Automated Essay Scoring And The Search For Valid Writing AssessmentEmma Burke
 
Automated Writing Evaluation For Formative Assessment Of Second Language Writ...
Automated Writing Evaluation For Formative Assessment Of Second Language Writ...Automated Writing Evaluation For Formative Assessment Of Second Language Writ...
Automated Writing Evaluation For Formative Assessment Of Second Language Writ...Lisa Riley
 
AcaWriter A Learning Analytics Tool For Formative Feedback On Academic Writing
AcaWriter  A Learning Analytics Tool For Formative Feedback On Academic WritingAcaWriter  A Learning Analytics Tool For Formative Feedback On Academic Writing
AcaWriter A Learning Analytics Tool For Formative Feedback On Academic WritingScott Faria
 
A Self-Assessment Checklist For Undergraduate Students Argumentative Writing
A Self-Assessment Checklist For Undergraduate Students  Argumentative WritingA Self-Assessment Checklist For Undergraduate Students  Argumentative Writing
A Self-Assessment Checklist For Undergraduate Students Argumentative WritingPedro Craggett
 
Automated Formative Assessment As A Tool To Scaffold Student Documentary Writing
Automated Formative Assessment As A Tool To Scaffold Student Documentary WritingAutomated Formative Assessment As A Tool To Scaffold Student Documentary Writing
Automated Formative Assessment As A Tool To Scaffold Student Documentary WritingMartha Brown
 
Argumentative Writing And Academic Achievement A Longitudinal Study
Argumentative Writing And Academic Achievement  A Longitudinal StudyArgumentative Writing And Academic Achievement  A Longitudinal Study
Argumentative Writing And Academic Achievement A Longitudinal StudySarah Morrow
 
A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...
A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...
A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...Simar Neasy
 
A Comparative Investigation of Peer Revision versus Teacher Revision on the P...
A Comparative Investigation of Peer Revision versus Teacher Revision on the P...A Comparative Investigation of Peer Revision versus Teacher Revision on the P...
A Comparative Investigation of Peer Revision versus Teacher Revision on the P...Valerie Felton
 
Vietnamese EFL students’ perception and preferences for teachers’ written fee...
Vietnamese EFL students’ perception and preferences for teachers’ written fee...Vietnamese EFL students’ perception and preferences for teachers’ written fee...
Vietnamese EFL students’ perception and preferences for teachers’ written fee...AJHSSR Journal
 
A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12
A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12
A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12Katie Robinson
 
Wsudiantes universitarios sobre retroalimentacion
Wsudiantes universitarios sobre retroalimentacionWsudiantes universitarios sobre retroalimentacion
Wsudiantes universitarios sobre retroalimentacionSisercom SAC
 
An Assessment Instrument of Product versus Process Writing Instruction: A Ra...
 An Assessment Instrument of Product versus Process Writing Instruction: A Ra... An Assessment Instrument of Product versus Process Writing Instruction: A Ra...
An Assessment Instrument of Product versus Process Writing Instruction: A Ra...English Literature and Language Review ELLR
 
Assessment System In Writing Essays By Graduate Students
Assessment System In Writing Essays By Graduate StudentsAssessment System In Writing Essays By Graduate Students
Assessment System In Writing Essays By Graduate StudentsAshley Hernandez
 
Empirical Investigations that Supported the Development of OpenEssayist: A To...
Empirical Investigations that Supported the Development of OpenEssayist: A To...Empirical Investigations that Supported the Development of OpenEssayist: A To...
Empirical Investigations that Supported the Development of OpenEssayist: A To...César Pablo Córcoles Briongos
 
Academic Tutors Beliefs About And Practices Of Giving Feedback On Students ...
Academic Tutors  Beliefs About And Practices Of Giving Feedback On Students  ...Academic Tutors  Beliefs About And Practices Of Giving Feedback On Students  ...
Academic Tutors Beliefs About And Practices Of Giving Feedback On Students ...Suzanne Simmons
 
A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...
A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...
A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...Angel Evans
 
An Emic View Of Student Writing And The Writing Process
An Emic View Of Student Writing And The Writing ProcessAn Emic View Of Student Writing And The Writing Process
An Emic View Of Student Writing And The Writing ProcessRachel Doty
 

Ähnlich wie Computer generated feedback (20)

Automated Writing Assessment In The Classroom
Automated Writing Assessment In The ClassroomAutomated Writing Assessment In The Classroom
Automated Writing Assessment In The Classroom
 
Automated Writing Evaluation
Automated Writing EvaluationAutomated Writing Evaluation
Automated Writing Evaluation
 
Automated Essay Scoring And The Search For Valid Writing Assessment
Automated Essay Scoring And The Search For Valid Writing AssessmentAutomated Essay Scoring And The Search For Valid Writing Assessment
Automated Essay Scoring And The Search For Valid Writing Assessment
 
Automated Writing Evaluation For Formative Assessment Of Second Language Writ...
Automated Writing Evaluation For Formative Assessment Of Second Language Writ...Automated Writing Evaluation For Formative Assessment Of Second Language Writ...
Automated Writing Evaluation For Formative Assessment Of Second Language Writ...
 
AcaWriter A Learning Analytics Tool For Formative Feedback On Academic Writing
AcaWriter  A Learning Analytics Tool For Formative Feedback On Academic WritingAcaWriter  A Learning Analytics Tool For Formative Feedback On Academic Writing
AcaWriter A Learning Analytics Tool For Formative Feedback On Academic Writing
 
A Self-Assessment Checklist For Undergraduate Students Argumentative Writing
A Self-Assessment Checklist For Undergraduate Students  Argumentative WritingA Self-Assessment Checklist For Undergraduate Students  Argumentative Writing
A Self-Assessment Checklist For Undergraduate Students Argumentative Writing
 
Assessing Writing
Assessing WritingAssessing Writing
Assessing Writing
 
Automated Formative Assessment As A Tool To Scaffold Student Documentary Writing
Automated Formative Assessment As A Tool To Scaffold Student Documentary WritingAutomated Formative Assessment As A Tool To Scaffold Student Documentary Writing
Automated Formative Assessment As A Tool To Scaffold Student Documentary Writing
 
Argumentative Writing And Academic Achievement A Longitudinal Study
Argumentative Writing And Academic Achievement  A Longitudinal StudyArgumentative Writing And Academic Achievement  A Longitudinal Study
Argumentative Writing And Academic Achievement A Longitudinal Study
 
A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...
A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...
A Case Study Of The 4Cs Approach To Academic Writing Among Second Year Busine...
 
A Comparative Investigation of Peer Revision versus Teacher Revision on the P...
A Comparative Investigation of Peer Revision versus Teacher Revision on the P...A Comparative Investigation of Peer Revision versus Teacher Revision on the P...
A Comparative Investigation of Peer Revision versus Teacher Revision on the P...
 
Vietnamese EFL students’ perception and preferences for teachers’ written fee...
Vietnamese EFL students’ perception and preferences for teachers’ written fee...Vietnamese EFL students’ perception and preferences for teachers’ written fee...
Vietnamese EFL students’ perception and preferences for teachers’ written fee...
 
A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12
A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12
A Quantitative Synthesis Of Research On Writing Approaches In Grades 2 To 12
 
Wsudiantes universitarios sobre retroalimentacion
Wsudiantes universitarios sobre retroalimentacionWsudiantes universitarios sobre retroalimentacion
Wsudiantes universitarios sobre retroalimentacion
 
An Assessment Instrument of Product versus Process Writing Instruction: A Ra...
 An Assessment Instrument of Product versus Process Writing Instruction: A Ra... An Assessment Instrument of Product versus Process Writing Instruction: A Ra...
An Assessment Instrument of Product versus Process Writing Instruction: A Ra...
 
Assessment System In Writing Essays By Graduate Students
Assessment System In Writing Essays By Graduate StudentsAssessment System In Writing Essays By Graduate Students
Assessment System In Writing Essays By Graduate Students
 
Empirical Investigations that Supported the Development of OpenEssayist: A To...
Empirical Investigations that Supported the Development of OpenEssayist: A To...Empirical Investigations that Supported the Development of OpenEssayist: A To...
Empirical Investigations that Supported the Development of OpenEssayist: A To...
 
Academic Tutors Beliefs About And Practices Of Giving Feedback On Students ...
Academic Tutors  Beliefs About And Practices Of Giving Feedback On Students  ...Academic Tutors  Beliefs About And Practices Of Giving Feedback On Students  ...
Academic Tutors Beliefs About And Practices Of Giving Feedback On Students ...
 
A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...
A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...
A Comparison Of Word Processed And Handwritten Essays From A Standardized Wri...
 
An Emic View Of Student Writing And The Writing Process
An Emic View Of Student Writing And The Writing ProcessAn Emic View Of Student Writing And The Writing Process
An Emic View Of Student Writing And The Writing Process
 

Mehr von Magdy Mahdy

Beyond instrumentation
Beyond instrumentationBeyond instrumentation
Beyond instrumentationMagdy Mahdy
 
Teaching and emotions
Teaching and emotionsTeaching and emotions
Teaching and emotionsMagdy Mahdy
 
Fluency and games
Fluency and gamesFluency and games
Fluency and gamesMagdy Mahdy
 
Classroom management1
Classroom management1Classroom management1
Classroom management1Magdy Mahdy
 
Writing scientific papers
Writing scientific papersWriting scientific papers
Writing scientific papersMagdy Mahdy
 
Effectivelessonplanning
EffectivelessonplanningEffectivelessonplanning
EffectivelessonplanningMagdy Mahdy
 
Evaluating internet sources
Evaluating internet  sourcesEvaluating internet  sources
Evaluating internet sourcesMagdy Mahdy
 
Problem based and_inquiry_based21_1_
Problem based and_inquiry_based21_1_Problem based and_inquiry_based21_1_
Problem based and_inquiry_based21_1_Magdy Mahdy
 
Reflective writingforinterns 001
Reflective writingforinterns 001Reflective writingforinterns 001
Reflective writingforinterns 001Magdy Mahdy
 
Pp on imagination final
Pp on imagination finalPp on imagination final
Pp on imagination finalMagdy Mahdy
 
Methods of research
Methods of researchMethods of research
Methods of researchMagdy Mahdy
 
Definitions assessment 6
Definitions assessment 6Definitions assessment 6
Definitions assessment 6Magdy Mahdy
 

Mehr von Magdy Mahdy (20)

Beyond instrumentation
Beyond instrumentationBeyond instrumentation
Beyond instrumentation
 
Teaching and emotions
Teaching and emotionsTeaching and emotions
Teaching and emotions
 
Fluency and games
Fluency and gamesFluency and games
Fluency and games
 
Classroom management1
Classroom management1Classroom management1
Classroom management1
 
Writing scientific papers
Writing scientific papersWriting scientific papers
Writing scientific papers
 
Effectivelessonplanning
EffectivelessonplanningEffectivelessonplanning
Effectivelessonplanning
 
Evaluating internet sources
Evaluating internet  sourcesEvaluating internet  sources
Evaluating internet sources
 
E assessment
E assessmentE assessment
E assessment
 
Problem based and_inquiry_based21_1_
Problem based and_inquiry_based21_1_Problem based and_inquiry_based21_1_
Problem based and_inquiry_based21_1_
 
Reflective writingforinterns 001
Reflective writingforinterns 001Reflective writingforinterns 001
Reflective writingforinterns 001
 
Pp on imagination final
Pp on imagination finalPp on imagination final
Pp on imagination final
 
Methods of research
Methods of researchMethods of research
Methods of research
 
R proposal 1
R proposal 1R proposal 1
R proposal 1
 
R proposal 7
R proposal 7R proposal 7
R proposal 7
 
R proposal 9
R proposal 9R proposal 9
R proposal 9
 
R m 7777
R m 7777R m 7777
R m 7777
 
R m 101
R m 101R m 101
R m 101
 
R m 99
R m 99R m 99
R m 99
 
Definitions assessment 6
Definitions assessment 6Definitions assessment 6
Definitions assessment 6
 
Behav & constr
Behav & constrBehav & constr
Behav & constr
 

Kürzlich hochgeladen

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterMateoGardella
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfSanaAli374401
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhikauryashika82
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 

Kürzlich hochgeladen (20)

Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in DelhiRussian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
Russian Escort Service in Delhi 11k Hotel Foreigner Russian Call Girls in Delhi
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 

Computer generated feedback

  • 1. Assessing Writing 19 (2014) 51–65 Contents lists available at ScienceDirect Assessing Writing The effects of computer-generated feedback on the quality of writing Marie Stevenson∗ , Aek Phakiti University of Sydney, Australia a r t i c l e i n f o Article history: Available online 17 December 2013 Keywords: Automated writing evaluation (AWE) Computer-generated feedback Effects on writing quality Critical review a b s t r a c t This study provides a critical review of research into the effects of computer-generated feedback, known as automated writing evalu- ation (AWE), on the quality of students’ writing. An initial research survey revealed that only a relatively small number of studies have been carried out and that most of these studies have examined the effects of AWE feedback on measures of written production such as scores and error frequencies. The critical review of the findings for written production measures suggested that there is modest evi- dence that AWE feedback has a positive effect on the quality of the texts that students produce using AWE, and that as yet there is little evidence that the effects of AWE transfer to more general improve- ments in writing proficiency. Paucity of research, the mixed nature of research findings, heterogeneity of participants, contexts and designs, and methodological issues in some of the existing research were identified as factors that limit our ability to draw firm con- clusions concerning the effectiveness of AWE feedback. The study provides recommendations for further AWE research, and in par- ticular calls for more research that places emphasis on how AWE can be integrated effectively in the classroom to support writing instruction. © 2013 Elsevier Ltd. All rights reserved. 1. Introduction This study provides a critical review of literature on the pedagogical effectiveness of computer- based educational technology for providing students with feedback on their writing that is commonly ∗ Corresponding author. E-mail addresses: marie.stevenson@sydney.edu.au (M. Stevenson), aek.phakiti@sydney.edu.au (A. Phakiti). 1075-2935/$ – see front matter © 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.asw.2013.11.007
  • 2. 52 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 known as Automated Writing Evaluation (AWE).1 AWE software provides computer-generated feed- back on the quality of written texts. A central component of AWE software is a scoring engine that generates automated scores based on techniques such as artificial intelligence, natural language processing and latent semantic analysis (See Dikli, 2006; Philips, 2007; Shermis & Burstein, 2003; Yang, Buckendahl, Juszkiewicz, & Bhola, 2002). AWE software that is used for pedagogical purposes also provides written feedback in the form of general comments, specific comments and/or corrections. Originally, AWE was primarily used in high-stakes testing situations to generate summative scores to be used for assessment purposes. Widely used, commercially available scoring engines are Project Essay GraderTM (PEG), e-rater®, Intelligent Essay AssessorTM (IEA), and IntelliMetricTM. In recent years, the use of AWE for the provision of formative feedback in the writing classroom has steadily increased, particularly in classrooms in the United States. AWE programs are currently being used in many elementary, high school, college and university classrooms with a range of writers from diverse backgrounds. Examples of commercially available AWE programs designed for classroom use are: Criterion (Educational Testing Service: MY Access! (Vantage Learning): Write to Learn and Sum- mary Street (Pearson Knowledge Technologies); and Writing Roadmap (McGraw Hill). These programs sometimes incorporate the same scoring engine as used in summative programs. For example, Crite- rion incorporates the e-rater scoring engine and MY Access! incorporates the IntellimetricTM scoring engine. Common to all AWE programs designed for classroom use is that they provide writers with multiple drafting opportunities, and upon receiving feedback writers can choose whether or not to use this feedback to revise their texts. AWE programs vary in the kinds of feedback they provide writers. Some provide feedback on both global writing skills and language use (e.g., Criterion, MY Access!), whereas others focus on language use (e.g., QBL) and some claim to focus primarily on content knowledge (e.g., Write to Learn and Summary Street). Some programs incorporate other tools such as model essays, scoring rubrics, graphic organizers, and dictionaries and thesauri. Like many other forms of educational technology, the use of AWE in the classroom has been the subject of controversy, with scholars taking divergent stances. On the one hand, AWE has been hailed as a means of liberating instructors, freeing them up to devote valuable time to aspects of writing instruction other than marking assignments (e.g., Burstein, Chodorow, & Leacock, 2004; Herrington & Moran, 2001; Hyland & Hyland, 2006; Philips, 2007). It has been seen as impacting positively on the quality of students’ writing, due to the immediacy of its ‘on-line’ feedback (Dikli, 2006), and the multiple practice and revision opportunities it provides (Warschauer & Ware, 2006). It has also been claimed to have positive effects on student autonomy (Chen & Cheng, 2008). On the other hand, the notion that computers are capable of providing effective writing feedback has aroused considerable suspicion, perhaps fueled by the fearful specter of a world in which humans are replaced by machines. Criticisms have been made concerning the capacity of AWE to provide accurate and meaningful scores (e.g., Anson, 2006; Freitag Ericsson, 2006). There is a common perception that computers are not capable of scoring human texts, as they do not possess human inferencing skills and background knowledge (Anson, 2006). Other criticisms relate to the effects that AWE has on students’ writing. AWE has been accused of reflecting and promoting a primarily formalist approach to writing, in which writing is viewed as simply being “mastery of a set of subskills” (Hyland & Hyland, 2006, p. 95). Comments generated by AWE have been said to place too much emphasis on surface features of writing, such as grammatical correctness (Hyland & Hyland, 2006) and the effects of writing for a non-human audience have been decried. There is also fear that using AWE feedback may be more of an exercise in developing test-taking strategies than in developing writing skills, with students writing to the test by consciously or unconsciously adjusting their writing to meet the criteria of the software (Patterson, 2005). Positive and negative claims regarding the effects of AWE on students’ writing are not always based on empirical evidence, and at times appear to reflect authors’ own ‘techno-positivistic’ or ‘technopho- bic’ stances toward technology in the writing classroom. Moreover, quite a lot of the research that has 1 Other terms found in the literature are automated essay evaluation (AEE) (See Shermis & Burstein, 2013) and writing evaluation technology.
  • 3. M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 53 been carried out is or from authors who have been involved in developing a particular AWE program or who are affiliated with organizations that have developed these programs, so could contain a bias toward showing AWE in a positive light. Consequently, there is lack of clarity concerning the current state of evidence for the effects on the quality of students’ writing of AWE programs designed for teaching and learning purposes. However, it is important to be aware that over the past decades there has also been controversy about the effects of teacher feedback on writing. Perhaps the strongest opponent of classroom writing feedback was Truscott (1996), who claimed that feedback on grammar should be abandoned, as it ignored deeper learning processes, only led to pseudo-learning and had a negative effect on the quality of students’ writing. While most scholars have taken less extreme positions, in a review of issues relating to feedback in the classroom, Hyland and Hyland (2006) concluded that there was surprisingly little consensus about the kinds of feedback that are effective and in particular about the long term effects of feedback on writing development. However, some research synthetic evidence exists for the effectiveness of teacher feedback. In a recent met-analytic study, Biber, Nekrasova, and Horn (2011) found that, when compared to no feedback, teacher feedback was associated with gains in writing development for both first and second language writers. They found that a focus on content and language use was more effective than focus on a focus on form only, especially for second language writers. They also found that comments were more effective than error correction, even for improving grammatical accuracy. It is therefore timely to evaluate whether there is evidence that computer- generated feedback is also associated with improvements in writing. To date, the thrust of AWE research has been on validation through the examination of the psy- chometric properties of AWE scores by, for example, calculating the degree of correlation between computer-generated scores and scores given by human raters. Studies have frequently found high correlations between AWE scores and human scores. and these results have been taken as providing evidence that AWE scores provide a psychometrically valid measure of students’ writing. (See two volumes edited by Shermis and Burstein (2003, 2013) for detailed results and in-depth discussion of the reliability and validity of specific AWE systems). Such studies, however, do not inform us about whether AWE is effective as a classroom tool to actually improve students’ writing. As Warschauer and Ware (2006) pointed out, while evidence of psychometric reliability and validity is a necessary pre-requisite, it is not sufficient for understanding whether AWE ‘works’ in the sense of contributing to positive outcomes for student learning. Even the recently published ‘Handbook of Automated Essay Evaluation’ (Shermis & Burstein, 2013), although it pays some attention to AWE as a teaching and learning tool, still has a strong psychometric and assessment focus. Although a number of individual studies have examined the effects of AWE feedback in the classroom, no comprehensive review of the literature exists that examines whether AWE feedback improves the quality of students’ writing. Warschauer and Ware (2006) provided a thought-provoking discussion of some existing research on AWE in the classroom and used this to make recommenda- tions for future AWE research. However, they only provided a limited review that did not include all of the then available research and did not provide an overview of the evidence for the effects of AWE on students’ writing. Moreover, since their paper was written a number of studies have been published in this area. 2. The current study The current study provides an evaluation of the available evidence for the effects of AWE feedback in the writing classroom in terms of written production. The study focuses on research involving AWE systems specifically designed as tools for providing formative evaluation in the writing classroom, rather than AWE systems designed to provide summative assessment in testing situations. The purpose of formative evaluation is to provide writers with individual feedback that can form the basis for further learning (Philips, 2007). In formative evaluation, there is a need to inform students not only about their level of achievement, but also about their specific strengths and weaknesses. Formative evaluation can be said to involve assessment for learning, rather than assessment of learning (Taylor, 2005). In this study, feedback is viewed as encompassing both numeric feedback (i.e., scores and ratings) and written
  • 4. 54 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 feedback (i.e., global or specific comments on the quality of the text and/or identification of specific problems in the actual text). The study focuses on the effects of AWE on written production, because the capability to improve the quality of students’ texts is central to claims made about the effectiveness of AWE feedback, and because, likely as a consequence of this, the bulk of AWE pedagogical research focuses on written pro- duction outcomes. The study includes AWE research on students from diverse backgrounds, in diverse teaching contexts, and receiving diverse kinds of feedback from diverse AWE programs. The scope of the research included is broad due to the relatively small number of existing studies and the hetero- geneity of these studies. The study does not aim to make comparisons or draw conclusions about the relative effects of AWE feedback on student writing for specific populations, contexts, feedback types or programs. Instead, it aims to critically evaluate the effects of AWE feedback on written production by identifying general patterns and trends, and identifying issues and factors that may impact on these effects. The study is divided into two stages: a research survey and a critical review. The objective of the research survey is to determine the maturity of the research domain, and to provide a characterization of the existing research that can be drawn on in the critical review. The objective of the critical review, which is the central stage, is to identify overall patterns in the research findings and to evaluate and interpret these findings, taking account of relevant issues and factors. 3. Method 3.1. The literature search A comprehensive and systematic literature search was conducted to identify relevant primary sources for inclusion in the research survey and critical review. Both published research (i.e., journal articles, book chapters and reports) and unpublished research (i.e., theses and conference papers) were identified. The following means of identifying research were used: a) Search engines: Google Scholar, Google. b) Databases: ERIC, MLA, PsychInfo, SSCI, MLA, Ovid, PubPsych, Linguistics and Language Behavior Abstracts (LLBA), Dissertation Abstracts International, Academic Search Elite, Expanded Academic, ProQuest Dissertation and Theses Full-text, and Australian Education Index. c) Search terms used: automated writing evaluation, automated writing feedback, computer- generated feedback, computer feedback, and automated essay scoring automated evaluation, electronic feedback, and program names (e.g., Criterion, Summary Street, Intelligent Essay Assessor, Write to Learn, MY Access!). d) Websites: ETS website (ets.org) (ETS Research Reports, TOEFL iBT Insight series, TOEFL iBT research series, TOEFL Research Reports); AWE software websites. e) Journals from 1990 to 2011: CAELL Journal; CALICO Journal; College English; English Journal; Com- puter Assisted Language Learning; Computers and Composition; Educational Technology Research and Development; English for Specific Purposes; IEEE Intelligent Systems; Journal of Basic Writ- ing; Journal of Computer-Based Instruction; Journal of Educational Computing Research; Journal of Research on Technology in Education; Journal of Second Language Writing; Journal of Technol- ogy; Journal of Technology, Learning and Assessment, Language Learning and Technology; Language Learning; Language Teaching Research; Learning, and Assessment; ReCALL; System; TESL-EJ. f) Reference lists of already identified publications. In particular, the Ericson and Haswell (2006) bibliography. To be included, a primary source had to focus on empirical research on the use AWE feedback generated by one or more commercially or non-commercially available programs for the formative evaluation of texts in the writing classroom. The program reported on needed to provide text- specific feedback. Studies were excluded that reported on programs that provided generic writing guidelines (e.g., The Writing Partner: Zellermayer, Salomon, Globerson, & Givon, 1991; Essay Assist:
  • 5. M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 55 Chandrasegaran, Ellis, & Poedjosoedarmo, 2005). Studies that reported results already reported else- where were also excluded. Where the same results were reported more than once, published studies were chosen above unpublished ones, or if both were published, the first publication was chosen. This led to the exclusion of Grimes (2005) and Kintsch et al. (2000). Based on the above criteria, 33 primary sources were identified for inclusion in the research survey (See Appendix A). 3.2. Coding of research survey A coding scheme of study descriptors was developed for the research survey. The unit of coding was the study. A study was defined as consisting of “a set of data collected under a single research plan from a designated sample of respondents” (Lipsey & Wilson, 2001, p. 76). As one of the publications, Elliot and Mikulas (2004), included four studies with different samples, this led to a total of 36 studies being identified. In order to obtain an overview of the scope of the research domain, the studies were first classified in terms of constructs of effectiveness: Product, Process and Perceptions. Lai (2010) defined effec- tiveness of AWE feedback in terms of three dimensions: (1) the effects on written production (e.g., quality scores, error frequencies and rates, lexical measures and text length); (2) the effects on writing processes (e.g., rates and types of revisions, editing time, time on task, and rates of text production); and (3) perceived usefulness. In our study, combinations of these constructs were possible, as some studies included more than one construct. Subsequently, as the focus of the study is writing outcomes, only studies that included Product measurements were coded in terms of Substantive descriptors and Methodological descriptors (See Lipsey & Wilson, 2001). Substantive descriptors relate to substantive aspects of the study, such as the characteristics of the intervention and the research context. Methodological descriptors relate to the methods and procedures used in the study. Table 1 lists the coding categories for both kinds of descrip- tors and the coding options within each category. In the methodological descriptors, ‘Control group’ refers to whether the study included a control condition and whether this involved comparing AWE feedback with a no feedback condition or with a teacher feedback condition. ‘Text’ refers to whether outcomes were measured using texts for which AWE feedback had been received or other texts, such as writing assessment tasks. ‘Outcome measure’ refers to the measure(s) of written production that were included in the study. The coding categories and options were developed inductively by reading through the sample studies. Developing the coding scheme was a cyclical process, and each study was coded a number of times, until the coding scheme was sufficiently refined. These coding cycles were carried out by the first researcher. The reliability of the coding was checked through the coding of 12 studies (one Table 1 Research survey coding scheme. Categories Descriptors Substantive descriptors Publication type ISI-listed journal; non-ISI listed journal; book chapter; thesis; report; unpublished paper AWE program Open coding Country Open coding Educational context Elementary; high school; elementary/high school; university & college Language background L1; L1 & ESL; EFL/ESL only; unspecified Methodological descriptors Design Between group; within-groups; between & within group; single group Reporting Statistical testing; descriptive statistics; no statistics Control group No feedback; teacher feedback; no feedback & teacher feedback; different AWE conditions; no control group Text AWE texts; other texts; AWE texts & other texts Outcome measure Scores; scores & other product measures; errors; citations
  • 6. 56 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 Table 2 Research survey: constructs. Construct Frequency Product 17 Product & process 4 Product & perceptions 5 Product, process, & perceptions 4 Perceptions 5 Perceptions & process 1 Total 36 third of the data) by the second researcher. Rater reliability was calculated using Cohen’s kappa. For the substantive descriptors the kappa values were all 1.00, except for language background, which was .75. For the methodological descriptors the kappa values were .85 for Design, 1.00 for Reporting, .85 for Control group, 1.00 for Text and .85 for Outcome measure. Any disagreements were resolved through discussion. For the research survey, the frequencies of the coding categories were collated and this information was used to describe the characteristics of the studies in the research sample. For the critical literature review, the findings of the sample studies were critically discussed in relation to the characteristics of the studies identified in the research survey and also in relation to strengths or weaknesses of particular studies. 3.3. Research survey Table 2 shows that the primary focus of AWE research has so far been on the effects of AWE on written production. Thirty of the thirty six studies include Product measures: 17 focus solely on Prod- uct, and another 13 studies involve Product in combination with one or more of the other constructs. The secondary focus has been on Perceptions, with five studies focusing solely on Perceptions, and another 10 including Perceptions. No studies have focused solely on Process. In the remaining survey, the thirty studies involving product measurements are characterized. Table 3 shows that, in terms of types of publication, relatively few of the studies have appeared in ISI-listed journals or in books. A number of the studies are from non-ISI-listed journals, and a number are unpublished papers from conferences or websites. Table 3 also shows that 10 AWE programs are involved in the sample and that the majority of these have been developed by organizations that are major players in the field of educational technology: Criterion from ETS, MY Access! from Vantage Learning, IEA and Summary Street from Pearson Knowledge Analysis Technologies. Criterion is the program that has been examined most frequently. Criterion, MY Access! and Writing Roadmap provide scores and feedback on both content and language. However, one of the studies that examined Criterion (i.e., Chodorow, Gamon, & Tetreault, 2010) limited itself to examining feedback on article errors. Summary Street, IEA, LSA and ECS are all Table 3 Publication, program and feedback. Publication K Program K Feedback K ISI-listed 7 Criterion 11 Content & language 20 Non-ISI-listed 7 My access 5 Content 5 Book chapter 1 Writing roadmap 1 Language 4 Thesis 5 ETIPS 2 Citations 1 Report 1 IEA 1 Unpublished paper 9 LSA semantic space 1 Summary street 3 ECS 1 SAIF 1 QBL 4
  • 7. M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 57 based on a technique known as latent sematic analysis that purports to focus primarily on content feedback. ETIPS provides feedback for pre-service teachers on tasks carried out in an on-line case-based learning environment. SAIF provides feedback on the citations in a text. QBL provides comments on language errors only. The table shows that most of the studies have involved programs that provide both content and language feedback. Table 4 shows that the majority of studies were carried out in classrooms in the United States, with the remaining studies being carried out in Asian countries, with the exception of a single study carried out in Egypt. University and college contexts were the most common, followed by high school contexts, and then elementary contexts. Almost half the studies do not specify the language background of the participants. Among the studies that did report the language backgrounds of the participants, only two of the studies (i.e., Chodorow et al., 2010; Choi, 2010) investigated the effects of language background on the effects of AWE feedback as a variable. Chodorow et al. (2010) compared the effects of Criterion feedback on the article errors of native and non-native speakers, and Choi (2010) compared the effects of Criterion feedback on written production measures of EFL students in Korea and ESL students in the U.S. 3.4. Methodological features Table 5 shows that most of the studies involved statistical testing, and that between group designs, in which one or more AWE conditions were compared with one or more control conditions, are the most common design. There were also a number of within group comparisons in which the same group of students was compared across drafts and/or texts. One study (i.e., Scharber, Dexter, & Riedel, 2008) used a single group design in which students’ ETIPS scores were correlated with the number of drafts they submitted. Table 5 also shows that the most common control group for the between group comparisons involved a condition in which students received no feedback. In some cases, students in this con- dition wrote the same texts as students in the experimental condition(s) but received no feedback on them, and in other cases students in the control condition did not produce any experimental texts. However, it is unclear in most of the studies whether students in the control condition did receive some teacher feedback during their normal classroom instruction. Only three studies have explicitly compared AWE feedback to teacher feedback. In addition, the table shows that many of the studies have examined the effects of AWE feedback on AWE texts. However, 11 of the studies focus partly or exclusively on the transfer effects of AWE to the quality of texts that were not written using AWE. Lastly, Table 5 shows that scores followed by errors are the most common writing production measures that have been examined in the studies. Other measures that have been examined include text length, sentence length, lexical measures and number of citations. 4. Critical review The research survey has shown that the AWE pedagogical research domain is not a very mature one. Even though written production has been the main focus of research to date, the total number of studies carried out remains relatively small, and a number of these studies are either unpublished papers or published in unranked journals, and perhaps as a consequence are lacking in rigor. Moreover, these studies are highly heterogeneous, varying in terms of factors such as the AWE program that is examined, the design of the study, and the educational context in which the studies were carried out. Hence, not surprisingly, the research has produced mixed and sometimes contradictory results. As a result, there is only modest evidence that AWE feedback has a positive effect on the quality of students’ writing and, as the research survey showed, much of the available evidence relates to the effectiveness of AWE in improving the quality of texts written using AWE feedback. The evidence for the effects of AWE on writing quality from within group comparisons can be said to be stronger than the evidence from between-group comparisons. In general, within-group studies have shown that AWE scores increase and the number of errors decrease across AWE drafts and texts produced by the same writers (e.g., Attali, 2004; Choi, 2010; El Ebyary & Windeatt, 2010; Foltz,
  • 8. 58M.Stevenson,A.Phakiti/AssessingWriting19(2014)51–65 Table 4 Country, context, language background and sample size. Country K Educational context K Language background k Sample size K USA 21 University & College 17 L1 1 <10 1 Taiwan 4 High school 8 Mixed 6 11–50 4 USA & Korea 1 Elementary 3 EFL 8 51–100 9 Japan 1 Elementary & High school 2 EFL &ESL 1 101–200 5 China 1 Unspecified 14 >200 10 Hong Kong 1 Unspecified 1 Egypt 1
  • 9. M.Stevenson,A.Phakiti/AssessingWriting19(2014)51–6559 Table 5 Methodological features. Design K Reporting K Control Text Outcome Between groups 20 Statistical testing 23 No feedback 17 AWE text 19 Scores 13 Within groups 7 Descriptive statistics 3 Teacher feedback 3 Other text 9 Scores + other measures 11 Between & within 2 No statistics 4 No feedback & teacher feedback 1 Both AWE and other text 2 Errors 5 Single group 1 Different AWE conditions 1 Citations 1 No control 8
  • 10. 60 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 Laham, & Landaur, 1999; Shermis, Garvan, & Diao, 2008; Warden & Chen, 1995). This would appear to indicate that writers are able to incorporate AWE feedback to improve the quality and accuracy of AWE texts – at least according to the criteria that AWE programs use to evaluate texts. However, due to methodological issues, some of the results of within-group studies need to be carefully interpreted. To give an example, Attali (2004) excluded 71% of his data set from analysis because the writers did not undertake any revising or redrafting. While the remaining students did on average increase their score across drafts of the same texts, the lack of utilization of AWE by over two thirds of the cohort at the very least places a question mark against the efficacy of AWE for stimulating students to revise their texts. Moreover, an obvious limitation of within-group comparisons is that the lack of control group makes it difficult to conclude with certainty that improvements are actually attributable to the use of AWE software. Improvements made by students to successive drafts of a particular text could be attributable to their own revising skills rather than to their use of revisions suggested by AWE feedback. Improvements made to successive texts could possibly be attributable to other instructional factors or possibly even to developmental factors. The findings from between-group comparisons, which compare one or more AWE conditions with one or more control conditions, are more mixed, and those findings that provide positive evidence frequently suffer from serious methodological drawbacks. More than half the studies using between groups comparisons showed either mixed effects or no effects for AWE feedback on writing outcomes. Mixed effects involve effects being found for some texts but not for others (e.g., Riedel, Dexter, Scharber, & Doering, 2006) for some measures but not for others (e.g., Rock, 2007), or for some groups of writers and not for others (e.g., Schroeder, Grohe, & Pogue, 2008). In a number of cases, in their discussions these studies largely ignore any negative evidence and hence draw conclusions about the effectiveness of AWE that are more optimistic than appear to be warranted. For example, in a study by Schroeder et al. (2008) on the effectiveness of Criterion in improving writing in a criminal justice writing course, one of the three groups of students utilizing AWE feedback did not achieve significantly higher final course grades than the control group. However, possible reasons for the non-significance of the results for this third group are not mentioned and a very strong positive conclusion is drawn: “Results from this study overwhelmingly point toward the value of technology when teaching writing skills” (p. 444). However, we did also find an example in which the authors did not appear to do justice to their findings. Chodorow et al. (2010) found that Criterion reduced the article error rate of non-native speakers, but not of native speakers. However, the study did not report the article error rates for the native-speakers and does not raise the point that AWE may be less effective for native-speakers simply because native-speakers do not tend to make many article errors. In this particular case, the lack of a significant effect for native speakers should not be taken at face value as negative evidence for the effectiveness of AWE. A number of studies comparing AWE feedback to no feedback have found significant positive effects for AWE on writing outcomes. For example, in a study by Franzke et al. (2005) on Summary Street using a pre-test/posttest design with random assignment to an AWE or a no-feedback condition wrote, students in both conditions wrote four texts, the quality of which were scored by human raters. It was found that the AWE condition had higher holistic and content scores on both the averaged score for the four texts and for orthogonal comparisons of the scores for the first two texts with the last two texts. However, many of the studies are not as well-designed, and do not include a pretest or other information on the comparability of students in experimental and control groups. In particular, results of studies that have compared writing outcomes of students who received AWE with those of students in previous cohorts should be viewed with caution. For example, Grimes (2008) found that in three out of four schools students who used My Access had higher external test scores than students from a previous year who did not receive AWE feedback. However, the author acknowledges that it is difficult to attribute this improvement to AWE as during the intervention period important improvements to the quality of writing instruction provided by teachers were also instituted. As shown by the research survey, only three studies have explicitly compared AWE feedback with teacher feedback (i.e., Frost, 2008; Rock, 2007; Warden, 2000). As the evidence from these studies is also mixed, it seems premature to draw any firm conclusions. However, it should be pointed out that none of the studies shows that AWE feedback is less effective than teacher feedback, which as such could be taken as a positive sign. Nonetheless, of concern is that these studies report little about the
  • 11. M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 61 nature of the teacher feedback given or whether this feedback is not comparable to the AWE feedback. For example, in Warden (2000), an AWE condition in which students received specific error feedback is compared with a teacher feedback condition in which students received no specific feedback, but only general comments on content, organization, and grammar. As students in the teacher feedback condition received no specific feedback on the accuracy of their texts, it is hardly surprising that the number of errors decreased more in the AWE condition. In general, there appears to be more support for improvement of error rates than improvement of holistic scores. For example, Kellogg, Whiteford, and Quinlan (2010) found that holistic scores did not improve, but that errors were reduced. As the errors types that reduced largely related to linguistic aspects of the text, they drew the conclusion that there was tentative support for learning about mechanical aspects of writing from AWE. In contrast, Chen (1997) found that an AWE group and a no- feedback control group decreased linguistic errors equally. However, the results of this study could well be attributable to a methodological drawback, as both experimental and control groups were in the same classes. In these classes, the teachers spent time reviewing the most common error types found by the computer, in the presence of all the students. Hence, both groups of students may have benefited from this instruction. There appears to be no clear evidence as yet concerning whether AWE feedback is associated with more generalized improvements in writing proficiency. Some of the studies that have examined trans- fer of the effects of AWE to texts for which no AWE feedback has been provided found no significant differences between scores for AWE and non-AWE conditions (i.e., Choi, 2010; Kellogg et al., 2010; Shermis, Burstein, & Bliss, 2004). Moreover, although three studies did find evidence of transfer (Elliot & Mikulas, 2004; Grimes, 2008; Wang & Wang, 2012), none of these studies is rigorously designed. The Wang and Wang (2008) study had only one participant in each condition. The flaws in the Grimes (2008) study have already been discussed. In Elliot and Mikulas (2004), in each of four sub-studies it was claimed that AWE feedback was associated with better exam performance. However, there was no random assignment to conditions and the reader is given no information concerning the char- acteristics of the participants in the two conditions. In one of the sub-studies, students’ results are compared with students from a year 2000 baseline. In addition, results for two of the four sub-studies were not tested statistically, and those that were tested are tested non-parametrically. Also, some of the claims seem to be rather remarkable, such as that a group who used MY Access! between February and March of 2003 had a pass rate of 81% compared to only 46% for a group who did not receive AWE feedback. It seems rather unlikely that such a short AWE intervention could lead to such a substantial change in assessment outcomes, indicating that other factors may also be in operation. However, it is important to be aware that one of the big unknowns of writing feedback received from teachers is also whether it leads to any generalized improvements in students’ revising ability or in the quality of their texts. Hyland and Hyland (2006) pointed out that research on human feedback rarely looks beyond immediate correction in a subsequent draft, so research on AWE research is not alone in neglecting this area. Closely connected to whether feedback can lead to generalized improvements in writing is whether it assists students in developing their ability to revise independently. One of the first steps in developing revising skills is that writers are able to notice aspects of their texts that have not, up to that point, been salient (Schmidt, 1990; Truscott, 1998). Once a feature has been noticed it becomes available for reflection and analysis. As Hyland and Hyland (2006) pointed out, demonstrating that a student can utilize feedback to edit a draft tells us little about whether the student has successfully acquired a feature. Similarly, it tells us little about whether the student has developed the meta-cognitive skills to be able to notice, and then subsequently evaluate and correct textual problems in other texts successfully. Currently, we know little about whether AWE actually promotes independent revising. However, there is some evidence that receiving AWE feedback may not actually encourage students to make changes either between or within drafts. Attali (2004) reported that 71% of students did not redraft their essays and 48% of those who did redraft did this only once. Grimes (2005) reported that a typical revision pattern for students was to submit a first draft, correct a few mechanical errors and resubmit as fast as possible to see if the score improved. Warden (2000) found that students who were offered a redrafting opportunity after receiving AWE feedback from QBL actually spent significantly less time revising their first drafts than students who received AWE feedback on a single draft with no redrafting
  • 12. 62 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 opportunity, or who received teacher feedback instead of AWE feedback. Students who received no redrafting opportunity revised their texts before they received any feedback. They then submitted their texts for marking, received a mark and AWE feedback, but were not given an opportunity to redraft the text. In contrast, students who received AWE feedback and had an opportunity to redraft appeared carried out little independent editing, instead waiting for the program to tell them what was wrong with their texts and then specifically correcting these errors. While these students were successful in correcting errors detected by AWE, they made few other changes to their texts. Moreover, this trend continued across successive assignments, suggesting that AWE feedback was not leading to much development in revising skills. However, it is important to remember that these findings corroborate findings from revision research that writers – particularly younger writers-revise little and revise superficially (Faigley & Witte, 1981; Whalen & Ménard, 1995). It may be that some students simply do not possess the revising skills needed to allow them to benefit from the revision opportunities afforded by AWE. 5. Conclusions and recommendations This critical review suggests that there is only modest evidence that AWE feedback has a positive effect on the quality of the texts that students produce using AWE, and that as yet there is little clarity about whether AWE is associated with more general improvements in writing proficiency. Paucity of research, heterogeneity of existing research, the mixed nature of research findings, and methodological issues in some of the existing research are factors that limit our ability to draw firm conclusions concerning the effectiveness of AWE feedback. Initially, we endeavored to meta-analyze effect sizes for the product studies in this sample. How- ever, due to methodological issues, many of the studies had to be excluded, leaving us with a very small but still highly heterogenous sample. Heterogeneity necessitates the inclusion of moderator analyses that examine the effects of variables such as AWE program, educational context and whether AWE feedback was compared with no feedback or teacher feedback. However, with such a small sam- ple, there was insufficient power to conduct moderator analyses. We felt that simply providing an overall effect size that ignores possible effects of moderator variables was not a viable or meaningful option. Instead, by carrying out a critical review we have been able to identify patterns in the existing research as well as discussing gaps in the findings, and issues in the methodologies. Below are rec- ommendations that follow from this review that can serve as a guideline for further research in this area. Although this review has not allowed us to differentiate the effectiveness of specific AWE programs, given differences in the objectives of the programs and the nature of the feedback provided, it is likely that such differences do exist. So far, more research on the effects of AWE has been carried out for Criterion than for other programs. Therefore, more studies examining other programs are called for, and in particular studies comparing the effectiveness of more than one AWE program. A number of the studies provided only sketchy descriptions of their participants in terms of factors such as SES, language background, literacy levels, and computer literacy. Future research needs to be more rigorous in reporting participant characteristics, in controlling for participant variables and, where appropriate, including these as variables in the research design. In particular, further research is needed that examines the effectiveness of AWE feedback in ESL and EFL settings, and compares these to L1 settings. Given the tremendous diversity of student populations within the United States, not to mention the diversity in potential markets for AWE programs in both English-speaking and EFL contexts outside the United States, it is of particular importance that the effectiveness of AWE feedback for second language learners be investigated. The commercial programs in use in the United States were not originally designed for English as a second language populations, even though they are being marketed with such populations in mind (Warschauer & Ware, 2006). In addition, further research examining the relative effects of AWE feedback and teacher feedback is needed, in which greater explanation of the nature and quality of feedback provided by teachers is given and in which it is ensured that the kinds of feedback offered by teachers and AWE programs are more comparable. As there are so many factors in play, it is likely to turn out to be too simplis- tic to make overall pronouncements about whether human feedback or computer feedback is better.
  • 13. M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 63 What needs to be disentangled is whether it is really is the source of the feedback that matters, or whether it is other factors such as the way it is delivered, and the nature of the feedback pro- vided that make the difference. It is also important to be aware that, as it is frequently reiterated by developers and researchers alike that AWE feedback is intended to augment teacher feedback rather than replace it (e.g., Chen & Cheng, 2008; Kellogg et al., 2010; Philips, 2007), research into the rel- ative effects of different ways of integrating AWE feedback into classroom writing instruction may have greater ecological validity. In a qualitative study involving the use of AWE feedback in three classrooms, Chen and Cheng (2008) found indications that AWE feedback may be indeed be more effective when it is combined with human feedback. However, this study did not examine the effects of different methods of integration on written production. There are a variety of possible ways of combining AWE with teacher feedback, and of scaffolding AWE feedback. To name just a couple, stu- dents can use AWE to help them improve the quality of initial drafts and then submit to the teacher for feedback, teachers can use AWE as a diagnostic tool for identifying the problems that students have with their writing, and/or teachers can provide initial training. Research that investigates differ- ent possibilities for integrating AWE into classroom writing instruction would also be of pedagogical value. Some might argue that in terms of the effectiveness of AWE feedback, the bottom line is whether the scores it generates correlate with external assessment outcomes and whether its repeated use in the classroom improves students’ test results. However, while it is highly desirable that the transfer of effects to AWE feedback to non-AWE texts be established, it is questionable whether external exams provide the most appropriate means of doing so. Firstly, as Warschauer and Ware (2006) remark, exam writing is generally based on a single draft in timed circumstance, whereas the whole point of AWE is that it encourages multiple drafting. Secondly, the scoring on exams may be too far removed from the aspects for which AWE provides feedback. Thirdly, AWE feedback may not be robust enough as an instructional intervention to impact noticeably on exam scores. Instead, we would recommend examining transfer of the effects of AWE feedback in non-test situations using texts that are similar in terms of genres and topics to the AWE texts students have been writing. The question remains, of course, whether the kinds of writing that AWE feedback give writers the opportunity to engage in actually reflect the kinds of writing that students do in their classrooms. AWE programs generally offer only a limited number of genres, such as persuasive, narrative and informative genres, though some programs such as My Access! additionally enable teachers to use their own prompts (See Grimes & Warschauer, 2010). Moreover, as mentioned, AWE has been accused of promoting formulaic writing with an unimaginative five-paragraph structure. The way lies open for AWE research to include a greater consideration of genre by controlling for genre as a variable, and by systematically examining the influence of genre on the effectiveness of AWE feedback, for example, by comparing the effects of AWE when standard prompts are used with the effects when teachers’ own prompts are used. In conclusion, this study has carried out a critical review of research that examines the effects of formative AWE feedback on the quality of texts that students produce. It has illuminated what is known and what is not known about the effects of AWE feedback on writing. It could be argued that a limitation of the study is that it takes a narrow view of effectiveness in terms of a single dimension: written production measures. It does not focus on either of the other two dimensions of effectiveness identified by Lai (2010): the effects on writing processes or perceived usefulness. However, we feel that Lai’s first dimension is an appropriate and valuable focal point for a critical review, because improving students’ writing is central to the objectives of AWE and to claims regarding its effectiveness, both of which are reflected in the fact that, as this study has shown, the bulk of research conducted so far focuses on written production. We certainly do applaud AWE research that takes a triangulated approach AWE by incorporating the effects of AWE on written production (product perspective), on revision processes and learning and teaching processes (process perspective) and on writers’ and teachers’ perceptions (perception perspective) (e.g., Choi, 2010; Grimes, 2008). We would also join in the plea made by Liu et al. (2002) concerning research on computer-based technology: “rather than focusing on the benefits and potentials of computer technology, research needs to move toward explaining how computers can be used to support (second) language learning – i.e., what kind of tasks or activities should be used and in what kinds of settings” (pp. 26–27). Consequently, as the next step,
  • 14. 64 M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 in a follow-up study we will examine the use of AWE feedback in the classroom, including teaching and learning processes and teacher and learner perceptions. 2Research survey sample *Attali, Y. (2004). Exploring feedback and revision features of Criterion. Paper presented at the National Council on Measurement in Education San Diego, April 12–16, 2004. *Chen, J. F. (1997). Computer generated error feedback and writing process: A link [Electronic Version]. TESL-EJ, 2. Retrieved from http://tesl-ej.org/ej07/a1.html Chen, C. E., & Cheng, W. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning and Technology, 12(2), 94–112. *Chodorow, M., Gamon, M., & Tetreault, J. (2010). The utility of article and preposition error correction systems for English language learners: Feedback and assessment. Language Testing, 27(3), 419–436. *Choi, J. (2010). The impact of automated essay scoring (AES) for improving English language learners essay writing. (Doctoral dissertation. University of Virginia, 2010). *El Ebyary, K., & Windeatt, S. (2010). The impact of computer-based feedback on students’ written work. International Journal of English Studies, 10(2), 121–142. *Elliot, S., & Mikulas, C. (2004). The impact of MY Access! ! Use on student writing performance: A technology overview and four studies. Paper presented at the Annual Meeting of the American Educational Research Association. *Foltz, P. W., Laham, D., & Landauer, T. K. (1999). The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Educational Journal of Computer-Enhanced Learning, 1(2.). Retrieved from www.knowledge-technologies.com *Franzke, M., Kintsch, E., Caccamise, D., & Johnson, N. (2005). Summary Street: Computer support for comprehension and writing. Journal of Educational Computing Research, 33(1), 53–80. *Frost, K. L. (2008). The effects of automated essay scoring as a high school classroom Intervention, PhD thesis. Las Vegas: University of Nevada. *Grimes, D. C. (2008). Middle school use of automated writing evaluation: A multi-site case study, PhD thesis. Irvine: University of California. *Grimes, D., & Warschauer, M. (2010). Utility in a fallible tool: A multi-site case study of automated writing evaluation. Journal of Technology, Language, and Assessment, 8(6), 1–43. *Kellogg, R., Whiteford, A., & Quinlan, T. (2010). Does automated feedback help students learn to write? Journal of Educational Computing Research, 42, 173–196. Lai, Y.-H. (2010). Which do students prefer to evaluate their essays: Peers or computer program. British Journal of Educational Technology, 41(3), 432–454. *Riedel, E., Dexter, S. L., Scharber, C., & Doering, A. (2006). Experimental evidence on the effectiveness of automated essay scoring in teacher education cases. Journal of Educational Computing Research, 35(3), 267–287. *Rock, J. (2007). The impact of short-term use of Criterion on writing skills in 9th grade (Research Report RR-07-07). Princeton, NJ: Educational Testing Service. Scharber, C., Dexter, S., & Riedel, E. (2008). Students’ experiences with an automated essay scorer. The Journal of Technology, Learning and Assessment, 7(1), 1–44. *Shermis, M. D., Burstein, J., & Bliss, L. (2004). The impact of automated essay scoring on high stakes writing assessments. Paper Presented at the Annual Meeting of the National Council on Measurement in Education. *Shermis, M., Garvan, C. W., & Diao, Y. (2008). The impact of automated essay scoring on writing outcomes. Paper presented at the Annual Meetings of the National Council on Measurement in Education, March 25–27, 2008. *Schroeder, J. A., Grohe, B., & Pogue, R. (2008). The impact of criterion writing evaluation technology on criminal justice student writing skills. Journal of Criminal Justice Education, 19(3), 432–445. *Wang, F., & Wang, S. (2012). A comparative study on the influence of automated evaluation system and teacher grading on students’ English writing. Procedia Engineering, 29, 993–997. *Warden, C. A. (2000). EFL business writing behavior in differing feedback environments. Language Learning, 50(4), 573–616. Warden, C. A., & Chen, J. F. (1995). Improving feedback while decreasing teacher burden in ROC ESL business English classes. In P. Porythiaux, T. Boswood, & B. Babcock (Eds.), Explorations in English for professional communications. Hong Kong: City University of Hong Kong. Other references Anson, C. M. (2006). Can’t touch this: Reflections on the servitude of computers as readers. Machine scoring of student essays. In P. Freitag Ericsson, & R. Haswell (Eds.), Machine scoring of student essays (pp. 38–56). Logan, Utah: Utah State University Press. Biber, D., Nekrasova, T., & Horn, B. (2011). The effectiveness of feedback for L1-english and L2-writing development: A meta-analysis. (ETS Research Report RR-11-05). Princeton, NJ: ETS. Burstein, J., Chodorow, M., & Leacock, C. (2004). Automated essay evaluation: The Criterion online writing service. AI Magazine (Fall), 27–36. Chandrasegaran, A., Ellis, M., & Poedjosoedarmo, G. (2005). Essay assist: Developing software for writing skills improvement in partnership with students. RELC Journal, 36(2), 137–155. 2 References marked with an asterisk indicate studies that examine solely or partially the effects of AWE on writing outcomes, and which therefore have been included in the critical review.
  • 15. M. Stevenson, A. Phakiti / Assessing Writing 19 (2014) 51–65 65 Dikli, S. (2006). An overview of automated scoring of essays. The Journal of Technology, Learning and Assessment, 5(1), 1–35. Faigley, L., & Witte, S. (1981). Analyzing revision. College Composition and Communication, 32, 400–414. Freitag Ericsson, P. (2006). The meaning of meaning. In P. Freitag Ericsson, & R. Haswell (Eds.), Machine scoring of student essays. Logan Utah: Utah State University Press. Grimes, D. (2005). Assessing automated assessment: Essay evaluation software in the classroom. Paper presented at the Computers and Writing Conference Stanford, CA. Herrington, A., & Moran, C. (2001). What happens when machines read our students writing? College English, 63(4), 480–499. Hyland, K., & Hyland, F. (2006). Feedback on second language students’ writing. Language Teaching, 39, 83–101. Patterson, N. (2005). Computerized writing assessment: Technology gone wrong. Voices From the Middle, 13(2), 56–57. Philips, S. M. (2007). Automated essay scoring: A literature review (SAEE research series #30). Kelowna, BC: Society for the Advancement of Excellence Education. Schmidt, R. W. (1990). The role of consciousness in second language learning. Applied Linguistics, 11(2), 129–158. Shermis, M. D., & Burstein, J. (Eds.). (2003). Automated essay scoring: A cross-disciplinary perspective. Hillsdale, NJ: Lawrence Erlbaum Associates. Shermis, M. D., & Burstein, J. (Eds.). (2013). Handbook of automated essay evaluation: Current applications and new directions. New York and London: Routledge. Taylor, A. R. (2005). A future in the process of arrival: Using computer technologies for the assessment of learning. TASA Institute, Society for the Advancement of Excellence in Education. Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46(2), 327–369. Truscott, J. (1998). Noticing in second language acquisition: A critical review. Second Language Research, 14(2), 103–135. Warschauer, M., & Ware, J. (2006). Automated writing evaluation: Defining the classroom research agenda. Language Teaching Research, 10(2), 1–24. Whalen, K., & Ménard, N. (1995). L1 and L2 writers’ strategic and linguistic knowledge: A model of multiple-level discourse processing. Language Learning, 44(3), 381–418. Yang, Y., Buckendahl, C. W., Juszkewicz, P. J., & Bhola, D. S. (2002). A review of strategies for validating computer-automated scoring. Applied Measurement in Education, 15(4), 391–412. Zellermayer, M., Salomon, G., Globerson, T., & Givon, H. (1991). Enhancing writing-related metacognitions through a comput- erized writing partner. American Educational Research Journal, 28(2), 373–391. Further reading *Britt, A., Wiemer-Hastings, P., Larson, A., & Perfetti, C. (2004). Using intelligent feedback to improve sourcing and integration in students’ essays. International Journal of Artificial Intelligence in Education, 14, 359–374. Dikli, S. (2007). Automated essay scoring in an ESL setting. (Doctoral dissertation, Florida State University, 2007). *Hoon, T. (2006). Online automated essay assessment: Potentials for writing development. Retrieved from http://ausweb. scu.edu.au/aw06/papers/refereed/tan3/paper.html *Lee, C., Wong, K. C. K., Cheung, W. K., & Lee, F. S. L. (2009). Web-based essay critiquing system and EFL students’ writing: A quantitative and qualitative investigation. Computer Assisted Language Learning, 22(1), 57–72. *Matsumoto, K., & Akahori, K. (2008). Evaluation of the use of automated writing assessment software. In C. Bonk, C. Bonk, et al. (Eds.), Proceedings of World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education 2008 (pp. 1827–1832). Chesapeake, VA: AACE. *Schreiner, M. E. (2002). The role of automatic feedback in the summarization of narrative text, PhD Thesis. University of Colorado. *Steinhart, D. J. (2001). An intelligent tutoring system for improving student writing through the use of latent semantic analysis. Boulder: University of Colorado. Wade-Stein, D., & Kintsch, E. (2004). Summary Street: Interactive computer support for writing. Cognition and Instruction, 22(3), 333–362. Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An International Journal, 3, 22–36. Yao, Y. C., & Warden, C. A. (1996). Process writing and computer correction: Happy wedding or shotgun marriage? [Electronic Version]. CALL Electronic Journal from Available at http://www.lerc.ritsumei.ac.jp/callej/1-1/Warden1.html.