This study examined differences in vocabulary size and lexical diversity between non-native English speakers (NNS) and native English speakers (NS) in their academic writing. It found that NS texts had greater lexical diversity and used less common words than NNS texts. Lexical diversity, as measured by the MTLD score, was the strongest predictor of writing quality for both groups and accounted for more variation in ratings than vocabulary size. While vocabulary size helped advance scores from a level 2 to 3, lexical diversity was more important for pushing compositions into the higher 4 to 5 quality range. The findings suggest vocabulary instruction needs to go beyond growing word banks to teaching writers how to vary words in their compositions.
2. Introduction
Many second language (L2) learners lack sufficient vocabulary
knowledge to meet the academic demands of university-level writing
tasks (Ferris, 1994; Laufer, 1994; Laufer & Nation, 1995;
Ruegg, Fritz, & Holland, 2011)
What does this gap in productive word knowledge look like?
Is it just a problem solved by teaching and learning more words?
Or is it more?
3. Background
Vocabulary size: frequency-based number of words in essay’s lexicon
• Learners with larger vocabulary sizes used fewer higher-frequency
words and more academic and uncommon words in their
compositions (Laufer & Nation, 1995)
• Significant correlations between learners’ vocabulary size and
measures of writing quality (Agustin Llach & Gallego, 2009;
Albrechtsen, Haastrup, & Henriksen, 2008; Crossley &
McNamara, 2009; Staehr, 2008)
Lexical diversity: varied use of different words in writing
• Tended to be the strongest predictor of writing quality (de Haan &
van Esch; 2005; Grobe, 1981; Linnerud, 1986; Crossley &
McCarthy, 2009; Crossley, McNamara, & Jarvis, 2010; Schoonen, et
al., 2003)
• There is an assumption that the ability to vary words in discourse
requires a large vocabulary size (Laufer, 1994)
4. Research Questions
1.
Are there significant differences between advanced NNS learners’
and NS learners’ measures of vocabulary size and lexical diversity
as evidenced in samples of their academic writing?
1.
Is there a relationship between vocabulary size and lexical
diversity?
2.
Is vocabulary size or lexical diversity a greater predictor of writing
score in NNS and NS college writing?
5. Methods
• Description of the Sample:
• 104 advanced NNS academic essays, 68 NS academic essays (N =
172) collected from six intensive English writing programs in the
U.S.
• Spanned 14 different L1s and 7 writing genres
• 3 raters
• Instruments:
• MTLD – typical score range between 70 and 120
• Voc-D – range is highly variable
• CELEX – score range 0 to 6; 0 = rarest words, 6 = common words
• Available within the Coh-Metrix 3.0 computational linguistics
tool
• TOEFL iBT Writing Rubric – score range 0 to 5; 0 = lowest score,
5 = highest proficiency
7. Research Question 1
RQ1: Are there significant differences between advanced NNS
learners’ and NS learners’ measures of vocabulary size and lexical
diversity?
Results:
• NS texts exhibited significantly higher levels of lexical diversity and
used lower-frequency words than NNS (F3, 168 = 20.30, p < .05, η2 =
.27)
• Voc-D showed the greatest differences (F3, 168 = 55.02, p < .05, η2
=.25)
• For NS texts, only the MTLD was able to detect differences (F1, 66 =
4.17, p < .05, η2 = .06)
Native speakers’ vary their words more and produce less common
words than non-native speakers.
8. Research Question 2
RQ: Is there a significant relationship between vocabulary size
and lexical diversity?
Results:
• A moderate correlation between vocabulary size and lexical
diversity existed in the sample (MTLD [r = −.44, p < .001];
voc-D [r = −.46, p < .001])
• Split-file analysis shows that for NS, the correlation is a little
less (MTLD [r = −.36, p < .05]; voc-D [r = −.31, p < .05])
Essays with greater lexical diversity utilized lower-frequency
words, but only to a moderate degree.
9. Research Question 3
RQ: Is vocabulary size or lexical diversity a greater predictor of writing
score in NNS and NS college writing?
Results:
• Lexical diversity was the only significant contributor to the model
for both NS and NNS writings (Exp[B] = 1.07, p < .001).
• Although both the MTLD and CELEX scores significantly differed by
each score level (F6, 336 = 10.61, p < .001, η2 = .16), only the MTLD
accounted for a greater amount of the variation in ratings (F3, 168 =
21.66, p < .001, η2 = .28).
As lexical diversity within an essay increased, so did its likelihood of
earning a score of 5.
10. Discussion
Figure 1. Lexical Diversity
Figure 2. Vocabulary Size
Writers’ vocabulary size helps in the beginning to advance score from level 2
to 3, but it is their ability to diversify lexis that pushes the composition’s
quality into the 4 to 5 range.
11. Significance of the Findings
• Offers further explanation of vocabulary criteria for assessment rubrics
• Suggests that mid-range vocabulary words could account for some of the
differences between native speaker and non-native speaker writers’
ratings
• Indicates that vocabulary instruction needs to go beyond growing
advanced learner lexicons and teach advanced NNS writers how to vary
these words in composition
• Offers some validation of the MTLD; it performed well despite large
variation in text length
12. Implications for Practice
• Highlights the importance of vocabulary instruction in the composition
classroom, even for advanced learners
• Instruction should not only focus on expanding learner lexicons in the midfrequency range, but also how to diversify these words in production
• Help interpret vocabulary benchmarks such as “appropriate word choice”,
“sufficient range of vocabulary”, or “control of lexical features”
• Allow for instructors to give more targeted feedback
13. Limitations
• Text length, task topic, and writing genre presents challenges to any study
of lexical diversity
• CELEX frequency bands were created in 1995; it is possible that word
frequencies have changed
• No covariates
• Generalizability due to demographics of the sample
14. Future Directions
• Study assignments from intact freshman composition courses that contain
both NNS and NS
• Control for covariates such as text length, grammar, cohesion, lexical error,
or other factors that relate to writing quality
• Qualitative component to raters’ scores
• Include an independent measure of productive vocabulary size such as a
productive version of Nation’s Vocabulary Size Test; also correlate CELEX
frequencies to BNC/COCA
• Examine lexical density, or content words, and its impact on writing quality