HTML Injection Attacks: Impact and Mitigation Strategies
Paper diomede
1. Frequency Effects on Perceptual Compensation for Coarticulation
Alan C. L. Yu1 , Ed King1 , Morgan Sonderreger2
1
Phonology Laboratory, Department of Linguistics, University of Chicago
2
Department of Computer Science, University of Chicago
aclyu@uchicago.edu, etking@uchicago.edu, morgan@cs.uchicago.edu
Abstract effort has gone into identifying the likely sources of such errors
Errors in compensating perceptually for effects of coarticulation [10, 11, 8, 12, 9], little is known about the source of regularity
in speech have been hypothesized as one of the major sources in listener misperception that leads to the systematic nature of
of sound change in language. Little research has elucidated the sound change. That is, why would random and haphazard mis-
conditions under which such errors might take place. Using the perception in an individual’s percept lead to systematic reorga-
paradigm of selective adaptation, this paper reports the results nization of the sound system within the individual and within
of a series of experiments testing for the effect of frequency on the speech community? The present study demonstrates that
likelihood of perceptual compensation for coarticulation by lis- the likelihood of listeners adjusting their categorization pattern
teners. The results suggest that perceptual compensation might contextually (i.e. perceptual compensation) may be affected by
be ameliorated (which might result in hypocorrection) or ex- the frequencies of the sound categories occurring in the specific
aggerated (i.e. hypercorrection) depending on the relative fre- contexts. In particular, the present study expands on Beddor
quency of the categories that are being perceived in their spe- et al.’s work [7] on the perceptual compensation for vowel-to-
cific coarticulated contexts. vowel coarticulation in English, showing that the way English
Index Terms: perceptual compensation, sound change, selec- listeners compensate perceptually for the effect of regressive
tive adaptation. coarticulation from a following vowel (either /i/ or /a/) depends
on the relative frequency of the coarticulated vowels (i.e. the
relative frequency of /a/ and /e/ appearing before /i/ or /a/). The
1. Introduction idea that category frequency information affects speech percep-
A fundamental property of speech is its tremendous variability. tion is not new. Research on selective adaptation has shown that
Much research has shown that human listeners take such vari- repeated exposure to a particular speech sound, say /s/, would
ability into account in speech perception [1, 2, 3, 4, 5, 6]. Bed- shift the identification of ambiguous sounds, say sounds that
dor and colleagues [7], for example, found that adult English are half-way between /s/ and /S/, away from the repeatedly pre-
and Shona speakers perceptually compensate for the coarticula- sented sound towards the alternative [13, 14, 15]. In perceptual
tory anticipatory raising of /a/ in C/i/ context and the anticipa- learning studies, repeated exposure to an ambiguous sound, say
tory lowering of /e/ in C/a/ context. Both English and Shona a /s/-/f/ mixture, in /s/-biased lexical contexts induces retuned
listeners report hearing more /a/ in the context of a following /i/ perception such that subsequent sounds are heard as /s/ even
than in the context of a following /a/. Many scholars have hy- in lexically neutral contexts [16, 17]. The experiments report
pothesized that a primary source of systematic sound changes in below extend Beddor et al. [7]’s findings by presenting three
language comes from errors in perceiving the intended speech groups of participants with the same training stimuli but vary-
signal [8, 9]. That is, errors in listeners’ classification of speak- ing the frequency with which they hear each token. The purpose
ers’ intended pronunciation, if propogated, might result in sys- of ths present study is to demonstrate that contextually-sensitive
tematic changes in the sound systems of all speakers-listeners category frequency information can induce selective adaptation
within the speech community. Ohala [8], in particular, argues effects in perceptual compensation and to examine the implica-
that hypocorrective sound change (e.g., assimilation and vowel tions of such effects on theories of sound change.
harmony) obtains when a contextual effect is misinterpreted as
an intrinsic property of the segment (i.e. an increase in false
positive in sound categorization). For example, an ambiguous
2. Methods
/a/ token might be erroneously categorized as /e/ in the context 2.1. Stimuli
of a following /i/ if the listeners fail to take into account of the
anticipatory raising effect of /i/. If enough /a/ exemplars are The training stimuli consisted of CV1CV2 syllables where C
misidentified as /e/, a pattern of vowel harmony might emerge. is one of /p, t, k/, V1 is either /a/ or /e/, and V2 is either /a/
That is, the language will show a prevalence of mid vowels be- or /i/. To avoid any vowel-to-vowel coarticulatory effect in the
fore /i/ and low vowels before /a/. On the other hand, a hyper- training stimuli, a phonetically-trained native English speaker
corrective sound change (e.g., dissimilation) emerges when the (second author) produced each syllable of the training stimuli in
listener erroneously attributes intended phonetic properties as isolation (/pa/ /pe/, /pi/, /ta/, /te/, ti/, /ka/, /ke/, /ki/). The train-
contextual variation (i.e. an increase in false negative in sound ing disyllablic stimuli were assembled by splicing together the
identification). In this case, an ambiguous /e/ might be mis- appropriate syllable and were resynthesized with a consistent
classifed as /a/ in the context of a following /i/. If enough /e/ intensity and pitch profile to avoid potential confound of stress.
exemplars are misidentified as /a/ when followed by /i/, a dis- The test stimuli consisted of two series of /pV1pV2/ disyllables
similatory pattern of only low vowels before high vowels and where V2 is either /a/ or /i/. The first syllable, pV1, is a 9-step
non-low vowels before low vowels might emerge. While much continuum resynthesized in PRAAT by varing in F1, F2, and F3
2. in equidistant steps from the abovementioned speaker’s /pa/ and Vocalic Context x Step
1.0
/pe/ syllables. The original /pa/ and /pe/ syllables serve as the
end points of the 9-step continuum.
0.8
2.2. Participants and procedure
Probability of ’a’
0.6
The experiment consists of two parts: exposure and testing. i
Subjects were assigned randomly to three exposure conditions.
One group was exposed to CeCi tokens four times more of- a
0.4
ten than to CeCa tokens and to CaCa tokens four times more
often than to CaCi ones (the H YPER condition). The second
0.2
group was exposed to CeCa tokens four times more often than
to CeCi tokens and to CaCi tokens four times more often than
to CaCa ones (the H YPO condition). The final group was ex-
0.0
v2
posed to an equal number of /e/ and /a/ vowels preceding /i/ and 2 4 6 8
/a/ (the BALANCED condition). See Table 1 for a summary of Step
frequency distribution of exposure stimuli. The exposure stim-
uli were presented over headphones automatically in random
order in E-Prime in a sound-proof booth. Subjects performed a Figure 1: Interaction between VOCALIC C ONTEXT and S TEP.
phoneme monitoring task during the exposure phase where they The predictor variables were back-transformed to their original
were asked to press a response button when the word contains a scales in the figure.
medial /t/. Each subject heard 360 exposure tokens three times;
a short break follows each block of 360 tokens. A total of forty-
eight students at the University of Chicago, all native speakers vocalic contexts are both significant predictors of /a/ response.
of American English, participated in the experiment for course That is, listeners reported hearing less and less /a/ from the /a/-
credit or a nominal fee. Eleven subjects took part in the H YPO end of the continuum to the /e/-end of the continuum and they
condition, sixteen subjects each participated in the H YPER con- heard more /a/ when the target vowel is followed by /i/ than
dition and the BALANCED condition. when it is followed by /a/. Specifically, the odds of hearing /a/
During the testing phase, subjects performed a 2-alternative before /i/ is 1.3 times that before /a/. The significant interac-
force-choice task. The subject listened to a randomized set of tion between S TEP and VOCALIC C ONTEXT suggests that the
test stimuli and were asked to decide whether the first vowel vocalic context effect differs depending on where the test stimu-
sounds like /e/ or /a/. lus is along the /a/-/e/ continuum. As illustrated in Figure 1, the
effect of vocalic context is largest around steps 4-6 while identi-
3. Analysis fication is close to ceiling at the two endpoints of the continuum
regardless of vocalic contexts. Of particular interest here is the
Subject’s responses (i.e. subject’s /a/ response rates) were mod-
significant interaction between the exposure condition and vo-
eled using a mixed-effect logistic regression. The model con-
calic contexts. Figure 2 illustrates this interaction clearly; the
tains four fixed variables: T RIAL (1-180), C ONTINUUM S TEP
effect of vocalic context on /a/ response is influenced by the
(1-9), E XPOSURE C ONDITION (balanced, hyper, hypo) and
nature of the exposure data. When the exposure data contains
VOCALIC C ONTEXT (/a/ vs. /i/). The model also includes three
more CaCa and CeCi tokens than CaCi and CeCa tokens (i.e.
two-way interactions: VOCALIC C ONTEXT x S TEP, S TEP x
the hyper condition), listeners report hearing more /a/ in the /i/
C ONDITION, and VOCALIC C ONTEXT x C ONDITION. In addi-
context than in the /a/ context, compared to the response rate af-
tion, the model includes a by-subject random slopes for T RIAL.
ter the balanced condition where the frequency of CaCa, CeCi,
A likelihood ratio test comparing a model with a VOCALIC
CaCi, and CeCa tokens are equal. On the other hand, in the hypo
C ONTEXT x S TEP x C ONDITION as a three-way interaction
condition where listeners heard more CaCi and CeCa tokens
term and one without it shows that the added three-way interac-
than CaCa and CeCi ones, listeners reported hearing less /a/ in
tion does not significantly improve model log-likelihood (χ2 =
the /i/ context than in the /a/ context, the opposite of what is ob-
3.253, df = 2, P r(> χ2 ) = 0.1966). Table 2 summarizes the pa-
served in both the balanced and hyper conditions. The model
rameter estimate β for all fixed effects in the model, as well as
also shows a significant interaction between C ONDITION and
the estimate of their standard error SE(β), the associated Wald’s
S TEP. As illustrated in Figure 3, the slope of the identification
z-score, and the significance level. To eliminate collinearity,
function is the steepest after the hyper condition, but shallowest
scalar variables were centered, while the categorical variables
in the hypo condition.
were sum-coded.
Consistent with Beddor et al.’s findings, continuum step and
4. Discussion and conclusion
The present study shows that the classification of vowels in dif-
Table 1: Stimuli presentation frequency during the exposure ferent prevocalic contexts is influenced by the relative frequency
phase. C = /p, t, k/ distribution of the relevant vowels in specific contexts. For ex-
ample, when /a/ frequently occurs before /a/, listeners are less
Type BALANCED H YPER H YPO likely to identify future instances of ambiguous /a/-/e/ vowels
CeCi 90 144 36 as /a/ in the same context; listeners would report hearing more
CeCa 90 36 144 /e/ before /a/ if CaCa exemplars outnumber CeCa exemplars.
CaCi 90 36 144 Likewise, when /a/ occurs frequently before /i/, listeners would
CaCa 90 144 36 reduce their rate of identification of /a/ in the same context; lis-
3. Table 2: Estimates for all predictors in the analysis of listener response in the identification task.
Predictor Coef.β SE(β) z p
Intercept -0.0096 0.0674 -0.14 0.8867
T RIAL -0.0008 0.0009 -0.87 0.3825
S TEP -0.9260 0.0185 -49.98 < 0.001 ***
VOCALIC C ONTEXT = a -0.2690 0.0316 -8.51 < 0.001 ***
C ONDITION = hyper -0.0594 0.0934 -0.64 0.5251
C ONDITION = hypo 0.0126 0.0950 0.13 0.8942
S TEP x VOCALIC C ONTEXT = a 0.0372 0.0170 2.19 < 0.05 *
S TEP x C ONDITION = hyper 0.0481 0.0247 1.95 0.0514
S TEP x C ONDITION = hypo -0.2631 0.0296 -8.89 < 0.001 ***
VOCALIC C ONTEXT = a x C ONDITION = hyper 0.0311 0.0432 0.72 0.4717
VOCALIC C ONTEXT = a x C ONDITION = hypo -0.4146 0.0477 -8.70 < 0.001 ***
Vocalic Context x Condition Condition x Step
1.0
1.0
0.8
0.8
Probability of ’a’
Probability of ’a’
0.6
0.6
i Hypo
a Balanced
0.4
0.4
Hyper
Condition
0.2
0.2
0.0
0.0
v2
Balanced Hyper Hypo 2 4 6 8
Condition Step
Figure 2: Interaction between VOCALIC C ONTEXT and C ON - Figure 3: Interaction between C ONDITION and S TEP.
DITION .
the balanced condition were heard as /e/ in the HYPO condition
when V2 = /i/. If this type of reclassification persists, listeners
teners would report hearing more /e/ before /i/ if CaCi tokens are in the hypo condition would develop a pseudo-lexicon where
more prevalent than CeCi tokens. These results suggest that lis- vowels in disyllabic words must agree in lowness and a state
teners exhibit selective adaptation when frequency information of vocalic height harmony would obtain, similar to many cases
of the target sounds varies in a context-specific fashion. That is, found in the Bantu languages of Africa [19].
the repeated exposure to an adaptor (the more frequent variant) Another ramification the present findings have for listener-
results in heighten identification of the alternative. This find- misperception models of sound change concerns the role of
ing has serious implications for models of sound change that the conditioning environment. That is, such models of sound
afford a prominent role to listener misperception to account for change often attribute misperception to listeners failing to detect
sources of variation that lead to change. the contextual information properly and thus failing to prop-
To begin with, subjects in the hyper exposure condition ex- erly normalize for the effect of context on the realization of
hibit what can be interpreted as hypercorrective behavior. That the sound in question. Here, our findings establish that system-
is, speech tokens that were classified as /a/ in the balanced atic “failure” of perceptual compensation take place despite the
condition were being classified as /e/ in the hyper condition presence of the coarticulatory source; perceptual compensation
when V2 = /a/; likewise, sounds that were classifed as /e/ in “failure” is interpreted here as whenever the context-specific
the banalced condition were treated as /a/ in the hyper condi- identification functions deviate from the canonical identification
tion when V2 = /i/. If this type of hypercorrective behavior functions observed in the balanaced condition. This finding
persists, the pseudo-lexicon of the made-up language our sub- echoes early findings that perceptual compensation may only be
jects experienced would gradually develop a prevalence of di- partial under certain circumstances. Taken together, these find-
syllabic “words” that do not allow in consecutive syllables two ings suggest that failure to compensate perceptually for coar-
low-vowels or two non-low vowels. This would represent a state ticulatory influence need not be the result of not detecting the
of vocalic height dissimilation, not unlike the pattern found in source of coarticulation. Listeners may exhibit behaviors of not
the Vanuatu languages [18]. On the other hand, listeners in the taking into account properly the role of coarticulatory contexts
hypo exposure condition exhibit what could be interpreted as have on speech production and perception.
hypocorrective behavior. That is, tokens that were classified as It is worth pointing out in closing that selective adaptation
/e/ in the balanced condition were being classified as /a/ in the effects have generally been attributed to adaptors fatiguing spe-
HYPO condition when V2 = /a/; likewise, vowels heard as /a/ in cialized linguistic feature detectors [13], which suggests that the
4. neural mechanism that subserves speech perception may even- [17] A. G. Samuel and T. Kraljic, “Perceptual learning for speech,”
tually recuperate from adaptor fatigue and the selective adap- Attention, Perception, & Psychophysics, vol. 71, no. 6, pp. 1207–
tation might dissipate. There is some evidence that selective 1218, 2009.
adaptation effects are temporarily [20]. The lack of durativ- [18] J. Lynch, “Low vowel dissimilation in Vanuato languages,”
ity of selective adaptation raises doubt about its implication for Oceanic Linguistics, vol. 42, no. 2, pp. 359–406, 2003.
sound change since sound change necessitates the longevity of [19] F. B. Parkinson, “The representation of vowel height in phonol-
the influencing factors. Additional research is underway to as- ogy,” PhD dissertation, Ohio State University, 1996.
certain the longitudinal effects of selective adaptation. Such [20] J. Vroomen, S. van Linden, M. Keetels, B. de Gelder, and P. Ber-
data will provide much needed information regarding the sig- telson, “Selective adaptation and recalibration of auditory speech
nificance of selective adaptation effects on speech perception by lipread information: dissipation,” Speech Communication,
and sound change. vol. 44, p. 5561, 2004.
5. Acknowledgements
This work is partially supported by National Science Founda-
tion Grant BCS-0949754.
6. References
[1] V. Mann, “Influence of preceding liquid on stopconsonant percep-
tion,” Perception & Psychophysics, vol. 28, no. 5, p. 40712, 1980.
[2] V. A. Mann and B. H. Repp, “Influence of vocalic context on per-
ception of the [ ]-[s] distinction,” Perception & Psychophysics,
vol. 28, pp. 213–228, 1980.
[3] J. S. Pardo and C. A. Fowler, “Perceiving the causes of coartic-
ulatory acoustic variation: consonant voicing and vowel pitch.”
Perception & Psychophysics, vol. 59, no. 7, pp. 1141–52, 1997.
[4] A. Lotto and K. Kluender, “General contrast effects in speech per-
ception: effect of preceding liquid on stop consonant identifica-
tion,” Perception & Psychophysics, vol. 60, no. 4, p. 60219, 1998.
[5] P. Beddor and R. A. Krakow, “Perception of coarticulatory nasal-
ization by speakers of English and Thai: Evidence for partial com-
pensation,” Journal of the Acoustical Society of America, vol. 106,
no. 5, pp. 2868–2887, 1999.
[6] C. Fowler, “Compensation for coarticulation reflects gesture
perception, not spectral contrast,” Perception & Psychophysics,
vol. 68, no. 2, p. 161177, 2006.
[7] P. S. Beddor, J. Harnsberger, and S. Lindemann, “Language-
specific patterns of vowel-to-vowel coarticulation: acoustic struc-
tures and their perceptual correlates,” Journal of Phonetics,
vol. 30, pp. 591–627, 2002.
[8] J. Ohala, “The phonetics of sound change,” in Historical Linguis-
tics: Problems and Perspectives, C. Jones, Ed. London: Long-
man Academic, 1993, pp. 237–278.
[9] J. Blevins, Evolutionary Phonology: the emergence of sound pat-
terns. Cambridge: Cambridge University Press, 2004.
[10] J. Ohala, Sound change is drawn from a pool of synchronic varia-
tion. Berlin: Mouton de Gruyter, 1989, pp. 173–198.
[11] ——, “The phonetics and phonology of aspects of assimilation,”
in Papers in Laboratory Phonology I: Between the Grammar and
the Physics of Speech, J. Kingston and M. Beckman, Eds. Cam-
bridge: Cambridge University Press, 1990, vol. 1, pp. 258–275.
[12] ——, “Towards a universal, phonetically-based, theory of vowel
harmony,” ICSLP, Yokohama, vol. 3, pp. 491–494, 1994.
[13] P. Eimas and J. Corbit, “Selective adaptation of linguistic feature
detectors,” Cognitive Psychology, vol. 4, pp. 99– 109, 1973.
[14] P. D. Eimas and J. L. Miller, “Effects of selective adaptation of
speech and visual patterns: Evidence for feature detectors,” in Per-
ception and Experience, H. L. Pick and R. D. Walk, Eds. N.J.:
Plenum, 1978.
[15] A. G. Samuel, “Red herring detectors and speech perception: In
defense of selective adaptation,” Cognitive Psychology, vol. 18,
pp. 452–499, 1986.
[16] D. Norris, J. M. McQueen, and A. Cutler, “Perceptual learning in
speech,” Cognitive Psychology, vol. 47, no. 2, pp. 204–238, 2003.