SlideShare ist ein Scribd-Unternehmen logo
1 von 1
Downloaden Sie, um offline zu lesen
EXP 2 RESULTS
Voice learning
•  Voice ID significantly improved across training sessions. This was
true for performance in the training task (F(3, 38) = 80.15, p < .0001) and
in the post-training ID task (F(3, 38) = 43.95, p < .0001, See Fig. 6).
• Voice recognition in the training task was significantly better for AV
than the A group (F(1, 38) = 6.96, p < .05).
• Performance in the post-training voice identification task did not differ
as a function of training format (F(1, 38) = 1.807, p > .05).
Auditory word recognition
• Word recognition performance did not differ significantly for trained vs.
untrained voices (F(1, 38) = .211, p > .05, See Fig. 7).
EXP 1 RESULTS
Voice learning
• Voice identification was significantly more difficult for Talker Set
1 than Talker Set 2 (See Fig. 3).
• F0s for Talker Set 1 overlapped within the same gender
whereas within-gender F0s for Talker Set 2 were spread
apart.
• When a less stringent criterion was used for the training task,
there was a significant difference in mean total sessions
between the auditory and auditory-visual groups in Talker Set 1
(See Table 5).
Table 5. Total sessions required to achieve 75% criterion performance in
the voice training task.
Auditory word recognition
• There was no significant difference in auditory word
recognition between the trained voices and the untrained voices
(See Fig. 4). This was may be due to: 1) intelligibility differences
between Talker Sets 1 and 2; and 2) different numbers of
training sessions across subjects and conditions.
FIG 3. Average performance in
the voice identification task. Error
bars represent 95% confidence
interval. The dotted line
represents chance level (25%).
STIMULI
• 320 sentences created by Bell & Wilson (2001).
• 7-9 words in each sentence, semantically neutral.
• 3 key words per sentence (e.g., She felt a rough spot on her
skin.).
•  5 male and 5 female talkers produced the sentences
(See Table 1); recordings were in an audiovisual HD
format.
Table 1. Talker Acoustics.
• All sentences were processed via12-channel noise-band CI
simulator (See Fig. 1 & Table 2).
Table 2. Corner frequencies
for frequency analysis filters
for CI simulation.
1pPP25. Effects of Training Format on Voice Learning of Spectrally Degraded Voices 
Sangsook Choi1, Karen Iler Kirk1, Tom Talavage2, Vidya Krull1, Christopher Smalt2, Steve Baker2
1. Dept. of Speech Language and Hearing Science, 2. Dept. of Electrical and Computer Engineering
BACKGROUND
• Perception of talker-specific information (i.e., indexical
information such as a talker’s voice) is closely related to
linguistic processing of an utterance (For review, Nygaard,
2005).
• Audiovisual speech facilitates speech perception in adverse
listening environments (Sumby  Pollack, 1954). Similar
facilitative effects were found in voice identification in NH
listeners (Sheffert  Olson, 2004).
• It is well known that audiovisual speech provides large
benefits to individuals with hearing impairment (Erber, 1972;
1975), including CI recipients (Tyler et al., 1997).
• Little is known about indexical perception in CI recipients
and their abilities to integrate indexical and linguistic
information in speech recognition.
PURPOSE OF THE STUDY
In the present study, CI-simulated speech was used to
examine the effect of presentation format (auditory vs.
auditory-visual) on learning of spectrally degraded voices and
the transfer of voice learning to auditory recognition of
spectrally degraded speech.
EXPERIMENT 1
Participants: 32 native speakers of General American
English with normal-hearing and normal or corrected-to-
normal vision.
Procedures:
Voice learning
• Familiarized with voices in A or AV presentation format.
• 1/2 group in each presentation format trained on Talker Set 1, and
the remainder trained on Talker Set 2.
•  After familiarization, voice identification was tested in an A format
and the feedback was provided on every trial.
• Voice identification training continued until criterion performance
was reached or 10 sessions were completed.
Auditory word recognition
•  Upon completion of voice ID training, auditory word recognition
was tested using novel sentences produced by all talkers in Sets 1
and 2 (1/2 trained voices, 1/2 untrained voices).
Hypothesis 1: Auditory-visual familiarization will accelerate
the learning of spectrally degraded voices. Therefore, fewer
training sessions should be necessary to achieve voice
identification criterion performance (80% in 2 consecutive
sessions) in the AV training group compared to the A training
group.
Hypothesis 2: Familiarity with a talker’s voice will facilitate
subsequent spoken word recognition. That is, word
recognition performance for the trained voices will be greater
than for the untrained voices.
EXP 1 RESULTS
Voice learning
• 75 % of the listeners met the criterion after an average of
4.83 sessions (see Tables 3  4).
• All who did not meet the criterion were trained with
Talker Set 1, and 75% of the listeners who did not meet
the criterion were in the auditory group (See Table 3).
Table 3. Total number of subjects who met and did not meet criterion.
• Overall, there was no significant difference in mean total
training sessions between the auditory and auditory-visual
training formats (t (1, 21.99) = .451, p  .05, See Table 4).
Table 4. Total sessions required to achieve 80% criterion performance
in the voice training task.
EXPERIMENT 2
Participants: A new group of 40 native speakers of General American
English with NH and normal or corrected-to normal vision
Stimuli: Six talkers in the training set and 4 talkers in the untrained set.
Sets were matched on average intelligibility.
Procedures:
Voice learning
• Familiarized with voices in A or AV format
• Voice ID training was carried in an A format. But the feedback about the
accuracy of each answer was provided in A or AV format.
• Upon completion of each training, subjects completed an auditory-only voice
ID task consisting of 20 new sentences spoken by 6 trained voices (total 120
tokens).
Auditory word recognition
• Pre- and post-training word recognition testing was carried in an A format
using sentences spoken by both trained and untrained talkers.
Hypothesis 1: Auditory-visual voice training will facilitate voice learning.
Therefore, voice ID performance should be better for subjects trained in
AV compared with those trained in A.
Hypothesis 2: Talker-specific knowledge learned from voice ID training
should facilitate subsequent word recognition. Therefore, recognition of
spectrally degraded speech should be greater for trained than untrained
voices.
FIG 4. Average auditory word
recognition performance. Error
bars represent 95%
confidence interval.
CONCLUSIONS
• Auditory-visual talker information facilitated recognition for
spectrally degraded voices.
• Voice ID training significantly improved perception of spectrally
degraded voices.
• Improved familiarity with a talker’s voice through voice training
did not improve subsequent auditory recognition of spectrally
degraded speech.
FIG 6. Mean voice ID
performance. Error bars
represent 95% confidence
interval. The dotted line is at
chance performance (33%).
FIG 7. Pre- and post-training mean
key-word recognition performance.
Error bars represent 95% confidence
interval.
ACKNOWLEDGEMENTS
This work was supported by NIH NIDCD grant DC008875.
Channel Start Freq Stop Freq
1 188 313
2 313 563
3 563 813
4 813 1063
5 1063 1313
6 1313 1688
7 1688 2188
8 2188 2813
9 2813 3688
10 3688 4813
11 4813 6188
12 6188 7938 FIG 1. Overview of the CI simulator
Total
Talker Set 1 Talker Set 2 Talker Set 1 Talker Set 2
Criterion met 2 8 6 8 24
Criterion unmet 6 0 2 0 8
Auditory Auditory-visual
Auditory Auditory-visual Auditory Auditory-visual
Mean 7 4.13 3.88 4.13
SD 2.08 1.25 1.36 2.75
t-statistic
Talker Set 1 Talker Set 2
t (1,14) = .231, p .05t (1,9.55) = -3.19, p  .05
FIG 2. A schematic of
voice ID training and word
recognition tasks.
Voice ID Training
Sentence
transcription
A-only
Pre-training word recognition
Voice ID with
Feedback
A or AV A-onlyA or AV
Voice ID TestFamiliarization
Sentence
transcription
Post-training word recognition
A-only
Training repeated for 4 sessions
FIG 5. A schematic of voice
training and word recognition
tasks used for EXP 2.
Talker Sex Age
Mean F0
(Hz)
Min F0
(Hz)
Max F0
(Hz)
01 f 25 207 77 366
02 f 23 201 78 308
03 f 21 183 104 322
04 f 24 172 78 452
05 f 36 145 84 334
06 m 27 137 80 295
07 m 20 124 90 181
08 m 27 116 81 191
09 m 28 113 103 175
10 m 28 113 76 271
Total
Talker Set 1 Talker Set 2 Talker Set 1 Talker Set 2
Mean 6 4.25 5.67 4.5 4.83
SD 1.41 1.75 2.73 2.51 2.24
Mean 4.83
SD 2.24
t-statistic
5 4.6
2.57 1.78
t (1, 21.99) = .451, p  .05
Auditory Auditory-visual

Weitere ähnliche Inhalte

Was ist angesagt?

Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...ravi sharma
 
Playing with improv theatre to battle public speaking anxiety
Playing with improv theatre to battle public speaking anxietyPlaying with improv theatre to battle public speaking anxiety
Playing with improv theatre to battle public speaking anxietyJordi Casteleyn
 
Predictability of Consonant Perception Ability Through a Listening Comprehens...
Predictability of Consonant Perception Ability Through a Listening Comprehens...Predictability of Consonant Perception Ability Through a Listening Comprehens...
Predictability of Consonant Perception Ability Through a Listening Comprehens...Kosuke Sugai
 
Investigations on the role of analysis window shape parameter in speech enhan...
Investigations on the role of analysis window shape parameter in speech enhan...Investigations on the role of analysis window shape parameter in speech enhan...
Investigations on the role of analysis window shape parameter in speech enhan...karthik annam
 
A literature review on improving speech intelligibility in noisy environment
A literature review on improving speech intelligibility in noisy environmentA literature review on improving speech intelligibility in noisy environment
A literature review on improving speech intelligibility in noisy environmentOHSU | Oregon Health & Science University
 
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?ozpar
 
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...
IRJET- Spoken Language Identification System using MFCC Features and Gaus...IRJET Journal
 
PPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual TreebanksPPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual TreebanksLifeng (Aaron) Han
 
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...kevig
 
Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...
  Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...  Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...
Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...N.K KooZN
 
Speech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data VisualisationSpeech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data VisualisationITIIIndustries
 
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing IJECEIAES
 

Was ist angesagt? (14)

Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
Comparative study of Text-to-Speech Synthesis for Indian Languages by using S...
 
Playing with improv theatre to battle public speaking anxiety
Playing with improv theatre to battle public speaking anxietyPlaying with improv theatre to battle public speaking anxiety
Playing with improv theatre to battle public speaking anxiety
 
Predictability of Consonant Perception Ability Through a Listening Comprehens...
Predictability of Consonant Perception Ability Through a Listening Comprehens...Predictability of Consonant Perception Ability Through a Listening Comprehens...
Predictability of Consonant Perception Ability Through a Listening Comprehens...
 
Investigations on the role of analysis window shape parameter in speech enhan...
Investigations on the role of analysis window shape parameter in speech enhan...Investigations on the role of analysis window shape parameter in speech enhan...
Investigations on the role of analysis window shape parameter in speech enhan...
 
A literature review on improving speech intelligibility in noisy environment
A literature review on improving speech intelligibility in noisy environmentA literature review on improving speech intelligibility in noisy environment
A literature review on improving speech intelligibility in noisy environment
 
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?
 
rs_day_10012015
rs_day_10012015rs_day_10012015
rs_day_10012015
 
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...IRJET-  	  Spoken Language Identification System using MFCC Features and Gaus...
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
 
PPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual TreebanksPPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
PPT-CCL: A Universal Phrase Tagset for Multilingual Treebanks
 
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
 
Ijeer journal
Ijeer journalIjeer journal
Ijeer journal
 
Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...
  Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...  Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...
Pedagogical Approaches Based on Intensive Listening & Reading Aloud of Auth...
 
Speech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data VisualisationSpeech Feature Extraction and Data Visualisation
Speech Feature Extraction and Data Visualisation
 
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
 

Andere mochten auch

Thank you letter for SG
Thank you letter for SGThank you letter for SG
Thank you letter for SGJoe Anderson
 
Universidad Tecnológica Equinoccial Curso: QTOB
Universidad Tecnológica Equinoccial Curso: 		QTOBUniversidad Tecnológica Equinoccial Curso: 		QTOB
Universidad Tecnológica Equinoccial Curso: QTOBCarito Criollo
 
Die bremer stadtmusikanten
Die bremer stadtmusikantenDie bremer stadtmusikanten
Die bremer stadtmusikantenlavipo
 
Paul WuenschelRecomendation
Paul WuenschelRecomendationPaul WuenschelRecomendation
Paul WuenschelRecomendationDouglas Fisher
 
Embedr.eu & Omeka
Embedr.eu & OmekaEmbedr.eu & Omeka
Embedr.eu & OmekaIIIF_io
 
Pcori challenge final_proposal_04152013
Pcori challenge final_proposal_04152013Pcori challenge final_proposal_04152013
Pcori challenge final_proposal_04152013Paul Diefenbach
 
Salvador vista aérea
Salvador vista aéreaSalvador vista aérea
Salvador vista aéreaRita Rocha
 
Young & Rubicam
Young & RubicamYoung & Rubicam
Young & Rubicammark10d
 
Jornalismo de Dados, Panamá Papers e Wikileaks
Jornalismo de Dados, Panamá Papers e WikileaksJornalismo de Dados, Panamá Papers e Wikileaks
Jornalismo de Dados, Panamá Papers e WikileaksHaydee Svab
 
Звіт Асоціації велосипедистів Києва 2015
Звіт Асоціації велосипедистів Києва 2015 Звіт Асоціації велосипедистів Києва 2015
Звіт Асоціації велосипедистів Києва 2015 velotransport
 
Delegados aecistas contra intenert
Delegados aecistas contra intenertDelegados aecistas contra intenert
Delegados aecistas contra intenertMiguel Rosario
 

Andere mochten auch (16)

Thank you letter for SG
Thank you letter for SGThank you letter for SG
Thank you letter for SG
 
Universidad Tecnológica Equinoccial Curso: QTOB
Universidad Tecnológica Equinoccial Curso: 		QTOBUniversidad Tecnológica Equinoccial Curso: 		QTOB
Universidad Tecnológica Equinoccial Curso: QTOB
 
26 sustentacio proyectofinal
26 sustentacio proyectofinal26 sustentacio proyectofinal
26 sustentacio proyectofinal
 
FriendsWhoLend
FriendsWhoLendFriendsWhoLend
FriendsWhoLend
 
Die bremer stadtmusikanten
Die bremer stadtmusikantenDie bremer stadtmusikanten
Die bremer stadtmusikanten
 
Paul WuenschelRecomendation
Paul WuenschelRecomendationPaul WuenschelRecomendation
Paul WuenschelRecomendation
 
Bachelor Diploma
Bachelor DiplomaBachelor Diploma
Bachelor Diploma
 
Embedr.eu & Omeka
Embedr.eu & OmekaEmbedr.eu & Omeka
Embedr.eu & Omeka
 
Pcori challenge final_proposal_04152013
Pcori challenge final_proposal_04152013Pcori challenge final_proposal_04152013
Pcori challenge final_proposal_04152013
 
Salvador vista aérea
Salvador vista aéreaSalvador vista aérea
Salvador vista aérea
 
Coase-EPA. El problema del Costo Social
Coase-EPA. El problema del Costo SocialCoase-EPA. El problema del Costo Social
Coase-EPA. El problema del Costo Social
 
Young & Rubicam
Young & RubicamYoung & Rubicam
Young & Rubicam
 
Jornalismo de Dados, Panamá Papers e Wikileaks
Jornalismo de Dados, Panamá Papers e WikileaksJornalismo de Dados, Panamá Papers e Wikileaks
Jornalismo de Dados, Panamá Papers e Wikileaks
 
Звіт Асоціації велосипедистів Києва 2015
Звіт Асоціації велосипедистів Києва 2015 Звіт Асоціації велосипедистів Києва 2015
Звіт Асоціації велосипедистів Києва 2015
 
Delegados aecistas contra intenert
Delegados aecistas contra intenertDelegados aecistas contra intenert
Delegados aecistas contra intenert
 
IT services for fashion and retail
IT services for fashion and retailIT services for fashion and retail
IT services for fashion and retail
 

Ähnlich wie ASA 09 Poster-portlandOR-051209

3. speech processing algorithms for perception improvement of hearing impaire...
3. speech processing algorithms for perception improvement of hearing impaire...3. speech processing algorithms for perception improvement of hearing impaire...
3. speech processing algorithms for perception improvement of hearing impaire...k srikanth
 
Speaker identification system using close set
Speaker identification system using close setSpeaker identification system using close set
Speaker identification system using close seteSAT Journals
 
Speaker identification system using close set
Speaker identification system using close setSpeaker identification system using close set
Speaker identification system using close seteSAT Publishing House
 
To Improve Speech Recognition in Noise for Cochlear Implant Users
To Improve Speech Recognition in Noise for Cochlear Implant UsersTo Improve Speech Recognition in Noise for Cochlear Implant Users
To Improve Speech Recognition in Noise for Cochlear Implant UsersIOSR Journals
 
Writing the discussion chapter for quantitative research.pdf
Writing the discussion chapter for quantitative research.pdfWriting the discussion chapter for quantitative research.pdf
Writing the discussion chapter for quantitative research.pdfMartin McMorrow
 
Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)Saibur Rahman
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summaryAditya Deshmukh
 
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...RicardoVallejo30
 
Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...
Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...
Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...Phonak
 
Central Auditory Function: Willeford Test Battery
Central Auditory Function: Willeford Test BatteryCentral Auditory Function: Willeford Test Battery
Central Auditory Function: Willeford Test Batteryalexandracostlow
 
Writing results and discussion chapters for quantitative research
Writing results and discussion chapters for quantitative researchWriting results and discussion chapters for quantitative research
Writing results and discussion chapters for quantitative researchMartin McMorrow
 
Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...
Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...
Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...Arun Joshi
 
20211008 修論中間発表
20211008 修論中間発表20211008 修論中間発表
20211008 修論中間発表Tomoya Koike
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...IJERA Editor
 

Ähnlich wie ASA 09 Poster-portlandOR-051209 (20)

3. speech processing algorithms for perception improvement of hearing impaire...
3. speech processing algorithms for perception improvement of hearing impaire...3. speech processing algorithms for perception improvement of hearing impaire...
3. speech processing algorithms for perception improvement of hearing impaire...
 
Speaker identification system using close set
Speaker identification system using close setSpeaker identification system using close set
Speaker identification system using close set
 
Speaker identification system using close set
Speaker identification system using close setSpeaker identification system using close set
Speaker identification system using close set
 
To Improve Speech Recognition in Noise for Cochlear Implant Users
To Improve Speech Recognition in Noise for Cochlear Implant UsersTo Improve Speech Recognition in Noise for Cochlear Implant Users
To Improve Speech Recognition in Noise for Cochlear Implant Users
 
ship
shipship
ship
 
Bioscientist Poster2
Bioscientist Poster2Bioscientist Poster2
Bioscientist Poster2
 
F010334548
F010334548F010334548
F010334548
 
Int journal 01
Int journal 01Int journal 01
Int journal 01
 
Writing the discussion chapter for quantitative research.pdf
Writing the discussion chapter for quantitative research.pdfWriting the discussion chapter for quantitative research.pdf
Writing the discussion chapter for quantitative research.pdf
 
03.Acoustic_Eduardo
03.Acoustic_Eduardo03.Acoustic_Eduardo
03.Acoustic_Eduardo
 
Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)Esophageal Speech Recognition using Artificial Neural Network (ANN)
Esophageal Speech Recognition using Artificial Neural Network (ANN)
 
Bachelors project summary
Bachelors project summaryBachelors project summary
Bachelors project summary
 
Frequency Transposition
Frequency TranspositionFrequency Transposition
Frequency Transposition
 
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
 
Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...
Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...
Frequency Lowering Hearing Aids: Procedures for Assessing Candidacy and Fine ...
 
Central Auditory Function: Willeford Test Battery
Central Auditory Function: Willeford Test BatteryCentral Auditory Function: Willeford Test Battery
Central Auditory Function: Willeford Test Battery
 
Writing results and discussion chapters for quantitative research
Writing results and discussion chapters for quantitative researchWriting results and discussion chapters for quantitative research
Writing results and discussion chapters for quantitative research
 
Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...
Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...
Effects of Nonlinear Frequency Compression on Performance of Individuals Who ...
 
20211008 修論中間発表
20211008 修論中間発表20211008 修論中間発表
20211008 修論中間発表
 
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
Identification of Sex of the Speaker With Reference To Bodo Vowels: A Compara...
 

ASA 09 Poster-portlandOR-051209

  • 1. EXP 2 RESULTS Voice learning •  Voice ID significantly improved across training sessions. This was true for performance in the training task (F(3, 38) = 80.15, p < .0001) and in the post-training ID task (F(3, 38) = 43.95, p < .0001, See Fig. 6). • Voice recognition in the training task was significantly better for AV than the A group (F(1, 38) = 6.96, p < .05). • Performance in the post-training voice identification task did not differ as a function of training format (F(1, 38) = 1.807, p > .05). Auditory word recognition • Word recognition performance did not differ significantly for trained vs. untrained voices (F(1, 38) = .211, p > .05, See Fig. 7). EXP 1 RESULTS Voice learning • Voice identification was significantly more difficult for Talker Set 1 than Talker Set 2 (See Fig. 3). • F0s for Talker Set 1 overlapped within the same gender whereas within-gender F0s for Talker Set 2 were spread apart. • When a less stringent criterion was used for the training task, there was a significant difference in mean total sessions between the auditory and auditory-visual groups in Talker Set 1 (See Table 5). Table 5. Total sessions required to achieve 75% criterion performance in the voice training task. Auditory word recognition • There was no significant difference in auditory word recognition between the trained voices and the untrained voices (See Fig. 4). This was may be due to: 1) intelligibility differences between Talker Sets 1 and 2; and 2) different numbers of training sessions across subjects and conditions. FIG 3. Average performance in the voice identification task. Error bars represent 95% confidence interval. The dotted line represents chance level (25%). STIMULI • 320 sentences created by Bell & Wilson (2001). • 7-9 words in each sentence, semantically neutral. • 3 key words per sentence (e.g., She felt a rough spot on her skin.). •  5 male and 5 female talkers produced the sentences (See Table 1); recordings were in an audiovisual HD format. Table 1. Talker Acoustics. • All sentences were processed via12-channel noise-band CI simulator (See Fig. 1 & Table 2). Table 2. Corner frequencies for frequency analysis filters for CI simulation. 1pPP25. Effects of Training Format on Voice Learning of Spectrally Degraded Voices Sangsook Choi1, Karen Iler Kirk1, Tom Talavage2, Vidya Krull1, Christopher Smalt2, Steve Baker2 1. Dept. of Speech Language and Hearing Science, 2. Dept. of Electrical and Computer Engineering BACKGROUND • Perception of talker-specific information (i.e., indexical information such as a talker’s voice) is closely related to linguistic processing of an utterance (For review, Nygaard, 2005). • Audiovisual speech facilitates speech perception in adverse listening environments (Sumby Pollack, 1954). Similar facilitative effects were found in voice identification in NH listeners (Sheffert Olson, 2004). • It is well known that audiovisual speech provides large benefits to individuals with hearing impairment (Erber, 1972; 1975), including CI recipients (Tyler et al., 1997). • Little is known about indexical perception in CI recipients and their abilities to integrate indexical and linguistic information in speech recognition. PURPOSE OF THE STUDY In the present study, CI-simulated speech was used to examine the effect of presentation format (auditory vs. auditory-visual) on learning of spectrally degraded voices and the transfer of voice learning to auditory recognition of spectrally degraded speech. EXPERIMENT 1 Participants: 32 native speakers of General American English with normal-hearing and normal or corrected-to- normal vision. Procedures: Voice learning • Familiarized with voices in A or AV presentation format. • 1/2 group in each presentation format trained on Talker Set 1, and the remainder trained on Talker Set 2. •  After familiarization, voice identification was tested in an A format and the feedback was provided on every trial. • Voice identification training continued until criterion performance was reached or 10 sessions were completed. Auditory word recognition •  Upon completion of voice ID training, auditory word recognition was tested using novel sentences produced by all talkers in Sets 1 and 2 (1/2 trained voices, 1/2 untrained voices). Hypothesis 1: Auditory-visual familiarization will accelerate the learning of spectrally degraded voices. Therefore, fewer training sessions should be necessary to achieve voice identification criterion performance (80% in 2 consecutive sessions) in the AV training group compared to the A training group. Hypothesis 2: Familiarity with a talker’s voice will facilitate subsequent spoken word recognition. That is, word recognition performance for the trained voices will be greater than for the untrained voices. EXP 1 RESULTS Voice learning • 75 % of the listeners met the criterion after an average of 4.83 sessions (see Tables 3 4). • All who did not meet the criterion were trained with Talker Set 1, and 75% of the listeners who did not meet the criterion were in the auditory group (See Table 3). Table 3. Total number of subjects who met and did not meet criterion. • Overall, there was no significant difference in mean total training sessions between the auditory and auditory-visual training formats (t (1, 21.99) = .451, p .05, See Table 4). Table 4. Total sessions required to achieve 80% criterion performance in the voice training task. EXPERIMENT 2 Participants: A new group of 40 native speakers of General American English with NH and normal or corrected-to normal vision Stimuli: Six talkers in the training set and 4 talkers in the untrained set. Sets were matched on average intelligibility. Procedures: Voice learning • Familiarized with voices in A or AV format • Voice ID training was carried in an A format. But the feedback about the accuracy of each answer was provided in A or AV format. • Upon completion of each training, subjects completed an auditory-only voice ID task consisting of 20 new sentences spoken by 6 trained voices (total 120 tokens). Auditory word recognition • Pre- and post-training word recognition testing was carried in an A format using sentences spoken by both trained and untrained talkers. Hypothesis 1: Auditory-visual voice training will facilitate voice learning. Therefore, voice ID performance should be better for subjects trained in AV compared with those trained in A. Hypothesis 2: Talker-specific knowledge learned from voice ID training should facilitate subsequent word recognition. Therefore, recognition of spectrally degraded speech should be greater for trained than untrained voices. FIG 4. Average auditory word recognition performance. Error bars represent 95% confidence interval. CONCLUSIONS • Auditory-visual talker information facilitated recognition for spectrally degraded voices. • Voice ID training significantly improved perception of spectrally degraded voices. • Improved familiarity with a talker’s voice through voice training did not improve subsequent auditory recognition of spectrally degraded speech. FIG 6. Mean voice ID performance. Error bars represent 95% confidence interval. The dotted line is at chance performance (33%). FIG 7. Pre- and post-training mean key-word recognition performance. Error bars represent 95% confidence interval. ACKNOWLEDGEMENTS This work was supported by NIH NIDCD grant DC008875. Channel Start Freq Stop Freq 1 188 313 2 313 563 3 563 813 4 813 1063 5 1063 1313 6 1313 1688 7 1688 2188 8 2188 2813 9 2813 3688 10 3688 4813 11 4813 6188 12 6188 7938 FIG 1. Overview of the CI simulator Total Talker Set 1 Talker Set 2 Talker Set 1 Talker Set 2 Criterion met 2 8 6 8 24 Criterion unmet 6 0 2 0 8 Auditory Auditory-visual Auditory Auditory-visual Auditory Auditory-visual Mean 7 4.13 3.88 4.13 SD 2.08 1.25 1.36 2.75 t-statistic Talker Set 1 Talker Set 2 t (1,14) = .231, p .05t (1,9.55) = -3.19, p .05 FIG 2. A schematic of voice ID training and word recognition tasks. Voice ID Training Sentence transcription A-only Pre-training word recognition Voice ID with Feedback A or AV A-onlyA or AV Voice ID TestFamiliarization Sentence transcription Post-training word recognition A-only Training repeated for 4 sessions FIG 5. A schematic of voice training and word recognition tasks used for EXP 2. Talker Sex Age Mean F0 (Hz) Min F0 (Hz) Max F0 (Hz) 01 f 25 207 77 366 02 f 23 201 78 308 03 f 21 183 104 322 04 f 24 172 78 452 05 f 36 145 84 334 06 m 27 137 80 295 07 m 20 124 90 181 08 m 27 116 81 191 09 m 28 113 103 175 10 m 28 113 76 271 Total Talker Set 1 Talker Set 2 Talker Set 1 Talker Set 2 Mean 6 4.25 5.67 4.5 4.83 SD 1.41 1.75 2.73 2.51 2.24 Mean 4.83 SD 2.24 t-statistic 5 4.6 2.57 1.78 t (1, 21.99) = .451, p .05 Auditory Auditory-visual