Visit https://alexisbaskind.net/teaching for a full interactive version of this course with sound and video material, as well as more courses and material.
Course series: Fundamentals of acoustics for sound engineers and music producers
Level: undergraduate (Bachelor)
Language: English
Revision: February 2020
To cite this course: Alexis Baskind, Psychoacoustics 4 – Spatial Hearing
course material, license: Creative Commons BY-NC-SA.
Course content:
1. Introduction
sound localization, lateralization, perception of height, perception of distance
2. Interaural level and time differences
head as acoustic shadow, ITD, ILD, frequency dependence, interindividual differences
3. Cone of confusion
ambiguity of ITD and ILD in the cone of confusion, front/back confusions, need for extra information (vision, previous knowledge, head movements, distance-based cues, spectral cues)
4. Estimating distance in a dry environment
use of absolute level and spectrum of the sound
5. Cocktail-Party Effect
selective attention based on spectral, spatial and time cues
6. Summing Localization
base of stereophony, phantom sources, influence of interchannel time and level differences, time-based, level-based and mixed stereophony, sweet spot
7. Precedence Effect
Haas effect / Law of the first wavefront, echo threshold, application in music production
2. Alexis Baskind
Psychoacoustics 4 – Spatial Hearing
Course series
Fundamentals of acoustics for sound engineers and music producers
Level
undergraduate (Bachelor)
Language
English
Revision
January 2020
To cite this course
Alexis Baskind, Psychoacoustics 4 – Spatial Hearing, course material, license: Creative
Commons BY-NC-SA.
Full interactive version of this course with sound and video material, as well as more
courses and material on https://alexisbaskind.net/teaching.
Except where otherwise noted, content of this course
material is licensed under a Creative Commons Attribution-
NonCommercial-ShareAlike 4.0 International License.
Psychoacoustics 3 - Spatial Hearing
3. Alexis Baskind
Outline
1. Introduction
2. Interaural level and time differences
3. Cone of confusion
4. Estimating distance in a dry environment
5. Cocktail-Party effect
6. Summing Localization
7. Precedence effect
Psychoacoustics 3 - Spatial Hearing
4. Alexis Baskind
Introduction
• Sound localization, i.e. our ability to estimate the position of
objects in 3 dimensions based on sound only, is one of the
fundamental attributes of human perception, as it
complements vision (which is only frontal) in spatial
perception of the environment
• It consists in 3 aspects:
1. Lateralization (left-right localization)
2. Perception of height
3. Perception of distance
• It’s a quite complex mechanism that relies on objective cues
(time and level differences between both ears, filtering
because of reflections…) as well as other sources of
information (previous knowledge, interaction with vision and
movements, etc…)
• This course will only focus on the direct sound (i.e. without
considering neither the reverberation nor the reflections)
Perception of direction
Psychoacoustics 3 - Spatial Hearing
5. Alexis Baskind
Outline
1. Introduction
2. Interaural level and time differences
3. Cone of confusion
4. Estimating distance in a dry environment
5. Cocktail-Party effect
6. Summing Localization
7. Precedence effect
Psychoacoustics 3 - Spatial Hearing
6. Alexis Baskind
Interaural Level and Time Differences
• Depending on the sound wave incidence, the sound
differs between both ears;
• The head creates an
acoustic shadow
• At the ear which is the
furthest away from the
source, the sound is:
– softer and filtered (especially at
high frequencies)
– delayed
• The ear relies on those
interaural time and level
differences in order to
localize the source
Psychoacoustics 3 - Spatial Hearing
7. Alexis Baskind
Interaural Level Differences
• Interaural level differences (“ILD”) depend on the sound
incidence:
• If the source is in the so-
called median plane (on the
front, the rear or on the top,
for example), it is at the same
distance to both ears
the level difference is zero
• If the source is 90° on one
side, then the shadowing is
maximum, and thus the level
difference is also maximal
Psychoacoustics 3 - Spatial Hearing
8. Alexis Baskind
Interaural Level Differences
• Interaural level differences (“ILD”) depend on frequency:
• The head is an obstacle only
at frequencies, for which the
wavelength is smaller as the
dimensions of the head (i.e.
above 1000-1500 Hz)
• At low frequencies,
diffraction of the sound
around the head dominates,
and level differences
decrease with frequency.
Psychoacoustics 3 - Spatial Hearing
9. Alexis Baskind
Interaural Level Differences
Example of Interaural level differences as a function of frequency and angle
Note: those values vary from one individual to another !
Psychoacoustics 3 - Spatial Hearing
10. Alexis Baskind
Interaural Time Differences
• Interaural time differences (“ITD”) depend on the sound
incidence:
Closest ear
(shorter path)
Furthermost
ear (longer
path)
• If the source is in the so-
called median plane (on the
front, the rear or on the top,
for example), it is at the same
distance to both ears
the time difference is zero
• If the source is 90° on one
side, then the distance to the
furthermost ear is maximum,
and so the time difference is
also maximal
Psychoacoustics 3 - Spatial Hearing
11. Alexis Baskind
Interaural Time Differences
• Interaural time differences (“ITD”) depend on frequency:
Closest ear
(shorter path)
Furthermost
ear (longer
path)
• The maximum time difference is
specific to each person, but is on
average around 0.6 ms
• At low frequencies, those differences
are small pertaining to the period: the
localisation blur increases with
decreasing frequency
• At high frequencies, the period is very
small
possible ambiguity of the time delay
Increasing localization blur
Conclusion: time-based sound
localization reaches its maximum of
precision in the medium frequencies
Psychoacoustics 3 - Spatial Hearing
12. Alexis Baskind
Interaural Time Differences
Interaural time differences as a function of angle
Note: those values also vary from one individual to another !
Psychoacoustics 3 - Spatial Hearing
13. Alexis Baskind
Outline
1. Introduction
2. Interaural level and time differences
3. Cone of confusion
4. Estimating distance in a dry environment
5. Cocktail-Party effect
6. Summing Localization
7. Precedence effect
Psychoacoustics 3 - Spatial Hearing
14. Alexis Baskind
Cone of Confusion
• Interaural level and time differences, which carry the most
important information to localize sources, are ambiguous
• For given interaural level
and time differences,
there is an infinity of
possible incidences
Image from Stefan Weinzierl, „Handbuch der Audiotechnik“
Typical example: Front/back confusions
Psychoacoustics 3 - Spatial Hearing
15. Alexis Baskind
Blauert Bands
• Special case: sound incidence in the median plane
Level and time differences are zero
Without any other source of information, the source is localized not
thanks to its actual direction, but only based on its frequency content
(„Blauert bands“)
Experiment by J. Blauert (1969/70):
• The sound source is in the median
plane (either in the front, above or
behind)
• The signal consists in third-octave
noise
The Localization does not depend
on the incidence, only on the
frequency band
(Image: Wikipedia)Frequency in kHz
Probabilityofestimated
directionofincidencein%
Blauert bands
front frontback backabove
Psychoacoustics 3 - Spatial Hearing
16. Alexis Baskind
Cone of Confusion
How to solve this ambiguity?
1. Thanks to vision: if one sees the sound source or at least ist
rough direction, the ambiguity is solved
2. Thanks to previous-knowledge: if one already has a good
idea about the possible position of the source
3. Thanks to head movements even small: the modification of
the interaural time differences is enough
For binaural synthesis (3D-audio for headphones) this means, that head
movements must be measured in real time in order to adapt the synthesis
accordingly
Head-Tracking
4. Thanks to the effect of distance on level and spectrum (see
below)
Psychoacoustics 3 - Spatial Hearing
17. Alexis Baskind
Cone of Confusion
How to solve this ambiguity?
5. Thanks to the effect of reflections on schoulders and
ear conch (comb filters)
Image in D.R. Begault, 3-D Sound For Virtual Reality And Multimedia
Psychoacoustics 3 - Spatial Hearing
18. Alexis Baskind
Outline
1. Introduction
2. Interaural level and time differences
3. Cone of confusion
4. Estimating distance in a dry environment
5. Cocktail-Party effect
6. Summing Localization
7. Precedence effect
Psychoacoustics 3 - Spatial Hearing
19. Alexis Baskind
Estimating the distance of the source
In a dry environment (i.e. without reverberation), estimating
the distance relies on two kinds of information:
1. The level of the sound at the ears: if a sound is further, it
will be softer (distance law).
=> But this means knowing how loud is the source !
For example, a loud whispered voice will be considered as close, or a
very brassy, but soft, trumpet, will be considered as far, even if the
sound level is the same
If we don’t know the source, and if there is no
reverberation, we cannot know how far is the sound source
Psychoacoustics 3 - Spatial Hearing
20. Alexis Baskind
Estimating the distance of the source
In a dry environment (i.e. without reverberation), estimating
the distance relies on two kinds of information:
2. The spectrum of the sound:
– If the source is close and directive, low frequencies are boosted
– If the source is far, high frequencies will decline because of air
absorption (low frequencies travel further than high frequencies)
But this requires as well to know the spectrum of the sound
source. If we don’t know the sound, we cannot know how
its spectrum is modified with the distance
Psychoacoustics 3 - Spatial Hearing
21. Alexis Baskind
Outline
1. Introduction
2. Interaural level and time differences
3. Cone of confusion
4. Estimating distance in a dry environment
5. Cocktail-Party effect
6. Summing Localization
7. Precedence effect
Psychoacoustics 3 - Spatial Hearing
22. Alexis Baskind
Cocktail-Party Effect
• The cocktail-party effect (also called selective attention) is our
ability to focus our attention on a given sound source when two
or more sources are playing simultaneously
• It relies on two different mechanisms:
1. Spectral and time cues: the hearing system will try to separate
sources that have different spectra, and are not simultaneous
(homorhythmic for music). But this is harder if sources are at
the same position
Psychoacoustics 3 - Spatial Hearing
23. Alexis Baskind
Cocktail-Party Effect
• The cocktail-party effect (also called selective attention) is our
ability to focus our attention on a given sound source when two
or more sources are playing simultaneously
• It relies on two different mechanisms:
2. Spatial cues: if two sounds present different level and time
differences between both ears, they will be spatially separated
by the audition and therefore perceived as two distinct sound
sources:
=> That’s why a stereo mix sounds in general clearer than a mono
mix, and a surround mix sounds in general clearer than a stereo
mix
Psychoacoustics 3 - Spatial Hearing
24. Alexis Baskind
Outline
1. Introduction
2. Interaural level and time differences
3. Cone of confusion
4. Estimating distance in a dry environment
5. Cocktail-Party effect
6. Summing Localization
7. Precedence effect
Psychoacoustics 3 - Spatial Hearing
25. Alexis Baskind
Summing Localization
• With two-channel
stereophony, the listener
becomes 4 information, 2 for
each source
• Hearing is tricked because of
this non natural situation and
assumes, that there is only
one source between both
loudspeakers
=> Phantom source
• This is called Summing
localization
Phantom
source
Sound image
Psychoacoustics 3 - Spatial Hearing
26. Alexis Baskind
Summing Localization
• Summing localization is the basis of stereophony
• It occurs when two or more spatially distinct sources
radiate identical or at least coherent signals with a time
difference smaller than 1.5 ms
• In this case there is one perceived sound event, which
localization depends on the time and level differences at
the ears between the sound sources:
– If the time and level differences are zero, the phantom source is
perceived exactly at the midpoint between both loudspeakers
– If the time and level differences are not zero, the phantom source
shifts towards the loudspeaker for which the signal is the loudest
and/or the earliest
Psychoacoustics 3 - Spatial Hearing
27. Alexis Baskind
Summing Localization
Localization of the phantom source based on level differences
(left figure) and time differences (right figure)
(Figure: J. Blauert, “Spatial Hearing”. Dashed plot: speech, head free to move;
solid plot: impulse signals, head immobilized)
Psychoacoustics 3 - Spatial Hearing
28. Alexis Baskind
Time difference of the loudspeaker signals
Leveldifferenceoftheloudspeakersignals
Phantom source in
the middle
(rightlouder)(leftlouder)
(right earlier) (left earlier)
Phantom source at the
right Loudspeaker
Time and level
differences can be
combined with each
others
Mixed Stereophony,
near-coincident pairs
(ex: ORTF)
Summing Localization
Phantom source at the
left Loudspeaker
Psychoacoustics 3 - Spatial Hearing
29. Alexis Baskind
Summing Localization
Sweet-Spot
60°
60° 60°
• The previous information is
only valid if the listener is at
the Sweet-Spot (i.e. the
listener and the loudspeakers
have to form an equilateral
triangle)
• If it‘s not the case, extra time
and level differences are
introduced and the phantom
source moves in the direction
of the closest loudspeaker
Psychoacoustics 3 - Spatial Hearing
Sound image
30. Alexis Baskind
Outline
1. Introduction
2. Interaural level and time differences
3. Cone of confusion
4. Estimating distance in a dry environment
5. Cocktail-Party effect
6. Summing Localization
7. Precedence effect
Psychoacoustics 3 - Spatial Hearing
31. Alexis Baskind
Precedence effect
• If the same (or almost the same) signals reach the listener
from different directions with a delay between them, the
resulting perceived event differs as a function of the delay:
1. If the time delay is less than 1.5 ms, the summing
localization applies
2. If the time delay is greater than 1.5 ms and the delayed
sound is much quieter than the first, the auditory system
only perceives the direction of the sound signal arriving
first. The second sound is not distinctly perceived
(=precedence effect, or Haas effect), but the sound is
perceived louder, farther and/or wider.
Psychoacoustics 3 - Spatial Hearing
32. Alexis Baskind
Precedence effect
• If the same (or almost the same) signals reach the listener
from different directions with a delay between them, the
resulting perceived event differs as a function of the delay:
3. If the time delay is greater than 1.5 ms and the delayed
sound is loud enough, the auditory system perceives both
sound signals distinctly. The second sound is then called
echo
Psychoacoustics 3 - Spatial Hearing
33. Alexis Baskind
Precedence effect
• The level threshold between the Haas effect and a distinct
perception of an echo is called echo threshold, and
depends on the time delay and on the sound itself :
• The greater the time delay,
the lower the echo
threshold
• The echo threshold is
lower for sounds with
short, sharp transients
Psychoacoustics 3 - Spatial Hearing
34. Alexis Baskind
Precedence effect
The precedence effect is used everywhere in music
production:
1. Sound reinforcement: delayed loudspeakers are set at the back
of the room. The listener does not perceive them distinctly,
still perceiving the sound coming from the stage, but louder.
Psychoacoustics 3 - Spatial Hearing
35. Alexis Baskind
Precedence effect
The precedence effect is used everywhere in music
production:
2. Studio-Mixing:
– to make a mono recording wider, short delays (approx. 10ms) can
be applied to it, which are not perceived as echoes. The delays
must be different on the left and right side to decrease the
correlation (see previous part)
– If the time delay is a bit longer (20-50ms), more envelopment is
generated
Psychoacoustics 3 - Spatial Hearing
36. Alexis Baskind
Precedence effect
The precedence effect is used everywhere in music
production:
2. Studio-Mixing :
– These are the basics of the perception of a reverberation:
Direct sound
(more
width/envelopment)
elate intense reflections cause
echoes
late reverberationFirst reflections
Psychoacoustics 3 - Spatial Hearing
37. Alexis Baskind
To go further
• J. Blauert, Spatial Hearing, The Psychophysics of
Human Sound Localization, MIT Press
• F. Rumsey, Spatial Audio, Focal Press
• Eberhard Sengpiel‘s webseite, among others
– http://www.sengpielaudio.com/InterchannelLevelDiffe
rencesAndInterchannelTimeDifferences2.pdf
– http://www.sengpielaudio.com/calculator-
localisationcurves.htm
Psychoacoustics 3 - Spatial Hearing