Multizone reproduction of speech soundfields a perceptually weighted approach - final

MULTIZONE
REPRODUCTION OF
SPEECH SOUNDFIELDS:
A PERCEPTUALLY
WEIGHTED APPROACH
Jacob Donley and Christian Ritz
School of Electrical, Computer and Telecommunications Engineering
ICT Research Institute & Global Challenges
University of Wollongong

2
Room
How can we perceptually enhance
independent listening zones in a room?
Quiet Zone:
No reproduced
sound
Bright Zone:
Listening to
speech or
music
Loudspeakers
Known as Multizone Reproduction of Soundfields

3
Aim: derive loudspeaker signals to
reproduce desired sound field in each zone
• Reproduced sound field modelled in the
(discrete) space (𝐱 = (𝒙, 𝒚)), time (𝑛),
frequency domain (𝑘) as:
𝑆 𝑤 𝐱, 𝑛, 𝑘 =
𝑙=1
𝐿
𝑑𝑙 𝑛, 𝑘, 𝑤
𝑗
4
𝐻0
1
𝑘 𝐱 𝑙 − 𝐱
𝐻 𝑚
1
is the mth order Hankel function of the first kind
𝑑𝑙 𝑘, 𝑤 are the loudspeaker signals to be derived
[1] Donley, J. & Ritz, C., “An efficient approach to dynamically weighted multizone wideband reproduction
of speech soundfields”, Proc. IEEE ChinaSIP 2015, pp. 60-64, 12-15 July 2015.
[2] W. Jin, W. B. Kleijn, and D. Virette, “Multizone soundfield reproduction using orthogonal basis
expansion,” Proc. IEEE ICASSP 2013, pp. 311–315
Solution is based on a weighted orthogonal basis expansion approach [1,2]

4
http://bit.ly/WeightedMultizone
Weighting method controls leakage into
quiet zone at cost of quality in bright zone
• Multizone Occlusion
problem:
• Quiet zone in-line with
desired bright zone
• Difficult to control leakage
• Trade-off:
• Quality in Bright Zone vs
Quietness in Quiet Zone
Small weight
Large weight
Discrete:
Space 
Time 
Frequency 
(weighted actual
soundfield function)
How quiet does the quiet
zone need to be?

5
• Only need to suppress leakage in the quiet zone down to
the threshold in quiet
• Possible only if the acoustic contrast between zones is large
enough
Case 1: The Hearing Threshold
Speech

6
• Key idea: a masker in the quiet zone perceptually hides
surrounding frequency components leaked from the
bright zone
• Benefit: Less control via weighting needed – improve
bright zone quality
Case 2: Spreading functions
corresponding to local masking signal
2kHz Masker
Speech
• Max. SPL - small
weight, high bright zone
quality
• Min. SPL – large weight,
low bright zone quality
• Leaked SPL – masker
allowed to remain in
quiet zone

7
Considering masking - reduces spatial error
in the bright zone and SPL in quiet zone
Benefit: Perceptually optimised trade-off between
quality and leakage
• Weights chosen by comparing reproduced speech with
spreading functions
(2)
reduction
Spatial error:
Speech
Spreading
function and
hearing
threshold
𝜖 𝑏(𝑛, 𝑘)

8
Experimental evaluation to validate
proposed perceptual approach
Multizone Setup:
• Full circle of 65 loudspeakers
• Loudspeaker array diameter: 3m
• Zone diameters: 60cm
(enough space for a human head)
• Zone centres are 1.2m apart
• Reproduction capable of wideband
speech
• Direction of speech causes Multizone
Occlusion Problem (𝜃 ≈ 15°).
= Hearing threshold & Spreading
function (as used in audio coding
standards)

9
• 10dB improvement in MSE
• Still high quality speech in the bright zone
Reduced bright zone error from
psychoacoustic masking
Mean Squared Error (MSE): MSE =
1
𝑀 𝑛=1
𝑀
𝑌𝑤(𝑛) − 𝑌(𝑛)
2
No masking
large weight
With masking
variable weight

10
Reduced bright zone spatial error
from psychoacoustic masking
Magnitude difference (A, B):
Phase difference (C, D):
Maximum spatial error reduction:
28dB
Consequence of smaller weighting:
less loudspeaker power (max.
reduction = 65 %

11
Conclusion: Exploiting perceptual
weighting within multizone soundfield
reproduction results in significant
advantages
• Improved error in bright zones with no perceptual cost in
adjacent zones
• MSE of speech: -69.8dB to -80.3dB (max)
• Spatial error: -7.4dB to -31.5dB (max)
• Reduced loudspeaker power (up to 65%)
• Improved reproduction when occlusion problem is present
Questions?

Multizone reproduction of speech soundfields a perceptually weighted approach - final

Empfohlen

Empfohlen

Weitere ähnliche Inhalte

Was ist angesagt?

Was ist angesagt? (20)

Andere mochten auch

Andere mochten auch (9)

Ähnlich wie Multizone reproduction of speech soundfields a perceptually weighted approach - final

Ähnlich wie Multizone reproduction of speech soundfields a perceptually weighted approach - final (20)

Kürzlich hochgeladen

Kürzlich hochgeladen (20)

Multizone reproduction of speech soundfields a perceptually weighted approach - final