Skip to main content
Log in

On the robustness of upper limits for circular auditory motion perception

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

We developed a new rendering system that consists of a image-source reflection model followed by a two-step strategy for modeling moving phantom sound sources (combining a physical propagation model and a spatialization method). This rendering system was used for simulations of pressure signals and acoustical localization cues, and for stimulus generation. We report three perceptual experiments measuring the upper limits, defined as auditory velocity thresholds beyond which listeners are no longer able to perceptually, resolve a smooth circular trajectory. These thresholds were measured in different reverberation conditions using white noise, band-limited white noise and band-limited white noise with a pure tone. Experiment 1 took place in a hemi-anechoic room and compared two conditions: dry and with simulated first-order reflections. In Experiment 2 we compare the thresholds measured in the hemi-anechoic room (dry and simulated second-order reflections) and in a reverberant room using two different loudspeaker configurations. Experiment 3 investigated the effect of audio source type in the dry condition at high velocities. No significant effects were observed among reverberation conditions, loudspeaker configurations and audio source type, suggesting that the upper limit is robust against reverberation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Devore S, Ihlefeld A, Hancock K, Shinn-Cunningham B, Delgutte B (2009) Accurate sound localization in reverberant environments is mediated by robust encoding of spatial cues in the auditory midbrain. Neuron 62(1):123–134

    Article  Google Scholar 

  2. Giguère C, Abel SM (1993) Sound localization: effects of reverberation time, speaker array, stimulus frequency and stimulus rise/decay. J. Acoust. Soc. Am. 94(2):769–776

    Article  Google Scholar 

  3. Rakered B, Hartmam WM (2005) Auditory signal processing: Physiology, psychophysics, and models. Localization of noise in a reverberant environment. Springer, Berlin, pp 348–354

    Google Scholar 

  4. Zahorik P (2002) Assessing auditory distance perception using virtual acoustics. J. Acoust. Soc. Am. 111(4):1832–1846

    Article  Google Scholar 

  5. Theile G (1980) On the localization in the superimposed soundfield. PhD thesis, Technische Universität von Berlin

  6. Féron FX, Frissen I, Boissinot J, Guastavino C (2010) Upper limits of auditory rotational motion perception. J. Acoust. Soc. Am. 128(6):3703–3714

    Article  Google Scholar 

  7. Frissen I, Féron FX, Guastavino C (2014) Auditory velocity discrimination in the horizontal plane at very high velocities. Hearing Res 316:94–101

    Article  Google Scholar 

  8. Camier C, Féron FX, Boissinot J and Guastavino C(2015) Tracking moving sounds: perception of spatial figures. In: Ext. abstracts 21st int. conf. on auditory display (ICAD 2015), Graz, 6–10 July 2015

  9. Romblom D, Depalle P, Guastavino C, King R (2016) Diffuse field modeling using physically-inspired decorrelation and b-format microphones: Part 1: Algorithm. J. Audio Eng. Soc. 64(4):177–193

    Article  Google Scholar 

  10. Pulkki V (1997) Virtual sound source positioning using vector base amplitude panning. J. Audio Eng. Soc. 45(6):456–466

    Google Scholar 

  11. Rabenstein R, Spors S (2007) Multichannel sound-field reproduction. Springer Handbook on Speech Processing and Speech Communication, Berlin

    Google Scholar 

  12. Frank M (2013) Phantom sources using multiple loudspeakers in the horizontal plane. PhD thesis, University of Music and Performing Arts, Graz, Austria

  13. Franck A, Gräfe A, Korn T, Strauss M (2007) Reproduction of moving sound sources by wave field synthesis—an analysis of artifacts. In: Proc. 32nd int. conf. AES, Hillerod, Denmark, pp 188–196

  14. Ahrens J, Spors S (2008) Reproduction of moving virtual sound sources with a special attention to the Doppler effect. In: Proc. 124th conv. AES, Amsterdam, Netherlands, 17–20 May 2008

  15. Lee P, Wang J (2009) The simulation of binaural hearing caused by a moving sound source. Comput Struct 87:1102–1110

    Article  Google Scholar 

  16. Ffowcs Williams JE, Hawkings DL (1969) Sound generation by turbulence and surfaces in arbitrary motion. In: Philosophical transactions of the royal society of london. Series A, mathematical and physical sciences 264(1151):321–342

  17. Andéol G, Savel S, Guillaume A (2015) Perceptual factors contribute more than acoustical factors to sound localization abilities with virtual sources. Front Neurosci 8:451. doi:10.3389/fnins.2014.00451

  18. Camier C, Blais JF, Lapointe R, Berry A (2012) A time-domain analysis of 3D non-uniform moving acoustic sources: application to source identification and absolute quantification via beamforming. In: Proc. 30th BEBEC Berlin Beamforming Conference (BeBeC 2012). 22–23 Feb 2012, Berlin, Germany

  19. Delhommeau K, Micheyl C, Jouvent R (2005) Generalization of frequency discrimination learning across frequencies and ears: implications for underlying neural mechanisms in humans. J. Assoc. Res. Oto. 6(2):171–179

  20. Aschoff V (1962) Über das räumliche Hören (on spatial hearing). Arbeitsgem Forsh Landes Nordrh Westfal 138:7–38

    Google Scholar 

  21. Blauert J (1997) Spatial hearing: the psychophysics of human sound localization. MIT press, Cambridge

  22. Lafarge (2009) Lafarge plâtres commercialisation—memento. Technical report

  23. URSA (2012) Isolation acoustique des cloisons. Ursa, France. http://www.espace-homega.com/assets/dyn-files/d_files/ursa-34r. Accessed Jan 2012

  24. Thomas M, MLS Matlab source codes. http://www.commsp.ee.ic.ac.uk/~mrt102/projects/mls.html. Accessed 25 Feb 2014

  25. Suzuki Y, Asani F, Kim HY, Sone T (1995) An optimum computer-generated pulse signal suitable for the measurement of very long impulse responses. J. Acoust. Soc. Am. 97(2):1119–1123

    Article  Google Scholar 

  26. Morse PM, Ingard KU (1987) Theoretical acoustics. Chap. 11. Princeton University Press, Princeton

    Google Scholar 

  27. De Hoop AT (2009) Electromagnetic radiation from moving, pulsed source distributions: the 3D time-domain relativistic Doppler effect. Wave Motion 46:74–77

    Article  MathSciNet  MATH  Google Scholar 

  28. Quinn B, Fernandes J (1991) A fast technique for the estimation of frequency. Biometrika 78(3):489–497

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

This research was supported NSERC (RGPIN 327392-13) Grant and a William Dawson award (McGill University) to C. Guastavino. The authors would like to thank Ilja Frissen and Daniel Steele, for their assistance, and François-Xavier Féron and David Romblom for relevant discussions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cédric Camier.

Appendix: Rendering strategy for moving sources

Appendix: Rendering strategy for moving sources

1.1 A two-step strategy

Figure 2 describes the two-step strategy used to simulate the acoustical emission of moving acoustical monopoles through a circular loudspeaker array toward a listener positioned at the center. The first step uses a numerical implementation of carefully derived wave-field equations of an acoustical source moving along an arbitrary trajectory. This propagation modeling is used to compute the acoustical pressure received from the source at the intersection between the circle \(\mathscr {C}\) co-located with the loudspeaker array and the segment defined by the circle center and the source location. The second step consists in a modified VBAP algorithm which aims at rendering the source projection on \(\mathscr {C}\) for a listener located at the center of the loudspeaker array. This strategy allows for a faithful reproduction of the acoustical localization cues and of the complete Doppler effect in the vicinity of the array center, combining the advantages of robustness (no filtering artifacts due to motion) and computational efficiency of VBAP algorithm. Numerical simulations of acoustical localization cues for center and off-center circular motion trajectories at high velocity are provided. For the sake of concision, only the major features of the strategy are given here.

1.2 The propagation model

1.2.1 Taking the Doppler Effect into account

In Camier et al. [18], the complete vector-based solution of the wave field generated by a point source in arbitrary motion was derived. This work is an extension of the equations developed by Ingard and Morse in [26] from the case of rectilinear motion to the case of arbitrary motion. When the source motion is not rectilinear and not uniform, the Lorentz transformation can not be applied because the source referential is no longer Galilean. In order to derive the velocity potential solution and consequently the pressure wave field due to a moving acoustic source, the authors used the Green formulation by starting with the wave equation written for the acoustic velocity potential. The developments were inspired from De Hoop [27] who solved the Doppler problem in the general case of a wave equation including a wave function excitation. Applied to the acoustics, the wave function derived in [27] should drive the velocity potential source term and not the sound pressure source term. This distinction is important because the resulting time–pressure solution differs for the case of moving sources. The formulation proposed by Camier et al. [18] is different from the Ffowcs Williams–Hawkings equations [16] as the expression of the pressure is expanded in vector form. This formulation is particularly adapted for numerical implementations especially dealing with geometrical trajectories with uniform velocities.

Given the mass density \(\rho \) (in kg \(\hbox {m}^{-3}\)), the acoustic wave celerity c (in m \(\hbox {s}^{-1}\)) and the geometry presented in Fig. 10 for an fixed observation point \(\mathbf{O }\), Eq. (1) provides for the acoustic pressure \(p(\mathbf{O },t)\) (in Pa) in the time referential t of the observation point, generated by a radiating acoustic monopole described by a source-distribution density time function q(t) (in kg \(\hbox {s}^{-1}\)). The vectorial quantities are indicated in bold. Position, velocity and acceleration vectors are respectively noted \(\mathbf{R }(t)\), \(\mathbf{V }(t)\), \(\mathbf{A }(t)\) and the superscript \(^+\) refers to the related quantity delayed by \(-R^+(t)/c\). This definition is recursive because the propagation problem and the delay solution are motion-dependents. For known geometrical trajectories, the recursive term could be replaced by the corresponding geometrical formulation.

Fig. 10
figure 10

Geometry of the source moving along an arbitrary motion with respect to a fixed observation point

In the case of a circular trajectory, two observations should be noted. First, in the case of a source revolving around a listening position, all velocity contribution factors in \(p_1\) and \(p_2\) are equal to zero, showing that no Doppler effect can be observed at the very center of a circular trajectory. Second, with respect to the frequency range of the audio stimuli production (above 100 Hz), with respect to the range of velocities tested here (below to 5 rot\(\cdot \hbox {s}^{-1}\)) and according to the trajectories of mirror image reflections, the second term \(p_2\) is limited to \(-42\) dB compared to the first term \(p_1\). Therefore, \(p_2\) were neglected in our implementations.

$$\begin{aligned} \begin{array}{l} p(\mathbf {O},t)\\ \quad =\underbrace{\frac{\displaystyle {q'\big (t-R^+(t)/c\big ) R^+(t)}}{ \displaystyle {4\pi \bigg ( R^+(t)- \frac{1}{c}{} \mathbf{V}^+(t)\cdot \mathbf{R}^+(t)\bigg )^2}}}_{p_1} \\ \qquad \underbrace{\begin{array}{l} +\frac{\displaystyle {q(t-R^+(t)/c) \cdot R^+(t)} }{ \displaystyle {4\pi \bigg ( R^+(t)- \frac{1}{c}{} \mathbf{V}(t)\cdot \mathbf{R}^+(t)\bigg )^3}} \\ \qquad \displaystyle {\times \Bigg (\frac{\mathbf{V}^+(t)\cdot \mathbf{R}^+(t)}{R^+(t)} + \frac{\mathbf{A}^+(t) \cdot \mathbf{R}^+(t)}{c } - \frac{ {V^+(t)}^2}{c } \bigg )} \end{array} }_{p_2}\\ \end{array} \end{aligned}$$
(1)

1.2.2 Source projection onto the reproduction circle \(\mathscr {C}\)

Given the intersection \(\mathbf{C }(t)\) between the circle \(\mathscr {C} \) and the acoustical path, we aim at comparing the pressure \(p_{O}\) propagated in \(\mathbf{O }\) from the source S(t) and the pressure propagated in \(\mathbf{O }\) from a projected source placed at \(\mathbf{C }(t)\) and from which the emitted pressure \(p_C\) is driven by the propagated pressure from the moving source S(t).

By neglecting \(p_2\) in Eq. 1, the pressure \(p_{O}=p(\mathbf{O },t)\) generated by S(t) and propagated in \(\mathbf{O }\) writes:

$$\begin{aligned} p_{O}=\frac{\displaystyle {q'\big (t-R^+_{S \rightarrow O}(t) /c\big ) R^+_{S \rightarrow O}(t)}}{ \displaystyle {4\pi \bigg (\! R^+_{S \rightarrow O}(t) \!-\! \frac{1}{c}{} \mathbf{V}(t-R^+_{S \rightarrow O}(t) /c)\cdot \mathbf {R}^+_{S \rightarrow O}(t) \!\bigg )^2}}. \end{aligned}$$
(2)

For any source outer \(\mathscr {C}\), by introducing the constant radius \(R_C\) of \(\mathscr {C}\), one could writes:

$$\begin{aligned} R^+_{S \rightarrow O}(t) = R^+_{S \rightarrow C}(t) + R_C. \end{aligned}$$
(3)

Transferring the constant time delay to the source, Eq. (2) becomes:

$$\begin{aligned}&p\bigg (\mathbf {O},t\!+\!\displaystyle {\frac{R_C}{c}}\bigg )\nonumber \\&\quad =\frac{\displaystyle {q'\big (t\!-\!R^+_{S \rightarrow C}(t) /c\big )\bigg ( R^+_{S \rightarrow C}(t)\!+\!R_C\bigg )}}{ \displaystyle {4\pi \bigg ( R^+_{S \rightarrow C}(t) + R_C \!-\! \frac{1}{c}{} \mathbf{V}(t-R^+_{S \rightarrow C}(t) /c)\cdot \mathbf {R}^+_{S \rightarrow C}(t) \bigg )^2}}.\nonumber \\ \end{aligned}$$
(4)

On the other hand, by developing the pressure \(p_C\) generated by \(\mathbf{S }(t)\) and propagated in \(\mathbf{C }(t)\), and then comparing to Eq. (4), one verifies:

$$\begin{aligned} p\bigg (\mathbf {O},t+\frac{R_C}{c} \bigg )=\displaystyle {\frac{R^+_{S \rightarrow C}}{R^+_{S \rightarrow 0}}} p_C. \end{aligned}$$
(5)

Therefore, the acoustical pressure generated by the projected source \(\mathbf{C }(t)\) and propagated in the center is equal to the pressure generated by the source \(\mathbf{S }(t)\) and propagated in the center, for any arbitrary motion trajectory outside \(\mathscr {C}\), except for a fixed delay. All the components of distance variations and of the Doppler effect are then exactly reproduced at the center. This is slightly different for the listener’s ears which are not located at the center. This results in interaural errors and also in a frequency shifting. Nevertheless, these errors are small. The parametric description is not provided here. But the errors made through this step and added with those made through the second step are illustrated in the numerical simulation results presented in the sequel.

1.3 Spatialization algorithm

A modified VBAP algorithm is used as a second step to render the projected source \(\mathbf{C }(t)\). For our purpose, the VBAP algorithm [10] was restricted to the horizontal plane (no height). The cross gain function between pairs of loudspeakers was modified, in order to guarantee smooth angle variations at very high velocities. Intensity gains \(g_1\) and \(g_2\) are distributed across a symmetrical sigmoid function defined in Eq. (6) with respect to the normalized angle \(\theta (t)\) between the two closest speakers to \(\mathbf{C }(t)\). A smooth cross-over defined by \(\alpha =2.85\) has been chosen by ear.

$$\begin{aligned} g_1 (t)&= \frac{1}{1+exp (-\alpha \cdot (\theta (t) - 0.5 ))} \end{aligned}$$
(6a)
$$\begin{aligned} g_2 (t)&=\big (1-g_1(\theta (t))\big )/\sqrt{(g_1(\theta (t))^2+g_2(\theta (t))^2)}. \end{aligned}$$
(6b)

\(p_{R}\) is then build with a linear combination of the previous terms:

$$\begin{aligned} p_{R}(t)=g_1(t) \cdot p_C(t) + g_2(t) \cdot p_C(t). \end{aligned}$$
(7)

Due to the selected VBAP method, time difference errors could be made in addition to the ones due to the projection. Loudspeakers generate \(p_{R}\) per pair, producing a wave front equally balanced in term of delay between the two loudspeakers. Additional level difference errors could also be made because of the selected cross-over function. Besides, the method and the cross-function have been selected to minimize the artefacts that the authors experienced when testing various spatialization algorithms.

1.4 Numerical simulations of high-velocity circular trajectories

In order to evaluate the errors resulting from the rendering strategy compared to a direct acoustical path, we present some acoustical localization cues simulated for two trajectories depicted in Fig. 11a, b. These simulations illustrate the efficiency of the presented method to reproduce the dynamic acoustical soundfield produced by a moving virtual monopole, in the vicinity of a circular loudspeaker array center in terms of time difference, sound pressure level difference and instantaneous frequency (IF). It should be noted that head related transfer functions should be considered in future research to relate these differences to binaural localization cues. Trajectory (a) is co-located with the circle of reproduction \(\mathscr {C}\) and with the 1.85-m radius loudspeaker circular array. Trajectory (b) follows a circle of the same radius and is 8-m off-center. This case exemplifies the mirror image reflection from a wall situated 4 m in the front of the listener. Furthermore, for the sake of clarity and to clearly establish the relation between the peculiarity of circular motion and its effect on Time and Level differences and on IF, we illustrate this relation with a monochromatic scenario. Indeed, both cases involve a 500-Hz monopole evolving in 2 cycles at a velocity of 5 rot\(\cdot \hbox {s}^{-1}\). This high velocity value has been chosen to enhance the error potentially due to motion.

Fig. 11
figure 11

Geometries of the two trajectory cases a and b

Figure 12 displays acoustical localization cue simulations for cases (a) and (b), in gray and black respectively. These cues are computed on the basis of acoustical signals propagated to two “aural positions” (\(\pm 0.1\) m from the center). The Direct results refer to the propagation from the moving source to the aural positions, according to Eq. (4). The Reprod results refer to the addition of the loudspeaker signals computed with the two-step rendering strategy described in the previous section, and propagated to the aural points.

Fig. 12
figure 12

Simulated acoustical localization cues for case a and case b, in gray and black respectively

Time difference variations are computed with the cross-correlation between left and right signals (a positive time difference corresponds to a positive delay of the right signal compared to the left one). As observed in Fig. 12-(upper middle), results from the proposed rendering strategy and from the direct propagation are very similar. Case (a)’s Time Difference variation fall within a \([-0.58 ~0.58]\) ms range, corresponding to the 0.2-m distance between the aural positions. The observed sinusoidal temporal evolution, starting from 0, corresponds to the circular motion of the source revolving in counterclockwise direction and starting from a frontal position as depicted in Fig 11a. The delay between the Direct and Reprod results is due to the fact that VBAP method doesn’t compensate for the phase delay per speaker location for the wave front reproduction.

Level difference is computed by taking into account the time difference before comparing the time-windowed left and right sound pressure level. the delay between Direct and Reprod time difference variations impacts also on the level difference results. Figure 12-(top) displays level difference variations for the cases (a) and (b). Case (a)’s results from Direct and Reprod computation are very similar. Except for the time and level difference delays between Direct and Reprod results, the dynamic behaviors of the acoustical localization cues are perfectly reproduced by the moving source rendering strategy. Moreover, as the starting point is randomly positioned and faded-in during stimuli presentation, this delay has no consequences on the direction motion detection tests.

In addition, IFs are computed using the Quinn–Fernandes frequency estimator [28], for both left and right signals and for the two Reprod and Direct methods. Figure 12-(lower middle) displays the simulation results for case (a). Once again, except for the small delay, results are quasi-identical. Pitch shifting due to the Doppler effect is thus well reproduced by the compression waves coming from the loudspeakers for this frequency and at this rotation speed. Because of symmetry of the left and right locations, left and right pitch-shifted signals are the opposite from each other with respect to the source frequency. The maximal difference observed is 2 % of 500 Hz which corresponds to almost a halftone in the musical scale which is well above the perceptual thresholds.

Case (b)’s acoustical localization cues simulations from the moving source rendering strategy are also very close to the direct propagation simulations. Case (b)’s Level Difference results from the rendering strategy show more discrepancies from the direct propagation in comparison with case (a). This is the consequence of the non-linear crossover gains in VBAP method presented in Eq. (6). Due to Doppler-effect gain variation, the amplitude of wave front produced by the source moving toward \(\mathscr {C}\) is greater than the one produced by the source moving away from \(\mathscr {C}\). This effect is slightly observed in the Direct results. But, due to the locations of the amplification, this effect is amplified by the non-linear gain functions in Reprod simulations. The asymmetry between in-coming and out-coming magnitudes results in a deformation of the level differences curve. The curve is no longer sinusoidal but becomes closer to a saw tooth curve. The case (b)’s time difference results are very slightly affected by the variation of relative velocities. Indeed, in Eq. (4), the phase terms are only affected by the time-compression due to relative velocities. The relative velocity values from the source to left and right locations respectively are very close to each other because the relative distances are close to each other. This subtle difference between left and right relative velocities is lesser than for case (a) because the source is farther. Besides, the off-center circular trajectory produces high relative velocities values. For case (b)’s IF depicted in Fig. 12-(bottom), this results in large global variations of pitch shifting for both left and right signals (range of 200 Hz) but results in small variations between left and right signals.

The comparison of direct propagation and our rendering strategy demonstrates the ability of our rendering strategy to reproduce accurately the behavior of a moving phantom source at high velocity, in terms of time and level differences as well as Doppler effect.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Camier, C., Boissinot, J. & Guastavino, C. On the robustness of upper limits for circular auditory motion perception. J Multimodal User Interfaces 10, 285–298 (2016). https://doi.org/10.1007/s12193-016-0225-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-016-0225-8

Keywords

Navigation