Skip to main content
Log in

Prioritizing foreground selection of natural chirp sounds by tempo and spectral centroid

  • Original Paper
  • Published:
Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Abstract

Salience shapes the involuntary perception of a sound scene into foreground and background. Auditory interfaces, such as those used in continuous process monitoring, rely on the prominence of those sounds that are perceived as foreground. We propose to distinguish between the salience of sound events and that of streams, and introduce a paradigm to study the latter using repetitive patterns of natural chirps. Since streams are the sound objects populating the auditory scene, we suggest the use of global descriptors of perceptual dimensions to predict their salience, and hence, the organization of the objects into foreground and background. However, there are many possible independent features that can be used to describe sounds. Based on the results of two experiments, we suggest a parsimonious interpretation of the rules guiding foreground formation: after loudness, tempo and brightness are the dimensions that have higher priority. Our data show that, under equal-loudness conditions, patterns with fast tempo and lower brightness tend to emerge and that the interaction between tempo and brightness in foreground selection seems to increase with task difficulty. We propose to use the relations we uncovered as the underpinnings for a computational model of foreground selection, and also, as design guidelines for stream-based sonification applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. http://srl.mcgill.ca/~tord/SOAPsounds/

References

  1. Gutschalk A, Dykstra AR (2014) Functional imaging of auditory scene analysis. Hear Res 307:98–110

    Article  Google Scholar 

  2. Denham S, Winkler I (2014) Auditory perceptual organization. In: Wagemans J (ed) The Oxford handbook of perceptual organization. Oxford University Press, Oxford, pp 1–31

    Google Scholar 

  3. Kramer G, Walker BN, Bonebright T, Cook P, Flowers J, Miner N (1999) The sonification report: status of the field and research agenda. Report prepared for the National Science Foundation by members of the International Community for Auditory Display (ICAD), Santa Fe

  4. Hermann T (2008) Taxonomy and definitions for sonification and auditory display. In: Proceedings of the 14th International Conference Auditory Display (ICAD2008). IRCAM, Paris, pp 1–8

  5. Kayser C, Petkov CI, Lippert M, Logothetis NK (2005) Mechanisms for allocating auditory attention: an auditory saliency map. Curr Biol 15(21):1943–1947

    Article  Google Scholar 

  6. Kalinli O, Narayanan S (2007) A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech. In: Proceedings of the Interspeech, Antwerp, pp 1–4

  7. De Coensel B, Botteldooren D, Berglund B, Nilsson ME (2009) A computational model for auditory saliency of environmental sound. In: Proceedings of the of the 157th meeting of the Acoustical Society of America (ASA), vol 125, Portland, p 2528, Poster 1pPP36

  8. Kalinli O (2009) Biologically Inspired auditory attention models with applications in speech and audio processing. PhD thesis, University of Southern California

  9. Slaney M, Agus T, Liu S, Kaya M, Elhilali M (2012) A model of attention-driven scene analysis. In: Proceedings of the of the IEEE International Conference on Acoustics, Speech, and Signal processing (ICASSP), Kyoto, pp 145–148

  10. Duangudom V (2012) Computational auditory saliency. PhD thesis, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta

  11. Kaya EM, Elhilali M (2014) Investigating bottom-up auditory attention. Front Hum Neurosci 8(327):1–12

    Google Scholar 

  12. Hunt A, Hermann T, Pauletto S (2004) Interacting with sonification systems: closing the loop. International Conference on Information Visualisation pp 879–884

  13. Bakker S, van den Hoven E, Eggen B (2012) Knowing by ear: leveraging human attention abilities in interaction design. J Multimodal User Interfaces 5(3–4):197–209

    Article  Google Scholar 

  14. Walker BN, Nees MA (2011) Theory of sonification. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 9–39

    Google Scholar 

  15. Neuhoff JG (2011) Perception, cognition and action in auditory display. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 63–85

    Google Scholar 

  16. Vickers P (2011) Sonification for process monitoring. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 455–491

    Google Scholar 

  17. Barrass S, Kramer G (1999) Using sonification. Multimed Syst 7:23–31

    Article  Google Scholar 

  18. Hildebrandt T, Hermann T, Rinderle-Ma S (2014) A sonification system for process monitoring as secondary task. In: 5th IEEE Conference on Cognitive Infocommunications (CogInfoCom), pp 191–196

  19. Bregman AS (1990) Auditory scene analysis–the perceptual organization of sound. MIT Press, Cambridge

    Google Scholar 

  20. Barrass S, Best V (2008) Stream-based sonification diagrams. In: Proceedings of the 14th International Conference Auditory Display (ICAD2008). IRCAM, Paris, pp 1–6

  21. Csapó Á, Wersényi G, Nagy H, Stockman T (2015) A survey of assistive technologies and applications for blind users on mobile platforms: a review and foundation for research. J Multimodal User Interfaces 9(4):275–286

    Article  Google Scholar 

  22. Patterson RD (1982) Guide lines for auditory warning systems on civil aircraft. Instituut voor Perceptie Onderzoek, RP/ne 82/01, Manuscript no. 413/II

  23. Wiese EE, Lee JD (2004) Auditory alerts for in-vehicle information systems: the effects of temporal conflict and sound parameters on driver attitudes and performance. Ergonomics 47(9):965–986

    Article  Google Scholar 

  24. Suied C, Susini P, McAdams S (2008) Evaluating warning sound urgency with reaction times. J Exp Psychol Appl 14(3):201–212

    Article  Google Scholar 

  25. Hunt A, Hermann T (2011) Interactive sonification. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 273–298

    Google Scholar 

  26. Meredith C, Edworthy J (1995) Are there too many alarms in the intensive care unit? An overview of the problems. J Adv Nurs 21(1):15–20

    Article  Google Scholar 

  27. Stevenson RA, Schlesinger JJ, Wallace MT (2013) Effects of divided attention and operating room noise on perception of pulse oximeter pitch changesa laboratory study. Anesthesiology 118(2):376–381

    Article  Google Scholar 

  28. Misdariis N, Minard A, Susini P, Lemaitre G, McAdams S, Parizet E (2010) Environmental sound perception: metadescription and modeling based on independent primary studies. EURASIP J Audio Speech Music Process 6(1–6):26

    Google Scholar 

  29. Peeters G, Giordano BL, Susini P, Misdariis N, McAdams S (2011) The timbre toolbox: extracting acoustic descriptors from musical signals. J Acoust Soc Am 130:2902–2916

    Article  Google Scholar 

  30. Pressnitzer D, Hupé J-M (2006) Temporal dynamics of auditory and visual bistability reveal common principles of perceptual organization. Curr Biol 16(13):1351–1357

    Article  Google Scholar 

  31. Tordini F, Bregman A, Ankolekar A, Sandholm TE, Cooperstock JR (2013) Toward an improved model of auditory saliency. In: Proceedings of the 19th International Conference Auditory Display (ICAD2013), Łódź, pp 189–196

  32. Glasberg BR, Moore BCJ (2002) A model of loudness applicable to time-varying sounds. J Audio Eng Soc 50(5):331–342

    Google Scholar 

  33. Florentine M, Popper AN, Fay RR (eds) (2011) Loudness. Springer Handbook of Auditory Research, vol 37. Springer, New York

    Google Scholar 

  34. Tordini F, Bregman A, Cooperstock JR (2015) The loud bird doesn’t (always) get the worm: why computational salience also needs brightness and tempo. In: Proceedings of the 21st International Conference Auditory Display (ICAD2015), Graz, pp 236–243

  35. International Organization for Standardization (ISO) (2003) BS-ISO-226:2003(E) Acoustics. Normal equal-loudness-level, standard. ISO, Geneva

  36. Agresti A (2014) Categorical data analysis. Wiley Series in Probability and Statistics, 3 edn. Wiley, New York

  37. Hanley JA, Negassa A, Forrester JE et al (2003) Statistical analysis of correlated data using generalized estimating equations: an orientation. Am J Epidemiol 157(4):364–375

    Article  Google Scholar 

  38. Theunissen FE, Elie JE (2014) Neural processing of natural sounds. Nat Rev Neurosci 15:355–366

    Article  Google Scholar 

  39. Parviainen O (2015) Soundstretch utility. Version 1.9.0. Accessed on 18 May 2015. http://www.surina.net/soundtouch/soundstretch.html

  40. Tordini F (2014) Is there more to saliency than loudness? In: 6th Workshop on Speech in Noise (SPiN): Intelligibility and Quality, Marseille

  41. Bregman AS (1978) Auditory streaming is cumulative. J Exp Psychol Hum Percept Perform 4:380–387

    Article  Google Scholar 

  42. McAuley JD (2010) Tempo and rhythm. In: Jones MR, Fay RR, Popper AN (eds)Music perception. Springer Handbook of Auditory Research 36. Springer, New York

  43. Gonzalez CA, Baldwin CL (2014) Effects of pulse rate, fundamental frequency and burst density on auditory similarity. Theor Iss Ergon Sci 16(2):1–13

    Google Scholar 

  44. Hove MJ, Marie C, Bruce IC, Trainor LJ (2014) Superior time perception for lower musical pitch explains why bass-ranged instruments lay down musical rhythms. Proc Natl Acad Sci 111(28):10383–10388

  45. Bouchara T, Jacquemin C, Katz BFG (2013) Cueing multimedia search with audiovisual blur. ACM Trans Appl Percept 10:7:1–7:21

    Article  Google Scholar 

Download references

Acknowledgments

F. T. thanks J. Blum for the fruitful discussions during this research, F. Grond, the colleagues at the Shared Reality Lab, the JMUI editors and reviewers for their valuable input.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francesco Tordini.

Additional information

This research has been generously supported by the Networks of Centres of Excellence: Graphics, Animation, and New Media (GRAND), and by the Natural Sciences and Engineering Research Council (NSERC) of Canada, Grant #203568-06.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tordini, F., Bregman, A.S. & Cooperstock, J.R. Prioritizing foreground selection of natural chirp sounds by tempo and spectral centroid. J Multimodal User Interfaces 10, 221–234 (2016). https://doi.org/10.1007/s12193-016-0223-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12193-016-0223-x

Keywords

Navigation