Abstract
Salience shapes the involuntary perception of a sound scene into foreground and background. Auditory interfaces, such as those used in continuous process monitoring, rely on the prominence of those sounds that are perceived as foreground. We propose to distinguish between the salience of sound events and that of streams, and introduce a paradigm to study the latter using repetitive patterns of natural chirps. Since streams are the sound objects populating the auditory scene, we suggest the use of global descriptors of perceptual dimensions to predict their salience, and hence, the organization of the objects into foreground and background. However, there are many possible independent features that can be used to describe sounds. Based on the results of two experiments, we suggest a parsimonious interpretation of the rules guiding foreground formation: after loudness, tempo and brightness are the dimensions that have higher priority. Our data show that, under equal-loudness conditions, patterns with fast tempo and lower brightness tend to emerge and that the interaction between tempo and brightness in foreground selection seems to increase with task difficulty. We propose to use the relations we uncovered as the underpinnings for a computational model of foreground selection, and also, as design guidelines for stream-based sonification applications.
Similar content being viewed by others
References
Gutschalk A, Dykstra AR (2014) Functional imaging of auditory scene analysis. Hear Res 307:98–110
Denham S, Winkler I (2014) Auditory perceptual organization. In: Wagemans J (ed) The Oxford handbook of perceptual organization. Oxford University Press, Oxford, pp 1–31
Kramer G, Walker BN, Bonebright T, Cook P, Flowers J, Miner N (1999) The sonification report: status of the field and research agenda. Report prepared for the National Science Foundation by members of the International Community for Auditory Display (ICAD), Santa Fe
Hermann T (2008) Taxonomy and definitions for sonification and auditory display. In: Proceedings of the 14th International Conference Auditory Display (ICAD2008). IRCAM, Paris, pp 1–8
Kayser C, Petkov CI, Lippert M, Logothetis NK (2005) Mechanisms for allocating auditory attention: an auditory saliency map. Curr Biol 15(21):1943–1947
Kalinli O, Narayanan S (2007) A saliency-based auditory attention model with applications to unsupervised prominent syllable detection in speech. In: Proceedings of the Interspeech, Antwerp, pp 1–4
De Coensel B, Botteldooren D, Berglund B, Nilsson ME (2009) A computational model for auditory saliency of environmental sound. In: Proceedings of the of the 157th meeting of the Acoustical Society of America (ASA), vol 125, Portland, p 2528, Poster 1pPP36
Kalinli O (2009) Biologically Inspired auditory attention models with applications in speech and audio processing. PhD thesis, University of Southern California
Slaney M, Agus T, Liu S, Kaya M, Elhilali M (2012) A model of attention-driven scene analysis. In: Proceedings of the of the IEEE International Conference on Acoustics, Speech, and Signal processing (ICASSP), Kyoto, pp 145–148
Duangudom V (2012) Computational auditory saliency. PhD thesis, School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta
Kaya EM, Elhilali M (2014) Investigating bottom-up auditory attention. Front Hum Neurosci 8(327):1–12
Hunt A, Hermann T, Pauletto S (2004) Interacting with sonification systems: closing the loop. International Conference on Information Visualisation pp 879–884
Bakker S, van den Hoven E, Eggen B (2012) Knowing by ear: leveraging human attention abilities in interaction design. J Multimodal User Interfaces 5(3–4):197–209
Walker BN, Nees MA (2011) Theory of sonification. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 9–39
Neuhoff JG (2011) Perception, cognition and action in auditory display. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 63–85
Vickers P (2011) Sonification for process monitoring. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 455–491
Barrass S, Kramer G (1999) Using sonification. Multimed Syst 7:23–31
Hildebrandt T, Hermann T, Rinderle-Ma S (2014) A sonification system for process monitoring as secondary task. In: 5th IEEE Conference on Cognitive Infocommunications (CogInfoCom), pp 191–196
Bregman AS (1990) Auditory scene analysis–the perceptual organization of sound. MIT Press, Cambridge
Barrass S, Best V (2008) Stream-based sonification diagrams. In: Proceedings of the 14th International Conference Auditory Display (ICAD2008). IRCAM, Paris, pp 1–6
Csapó Á, Wersényi G, Nagy H, Stockman T (2015) A survey of assistive technologies and applications for blind users on mobile platforms: a review and foundation for research. J Multimodal User Interfaces 9(4):275–286
Patterson RD (1982) Guide lines for auditory warning systems on civil aircraft. Instituut voor Perceptie Onderzoek, RP/ne 82/01, Manuscript no. 413/II
Wiese EE, Lee JD (2004) Auditory alerts for in-vehicle information systems: the effects of temporal conflict and sound parameters on driver attitudes and performance. Ergonomics 47(9):965–986
Suied C, Susini P, McAdams S (2008) Evaluating warning sound urgency with reaction times. J Exp Psychol Appl 14(3):201–212
Hunt A, Hermann T (2011) Interactive sonification. In: Hermann T, Hunt A, Neuhoff JG (eds) The sonification handbook. Logos Publishing House, Berlin, pp 273–298
Meredith C, Edworthy J (1995) Are there too many alarms in the intensive care unit? An overview of the problems. J Adv Nurs 21(1):15–20
Stevenson RA, Schlesinger JJ, Wallace MT (2013) Effects of divided attention and operating room noise on perception of pulse oximeter pitch changesa laboratory study. Anesthesiology 118(2):376–381
Misdariis N, Minard A, Susini P, Lemaitre G, McAdams S, Parizet E (2010) Environmental sound perception: metadescription and modeling based on independent primary studies. EURASIP J Audio Speech Music Process 6(1–6):26
Peeters G, Giordano BL, Susini P, Misdariis N, McAdams S (2011) The timbre toolbox: extracting acoustic descriptors from musical signals. J Acoust Soc Am 130:2902–2916
Pressnitzer D, Hupé J-M (2006) Temporal dynamics of auditory and visual bistability reveal common principles of perceptual organization. Curr Biol 16(13):1351–1357
Tordini F, Bregman A, Ankolekar A, Sandholm TE, Cooperstock JR (2013) Toward an improved model of auditory saliency. In: Proceedings of the 19th International Conference Auditory Display (ICAD2013), Łódź, pp 189–196
Glasberg BR, Moore BCJ (2002) A model of loudness applicable to time-varying sounds. J Audio Eng Soc 50(5):331–342
Florentine M, Popper AN, Fay RR (eds) (2011) Loudness. Springer Handbook of Auditory Research, vol 37. Springer, New York
Tordini F, Bregman A, Cooperstock JR (2015) The loud bird doesn’t (always) get the worm: why computational salience also needs brightness and tempo. In: Proceedings of the 21st International Conference Auditory Display (ICAD2015), Graz, pp 236–243
International Organization for Standardization (ISO) (2003) BS-ISO-226:2003(E) Acoustics. Normal equal-loudness-level, standard. ISO, Geneva
Agresti A (2014) Categorical data analysis. Wiley Series in Probability and Statistics, 3 edn. Wiley, New York
Hanley JA, Negassa A, Forrester JE et al (2003) Statistical analysis of correlated data using generalized estimating equations: an orientation. Am J Epidemiol 157(4):364–375
Theunissen FE, Elie JE (2014) Neural processing of natural sounds. Nat Rev Neurosci 15:355–366
Parviainen O (2015) Soundstretch utility. Version 1.9.0. Accessed on 18 May 2015. http://www.surina.net/soundtouch/soundstretch.html
Tordini F (2014) Is there more to saliency than loudness? In: 6th Workshop on Speech in Noise (SPiN): Intelligibility and Quality, Marseille
Bregman AS (1978) Auditory streaming is cumulative. J Exp Psychol Hum Percept Perform 4:380–387
McAuley JD (2010) Tempo and rhythm. In: Jones MR, Fay RR, Popper AN (eds)Music perception. Springer Handbook of Auditory Research 36. Springer, New York
Gonzalez CA, Baldwin CL (2014) Effects of pulse rate, fundamental frequency and burst density on auditory similarity. Theor Iss Ergon Sci 16(2):1–13
Hove MJ, Marie C, Bruce IC, Trainor LJ (2014) Superior time perception for lower musical pitch explains why bass-ranged instruments lay down musical rhythms. Proc Natl Acad Sci 111(28):10383–10388
Bouchara T, Jacquemin C, Katz BFG (2013) Cueing multimedia search with audiovisual blur. ACM Trans Appl Percept 10:7:1–7:21
Acknowledgments
F. T. thanks J. Blum for the fruitful discussions during this research, F. Grond, the colleagues at the Shared Reality Lab, the JMUI editors and reviewers for their valuable input.
Author information
Authors and Affiliations
Corresponding author
Additional information
This research has been generously supported by the Networks of Centres of Excellence: Graphics, Animation, and New Media (GRAND), and by the Natural Sciences and Engineering Research Council (NSERC) of Canada, Grant #203568-06.
Rights and permissions
About this article
Cite this article
Tordini, F., Bregman, A.S. & Cooperstock, J.R. Prioritizing foreground selection of natural chirp sounds by tempo and spectral centroid. J Multimodal User Interfaces 10, 221–234 (2016). https://doi.org/10.1007/s12193-016-0223-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12193-016-0223-x