Skip to main content

Acoustic Cues for the Perceptual Assessment of Surround Sound

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10458))

Included in the following conference series:

Abstract

Speech and audio codecs are implemented in a variety of multimedia applications, and multichannel sound is offered by first streaming or cloud-based services. Beside the objective of perceptual quality, coding-related research is focused on low bitrate and minimal latency. The IETF-standardized Opus codec provides a high perceptual quality, low latency and the capability of coding multiple channels in various audio bandwidths up to Fullband (20 kHz). In a previous perceptual study on Opus-processed 5.1 surround sound, uncompressed and degraded stimuli were rated on a five-point degradation category scale (DMOS) for six channels at total bitrates between 96 and 192 kbit/s. This study revealed that the perceived quality depends on the music characteristics. In the current study we analyze spectral and music-feature differences between those five music stimuli at three coding bitrates and uncompressed sound to identify objective causes for perceptual differences. The results show that samples with annoying audible degradations involve higher spectral differences within the LFE channel as well as highly uncorrelated LSPs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dietz, M., Multrus, M., Eksler, V., Malenovsky, V., Norvell, E., Pobloth, H., Miao, L., Wang, Z., Laaksonen, L., Vasilache, A., Kamamoto, Y., Kikuiri, K., Ragot, S., Faure, J., Ehara, H., Rajendran, V., Atti, V., Sung, H., Oh, E., Yuan, H., Zhu, C.: Overview of the EVS codec architecture. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5698–5702 (2015)

    Google Scholar 

  2. Dobbriner, J., Jokisch, O., Maruschke, M.: Assessment of prosodic attributes in codec-compressed speech. In: Draxler, C., Kleber, F. (eds.) Proceedings of 12th Conference Phonetik und Phonologie im deutschsprachigen Raum (P&P), Munich, Germany, vol. 12, pp. 35–39. LMU Munich, October 2016

    Google Scholar 

  3. Dolby Laboratories Inc.: Dolby Atmos Demonstration Disc, August 2014

    Google Scholar 

  4. Eyben, F., Wöllmer, M., Schuller, B.: OpenSMILE - The Munich versatile and fast open-source audio feature extractor. In: Proceedings of the ACM MM-2010, p. s.p., Firenze, Italy (2010)

    Google Scholar 

  5. Eyben, F., Schuller, B.: Music classification with the Munich openSMILE toolkit. In: Proceedings of Annual Meeting of the MIREX 2010 Community as Part of the 11th International Conference on Music Information Retrieval, p. s.p., Utrecht, Netherlands, August 2010

    Google Scholar 

  6. Fastl, H., Zwicker, E.: Psychoacoustics. Facts and Models. Springer, Berlin (2007)

    Book  Google Scholar 

  7. Hoene, C., Valin, J.M., Vos, K., Skoglund, J.: Summary of Opus listening test results draft-valin-codec-results-03. Internet-draft, IETF (2013). https://tools.ietf.org/html/draft-ietf-codec-results-03

  8. ITU-R: Multichannel stereophonic sound system with and without accompanying picture. REC BS.775-3, International Telecommunication Union (Radiocommunication Sector), August 2012. http://www.itu.int/rec/R-REC-BS.775-3-201208-I/en

  9. ITU-T: Methods for objective and subjective assessment of quality- Methods for subjective determination of transmissen quality. REC P.800, International Telecommunication Union (Telecommunication Standardization Sector), August 1996. http://www.itu.int/rec/T-REC-P.800-199608-I/en

  10. Jarschel, M., Schlosser, D., Scheuring, S., Hoßfeld, T.: An evaluation of QoE in cloud gaming based on subjective tests. In: Fifth International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing, pp. 330–335, Seoul, Korea (2011)

    Google Scholar 

  11. Jokisch, O., Maruschke, M.: Audio and speech coding/transcoding in web real-time communication. In: International Symposium on Human Life Design (HLD 2016), p. s.p., Kanazawa, Japan (2016)

    Google Scholar 

  12. Lindberg Lyd AS. 2L - the Nordic sound: HiRes Test Bench (online available). http://www.2l.no/hires/index.html. Accessed 15 Jan 2017

  13. Lutzky, M., Schuller, G., Gayer, M., Krämer, U., Wabnik, S.: A guideline to audio codec delay. In: AES 116th Convention, Berlin, Germany, pp. 8–11 (2004)

    Google Scholar 

  14. Maruschke, M., Jokisch, O., Meszaros, M., Trojahn, F., Hoffmann, M.: Quality assessment of two fullband audio codecs supporting real-time communication. In: Ronzhin, A., Potapova, R., Németh, G. (eds.) SPECOM 2016. LNCS (LNAI), vol. 9811, pp. 571–579. Springer, Cham (2016). doi:10.1007/978-3-319-43958-7_69

    Chapter  Google Scholar 

  15. Maruschke, M., Jokisch, O., Meszaros, M., Iaroshenko, V.: Review of the Opus Codec in a WebRTC scenario for audio and speech communication. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds.) SPECOM 2015. LNCS (LNAI), vol. 9319, pp. 348–355. Springer, Cham (2015). doi:10.1007/978-3-319-23132-7_43

    Chapter  Google Scholar 

  16. Rämö, A., Toukomaa, H.: Voice quality characterization of IETF Opus codec. In: Proceedings of the INTERSPEECH-2011, pp. 2541–2544, Florence, Italy (2011)

    Google Scholar 

  17. Rämö, A., Toukomaa, H.: Subjective qualitiy evaluation of the 3Gpp. EVS codec. In: Proceedings of the 40th IEEE ICASSP, pp. 5157–5161, Brisbane, Australia (2015)

    Google Scholar 

  18. Siegert, I., Lotz, A.F., l. Duong, L., Wendemuth, A.: Measuring the impact of audio compression on the spectral quality of speech data. In: Elektronische Sprachsignalverarbeitung 2016. Studientexte zur Sprachkommunikation, vol. 81, pp. 229–236, Leipzig, Germany (2016)

    Google Scholar 

  19. Trojahn, F., Meszaros, M., Maruschke, M., Jokisch, O.: Surround sound processed by Opus codec: a perceptual quality assessment. In: Elektronische Sprachsignalverarbeitung 2017. Tagungsband der 28. Konferenz. Studientexte zur Sprachkommunikation, vol. 86, pp. 300–307. TUDpress, Saarbrücken, Germany (2017)

    Google Scholar 

  20. Valin, J.M., Maxwell, G., Terriberry, T., Vos, K.: High-quality, low-delay music coding in the Opus codec. In: Proceedings of the 135th Audio Engineering Society Convention, p. s.p. Audio Engineering Society, New York, USA, October 2013

    Google Scholar 

  21. Valin, J., Vos, K., Terriberry, T.: Definition of the Opus audio codec. RFC 6716. http://tools.ietf.org/html/rfc6716

  22. Zion Market Research Blog: Sound Bar Market: Rising events in corporate, film industry, sports and others increase the demand of sound bar systems, November 2016

    Google Scholar 

Download references

Acknowledgments

This work was partly carried out within the Transregional Collaborative Research Centre SFB/TRR 62 “Companion Technology for Cognitive Technical Systems” funded by the German Research Foundation (DFG) (www.sfb-trr-62.de).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Ingo Siegert or Oliver Jokisch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Siegert, I., Jokisch, O., Lotz, A.F., Trojahn, F., Meszaros, M., Maruschke, M. (2017). Acoustic Cues for the Perceptual Assessment of Surround Sound. In: Karpov, A., Potapova, R., Mporas, I. (eds) Speech and Computer. SPECOM 2017. Lecture Notes in Computer Science(), vol 10458. Springer, Cham. https://doi.org/10.1007/978-3-319-66429-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66429-3_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66428-6

  • Online ISBN: 978-3-319-66429-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics