Abstract
This article reports on a multi-lab subjective listening experiment aiming at inter-lab and intra-lab test results repeatability verification. An identical set of speech samples corresponding to contemporary networks has been tested by three independent labs deploying ITU-T P.835 methodology. The tests results have been compared regarding Pearson correlation, RMSE, RMSE* and numbers of opposite pair-wise comparisons. The results show the level of inter-lab and intra-lab repeatability in the case of identical test speech samples utilization and thus confirm the subjective tests are highly repeatable in case they follow recommendation requirements strictly. The tests also show differences in results in case subject expectations are set differently using a wider set of test speech samples (as presented in one of the labs).
Similar content being viewed by others
References
Appendix of IEEE Subcommittee on Subjective Measurements IEEE Recommended Practices for Speech Quality Measurements. (1969). IEEE Transactions on Audio and Electroacoustics. Vol 17, pp. 227–246.
European Telecommunications Standards Institute. (2008). Speech processing, transmission and quality aspects (STQ); speech quality performance in the presence of background noise part 3: Background noise transmission—Objective test methods. European Telecommunications Standards Institute, ETSI EG 202 396-3.
European Telecommunications Standards Institute. (2014). Speech and multimedia transmission quality (STQ); speech quality performance in the presence of background noise: Background noise transmission for mobile terminals—Objective test methods. European Telecommunications Standards Institute, ETSI TS 103 106.
Goodman, D. J., & Nash, R. D. (1982). Subjective quality of the same speech transmission conditions in seven different countries. IEEE Transactions on Communications, 30(4), 642–654.
3GPP TR 26.952. (2015). Codec for Enhanced Voice Services (EVS); Performance Characterization.
3GPP TS 26 071. Mandatory speech CODEC speech processing functions; AMR speech Codec; General description
ITU-T Rec. P.800. (1996). Methods for subjective determination of transmission quality, Series P: Telephone transmission quality, ITU, Geneva, am. 1998.
ITU-T Rec. P.835. (2003). Methods for objective and subjective assessment of quality, Series P: Telephone transmission quality, Telephone Installations, Local Line Networks, ITU, Geneva.
ITU-T Rec. P.863. (2014). Methods for objective and subjective assessment of speech quality, Series P: Terminals and subjective and objective assessment methods, ITU, Geneva.
ITU-T TD12rev1. (2009). Statistical evaluation. Procedure for P.OLQA v.1.0, SwissQual AG (Author: Jens Berger), ITUTSG12 Meeting, Geneva, Switzerland, March 10–19, 2009.
Pinson, M. H., Janowski, L., Pepion, R., Huynh-Thu, Q., Schmidmer, C., Corriveau, P., et al. (2012). The influence of subjects and environment on audiovisual subjective tests: An international study. IEEE Journal of Selected Topics in Signal Processing, 6(6), 640–651.
Acknowledgements
Authors thank Andrew Catellier and Stephen Voran at the United States Department of Commerce’s Institute for Telecommunication Sciences in Boulder Colorado for providing the test premises and test subjects and also for valuable discussions related to this project.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Holub, J., Avetisyan, H. & Isabelle, S. Subjective speech quality measurement repeatability: comparison of laboratory test results. Int J Speech Technol 20, 69–74 (2017). https://doi.org/10.1007/s10772-016-9389-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-016-9389-6