Abstract
This paper studies the use of simulated recordings to perform audio identification experiments. In contrast to use actual recordings in the experiments, we use the measured room impulse response to generate simulated recordings. Doing so greatly reduces the burden of manually recording audio items for experiments. By comparing the correlations between actual and simulated recordings, we conclude that this approach is highly possible. The audio identification experiments are conducted based on the moving picture expert group audio signature descriptors to represent the simulated recordings. We also add environmental noises, provided by European Telecommunications Standards Institute, to the simulated recordings to study the performance degradation. Finally, we study if performing filtering in the descriptor domain can improve the accuracy. The experimental results show that filtering in the frequency direction yields higher accuracy for signal to noise ratio of 10 dB items.
Similar content being viewed by others
References
Cho H, Choi M (2014) Personal mobile album/diary application development. J Converg 5(1):32–37
Oh J-S, Park C-U, Lee S-B (2014) NFC-based mobile payment service adoption and diffusion. J Converg 5(2):8–14
Feese S, Burscher MJ, Jonas K, Troster G (2014) Sensing spatial and temporal coordination in teams using the smartphone. Hum-Centric Comput Inf Sci 4(15):1–18
You SD, Chen W-H, Chen W-K (2013) Music identification system using MPEG-7 audio signature descriptors. Sci World J 2013. doi:10.1155/2013/752464
Ramona M et al (2012) A public audio identification evaluation for broadcast monitoring. Appl Artif Intell Int J 26(1–2):119–136. doi:10.1080/08839514.2012.629840
Stan G-B, Embrechts J-J, Archambeau D (2002) Comparison of different impulse response measurement techniques. J Audio Eng Soc 50(4):249–262
ETSI, Speech and Multimedia Transmission Quality (STQ) (2012) Speech quality performance in the presence of background noise; part 1: background noise simulation technique and background noise database. ETSI ES202 396-1, pp 45–47
http://www.shazam.com/. Accessed 10 Mar 2015
Wang AL-C (2003) An industrial-strength audio search algorithm. In: Proc. of international conference on music information retrieval (ISMIR), Baltimore, pp 7–13
ISO/IEC (2002) Information technology—multimedia content description interface-part 4: audio. IS 15938-4
Cano P, Battle E, Kalker T, Haitsma J (2005) A review of audio fingerprinting. J VLSI Signal Process 41(3):271–284
Haitsma J, Kalker T (2002) A highly robust audio fingerprinting system. In: Proc. int’l. conf. on music information retrieval. IRCAM, France, pp 107–115
Baluja S, Covell M (2007) Audio fingerprinting: combining computer vision and data stream processing. In: Proceedings of IEEE intl conf on acoustics, speech and signal processing. IEEE Press, Piscataway, pp II-213–II-216
Burges CJC, Platt JC, Jana S (2003) Distortion discriminant analysis for audio fingerprinting. IEEE Trans Speech Audio Process 11(3):165–174
Ramona M, Peeters G (2013) Audioprint: an efficient audio fingerprint system based on a novel cost-less synchronization scheme. In: Proceedings of the international conference on acoustics, speech and signal processing (ICASSP’13), pp 818–822
You SD, Pu Y-H (2015) Using paired distances of signal peaks in stereo channels as fingerprints for copy identification. ACM Trans Multimed Comput Commun Appl 12(1):1–22, Art No 1
http://en.wikipedia.org/wiki/Acoustic_fingerprint/. Accessed 10 Mar 2015
You SD, Chen W-H (2015) Comparative study of methods for reducing dimensionality of MPEG-7 audio signature descriptors. Multimed Tools Appl 74(10):3579–3598
Park M, Kim H-R, Yang SH (2006) Frequency-temporal filtering for a robust audio fingerprinting scheme in real-noise environments. ETRI J 28(4):509–512
Storn R (1996) Echo cancellation techniques for multimedia applications: a survey. International Computer Science Institute-Publications-TR, Berkeley, USA
Acknowledgments
This work was supported in part by Ministry of Science and Technology of Taiwan through Grant NSC 102-2221-E-027-076.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
You, S.D., Lin, YC. Simulated smart phone recordings for audio identification. J Supercomput 72, 1799–1812 (2016). https://doi.org/10.1007/s11227-015-1533-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-015-1533-6