Abstract
Accurate voice humming transcription and efficient indexing and retrieval schemes are essential to a large-scale humming-based audio retrieval system. Although much research has been done to develop such schemes, their performance in terms of precision, recall, and F-measure, among all similarity metrics, are still unsatisfactory. In this paper, we propose a new voice query transcription scheme. It considers the following features: note onset detection using dynamic threshold methods, fundamental frequency (F0) acquisition of each frame, and frequency realignment using K-means. We use a popularity-adaptive indexing structure called frequently accessed index (FAI) based on frequently queried tunes for indexing purposes. In addition, we propose a semi-supervised relevance feedback and query reformulation scheme based on a genetic algorithm to improve retrieval efficiency. In this paper, we extend our efforts to mobile multimedia environments and develop a mobile audio retrieval system. Experiments show our system performs satisfactory in wireless mobile multimedia environments.
Similar content being viewed by others
References
Pham, B., Wong, O.: Handheld devices for applications using dynamic multimedia data. In: Proceedings of graphite (2004)
Paraskevi, L., Aristomenis L., George T.: Intelligent mobile content-based retrieval from digital music libraries. Intell. Decis. Tech. 3(3), 123–138 (2009). doi:10.3233/IDT-2009-0060
Barrington, L., Chan, A., Turnbull, D., Lanckriet, G.: Audio information retrieval using semantic similarity. In: ICASSP, pp.725–728 (2007)
Gagliardi, l., Pagliarulo, P.: Audio information retrieval in HyperMedia environment. In: ACM conference on Hypertext and hypermedia, pp. 248–250 (2005)
Batlle, E., Masip, J., Guaus, E.: Amadeus: A scalable HMM-based audio information retrieval system. In: ISCCSP, pp. 731–734 (2004)
Schnelle, D., James, F.: Structured audio information retrieval system, pp. 1–12. Mobile Computing and Ambient Intelligence (2005)
Jang, J.S.R., Chun, J., Kao, M.-Y.: MIRACLE: a music information retrieval system with clustered computing engines, pp. 11–12. ISMIR (2001)
Ghias, A., Logan, J., Chamberlin, D., Smith, B.: Query by humming—musical information retrieval in an audio database. In: Proceedings of ACM multimedia, pp. 231–236 (1995)
Chen, L., Hu, B.: An implementation of web based query by humming system, pp. 1467–1470. ICME (2007)
Rho, S., Han, B., Hwang, E., Kim, M.: MUSEMBLE: a novel music retrieval system with automatic voice query transcription and reformulation. J. Syst. Softw. 81(7), 1065–1080 (2008). doi:10.1016/j.jss.2007.05.038
Park, S., Kim, S., Byeon, K., Hwang, E.: Automatic voice query transformation for query-by-humming systems, pp. 197–202. IMSA (2005)
Zhang, W., Xu, G., Wang, Y.: Pitch extraction based on circular AMDF. In: ICASSP, pp. 341–344 (2002)
Ross, M. J.: Average magnitude difference function pitch extractor. IEEE Trans. Acoust. 22(1), 353–362 (1974)
QBSH corpus. http://www.cs.nthu.edu.tw/~jang
Bello, J.P., Daudet, L., Abdallah, S., Duxbury, C., Davies, M., Sandler, M.B.: A tutorial on onset detection in music signals. IEEE Trans. Speech. Audio. Process. 13(5), 1035–1047 (2005). doi:10.1109/TSA.2005.851998
Duxbury, C., Bello, J. P.: Complex domain Onset Detection for Musical Signals. In: DAFx (2003)
Klapuri, A.: Sound onset detection by applying psychoacoustic knowledge. In: ICASSP, pp. 115–118 (1999)
Gainza: Onset detection and music transcription for the Irish Tin Whistle. In: Proceedings of the Irish systems and signals conference (2004)
Leveau, P., Richard, L. D. G.: Methodology and tools for the evaluation of automatic onset detection algorithms in music, pp. 72–75. ISMIR (2004)
Eric, L., Maddox, R.: Real-time time-domain pitch tracking using wavelets. In: REU Report, University of Illinois at Urbana-Champaign, Department of Physics (2005)
Ryynanen, M., Klapuri A.: Transcription on the singing melody in polyphonic music, ISMIR (2006)
Klapuri, A.: A perceptually motivated multiple-F0 estimation method. In: WASPAA, pp. 291–294 (2005)
Roger, J., Lee, H. R.: A general framework of progressive filtering and its application to query by singing/humming. IEEE Trans. Audio. Speech. Lang. Process. 2(16), 350–358 (2008). doi:10.1109/TASL.2007.913035
Hoashi, K., Zeitler, E., Inoue, N.: Implementation of relevance feedback for content-based music retrieval based on user preferences. In: ACM SIGIR, pp. 385–286 (2002)
Foote, J.: The TreeQ package. ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/tools/treeq1.3.tar.gz
Rui, Y., Huang, T. S., Ortega, M. Mehrotra, S.: Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circuits. Syst. Video. Technol. 8(5), 644–655 (1998)
Stejic, Z., Takama, Y., and Hirota, K.: Genetic algorithm based relevance feedback for image retrieval using local similarity patterns. Inf. Process. Manag. 39(1), 1–23 (2003). doi:10.1016/S0306-4573(02)00024-9
Karydis, I., Nanopoulos, A., Papadopoulos, A., Katsaros, D., Manolopoulos, Y.: Content-based music information retrieval in wireless ad-hoc networks, pp. 137–144. ISMIR (2005)
Lampropoulou, P.S., Lampropoulos, A.S., Tsihrintzis, G.A.: Alimos: A middleware system for accessing digital music libraries in mobile services. In: Proceedings of 10th international conference on knowledge-based and intelligent information and engineering systems, pp. 384–391 (2006)
Savitzkey, A., Golay, M. J.E.: Smoothing and differentiation of data by simplified least squares procedures. Anal. Chem. 36(8), 1627–1639 (1964)
MacQueen J.: some methods for classification and analysis of multivariate observations. In: Proceedings of knowledge discovery and data mining, pp. 16–22 (1967)
Rho, S., Hwang, E.: FMF: Query adaptive melody retrieval system. J. Syst. Softw. 79(1), 43–56 (2006). doi:10.1016/j.jss.2004.11.024
AKOff sound labs. http://www.akoff.com/
Digital Ear. Real-time wav to MIDI converter. http://www.digital-ear.com/
Ross, M.J.: Average magnitude difference function pitch extractor. IEEE Trans. Acoust. 22(1), 353–362 (1974)
Acknowledgments
“This research was supported by the MKE (The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the NIPA (National IT Industry Promotion Agency)” (NIPA-2010-C1090-1031-0004). “This research was also supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0025395)”. We would like to specially thank to Byeong-jun Han who were dedicated as much as we were in this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rho, S., Hwang, E. & Park, J.H. M-MUSICS: an intelligent mobile music retrieval system. Multimedia Systems 17, 313–326 (2011). https://doi.org/10.1007/s00530-010-0212-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-010-0212-y