skip to main content
10.1145/2413097.2413104acmotherconferencesArticle/Chapter ViewAbstractPublication PagespetraConference Proceedingsconference-collections
research-article

A survey of query-by-humming similarity methods

Published: 06 June 2012 Publication History

Abstract

Performing similarity search in large databases is a problem of particular interest in many communities, such as music, database, and data mining. Although several solutions have been proposed in the literature that perform well in many application domains, there is no best method to solve this kind of problem in a Query-By-Humming (QBH) application. In QBH the goal is to find the song(s) most similar to a hummed query in an efficient manner. In this paper, we focus on providing a brief overview of the representations to encode music pieces, and also on the methods that have been proposed for QBH or other similarly defined problems.

References

[1]
N. Adams, M. Bartsch, J. Shifrin, and G. Wakefield. Time series alignment for music information retrieval. In Proceedings of ISMIR, pages 303--311, 2004.
[2]
R. Bellman. The theory of dynamic programming. Bull. Amer. Math. Soc., 60(6):503--515, 1954.
[3]
L. Bergroth, H. Hakonen, and T. Raita. A survey of longest common subsequence algorithms. In SPIRE, pages 39--48, 2000.
[4]
B. Bollobás, G. Das, D. Gunopulos, and H. Mannila. Time-series similarity problems and well-separated geometric sets. In Symposium on Computational Geometry, pages 454--456, 1997.
[5]
L. Chen and R. Ng. On the marriage of lp-norms and edit distance. In VLDB, pages 792--803, 2004.
[6]
L. Chen and M. T. zsu. Robust and fast similarity search for moving object trajectories. In SIGMOD, pages 491--502, 2005.
[7]
M. Clausen, R. Engelbrecht, D. Meyer, and J. Schmitz. Proms: A web-based tool for searching in polyphonic music. In ISMIR, 2000.
[8]
T. Crawford, C. Iliopoulos, and R. Raman. String matching techniques for musical similarity and melodic recognition. Computing in Musicology, 11:73--100, 1998.
[9]
M. Crochemore, C. Iliopoulos, C. Makris, W. Rytter, A. Tsakalidis, and K. Tsichlas. Approximate string matching with gaps. Nordic Journal of Computing, 9(1):54--65, 2002.
[10]
R. Dannenberg, W. Birmingham, B. Pardo, N. Hu, C. Meek, and G. Tzanetakis. A comparative evaluation of search techniques for query-by-humming using the MUSART testbed. Journal of the American Society for Information Science and Technology, 58(5):687--701, 2007.
[11]
R. Dannenberg and N. Hu. Understanding search performance in query-by-humming systems. In ISMIR, pages 232--237, 2004.
[12]
S. Deorowicz. Speeding up transposition-invariant string matching. Information Processing Letters, 100(1):14--20, 2006.
[13]
J. Downie. The musifind music information retrieval project, phase iii: evaluation of indexing options. In Connectedness: Information, systems, people, organizations: In conference of the Canadian Association for Information Science, 1995, pages 135--146, 1995.
[14]
Z. Ghahramani and M. I. Jordan. Factorial hidden markov models. Machine Learning, 29:245--275, 1997.
[15]
A. Ghias, J. Logan, D. Chamberlin, and B. Smith. Query by humming: Musical information retrieval in an audio database. In ACM Multimedia, pages 231--236, 1995.
[16]
T. Han, S.-K. Ko, and J. Kang. Efficient subsequence matching using the longest common subsequence with a dual match index. In Machine Learning and Data Mining in Pattern Recognition, pages 585--600. 2007.
[17]
N. Hu, R. Dannenberg, and A. Lewis. A probabilistic model of melodic similarity. In ICMC, pages 509--515, 2002.
[18]
C. Iliopoulos and M. Kurokawa. String matching with gaps for musical melodic recognition. In PSC, pages 55--64, 2002.
[19]
J. Jang and M. Gao. A query-by-singing system based on dynamic programming. In International Workshop on Intelligent Systems Resolutions, pages 85--89, 2000.
[20]
T. Kageyama, K. Mochizuki, and Y. Takashima. Melody retrieval with humming. In ICMC, pages 349--349, 1993.
[21]
E. Keogh. Exact indexing of dynamic time warping. In VLDB, pages 406--417, 2002.
[22]
A. Kotsifakos, P. Papapetrou, J. Hollmén, and D. Gunopulos. A Subsequence Matching with Gaps-Range-Tolerances Framework: A Query-By-Humming Application. PVLDB, 4(11):761--771, 2011.
[23]
J. B. Kruskall and M. Liberman. The symmetric time warping algorithm: From continuous to discrete. In Time Warps. Addison-Wesley, 1983.
[24]
K. Lemström and S. Perttu. Semex-an efficient music retrieval prototype. In ISMIR, pages 23--25, 2000.
[25]
K. Lemström and E. Ukkonen. Including interval encoding into edit distance based music comparison and retrieval. In AISB, pages 53--60, 2000.
[26]
V. I. Levenshtein. Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics, 10(8):707--710, 1966.
[27]
A. Lubiw and L. Tanur. Pattern matching in polyphonic music as a weighted geometric translation problem. In ISMIR, pages 289--296, 2004.
[28]
V. Makinen, G. Navarro, and E. Ukkonen. Algorithms for transposition invariant string matching. Lecture notes in computer science, pages 191--202, 2003.
[29]
D. Mazzoni and R. Dannenberg. Melody matching directly from audio. In ISMIR, pages 17--18, 2001.
[30]
R. McNab, L. Smith, I. Witten, C. Henderson, and S. Cunningham. Towards the digital music library: Tune retrieval from acoustic input. In International Conference on Digital Libraries, pages 11--18, 1996.
[31]
C. Meek and W. Birmingham. A comprehensive trainable error model for sung music queries. Journal of Artificial Intelligence Research, 22(1):57--91, 2004.
[32]
M. Mongeau and D. Sankoff. Comparison of musical sequences. Computers and the Humanities, 24(3):161--175, 1990.
[33]
B. Pardo and W. Birmingham. Encoding timing information for musical query matching. In ISMIR, pages 267--268, 2002.
[34]
B. Pardo, J. Shifrin, and W. Birmingham. Name that tune: A pilot study in finding a melody from a sung query. Journal of the American Society for Information Science and Technology, 55(4):283--300, 2004.
[35]
S. Pauws. Cubyhum: A fully operational query by humming system. In ISMIR, pages 187--196, 2002.
[36]
L. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--286, 1989.
[37]
H. Sakoe and S. Chiba. Dynamic programming algorithm optimization for spoken word recognition. Transactions on ASSP, 26:43--49, 1978.
[38]
Y. Sakurai, C. Faloutsos, and M. Yamamuro. Stream monitoring under the time warping distance. In ICDE, pages 1046--1055, 2007.
[39]
J. Shifrin, B. Pardo, C. Meek, and W. Birmingham. HMM-based musical query retrieval. In CS joint Conference on Digital Libraries, pages 295--300, 2002.
[40]
H. Shih, S. Narayanan, and C. Kuo. An HMM-based approach to humming transcription. In 2002 IEEE International Conference on Multimedia and Expo (ICME2002), 2002.
[41]
H. Shih, S. Narayanan, and C. Kuo. A statistical multidimensional humming transcription using phone level hidden Markov models for query by humming systems. In Proceedings of IEEE 2003 International Conference on Multimedia and Expo, volume 1, pages 61--64, 2003.
[42]
T. Smith and M. Waterman. Identification of common molecular subsequences. Journal of Molecular Biology, 147:195--197, 1981.
[43]
R. Typke, P. Giannopoulos, R. Veltkamp, F. Wiering, and R. Van Oostrum. Using transportation distances for measuring melodic similarity. In ISMIR, pages 107--114, 2003.
[44]
A. Uitdenbogerd and J. Zobel. Manipulation of music for melody matching. In ACM Multimedia, pages 235--240, 1998.
[45]
A. Uitdenbogerd and J. Zobel. Melodic matching techniques for large music databases. In ACM Multimedia (Part 1), page 66, 1999.
[46]
E. Ukkonen. Approximate string-matching with q-grams and maximal matches. Theoretical Computer Science, 92(1):191--211, 1992.
[47]
E. Ukkonen, K. Lemström, and V. Mäkinen. Geometric algorithms for transposition invariant content-based music retrieval. In ISMIR, pages 193--199, 2003.
[48]
E. Unal, E. Chew, P. Georgiou, and S. Narayanan. Challenging uncertainty in query by humming systems: a fingerprinting approach. Transactions on Audio Speech and Language Processing, 16(2):359--371, 2008.
[49]
G. Wiggins, K. Lemström, and D. Meredith. SIA(M)ESE: An algorithm for transposition invariant, polyphonic content-based music retrieval. In ISMIR, pages 13--17, 2002.
[50]
C. Yang. Efficient acoustic index for music retrieval with various degrees of similarity. In International Conference on Multimedia, page 591, 2002.
[51]
Y. Zhu and D. Shasha. Warping indexes with envelope transforms for query by humming. In SIGMOD, pages 181--192, 2003.

Cited By

View all
  • (2024)Improving the Robustness of DTW to Global Time Warping Conditions in Audio SynchronizationApplied Sciences10.3390/app1404145914:4(1459)Online publication date: 10-Feb-2024
  • (2024)Sketching With Your Voice: "Non-Phonorealistic" Rendering of Sounds via Vocal ImitationSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687679(1-11)Online publication date: 3-Dec-2024
  • (2022)A Framework for Content-Based Search in Large Music CollectionsBig Data and Cognitive Computing10.3390/bdcc60100236:1(23)Online publication date: 23-Feb-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
PETRA '12: Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments
June 2012
307 pages
ISBN:9781450313001
DOI:10.1145/2413097
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • U of Tex at Arlington: U of Tex at Arlington

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hidden Markov models
  2. query-by-humming
  3. sequence matching

Qualifiers

  • Research-article

Funding Sources

Conference

PETRA2012
Sponsor:
  • U of Tex at Arlington

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Improving the Robustness of DTW to Global Time Warping Conditions in Audio SynchronizationApplied Sciences10.3390/app1404145914:4(1459)Online publication date: 10-Feb-2024
  • (2024)Sketching With Your Voice: "Non-Phonorealistic" Rendering of Sounds via Vocal ImitationSIGGRAPH Asia 2024 Conference Papers10.1145/3680528.3687679(1-11)Online publication date: 3-Dec-2024
  • (2022)A Framework for Content-Based Search in Large Music CollectionsBig Data and Cognitive Computing10.3390/bdcc60100236:1(23)Online publication date: 23-Feb-2022
  • (2022)Improving Query by Humming System using Frequency-Temporal Attention Network and Partial Query Matching2022 9th International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)10.1109/ICAICTA56449.2022.9933001(1-6)Online publication date: 28-Sep-2022
  • (2021)Hummer: Text Entry by Gaze and HumProceedings of the 2021 CHI Conference on Human Factors in Computing Systems10.1145/3411764.3445501(1-11)Online publication date: 6-May-2021
  • (2021)Automated Machine Learning for MultimediaAutomated Machine Learning and Meta-Learning for Multimedia10.1007/978-3-030-88132-0_3(97-177)Online publication date: 15-Sep-2021
  • (2019)Multimodal Music Information Processing and Retrieval: Survey and Future Challenges2019 International Workshop on Multilayer Music Representation and Processing (MMRP)10.1109/MMRP.2019.00012(10-18)Online publication date: Jan-2019
  • (2019)NvPD: novel parallel edit distance algorithm, correctness, and performance evaluationCluster Computing10.1007/s10586-019-02962-wOnline publication date: 27-Jul-2019
  • (2018)Content-Based Music Information Retrieval (CB-MIR) and Its Applications toward the Music IndustryACM Computing Surveys10.1145/317784951:3(1-46)Online publication date: 12-Jun-2018
  • (2018)Discovery of Repeated Melodic Phrases in Folk Singing RecordingsIEEE Transactions on Multimedia10.1109/TMM.2017.277145020:6(1291-1304)Online publication date: Jun-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media