skip to main content
10.1145/1873951.1874100acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

A conditional random field viewpoint of symbolic audio-to-score matching

Published: 25 October 2010 Publication History

Abstract

We present a new approach of symbolic audio-to-score alignment, with the use of Conditional Random Fields (CRFs). Unlike Hidden Markov Models, these graphical models allow the calculation of state conditional probabilities to be made on the basis of several audio frames. The CRF models that we propose exploit this property to take into account the rhythmic information of the musical score. Assuming that the tempo is locally constant, they confront the neighborhood of each frame with several tempo hypotheses.
Experiments on a pop-music database show that this use of contextual information leads to a significant improvement of the alignment accuracy. In particular, the proportion of detected onsets inside a 100-ms tolerance window increases by more than 10% when a 1-s neighborhood is considered.

References

[1]
A. Cont. A coupled Duration-Focused architecture for Real-Time Music-to-Score alignment. IEEE Trans. on PAMI, 32(6):974--987, June 2010.
[2]
M. Goto, H. Hashiguchi, T. Nishimura, and R. Oka. Rwc music database: Popular, classical, and jazz music databases. In Proc. of ISMIR, 2002.
[3]
N. Hu, R. B. Dannenberg, and G. Tzanetakis. Polyphonic audio matching and alignment for music retrieval. In Proc. of WASPAA, 2003.
[4]
C. Joder, S. Essid, and G. Richard. A comparative study of tonal acoustic features for a symbolic level music-to-score alignment. In Proc. of ICASSP, 2010.
[5]
J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML, 2001.
[6]
N. Montecchio and N. Orio. A discrete filterbank approach to audio to score matching for score following. In Proc. of ISMIR, 2009.
[7]
M. Müller and F. Kurth. Enhancing similarity matrices for music audio analysis. In Proc. of ICASSP, 2006.
[8]
B. Pardo and W. Birmingham. Modeling form for on-line following of musical performances. In Proc. of National Conference on Artificial Intelligence, 2005.
[9]
C. Raphael. Aligning music audio with symbolic scores using a hybrid graphical model. Machine Learning Journal, 65:389--409, 2006.
[10]
Y. Zhu and M. Kankanhalli. Precise pitch profile feature extraction from musical audio for key detection. IEEE Trans. on Multimedia, 8(3):575--584, June 2006.

Cited By

View all
  • (2017)MuEnsProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3025505(4290-4301)Online publication date: 2-May-2017
  • (2017)Models for Music Analysis From a Markov Logic Networks PerspectiveIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2016.261435125:1(19-34)Online publication date: 1-Jan-2017
  • (2014)Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6854597(5212-5216)Online publication date: May-2014
  • Show More Cited By

Index Terms

  1. A conditional random field viewpoint of symbolic audio-to-score matching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '10: Proceedings of the 18th ACM international conference on Multimedia
    October 2010
    1836 pages
    ISBN:9781605589336
    DOI:10.1145/1873951
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. audio/score alignment
    2. conditional random fields
    3. indexing
    4. music information retrieval

    Qualifiers

    • Short-paper

    Conference

    MM '10
    Sponsor:
    MM '10: ACM Multimedia Conference
    October 25 - 29, 2010
    Firenze, Italy

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 17 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)MuEnsProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3025505(4290-4301)Online publication date: 2-May-2017
    • (2017)Models for Music Analysis From a Markov Logic Networks PerspectiveIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2016.261435125:1(19-34)Online publication date: 1-Jan-2017
    • (2014)Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6854597(5212-5216)Online publication date: May-2014
    • (2014)Ryry: A Real-Time Score-Following Automatic Accompaniment Playback System Capable of Real Performances with Errors, Repeats and JumpsActive Media Technology10.1007/978-3-319-09912-5_12(134-145)Online publication date: 2014
    • (2013)Robust on-line algorithm for real-time audio-to-score alignment based on a delayed decision and anticipation framework2013 IEEE International Conference on Acoustics, Speech and Signal Processing10.1109/ICASSP.2013.6637635(191-195)Online publication date: May-2013
    • (2012)A music retrieval system using chroma and pitch features based on conditional random fields2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2012.6288299(1997-2000)Online publication date: Mar-2012

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media