short-paper

A conditional random field viewpoint of symbolic audio-to-score matching

Authors:

Cyril Joder,

Slim Essid,

Gaël RichardAuthors Info & Claims

MM '10: Proceedings of the 18th ACM international conference on Multimedia

Pages 871 - 874

https://doi.org/10.1145/1873951.1874100

Published: 25 October 2010 Publication History

Get Access

Abstract

We present a new approach of symbolic audio-to-score alignment, with the use of Conditional Random Fields (CRFs). Unlike Hidden Markov Models, these graphical models allow the calculation of state conditional probabilities to be made on the basis of several audio frames. The CRF models that we propose exploit this property to take into account the rhythmic information of the musical score. Assuming that the tempo is locally constant, they confront the neighborhood of each frame with several tempo hypotheses.

Experiments on a pop-music database show that this use of contextual information leads to a significant improvement of the alignment accuracy. In particular, the proportion of detected onsets inside a 100-ms tolerance window increases by more than 10% when a 1-s neighborhood is considered.

References

[1]

A. Cont. A coupled Duration-Focused architecture for Real-Time Music-to-Score alignment. IEEE Trans. on PAMI, 32(6):974--987, June 2010.

Digital Library

Google Scholar

[2]

M. Goto, H. Hashiguchi, T. Nishimura, and R. Oka. Rwc music database: Popular, classical, and jazz music databases. In Proc. of ISMIR, 2002.

Google Scholar

[3]

N. Hu, R. B. Dannenberg, and G. Tzanetakis. Polyphonic audio matching and alignment for music retrieval. In Proc. of WASPAA, 2003.

Google Scholar

[4]

C. Joder, S. Essid, and G. Richard. A comparative study of tonal acoustic features for a symbolic level music-to-score alignment. In Proc. of ICASSP, 2010.

Crossref

Google Scholar

[5]

J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proc. of ICML, 2001.

Digital Library

Google Scholar

[6]

N. Montecchio and N. Orio. A discrete filterbank approach to audio to score matching for score following. In Proc. of ISMIR, 2009.

Google Scholar

[7]

M. Müller and F. Kurth. Enhancing similarity matrices for music audio analysis. In Proc. of ICASSP, 2006.

Crossref

Google Scholar

[8]

B. Pardo and W. Birmingham. Modeling form for on-line following of musical performances. In Proc. of National Conference on Artificial Intelligence, 2005.

Digital Library

Google Scholar

[9]

C. Raphael. Aligning music audio with symbolic scores using a hybrid graphical model. Machine Learning Journal, 65:389--409, 2006.

Digital Library

Google Scholar

[10]

Y. Zhu and M. Kankanhalli. Precise pitch profile feature extraction from musical audio for key detection. IEEE Trans. on Multimedia, 8(3):575--584, June 2006.

Digital Library

Google Scholar

Cited By

View all

Maezawa AYamamoto KMark GFussell SLampe Cschraefel mHourcade JAppert CWigdor D(2017)MuEnsProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3025505(4290-4301)Online publication date: 2-May-2017
https://dl.acm.org/doi/10.1145/3025453.3025505
Papadopoulos HTzanetakis GPapadopoulos HTzanetakis GTzanetakis GPapadopoulos H(2017)Models for Music Analysis From a Markov Logic Networks PerspectiveIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2016.261435125:1(19-34)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1109/TASLP.2016.2614351
Maezawa AOkuno H(2014)Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6854597(5212-5216)Online publication date: May-2014
https://doi.org/10.1109/ICASSP.2014.6854597
Show More Cited By

Index Terms

A conditional random field viewpoint of symbolic audio-to-score matching
1. Applied computing
  1. Arts and humanities
    1. Sound and music computing

Recommendations

A Conditional Random Field Framework for Robust and Scalable Audio-to-Score Matching

In this paper, we introduce the use of conditional random fields (CRFs) for the audio-to-score alignment task. This framework encompasses the statistical models which are used in the literature and allows for more flexible dependency structures. In ...
Convolutional neural network based deep conditional random fields for stereo matching

A deep CRF based stereo matching algorithm with CNN is proposed.The CNN potential function learns the potentials of CRF in a CNN framework.The inference of the deep CRF model is formulated as a Recurrent Neural Network.The deep CRF based algorithm ...
The echo state conditional random field model for sequential data modeling

Sequential data labeling is a fundamental task in machine learning applications, with speech and natural language processing, activity recognition in video sequences, and biomedical data analysis being characteristic such examples, to name just a few. ...

Comments

Information & Contributors

Information

Published In

MM '10: Proceedings of the 18th ACM international conference on Multimedia

October 2010

1836 pages

ISBN:9781605589336

DOI:10.1145/1873951

General Chairs:
Alberto del Bimbo
University of Florence, Italy
,
Shih-Fu Chang
Columbia University, USA
,
Program Chair:
Arnold Smeulders
University of Amsterdam, NL

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 October 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Conference

MM '10

Sponsor:

SIGMM

MM '10: ACM Multimedia Conference

October 25 - 29, 2010

Firenze, Italy

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
116
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Maezawa AYamamoto KMark GFussell SLampe Cschraefel mHourcade JAppert CWigdor D(2017)MuEnsProceedings of the 2017 CHI Conference on Human Factors in Computing Systems10.1145/3025453.3025505(4290-4301)Online publication date: 2-May-2017
https://dl.acm.org/doi/10.1145/3025453.3025505
Papadopoulos HTzanetakis GPapadopoulos HTzanetakis GTzanetakis GPapadopoulos H(2017)Models for Music Analysis From a Markov Logic Networks PerspectiveIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2016.261435125:1(19-34)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1109/TASLP.2016.2614351
Maezawa AOkuno H(2014)Audio part mixture alignment based on hierarchical nonparametric Bayesian model of musical audio sequence collection2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6854597(5212-5216)Online publication date: May-2014
https://doi.org/10.1109/ICASSP.2014.6854597
Sako SYamamoto RKitamura T(2014)Ryry: A Real-Time Score-Following Automatic Accompaniment Playback System Capable of Real Performances with Errors, Repeats and JumpsActive Media Technology10.1007/978-3-319-09912-5_12(134-145)Online publication date: 2014
https://doi.org/10.1007/978-3-319-09912-5_12
Yamamoto RSako SKitamura T(2013)Robust on-line algorithm for real-time audio-to-score alignment based on a delayed decision and anticipation framework2013 IEEE International Conference on Acoustics, Speech and Signal Processing10.1109/ICASSP.2013.6637635(191-195)Online publication date: May-2013
https://doi.org/10.1109/ICASSP.2013.6637635
Sumi KArai MFujishima THashimoto S(2012)A music retrieval system using chroma and pitch features based on conditional random fields2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2012.6288299(1997-2000)Online publication date: Mar-2012
https://doi.org/10.1109/ICASSP.2012.6288299

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

A Conditional Random Field Framework for Robust and Scalable Audio-to-Score Matching

Convolutional neural network based deep conditional random fields for stereo matching

The echo state conditional random field model for sequential data modeling

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations