ABSTRACT
Audio-to-score alignment is one of the music information retrieval (MIR) tasks that concerns the real world time when notes appeared in a corresponding audio. Although recent studies based on synthesizing MIDI to audio then applying audio feature extraction techniques and DTW-based alignment have achieved about 10 milliseconds in mean alignment error for piano music, evaluation in a real-world scenario for robustness is preferable. In this paper, we implemented a standard DTW-based Audio-to-score alignment system with audio feature extraction techniques for musical onset enhancement, and evaluated the robustness in a real-world scenario, namely for MIR database building. Considering this type of usage, we used 3 different synthesizers and real-world performance data from CrestMusePEDB in order to simulate the absence of prior information about audio recording conditions and velocity information. As for result, velocity from real-world performance and the choice of synthesizer can ruin DTW-based alignment system by almost doubling the average mean error in most cases. We also made a practical attempt at combining phase-based onset feature extraction and conventional MIDI-audio alignment framework on real-world flute aligning, indicating the protentional benefits of combining different type of audio features.
- Müller, M. (2007). Information retrieval for music and motion (Vol. 2). Heidelberg: Springer.Google ScholarDigital Library
- Müller, M. (2015). Fundamentals of music processing: Audio, analysis, algorithms, applications (Vol. 3). Cham: Springer.Google Scholar
- Foscarin, F., Mcleod, A., Rigaux, P., Jacquemard, F., & Sakai, M. (2020, October). ASAP: a dataset of aligned scores and performances for piano transcription. In International Society for Music Information Retrieval Conference (No. CONF, pp. 534-541).Google Scholar
- Shi, Z., Sapp, C., Arul, K., McBride, J., & Smith III, J. O. (2019, May). SUPRA: Digitizing the Stanford University Piano Roll Archive. In ISMIR (pp. 517-523).Google Scholar
- Ewert, S., Muller, M., & Grosche, P. (2009, April). High resolution audio synchronization using chroma onset features. In 2009 IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 1869-1872). IEEE.Google ScholarDigital Library
- Hu, N., Dannenberg, R. B., & Tzanetakis, G. (2003, October). Polyphonic audio matching and alignment for music retrieval. In 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No. 03TH8684) (pp. 185-188). IEEE.Google Scholar
- Kwon, T., Jeong, D., & Nam, J. (2017, July). Audio-to-Score Alignment of Piano Music Using RNN-based Automatic Music Transcription. In The 14th Sound and Music Computing Conference. SMCNetwork.Google Scholar
- Bello, J. P., Duxbury, C., Davies, M., & Sandler, M. (2004). On the use of phase and energy for musical onset detection in the complex domain. IEEE Signal Processing Letters, 11(6), 553-556.Google Scholar
- Hashida, M., Matsui, T., & Katayose, H. (2008). A New Music Database Describing Deviation Information of Performance Expressions. In ISMIR (pp. 489-494).Google Scholar
- FluidSynth, web resource. https://www.fluidsynth.org/Google Scholar
- Friberg, A., & Sundberg, J. (1993). Perception of just‐noticeable time displacement of a tone presented in a metrical sequence at different tempos. The Journal of The Acoustical Society of America, 94(3), 1859-1859.Google Scholar
- Juan P. Braga Brum (2018). "Traditional Flute Dataset for Score Alignment", web resource. https://www.kaggle.com/jbraga/traditional-flute-datasetGoogle Scholar
Index Terms
- Audio Feature Extraction for DTW-based Audio-to-Score Alignment
Recommendations
Precise pitch profile feature extraction from musical audio for key detection
The majority of pieces of music, including classical and popular music,are composed using music scales, such as keys. The key or the scale information of a piece provides important clues on its high level musical content, like harmonic and melodic ...
Drum loop pattern extraction from polyphonic music audio
ICME'09: Proceedings of the 2009 IEEE international conference on Multimedia and ExpoAlthough drum loops are widely present in many audio recordings of modern style music, there is little research that deals with automatic extraction of drum loops in polyphonic music audio. This paper presents an approach for drum loop pattern ...
Chord Progressions Selection Based on Song Audio Features
Hybrid Artificial Intelligent SystemsAbstractA chord progression is an essential building block in music. In the field of music theory is usually assumed that these progressions influence the mood, emotion, genre or other critical aspects of the songs, and also in the perception that they ...
Comments