Skip to main content
Log in

Chirp Group Delay-Based Onset Detection in Instruments with Fast Attack

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

The onset of a musical note is the earliest time at which a note can be reliably detected. Detection of these musical onsets pose challenges in the presence of ornamentation, such as vibrato, bending, and if the attack of the note transient is slower. The legacy systems such as spectral difference or flux and complex domain functions suffer from the addition of false positives due to ornamentation posing as viable onsets. We propose that this can be solved by appropriately improving the resolution of the onset strength signal (OSS) and smoothening it to increase true positives and decrease false positives, respectively. An appropriate peak picking algorithm that works well in unison with the OSS generated is also desired. Since onset detection is a low-level process upon which many other tasks are built, computational complexity must also be reduced. We propose an onset detection algorithm that is a combination of short-time spectral average-based OSS estimation, chirp group delay-based smoothening, and valley–peak distance-based peak-picking. This algorithm performs on par with the state of the art, SuperFlux and convolutional neural networks-based onset detection, with an average \(\text {F}_{1}\) score of 0.88, across three datasets. Subsets from the IDMT-SMT-Guitar, Guitarset, and Musicnet datasets that fit the scope of the work are used for evaluation. It is also found that the proposed algorithm is computationally 300% more efficient than SuperFlux. The positive effects of smoothening an OSS, in determining the onset locations, are established by refining the OSS produced by legacy algorithms, where consistent improvement in onset detection performance is observed. To provide insights into the performance of the proposed algorithms when different ornamentation styles are present in the recording, three levels of results are computed, by selecting different subsets of the IDMT dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The datasets used for analysis in the current study are available from the following links: IDMT-SMT-Guitar—https://www.idmt.fraunhofer.de/en/publications/datasets/guitar.html; Guitarset—https://guitarset.weebly.com/; and Musicnet— https://zenodo.org/record/5120004#.YhYWoXVBzMU.

References

  1. J.P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M.B. Sandler, A tutorial on onset detection in music signals. IEEE Trans. Speech Audio Process. 13(5), 1035–1047 (2005)

    Article  Google Scholar 

  2. J.P. Bello, M. Sandler, Phase-based note onset detection for music signals. Int. Conf. Acoust. Speech Signal Process. (ICASSP) 5, 441–444 (2003)

    Google Scholar 

  3. S. Böck, A. Arzt, F. Krebs, M. Schedl, Online real-time onset detection with recurrent neural networks, in Proceedings of the 15th International Conference on Digital Audio Effects (DAFx-12), York, UK (2012)

  4. S. Böck, F. Korzeniowski, J. Schlüter, F. Krebs, G. Widmer, Madmom: a new python audio and music signal processing library, in Proceedings of the 24th ACM International Conference on Multimedia, pp. 1174–1178 (2016)

  5. S. Böck, F. Krebs, M. Schedl, Evaluating the online capabilities of onset detection methods, in International Society for Music Information Retrieval (ISMIR) Conference, pp. 49–54 (2012)

  6. S. Böck, G. Widmer, Local group delay based vibrato and tremolo suppression for onset detection, in International Society for Music Information Retrieval (ISMIR) Conference, pp. 361–366 (2013)

  7. S. Böck, G. Widmer, Maximum filter vibrato suppression for onset detection, in International Conference on Digital Audio Effects (DAFx), vol. 7 (2013)

  8. B. Bozkurt, L. Couvreur, T. Dutoit, Chirp group delay analysis of speech signals. Speech Commun. 49(3), 159–176 (2007)

    Article  Google Scholar 

  9. N. Collins, A comparison of sound onset detection algorithms with emphasis on psychoacoustically motivated detection functions, in Audio Engineering Society Convention, vol. 118. Audio Engineering Society (2005)

  10. N. Collins, Using a pitch detector for onset detection, in International Society for Music Information Retrieval (ISMIR) Conference, pp. 100–106 (2005)

  11. N. Degara, M.E. Davies, A. Pena, M.D. Plumbley, Onset event decoding exploiting the rhythmic structure of polyphonic music. IEEE J. Sel. Top. Signal Process. 5(6), 1228–1239 (2011)

    Article  Google Scholar 

  12. S. Dixon, Onset detection revisited. Int. Conf. Digit. Audio Effects (DAFx) 120, 133–137 (2006)

    Google Scholar 

  13. C. Duxbury, J.P. Bello, M. Davies, M. Sandler, A combined phase and amplitude based approach to onset detection for audio segmentation, in Digital Media Processing For Multimedia Interactive Services, pp. 275–280. World Scientific (2003)

  14. C. Duxbury, J.P. Bello, M. Davies, M. Sandler, et al., Complex domain onset detection for musical signals, in International Conference on Digital Audio Effects (DAFx), vol. 1, pp. 6–9. Queen Mary University London (2003)

  15. F. Eyben, S. Böck, B. Schuller, A. Graves, Universal onset detection with bidirectional long-short term memory neural networks, in International Society for Music Information Retrieval (ISMIR) Conference, pp. 589–594 (2010)

  16. G. Hu, D. Wang, Auditory segmentation based on onset and offset analysis. IEEE Trans. Audio Speech Lang. Process. 15(2), 396–405 (2007)

    Article  Google Scholar 

  17. C. Kehling, J. Abeßer, C. Dittmar, G. Schuller, Automatic tablature transcription of electric guitar recordings by estimation of score- and instrument-related parameters, in International Conference on Digital Audio Effects (DAFx), pp. 219–226 (2014)

  18. P.M. Kumar, J. Sebastian, H.A. Murthy, Musical onset detection on carnatic percussion instruments, in 2015 Twenty First National Conference on Communications (NCC), pp. 1–6. IEEE (2015)

  19. A. Lacoste, D. Eck, A supervised classification algorithm for note onset detection. EURASIP J. Adv. Signal Process. 2007, 1–13 (2006)

    Article  MATH  Google Scholar 

  20. P. Leveau, L. Daudet, Methodology and tools for the evaluation of automatic onset detection algorithms in music, in International Society for Music Information Retrieval (ISMIR) Conference (2004)

  21. T. Maka, Audio content analysis based on density of peaks in amplitude envelope, in International Conference on Telecommunications and Signal Processing (TSP), pp. 331–334. IEEE (2016)

  22. M. Marolt, A. Kavcic, M. Privosnik, Neural networks for note onset detection in piano music, in Proceedings of the 2002 International Computer Music Conference. Citeseer (2002)

  23. P. Masri, Computer modelling of sound for transformation and synthesis of musical signals. Ph.D. thesis, University of Bristol (1996)

  24. B. McFee, D.P. Ellis, Better beat tracking through robust onset aggregation, in International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2154–2158. IEEE (2014)

  25. MIREX: Mirex2019:Audio Onset Detection (2019). https://www.music-ir.org/mirex/wiki/

  26. H.A. Murthy, B. Yegnanarayana, Formant extraction from group delay function. Speech Commun. 10(3), 209–221 (1991)

    Article  Google Scholar 

  27. H.A. Murthy, B. Yegnanarayana, Group delay functions and its applications in speech technology. Sadhana 36(5), 745–782 (2011)

    Article  Google Scholar 

  28. T. Nagarajan, H.A. Murthy, R.M. Hegde, Segmentation of speech into syllable-like units, in Eighth European Conference on Speech Communication and Technology (EUROSPEECH), pp. 2893–2896 (2003)

  29. T. Nagarajan, H.A. Murthy, Subband-based group delay segmentation of spontaneous speech into syllable-like units. EURASIP J. Adv. Signal Process. 2004(17), 1–12 (2004)

    Article  MATH  Google Scholar 

  30. K. O’Hanlon, M.B. Sandler, Improved detection of semi-percussive onsets in audio using temporal reassignment, in 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 611–615. IEEE (2018)

  31. G.A. Rachel, N. Sripriya, P. Vijayalakshmi, T. Nagarajan, Significance of differenced EGG signal as a spectrum in phase difference computation for the estimation of glottal closure instants. Circuits Syst. Signal Process. 37(5), 2074–2097 (2018)

    Article  MathSciNet  Google Scholar 

  32. G.A. Rachel, P. Vijayalakshmi, T. Nagarajan, Estimation of glottal closure instants from telephone speech using a group delay-based approach that considers speech signal as a spectrum, in INTERSPEECH (2015)

  33. G.A. Rachel, P. Vijayalakshmi, T. Nagarajan, Estimation of glottal closure instants from degraded speech using a phase-difference-based algorithm. Comput. Speech Lang. 46, 136–153 (2017)

    Article  Google Scholar 

  34. X. Rodet, F. Jaillet, Detection and modeling of fast attack transients, in International Computer Music Conference (ICMC), pp. 30–33 (2001)

  35. C. Rosão, R. Ribeiro, Trends in onset detection, in Workshop on Open Source and Design of Communication, pp. 75–81 (2011)

  36. C. Rosão, R. Ribeiro, D.M. De Matos, Influence of peak selection methods on onset detection, in International Society for Music Information Retrieval (ISMIR) Conference, pp. 517–522 (2012)

  37. J. Schlüter, S. Böck, Improved musical onset detection with convolutionalneural networks, in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6979–6983. IEEE (2014)

  38. N. Sripriya, T. Nagarajan, Estimation of glottal closure instants by considering speech signal as a spectrum. Electron. Lett. 51(8), 649–651 (2015)

    Article  Google Scholar 

  39. L. Su, Y.H. Yang, Escaping from the abyss of manual annotation: New methodology of building polyphonic datasets for automatic music transcription, in International Symposium on Computer Music Multidisciplinary Research, pp. 309–321. Springer (2015)

  40. J. Thickstun, Z. Harchaoui, S.M. Kakade, Learning features of music from scratch, in International Conference on Learning Representations (ICLR) (2017)

  41. P. Vijayalakshmi, M.R. Reddy, D. O’Shaughnessy, Acoustic analysis and detection of hypernasality using a group delay function. IEEE Trans. Biomed. Eng. 54(4), 621–629 (2007)

    Article  Google Scholar 

  42. Q. Xi, R.M. Bittner, J. Pauwels, X. Ye, J.P. Bello, Guitarset: a dataset for guitar transcription, in International Society for Music Information Retrieval (ISMIR) Conference, pp. 453–460 (2018)

  43. R. Zhou, M. Mattavelli, G. Zoia, Music onset detection based on resonator time frequency image. IEEE Trans. Audio Speech Lang. Process. 16(8), 1685–1695 (2008)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Johanan Joysingh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Joysingh, S.J., Vijayalakshmi, P. & Nagarajan, T. Chirp Group Delay-Based Onset Detection in Instruments with Fast Attack. Circuits Syst Signal Process 42, 1639–1662 (2023). https://doi.org/10.1007/s00034-022-02183-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-022-02183-4

Keywords

Navigation