Abstract
Mathematical Morphology has proven to be a powerful tool for extracting geometric information from greyscale images. In this paper, we demonstrate its application to spectrograms (two-dimensional greyscale images of sound) of music excerpts. The sounds of musical instruments exhibit particular shapes when represented as a spectrogram. These shapes are determined by the sound characteristics. In general, musical sounds contain three different components: the attack component, appearing as vertical lines; the sustain component, appearing as horizontal lines; and the stochastic component, appearing as a landscape of hills and holes. In this paper we propose a pipeline of morphological operators to separate these three components. This separation allows us to build a new sound similar to the input one.
This work was partly supported by the chair of I. Bloch in Artificial Intelligence (Sorbonne Université and SCAI).
Notes
- 1.
In the experiments exposed in this work, we chose a 10 ms step for time and a \(\frac{44100}{4096}\approx \) 10.77 Hz step for frequency. These values are common values for music applications.
- 2.
These continuous values are sampled according to the grid, and become \(7\times 3\) in our case.
References
Amatriain, X., Bonada, J., Loscos, A., Serra, X.: Spectral processing. In: DAFX, chap. 10, pp. 373–438. Wiley (2002)
Bloch, I., Heijmans, H., Ronse, C.: Mathematical morphology. In: Aiello, M., Pratt-Hartmann, I., Van Benthem, J. (eds.) Handbook of Spatial Logics, pp. 857–944. Springer, Dordrecht (2007). https://doi.org/10.1007/978-1-4020-5587-4_14
Cadore, J., Gallardo-Antolín, A., Peláez-Moreno, C.: Morphological processing of spectrograms for speech enhancement. In: Travieso-González, C.M., Alonso-Hernández, J.B. (eds.) NOLISP 2011. LNCS, vol. 7015, pp. 224–231. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25020-0_29
Couprie, M., Bezerra, F.N., Bertrand, G.: Topological operators for grayscale image processing. J. Electron. Imaging 10(4), 1003–1015 (2001)
Gröchenig, K.: Foundations of Time-Frequency Analysis. Birkhäuser, Boston (2001)
Guimarães, S.J.F., Couprie, M., de Albuquerque Araújo, A., Jerônimo Leite, N.: Video segmentation based on 2D image analysis. Pattern Recogn. Lett. 24, 947–957 (2003)
Harris, C.R., et al.: Array programming with NumPy. Nature 585(7825), 357–362 (2020)
Keiler, F., Marchand, S.: Survey on extraction of sinusoids in stationary sounds. In: Digital Audio Effects (DAFx) Conference, Germany, pp. 51–58 (2002)
Klapuri, A., Davy, M.: Signal Processing Methods for Music Transcription. Springer, Cham (2007)
Naegel, B., Passat, N., Ronse, C.: Grey-level hit-or-miss transforms—part I: unified theory. Pattern Recogn. 40(2), 635–647 (2007)
Najman, L., Talbot, H.: Mathematical Morphology: From Theory to Applications. Wiley-ISTE, London (2010)
Romero-García, G., Agón, C., Bloch, I.: Estimation de paramètres de resynthèse de sons d’instruments de musique avec des outils de morphologie mathématique. In: 19th Sound and Music Computing Conference, Zenodo, Saint-Etienne, France, pp. 653–662 (2022)
Ronse, C., Heijmans, H.J.A.M.: The algebraic basis of mathematical morphology: II. Openings and closings. CVGIP: Image Underst. 54(1), 74–97 (1991)
Salamon, J., Gomez, E.: Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans. Audio Speech Lang. Process. 20(6), 1759–1770 (2012)
Serra, X.: Musical sound modeling with sinusoids plus noise. In: Musical Signal Processing, pp. 91–122. Routledge, New York (1997)
Serra, X., Smith, J.: Spectral modeling synthesis: a sound analysis/synthesis system based on a deterministic plus stochastic decomposition. Comput. Music J. 14(4), 12–24 (1990)
Steinberg, R., O’Shaughnessy, D.: Segmentation of a speech spectrogram using mathematical morphology. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1637–1640 (2008)
Verma, T.S., Meng, T.H.Y.: Extending spectral modeling synthesis with transient modeling synthesis. Comput. Music J. 24(2), 47–59 (2000)
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17, 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Virtanen, T., Klapuri, A.: Separation of harmonic sound sources using sinusoidal modeling. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, pp. II765–II768 (2000)
Van der Walt, S., et al.: scikit-image: image processing in python. PeerJ 2, e453 (2014)
Xu, S., et al.: A mathematical morphological processing of spectrograms for the tone of Chinese vowels recognition. Appl. Mech. Mater. 571–572, 665–671 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Romero-García, G., Bloch, I., Agón, C. (2024). Mathematical Morphology Applied to Feature Extraction in Music Spectrograms. In: Brunetti, S., Frosini, A., Rinaldi, S. (eds) Discrete Geometry and Mathematical Morphology. DGMM 2024. Lecture Notes in Computer Science, vol 14605. Springer, Cham. https://doi.org/10.1007/978-3-031-57793-2_33
Download citation
DOI: https://doi.org/10.1007/978-3-031-57793-2_33
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-57792-5
Online ISBN: 978-3-031-57793-2
eBook Packages: Computer ScienceComputer Science (R0)