Abstract
Audio signal classification consists of extracting some descriptive features from a sound and use them as input in a classifier. Then, the classifier will assign a different label to any different sound class. The classification of the features can be performed in a supervised or unsupervised way. However, unsupervised classification usually supposes a challenge against supervised classification as it has to be performed without any a priori knowledge. In this paper, unsupervised classification of audio signals is accomplished by using a Probabilistic Self-Organizing Map (PSOM) with probabilistic labeling. The hybrid unsupervised classifier presented in this work can achieve higher detection rates than the reached by the unsupervised traditional SOM. Moreover, real audio recordings from clarinet music are used to show the performance of our proposal.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Holmes, W.J., Huckvale, M.: Why have HMMs been so successful for automatic speech recognition and how might they be improved? In: Speech, Hearing and Language, UCL Work in Progress, vol. 8, pp. 207–219 (1994)
Juang, B.H., Rabiner, L.R.: Automatic Speech Recognition A Brief History of the Technology. In: Elsevier Encyclopedia of Language and Linguistics, 2nd edn. (2005)
Kimura, S.: Advances in Speech Recognition Technologies. Fujitsu Sci. Tech. J. 35(2), 202–211 (1999)
Zils, A., Pachet, F.: Automatic Extraction of Music Descriptors from Acoustic Signals using EDS. In: Proc. of the 116th AES Convention, Berlin, Germany (2004)
Farahani, G., Ahadi, S.M.: Robust Features for Noisy Speech Recognition Based on Filtering and Spectral Peaks in Autocorrelation Domain. In: Proc. of the European Signal Processing Conference, Antalya, Turkey (2005)
Minematsu, N., Nishimura, T., Murakami, T., Hirose, K.: Speech recognition only with suprasegmental features - hearing speech as music. In: Proc. of the International Conference on Speech Prosody, Dresden, Germany, pp. 589–594 (2006)
Lee, J.-H., Jung, H.-J., Lee, T.-W., Lee, S.-Y.: Speech Feature Extraction Using Independent Component Analysis. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. III, pp. 1631–1634 (2000)
Lee, T.-W., Lewicki, M.S., Sejnowski, J.: ICA Mixture Models for Unsupervised Classification of Non-Gaussian Sources and Automatic Context Switching in Blind Signal Separation. IEEE Transactions on Pattern Recognition and Machine Intelligence 22(10), 1–12 (2000)
Martin, K.D.: Sound-Source Recognition: A Theory and Computational Model. PhD thesis, Massachusets Institue of Technology (June 1999)
Moon, T.K.: The expectation-maximization algorithm. IEEE Signal Processing Magazine, 47–60 (November 1996)
Ortiz, A., Gorriz, J.M., Ramirez, J., Salas-Gonzalez, D.: MR brain image segmentation by growing hierarchical SOM and probability clustering. Electronic Letters 47(10), 585–586 (2011)
Kohonen, T.: Self-Organizing Maps, 2nd edn. Springer series in information sciences, Berlin (1997)
Alhoniemi, E., Himberg, J., Vesanto, J.: Probabilistic measures for responses of Self-Organizing Map units. In: Proceedings of the International ICSC Congress on Computational Intelligence Methods and Applications, CIMA 1999 (1999)
Riveiro, M., Johansson, F., Falkman, G., Ziemke, T.: Supporting Maritime Situation Awareness Using Self Organizing Maps and Gaussian Mixture Models
Jenses, J.: Envelope model of isolated musical sounds. In: Proceedings of the 2nd COST G-6 Workshop on Digital Audio Effects, Trondheim, Norway (1999)
Barbancho, I., Bandera, C., Barbancho, A.M., Tardon, L.J.: Transcription and Expressiveness Detection System for Violin Music. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009), Taipei, Taiwan (2009)
Corchado, E., Graña, M., Wozniak, M.: New trends and applications on hybrid artificial intelligence systems. Neurocomputing 75(1), 61–63 (2012)
García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences 180(10), 2044–2064 (2010)
Corchado, E., Abraham, A., Carvalho, A.: Hybrid intelligent algorithms and applications. Information Sciences 180(14), 2633–2634 (2010)
Pedrycz, W., Aliev, R.: Logic-oriented neural networks for fuzzy neurocomputing. Neurocomputing 73(1-3), 10–23 (2009)
Abraham, A., Corchado, E., Corchado, J.M.: Hybrid learning machines. Neurocomputing 72(13-15), 2729–2730 (2009)
Murtagh, F.: Multilayer perceptrons for classification and regression. Neurocomputing 2(5-6), 183–197 (1991)
Prasad, B., Prasanna, S.R.M. (eds.): Speech, Audio, Image and Biomedical Signal Processing using Neural Networks. SCI, vol. 83. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cruz, R., Ortiz, A., Barbancho, A.M., Barbancho, I. (2012). Unsupervised Classification of Audio Signals by Self-Organizing Maps and Bayesian Labeling. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7208. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28942-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-28942-2_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28941-5
Online ISBN: 978-3-642-28942-2
eBook Packages: Computer ScienceComputer Science (R0)