Abstract
A Content Based Audio Retrieval system is presented, which uses Mel ScaledF requency Cepstral Coefficients (MFCCs) as frequency based feature vectors to distinguish between different musical instruments via a Neural Network. Supervectors were formedfrom the concatenation of 14 dimensional feature vectors (12 MFCCs, 1 spectral centroid and1 frame energy feature) taken from 213 time slices, over 4 second samples. These feature sets were then usedto train a two-layer, feedforwardneural network, to segregate instrument samples into categories basedon their audio content. The motivation for this work is to create a system, which when presentedwith an unknown sample, can categorise it as belonging to one of a set of predefined classes. The system was tested on the recognition of instruments given 4 secondmonophonic phrases. An overall recognition accuracy of 40% was obtained, with drum and bass classes being the best categorised.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Seltzer, M.: SPHINX III Signal Processing System. CMU Speech Group (1999)
Davis, S. B., Mermelstein, P.: Comparison of Parametric Representations for MonosyllabicWordRecognition in Continuously Spoken Sentences. In: IEEE Trans. ASSP August (1982) 357–366
Sandell, G.: SHARC Timbre Database. Parmly Hearing Institute, Loyola University Chicago (1995) Web ref: http://sparky.parmly.luc.edu/sharc, Last accessed: 2/04/2002
Zheng, F., Zhang, G. L.: Integrating the Energy Information into MFCC. In: Proc. Int. Conf. on Spoken Language Processing, October 16–20, Vol.1 (2000) 389–292
Green, M., Nelson, P., Tchou, C., Watson, R., Hedberg, M.: The Mod Archive (2002) Web Ref: http://www.modarchive.com/waveworld, Last Accessed: 11/04/2002
Potulski, F.: Freesamples.de (2002) Web Ref: http://www.freesamples.de, Last Accessed: 11/04/02
Foote, J. T.: Content-Based Retriev al of Music and Audio. In: C.-C. J. Kuo et al. (eds.) Multimedia Storage and Archiving Systems II, Proc. of SPIE, Vol. 3229 (1997) 138–147
Zell, A.: Stuttgart Neural Network Simulator, User Manual V4.2. University of Stuttgart (2000) Web ref: http://www-ra.informatik.uni-tuebingen.de/SNNS, Last accessed: 25/03/2002
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
McGrath, D. (2002). Design of a Musical Instrument Classifier System Based on Mel Scaled Cepstral Coefficient Supervectors and a Supervised Two-Layer Feedforward Neural Network. In: O’Neill, M., Sutcliffe, R.F.E., Ryan, C., Eaton, M., Griffith, N.J.L. (eds) Artificial Intelligence and Cognitive Science. AICS 2002. Lecture Notes in Computer Science(), vol 2464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45750-X_29
Download citation
DOI: https://doi.org/10.1007/3-540-45750-X_29
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44184-7
Online ISBN: 978-3-540-45750-3
eBook Packages: Springer Book Archive