Design of a Musical Instrument Classifier System Based on Mel Scaled Cepstral Coefficient Supervectors and a Supervised Two-Layer Feedforward Neural Network

McGrath, Declan

doi:10.1007/3-540-45750-X_29

Declan McGrath²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2464))

Included in the following conference series:

Irish Conference on Artificial Intelligence and Cognitive Science

563 Accesses

Abstract

A Content Based Audio Retrieval system is presented, which uses Mel ScaledF requency Cepstral Coefficients (MFCCs) as frequency based feature vectors to distinguish between different musical instruments via a Neural Network. Supervectors were formedfrom the concatenation of 14 dimensional feature vectors (12 MFCCs, 1 spectral centroid and1 frame energy feature) taken from 213 time slices, over 4 second samples. These feature sets were then usedto train a two-layer, feedforwardneural network, to segregate instrument samples into categories basedon their audio content. The motivation for this work is to create a system, which when presentedwith an unknown sample, can categorise it as belonging to one of a set of predefined classes. The system was tested on the recognition of instruments given 4 secondmonophonic phrases. An overall recognition accuracy of 40% was obtained, with drum and bass classes being the best categorised.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Seltzer, M.: SPHINX III Signal Processing System. CMU Speech Group (1999)
Google Scholar
Davis, S. B., Mermelstein, P.: Comparison of Parametric Representations for MonosyllabicWordRecognition in Continuously Spoken Sentences. In: IEEE Trans. ASSP August (1982) 357–366
Google Scholar
Sandell, G.: SHARC Timbre Database. Parmly Hearing Institute, Loyola University Chicago (1995) Web ref: http://sparky.parmly.luc.edu/sharc, Last accessed: 2/04/2002
Zheng, F., Zhang, G. L.: Integrating the Energy Information into MFCC. In: Proc. Int. Conf. on Spoken Language Processing, October 16–20, Vol.1 (2000) 389–292
MathSciNet Google Scholar
Green, M., Nelson, P., Tchou, C., Watson, R., Hedberg, M.: The Mod Archive (2002) Web Ref: http://www.modarchive.com/waveworld, Last Accessed: 11/04/2002
Potulski, F.: Freesamples.de (2002) Web Ref: http://www.freesamples.de, Last Accessed: 11/04/02
Foote, J. T.: Content-Based Retriev al of Music and Audio. In: C.-C. J. Kuo et al. (eds.) Multimedia Storage and Archiving Systems II, Proc. of SPIE, Vol. 3229 (1997) 138–147
Google Scholar
Zell, A.: Stuttgart Neural Network Simulator, User Manual V4.2. University of Stuttgart (2000) Web ref: http://www-ra.informatik.uni-tuebingen.de/SNNS, Last accessed: 25/03/2002

Download references

Author information

Authors and Affiliations

Dept. of Information Technology, NUI, Galway
Declan McGrath

Authors

Declan McGrath
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Information Systems, University of Limerick, Ireland
Michael O’Neill , Richard F. E. Sutcliffe , Conor Ryan , Malachy Eaton & Niall J. L. Griffith , , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

McGrath, D. (2002). Design of a Musical Instrument Classifier System Based on Mel Scaled Cepstral Coefficient Supervectors and a Supervised Two-Layer Feedforward Neural Network. In: O’Neill, M., Sutcliffe, R.F.E., Ryan, C., Eaton, M., Griffith, N.J.L. (eds) Artificial Intelligence and Cognitive Science. AICS 2002. Lecture Notes in Computer Science(), vol 2464. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45750-X_29

Download citation

DOI: https://doi.org/10.1007/3-540-45750-X_29
Published: 27 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44184-7
Online ISBN: 978-3-540-45750-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics