Skip to main content
Log in

Audio Coding for Representation in MIDI via Pitch Detection Using Harmonic Dictionaries

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

The search for a flexible and concise alternate representation for digital musical sound leads to the proposal for the use of the MIDI (Musical Instrument Digital Interface) protocol. The problem becomes one of automating the conversion process from sound to MIDI. This requires processing musical sound and extracting the information necessary to represent the sound as MIDI data. We have conducted studies which have led to algorithms for segmentation of the sound and pitch detection of the individual notes. We describe a novel method for pitch detection using subset selection with dictionaries containing harmonic spectra from samples of musical sounds. Examples demonstrating applicability to monophonic sounds as well as signals with multiple sound sources are given, including detection of objects in a complex background scene.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. A. Ghias, J. Logan, D. Chamberlin, and B.C. Smith, “Query by humming: Musical information retrieval in an audio database,” Preprint, Department of Computer Science, Cornell University, 1997.

  2. H. Helmholtz, On the Sensations of Tone (4th edition, 1877) Dover, New York, 1954.

    Google Scholar 

  3. D. Luce and C. Melville, “Duration of attack transients of nonpercussive orchestral instruments,” J. Audio Eng. Soc., Vol. 13, No.3, p. 194, 1965.

    Google Scholar 

  4. M.D. Freedman, “Analysis of musical instrument tones,” J. Acoust. Soc. Am., Vol. 41, p. 793, 1967.

    Article  Google Scholar 

  5. J.W. Beauchamp, “Acomputer system for time-variant harmonic analysis and synthesis of musical tones,” Music by Computers, Wiley, New York, 1969.

    Google Scholar 

  6. J.A. Moorer, “On the segmentation and analysis of continuous musical sound by digital computer,” PhD thesis, Stanford University, 1975.

  7. S. Foster, W. Andrew Schloss, and A. Joseph Rockmore, “Toward an intelligent editor of digital audio: Signal processing methods,” Computer Music Journal, Vol. 6, No.1, 1982.

  8. C. Chafe, D. Jaffe, K. Kashima, B. Mont-Reynaud, and J. Smith, “Techniques for note identification in polyphonic music,” Proc. ICMC, pp. 399-405, 1985.

  9. R. Wilson, A.D. Calway, and E.R.S. Pearson, “A generalized wavelet transform for Fourier analysis: The multiresolution fourier transform and its application to image and audio signal analysis,” IEEE Trans. Info. Theory,Vol. 38, No.2, pp. 674-690, March 1992.

    Article  MathSciNet  MATH  Google Scholar 

  10. A.S. Tanguiane, Artificial Perception and Music Recognition, Springer-Verlag, Berlin, 1993.

    Book  MATH  Google Scholar 

  11. E.D. Scheirer, “Bregman's chimerae: Music perception as auditory scene analysis,” Technical report, MIT Media Lab, 1996.

  12. D.P.W. Ellis, “A computer implementation of psychoacoustic grouping rules,” Technical report 224, MIT Media Lab, 1994.

  13. A.S. Bregman, Auditory Scene Analysis, MIT Press, Cambridge, MA, 1990.

    Google Scholar 

  14. B.C.J. Moore, An Introduction to the Psychology of Hearing, Academic Press, London, 1989.

    Google Scholar 

  15. E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models, Springer Verlag, Berlin, 1990.

    Google Scholar 

  16. S. Handel, Listening, MIT Press, Cambridge, MA, 1989.

    Google Scholar 

  17. J. Rothstein, MIDI: A Comprehensive Introduction, 2nd edition, A-R Editions, Madison, WI, 1995.

    Google Scholar 

  18. J. Heckroth, “Tutorial on MIDI and music synthesis,” World Wide Web, 1995. http://www.harmony-central.com/MIDI/Docs/tutorial.html.

  19. D.J. Thomson, “Spectrum estimation and harmonic analysis,” Proc. IEEE, Vol. 70, No.9, pp. 1055-1096, Sept. 1982.

    Article  Google Scholar 

  20. D. Slepian, “Prolate spheroidalwave functions, Fourier analysis, and uncertainty V: The discrete case,” Bell Syst. Tech. J., Vol. 57, pp. 1371-1429, 1978.

    Article  MATH  Google Scholar 

  21. R.J. McAulay and T.F. Quatieri, “Speech analysis/synthesis based on a sinusoidal representation,” IEEE Trans. Acoust. Speech and Signal Proc.,Vol. 34, No.4, pp. 744-754, Aug. 1986.

    Article  Google Scholar 

  22. E. Terhardt, G. Stoll, and M. Seewann, “Algorithm for extraction of pitch and pitch salience from complex tonal signals,” J. Acoust. Soc. Am., Vol. 71, No.3, pp. 679-688, March 1982.

    Article  Google Scholar 

  23. E. Terhardt, G. Stoll, and M. Seewann, “Pitch of complex signals according to virtual-pitch theory: Tests, examples and predictions,” J. Acoust. Soc. Am., Vol. 71, No.3, pp. 671-678, March 1982.

    Article  Google Scholar 

  24. I. Daubechies, “Time-frequency localization operators: A geometric phase space approach,” IEEE Trans. Info. Theory,Vol. 34, No.4, pp. 605-612, 1988.

    Article  MathSciNet  MATH  Google Scholar 

  25. S. Mallat and Z. Zhang, “Matching pursuit in a time-frequency dictionary,” IEEE Trans. Signal Proc., Vol. 41, pp. 3397-3415, 1993.

    Article  MATH  Google Scholar 

  26. S. Chen and D.L. Donoho, “Atomic decomposition by basis pursuit,” Technical report, Stanford University, May 1995.

  27. N. Sieger and A. Tewfik, “Audio coding for conversion to MIDI,” Proceedings of the IEEE Workshop on Multimedia Signal Processing, IEEE, June 1997.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sieger, N.J., Tewfik, A.H. Audio Coding for Representation in MIDI via Pitch Detection Using Harmonic Dictionaries. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 20, 45–59 (1998). https://doi.org/10.1023/A:1008074130468

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008074130468

Keywords

Navigation