Deriving Musical Structures from Signal Analysis for Music Audio Summary Generation: “Sequence” and “State” Approach

Peeters, Geoffroy

doi:10.1007/978-3-540-39900-1_14

Geoffroy Peeters⁵

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2771))

Included in the following conference series:

International Symposium on Computer Music Modeling and Retrieval

732 Accesses

Abstract

In this paper, we investigate the derivation of musical structures directly from signal analysis with the aim of generating visual and audio summaries. From the audio signal, we first derive features – static features (MFCC, chromagram) or proposed dynamic features. Two approaches are then studied in order to derive automatically the structure of a piece of music. The sequence approach considers the audio signal as a repetition of sequences of events. Sequences are derived from the similarity matrix of the features by a proposed algorithm based on a 2D structuring filter and pattern matching. The state approach considers the audio signal as a succession of states. Since human segmentation and grouping performs better upon subsequent hearings, this natural approach is followed here using a proposed multi-pass approach combining time segmentation and unsupervised learning methods. Both sequence and state representations are used for the creation of an audio summary using various techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Symbolic Representation and Processing of Musical Structure: Stream Segments, Pitch Interval Patterns, General Chord Types

A Creative Tool for the Musician Combining LSTM and Markov Chains in Max/MSP

Melody Transformation with Semiotic Patterns

References

Aucouturier, J.-J., Sandler, M.: Segmentation of musical signals using hidden markov models. In: AES 110th Convention, Amsterdam, The Netherlands (2001)
Google Scholar
Aucouturier, J.-J., Sandler, M.: Finding repeating patterns in acoustic musical signals: applications for audio thumbnailing. In: AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio, Espoo, Finland (2002)
Google Scholar
Bartsch, M., Wakefield, G.: To catch a chorus: Using chroma-based representations for audio thumbnailing. In: WASPAA, New Paltz, New York, USA (2001)
Google Scholar
Beatles, T.: Love me do (one, the best of album). Apple, Capitol Records (2001)
Google Scholar
Bjork. It’s oh so quiet (post album). Mother records (1995)
Google Scholar
Cambouropoulos, E., Crochemore, M., Iliopoulos, C., Mouchard, L., Pinzon, Y.: Algorithms for computing approximate repetitions in musical sequences. In: Raman, R., Simpson, J. (eds.) 10th Australasian Workshop On Combinatorial Algorithms, Perth, WA, Australia, pp. 129–144 (1999)
Google Scholar
Cooper, M., Foote, J.: Automatic music summarization via similarity analysis. In: ISMIR, Paris, France (2002)
Google Scholar
Crawford, T., Iliopoulos, C., Raman, R.: String matching techniques for musical similarity and melodic recognition. In: Computing in Musicology, vol. 11, pp. 73–100. MIT Press, Cambridge (1998)
Google Scholar
Dannenberg, R.: Pattern discovery techniques for music audio. In: ISMIR, Paris (2002)
Google Scholar
Deliege, I.: A perceptual approach to contemporary musical forms. In: Osborne, N. (ed.) Music and the cognitive sciences, vol. 4, pp. 213–230. Harwood Academic publishers (1990)
Google Scholar
Eckman, J., Kamphorts, S., Ruelle, R.: Recurrence plots of dynamical systems. Europhys. Lett. 4, 973–977 (1987)
Article Google Scholar
Foote, J.: Automatic audio segmentation using a measure of audio novelty. In: ICME (IEEE Int. Conf. Multimedia and Expo), New York City, NY, USA, p. 452 (1999)
Google Scholar
Foote, J.: Visualizing music and audio using self-similarity. In: ACM Multimedia, Orlando, Florida, USA, pp. 77–84 (1999)
Google Scholar
Foote, J.: Arthur: Retrieving orchestral music by long-term structure. In: ISMIR, Pymouth, Massachusetts, USA (2000)
Google Scholar
Hunt, M., Lennig, M., Mermelstein, P.: Experiments in syllable-based recognition of continuous speech. In: ICASSP, Denver, Colorado, USA, pp. 880–883 (1980)
Google Scholar
Laburthe, A.: Resume sonore. Master thesis, Universite Joseph Fourier, Grenoble, France (2002)
Google Scholar
Lemstrom, K., Tarhio, J.: Searching monophonic patterns within polyphonic sources. In: RIAO, pp. 1261–1278. College of France, Paris (2000)
Google Scholar
Logan, B., Chu, S.: Music summarization using key phrases. In: ICASSP, Istanbul, Turkey (2000)
Google Scholar
Moby. Natural blues (play album). Labels (2001)
Google Scholar
MPEG-7. Information technology - multimedia content description interface - part 5: Multimedia description scheme (2002)
Google Scholar
Nirvana. Smells like teen spirit (nevermind album). Polygram (1991)
Google Scholar
Orio, N., Schwarz, D.: Alignment of monophonic and polyphonic music to a score. In: ICMC, La Habana, Cuba (2001)
Google Scholar
Peeters, G., Laburthe, A., Rodet, X.: Toward automatic music audio summary generation from signal analysis. In: ISMIR, Paris, France (2002)
Google Scholar
Rabiner, L.: A tutorial on hidden markov model and selected applications in speech. Proccedings of the IEEE 77(2), 257–285 (1989)
Article Google Scholar
Rossignol, S.: Segmentation et indexation des signaux sonores musicaux. Phd thesis, Universite Paris VI, Paris, France (2000)
Google Scholar
Scheirer, E.: Tempo and beat analysis of acoustic musical signals. JASA 103(1), 588–601 (1998)
Google Scholar
Souren, K.: Extraction of structure of a musical piece starting from audio descriptors. Technical report, Ircam (2003)
Google Scholar
Tzanetakis, G., Cook, P.: Multifeature audio segmentation for browsing and annotation. In: WASPAA, New Paltz, New York, USA (1999)
Google Scholar
VanSteelant, D., DeBaets, B., DeMeyer, H., Leman, M., Martens, S.-P., Clarisse, L., Lesaffre, M.: Discovering structure and repetition in musical audio. In: Eurofuse, Varanna, Italy (2002)
Google Scholar
Vinet, H., Herrera, P., Pachet, F.: The cuidado project. In: ISMIR, Paris, France (2002)
Google Scholar
Zhang, H., Kankanhalli, A., Smoliar, S.: Automatic partitioning of full-motion video. ACM Multimedia System 1(1), 10–28 (1993)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Ircam, 1, pl. Igor Stravinsky, 75004, Paris, France
Geoffroy Peeters

Authors

Geoffroy Peeters
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Mærsk Mc-Kinney Møller Institute, University of Southern Denmark, Campus 55, 5230 Odense M, P.O. Box, Denmark
Uffe Kock Wiil

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peeters, G. (2004). Deriving Musical Structures from Signal Analysis for Music Audio Summary Generation: “Sequence” and “State” Approach. In: Wiil, U.K. (eds) Computer Music Modeling and Retrieval. CMMR 2003. Lecture Notes in Computer Science, vol 2771. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39900-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-540-39900-1_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20922-5
Online ISBN: 978-3-540-39900-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics