Music Generation Using Bayesian Networks

Kitahara, Tetsuro

doi:10.1007/978-3-319-71273-4_33

Tetsuro Kitahara²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10536))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

3074 Accesses
1 Citations

Abstract

Music generation has recently become popular as an application of machine learning. To generate polyphonic music, one must consider both simultaneity (the vertical consistency) and sequentiality (the horizontal consistency). Bayesian networks are suitable to model both simultaneity and sequentiality simultaneously. Here, we present music generation models based on Bayesian networks applied to chord voicing, four-part harmonization, and real-time chord prediction.

This work was supported by JSPS KAKENHI Grant Numbers 16K16180, 16H01744, 16KT0136, and 17H00749.

You have full access to this open access chapter, Download conference paper PDF

A-Muze-Net: Music Generation by Composing the Harmony Based on the Generated Melody

Automatic Music Composition from a Self-learning Algorithm

A Survey on Automatic Musical Chord Sequence Generation

1 Introduction

Music is widely known as an application domain of machine learning. However, in the beginning of the 21st century, recognition/analysis tasks were actively studied, such as music transcription and genre classification. But recently, the number of studies devoted to music generation has been increasing (e.g., [1]).

When generating polyphonic music, one must consider two-directional consistencies: simultaneity (i.e., the vertical or pitch-axis consistency) and sequentiality (i.e., the horizontal or time-axis consistency). Our team has investigated music generation models considering both simultaneity and sequentiality using Bayesian networks [2,3,4]. Here, we present our models applied to chord voicing [2], four-part harmonization [3], and real-time chord prediction [4].

2 Assumed Music Structure and Fundamental Model

Suppose that a chord progression \(C = [c_1, c_2,{\cdots }, c_N]\) (\(c_i\): chord symbol) exists in a piece of music. Each chord \(c_i\) (e.g., Am) is played with a particular voicing \((a^{(1)}_i, a^{(2)}_i,{\cdots }, a^{(K)}_i)\) (\(a^{(k)}_i\): note name (a.k.a. pitch class)) (e.g., (C, E, A)). As noted in Introduction, a set of simultaneous notes \((a^{(1)}_i, a^{(2)}_i,{\cdots }, a^{(K)}_i)\) should be harmonically consistent with one other, and each sequence \(A^{(k)} = [a^{(k)}_1, a^{(k)}_2,{\cdots }, a^{k}_N]\) should be temporally smooth. At the same time, a melody \(M=[m_{1,1}, m_{1,2},{\cdots }, m_{2,1}, \cdots ]\) exists, where \(m_{i,j}\) represents the note name of the j-th note in the i-th chord region. The sequences of chords, voicings, and melody notes are considered to have temporal dependencies within each sequence but also depends on one another, as shown in Fig. 1(a). In fact, this fundamental model is difficult to construct because of variations in the number of melody notes within each chord region. We therefore simplify the model based on restrictions to music structures designed for each music generation task.

3 Chord Voicing

Chord voicing refers to estimating voicings \((A^{(1)}, A^{(2)},{\cdots }, A^{(K)})\) according to a given chord progression C and melody M. Here we assume \(K=4\) for simplicity. To resolve the difficulty due to variations in the number of melody notes within each chord region, we use a different melody node \(m'_i = (r_{i,0},{\cdots }, r_{i,11})\) (\(0 \le r_{i,p} \le 1\)) that represents the relative length of the appearance of each note name. For example, \(m'_i = (0.5, 0, 0.25, 0, 0.25, 0,{\cdots }, 0)\) is given for a melody [E, D, C, C] (with equal duration). The simplified model is shown in Fig. 1(b).

This model is applied sequentially from the beginning to the end of a given piece. Given \(c_i\), \(m'_i\), and \((a^{(1)}_{i-1},{\cdots }, a^{(K)}_{i-1})\), the i-th chord voicing \((a^{(1)}_i,{\cdots }, a^{(K)}_i)\) as well as its next voicing \((a^{(1)}_{i+1},{\cdots }, a^{(K)}_{i+1})\) is estimated because each voicing should be smoothly connected to the next voicing. \((a^{(1)}_{i+1},{\cdots }, a^{(K)}_{i+1})\) will be overridden at the next step because this step is repeated for each increment of i.

An example of chord voicing is shown in Fig. 2. The model has been trained with 30 jazz pieces arranged for the electronic organ. Listening tests conducted by music experts revealed that 94.7% of the chord voicings were acceptable.

4 Four-Part Harmonization

Here, we focus on harmonization. Unlike voicing, a sequence of chord symbols is not given—it has to be estimated. For simplicity, we adopt the “one chord for one melody note” assumption. Based on this assumption, the Bayesian network can be simplified to that shown in Fig. 1(c). Here we assume \(K=3\). This problem is called four-part harmonization because the harmony consists of four voices (i.e., soprano, alto, tenor, and bass). Furthermore, we constructed a Bayesian network in which the chord nodes are removed (Fig. 1(d)) because the chord symbols are sometimes too ambiguous.

Figure 3 shows an example of harmonization using these two models. Our objective quantitative evaluation reveals that the model shown in Fig. 1(d) generates more temporally smooth harmonies than the model shown in Fig. 1(c) even though harmonizations with the former model tend to contain slightly more dissonant sounds.

5 Real-Time Chord Prediction

Finally, we apply our Bayesian network to real-time chord prediction. Music experts can often precisely predict the next chord by listening to the current chord, even if they are not familiar with the piece being played. This ability derives from the fact that chord progressions have strong temporal dependencies; experts have learned these dependencies based on their musical experience. They are therefore able to play an accompaniment to a melody that they are listening to for the first time. The goal here is to achieve a computer system that plays such an accompaniment.

Real-time chord prediction can also be achieved through a simplified version of the fundamental model shown in Fig. 1(a). For simplicity, we estimate only chord symbols, we determine the voicings through a separately designed rule. The model used here is shown in Fig. 1(e). Given a new melody note, its next note is predicted. At the same time, the most likely next chord is inferred based on the current chord and the predicted next note.

An example of chord prediction is shown in Fig. 4. This figure shows that the model appropriately predicts chord progression.

6 Conclusion

We have presented Bayesian network models that achieve different music generation tasks: chord voicing, four-part harmonization, and real-time chord prediction. Bayesian networks are flexible models that are suitable to construct a unified music generation model. In the future, we will apply our model to other types of music generation tasks.

References

Harjeres, G., Pachet, F.: DeepBach: A Steerable Model for Bach Chorales Generation, arXiv:1612.01010 [cs.AI] (2016)
Kitahara, T., Katsura, M., Katayose, H., Nagata, N.: Computational model for automatic chord voicing based on Bayesian network. In: ICMPC, pp. 395–398 (2008)
Google Scholar
Suzuki, S., Kitahara, T.: Four-part harmonization using Bayesian networks: pros and cons of introducing chord nodes. J. New Music Res. 43(3), 331–353 (2014)
Article Google Scholar
Kitahara, T., Totani, N., Tokuami, R., Katayose, H.: BayesianBand: jam session system based on mutual prediction by user and system. In: Natkin, S., Dupire, J. (eds.) ICEC 2009. LNCS, vol. 5709, pp. 179–184. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04052-8_17
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

College of Humanities and Sciences, Nihon University, 3-25-40, Sakurajosui, Stagaya-ku, Tokyo, 156-8550, Japan
Tetsuro Kitahara

Authors

Tetsuro Kitahara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tetsuro Kitahara .

Editor information

Editors and Affiliations

Google Research, Google Inc., Zurich, Switzerland
Yasemin Altun
NASA Ames Research Center, Mountain View, USA
Kamalika Das
Oath, Sunnyvale, USA
Taneli Mielikäinen
Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
Donato Malerba
Institute of Computing Science, Poznan University of Technology, Poznan, Poland
Jerzy Stefanowski
Laboratoire d’ Informatique (LIX), École Polytechnique, Palaiseau, France
Jesse Read
Department of Computer Science, Stanford University, Stanford, USA
Marinka Žitnik
Università degli Studi di Bari Aldo Moro, Bari, Italy
Michelangelo Ceci
Jožef Stefan Institute, Ljubljana, Slovenia
Sašo Džeroski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kitahara, T. (2017). Music Generation Using Bayesian Networks. In: Altun, Y., et al. Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2017. Lecture Notes in Computer Science(), vol 10536. Springer, Cham. https://doi.org/10.1007/978-3-319-71273-4_33

Download citation

DOI: https://doi.org/10.1007/978-3-319-71273-4_33
Published: 30 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71272-7
Online ISBN: 978-3-319-71273-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Music Generation Using Bayesian Networks

Abstract

Similar content being viewed by others

A-Muze-Net: Music Generation by Composing the Harmony Based on the Generated Melody

Automatic Music Composition from a Self-learning Algorithm

A Survey on Automatic Musical Chord Sequence Generation

1 Introduction

2 Assumed Music Structure and Fundamental Model

3 Chord Voicing

4 Four-Part Harmonization

5 Real-Time Chord Prediction

6 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Music Generation Using Bayesian Networks

Abstract

Similar content being viewed by others

A-Muze-Net: Music Generation by Composing the Harmony Based on the Generated Melody

Automatic Music Composition from a Self-learning Algorithm

A Survey on Automatic Musical Chord Sequence Generation

1 Introduction

2 Assumed Music Structure and Fundamental Model

3 Chord Voicing

4 Four-Part Harmonization

5 Real-Time Chord Prediction

6 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation