Separation of Drum and Bass from Monaural Tracks

Scarpiniti, Michele; Scardapane, Simone; Comminiello, Danilo; Parisi, Raffaele; Uncini, Aurelio

doi:10.1007/978-3-319-95098-3_13

Michele Scarpiniti⁷,
Simone Scardapane⁷,
Danilo Comminiello⁷,
Raffaele Parisi⁷ &
…
Aurelio Uncini⁷

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 102))

Included in the following conference series:

Italian Workshop on Neural Nets

470 Accesses
1 Citations

Abstract

In this paper, we propose a deep recurrent neural network (DRNN), based on the Long Short-Term Memory (LSTM) unit, for the separation of drum and bass sources from a monaural audio track. In particular, a single DRNN with a total of six hidden layers (three feedforward and three recurrent) is used for each original source to be separated. In this work, we limit our attention to the case of only two, challenging sources: drum and bass. Some experimental results show the effectiveness of the proposed approach with respect to another state-of-the-art method. Results are expressed in terms of well-known metrics in the field of source separation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Available at: http://medleydb.weebly.com/.

References

Asari, H., Olsson, R.K., Pearlmutter, B.A.: Sparsification for monaural source separation. In: Makino, S., Lee, T.W., Sawada, H. (eds.) Blind Speech Separation, Chap. 14, pp. 387–410. Springer (2007)
Chapter Google Scholar
Beierholm, T., Dam Pedersen, B., Winthert, O.: Low complexity bayesian single-channel source separation. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2004) (2004)
Google Scholar
Bittner, R., Salamon, J., Tierney, M., Mauch, M., Cannam, C., Bello, J.P.: MedleyDB: a multitrack dataset for annotation-intensive MIR research. In: 15th International Society for Music Information Retrieval Conference, pp. 1–6. Taipei, Taiwan (2014)
Google Scholar
Cichocki, A., Amari, S.: Adaptive Blind Signal and Image Processing. Wiley (2002)
Google Scholar
Comon, P., Jutten, C. (eds.): Handbook of Blind Source Separation. Springer (2010)
Google Scholar
Gao, B., Woo, W.L., Dlay, S.S.: Single-channel source separation using EMD-subband variable regularized sparse features. IEEE Trans. Audio Speech Lang. Process. 19(4), 961–976 (2011)
Article Google Scholar
Grais, E.M., Sen, M.U., Erdogan, H.: Deep neural networks for single channel source separation. In: 2014 IEEE International Conference on Acoustic, Speech and Signal Processing (ICASSP 2014), pp. 1–5. Florence, Italy, 4–9 May 2014
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Huang, P.S., Kim, M., Hasegawa-Johnson, M., Smaragdis, P.: Joint optimization of masks and deep recurrent neural networks for monaural source separation. IEEE/ACM Trans. Audio, Speech Lang. Process. 23(12), 1–12 (2015)
Article Google Scholar
Jang, G.J., Lee, T.W.: A maximum likelihood approach to single-channel source separation. J. Mach. Learn. Res. 4(12), 1365–1392 (2003)
MathSciNet MATH Google Scholar
Lee, D.D., Seung, H.S.: Learning the parts of objects by nonnegative matrix factorization. Nature 401(6755), 788–791 (1999)
Article Google Scholar
Litvin, Y., Cohen, I.: Source separation using Bark-scale wavelet packet decompostion. In: Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP2009), pp. 1–4. Grenoble, France, 1–4 Sept 2009
Google Scholar
Molla, K., Hirose, K.: Single-mixture audio source separation by subspace decomposition of Hilbert spectrum. IEEE Trans. Audio Speech Lang. Process. 15(3), 893–900 (2004)
Article Google Scholar
Patki, K.: Review of single channel source separation techniques. In: Proceedings of the 14th International Society for Music Information Retrieval Conference (ISMIR 2013), pp. 1–5. Curitiba, Brasil, 4–8 Nov 2013
Google Scholar
Reddy, A.M., Raj, B.: Soft mask methods for single-channel speaker separation. IEEE Trans. Audio Speech Lang. Process. 15(6), 1766–1776 (2007)
Article Google Scholar
Smaragdis, P., Brown, J.C.: Non-negative matrix factorization for polyphonic music transcription. In: Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 177–180, 19–22 Oct 2003
Google Scholar
Tieleman, T., Hinton, G.: Lecture 6.5—RMSProp. Tech. rep., COURSERA: Neural Networks for Machine Learning (2012)
Google Scholar
Uncini, A.: Fundamentals of adaptive signal processing. In: Signals and Communication Technology. Springer International Publishing, Switzerland (2015)
Google Scholar
Vincent, E., Gribonval, R., Fevotte, C.: Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)
Article Google Scholar
Virtanen, T.: Monaural sound source separation by non-negative matrix factorization with temporal continuity and sparseness criteria. IEEE Trans. Audio Speech Lang. Process. 15(3), 1066–1074 (2007)
Article Google Scholar
Weninger, F., Eyben, F., Schuller, B.: Single-channel speech separation with memory-enhanced recurrent neural networks. In: Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), pp. 3709–3713. Florence, Italy, 4–9 May 2014
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering, Electronics and Telecommunications (DIET), “Sapienza” University of Rome, Rome, Italy
Michele Scarpiniti, Simone Scardapane, Danilo Comminiello, Raffaele Parisi & Aurelio Uncini

Authors

Michele Scarpiniti
View author publications
You can also search for this author in PubMed Google Scholar
Simone Scardapane
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Comminiello
View author publications
You can also search for this author in PubMed Google Scholar
Raffaele Parisi
View author publications
You can also search for this author in PubMed Google Scholar
Aurelio Uncini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michele Scarpiniti .

Editor information

Editors and Affiliations

Dipartimento di Psicologia, Università della Campana Luigi Vanvitelli, Caserta, Italy
Anna Esposito
Fundació Tecnocampus, Pompeu Fabra University, Mataro, Barcelona, Spain
Marcos Faundez-Zanuy
Department of Civil, Environmental, Energy, and Material Engineering, University Mediterranea of Reggio Calabria, Reggio Calabria, Italy
Francesco Carlo Morabito
Laboratorio di Neuronica, Dipartimento Elettronica e Telecomunicazioni, Politecnico di Torino, Torino, Italy
Eros Pasero

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Scarpiniti, M., Scardapane, S., Comminiello, D., Parisi, R., Uncini, A. (2019). Separation of Drum and Bass from Monaural Tracks. In: Esposito, A., Faundez-Zanuy, M., Morabito, F., Pasero, E. (eds) Neural Advances in Processing Nonlinear Dynamic Signals. WIRN 2017 2017. Smart Innovation, Systems and Technologies, vol 102. Springer, Cham. https://doi.org/10.1007/978-3-319-95098-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-95098-3_13
Published: 22 July 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95097-6
Online ISBN: 978-3-319-95098-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics