Abstract
In spite of various deep learning models devised, it is still a challenging task to classify in-vehicle noise because of the reverberation and the variance in the low-frequency band generated from the narrow interior space. Considering the impulsive characteristics of the vehicle noise and the multi-channel sampling environment at the same time, it is essential to automatically learn the disentangled noise representation as well as parameterize the conventional beamforming operation. We propose a method to overcome the above two major hurdles by parameterizing a beamforming operation based on convolutional neural network. Moreover, we improve the structure of the beamforming network by explicitly learning of the distance between vehicle noises within the triplet network framework. Experiments with the dataset consisting of a total 241,958,848 time-series collected by a global motor company show that the proposed model improves the classification accuracy by 5% compared to the latest deep acoustic models. The detailed analysis shows that the proposed method can potentially compensate for the disjoint issues between the learning and validation vehicle types.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bu, S.-J., Park, N., Nam, G.-H., Seo, J.-Y., Cho, S.-B.: A Monte Carlo search-based triplet sampling method for learning disentangled representation of impulsive noise on steering gear. In: International Conference on Acoustics, Speech and Signal Processing, pp. 3057–3061. IEEE (2020)
Bu, S.-J., Cho, S.-B.: Classifying in-vehicle noise from multi-channel sound spectrum by deep beamforming networks. In: International Conference on Big Data, pp. 3545–3552. IEEE (2019)
Cerrato, G.: Automotive sound quality-accessories, BSR, and brakes. Sound Vibr. 43, 10 (2009)
Zhang, C., Koishida, K.: End-to-end text-independent speaker verification with triplet loss on short utterances. In: Interspeech, pp. 1487–1491 (2017)
Bredin, H.: TristouNet: triplet loss for speaker turn embedding. In: International Conference on Acoustics, Speech and Signal Processing, pp. 5430–5434. IEEE (2017)
Yang, H.-C., Tsai, F.-S., Weng, Y.-M., Ng, C.-J., Lee, C.-C.: A triplet-loss embedded deep regressor network for estimating blood pressure changes using prosodic features. In: International Conference on Acoustics, Speech and Signal Processing, pp. 6019–6023. IEEE (2018)
Novoselov, S., Shchemelinin, V., Shulipa, A., Kozlov, A., Kremnev, I.: Triplet loss based cosine similarity metric learning for text-independent speaker recognition. In: Interspeech, pp. 2242–2246 (2018)
Wang, J., Wang, K.-C., Law, M.T., Rudzicz, F., Brudno, M.: Centroid-based deep metric learning for speaker recognition. In: International Conference on Acoustics, Speech and Signal Processing, pp. 3652–3656. IEEE (2019)
Turpault, N., Serizel, R., Vincent, E.: Semi-supervised triplet loss based learning of ambient audio embeddings. In: International Conference on Acoustics, Speech and Signal Processing, pp. 760–764. IEEE (2019)
Zhao, F., Li, H., Zhang, X.: A robust text-independent speaker verification method based on speech separation and deep speaker. In: International Conference on Acoustics, Speech and Signal Processing, pp. 6101–6105. IEEE (2019)
Mingote, V., et al.: Language recognition using triplet neural networks. In: Interspeech, pp. 4025–4029 (2019)
Xiao, X., et al.: Deep beamforming networks for multi-channel speech recognition. In: International Conference on Acoustics, Speech and Signal Processing, pp. 5745–5749. IEEE (2016)
Markovich, S., Gannot, S., Cohen, I.: Multichannel eigenspace beamforming in a reverberant noisy environment with multiple interfering speech signals. IEEE Trans. Audio Speech Lang. Process. 17, 1071–1086 (2009)
Sainath, T., Parada, C.: Convolutional neural networks for small-footprint keyword spotting. In: Interspeech, pp. 1478–1482 (2015)
Kim, T.Y., Cho, S.B.: Predicting residential energy consumption using CNN-LSTM neural networks. Energy 182, 72–81 (2019)
Ribeiro, L.N., de Almeida, A.L., Mota, J.C.: Tensor beamforming for multilinear translation invariant arrays. In: International Conference on Acoustics, Speech and Signal Processing, pp. 2966–2970. IEEE (2016)
Ramón, M.M., Xu, N., Christodoulou, C.G.: Beamforming using support vector machines. IEEE Antennas Wirel. Propag. Lett. 4, 439–442 (2005)
Salvati, D., Drioli, C., Foresti, G.L.: A weighted MVDR beamformer based on SVM learning for sound source localization. Pattern Recognit. Lett. 84, 15–21 (2016)
Bell, K.L., Ephraim, Y., Van Trees, H.L.: A Bayesian approach to robust adaptive beamforming. IEEE Trans. Signal Process. 48, 386–398 (2000)
Donahue, J., et al.: Long-term Recurrent Convolutional Networks for Visual Recognition and Description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Kim, J.Y., Cho, S.B.: Electric energy consumption prediction by deep learning with state explainable autoencoder. Energies 12, 739 (2019)
Acknowledgments
This work was partly supported by Institute of Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korean government (MSIT) (No. 2020-0-01361, Artificial Intelligence Graduate School Program (Yonsei University)) and Hyundai Motors, Inc.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Bu, SJ., Cho, SB. (2020). Automated Learning of In-vehicle Noise Representation with Triplet-Loss Embedded Convolutional Beamforming Network. In: Analide, C., Novais, P., Camacho, D., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2020. IDEAL 2020. Lecture Notes in Computer Science(), vol 12490. Springer, Cham. https://doi.org/10.1007/978-3-030-62365-4_48
Download citation
DOI: https://doi.org/10.1007/978-3-030-62365-4_48
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-62364-7
Online ISBN: 978-3-030-62365-4
eBook Packages: Computer ScienceComputer Science (R0)