Elsevier

Signal Processing

Volume 86, Issue 6, June 2006, Pages 1307-1326
Signal Processing

Signal processing for in-car communication systems

https://doi.org/10.1016/j.sigpro.2005.07.040Get rights and content

Abstract

Due to a large amount of background noise the communication within a car driving at high or even moderate speed is often difficult. This is especially true if one of the communication partners is the driver and the other is one of the backseat passengers. As a result of the high noise level the backseat passengers often lean towards the front passengers. Furthermore, all speakers increase their loudness. Even if both reactions enhance the quality of the “communication channel” it is rather exhausting and uncomfortable for the passengers.

The situation can be improved by using in-car communication systems. These systems record the speech of each passenger by means of a single microphone or with an array of microphones. The recorded signals of the currently speaking passengers are processed by the system and played back via those loudspeakers which are located close to the non-active passengers. Comparable to public address systems, in-car communication systems operate within a closed electro-acoustic loop. Thus, signal processing is required to guarantee stable operation so as to avoid acoustic feedback such as howling or whistling.

In this contribution we describe the basic processing units of an in-car communication system. Those units contain mostly standard algorithms such as beamforming, echo cancellation, and loss control. However, these methods cannot be applied and controlled as in applications like hands-free telephones or preprocessing for speech recognition systems. Here, the problem is that the excitation signals and the distorting components are highly correlated—leading to convergence problems of adaptive algorithms. Furthermore, in-car communication systems have very restrictive demands on the tolerable processing delay.

Introduction

In limousines and vans communication between passengers in the front and in the rear may be difficult—especially if the car is driven at medium or high speed, resulting in a large background noise level. Furthermore, driver and front passengers speak toward the windshield. Thus, they are hardly intelligible for those sitting behind them. To improve the speech intelligibility the passengers start speaking louder and lean or turn toward the listening communication partners. For longer conversations this is usually tiring and uncomfortable.

Another way to improve the speech intelligibility within a passenger compartment is to use an in-car communication system [1], [2]—often shortly called intercom system. Those systems record the speech of the speaking passengers by means of microphones and improve the communication by playing the recorded signals via those loudspeakers located close to the listening passengers. Fig. 1 sketches the structure of a simple car interior communication system aimed to support only front-to-rear conversations with one microphone and one loudspeaker.

As it is clearly visible in Fig. 1, intercom systems operate in a closed electro-acoustic loop. The microphone picks up at least a portion of the loudspeaker signal. If this portion is not sufficiently small sustained oscillations appear—which can be heard as howling or whistling. The howling margin depends on the output gain of the intercom system as well as on the gains of the analog amplifiers VMic and VLs. For this reason all gains within the system need to be adjusted carefully.

To improve the stability margin signal processing, such as beamforming, feedback and echo cancellation, adaptive notch filtering, adaptive gain adjustment, equalization, and nonlinear processing can be applied. A few basic processing units are already depicted in Fig. 1.

Before we will describe the signal processing units in more detail in Section 3, we will discuss the boundary conditions we have to fulfill when designing communication systems for passenger compartments in the next section. In contrast to hands-free telephones or speech recognition engines no methods for evaluating the quality of intercom systems have been standardized or even published yet.1 Thus, evaluation is not as easy as in other speech and audio applications. However, a few measurements (binaural recordings) as well as subjective tests (performed in a car equipped with an intercom system) are presented at the end of this contribution.

Section snippets

Basics

When designing an intercom system a variety of boundary conditions and system demands will appear. In order to understand the origin of these demands a few—mostly physical or psychoacoustic—phenomena will be described within this section. Furthermore, models for all important transmission paths are introduced. This allows to give a first motivation for a few signal processing units, such as feedback cancellation and beamforming.

Signal processing for intercom systems

Fig. 10 sketches the structure of an intercom system aimed to support front-to-rear conversations (for the other direction a similar structure is applied). Compared to the basic system depicted in Fig. 1 now much more details are covered. Since driver and front passenger are located at well defined positions, fixed microphone arrays can point towards each of them requiring fixed beamformers only. This allows to start with the echo and feedback cancellation after the beamformer (and to reduce

A real system

Fig. 21 shows the results—in terms of a binaural recording—of a car interior communication system. The system utilizes eight microphones (two per passenger) and six loudspeaker channels (standard car loudspeakers). To obtain high speech intelligibility the algorithms described in Section 3 were applied. Especially at higher speed (90 km/h or more) a clear improvement of the communication quality could be achieved.

Subjective tests in terms of comparison mean opinion scores (CMOS) indicate a clear

Conclusions and outlook

In this paper the basic signal processing components of an in-car communication system have been described. Even if most algorithms are already known for other applications such as hands-free telephones, public address system, or hearing aids, the specific conditions in which intercom systems have to operate require several modifications of the standard algorithms. For this reason the boundary conditions have been described in detail at the beginning of this contribution.

Undoubtedly, in-car

Acknowledgements

The authors would like to thank Markus Buck, Marcus Hennecke, and Hans-Jörg Köpf from the research division of Harman/Becker/Temic for carefully reading and improving the manuscript.

References (36)

  • E. Lleida, E. Masgrau, A. Ortega, Acoustic echo and noise reduction for car cabin communication, Proceedings of...
  • A. Ortega, E. Lleida, E. Masgrau, F. Gallego, Cabin car communication system to improve communication inside a car,...
  • ITU-T Recommendation P.501, Test Signals for Use in Telephonometry, Geneva, Switzerland,...
  • Verband Deutscher Automobilindustrie (VDA), VDA-Spezifikationen für Kfz-Freisprecheinrichtungen, 2003 (in...
  • ITU-T Recommendation G.167, General Characteristics of International Telephone Connections and International Telephone...
  • H. Kuttruff

    Room Acoustics

    (2000)
  • E. Lombard

    Le signe de l’elevation de la voix

    Ann. Maladies Oreille, Larynx, Nez. Pharynx

    (1911)
  • J.H.L. Hanson

    Morphological constrained feature enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and lombard effect

    IEEE Trans. Speech Audio Process.

    (1994)
  • E. Hänsler et al.

    Acoustic Echo and Noise Control—A Practical Approach

    (2004)
  • ITU-T Recommendation P.581, Use of Head And Torso Simulator (HATS) for Hands-Free Terminal Testing, Geneva,...
  • N. Wiener

    Extrapolation, Interpolation, and Smoothing of Stationary Time Series, with Engineering Applications

    (1949)
  • H. Haas

    The influence of a single echo on the audibility of speech

    J. Audio Eng. Soc.

    (March 1972)
  • E. Meyer et al.

    Über den Einfluss von Schallrückwürfen auf Richtungslokalisation und Lautstärke bei Sprache

    Nachrichten der Akademie der Wissenschaften in Göttingen, Math-phys. Kl.

    (1952)
  • K. Shenoi

    Digital Signal Processing in Telecommunications

    (1995)
  • G. Glentis et al.

    Efficient least squares adaptive algorithms for FIR transversal filtering: a unified view

    IEEE Signal Process. Mag.

    (1999)
  • T.I. Laakso et al.

    Splitting the unit delay—tools for fractional delay filter design

    IEEE Signal Process. Mag.

    (1996)
  • Y. Huang et al.

    Microphone arrays for video camera steering

  • C.H. Knapp et al.

    The generalized correlation method for estimation of time delay

    IEEE Trans. Acoust. Speech Signal Process.

    (1976)
  • Cited by (41)

    • Adaptive feedback cancellation with prediction error method and howling suppression in train public address system

      2020, Signal Processing
      Citation Excerpt :

      In order to solve this problem, the decorrelating procedure must be added to the AFC system. Several decorrelating approaches have been proposed in the literatures [29–32], in which the prewhitening filter approach based on the prediction error method (PEM) has been proved to be advantageous because of the limited perceptual distortions [33]. The PEM assumes that the source signal of the system can be modeled as a white noise signal which is filtered by a time-varying prefilter.

    • Real-time frequency-based noise-robust Automatic Speech Recognition using Multi-Nets Artificial Neural Networks: A multi-views multi-learners approach

      2014, Neurocomputing
      Citation Excerpt :

      Finally, the produced speech was used for phoneme recognition and channelled to a Persian speech recogniser. A real-time noise-robust approach that can be beneficial for speech recognition is Active-Noise-Cancellation (ANC) methods [32]. In addition to the main microphone located near to the speech source which is used to capture speech, these methods use additional hardware as cancellation source (usually one or more feedback microphones and speakers) to detect noise waves and exclude it from the speech [33,34].

    • Voice activity detection for in-car communication systems

      2023, Towards Human-Vehicle Harmonization
    • Model-Based Estimation of in-Car-Communication Feedback Applied to Speech Zone Detection

      2022, International Workshop on Acoustic Signal Enhancement, IWAENC 2022 - Proceedings
    View all citing articles on Scopus
    View full text