Fast CommunicationReceiver-based packet loss concealment for pulse code modulation (PCM G.711) coder
Introduction
Voice-over-IP (VoIP), the transmission of packetized voice over IP networks, is gaining much attention as a possible alternative to conventional public switched telephone networks (PSTN). However, impairments present on IP networks, namely jitter, delay and channel errors can lead to the loss of packets at the receiving end. This packet loss degrades the speech quality. Model-based coders, especially G.729-A [2] and G.723.1 [3] International Telecommunication Union (ITU-T) standards, have been extensively used for speech coding over IP networks because of their low bit rates requirements (5.3 to for G.723.1 and for ) and their inherent ability to recover from erasure. Their built-in packet loss concealment makes their quality drop slowly with increasing amount of packet loss. However, their memory requires a few frames for the transition from a concealed state to a correct state. Thus, they actually tend to corrupt a few good packets before recovery as a result of a phenomenon known as “State Error” [6]. On the other hand, pulse code modulation (PCM, ) [9], although having a higher quality compared to G.729 and G.723.1 in the periods of normal operation, does not have the ability to conceal erasure. This results in a dramatic drop in the quality of speech during loss periods. Yet, PCM-based coders can recover from packet loss more rapidly than model-based coders, since the first speech sample in the first good packet restores speech to its original quality. The low complexity of PCM and its good performance in tandem coding make it a viable alternative to G.729 or G.723.1 for VoIP.
Several approaches have been implemented to address the frame erasure problem in PCM streams. The simplest approach is to play a mute (silence) packet in the erasure period. This method, however, introduces annoying voice clipping and most subjective tests proved that this method deteriorates the speech quality even at very low packet loss rates [4], [5]. Many other concealment algorithms depend on the quasi-stationary property of speech (not a lot of new information is delivered in the duration of a 10– lost packet). One of the popular commercial concealment algorithms repeats the speech signal received in the last speech packet. This method performs better than silence substitution but its quality is still not satisfactory for high-quality applications.
ITU-T has lately standardized (in G.711 Appendix A [1]) a high-quality low-complexity PCM-coded speech concealment method. This method depends on waveform substitution. The packet loss concealment (PLC) algorithm first performs pitch detection on a sufficient length of speech samples kept in the history buffer (390 samples of -sampled speech). The concealment unit then places the pointer one pitch period backward and copies a speech signal of the duration of the lost packet. This pitch predicted replica is played in the gap resulting from the missing speech segment. The algorithm also performs an overlap and add at the transition between the last received good samples and the concealed ones. This overlap and add is to ensure a smooth and natural transition and higher quality for the resulting concealment. However, this results in an added algorithmic delay of [1]. The algorithm introduces a very low complexity of 0.5 MIPS. Another standard method is presented in the ANSI standard T1-521-2000 (Appendix B) [7]. This method depends on the well-known linear prediction model in estimating the missing speech waveform. This standard simply adopts the model-based codecs approach. It implements a complete analysis to extract the short- and long-term excitation from the previous correctly received speech. Then, the synthesis unit uses these parameters along with the most recently received speech samples (as initial conditions for the inverse linear prediction (LP) filter) to synthesize an approximation of the missing speech segment. This method introduces an algorithmic delay of (a half correct packet) to perform the smoothing transition between the last good speech segment and the beginning of the concealed one. It also requires a much higher complexity (2.3 MIPS for packet) which is around 5 times the complexity of ITU-T G.711 Appendix A [4], [7]. The resulting concealment quality of this method is comparable to the ITU-T G.711 Appendix A [4], [7]. In this paper, we present a new receiver-based PLC algorithm for packetized PCM-coded speech. It is designed to work with the conventional sampling rate of and frame sizes of . The proposed algorithm does not require any delay and has an affordable complexity of 1.85 MIPS.
The rest of this paper is organized as follows. In Section 2, the concealment model is described. Section 3 presents the quality assessment test for the new method as well as simulation results confirming the improved performance of the proposed algorithm. We then conclude the paper in Section 4, along with the future work that could be added to the proposed method.
Section snippets
Prediction equation
The new LP-based concealment technique is based on the prediction with a sufficiently large-order filter that is capable of accurately modelling the speechwhere S(n) is the nth speech sample, P is the prediction order, which was set to 50 as will be explained later, a(i) are the LP coefficients and b(n) is the residual signal.
As can be seen from Eq. (1) the current speech sample S(n) is composed of two components. The first component is the predictable part carrying
Performance of the proposed algorithm
The new algorithm is compared to the ITU-T standard concealment tool G.711-A and to the packet repetition method. The test was performed on a set of speech files from four speakers; two males and two females referred to in the results as: M1, M2, F1 and F2. Each of those speakers has 10 speech files to investigate, each containing two sentences in English of duration . The format of the files was linear PCM. The files were taken from the ITU-T supplement P.23.
The assessment tool used to
Conclusion and future work
In this paper, we introduced a new concealment algorithm for PCM packetized speech of packet length. The model implemented in , provides very encouraging results for the idea of combining the pitch prediction along with the high-order LP-based prediction to produce the concealed speech segments. The PESQ-MOS scores obtained for the random loss tests prove that the algorithm exhibits a superior high-quality concealment performance in all cases when compared to an existing commercial method
References (10)
- Appendix A: a high quality low-complexity algorithm for packet loss concealment with G.711, ITU-T Recommendation....
- Coding of speech at 8kb/s using conjugate-structure algebraic-code-excited linear-prediction (CS-ACELP), ITU-T...
- Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3kb/s, ITU-T Recommendation G.723.1,...
- et al.
A linear prediction based packet loss concealment algorithm for PCM coded speech
IEEE Trans. Speech and Audio Process.
(November 2001) - M. Hassan, A. Nayandoro, Internet telephony: services, technical challenges, and products, IEEE Communication Magazine,...
Cited by (11)
An effective hybrid low delay packet loss concealment algorithm for MDCT-based audio codec
2019, Applied AcousticsCitation Excerpt :Thus, speech codecs must implement a Packet loss concealment (PLC) technique which conceals these frames losses and reduces the degradation in the synthesized audio signal. Most PLC algorithms were developed for speech codecs with time-domain predictive coding [1–4]. They concentrate attentions on digital speech transmission and work well for speech audio signals but yield poor results for audio signals with music.
Error-Resilient Coding and Error Concealment Strategies for Audio Communication
2007, Multimedia over IP and Wireless Networks: Compression, Networking, and SystemsError-Resilient Coding and Error Concealment Strategies for Audio Communication
2007, Multimedia over IP and Wireless NetworksA New Regression Model Applied for Packet Loss Recovery Technique in Real-Time Communications
2024, Przeglad ElektrotechnicznyPacket Loss Concealment Based on Phase Correction and Deep Neural Network
2022, Applied Sciences (Switzerland)Packet loss concealment-based estimation of polynomial interpolation for improving speech quality in VoIP
2020, International Journal of Intelligent Systems Technologies and Applications