Abstract
This paper presents and studies objective video quality evaluation techniques for a network where frame losses can be considered independent, for example a best effort not heavy loaded packet switching network. The total or partial loss of a frame’s information affects the quality of video playback, as the frame cannot be decoded and other frames that depend on it cannot be correctly decoded too. Therefore, during some time the video playback has errors in the image and the user will perceive them as interruptions. In this paper, the total number of decoded frames and the video playback interruptions duration will be considered important parameters to quantify the video quality. The analytical formulation for them will be presented and the importance of considering them together will be highlighted.
Similar content being viewed by others
References
Bikfalvi A, García-Reinoso J, Vidal I, Valera F, Azcorra A (2011) P2P vs. IP multicast: Comparing approaches to IPTV streaming based on TV channel popularity. Comput Networks 55(6):1310–1325
Borgnat P, Dewaele G, Fukuda K, Abry P, Cho K (2009) Seven years and one day: sketching the evolution of internet traffic. In: INFOCOM 2009. IEEE, pp 711–719
Cheng RS, Lin CH, Chen JL, Chao HC (2012) Improving transmission quality of MPEG video stream by SCTP multi-streaming and differential RED mechanisms. J Supercomputing 62:68–83. doi:10.1007/s11227-011-0624-2
Chih-Heng Ke CKS (2008) An evaluation framework for more realistic simulations of MPEG video transmission. J Inf Sci Eng 24(2):425–440
Cisco Visual Networking Index (2011) Forecast and methodology, 2010–2015. Tech Rep Cisco Systems Inc http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-481360.pdf. Accessed 7 May 2012
Espina F, Morato D (2012) Survey on the current uses of H.264. Tech Rep Public University of Navarre (Spain). https://www.tlm.unavarra.es/~felix/publicaciones/TR/H.264.pdf
Fiedler M, Hossfeld T, Tran-Gia P (2010) A generic quantitative relationship between quality of experience and quality of service. IEEE Netw 24(2):36–41
Heyman D, Lakshman T (1996) Source models for VBR broadcast-video traffic. IEEE/ACM Trans Netw 4(1):40–48
Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801
ISO JTC 1/SC 29 (2000) ISO/IEC 13818-2:2000: information technology – generic coding of moving pictures and associated audio information: video. ISO
ISO JTC 1/SC 29 (2009) ISO/IEC 14496-2:2004: information technology – coding of audio-visual objects – Part 2: visual. ISO
ISO JTC 1/SC 29 (2012) ISO/IEC 14496-10:2010: information technology – coding of audio-visual objects – Part 10: advanced video coding. ISO
ITU-T Study Group 12 (2008) Recommendation P.10/G.100 (2006) Amendment 2: vocabulary for performance and quality of service – new definitions for inclusion in recommendation ITU-T P.10/G.100. ITU-T
ITU-T Study Group 9 (2008) Recommendation J.247 (08/2008): objective perceptual multimedia video quality measurement in the presence of a full reference. ITU-T
Kusuma T, Zepernick HJ (2003) A reduced-reference perceptual quality metric for in-service image quality assessment. In: Mobile future and symposium on trends in communications, 2003. Joint First Workshop on SympoTIC ’03, pp 71–74
Lin CH, Ke CH, Shieh CK, Chilamkurti N (2006) The packet loss effect on MPEG video transmission in wireless networks. In: Advanced information networking and applications, 2006. In: 20th International Conference on AINA 2006, vol 1, pp 565–572
MAWI (Measurement and Analysis on the WIDE Internet) Working Group (2012) 150 megabit ethernet anonymized packet traces without payload: WIDE-TRANSIT link @ Tokyo, Japan. http://mawi.wide.ad.jp/mawi/ditl/ditl2010/. Accessed 7 May 2012
Moving Picture Experts Group (2012) The moving picture experts group (MPEG) home page. http://mpeg.chiariglione.org/. Accessed 7 May 2012
Osama A, Lotfallah MR, Panchanathan S (2006) A framework for advanced video traces: evaluating visual quality for video transmission over lossy networks. EURASIP J Adv Signal Process 2006:042083. doi:10.1155/ASP/2006/42083
Pastrana-Vidal R, Gicquel J (2006) Automatic quality assessment of video fluidity impairments using a no-reference metric. In: Proc. of int. workshop on video processing and quality metrics for consumer electronics
Pastrana-Vidal R, Gicquel J (2007) A no-reference video quality metric based on a human assessment model. In: Third international workshop on video processing and quality metrics for consumer electronics VPQM, vol 7, pp 25–26
Pastrana-Vidal R, Gicquel J, Colomes C, Cherifi H (2004) Frame dropping effects on user quality perception. In: 5th international workshop on image analysis for multimedia interactive services
Pastrana-Vidal RR, Gicquel JC, Colomes C, Cherifi H (2004) Sporadic frame dropping impact on quality perception. In: Rogowitz BE, Pappas TN (eds. Human vision and electronic imaging IX, vol 5292. SPIE, pp 182–193
Reisslein M, Lassetter J, Ratnam S, Lotfallah O, Fitzek FH, Panchanathan S (2002) Traffic and quality characterization of scalable encoded video: a large-scale trace-based study, part 1: overview and definitions. Arizona State Univ., Dept. of Electrical Eng., Tech. Rep. http://trace.eas.asu.edu/publications/p1.pdf. Accessed 7 May 2012
Seeling P, Reisslein M, Kulapala B (2004) Network performance evaluation using frame size and quality traces of single-layer and two-layer video: a tutorial. IEEE Commun Surv Tutor 6(3):58–78
Tionardi L, Hartanto F (2003) The use of cumulative inter-frame jitter for adapting video transmission rate. In: TENCON 2003. conference on convergent technologies for Asia-Pacific region, vol 1, pp 364–368
Van der Auwera G, David PT, Reisslein M (2008) Traffic and quality characterization of single-layer video streams encoded with the H.264/MPEG-4 advanced video coding standard and scalable video coding extension. IEEE Trans Broadcast 54(3):698–718
Varga A, Hornig R (2008) An overview of the OMNeT+ + simulation environment. In: Simutools ’08: Proceedings of the 1st international conference on simulation tools and techniques for communications, networks and systems & workshops. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), ICST, Brussels, Belgium, Belgium, pp 1–10
Ziviani A, Wolfinger BE, Rezende JF, Duarte OC, Fdida S (2005) Joint adoption of QoS schemes for MPEG streams. Multimedia Tools Appl 26(1):59–80
Acknowledgements
This work was supported by the Spanish Ministry of Science and Innovation through the research project INSTINCT (TEC-2010-21178-C02-01). The authors would also like to thank the Spanish thematic network FIERRO (TEC2010-12250-E).
Author information
Authors and Affiliations
Corresponding author
Appendix: Cut lengths in a video
Appendix: Cut lengths in a video
In a video flow that suffers frame losses, the cut length can vary from one frame length to the total length of the video. The cut length depends on the grouping of the directly lost frames and of the non-decodable frames due these lost frames. The cuts must be understood as experienced by the user, therefore they must be measured from the presentation order of the videos, not the transmission order where the losses take place.
For example (Fig. 6), in a video with the IPBBPBB GoP structure in transmission order, if the second B-frame and second P-frame are lost, the four last frames of the GoP could not been decoded. But in presentation order, two disconnect cuts of size 1 and 3 will be generated. This makes the analytical model of cuts more complex.
When an I-frame is directly lost the whole GoP cannot be decoded, creating a single cut of at least length N. In the case of an open GoP, the last B-frames from the previous GoP will be non-decodable and they will be part of the same single cut of length N + (M − 1) when considered in presentation order. If we define the control variable \(z=\frac{N_B}{M-1}-N_P\), where z = 1 if the GoP is an open one, and z = 0 if it is a closed GoP, then the cut length will be N + z * (M − 1).
If the I-frame of the next GoP is lost too, all the frames from the next GoP will not be decoded and a single cut of length 2 * N + z * (M − 1) will be generated. In general, when j consecutive I-frames are lost, a single cut of length j * N + z * (M − 1) will be generated.
The loss of a P-frame makes impossible to decode all the following frames in the GoP. As all these frames are consecutive, both in transmission and presentation order, only one cut is generated. If the lost P-frame is the last or N P th P-frame of the GoP, the cut will be of M + z * (M − 1) frames length. If the lost P-frame is the penultimate or (N P − 1)th P-frame of the GoP, the cut will be of 2 * M + z * (M − 1) frames length. If the lost P-frame is the first P-frame of the GoP, the cut will be of N P * M + z * (M − 1) frames length. In general, when the (N P + 1 − i)th P-frame of the GoP is lost, a single cut of i * M + z * (M − 1) will be generated.
If the I-frame from the next GoP is lost too, then all the frames from the next GoP will not be decoded. Again, all these frames are consecutive and result in only one cut of N + i * M + z * (M − 1) frames. In general, when the (N P + 1 − i)th P-frame of the GoP is lost and the next j I-frames are lost, a single cut of length j * N + i * M + z * (M − 1) will be generated.
The loss of c consecutive B-frames of a B-frames block does not affect other frames and it creates only one cut. If the last B-frame of the B-frames block is lost and the next frame (either I-frame or P-frame) is lost too, in presentation order the non-decoded frames will not be consecutive. The B-frames will be on one cut and the frames non-decoded from the lost I- or P-frame will be on the other cut.
The possible cut lengths are summarized in (19), where F is the number of frames in the video.
A similar approach was used for the analytical expression of Q in [29]. On the following we obtain the analytical expression of N cut[c], the number of cuts of c frames length.
P {I, P, B} is the probability of a {I, P, B}-frame being directly lost. It is assumed that direct frame losses are mutually independent.
A cut of length (j + 1) * N + z * (M − 1) frames will be generated if j + 1 consecutive I-frames are lost and the previous and next (in presentation order) frames are decoded. The next frame will always be the next I-frame, so this I-frame cannot be lost. The previous frame will always be the last P-frame of the previous GoP, so this P-frame cannot be lost. But any of the P-frames and the I-frame of this GoP cannot be lost too, or the last P-frame will not be decoded. Therefore, \(N_{\rm cut}[(j+1) * N + z * (M-1)] = N_G * P_I^{(j+1)} * (1-P_I)^2 (1-P_P)^{N_P}\), where N G = F/N is the number of GoPs in the video.
An i * M + z * (M − 1) frames length cut will be generated if the (N P + 1 − i)th P-frame of a GoP is lost, and again if previous and next (in presentation order) frames are decoded. Remember that i = 1 ...N P . The next frame will always be the next I-frame, so this I-frame cannot be lost. The previous frame will be a previous P-frame on the GoP or the I-frame of the GoP, so the previous P-frames and the I-frame cannot be lost. Therefore, \(N_{\rm cut}[i * M + z * (M-1)] = N_G * P_P * (1-P_I)^2 (1-P_P)^{N_P-i}\).
A cut of length j * N + i * M + z * (M − 1) frames will be generated if after losing a P-frame the next j I-frames are lost. Then, \(N_{\rm cut}[j * N + i * M + z * (M-1)] = N_G * P_I^j * P_P * (1-P_I)^2 (1-P_P)^{N_P-i}\).
To generate a M − 1 frames length cut, the M − 1 frames of a B-frames block have to be lost and its neighbouring frames in presentation order have to be decoded. These neighbouring frames will be the two previous P-frames or the previous I- and P-frame in transmission order. As the P-frames depend on the I-frame and its previous P-frames on the GoP, all the previous P-frames and the I-frame on the GoP cannot be lost. Therefore, for the ith B-frames block the number of cuts of length M − 1 frames is \(N_G * P_B^{M-1} * (1-P_I) (1-P_P)^i\), where i = 1 ...N P .
For an open GoP, the reasoning for B-frames is valid but incomplete, because there are (N P + 1) B-frames blocks on the GoP, not only N P . The frames in this “extra” block depend not only on the I-frame and the N P P-frames of the GoP, but on the I-frame of the next GoP too. Then, for the (N P + 1)th B-frames block of an open GoP the number of cuts of length M − 1 frames is \(N_G * P_B^{M-1} * (1-P_I)^2 (1-P_P)^{N_P}\).
In general, the number of cuts of length M − 1 frames is:
This procedure can be extended to cuts from c = 1 to c = M − 2 frames length. But there is an extra difficulty. Now in each B-frames block there are M − c possible loss combinations that generate a c frames length cut. For example, for the case of c = M − 2 they are two possibilities (see Fig. 7). The first possibility is to lose the first M − 2 B-frames of the block and to not lose the last B-frame. The second possibility is to not lose the first B-frame of the block and to lose the other M − 2 B-frames. In both cases, for the ith B-frames block, the number of cuts of length M − 2 frames is \(N_G * (1-P_B) * P_B^{M-2} * (1-P_I) (1-P_P)^i\).
To mathematically express the analytical model of c = 1 ...M − 1 lengths cut, we define σ and δ. σ is the minimum number of B-frames from the same block that should be available to the decoding process (should not be lost) to prevent a cut length larger than c, when the first lost B-frame is the rth in the block. δ is the contribution to the number of cuts of length c frames by the M − c possible loss combinations of a B-frames block that generate a c frames length cut. (20) presents the expression for σ and (21) presents the expression for δ.
In general, the number of cuts of length c = 1 ...M − 1 frames is:
Equation (22) presents the general expression for N cut[c]. It depends on the GoP structure of the video and on the loss probabilities of the frames, P {I, P, B}.
The total number of cuts T cut can be computed as:
The proportion of cuts of c frames length (P cut[c]) can be computed dividing the number of cuts of c frames length (N cut[c]) by the total number of cuts (T cut). The cut length Probability Mass Function P cut is the set of all possible values of P cut[c].
The average cut length L cut can be computed from the cut length Probability Mass Function:
Rights and permissions
About this article
Cite this article
Espina, F., Morato, D., Izal, M. et al. Analytical model for MPEG video frame loss rates and playback interruptions on packet networks. Multimed Tools Appl 72, 361–383 (2014). https://doi.org/10.1007/s11042-012-1344-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-012-1344-1