Skip to main content
Log in

Analytical model for MPEG video frame loss rates and playback interruptions on packet networks

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper presents and studies objective video quality evaluation techniques for a network where frame losses can be considered independent, for example a best effort not heavy loaded packet switching network. The total or partial loss of a frame’s information affects the quality of video playback, as the frame cannot be decoded and other frames that depend on it cannot be correctly decoded too. Therefore, during some time the video playback has errors in the image and the user will perceive them as interruptions. In this paper, the total number of decoded frames and the video playback interruptions duration will be considered important parameters to quantify the video quality. The analytical formulation for them will be presented and the importance of considering them together will be highlighted.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Bikfalvi A, García-Reinoso J, Vidal I, Valera F, Azcorra A (2011) P2P vs. IP multicast: Comparing approaches to IPTV streaming based on TV channel popularity. Comput Networks 55(6):1310–1325

    Article  Google Scholar 

  2. Borgnat P, Dewaele G, Fukuda K, Abry P, Cho K (2009) Seven years and one day: sketching the evolution of internet traffic. In: INFOCOM 2009. IEEE, pp 711–719

  3. Cheng RS, Lin CH, Chen JL, Chao HC (2012) Improving transmission quality of MPEG video stream by SCTP multi-streaming and differential RED mechanisms. J Supercomputing 62:68–83. doi:10.1007/s11227-011-0624-2

    Article  Google Scholar 

  4. Chih-Heng Ke CKS (2008) An evaluation framework for more realistic simulations of MPEG video transmission. J Inf Sci Eng 24(2):425–440

    Google Scholar 

  5. Cisco Visual Networking Index (2011) Forecast and methodology, 2010–2015. Tech Rep Cisco Systems Inc http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-481360.pdf. Accessed 7 May 2012

  6. Espina F, Morato D (2012) Survey on the current uses of H.264. Tech Rep Public University of Navarre (Spain). https://www.tlm.unavarra.es/~felix/publicaciones/TR/H.264.pdf

  7. Fiedler M, Hossfeld T, Tran-Gia P (2010) A generic quantitative relationship between quality of experience and quality of service. IEEE Netw 24(2):36–41

    Article  Google Scholar 

  8. Heyman D, Lakshman T (1996) Source models for VBR broadcast-video traffic. IEEE/ACM Trans Netw 4(1):40–48

    Article  Google Scholar 

  9. Huynh-Thu Q, Ghanbari M (2008) Scope of validity of PSNR in image/video quality assessment. Electron Lett 44(13):800–801

    Article  Google Scholar 

  10. ISO JTC 1/SC 29 (2000) ISO/IEC 13818-2:2000: information technology – generic coding of moving pictures and associated audio information: video. ISO

  11. ISO JTC 1/SC 29 (2009) ISO/IEC 14496-2:2004: information technology – coding of audio-visual objects – Part 2: visual. ISO

  12. ISO JTC 1/SC 29 (2012) ISO/IEC 14496-10:2010: information technology – coding of audio-visual objects – Part 10: advanced video coding. ISO

  13. ITU-T Study Group 12 (2008) Recommendation P.10/G.100 (2006) Amendment 2: vocabulary for performance and quality of service – new definitions for inclusion in recommendation ITU-T P.10/G.100. ITU-T

  14. ITU-T Study Group 9 (2008) Recommendation J.247 (08/2008): objective perceptual multimedia video quality measurement in the presence of a full reference. ITU-T

  15. Kusuma T, Zepernick HJ (2003) A reduced-reference perceptual quality metric for in-service image quality assessment. In: Mobile future and symposium on trends in communications, 2003. Joint First Workshop on SympoTIC ’03, pp 71–74

  16. Lin CH, Ke CH, Shieh CK, Chilamkurti N (2006) The packet loss effect on MPEG video transmission in wireless networks. In: Advanced information networking and applications, 2006. In: 20th International Conference on AINA 2006, vol 1, pp 565–572

  17. MAWI (Measurement and Analysis on the WIDE Internet) Working Group (2012) 150 megabit ethernet anonymized packet traces without payload: WIDE-TRANSIT link @ Tokyo, Japan. http://mawi.wide.ad.jp/mawi/ditl/ditl2010/. Accessed 7 May 2012

  18. Moving Picture Experts Group (2012) The moving picture experts group (MPEG) home page. http://mpeg.chiariglione.org/. Accessed 7 May 2012

  19. Osama A, Lotfallah MR, Panchanathan S (2006) A framework for advanced video traces: evaluating visual quality for video transmission over lossy networks. EURASIP J Adv Signal Process 2006:042083. doi:10.1155/ASP/2006/42083

    Google Scholar 

  20. Pastrana-Vidal R, Gicquel J (2006) Automatic quality assessment of video fluidity impairments using a no-reference metric. In: Proc. of int. workshop on video processing and quality metrics for consumer electronics

  21. Pastrana-Vidal R, Gicquel J (2007) A no-reference video quality metric based on a human assessment model. In: Third international workshop on video processing and quality metrics for consumer electronics VPQM, vol 7, pp 25–26

  22. Pastrana-Vidal R, Gicquel J, Colomes C, Cherifi H (2004) Frame dropping effects on user quality perception. In: 5th international workshop on image analysis for multimedia interactive services

  23. Pastrana-Vidal RR, Gicquel JC, Colomes C, Cherifi H (2004) Sporadic frame dropping impact on quality perception. In: Rogowitz BE, Pappas TN (eds. Human vision and electronic imaging IX, vol 5292. SPIE, pp 182–193

  24. Reisslein M, Lassetter J, Ratnam S, Lotfallah O, Fitzek FH, Panchanathan S (2002) Traffic and quality characterization of scalable encoded video: a large-scale trace-based study, part 1: overview and definitions. Arizona State Univ., Dept. of Electrical Eng., Tech. Rep. http://trace.eas.asu.edu/publications/p1.pdf. Accessed 7 May 2012

  25. Seeling P, Reisslein M, Kulapala B (2004) Network performance evaluation using frame size and quality traces of single-layer and two-layer video: a tutorial. IEEE Commun Surv Tutor 6(3):58–78

    Article  Google Scholar 

  26. Tionardi L, Hartanto F (2003) The use of cumulative inter-frame jitter for adapting video transmission rate. In: TENCON 2003. conference on convergent technologies for Asia-Pacific region, vol 1, pp 364–368

  27. Van der Auwera G, David PT, Reisslein M (2008) Traffic and quality characterization of single-layer video streams encoded with the H.264/MPEG-4 advanced video coding standard and scalable video coding extension. IEEE Trans Broadcast 54(3):698–718

    Article  Google Scholar 

  28. Varga A, Hornig R (2008) An overview of the OMNeT+ + simulation environment. In: Simutools ’08: Proceedings of the 1st international conference on simulation tools and techniques for communications, networks and systems & workshops. ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), ICST, Brussels, Belgium, Belgium, pp 1–10

  29. Ziviani A, Wolfinger BE, Rezende JF, Duarte OC, Fdida S (2005) Joint adoption of QoS schemes for MPEG streams. Multimedia Tools Appl 26(1):59–80

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the Spanish Ministry of Science and Innovation through the research project INSTINCT (TEC-2010-21178-C02-01). The authors would also like to thank the Spanish thematic network FIERRO (TEC2010-12250-E).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Felix Espina.

Appendix: Cut lengths in a video

Appendix: Cut lengths in a video

In a video flow that suffers frame losses, the cut length can vary from one frame length to the total length of the video. The cut length depends on the grouping of the directly lost frames and of the non-decodable frames due these lost frames. The cuts must be understood as experienced by the user, therefore they must be measured from the presentation order of the videos, not the transmission order where the losses take place.

For example (Fig. 6), in a video with the IPBBPBB GoP structure in transmission order, if the second B-frame and second P-frame are lost, the four last frames of the GoP could not been decoded. But in presentation order, two disconnect cuts of size 1 and 3 will be generated. This makes the analytical model of cuts more complex.

Fig. 6
figure 6

Left: second B-frame and second P-frame in transmission order are lost; Center: the last B-frames cannot be decoded; Right: in presentation order two cuts of sizes 1 and 3 are generated

When an I-frame is directly lost the whole GoP cannot be decoded, creating a single cut of at least length N. In the case of an open GoP, the last B-frames from the previous GoP will be non-decodable and they will be part of the same single cut of length N + (M − 1) when considered in presentation order. If we define the control variable \(z=\frac{N_B}{M-1}-N_P\), where z = 1 if the GoP is an open one, and z = 0 if it is a closed GoP, then the cut length will be N + z * (M − 1).

If the I-frame of the next GoP is lost too, all the frames from the next GoP will not be decoded and a single cut of length 2 * N + z * (M − 1) will be generated. In general, when j consecutive I-frames are lost, a single cut of length j * N + z * (M − 1) will be generated.

The loss of a P-frame makes impossible to decode all the following frames in the GoP. As all these frames are consecutive, both in transmission and presentation order, only one cut is generated. If the lost P-frame is the last or N P th P-frame of the GoP, the cut will be of M + z * (M − 1) frames length. If the lost P-frame is the penultimate or (N P  − 1)th P-frame of the GoP, the cut will be of 2 * M + z * (M − 1) frames length. If the lost P-frame is the first P-frame of the GoP, the cut will be of N P * M + z * (M − 1) frames length. In general, when the (N P  + 1 − i)th P-frame of the GoP is lost, a single cut of i * M + z * (M − 1) will be generated.

If the I-frame from the next GoP is lost too, then all the frames from the next GoP will not be decoded. Again, all these frames are consecutive and result in only one cut of N + i * M + z * (M − 1) frames. In general, when the (N P  + 1 − i)th P-frame of the GoP is lost and the next j I-frames are lost, a single cut of length j * N + i * M + z * (M − 1) will be generated.

The loss of c consecutive B-frames of a B-frames block does not affect other frames and it creates only one cut. If the last B-frame of the B-frames block is lost and the next frame (either I-frame or P-frame) is lost too, in presentation order the non-decoded frames will not be consecutive. The B-frames will be on one cut and the frames non-decoded from the lost I- or P-frame will be on the other cut.

The possible cut lengths are summarized in (19), where F is the number of frames in the video.

$$ \begin{array}{rll} \label{eq:c2} c &=& [ 1 \ldots M-1 ,\: j * N + i * M + z * (M-1) ,\: (j+1) * N + z * (M-1) ] \\ i &=& 1 \ldots N_P \\ j&=& 0 \ldots N_G-1 \end{array} $$
(19)

A similar approach was used for the analytical expression of Q in [29]. On the following we obtain the analytical expression of N cut[c], the number of cuts of c frames length.

P {I, P, B} is the probability of a {I, P, B}-frame being directly lost. It is assumed that direct frame losses are mutually independent.

A cut of length (j + 1) * N + z * (M − 1) frames will be generated if j + 1 consecutive I-frames are lost and the previous and next (in presentation order) frames are decoded. The next frame will always be the next I-frame, so this I-frame cannot be lost. The previous frame will always be the last P-frame of the previous GoP, so this P-frame cannot be lost. But any of the P-frames and the I-frame of this GoP cannot be lost too, or the last P-frame will not be decoded. Therefore, \(N_{\rm cut}[(j+1) * N + z * (M-1)] = N_G * P_I^{(j+1)} * (1-P_I)^2 (1-P_P)^{N_P}\), where N G  = F/N is the number of GoPs in the video.

An i * M + z * (M − 1) frames length cut will be generated if the (N P  + 1 − i)th P-frame of a GoP is lost, and again if previous and next (in presentation order) frames are decoded. Remember that i = 1 ...N P . The next frame will always be the next I-frame, so this I-frame cannot be lost. The previous frame will be a previous P-frame on the GoP or the I-frame of the GoP, so the previous P-frames and the I-frame cannot be lost. Therefore, \(N_{\rm cut}[i * M + z * (M-1)] = N_G * P_P * (1-P_I)^2 (1-P_P)^{N_P-i}\).

A cut of length j * N + i * M + z * (M − 1) frames will be generated if after losing a P-frame the next j I-frames are lost. Then, \(N_{\rm cut}[j * N + i * M + z * (M-1)] = N_G * P_I^j * P_P * (1-P_I)^2 (1-P_P)^{N_P-i}\).

To generate a M − 1 frames length cut, the M − 1 frames of a B-frames block have to be lost and its neighbouring frames in presentation order have to be decoded. These neighbouring frames will be the two previous P-frames or the previous I- and P-frame in transmission order. As the P-frames depend on the I-frame and its previous P-frames on the GoP, all the previous P-frames and the I-frame on the GoP cannot be lost. Therefore, for the ith B-frames block the number of cuts of length M − 1 frames is \(N_G * P_B^{M-1} * (1-P_I) (1-P_P)^i\), where i = 1 ...N P .

For an open GoP, the reasoning for B-frames is valid but incomplete, because there are (N P  + 1) B-frames blocks on the GoP, not only N P . The frames in this “extra” block depend not only on the I-frame and the N P P-frames of the GoP, but on the I-frame of the next GoP too. Then, for the (N P  + 1)th B-frames block of an open GoP the number of cuts of length M − 1 frames is \(N_G * P_B^{M-1} * (1-P_I)^2 (1-P_P)^{N_P}\).

In general, the number of cuts of length M − 1 frames is:

$$ N_G * P_B^{M-1} * (1-P_I) \sum\limits_{m=1}^{N_P} (1-P_P)^m + z * N_G * P_B^{M-1} * (1-P_I)^2 (1-P_P)^{N_P} $$

This procedure can be extended to cuts from c = 1 to c = M − 2 frames length. But there is an extra difficulty. Now in each B-frames block there are M − c possible loss combinations that generate a c frames length cut. For example, for the case of c = M − 2 they are two possibilities (see Fig. 7). The first possibility is to lose the first M − 2 B-frames of the block and to not lose the last B-frame. The second possibility is to not lose the first B-frame of the block and to lose the other M − 2 B-frames. In both cases, for the ith B-frames block, the number of cuts of length M − 2 frames is \(N_G * (1-P_B) * P_B^{M-2} * (1-P_I) (1-P_P)^i\).

Fig. 7
figure 7

GoP structure (9, 4) with its two possibilities to generate a two frames cut length

To mathematically express the analytical model of c = 1 ...M − 1 lengths cut, we define σ and δ. σ is the minimum number of B-frames from the same block that should be available to the decoding process (should not be lost) to prevent a cut length larger than c, when the first lost B-frame is the rth in the block. δ is the contribution to the number of cuts of length c frames by the M − c possible loss combinations of a B-frames block that generate a c frames length cut. (20) presents the expression for σ and (21) presents the expression for δ.

$$ \label{eq:sigma2} \sigma = \begin{cases} 0 & \text{ if $r=1$ and $r+c=M$ } \\ 1 & \text{ if $(r=1$ and $r+c<M)$ or $(r>1$ and $r+c=M)$ } \\ 2 & \text{ if $r>1$ and $r+c<M$ } \\ \end{cases} $$
(20)
$$ \label{eq:delta2} \delta = \sum\limits_{r=1}^{M-c} (1-P_B)^{\sigma} $$
(21)

In general, the number of cuts of length c = 1 ...M − 1 frames is:

$$ N_G * \delta * P_B^c * (1-P_I) \sum\limits_{m=1}^{N_P} (1-P_P)^m + z * N_G * P_B^c * (1-P_I)^2 (1-P_P)^{N_P} $$

Equation (22) presents the general expression for N cut[c]. It depends on the GoP structure of the video and on the loss probabilities of the frames, P {I, P, B}.

$$ \label{eq:Ncut2} N_{\rm cut}[c] = \begin{cases} N_G * \delta * P_B^c (1\!-\!P_I) \: \sum\limits_{m=1}^{N_P} (1\!-\!P_P)^m + \\ + z * N_G * \delta * P_B^c * (1\!-\!P_I)^2 \: (1\!-\!P_P)^{N_P} & {\kern-8pt} \text{ for $c=1 \ldots M-1$ }\\ \\ N_G * P_I^j * P_P * (1\!-\!P_I)^2 \: (1\!-\!P_P)^{N_P-i} & {\kern-8pt} \text{ for $c={\kern-1.5pt} j * N {\kern-1pt} +{\kern-1pt} i * {\kern-1pt} M {\kern-1pt} +{\kern-1pt} z *{\kern-1pt} ({\kern-.5pt} M{\kern-1.5pt} -{\kern-1.5pt} 1{\kern-.5pt} )$}\\ \\ N_G * P_I^j * (1\!-\!P_I)^2 \: (1\!-\!P_P)^{N_P} & {\kern-8pt} \text{ for $c={\kern-1pt} (j+1) * N + z * (M{\kern-.5pt} -{\kern-1.5pt} 1{\kern-.5pt} )$ } \\ \\ 0 & {\kern-8pt} \text{ otherwise } \end{cases} $$
(22)

The total number of cuts T cut can be computed as:

$$ \label{eq:TotalCuts2} T_{\rm cut} = \sum\limits_{c=1}^{F} N_{\rm cut}[c] $$
(23)

The proportion of cuts of c frames length (P cut[c]) can be computed dividing the number of cuts of c frames length (N cut[c]) by the total number of cuts (T cut). The cut length Probability Mass Function P cut is the set of all possible values of P cut[c].

$$ \label{eq:Pcut2} P_{\rm cut}[c] = \frac{ N_{\rm cut}[c] }{ T_{\rm cut} } = \frac{ N_{\rm cut}[c] }{ \sum\limits_{i=1}^{F} N_{\rm cut}[i] } $$
(24)

The average cut length L cut can be computed from the cut length Probability Mass Function:

$$ \label{eq:CutLength2} L_{\rm cut} = \sum\limits_{c=1}^{F} c * P_{\rm cut}[c] = \frac{ \sum\limits_{c=1}^{F} c * N_{\rm cut}[c] }{ \sum\limits_{c=1}^{F} N_{\rm cut}[c] } = \frac{ \sum\limits_{c=1}^{F} c * N_{\rm cut}[c] }{ T_{\rm cut} } $$
(25)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Espina, F., Morato, D., Izal, M. et al. Analytical model for MPEG video frame loss rates and playback interruptions on packet networks. Multimed Tools Appl 72, 361–383 (2014). https://doi.org/10.1007/s11042-012-1344-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-012-1344-1

Keywords

Navigation