Skip to main content

Detection of Lip Synchronization Artifacts

  • Conference paper
  • First Online:
Book cover Multimedia Communications, Services and Security (MCSS 2015)

Abstract

Over 10 billion hours of video are watched each month on the Internet, what, together with high definition television broadcasting and the rise in high quality video on demand makes the task of quality assessment a key one in the global multimedia market nowadays. Automating quality checking is currently based on finding major audiovisual artifacts. The Monitoring Of Audio Visual quality by key Indicators (MOAVI) subgroup of the Video Quality Experts Group (VQEG) is an open collaborative project for developing No-Reference models for monitoring audiovisual service quality. The purpose of this paper is to report the development of the audiovisual part of this project, which includes the detection of lip synchronization (also known as lip sync) artifacts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Baran, R., Glowacz, A., Matiolanski, A.: The efficient real- and non-real-time make and model recognition of cars. Multimedia Tools Appl. 74(12), 4269–4288 (2015). http://dx.doi.org/10.1007/s11042-013-1545-2

    Article  Google Scholar 

  2. Cerqueira, E., Janowski, L., Leszczuk, M., Papir, Z., Romaniak, P.: Video artifacts assessment for live mobile streaming applications. In: Mauthe, A., Zeadally, S., Cerqueira, E., Curado, M. (eds.) FMN 2009. LNCS, vol. 5630, pp. 242–247. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  3. Farneback, G.: Very high accuracy velocity estimation using orientation tensors, parametric motion and simultaneous segmentation of motion field. In: Proceedings of the Eighth IEEE International Conference on Computer Vision, Vancouver, Canada (2001)

    Google Scholar 

  4. Głowacz, A., Grega, M., Gwiazda, P., Janowski, L., Leszczuk, M., Romaniak, P., Romano, S.: Automated qualitative assessment of multi-modal distortions in digital images based on glz. Ann. Telecommun. - Ann. Telecommun. 65(12), 3–17 (2010). http://dx.doi.org/10.1007/s12243-009-0146-6

    Article  Google Scholar 

  5. Kacprzak, S., Ziółko, M.: Speech/music discrimination via energy density analysis. In: Dediu, A.-H., Martín-Vide, C., Mitkov, R., Truthe, B. (eds.) SLSP 2013. LNCS, vol. 7978, pp. 135–142. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Leszczuk, M., Baran, R., Skoczylas, Ł., Rychlik, M., Ślusarczyk, P.: Public transport vehicle detection based on visual information. In: Dziech, A., Czyżewski, A. (eds.) MCSS 2014. CCIS, vol. 429, pp. 16–28. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  7. Moddemeijer, R.: On the convergence of the iterative solution of the likelihood equations. In: Schouwhamer Immink, K.A. (ed.) Ninth Symposium on Information Theory in the Benelux, May 26-27, 1988, Mierlo (NL), pp. 121-128. Werkgemeenschap Informatie- en Communicatietheorie, Enschede (NL) (1999). ISBN: 90-71048-04-7

    Google Scholar 

  8. Sohn, J., Sung, W.: A voice activity detector employing soft decision based noise spectrum adaptation. In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 365–368, May 1998

    Google Scholar 

  9. Staelens, N., De Meulenaere, J., Bleumers, L., Van Wallendael, G., De Cock, J., Geeraert, K., Vercammen, N., Van den Broeck, W., Vermeulen, B., Van de Walle, R., Demeester, P.: Assessing the importance of audio/video synchronization for simultaneous translation of video sequences. Multimedia Syst. 18(6), 445–457 (2012). http://dx.doi.org/10.1007/s00530-012-0262-4

    Article  Google Scholar 

  10. Steinmetz, R.: Human perception of jitter and media synchronization. IEEE J. Sel. Areas Commun. 14(1), 61–72 (1996)

    Article  Google Scholar 

  11. Venkatesh, R., Bopardikar, A.S., Perkis, A., Hillestad, O.I.: No-reference metrics for video streaming applications. In: Proceedings of the 14th International Packet Video Workshop (PV 2004), Irvine, CA, USA, 13–14 December 2004

    Google Scholar 

  12. Viola, P., Jones, M.: Robust real-time object detection. Int. J. Comput. Vis. 57(2), 137–154 (2001)

    Article  Google Scholar 

Download references

Acknowledgments

The work was co-financed by The Polish National Centre for Research and Development (NCBR), as a part of the EUREKA Project №. C 2012/1-5 MITSU.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ignacio Blanco Fernández .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Fernández, I.B., Leszczuk, M. (2015). Detection of Lip Synchronization Artifacts. In: Dziech, A., Leszczuk, M., Baran, R. (eds) Multimedia Communications, Services and Security. MCSS 2015. Communications in Computer and Information Science, vol 566. Springer, Cham. https://doi.org/10.1007/978-3-319-26404-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26404-2_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26403-5

  • Online ISBN: 978-3-319-26404-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics