Abstract
Challenges, such as requirements of resources, limited availability of storage space on devices, and mobile bandwidth spectrum, inhibit unconstrained and ubiquitous video consumption. We propose a first-of-its-kind methodology to compress videos that stream human faces. We detect facial landmarks on-the-fly and compress the video by storing a sequence of distinct frames extracted from the video, such that the facial landmarks of a pair of successively stored frames are significantly different. We use a dynamic thresholding technique to detect the significance of difference and store meta-information for reconstructing the missing frames. To reduce glitches in the decompressed video, we use morphing technique that smoothens the transition between successive frames. We measure the objective goodness of our technique by evaluating the time taken to compress, the entropy per frame, peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and compression ratio. For subjective analysis, we perform a user study observing user satisfaction at different compression ratios. We provide an extension of our technique to handle videos with multiple faces. Our approach is complementary to the existing compression techniques, e.g. JPEG. By using the complementary approach, we further improve the compression ratio without compromising on the quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Google API for detecting facial landmarks. https://developers.google.com/android/reference/com/google/android/gms/vision/face/Landmark
Banerjee, R.: Video compression technique using facial landmarks on mobile devices. Master’s thesis, IIIT Delhi (2016). https://repository.iiitd.edu.in/jspui/handle/123456789/435
Bichsel, M.: Automatic interpolation and recognition of face images by morphing. In: Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, pp. 128–135 (1996)
Cootes, T.: Talking face video database. Images. https://www-prima.inrialpes.fr/FGnet/data/01-TalkingFace/talking_face.html
Ekman, P., Friesen, W.V.: Measuring facial movement. Environ. Psychol. Nonverbal Behav. 1(1), 56–75 (1976)
Facerig, F.: Facial animation system. https://facerig.com/
Heizmann, J., Zelinsky, A.: Robust real-time face tracking and gesture recognition. In: IJCAI, pp. 1525–1530 (1997)
Hore, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: ICPR, pp. 2366–2369. IEEE (2010)
Jo, G.S., Choi, I.H., Kim, Y.G.: Robust facial expression recognition against illumination variation appeared in mobile environment. In: CNS, pp. 10–13. IEEE (2011)
Kim, M., Kumar, S., Pavlovic, V., Rowley, H.: Face tracking and recognition with visual constraints in real-world videos. In: CVPR, pp. 1–8. IEEE (2008)
King, D.E.: Dlib-ml: A machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009). http://dl.acm.org/citation.cfm?id=1577069.1755843
Lane, N.D., Miluzzo, E., Lu, H., Peebles, D., Choudhury, T., Campbell, A.T.: A survey of mobile phone sensing. IEEE Commun. Mag. 48(9), 140–150 (2010)
Lane, N.D., et al.: Bewell: a smartphone application to monitor, model and promote wellbeing. In: ICST Conference on Pervasive Computing Technologies for Healthcare, pp. 23–26 (2011)
Lee, K.C., Ho, J., Yang, M.H., Kriegman, D.: Video-based face recognition using probabilistic appearance manifolds. In: CVPR, vol. 1, pp. 313–320. IEEE (2003)
Lien, J.J.J., Kanade, T., Cohn, J.F., Li, C.C.: Detection, tracking, and classification of action units in facial expression. Robot. Auton. Syst. 31(3), 131–146 (2000)
Likert, R.: A technique for the measurement of attitudes. Archives of psychology (1932)
Otsuka, T., Ohya, J.: Spotting segments displaying facial expression from image sequences using HMM. In: Conference on Automatic Face and Gesture Recognition, pp. 442–447. IEEE (1998)
Paleari, M., Lisetti, C.L.: Toward multimodal fusion of affective cues. In: Workshop on Human-centered Multimedia, pp. 99–108. ACM (2006)
Richardson, I.E.: H. 264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia. Wiley, Chichester (2004)
Setton, E., Yoo, T., Zhu, X., Goldsmith, A., Girod, B.: Cross-layer design of ad hoc networks for real-time video streaming. IEEE Wirel. Commun. 12(4), 59–65 (2005)
Suk, M., Prabhakaran, B.: Real-time mobile facial expression recognition system-a case study. In: CVPR Workshops, pp. 132–137. IEEE (2014)
Sullivan, G.J., Ohm, J.R., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Video Technol. 22(12), 1649–1668 (2012)
Tanaka, M., Kamio, R., Okutomi, M.: Seamless image cloning by a closed form solution of a modified poisson problem. In: SIGGRAPH Asia 2012 Posters, SA 2012, p. 15:1. ACM, New York (2012). https://doi.org/10.1145/2407156.2407173
Wang, J., Cohen, M.F.: Very low frame-rate video streaming for face-to-face teleconference. In: Data Compression Conference, pp. 309–318. IEEE (2005)
Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. In: CVPR, pp. 529–534. IEEE (2011)
Yeasin, M., Bullot, B., Sharma, R.: Recognition of facial expressions and measurement of levels of interest from video. Trans. Multimedia 8(3), 500–508 (2006)
Yi Tao, W.I.G.: Delaunay triangulation for image object indexing: a novel method for shape representation. In: Proceedings of the Seventh SPIE Symposium on Storage and Retrieval for Image and Video Databases, pp. 23–29 (1998)
Zhang, Z.L., Wang, Y., Du, D.H., Shu, D.: Video staging: a proxy-server-based approach to end-to-end video delivery over wide-area networks. IEEE/ACM Trans. Netw. (TON) 8(4), 429–442 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chhikara, G., Banerjee, R., Naik, V., Subramanyam, A.V., Dey, K. (2019). Use of Facial Landmarks for Adaptive Compression of Videos on Mobile Devices. In: Biswas, S., et al. Communication Systems and Networks. COMSNETS 2018. Lecture Notes in Computer Science(), vol 11227. Springer, Cham. https://doi.org/10.1007/978-3-030-10659-1_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-10659-1_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10658-4
Online ISBN: 978-3-030-10659-1
eBook Packages: Computer ScienceComputer Science (R0)