Skip to main content

Probabilistic Subpixel Temporal Registration for Facial Expression Analysis

  • Conference paper
  • First Online:
Computer Vision -- ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9006))

Included in the following conference series:

Abstract

Face images in a video sequence should be registered accurately before any analysis, otherwise registration errors may be interpreted as facial activity. Subpixel accuracy is crucial for the analysis of subtle actions. In this paper we present PSTR (Probabilistic Subpixel Temporal Registration), a framework that achieves high registration accuracy. Inspired by the human vision system, we develop a motion representation that measures registration errors among subsequent frames, a probabilistic model that learns the registration errors from the proposed motion representation, and an iterative registration scheme that identifies registration failures thus making PSTR aware of its errors. We evaluate PSTR’s temporal registration accuracy on facial action and expression datasets, and demonstrate its ability to generalise to naturalistic data even when trained with controlled data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The demo video is available on http://www.youtube.com/user/AffectQMUL.

References

  1. Vinciarelli, A., Pantic, M., Bourlard, H.: Social signal processing: survey of an emerging domain. Image Vis. Comput. 27, 1743–1759 (2009)

    Article  Google Scholar 

  2. Gunes, H., Schuller, B.: Categorical and dimensional affect analysis in continuous input: current trends and future directions. Image Vis. Comput. 31, 120–136 (2013)

    Article  Google Scholar 

  3. Valstar, M., Schuller, B., Smith, K., Eyben, F., Jiang, B., Bilakhia, S., Schnieder, S., Cowie, R., Pantic, M.: AVEC 2013 - the continuous audio/visual emotion and depression recognition challenge. In: Proceedings ACM International Workshop on Audio/Visual Emotion Challenge, pp. 3–10 (2013)

    Google Scholar 

  4. Almaev, T., Valstar, M.: Local Gabor binary patterns from three orthogonal planes for automatic facial expression recognition. In: Proceedings International Conference on Affective Computing and Intelligent Interaction, pp. 356–361 (2013)

    Google Scholar 

  5. Zhao, G., Pietikäinen, M.: Boosted multi-resolution spatiotemporal descriptors for facial expression recognition. Pattern Recogn. Lett. 30, 1117–1127 (2009)

    Article  Google Scholar 

  6. Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. Pattern Anal. Mach. Intell. 29, 915–928 (2007)

    Article  Google Scholar 

  7. Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31, 39–58 (2009)

    Article  Google Scholar 

  8. Jiang, B., Valstar, M., Martinez, B., Pantic, M.: Dynamic appearance descriptor approach to facial actions temporal modelling. IEEE Trans. Syst. Man Cybern. Part B 44, 161–174 (2014)

    Google Scholar 

  9. Huang, X., Zhao, G., Zheng, W., Pietikäinen, M.: Towards a dynamic expression recognition system under facial occlusion. Pattern Recogn. Lett. 33, 2181–2191 (2012)

    Article  Google Scholar 

  10. Valstar, M.F., Pantic, M.: Combined support vector machines and hidden markov models for modeling facial action temporal dynamics. In: Lew, M., Sebe, N., Huang, T.S., Bakker, E.M. (eds.) HCI 2007. LNCS, vol. 4796, pp. 118–127. Springer, Heidelberg (2007)

    Google Scholar 

  11. Valstar, M., Jiang, B., Mehu, M., Pantic, M., Scherer, K.: The first facial expression recognition and analysis challenge. In: Proceedings IEEE International Conference Automatic Face Gesture Recognition, pp. 921–926 (2011)

    Google Scholar 

  12. Çeliktutan, O., Ulukaya, S., Sankur, B.: A comparative study of face landmarking techniques. EURASIP J. Image Video Process. 2013, 13 (2013)

    Article  Google Scholar 

  13. Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition, pp. 2879–2886 (2012)

    Google Scholar 

  14. Jiang, B., Valstar, M., Pantic, M.: Action unit detection using sparse appearance descriptors in space-time video volumes. In: Proceedings IEEE International Conference on Automatic Face and Gesture Recognition, pp. 314–321 (2011)

    Google Scholar 

  15. Tzimiropoulos, G., Argyriou, V., Zafeiriou, S., Stathaki, T.: Robust FFT-based scale-invariant image registration with image gradients. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1899–1906 (2010)

    Article  Google Scholar 

  16. Adelson, E.H., Bergen, J.R.: Spatio-temporal energy models for the perception of motion. J. Opt. Soc. Am. 2, 284–299 (1985)

    Article  Google Scholar 

  17. Kolers, P.A.: Aspects of Motion Perception. Pergamon Press, Oxford (1972)

    Google Scholar 

  18. Petkov, N., Subramanian, E.: Motion detection, noise reduction, texture suppression, and contour enhancement by spatiotemporal Gabor filters with surround inhibition. Biol. Cybern. 97, 423–439 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  19. Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)

    MATH  Google Scholar 

  20. Amano, K., Edwards, M., Badcock, D.R., Nishida, S.: Adaptive pooling of visual motion signals by the human visual system revealed with a novel multi-element stimulus. J. Vis. 9, 1–25 (2009)

    Article  Google Scholar 

  21. Pinto, N., Cox, D.D., DiCarlo, J.J.: Why is real-world visual object recognition hard? PLoS Comput. Biol. 4, e27 (2008)

    Article  MathSciNet  Google Scholar 

  22. Webb, B.S., Ledgeway, T., Rocchi, F.: Neural computations governing spatiotemporal pooling of visual motion signals in humans. J. Neurosci. 31, 4917–4925 (2011)

    Article  Google Scholar 

  23. Boureau, Y.L., Ponce, J., LeCun, Y.: A theoretical analysis of feature pooling in visual recognition. In: International Conference on Machine Learning, pp. 111–118 (2010)

    Google Scholar 

  24. Fischer, S., Šroubek, F., Perrinet, L., Redondo, R., Cristóbal, G.: Self-invertible 2d log-Gabor wavelets. Int. J. Comput. Vis. 75, 231–246 (2007)

    Article  Google Scholar 

  25. Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) Computer Vision - ECCV 2006. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  26. Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22, 761–767 (2004)

    Article  Google Scholar 

  27. Sim, T., Baker, S., Bsat, M.: The CMU pose, illumination, and expression database. IEEE Trans. Pattern Analysis and Machine Intelligence 25, 1615–1618 (2003)

    Article  Google Scholar 

  28. McKeown, G., Valstar, M., Cowie, R., Pantic, M., Schroder, M.: The SEMAINE database: annotated multimodal records of emotionally colored conversations between a person and a limited agent. IEEE Trans. Affect. Comput. 3, 5–17 (2012)

    Article  Google Scholar 

Download references

Acknowledgement

The work of E. Sariyanidi and H. Gunes is partially supported by the EPSRC MAPTRAITS Project (Grant Ref: EP/K017500/1).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evangelos Sariyanidi .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (avi 17,790 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sariyanidi, E., Gunes, H., Cavallaro, A. (2015). Probabilistic Subpixel Temporal Registration for Facial Expression Analysis. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16817-3_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16816-6

  • Online ISBN: 978-3-319-16817-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics