Abstract
A standing conversational group (also known as F-formation) occurs when two or more people sustain a social interaction, such as chatting at a cocktail party. Detecting such interactions in images or videos is of fundamental importance in many contexts, like surveillance, social signal processing, social robotics or activity classification. This paper presents an approach to this problem by modeling the socio-psychological concept of an F-formation and the biological constraints of social attention. Essentially, an F-formation defines some constraints on how subjects have to be mutually located and oriented while the biological constraints defines the plausible zone in which persons can interact. We develop a game-theoretic framework embedding these constraints, which is supported by a statistical modeling of the uncertainty associated with the position and orientation of people. First, we use a novel representation of the affinity between pairs of people expressed as a distance between distributions over the most plausible oriented region of attention.Additionally, we integrate temporal information over multiple frames to smooth noisy head orientation and pose estimates, solve ambiguous situations and establish a more precise social context. We do this in a principled way by using recent notions from multi-payoff evolutionary game theory. Experiments on several benchmark datasets consistently show the superiority of the proposed approach over state of the art and its robustness under severe noise conditions.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
H. Hung—Author has been partially supported by the European Commission under contract number FP7-ICT-600877 (SPENCER) and is affiliated with the Delft Data Science consortium.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Groh, G., Lehmann, A., Reimers, J., Frieß, M.R., Schwarz, L.: Detecting social situations from interaction geometry. In: 2010 IEEE Second International Conference on Social Computing (SocialCom), pp. 1–8. IEEE (2010)
Li, R., Porfilio, P., Zickler, T.: Finding group interactions in social clutter. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2013)
Gan, T., Wong, Y., Zhang, D., Kankanhalli, M.S.: Temporal encoded F-formation system for social interaction detection. In: Proceedings of the 21st ACM International Conference on Multimedia, MM 2013, pp. 937–946. ACM, New York, NY, USA (2013)
Marin-Jimenez, M., Zisserman, A., Ferrari, V.: Here’s looking at you, kid. Detecting people looking at each other in videos. In: British Machine Vision Conference (2011)
Zen, G., Lepri, B., Ricci, E., Lanz, O.: Space speaks: towards socially and personality aware visual surveillance. In: 1st ACM International Workshop on Multimodal Pervasive Video Analysis, pp. 37–42 (2010)
Hung, H., Kröse, B.: Detecting F-formations as dominant sets. In: ICMI (2011)
Cristani, M., Bazzani, L., Paggetti, G., Fossati, A., Tosato, D., Del Bue, A., Menegaz, G., Murino, V.: Social interaction discovery by statistical analysis of F-formations. In: Proceedings of the BMVC, pp. 23.1–23.12. BMVA Press (2011)
Lan, T., Wang, Y., Yang, W., Robinovitch, S.N., Mori, G.: Discriminative latent models for recognizing contextual group activities. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1549–1562 (2012)
Yu, T., Lim, S., Patwardhan, K.A., Krahnstoever, N.: Monitoring, recognizing and discovering social networks. In: CVPR (2009)
Tran, K., Gala, A., Kakadiaris, I., Shah, S.: Activity analysis in crowded environments using social cues for group discovery and human interaction modeling. Pattern Recogn. Lett. 44, 49–57 (2013)
Kendon, A.: Conducting Interaction: Patterns of Behavior in Focused Encounters (Studies in Interactional Sociolinguistics). Cambridge University Press, Cambridge (1990)
Hüttenrauch, H., Eklundh, K.S., Green, A., Topp, E.A.: Investigating spatial relationships in human-robot interaction. In: 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 5052–5059. IEEE (2006)
Hall, E.T.: The Hidden Dimension. Anchor, New York (1990)
Goffman, E.: Behavior in Public Places: Notes on the Social Organization of Gatherings. Free Press, New York (1966)
Ciolek, T.M., Kendon, A.: Environment and the spatial arrangement of conversational encounters. Sociol. Inq. 50, 237–271 (1980)
Chen, C., Odobez, J.: We are not contortionists: coupled adaptive learning for head and body orientation estimation in surveillance video. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1544–1551. IEEE (2012)
Jain, V., Crowley, J.L.: Head pose estimation using multi-scale gaussian derivatives. In: Kämäräinen, J.-K., Koskela, M. (eds.) SCIA 2013. LNCS, vol. 7944, pp. 319–328. Springer, Heidelberg (2013)
Setti, F., Hung, H., Cristani, M.: Group detection in still images by F-formation modeling: a comparative study. In: WIAMIS (2013)
Torsello, A., Rota Bulò, S., Pelillo, M.: Grouping with asymmetric affinities: a game-theoretic perspective. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), vol. 1, pp. 292–299 (2006)
Rota Buló, S., Pelillo, M.: A game-theoretic approach to hypergraph clustering. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1312–1327 (2013)
Somasundaram, K., Baras, J.S.: Achieving symmetric Pareto Nash equilibria using biased replicator dynamics. In: 48th IEEE Conference Decision Control, pp. 7000–7005 (2009)
Pellegrini, S., Ess, A., Van Gool, L.: Improving data association by joint modeling of pedestrian trajectories and groupings. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 452–465. Springer, Heidelberg (2010)
Yamaguchi, K., Berg, A., Ortiz, L., Berg, T.: Who are you with and where are you going? In: IEEE Conference on Computer Vision and Patter Recognition (CVPR) (2011)
Ge, W., Collins, R.T., Ruback, R.B.: Vision-based analysis of small groups in pedestrian crowds. IEEE Trans. Pattern Anal. Mach. Intell. 34, 1003–1016 (2012)
Qin, Z., Shelton, C.R.: Improving multi-target tracking via social grouping. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)
Chang, M., Krahnstoever, N., Ge, W.: Probabilistic group-level motion analysis and scenario recognition. In: IEEE ICCV (2011)
Leal-Taixé, L., Pons-Moll, G., Rosenhahn, B.: Everybody needs somebody: modeling social and grouping behavior on a linear programming multiple people tracker. In: IEEE International Conference on Computer Vision Workshops (ICCVW). 1st Workshop on Modeling, Simulation and Visual Analysis of Large Crowds (2011)
Mckenna, S.J., Jabri, S., Duric, Z., Wechsler, H., Rosenfeld, A.: Tracking groups of people. Comput. Vis. Image Underst. 80, 42–56 (2000)
Cupillard, F., Bremond, F., Thonnat, M.: Tracking groups of people for video surveillance. In: Remagnino, P., Jones, G.A., Paragios, N., Regazzoni, C.S. (eds.) Video-Based Surveillance Systems, pp. 89–100. Springer, Heidelberg (2002)
Marques, J.S., Jorge, P.M., Abrantes, A.J., Lemos, J.M.: Tracking groups of pedestrians in video sequences. In: IEEE Conference on Computer Vision and Patter Recognition Workshops (CVPR Workshops), vol. 9, pp. 101–101 (2003)
Pellegrini, S., Ess, A., Schindler, K., Gool, L.J.V.: You’ll never walk alone: modeling social behavior for multi-target tracking. In: ICCV 2009, pp. 261–268 (2009)
Helbing, D., Molnar, P.: Social force model for pedestrian dynamics. Phys. Rev. E 51, 4282 (1995)
Smith, K., Ba, S.O., Odobez, J.M., Gatica-Perez, D.: Tracking the visual focus of attention for a varying number of wandering people. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1212–1229 (2008)
Adams, R.B.: The Science of Social Vision, vol. 7. Oxford University Press, New York (2011)
Kendon, A.: Some functions of gaze-direction in social interaction. Acta Psychol (Amst) 26, 22–63 (1967)
Jovanovic, N., op den Akker, R.: Towards automatic addressee identification in multi-party dialogues. In: Proceedings of the 5th SIGdial Workshop on Discourse and Dialogue, pp. 89–92, Pennsylvania, USA. Association for Computational Linguistics (2004). Imported from HMI
Duncan, S.: Some signals and rules for taking speaking turns in conversations. J. Pers. Soc. Psychol. 23, 283–292 (1972)
Ba, S.O., Odobez, J.: Multiperson visual focus of attention from head pose and meeting contextual cues. IEEE Trans. Pattern Anal. Mach. Intell. 33, 101–116 (2011)
Subramanian, R., Staiano, J., Kalimeri, K., Sebe, N., Pianesi, F.: Putting the pieces together: multimodal analysis of social attention in meetings. In: Proceedings of the International Conference on Multimedia, MM 2010, pp. 659–662. ACM, New York, NY, USA (2010)
Pavan, M., Pelillo, M.: Dominant sets and pairwise clustering. IEEE Trans. Pattern Anal. Mach. Intell. 29, 167–172 (2007)
Setti, F., Lanz, O., Ferrario, R., Murino, V., Cristani, M.: Multi-scale F-formation discovery for group detection. In: International Conference on Image Processing (ICIP) (2013)
Weibull, J.W.: Evolutionary Game Theory. MIT Press, Cambridge (2005)
Blackwell, D.: An analog of the minimax theorem for vector payoffs. Pacific J. Math. 6, 1–8 (1956)
Shapley, L.S.: Equilibrium points in games with vector payoffs. Naval Res. Logist. Q. 6, 57–61 (1959)
Contini, B.M.: A decision model under uncertainty with multiple objectives. In: Mensch, A. (ed.) Theory of Games: Techniques and Applications. American Elsevier, New York (1966)
Zeleny, M.: Games with multiple payoffs. Int. J. Game Theory 4, 179–191 (1975)
Kawamura, T., Kanazawa, T., Ushio, T.: Evolutionarily and neutrally stable strategies in multicriteria games. IEICE Trans. Fundam. Electr. Commun. Comp. Sci. E96–A, 814–820 (2013)
Ehrgott, M.: Multicriteria Optimization, 2nd edn. Springer, Berlin (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material (mp4 20,977 KB)
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Vascon, S., Mequanint, E.Z., Cristani, M., Hung, H., Pelillo, M., Murino, V. (2015). A Game-Theoretic Probabilistic Approach for Detecting Conversational Groups. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_43
Download citation
DOI: https://doi.org/10.1007/978-3-319-16814-2_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16813-5
Online ISBN: 978-3-319-16814-2
eBook Packages: Computer ScienceComputer Science (R0)