skip to main content
10.1145/2072298.2071973acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
short-paper

Detecting motion synchrony by video tubes

Published: 28 November 2011 Publication History

Abstract

Motion synchrony, i.e., the coordinated motion of a group of individuals, is an interesting phenomenon in nature or daily life. Fish swim in schools, birds fly in flocks, soldiers march in platoons, etc. Our goal is to detect motion synchrony that may be present in the video data, and to track the group of moving objects as a whole. This opens the door to novel algorithms and applications. To this end, we model individual motions as video tubes in space-time, define motion synchrony by the geometric relation among video tubes, and track a whole set of tubes by dynamic programming. The resulting algorithm is highly efficient in practice. Given a video clip of T frames of resolution XxY, we show that finding the K spatially correlated video tubes and determining the presence of synchrony can be solved optimally in O(XYTK) time. Preliminary experiments show that our method is both effective and efficient. Typical running times are 30 - 100 VGA-resolution frames per second after feature extraction, and the accuracy for the detection of synchrony is more than 90% as evaluated in our annotated data set.

References

[1]
M. Beal, N. Jojic, and H. Attias. A graphical model for audiovisual object tracking. PAMI, 25(7):828--836, 2003.
[2]
P. Felzenszwalb and D. Huttenlocher. Distance transforms of sampled functions. Technical Report TR2004--1963, Cornell Computing and Information Science, 2004.
[3]
P. Felzenszwalb and D. Huttenlocher. Pictorial structures for object recognition. IJCV, 61(1):55--79, 2005.
[4]
P. Felzenszwalb, D. McAllester, and D. Ramanan. A discriminatively trained, multiscale, deformable part model. In CVPR, 2008.
[5]
J. Fisher and T. Darrell. Speaker association with signal-level audiovisual fusion. In IEEE Transaction on Multimedia, 2004.
[6]
J. Fisher, T. Darrell, W. Freeman, and P. Viola. Learning joint statistical models for audio-visual fusion and segregation. In NIPS, 2001.
[7]
S. Gu and C. Tomasi. Phase diffusion for the synchronization of heterogenous sensor streams. In ICASSP, pages 1841--1844, 2009.
[8]
S. Gu and C. Tomasi. Branch and track. In CVPR, 2011.
[9]
S. Gu, Y. Zheng, and C. Tomasi. Efficient visual object tracking with online nearest neighbor classifier. In ACCV, pages 267--277, 2010.
[10]
S. Gu, Y. Zheng, and C. Tomasi. Linear time offline tracking and lower envelope algorithms. In ICCV, 2011.
[11]
R. Hess and A. Fern. Discriminatively trained particle filters for complex multi-object tracking. In CVPR, pages 240--247, 2009.
[12]
Z. Khan, T. Balch, and F. Dellaert. Mcmc-based particle filtering for tracking a variable number of interacting targets. PAMI, pages 1805--1918, 2005.
[13]
B. Leibe, K. Schindler, and L. V. Gool. Coupled detection and trajectory estimation for multi-object tracking. In ICCV, pages 1--8, 2007.
[14]
K. Li, E. Miller, M. Chen, T. Kanade, L. Weiss, and P. Campbell. Computer vision tracking of stemness. In ISBI, pages 847--850, 2008.
[15]
Y. Li, C. Huang, and R. Nevatia. Learning to associate: Hybridboosted multi-target tracker for crowded scene. In CVPR, pages 2953--2960, 2009.
[16]
D. Lowe. Object recognition from local scale-invariant features. In ICCV, pages 1150--1157, 1999.
[17]
H. Nock, G. Iyengar, and C. Neti. Assessing face and speech consistency for monologue detection in video. In Proc. ACM Multimedia, pages 303--306, 2002.
[18]
A. Perera, C. Srinivas, A. Hoogs, G. Brooksby, and W. Hu. Multi-object tracking through simultaneous long occlusions and split-merge conditions. In CVPR, pages 666--673, 2006.
[19]
M. Sargin, Y. Yemez, E. Erzin, and A. Tekalp. Audiovisual synchronization and fusion using canonical correlation analysis. IEEE Transactions on Multimedia, 9(7):1396--1403, 2007.

Cited By

View all
  • (2017)A Branch-and-Bound Framework for Unsupervised Common Event DiscoveryInternational Journal of Computer Vision10.1007/s11263-017-0989-7123:3(372-391)Online publication date: 1-Jul-2017
  • (2015)Unsupervised Synchrony Discovery in Human InteractionProceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)10.1109/ICCV.2015.360(3146-3154)Online publication date: 7-Dec-2015
  • (2014)Dynamic Background Learning through Deep Auto-encoder NetworksProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654914(107-116)Online publication date: 3-Nov-2014

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '11: Proceedings of the 19th ACM international conference on Multimedia
November 2011
944 pages
ISBN:9781450306164
DOI:10.1145/2072298
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 November 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. motion analysis
  2. synchronization

Qualifiers

  • Short-paper

Conference

MM '11
Sponsor:
MM '11: ACM Multimedia Conference
November 28 - December 1, 2011
Arizona, Scottsdale, USA

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2017)A Branch-and-Bound Framework for Unsupervised Common Event DiscoveryInternational Journal of Computer Vision10.1007/s11263-017-0989-7123:3(372-391)Online publication date: 1-Jul-2017
  • (2015)Unsupervised Synchrony Discovery in Human InteractionProceedings of the 2015 IEEE International Conference on Computer Vision (ICCV)10.1109/ICCV.2015.360(3146-3154)Online publication date: 7-Dec-2015
  • (2014)Dynamic Background Learning through Deep Auto-encoder NetworksProceedings of the 22nd ACM international conference on Multimedia10.1145/2647868.2654914(107-116)Online publication date: 3-Nov-2014

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media