ABSTRACT
Temporal synchronization of multiple video recordings of the same dynamic event is a critical task in many computer vision applications e.g. novel view synthesis and 3D reconstruction. Typically this information is implied through the time-stamp information embedded in the video streams. User-generated videos shot using consumer grade equipment do not contain this information; hence, there is a need to temporally synchronize signals using the visual information itself. Previous work in this area has either assumed good quality data with relatively simple dynamic content or the availability of precise camera geometry.
Our first contribution is a synchronization technique which tries to establish correspondence between feature trajectories across views in a novel way, and specifically targets the kind of complex content found in consumer generated sports recordings, without assuming precise knowledge of fundamental matrices or homographies. We evaluate performance using a number of real video recordings and show that our method is able to synchronize to within 1 sec, which is significantly better than previous approaches.
Our second contribution is a robust and unsupervised view-invariant activity recognition descriptor that exploits recurrence plot theory on spatial tiles. The descriptor is individually shown to better characterize the activities from different views under occlusions than state-of-the-art approaches. We combine this descriptor with our proposed synchronization method and show that it can further refine the synchronization index.
- A. F. Bobick and J. W. Davis. The recognition of human movement using temporal templates. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 23(3): 257--267, 2001. Google ScholarDigital Library
- Y. Caspi and M. Irani. Alignment of non-overlapping sequences. In ICCV. IEEE, 2001.Google ScholarCross Ref
- Y. Caspi and M. Irani. Spatio-temporal alignment of sequences. TPAMI, 24(11): 1409--1424, 2002. Google ScholarDigital Library
- Y. Caspi, D. Simakov, and M. Irani. Feature-based sequence-to-sequence matching. IJCV, 68(1), 2006. Google ScholarDigital Library
- P. Chen and D. Suter. Simultaneously estimating the fundamental matrix and homographies. Robotics, IEEE Transactions on, 25(6): 1425--1431, 2009. Google ScholarDigital Library
- N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, volume 1. IEEE, 2005. Google ScholarDigital Library
- T. J. Darrell, I. A. Essa, and A. P. Pentland. Task-specific gesture analysis in real-time using interpolated views. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 18(12): 1236--1242, 1996. Google ScholarDigital Library
- E. Dexter and I. Laptev. Multi-view synchronization of human actions and dynamic scenes. In BMVC, 2009.Google ScholarCross Ref
- M. Germann, T. Popa, R. Ziegler, R. Keiser, and M. Gross. Space-time body pose estimation in uncontrolled environments. In 3DIMPVT. IEEE, 2011. Google ScholarDigital Library
- N. Hasler, B. Rosenhahn, T. Thormahlen, M. Wand, J. Gall, and H.-P. Seidel. Markerless motion capture with unsynchronized moving cameras. In CVPR. IEEE, 2009.Google ScholarCross Ref
- I. N. Junejo, E. Dexter, I. Laptev, and P. Pérez. View-independent action recognition from temporal self-similarities. TPAMI, 33(1): 172--185, 2011. Google ScholarDigital Library
- H. Kuehne, H. Jhuang, E. Garrote, T. Poggio, and T. Serre. Hmdb: a large video database for human motion recognition. In ICCV. IEEE, 2011. Google ScholarDigital Library
- C. Lei and Y.-H. Yang. Tri-focal tensor-based multiple video synchronization with subframe optimization. Image Processing, IEEE Transactions on, 15(9): 2473--2480, 2006. Google ScholarDigital Library
- R. Li, R. Chellappa, and S. K. Zhou. Learning multi-modal densities on discriminative temporal interaction manifold for group activity recognition. In CVPR. IEEE, 2009.Google Scholar
- D. G. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 60(2): 91--110, 2004. Google ScholarDigital Library
- J. MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability. California, USA, 1967.Google Scholar
- K. Mikolajczyk and H. Uemura. Action recognition with appearance--motion features and fast search trees. Computer Vision and Image Understanding, 115(3): 426--438, 2011. Google ScholarDigital Library
- D. Pundik and Y. Moses. Video synchronization using temporal signals from epipolar lines. In ECCV. Springer, 2010. Google ScholarDigital Library
- I. Reid and A. Zisserman. Goal-directed video metrology. In ECCV. Springer, 1996. Google ScholarDigital Library
- P. J. Rousseeuw. Least median of squares regression. Journal of the American statistical association, 79(388): 871--880, 1984.Google Scholar
- S. Sadanand and J. J. Corso. Action bank: A high-level representation of activity in video. In CVPR. IEEE, 2012. Google ScholarDigital Library
- H. J. Seo and P. Milanfar. Action recognition from one example. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(5): 867--882, 2011. Google ScholarDigital Library
- J. Serrat, F. Diego, F. Lumbreras, and J. M. Álvarez. Synchronization of video sequences from free-moving cameras. In Pattern Recognition and Image Analysis, pages 620--627. Springer, 2007. Google ScholarDigital Library
- J. Shi and C. Tomasi. Good features to track. In CVPR. IEEE, 1994.Google Scholar
- M. Singh, A. Basu, and M. Mandal. Event dynamics based temporal registration. Multimedia, IEEE Transactions on, 9(5): 1004--1015, 2007. Google ScholarDigital Library
- L. Spencer and M. Shah. Temporal synchronization from camera motion. In ACCV, 2004.Google Scholar
- C. Sun, I. Junejo, and H. Foroosh. Action recognition using rank-1 approximation of joint self-similarity volume. In ICCV. IEEE, 2011.Google Scholar
- J. Tompkin, K. I. Kim, J. Kautz, and C. Theobalt. Videoscapes: exploring sparse, unstructured video collections. ACM Transactions on Graphics (TOG), 31(4): 68, 2012. Google ScholarDigital Library
- K. Tran, I. Kakadiaris, and S. Shah. Part-based motion descriptor image for human action recognition. Pattern Recognition, 45(7): 2562--2572, 2012. Google ScholarDigital Library
- P. Tresadern and I. Reid. Synchronizing image sequences of non-rigid objects. In BMVC, 2003.Google ScholarCross Ref
- T. Tuytelaars and L. Van Gool. Synchronizing video sequences. In CVPR. IEEE, 2004.Google ScholarCross Ref
- H. Wang, A. Klaser, C. Schmid, and C.-L. Liu. Action recognition by dense trajectories. In CVPR. IEEE, 2011.Google ScholarDigital Library
- X. Wang, T. X. Han, and S. Yan. An hog-lbp human detector with partial occlusion handling. In ICCV, pages 32--39. IEEE, 2009.Google ScholarCross Ref
- D. Weinland, M. Özuysal, and P. Fua. Making action recognition robust to occlusions and viewpoint changes. In ECCV. Springer, 2010. Google ScholarDigital Library
- A. Whitehead, R. Laganiere, and P. Bose. Temporal synchronization of video sequences in theory and in practice. In Wrkshop on Application of Computer Vision. IEEE, 2005. Google ScholarDigital Library
- L. Wolf and A. Zomet. Correspondence-free synchronization and reconstruction in a non-rigid scene. In Proc. Workshop on Vision and Modelling of Dynamic Scenes, Copenhagen, 2002.Google Scholar
- L. Zelnik-Manor and M. Irani. Event-based analysis of video. In CVPR. IEEE, 2001.Google ScholarCross Ref
Index Terms
- Synchronization of user-generated videos through trajectory correspondence and a refinement procedure
Recommendations
Reconstruction of the Pose of Uncalibrated Cameras via User-Generated Videos
ICDSC '14: Proceedings of the International Conference on Distributed Smart CamerasExtraction of 3D geometry from hand-held unsteady uncalibrated cameras faces multiple difficulties: finding usable frames, feature-matching and unknown variable focal length to name three. We have built a prototype system to allow a user to spatially ...
Experimental evaluation of the jitter generated in timing transfer
Modern telecommunications networks maintain synchronization and distribute accurate timing information using approaches with strict requirements on clock quality and jitter and wander at interfaces. In the future, packet networks may be used for timing ...
Chaining Clock Synchronization: An Energy-Efficient Clock Synchronization Scheme for Wireless Sensor Networks
ISPAN '09: Proceedings of the 2009 10th International Symposium on Pervasive Systems, Algorithms, and NetworksSince WSNs have restricted energy sources, the energy efficiency of a synchronization scheme is as important as the accuracy of a clock. To accomplish both the energy efficiency and the accuracy, we propose a new clock synchronization scheme called ...
Comments