skip to main content
10.1145/1178782.1178793acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Multiview fusion for canonical view generation based on homography constraints

Published:27 October 2006Publication History

ABSTRACT

Activity and gait recognition are among the various applications that necessitate view-specific input. In a real surveillance scenario it is impractical to assume that the desired canonical view will always be available. We present a framework to generate the canonical view of any translating object in a scene monitored by multiple cameras. The method is capable of recovering this view despite the fact that none of the cameras can see it individually. In this two step process, first the camera and scene geometry is used to identify the sagittal plane of the object, which is used to define the canonical view. Next, each original view is warped to the canonical view through planar homographies learnt from geometric constraints. The warped images are then combined by way of evidence fusion to recover the shape energy map which is used to obtain the final binary silhouette of the object's shape. Results presented for various indoor and outdoor sequences demonstrate the efficacy of this method in generating the shape of the object as seen from the canonical view, while resolving occlusions.

References

  1. S. Avidan and A. Shashua. Novel view synthesis by cascading trilinear tensors. IEEE Trans. Visualization and Computer Graphics 4(4):293--306, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. A. R. Chowdhury, A. Kale, and R. Chellappa. Video synthesis of arbitrary views for approximately planar scenes. In Proc. Int. Conf. Acoustics, Speech, and Signal Process.volume 3, pages 497--500, April 2003.Google ScholarGoogle ScholarCross RefCross Ref
  3. R. Collins, R. Gross, and J. Shi. Silhouette-based human identification from body shape and gait. In Proc. Int. Conf. on Auto. Face and Gesture Recognition 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. J. Davis and A. Bobick. The representation and recognition of action using temporal templates. In Proc. Comp. Vis. and Pattern Rec.pages 928--934. IEEE, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. J. Davis and A. Tyagi. A reliable-inference framework for recognition of human actions. In Advanced Video and Signal Based Surveillance pages 169--176. IEEE, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Davis and A. Tyagi. Minimal-latency human action recognition using reliable-inference. Image and Vision Computing 24(5): 455--472, May 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. T. Denton, M. F. Demirci, J. Abrahamson, A. Shokoufandeh, and S. Dickinson. Selecting canonical views for view-based 3-d object recognition. In Proc. Int. Conf. Pat. Rec.pages 273--276, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. A. Habed and B. Boufama. Novel view synthesis:a comparative analysis study. In Vision Interface pages 217--224, 2000.Google ScholarGoogle Scholar
  9. R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision Cambridge University Press, ISBN: 0521540518, second edition, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. P. Huang, C. Harris, and M. Nixon. Recognising humans by gait via parametric canonical space. Artif. Intell. in Eng.13: 359--366, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  11. T. Huang and A. Netravali. Motion and structure from feature correspondences: A review. In Proc. IEEE volume 82, pages 252--268, Feb 1994.Google ScholarGoogle ScholarCross RefCross Ref
  12. T. Jebara, A. Azarbeyejani, and A. Pentland. 3D structure from 2D motion. IEEE Signal Processing Magazine 16(3), 1999.Google ScholarGoogle ScholarCross RefCross Ref
  13. K. Jeong and C. Jaynes. Moving shadow detection using a combined geometric and color classification approach. In Wkshp. on Motion and Video Computing Jan 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. S. M. Khan and M. Shah. A multiview approach to tracking people in crowded scenes using a planar homography constraint. In Proc. European Conf. Comp. Vis. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. K. Kim, T. H. Chalidabhongse, D. Harwood, and L. Davis. Real-time foreground-background segmentation using codebook model. Elsevier Real-Time Imaging 11(3): 172--185, June 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. S. Mahamud, M. Hebert, Y. Omori,and J. Ponce. Provably-convergent iterative methods for projective structure from motion. In Proc. Comp. Vis. and Pattern Rec. 2001.Google ScholarGoogle ScholarCross RefCross Ref
  17. J. A. Nelder and R. Mead. A simplex method for function minimization. Comput. J. pages 308--313, 1965.Google ScholarGoogle Scholar
  18. V. Parameswaran and R. Chellappa. View invariants for human action recognition. In Proc. Comp. Vis. and Pattern Rec. pages 613--619, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  19. M. Pollefeys, L. V. Gool, M. Vergauwen, F. Verbiest, K. Cornelis, J. Tops, and R. Koch. Visual modeling with a hand-held camera. Int. J. of Comp. Vis. 59(3): 207--232, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. Pollefeys, R. Koch, and L. V. Gool. Self calibration and metric reconstruction in spite of varying and unknown internal camera parameters. In Proc. Int. Conf. Comp. Vis. pages 90--96, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. C. Rao and M. Shah. A view-invariant representation and learning of human action. In Proc. Wkshp. on Detection and Recognition of Events in Video pages 55--63. IEEE, 2001.Google ScholarGoogle Scholar
  22. C. Stauffer and W. Grimson. Adaptive background mixture models for real-time tracking. In Proc. Comp. Vis. and Pattern Rec. pages 246--252. IEEE, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  23. P. Sturm and W. Triggs. A factorization based algorithm for multi-image projective structure and motion. In Proc. European Conf. Comp. Vis. pages 709--720, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. R. Szeliski. Rapid octree construction from image sequences. CVGIP: Image Understanding 58(1): 23--32, July 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. M. Vergauwen, F. Verbiest, V. Ferrari, C. Strecha, and L. van Gool. Wide-baseline 3D reconstruction from digital stills. In Int. Wkshp. on Visualization and Animation of Reality-based 3D Models Engadin, Switzerland, Feb 2003.Google ScholarGoogle Scholar
  26. Z. Zhang, R. Deriche, O. D. Faugeras, and Q.-T. Luong. A robust technique for matching two uncalibrated images through the recovery of the unknown epipolar geometry.Artificial Intelligence 78(1-2):87--119, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. T. Zhao and R. Nevatia. Tracking multiple humans in complex situations.IEEE Trans. Patt. Analy. and Mach. Intell.26(9): 1208--1221, Sept. 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Multiview fusion for canonical view generation based on homography constraints

                    Recommendations

                    Comments

                    Login options

                    Check if you have access through your login credentials or your institution to get full access on this article.

                    Sign in
                    • Published in

                      cover image ACM Conferences
                      VSSN '06: Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks
                      October 2006
                      230 pages
                      ISBN:1595934960
                      DOI:10.1145/1178782

                      Copyright © 2006 ACM

                      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                      Publisher

                      Association for Computing Machinery

                      New York, NY, United States

                      Publication History

                      • Published: 27 October 2006

                      Permissions

                      Request permissions about this article.

                      Request Permissions

                      Check for updates

                      Qualifiers

                      • Article

                      Upcoming Conference

                      MM '24
                      MM '24: The 32nd ACM International Conference on Multimedia
                      October 28 - November 1, 2024
                      Melbourne , VIC , Australia

                    PDF Format

                    View or Download as a PDF file.

                    PDF

                    eReader

                    View online with eReader.

                    eReader