skip to main content
10.1145/2517351.2517356acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article

FOCUS: clustering crowdsourced videos by line-of-sight

Published: 11 November 2013 Publication History

Abstract

Crowdsourced video often provides engaging and diverse perspectives not captured by professional videographers. Broad appeal of user-uploaded video has been widely confirmed: freely distributed on YouTube, by subscription on Vimeo, and to peers on Facebook/Google+. Unfortunately, user-generated multimedia can be difficult to organize; these services depend on manual "tagging" or machine-mineable viewer comments. While manual indexing can be effective for popular, well-established videos, newer content may be poorly searchable; live video need not apply. We envisage video-sharing services for live user video streams, indexed automatically and in realtime, especially by shared content. We propose FOCUS, for Hadoop-on-cloud video-analytics. FOCUS uniquely leverages visual, 3D model reconstruction and multimodal sensing to decipher and continuously track a video's line-of-sight. Through spatial reasoning on the relative geometry of multiple video streams, FOCUS recognizes shared content even when viewed from diverse angles and distances. In a 70-volunteer user study, FOCUS' clustering correctness is roughly comparable to humans.

References

[1]
Crowdoptic. http://www.crowdoptic.com/.
[2]
Exa-tech. http://www.exa-tech.com/.
[3]
Streamweaver. http://streamweaver.com/.
[4]
Stringwire. http://www.stringwire.com/.
[5]
Switchcam. http://www.switchcam.com.
[6]
Vyclone. http://vyclone.com/.
[7]
Yinzcam. http://www.yinzcam.com/.
[8]
S. Agarwal, N. Snavely, I. Simon, S. Seitz, and R. Szeliski. Building rome in a day. In Computer Vision, 2009 IEEE 12th International Conference on, pages 72--79. IEEE, 2009.
[9]
X. Bao and R. Roy Choudhury. Movi: mobile phone based video highlights via collaborative sensing. In Proceedings of the 8th international conference on Mobile systems, applications, and services, pages 357--370. ACM, 2010.
[10]
S. Birchfield and S. Rangarajan. Spatiograms versus histograms for region-based tracking. In Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, volume 2, pages 1158--1163. IEEE, 2005.
[11]
Y. Chon, N. Lane, F. Li, H. Cha, and F. Zhao. Automatically characterizing places with opportunistic crowdsensing using smartphones. In Proc. 14th Int. Conf. Ubiquitous Computing (UbiComp'12). ACM, 2012.
[12]
M. Chuang and P. Narasimhan. Automated viewer-centric personalized sports broadcast. Procedia Engineering, 2(2):3397--3403, 2010.
[13]
A. Clauset, M. Newman, and C. Moore. Finding community structure in very large networks. Physical review E, 70(6):066111, 2004.
[14]
D. Crandall, A. Owens, N. Snavely, and D. Huttenlocher. Discrete-continuous optimization for large-scale structure from motion. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 3001--3008. IEEE, 2011.
[15]
J. Duell, P. Hargrove, and E. Roman. The design and implementation of Berkeley Lab's linuxcheckpoint/restart. Lawrence Berkeley National Laboratory, 2005.
[16]
A. Efros, A. Berg, G. Mori, and J. Malik. Recognizing action at a distance. In Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, pages 726--733. IEEE, 2003.
[17]
A. Fitzgibbon, A. Zisserman, et al. Automatic 3d model acquisition and generation of new images from video sequences. In Proceedings of European signal processing conference, pages 1261--1269, 1998.
[18]
R. Grzeszczuk, J. Kosecka, R. Vedantham, and H. Hile. Creating compact architectural models by geo-registering image collections. In Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pages 1718--1725. IEEE, 2009.
[19]
A. Hampapur, L. Brown, J. Connell, A. Ekin, N. Haas, M. Lu, H. Merkl, and S. Pankanti. Smart video surveillance: exploring the concept of multiscale spatiotemporal tracking. Signal Processing Magazine, IEEE, 22(2):38--51, 2005.
[20]
K. Heath, N. Gelfand, M. Ovsjanikov, M. Aanjaneya, and L. J. Guibas. Image webs: Computing and exploiting connectivity in image collections. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pages 3432--3439. IEEE, 2010.
[21]
F. Herranz, K. Muthukrishnan, and K. Langendoen. Camera pose estimation using particle filters. In Int. Conf. on Indoor Positioning and Indoor Navigation (IPIN), pages 1--8, sep 2011.
[22]
J. Hightower and G. Borriello. Location systems for ubiquitous computing. Computer, 34(8):57--66, 2001.
[23]
H. Hile, R. Vedantham, G. Cuellar, A. Liu, N. Gelfand, R. Grzeszczuk, and G. Borriello. Landmark-based pedestrian navigation from collections of geotagged photos. In Proceedings of the 7th International Conference on Mobile and Ubiquitous Multimedia, pages 145--152. ACM, 2008.
[24]
T. Kanade, R. Collins, A. Lipton, P. Burt, and L. Wixson. Advances in cooperative multi-sensor video surveillance. In Proceedings of DARPA Image Understanding Workshop, volume 1, page 2. Citeseer, 1998.
[25]
C. Kim and B. Vasudev. Spatiotemporal sequence matching for efficient video copy detection. Circuits and Systems for Video Technology, IEEE Transactions on, 15(1):127--132, 2005.
[26]
K. Kim, S. Yoon, and H. Cho. A faster color-based clustering method for summarizing photos in smartphone. In Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on, volume 2, pages 561--565. IEEE, 2012.
[27]
Y. Lee, Y. Ju, C. Min, S. Kang, I. Hwang, and J. Song. Comon: cooperative ambience monitoring platform with continuity and benefit awareness. In Proceedings of the 10th international conference on Mobile systems, applications, and services, pages 43--56. ACM, 2012.
[28]
Y. Li, N. Snavely, D. Huttenlocher, and P. Fua. Worldwide pose estimation using 3d point clouds.
[29]
J. Manweiler, P. Jain, and R. Roy Choudhury. Satellites in our pockets: an object positioning system using smartphones. In Proceedings of the 10th international conference on Mobile systems, applications, and services, pages 211--224. ACM, 2012.
[30]
R. Mohan. Video sequence matching. In Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on, volume 6, pages 3697--3700. IEEE, 1998.
[31]
C. Qin, X. Bao, R. Roy Choudhury, and S. Nelakuditi. Tagsense: a smartphone-based approach to automatic image tagging. In Proceedings of the 9th international conference on Mobile systems, applications, and services, pages 1--14. ACM, 2011.
[32]
S. Reddy, A. Parker, J. Hyman, J. Burke, D. Estrin, and M. Hansen. Image browsing, processing, and clustering for participatory sensing: lessons from a dietsense prototype. In Proceedings of the 4th workshop on Embedded networked sensors, pages 13--17. ACM, 2007.
[33]
E. Rosten and T. Drummond. Machine learning for high-speed corner detection. In European Conference on Computer Vision (ECCV), pages 430--443, 2006.
[34]
P. Sand and S. Teller. Video matching. In ACM Transactions on Graphics (TOG), volume 23, pages 592--599. ACM, 2004.
[35]
R. Schleicher, A. Shirazi, M. Rohs, S. Kratz, and A. Schmidt. Worldcupinion experiences with an android app for real-time opinion sharing during soccer world cup games. International Journal of Mobile Human Computer Interaction (IJMHCI), 3(4):18--35, 2011.
[36]
Z. Shen, S. Arslan Ay, S. H. Kim, and R. Zimmermann. Automatic tag generation and ranking for sensor-rich outdoor videos. In MM, pages 93--102. ACM, 2011.
[37]
N. Snavely, S. Seitz, and R. Szeliski. Photo tourism: exploring photo collections in 3d. In ACM Transactions on Graphics (TOG), volume 25, pages 835--846. ACM, 2006.
[38]
K. Tuite, N. Snavely, D. Hsiao, N. Tabing, and Z. Popovic. Photocity: training experts at large-scale image acquisition through a competitive game. In Proceedings of the 2011 annual conference on Human factors in computing systems, pages 1383--1392. ACM, 2011.
[39]
T. Yan, V. Kumar, and D. Ganesan. Crowdsearch: exploiting crowds for accurate real-time image search on mobile phones. In Proceedings of the 8th international conference on Mobile systems, applications, and services, pages 77--90. ACM, 2010.
[40]
T. Zhao and R. Nevatia. Tracking multiple humans in complex situations. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 26(9):1208--1221, 2004.
[41]
W. Zhao, D. Nister, and S. Hsu. Alignment of continuous video onto 3d point clouds. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(8):1305--1318, 2005.
[42]
Z. Zhu, G. Xu, E. Riseman, and A. Hanson. Fast generation of dynamic and multi-resolution 360 panorama from video sequences. In Multimedia Computing and Systems, 1999. IEEE International Conference on, volume 1, pages 400--406. IEEE, 1999.

Cited By

View all
  • (2018)The Minimum Selection of Crowdsourcing Images under the Resource BudgetSymmetry10.3390/sym1007025610:7(256)Online publication date: 2-Jul-2018
  • (2018)The crowd as a cameramanMultimedia Tools and Applications10.1007/s11042-016-4257-677:1(597-629)Online publication date: 1-Jan-2018
  • (2017)VideoMecProceedings of the 16th ACM/IEEE International Conference on Information Processing in Sensor Networks10.1145/3055031.3055089(143-154)Online publication date: 18-Apr-2017
  • Show More Cited By

Index Terms

  1. FOCUS: clustering crowdsourced videos by line-of-sight

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SenSys '13: Proceedings of the 11th ACM Conference on Embedded Networked Sensor Systems
      November 2013
      443 pages
      ISBN:9781450320276
      DOI:10.1145/2517351
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 November 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. crowdsourcing
      2. line-of-sight
      3. live video
      4. multi-view stereo

      Qualifiers

      • Research-article

      Conference

      Acceptance Rates

      SenSys '13 Paper Acceptance Rate 21 of 123 submissions, 17%;
      Overall Acceptance Rate 198 of 990 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)10
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 12 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2018)The Minimum Selection of Crowdsourcing Images under the Resource BudgetSymmetry10.3390/sym1007025610:7(256)Online publication date: 2-Jul-2018
      • (2018)The crowd as a cameramanMultimedia Tools and Applications10.1007/s11042-016-4257-677:1(597-629)Online publication date: 1-Jan-2018
      • (2017)VideoMecProceedings of the 16th ACM/IEEE International Conference on Information Processing in Sensor Networks10.1145/3055031.3055089(143-154)Online publication date: 18-Apr-2017
      • (2017)Density-aware compressive crowdsensingProceedings of the 16th ACM/IEEE International Conference on Information Processing in Sensor Networks10.1145/3055031.3055081(29-39)Online publication date: 18-Apr-2017
      • (2016)Low Bandwidth Offload for Mobile ARProceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies10.1145/2999572.2999587(237-251)Online publication date: 6-Dec-2016
      • (2016)Energy-Efficient Aquatic Environment Monitoring Using Smartphone-Based RobotsACM Transactions on Sensor Networks10.1145/293219012:3(1-28)Online publication date: 26-Jul-2016
      • (2015)PosterProceedings of the 21st Annual International Conference on Mobile Computing and Networking10.1145/2789168.2795175(260-262)Online publication date: 7-Sep-2015
      • (2015)Multiway K-Clustered Tensor ApproximationACM Transactions on Graphics10.1145/275375634:5(1-15)Online publication date: 3-Nov-2015
      • (2015)OverLayProceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services10.1145/2742647.2742666(331-344)Online publication date: 18-May-2015
      • (2015)SambaProceedings of the 14th International Conference on Information Processing in Sensor Networks10.1145/2737095.2737100(262-273)Online publication date: 13-Apr-2015
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media