Abstract
We present a new method for taking an urban scene reconstructed from a large Internet photo collection and reasoning about its change in appearance through time. Our method estimates when individual 3D points in the scene existed, then uses spatial and temporal affinity between points to segment the scene into spatio-temporally consistent clusters. The result of this segmentation is a set of spatio-temporal objects that often correspond to meaningful units, such as billboards, signs, street art, and other dynamic scene elements, along with estimates of when each existed. Our method is robust and scalable to scenes with hundreds of thousands of images and billions of noisy, individual point observations. We demonstrate our system on several large-scale scenes, and demonstrate an application to time stamping photos. Our work can serve to chronicle a scene over time, documenting its history and discovering dynamic elements in a way that can be easily explored and visualized.
Chapter PDF
Similar content being viewed by others
References
Agarwal, S., Snavely, N., Simon, I., Seitz, S.M., Szeliski, R.: Building Rome in a day. In: ICCV (2009)
Frahm, J.-M., et al.: Building rome on a cloudless day. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 368–381. Springer, Heidelberg (2010)
Klingner, B., Martin, D., Roseborough, J.: Street view motion-from-structure-from-motion. In: ICCV (2013)
Schindler, G., Dellaert, F., Kang, S.B.: Inferring temporal order of images from 3D structure. In: CVPR (2007)
Schindler, G., Dellaert, F.: Probabilistic temporal inference on reconstructed 3D scenes. In: CVPR (2010)
Shotton, J., Fitzgibbon, A., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR (2011)
Bregler, C., Hertzmann, A., Biermann, H.: Recovering non-rigid 3D shape from image streams. In: CVPR, pp. 690–696 (2000)
Vedula, S., Baker, S., Rander, P., Collins, R.T., Kanade, T.: Three-dimensional scene flow. PAMI 27(3) (2005)
Ulusoy, A.O., Biris, O., Mundy, J.L.: Dynamic probabilistic volumetric models. In: ICCV (2013)
Ballan, L., Brostow, G.J., Puwein, J., Pollefeys, M.: Unstructured video-based rendering: Interactive exploration of casually captured videos. In: SIGGRAPH (2010)
Basha, T., Moses, Y., Avidan, S.: Photo sequencing. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 654–667. Springer, Heidelberg (2012)
Sunkavalli, K., Matusik, W., Pfister, H., Rusinkiewicz, S.: Factored Time-Lapse Video. In: SIGGRAPH (2007)
Jacobs, N., Roman, N., Pless, R.: Consistent temporal variations in many outdoor scenes. In: CVPR (2007)
Rubinstein, M., Liu, C., Sand, P., Durand, F., Freeman, W.T.: Motion denoising with application to time-lapse photography. In: CVPR (2011)
Pollard, T., Mundy, J.: Change detection in a 3-D world. In: CVPR, pp. 1–6 (June 2007)
Taneja, A., Ballan, L., Pollefeys, M.: Image based detection of geometric changes in urban environments. In: ICCV (November 2011)
Taneja, A., Ballan, L., Pollefeys, M.: City-scale change detection in cadastral 3D models using images. In: CVPR (June 2013)
Fard, M.G., Peña-Mora, F., Savarese, S.: Monitoring changes of 3D building elements from unordered photo collections. In: ICCV Workshops (2011)
Sinha, S., Steedley, D., Szeliski, R.: Piecewise planar stereo for image-based rendering. In: ICCV (2009)
Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Reconstructing building interiors from images. In: ICCV (2009)
Snavely, N., Seitz, S.M., Szeliski, R.: Modeling the world from Internet photo collections. IJCV 80(2), 189–210 (2008)
Li, Y., Snavely, N., Huttenlocher, D., Fua, P.: Worldwide pose estimation using 3D point clouds. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part I. LNCS, vol. 7572, pp. 15–29. Springer, Heidelberg (2012)
Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Towards Internet-scale multi-view stereo. In: CVPR (2010)
Furukawa, Y., Ponce, J.: Accurate, dense, and robust multi-view stereopsis. PAMI 32(8), 1362–1376 (2009)
Ni, K., Jin, H., Dellaert, F.: GroupSAC: Efficient consensus in the presence of groupings. In: ICCV (2009)
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration. In: Int. Conf. on Computer Vision Theory and Application (2009)
Lee, Y.J., Efros, A.A., Hebert, M.: Style-aware mid-level representation for discovering visual connections in space and time. In: ICCV (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
1 Electronic Supplementary Material
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Matzen, K., Snavely, N. (2014). Scene Chronology. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8695. Springer, Cham. https://doi.org/10.1007/978-3-319-10584-0_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-10584-0_40
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10583-3
Online ISBN: 978-3-319-10584-0
eBook Packages: Computer ScienceComputer Science (R0)