Skip to main content
Log in

Space–time image layout

  • Original Article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Cameras are now ubiquitous in our lives. A given activity is often captured by multiple people from different viewpoints resulting in a sizable collection of photograph footage. We present a method that effectively organizes this spatiotemporal content. Given an unorganized collection of photographs taken by a number of photographers, capturing some dynamic event at a number of time steps, we would like to organize the collection into a space–time table. The organization is an embedding of the photographs into clusters that preserve the viewpoint and time order. Our method relies on a self-organizing map (SOM), which is a neural network that embeds the training data (the set of images) into a discrete domain. We introduce BiSOM, which is a variation of SOM that considers two features (space and time) rather than a single one, to layout the given photograph collection into a table. We demonstrate our method on several challenging datasets, using different space and time descriptors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

Notes

  1. The silhouette index \(-1 \le {s_i}\left( j \right) \le 1\) provides an indication for how well the element i lies within the cluster j. A value of \({s_i}\left( j \right) \) close to positive one means that the datum i is appropriately clustered in the cluster j, conversely, a value close to negative one means that the datum i is unlikely to belong to cluster j.

  2. The Rand index \(0\le R_I \le 1\) is a measure of the similarity between two clustering results, with 0 indicating that the two data clusters do not agree on any pair of points and 1 indicating that the data clusters are exactly the same.

  3. Datasets boats and slides in the courtesy of Dekel et al. [9]

  4. The swapping distance is defined to be the minimum number of swaps, or transpositions, of two adjacent clusters that transforms one permutation into another.

References

  1. Aoki, T., Aoyagi, T.: Self-organizing maps with asymmetric neighborhood function. Neural Comput. 19(9), 2515–2535 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  2. Averbuch-Elor, H., Cohen-Or, D.: Ringit: Ring-ordering casual photos of a temporal event. ACM Trans. Graph. 35(1), 33 (2015)

    MATH  Google Scholar 

  3. Bashyal, S., Venayagamoorthy, G.K.: Recognition of facial expressions using gabor wavelets and learning vector quantization. Eng. Appl. Artif. Intell. 21(7), 1056–1064 (2008)

    Article  Google Scholar 

  4. Brahmachari, A.S., Sarkar, S.: View clustering of wide-baseline n-views for photo tourism. In: Graphics, Patterns and Images (Sibgrapi), 2011 24th SIBGRAPI Conference on, pp. 157–164. IEEE (2011)

  5. Caspi, Y., Irani, M.: Spatio-temporal alignment of sequences. IEEE Trans. Pattern Anal. Mach. Intell. 24(11), 1409–1424 (2002)

    Article  Google Scholar 

  6. Chen, L.P., Liu, Y.G., Huang, Z.X., Shi, Y.T.: An improved som algorithm and its application to color feature extraction. Neural Comput. Appl. 24(7–8), 1759–1770 (2014)

    Article  Google Scholar 

  7. Cormode, G., Muthukrishnan, S.: The string edit distance matching problem with moves. ACM Trans. Algorithms. (TALG) 3(1), 2 (2007)

  8. Moses, Y., Avidan, S., et al.: Space-time tradeoffs in photo sequencing. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 977–984 (2013)

  9. Dekel, T., Moses, Y., Avidan, S.: Photo sequencing. Int. J. Comput. Vis. 110(3), 275–289 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  10. Dexter, E., Pérez, P., Laptev, I.: Multi-view synchronization of human actions and dynamic scenes. In: BMVC, pp. 1–11. Citeseer (2009)

  11. Endo, M., Ueno, M., Tanabe, T.: A clustering method using hierarchical self-organizing maps. J. VLSI Signal Process. Syst. Signal Image Video Technol. 32(1–2), 105–118 (2002)

    Article  MATH  Google Scholar 

  12. Fried, O., DiVerdi, S., Halber, M., Sizikova, E., Finkelstein, A.: IsoMatch: Creating informative grid layouts. In: Computer Graphics Forum, vol. 34, no 2, pp. 155–166. Wiley (2015)

  13. Furukawa, Y., Curless, B., Seitz, S.M., Szeliski, R.: Towards internet-scale multi-view stereo. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 1434–1441. IEEE (2010)

  14. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

    Article  MATH  Google Scholar 

  15. Kiang, M.Y.: Extending the kohonen self-organizing map networks for clustering analysis. Comput. Stat. Data Anal. 38(2), 161–180 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  16. Kohonen, T.: The self-organizing map. Proc. IEEE 78(9), 1464–1480 (1990)

    Article  Google Scholar 

  17. Lee, J.A., Verleysen, M.: Self-organizing maps with recursive neighborhood adaptation. Neural Netw. 15(8), 993–1003 (2002)

    Article  Google Scholar 

  18. Lefebvre, G., Laurent, C., Ros, J., Garcia, C.: Supervised image classification by som activity map comparison. In: Pattern Recognition, 2006. ICPR 2006. 18th International Conference on, vol. 2, pp. 728–731. IEEE (2006)

  19. Ling, H., Jacobs, D.W.: Shape classification using the inner-distance. IEEE Trans. Pattern Anal. Mach. Intell. 29(2), 286–299 (2007)

    Article  Google Scholar 

  20. Mauro, M., Riemenschneider, H., Van Gool, L., Leonardi, R., Brescia, I.: Overlapping camera clustering through dominant sets for scalable 3D reconstruction. In: BMVC, vol. 1, no 2, p. 3 (2013)

  21. Moehrmann, J., Bernstein, S., Schlegel, T., Werner, G., Heidemann, G.: Improving the usability of hierarchical representations for interactively labeling large image data sets. In: Human-Computer Interaction. Design and Development Approaches, pp. 618–627. Springer, Berlin (2011)

  22. Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)

    Article  MATH  Google Scholar 

  23. Ong, S.H., Yeo, N., Lee, K., Venkatesh, Y., Cao, D.: Segmentation of color images using a two-stage self-organizing network. Image Vis. Comput. 20(4), 279–289 (2002)

    Article  Google Scholar 

  24. Quadrianto, N., Song, L., Smola, A.J.: Kernelized sorting. In: Advances in neural information processing systems, pp. 1289–1296 (2009)

  25. Reinert, B., Ritschel, T., Seidel, H.P.: Interactive by-example design of artistic packing layouts. ACM Trans. Graph. (TOG) 32(6), 218 (2013)

    Article  Google Scholar 

  26. Rother, C., Kolmogorov, V., Blake, A.: Grabcut: interactive foreground extraction using iterated graph cuts. ACM Trans. Graph. (TOG) 23(3), 309–314 (2004)

    Article  Google Scholar 

  27. Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)

    Article  MATH  Google Scholar 

  28. Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)

    Article  Google Scholar 

  29. Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: exploring photo collections in 3D. In: ACM transactions on graphics (TOG), vol. 25, no 3, pp. 835–846. ACM (2006)

  30. Strong, G., Gong, M.: Self-sorting map: An efficient algorithm for presenting multimedia data in structured layouts. IEEE Trans. Multimedia. 16(4), 1045–1058 (2014)

  31. Tomasi, C., Kanade, T.: Shape and motion from image streams under orthography: a factorization method. Int. J. Comput. Vis. 9(2), 137–154 (1992)

    Article  Google Scholar 

  32. Zhou, H., Yuan, Y., Shi, C.: Object tracking using sift features and mean shift. Comput. Vis. Image Underst. 113(3), 345–352 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shahar Ben-Ezra.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 220892 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ben-Ezra, S., Cohen-Or, D. Space–time image layout. Vis Comput 34, 417–430 (2018). https://doi.org/10.1007/s00371-016-1347-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-016-1347-4

Keywords

Navigation