Skip to main content

Mental Visual Browsing

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9517))

Abstract

We present a surprisingly easy-to-use video browser for helping users to pinpoint a specific video shot in mind, within a long video. At each interactive iteration, the only user effort required is to click 1 shot, which most visually relates to the user’s mental target, out of 8 displayed shots. Then, the system updates the browsing model and display another 8 shots for the next iteration. The proposed system is underpinned by a theoretically-sound Bayesian framework that maintains the probabilities of all the video shots segmented from the long video. This framework guarantees that we can find the target shot out of around 1-h video within 3–5 iterations. We believe that our system will perform well in the Video Broswer Showdown game of MMM 2016.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Arandjelovic, R., Zisserman, A.: All about VLAD. In: CVPR (2013)

    Google Scholar 

  2. Ferecatu, M., Geman, D.: A statistical framework for image category search from a mental picture. TPAMI 31(6), 1087–1101 (2009)

    Article  Google Scholar 

  3. Jégou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. TPAMI 34(9), 1704–1716 (2012)

    Article  Google Scholar 

  4. Jia, Y.: Caffe: an open source convolutional architecture for fast feature embedding (2013). http://caffe.berkeleyvision.org

  5. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)

    Google Scholar 

  6. Over, P., Awad, G., Michel, M., Fiscus, J., Sanders, G., Shaw, B., Kraaij, W., Smeaton, A.F., Quenot, G.: Trecvid 2012 - an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: TRECVID (2012)

    Google Scholar 

  7. Schoeffmann, K.: A user-centric media retrieval competition: the video browser showdown 2012–2014. IEEE MultiMedia 21, 8–13 (2014)

    Article  Google Scholar 

  8. Schoeffmann, K., Ahlström, D., Bailer, W., Cobârzan, C., Hopfgartner, F., McGuinness, K., Gurrin, C., Frisson, C., Le, D.-D., Del Fabro, M., et al.: The video browser showdown: a live evaluation of interactive video search tools. IJMIR 3(2), 113–127 (2014)

    Google Scholar 

  9. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556

  10. Xu, Z., Yang, Y., Hauptmann, A.G.: A discriminative cnn video representation for event detection (2014). arXiv preprint arXiv:1411.4006

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xindi Shang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

He, J., Shang, X., Zhang, H., Chua, TS. (2016). Mental Visual Browsing. In: Tian, Q., Sebe, N., Qi, GJ., Huet, B., Hong, R., Liu, X. (eds) MultiMedia Modeling. MMM 2016. Lecture Notes in Computer Science(), vol 9517. Springer, Cham. https://doi.org/10.1007/978-3-319-27674-8_44

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27674-8_44

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27673-1

  • Online ISBN: 978-3-319-27674-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics