skip to main content
10.1145/1056808.1057045acmconferencesArticle/Chapter ViewAbstractPublication PageschiConference Proceedingsconference-collections
Article

Deciphering visual gist and its implications for video retrieval and interface design

Authors Info & Claims
Published:02 April 2005Publication History

ABSTRACT

How do people make sense of a video based on viewing a few frames of that video? What elements constitute the "visual gist" in their minds? Answers to these questions will give implications to both content-based video retrieval and the interface design (e.g., key-frame selection) of digital video libraries. A preliminary study was conducted to unravel the issues and 45 subjects participated in the study. After viewing a fast forward surrogate, the subjects were asked to choose pictures which they thought would "belong to" the video. And they were also asked to think aloud during their selection processes. Nine visual gist attributes (e.g., people, objects and actions) were generated using the grounded theory method and their frequencies were also compared and analyzed.

References

  1. Ding, W., Marchionini, G., Soergel, D. Multimodal Surrogates for Video Browsing. In: Proc. of Digital Libraries '99: 85--93 Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Dufaux, F. Key frame selection to represent a video. ICIP 2000. Vol, II. p. 275--278Google ScholarGoogle Scholar
  3. Grodal, T. Emotions, Cognitions, and Narrative Patterns in Film. In Passionate views : film, cognition, and emotion, edited by Plantinga, C. & Smith, G. M. 1999, 127--145Google ScholarGoogle Scholar
  4. Jorgensen, C. Image attributes in describing tasks: an investigation. Information Processing & Management, 34(2/3), 1998, 161--174 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Levin, D. T. & Simons, D. J. Failure to detect changes to attended objects in motion pictures. Psychological Bulletin, 4, 1997, 501--506Google ScholarGoogle ScholarCross RefCross Ref
  6. Lieberman, L. R., & Culpepper, J. T. Words versus objects: comparison of free verbal recall. Pscychol. Rep. 17,1965, 983--988Google ScholarGoogle ScholarCross RefCross Ref
  7. Mandler, J & Ritchey G. H. Long term memory for pictures. Journal of Experimental Psychology {Human learning and memory}, 3, 1977. 386--396Google ScholarGoogle Scholar
  8. Markkula, M. and Sormumen, E. Searching for photos: journalists' practices in pictorial IR, The Challenge of Image Retrieval Research Workshop, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  9. Massey, M.; Bender, W. Salient stills: process and practice. IBM Systems Journal, 35(3-4), 1996, 557--573. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Paivio, A. & Csapo, K. Picture superiority in free recall: Imagery or dual coding? Cognitive Psychology, 5, 1973, 176--206Google ScholarGoogle ScholarCross RefCross Ref
  11. Ponceleon, D., Srinivasan, S., Amir, A., Petkovic, D., & Diklic, D. Key to effective video retrieval: Effective cataloguing and browsing. In Proc. ACM Multimedia, 1998, 99--107. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Shepard, R. N. Recognition memory for words, sentences, and pictures. Journal of Verbal Learning and Verbal Behavior, 6, 1967, 156--163Google ScholarGoogle ScholarCross RefCross Ref
  13. Simons, D. J. & Levin, D. T., Change blindness. Trends Cognitive Science, 1, 1997, 261--267Google ScholarGoogle Scholar
  14. Wildemuth, B. M., Marchionini, G., Wilkens, T., Yang, M., Geisler, G., Fowler, B., Hughes, A., & Mu, X. Alternative surrogates for video objects in a digital library: users' perspectives on their relative usability. Proc., the European Conference on Digital Libraries (ECDL), 2002, 493--507 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Wildemuth, B. M., Marchionini, G., Yang, M., Geisler, G., Wilkens, T., Hughes, A., & Gruss, R. How fast is too fast? Evaluating fast forward surrogates for digital video. Proc., Joint Conference on Digital Libraries, 2003, 221--230 Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Wolfe, J.M. Visual memory: what do you know about what you saw? Current Biology, 8(9), 1998, 303--304.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Deciphering visual gist and its implications for video retrieval and interface design

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CHI EA '05: CHI '05 Extended Abstracts on Human Factors in Computing Systems
      April 2005
      1358 pages
      ISBN:1595930027
      DOI:10.1145/1056808

      Copyright © 2005 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 2 April 2005

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate6,164of23,696submissions,26%

      Upcoming Conference

      CHI '24
      CHI Conference on Human Factors in Computing Systems
      May 11 - 16, 2024
      Honolulu , HI , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader