Object Memorability Prediction using Deep Learning: Location and Size Bias

https://doi.org/10.1016/j.jvcir.2019.01.008Get rights and content

Abstract

Object memorability prediction is a task of estimating the probability that a human recognises the recurrence of an object after a single view. Initial research on object memorability showed that it is possible to predict the object memorability scores from the intrinsic features of an object. Though the existing works proposed some of the features for object memorability prediction task, the influence of Spatial-location and Spatial-size of an object to its memorability have not been explored yet. In this work, the importance of these two characteristics in determining object memorability prediction is investigated and the same is demonstrated by building a baseline model. Further, a deep learning model is devised for automatic feature learning on these two object characteristics. Experimental results highlight that the Spatial-location and Spatial-size of an object play a significant role in object memorability prediction and the proposed models outperformed the existing methods.

Introduction

Humans selectively process the visual information to perform various visual tasks such as object detection, object recognition, and scene analysis. For this reason, the human visual system selects very few visual candidates to carry out these tasks. Since most of the computer vision algorithms are designed to help human-visual tasks, it is essential for such algorithms to have information about visual candidates or objects which are vital for human vision. Object memorability is one such promising information that may aid in carrying out an intended human-visual task. Also, object memorability information may be used in various applications including creation of educational materials, Logos, intelligent advertisements, book covers, and websites. Further, object memorability information can be utilised in computational Photography, user interface design, video summarisation, Thumbnail generation and much more [1].

Object memorability is a task of predicting how well an object can stick on to humans’ memory after a single view [1]. It requires the understanding of intrinsic object features which influence the memorability of an object. In recent years a significant amount of research has been carried out to shed light on intrinsic features which influence memorability prediction [2], [3], [4], [5], [6], [7], [8], [9], [10]. However, these studies are limited to image level where memorability prediction is carried out for the entire image. Object memorability prediction task is similar to image memorability prediction except that memorability prediction is carried out for each of the objects within an image [1]. For example, memorability score for the entire image shown in Fig. 1(a) is 0.57, whereas the memorability scores for the objects within this image are different as shown in Fig. 1(b). Further, it can be observed that these objects belong to the same category (Birds), but their corresponding memorability scores are different.

Rest of the paper is organised as follows: the state-of-art work on object memorability is described with its limitations in Section 2. A brief discussion on existing object memorability dataset is presented in Section 3. A detailed study on the influence of proposed visual characteristics is shown in Section 4. Next, proposed object memorability prediction models are described Section 5. Performance evaluation of the proposed models is presented in Section 6. Finally, the paper is concluded in Section 7.

Section snippets

Related work and motivation

Recently, Khosla et al. [11] attempted to infer which objects within an image are memorable or forgettable. It is achieved from the image memorability scores alone by using object annotations and predictive models. But, the proper evaluation and analysis of such models require ground truth object memorability scores. Dubey et al. [1] have made the first attempt to understand the object memorability by creating object memorability dataset. This dataset is used to investigate the relationship

Object memorability dataset

In order to understand and analyse memorability of objects in images, a standard dataset is required. Dubey et al. [1] created an object memorability dataset. In this paper, all the analysis and experiments conducted are purely on this dataset. The object segments belong to this dataset are referred to as ground truth object segments throughout this paper. The details of this dataset are given in this section.

Dubey et al. utilised the PASCAL-S dataset [15] to create the object memorability

Relative spatial characteristics

Spatial-size and Spatial-location of an object in an image are two major visual factors to determine the objects importance [18]. The objects larger in size and closer to the centre are likely to be more important as they have higher probabilities of being mentioned by annotators [18]. While the relationship between the object memorability and its relative spatial characteristics (Spatial-size and Spatial-location) is unclear, image memorability and relative spatial characteristics of objects

Object memorability prediction

Based on the analysis carried out in Section 4, it is observed that Spatial-size and Spatial-location play a crucial role in determining object memorability. Therefore, these two visual factors are incorporated in determining object memorability in the proposed models. The initial stage of this section describes the details of input data preprocessing for object memorability prediction process. Later part describes the proposed deep learning based object memorability prediction models.

Experimental results

All the experiments are conducted on object memorability dataset created by Dubey et al. [1]. The details of the dataset are explained in Section 3. The proposed models are trained on preprocessed ground truth object segments. The preprocessing ensures the incorporation of the visual factors Spatial-size and Spatial-location as explained in Section 5.1. To train the proposed models, the dataset is divided into training and testing set in six-folds for cross-validation. Further, proposed models

Conclusion

In this paper, the influence of Spatial-location and Spatial-size of an object segment in predicting its memorability score is analysed. A baseline model is developed to demonstrate the improvement of object memorability prediction due to these two characteristics. Next, a deep CNN model is devised for automatic feature learning on these two object characteristics to predict the object memorability scores. The proposed models exploited the Spatial-location and Spatial-size of an object segment

References (20)

  • R. Dubey et al.

    What makes an object memorable?

  • W.A. Bainbridge et al.

    The intrinsic memorability of face photographs

    J. Exp. Psychol. Gen.

    (2013)
  • Y. Baveye et al.

    Deep learning for image memorability prediction: the emotional bias

  • Z. Bylinskii et al.

    Intrinsic and extrinsic effects on image memorability. vol. 116

    (2015)
  • M. Mancas et al.

    Memorability of natural scenes: the role of attention

  • P. Isola et al.

    What makes an image memorable?

  • A. Khosla et al.

    Image memorability and visual inception

  • A. Khosla, A.S. Raju, A. Torralba, A. Oliva, Understanding and predicting image memorability at a large scale, in:...
  • P. Isola, D. Parikh, A. Torralba, A. Oliva, Understanding the intrinsic memorability of images, in: Advances in NIPS,...
  • J. Kim et al.

    Relative spatial features for image memorability

There are more references available in the full text version of this article.

This paper has been recommended for acceptance by Zicheng Liu.

View full text