The effect of low-level image features on pseudo relevance feedback
Introduction
Advances in computer and multimedia technologies have allowed the production of digital images and the creation of large repositories for image storage with little cost. This has led a rapid increase in the size of image collections for multi-fold purposes, including digital libraries, medical imaging, art and museum collections, journalism, advertising, home photo archives, and so on. Clearly, it is now necessary to design automated image retrieval systems which can operate on a large scale.
The traditional image retrieval approach is based on manual image indexing with keywords assigned to images by human indexers during the database creation stage. Relevant images can be retrieved by using the indexed keywords as queries. However, there are some limitations to manual indexing. For example, it is a very time-consuming and expensive process, especially when the size of the image collection is very large, e.g., hundreds of thousands of images [24]. In addition, different indexers may assign different keywords to the same images, or the same indexers may perform differently given different circumstances and different times. In addition, during retrieval, users may not be aware of or agree with the indexed keywords or terms for queries which can lead to unsatisfactory retrieval results.
Content-Based Image Retrieval (CBIR) [30], which was proposed in the early 1990s, is a technique for automatically indexing images by extracting (low-level) visual features, such as color, texture, and shape. The retrieval of images is based solely upon the indexed image features. Therefore, it is hypothesized that relevant images can be retrieved by calculating the similarity between the low-level image contents through browsing, navigation, query-by-example, and so on. Typically, images are represented as points in a high dimensional feature space. Then, a metric is used to measure the degree of dis/similarity between images in this space. Thus, images corresponding closely to the query are classified as similar to the query and retrieved. Although CBIR introduced automated image feature extraction and indexation, it did not overcome the so-called semantic gap which is described in greater detail below.
The semantic gap is the gap between the computer extracted and indexed low-level features and the high-level concepts (or semantics) of a user’s queries. In other words, the automated CBIR systems do not allow ready matching to the users’ requests. The notation of similarity in the user’s mind is typically based on high-level abstractions, such as activities, entities/objects, events, or some evoked emotions, among others. In this situation, retrieval by similarity using low-level features like color or shape will not be very effective. In other words, human similarity judgments do not obey the requirements of the similarity metric used in CBIR systems. In addition, general users usually find it difficult to search or query images by using color, texture, and/or shape features only. They tend to prefer textual or keyword-based queries since these are easier to use and allow their information needs to be represented more intuitively [30].
One method applied to solve the semantic gap problem that affects the retrieval effectiveness of CBIR systems is relevance feedback (RF) [1], [23]. RF is a technique used in traditional text-based information retrieval systems, namely a (supervised active learning) process intended to improve the system performance by refining the results of the original queries. The main idea, which is the so-called user-in-the-loop version, is to ask users to provide positive (relevant) and negative (irrelevant) examples as feedback on the initially retrieved document sets. RF can also be applied in image retrieval so that some retrieved images can be selected to receive positive or negative feedback for query refinement. In principle, RF is based on learning a set of ‘optimal’ feature weights for a query, or moving the query point toward the relevant images [40].
However, the duration of the iterative process required to meet the users’ needs varies depending on the relevance feedback algorithms used, the image database, and the users. In addition, it can be a tedious task to provide relevant and/or irrelevant images to the system. Pseudo relevance feedback (PRF) can be utilized to alleviate these problems. PRF automates the manual part of RF, so that users get improved retrieval performance without an extended interaction. Specifically, it is assumed that a fraction of the top-ranked images in the initial search results are pseudo-positive [37], [38]. One closely related concept is the recursive matching scheme used in facial image retrieval [15].
Although some of the top-retrieved documents may contain noise, PRF has been shown to outperform several other RF algorithms and it has been recognized as the baseline for RF methods [4], [18], [8].
The Rocchio algorithm is the classic algorithm used for the implementation of RF/PRF, modeling a way of incorporating RF information into the vector space model [25]. In particular, it focuses on query vector modification where a new query vector is produced by taking the weighted sum of the original query and the mean vectors of the relevant and irrelevant sets (c.f., Section 2.2).
Clearly, RF/PRF performance is heavily dependent on image feature representation. In past studies in the image retrieval literature, it has been shown that various visual features can be extracted for image content analysis and indexing [30], such as the combined color and texture features [9] or bag-of-words method [14]. Although some comparative studies have been conducted analyzing the correlation of various low-level features and their application in image retrieval [7], [22], the focus has not been on the relevance feedback problem. In other words, there have been very few studies focusing on comparing relevant feature representations in terms of PRF. Therefore, the aim of this paper is to answer the question of which type of image feature representation performs best for PRF, as determined by the Rocchio algorithm. Moreover, the query response (or execution) time for modifying the query vector is also very critical for interactive image retrieval applications, such as web searching. The computation cost required for using different feature representations containing different feature dimensionalities is likely to be different. This has also motivated us to further examine the retrieval efficiency of PRF using different image feature representations.
To sum up, the contributions of this paper are two-fold. First, the findings of this study allow us to identify the optimal feature representation(s) that can provide better retrieval effectiveness and efficiency, which has not been done before in PRF. Second, the optimal feature representation(s) can be used as the baseline for future related work using PRF.
The rest of this paper is organized as follows. Section 2 overviews several well-known image feature representation methods, and Section 3 briefly describes the Rocchio algorithm. Section 4 presents the experimental setup and results. Finally, some conclusions are given in Section 5.
Section snippets
Color
Color is the most commonly used visual feature for image retrieval due to the computational efficiency of its extraction. All colors can be represented by variable combinations of the three so-called additive primary colors: red (R), green (G), and blue (B).
A color histogram [33] is one common method used to represent color contents for indexing and retrieval. It shows the proportion of pixels of each color within the image, which is represented by the distribution of the number of pixels for
The Rocchio algorithm
The Rocchio algorithm can be regarded as a query vector modification approach that iteratively reformulates the query vector based on the feedback set in order to move the query toward a topological region of more relevant images and away from irrelevant ones [25]. It reformulates the query as a modified query bywhere is the original query vector, and are the set of known relevant and non-relevant images, respectively, and , , and are
Experiments
This section is made up of two parts. First we introduce the experimental setup including the dataset used, low-level features extracted, the Rocchio parameters employed, and the evaluation metric considered. Second, relevant experimental results are presented, namely retrieval precision and query response time. In addition, the identified top feature representations are further compared using different Rocchio parameters, and their retrieval performance over the object and scene classes
Conclusion
Pseudo relevance feedback (PRF) is an approach for automating the manual component of the traditional relevance feedback process. The top-ranked images retrieved are then used for pseudo positive/negative feedback, even though some of them may contain noise information. CBIR systems using PRF can usually provide better retrieval performance than those without PRF.
The Rocchio algorithm is a classic algorithm used in the implementation of PRF, which is based on the query vector modification
Dr. Wei-Chao Lin is an assistant professor at the Department of Computer Science and Information Engineering, Hwa Hsia University of Technology, Taiwan. His research interests are machine learning and artificial intelligence applications.
References (40)
- et al.
An improved distance-based relevance feedback strategy for image retrieval
Image Vis. Comput.
(2013) - et al.
Bayesian relevance feedback for content-based image retrieval
Pattern Recogn.
(2004) - et al.
Image retrieval using color and shape
Pattern Recogn.
(1996) - et al.
Component-based LDA face description for image retrieval and MPEG-7 standardisation
Image Vis. Comput.
(2005) - et al.
A graph-based relevance feedback mechanism in content-based image retrieval
Knowl. Based Syst.
(2015) - et al.
Comparative study of global color and texture descriptors for web image retrieval
J. Visual Commun. Image Represent.
(2012) - et al.
Image retrieval: current techniques, promising directions and open issues
J. Visual Commun. Image Represent.
(1999) - et al.
An image retrieval scheme with relevance feedback using feature reconstruction and SVM reclassification
Neurocomputing
(2014) A computational approach to edge detection
IEEE Trans. Pattern Anal. Mach. Intell.
(1986)- T.-S. Chua, J. Tang, R. Hong, H. Li, Z. Luo, Y. Zheng, Y. (2009) NUS-WIDE: a real-word web image database from National...
Estimation and use of uncertainty in pseudo-relevance feedback
in: ACM SIGIR Conference on Research and Development in Information Retrieval
What does classifying more than 10,000 image categories tell us?
Eur. Conf. Comput. Vis.
Features for image retrieval: an experimental comparison
Inf. Retrieval
Improving web image search by bag-based reranking
IEEE Trans. Image Process.
Visual-textual joint relevance learning for tag-based social image search
IEEE Trans. Image Process.
Image indexing using color correlograms
IEEE Int. Conf. Comput. Vis. Pattern Recogn.
Representations of keypoint-based semantic concept detection: a comprehensive study
IEEE Trans. Multimedia
On the importance of parameter tuning in text categorization
Cited by (0)
Dr. Wei-Chao Lin is an assistant professor at the Department of Computer Science and Information Engineering, Hwa Hsia University of Technology, Taiwan. His research interests are machine learning and artificial intelligence applications.
Mr. Zong-Yao Chen is currently a Ph.D. student the Department of Information Management, National Central University, Taiwan. His research interests cover image processing, data mining and its applications.
Dr. Shih-Wen Ke is an assistant professor at the Department of Information and Computer Engineering, Chung Yuan Christian University, Taiwan. His research covers information retrieval, machine learning, and data mining.
Dr. Chih-Fong Tsai received a Ph.D. at School of Computing and Technology from the University of Sunderland, UK in 2005. He is now a professor at the Department of Information Management, National Central University, Taiwan. His current research focuses on multimedia information retrieval and data mining.
Dr. Wei-Yang Lin is an associate professor at the Department of Computer Science and Information Engineering, National Chung Cheng University, Taiwan. His research covers computer vision and image processing.