Unsupervised manifold learning using Reciprocal kNN Graphs in image re-ranking and rank aggregation tasks

doi:10.1016/j.imavis.2013.12.009

Image and Vision Computing

Volume 32, Issue 2, February 2014, Pages 120-130

https://doi.org/10.1016/j.imavis.2013.12.009 Get rights and content

Highlights

•
Presentation of an unsupervised manifold learning algorithm using Reciprocal kNN Graphs
•
Presentation of the Reciprocal kNN Graph ReRanking for improving the effectiveness of CBIR systems
•
Description of how Reciprocal kNN Graph algorithm can be used for rank aggregation tasks
•
Discussion about the computational complexity and the convergence of proposed algorithm
•
Experimental evaluation considering different datasets, descriptors, and baselines

Abstract

In this paper, we present an unsupervised distance learning approach for improving the effectiveness of image retrieval tasks. We propose a Reciprocal kNN Graph algorithm that considers the relationships among ranked lists in the context of a k-reciprocal neighborhood. The similarity is propagated among neighbors considering the geometry of the dataset manifold. The proposed method can be used both for re-ranking and rank aggregation tasks. Unlike traditional diffusion process methods, which require matrix multiplication operations, our algorithm takes only a subset of ranked lists as input, presenting linear complexity in terms of computational and storage requirements. We conducted a large evaluation protocol involving shape, color, and texture descriptors, various datasets, and comparisons with other post-processing approaches. The re-ranking and rank aggregation algorithms yield better results in terms of effectiveness performance than various state-of-the-art algorithms recently proposed in the literature, achieving bull's eye and MAP scores of 100% on the well-known MPEG-7 shape dataset.

Introduction

The development of multimedia technologies for creating and sharing digital contents has triggered an exponential increase of image collections. Traditional search approaches based on image metadata can be unfeasible for large collections, since much human intervention is required for image annotation. Content-Based Image Retrieval (CBIR) systems have emerged as a promising alternative, aiming at retrieving the images that are the most similar to a given query.

The effectiveness of CBIR systems is very dependent on the distance measure adopted. Images are often modelled as high dimensional points in an Euclidean space, and the distances among them are usually measured by Euclidean distances. In this scenario, CBIR systems often consider only pairwise image analysis, that is, compute similarity measures considering only pairs of images, ignoring the information encoded in the relations among several images. On the contrary, the user perception considers the query specification and responses in a given context. In view of that, there has been significant research [44], [43], [13], [26], [14] on improving the distance measures in CBIR systems, replacing pairwise similarities by more global affinity measures that consider the relationships among images. The overall goal of these methods is to mimic the human behavior on judging the similarity among objects by taking into account the context of the search process. As previously observed [42], [40], an effective distance measure should describe the relationship between the query and retrieved objects in the context of the whole collection.

Therefore, how to capture and utilize the intrinsic manifold structure of a collection becomes a central problem in the vision and learning community [14]. A common recent approach is manifold learning, mainly based on non-linear dimensionality reduction techniques. The idea is to explicitly construct a new embedding space with a corresponding metric which is more faithful to the manifold structure and hence induces a better distance/similarity measure. The manifold learning algorithms are able to learn distances between data points that correspond to geodesic distances on the data manifold [44]. In other words, the new distances are estimated considering a walk along the geometric structure of the dataset.

In this paper, we propose an unsupervised learning algorithm based on Reciprocal kNN Graph. The proposed algorithm improves the effectiveness of image retrieval through re-ranking and rank aggregation tasks by taking into account the instrinsic the geometry of the dataset manifold. The capacity of considering the geometry of the dataset manifold is illustrated in Fig. 1, Fig. 2, Fig. 3. We illustrate the Two-Moon dataset, comparing the Euclidean distance with the proposed Reciprocal kNN Graph. One point is selected as a labeled point (marked with a triangle) in each moon. In the following, all other data points are assigned to the closest labeled point, determining their color. Fig. 1 illustrates the classification computed by the Euclidean distance. Fig. 2 illustrates the ideal classification (with points in red and blue) considering the dataset manifold. The Euclidean distance does not consider the geometry structure of the dataset. As it can be observed, the extremities of the moons are misclassified. Fig. 3 illustrates the distances learned by the Reciprocal kNN Graph, after only one iteration. We can observe that several points were corrected compared with the Euclidean distance. The arrows in Fig. 3 illustrates how the Reciprocal kNN Graph algorithm iteratively propagates the similarity along the dataset structure considering the connectivity of the data set: (i) the red points in the left and; (ii) the blue points in the right.

The Reciprocal kNN Graph is mainly based on the information encoded in the top positions of the ranked lists. Given a query image, the ranked lists represent a relevant source of contextual information, since they define relationships not only between pairs of images (as distance functions), but also among all the images in the ranked list. The modelling of the similarity information consists in the essential difference between the Reciprocal kNN Graph approach and existing diffusion-based algorithms: the Reciprocal kNN Graph is based only on the ranked lists, and therefore independent of any distance (or similarity) scores.

By analyzing the ranked lists, it is expected, for example, that similar images present reciprocal references at the beginning of their ranked lists. It is also expected that images ranked at the top positions of ranked lists are similar to each other. In this way, aiming at redefining the distance between two images, the Reciprocal kNN Graph uses both the reciprocal nearest neighbor references and the graph structure considering all references among images at top positions of ranked lists. This approach represents the main contribution of our method, since it enables exploiting the maximum contextual information available in the ranked lists with low computational efforts. Another contribution relies on the efficiency of the Reciprocal kNN Graph algorithm. Unlike other diffusion approaches based on matrices multiplication [3], [42], [44], which presents complexity of O(n³), our algorithm recomputes only the beginning of ranked lists with a constant size of elements, which presents computational and storage requirements of only O(n), where n represents the number of images in the collection.

We conducted a large evaluation protocol involving shape, color, and texture descriptors, different datasets and comparisons with other post-processing approaches. Experimental results demonstrate the effectiveness of our method. The re-ranking and rank aggregation algorithm yield better results in terms of effectiveness performance than various state-of-the-art algorithms.

This paper is organized as follows: Section 2 discusses related work; Section 3 discusses the definition of the image re-ranking problem; in Section 4, we present our Reciprocal kNN Graph algorithm. Section 5 presents the experimental evaluation and, finally, Section 6 draws on conclusions and presents future work.

Section snippets

Related work

Defining an effective distance measures consists in a key role in many multimedia applications, including classification and retrieval tasks. For example, choosing a good distance measure is often critical for building a content-based image retrieval (CBIR) system. In general, aiming at retrieving the most similar images to a given query image, CBIR systems compute a predefined distance measure between the query image and each collection image. Traditional distance measures that consider only

Problem formulation

Let $C = \{im g_{1}, im g_{2}, \dots, im g_{n}\}$ be an image collection, where n is the number of images in the collection. Let $D$ be an image descriptor which defines a distance function between two images img_i and img_j as ρ(img_i, img_j). For simplicity and readability purposes, we use the notation ρ(i,j) for denoting the distance between images img_i and img_j.

Based on the distance function ρ, a ranked list τ_q can be computed in response to a query image img_q. Although the ranked lists contain distance information from

Reciprocal kNN Graph

In this section, we present the Reciprocal kNN Graph algorithm and its application in re-ranking and rank aggregation tasks. We also discuss convergence and efficiency aspects.

Experimental evaluation

This section demonstrates the effectiveness of the proposed re-ranking and rank aggregation methods in image retrieval tasks. A large set of experiments was conducted considering four datasets and nineteen CBIR descriptors, aiming at analyzing and comparing our method under several aspects.

Conclusions

In this work, we have presented a novel re-ranking and rank aggregation approach that exploits the Reciprocal kNN Graph for improving image retrieval tasks.

The main idea consists in analyzing the reciprocal references at top positions of ranked lists for performing re-ranking and rank aggregation tasks. The Reciprocal kNN Graph algorithm iteratively propagates the similarity along the dataset structure by taking into account intrinsic geometry of the dataset manifold.

We conducted a large set of

Acknowledgments

Authors thank AMD, FAEPEX, CAPES, FAPESP, and CNPq for financial support.

References (46)

N. Arica et al.
BAS: a perceptual shape descriptor based on the beam angle statistics
Pattern Recogn. Lett.
(2003)
D.C.G. Pedronette et al.
Exploiting pairwise recommendation and clustering strategies for image re-ranking
Inf. Sci.
(2012)
B. Tao et al.
Texture recognition and image retrieval using gradient indexing
J. Vis. Commun. Image Represent.
(2000)
J. Wang et al.
Learning context-sensitive similarity by shortest path propagation
Pattern Recogn.
(2011)
J. Almeida et al.
BP-tree: an efficient index for similarity search in high-dimensional metric spaces
X. Bai et al.
Co-transduction for shape retrieval
S. Belongie et al.
Shape matching and object recognition using shape contexts
IEEE Trans. Pattern Anal. Mach. Intell.
(2002)
Y.L. Boureau et al.
Learning mid-level features for recognition
P. Brodatz
Textures: A Photographic Album for Artists and Designers
(1966)
S.A. Chatzichristofis et al.
Cedd: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval

S.A. Chatzichristofis et al.

Fcth: fuzzy color and texture histogram—a low level feature for accurate image retrieval

J.C. van Gemert et al.

Visual word ambiguity

IEEE Trans. Pattern Anal. Mach. Intell.

(2010)

R. Gopalan et al.

Articulation-invariant representation of non-planar shapes

S.C. Hoi et al.

Semi-supervised distance metric learning for collaborative image retrieval and clustering

ACM Trans. Multimed. Comput. Commun. Appl.

(2010)

J. Huang et al.

Image indexing using color correlograms

H. Jegou et al.

Accurate image search using the contextual dissimilarity measure

IEEE Trans. Pattern Anal. Mach. Intell.

(2010)

J. Jiang et al.

Unsupervised metric learning by self-smoothing operator

P. Kontschieder et al.

Beyond pairwise shape similarity analysis

V. Kovalev et al.

Color co-occurrence descriptors for querying-by-example

L.J. Latecki et al.

Shape descriptors for non-rigid shapes with a single closed contour

H. Ling et al.

Shape classification using the inner-distance

IEEE Trans. Pattern Anal. Mach. Intell.

(2007)

H. Ling et al.

Balancing deformability and discriminability for shape matching

D. Lowe

Object recognition from local scale-invariant features

Cited by (43)

A BFS-Tree of ranking references for unsupervised manifold learning
2021, Pattern Recognition
Contextual information, defined in terms of the proximity of feature vectors in a feature space, has been successfully used in the construction of search services. These search systems aim to exploit such information to effectively improve ranking results, by taking into account the manifold distribution of features usually encoded. In this paper, a novel unsupervised manifold learning is proposed through a similarity representation based on ranking references. A breadth-first tree is used to represent similarity information given by ranking references and is exploited to discovery underlying similarity relationships. As a result, a more effective similarity measure is computed, which leads to more relevant objects in the returned ranked lists of search sessions. Several experiments conducted on eight public datasets, commonly used for image retrieval benchmarking, demonstrated that the proposed method achieves very high effectiveness results, which are comparable or superior to the ones produced by state-of-the-art approaches.
Graph-based selective rank fusion for unsupervised image retrieval
2020, Pattern Recognition Letters
Nowadays, there is a great variety of visual features available for image retrieval tasks. While fusion strategies have been established as a promising alternative, an inherent difficulty in unsupervised scenarios is the task of selecting the features to combine. In this paper, a Graph-based Selective Rank Fusion is proposed. The graph is used to represent the effectiveness estimation of features and the complementarity among them. The selected combinations are defined by the Connected Components of the graph. High-effective retrieval results were achieved through a comprehensive experimental evaluation considering different public datasets, dozens of features and comparisons with related methods. Relative gains up to +54.73% were obtained in relation to the best isolated feature.
Unsupervised selective rank fusion for image retrieval tasks
2020, Neurocomputing
Several visual features have been developed for content-based image retrieval in the last decades, including global, local and deep learning-based approaches. However, despite the huge advances in features development and mid-level representations, a single visual descriptor is often insufficient to achieve effective retrieval results in several scenarios. Mainly due to the diverse aspects involved in human visual perception, the combination of different features has been establishing as a relevant trend in image retrieval. An intrinsic difficulty consists in the task of selecting the features to combine, which is often supported by supervised learning approaches. Therefore, in the absence of labeled data, selecting features in an unsupervised way is a very challenging, although essential task. In this paper, an unsupervised framework is proposed to select and fuse visual features in order to improve the effectiveness of image retrieval tasks. The framework estimates the effectiveness and correlation among features through a rank-based analysis and uses a list of ranker pairs to determine the selected features combinations. High-effective retrieval results were achieved through a comprehensive experimental evaluation conducted on 5 public datasets, involving 41 different features and comparison with other methods. Relative gains up to +55% were obtained in relation to the highest effective isolated feature.
A framework for speaker retrieval and identification through unsupervised learning
2019, Computer Speech and Language
Speaker recognition is a task of remarkable relevance, with applications in diversified domains. Recently, mainly due to the facilities in audio-visual content acquisition, the capacity of analyzing growing datasets independent of labeled data has become a crucial advantage. This paper presents a speaker recognition approach based on recent unsupervised learning methods, which do not require any labeled data or user intervention. The approach is organized in terms of a framework which exploits a rank-based formulation. The similarity information defined by speaker modeling techniques is encoded in ranked lists, which are used as input by the unsupervised learning algorithms. Vector quantization, Gaussian mixture models and i-vectors are employed as modeling techniques, while the algorithms RL-Sim and ReckNN are used for unsupervised learning tasks. The framework was experimentally evaluated on query-by-example speaker retrieval and speaker identification tasks, both on clean and noisy speech recordings. An experimental evaluation was conducted on three public datasets, different languages, and recordings conditions. Effectiveness gains up to +56% on retrieval measures were obtained through the use of unsupervised learning algorithms over traditional speaker recognition techniques.
An optimized unsupervised manifold learning algorithm for manycore architectures
2019, Information Sciences
Citation Excerpt :
However, for ensuring the complexity of O(n), various efficiency aspects, such as data structure, should be properly considered. Since the focus of [34] was the effectiveness evaluation, only a direct and non-optimized implementation of the method was initially devised. This section describes the algorithm used in [34] for computing the ReckNNmethod, while next sections present the proposed optimizations, a detailed discussion about efficiency aspects and the evaluation on manycore architectures.
Multimedia data, such as images and videos, has become very popular in people’s daily life as a result of the widespread use of mobile devices. The ever-increasing amount of such data, along with the necessity for real-time retrieval, has lead to the development of new methods that can process them in a timely fashion with acceptable accuracy. In this paper, we study the performance of ReckNN, an unsupervised manifold learning algorithm based on the reciprocal neighbourhood and the authority of ranked lists. Most of the related work in this field do not fully investigate optimization strategies, an aspect that is becoming more important with the high availability of manycore machines. In order to address that issue, we fully investigate optimization opportunities in this article and make the following three main contributions. Firstly, we develop an efficient and scalable method for storing and accessing the distances between objects (e.g., video or image) based on dictionaries. Secondly, we employ memoization to speed up the computation of authority scores, leading to a significant performance gain even on single-core architectures. Lastly, we devise and implement several parallelization strategies and show that they are scalable on a 72-core Intel machine. The experimental results with MPEG-7, Corel5k and MediaEval benchmarks show that the optimized ReckNN delivers both efficiency and scalability, highlighting the importance of the proposed optimizations for manycore machines.
Semi-supervised and active learning through Manifold Reciprocal kNN Graph for image retrieval
2019, Neurocomputing
Citation Excerpt :
Moreover, the rank-based approach requires low computational costs and is parallelizable widely, which makes it suitable for online responses and real-world applications. The Manifold Reciprocal kNN Graph [39] is an unsupervised manifold learning algorithm which propagates the similarity among neighbors by considering the geometry of the dataset manifold, with the aim to improve the effectiveness of retrieval tasks without the need of user intervention. The Reciprocal kNN Graph is mainly based on the relationships and information encoded in the top positions of the ranked lists, in the context of a k-reciprocal neighborhood.
A massive and ever growing amount of data collections, including visual and multimedia content are available today. Such content usually possesses additional information, as text or other metadata, to form a rather sparse and noisy, yet rich and diverse source of annotation. Although the text-based retrieval models are well established, they ignore the rich source of information encoded in the visual data. In contrast, the promising content-based retrieval technologies, capable of considering the multimedia content, still face obstacles for mapping the low level features into high level semantic concepts. Supervised approaches based on relevance feedback techniques have been employed for mitigating such gap on visual retrieval tasks. Although often quite effective, such methods rely only on labeled data, which can severely impact the retrieval effectiveness when the number of user interventions is insufficient. In this scenario, the retrieval approaches are ideally suitable for the emerging weakly supervised and active learning technology to semi-autonomously explore data collections by taking into account the relationships among multimedia objects and saving the user’s efforts. In this paper, we discuss a novel semi-supervised learning algorithm for image retrieval tasks. While a manifold learning algorithm uses a reciprocal kNN graph to analyze the unlabeled data, the labeled information obtained through user interactions are represented using similarity sets. Both labeled and unlabeled information are modelled in terms of ranking information to allow a strict link between them. Experimental results obtained on various public datasets and several different visual features have demonstrated the effectiveness of the proposed approach.

View all citing articles on Scopus

^☆: This paper has been recommended for acceptance by Thomas Brox.

View full text

Unsupervised manifold learning using Reciprocal kNN Graphs in image re-ranking and rank aggregation tasks☆

Highlights

Abstract

Introduction

Section snippets

Related work

Problem formulation

Reciprocal kNN Graph

Experimental evaluation

Conclusions

Acknowledgments

Pattern Recogn. Lett.

Inf. Sci.

J. Vis. Commun. Image Represent.

Pattern Recogn.

BP-tree: an efficient index for similarity search in high-dimensional metric spaces

Co-transduction for shape retrieval

Shape matching and object recognition using shape contexts

IEEE Trans. Pattern Anal. Mach. Intell.

Learning mid-level features for recognition

Textures: A Photographic Album for Artists and Designers

Cedd: color and edge directivity descriptor: a compact descriptor for image indexing and retrieval

Fcth: fuzzy color and texture histogram—a low level feature for accurate image retrieval

Visual word ambiguity

IEEE Trans. Pattern Anal. Mach. Intell.

Articulation-invariant representation of non-planar shapes

Semi-supervised distance metric learning for collaborative image retrieval and clustering

ACM Trans. Multimed. Comput. Commun. Appl.

Image indexing using color correlograms

Accurate image search using the contextual dissimilarity measure

IEEE Trans. Pattern Anal. Mach. Intell.

Unsupervised metric learning by self-smoothing operator

Beyond pairwise shape similarity analysis

Color co-occurrence descriptors for querying-by-example

Shape descriptors for non-rigid shapes with a single closed contour

Shape classification using the inner-distance

IEEE Trans. Pattern Anal. Mach. Intell.

Balancing deformability and discriminability for shape matching

Object recognition from local scale-invariant features