Elsevier

Pattern Recognition

Volume 41, Issue 10, October 2008, Pages 3071-3077
Pattern Recognition

Contrast context histogram—An efficient discriminating local descriptor for object recognition and image matching

https://doi.org/10.1016/j.patcog.2008.03.013Get rights and content

Abstract

In this paper, we propose a new invariant local descriptor, called the contrast context histogram (CCH), for image matching and object recognition. By representing the contrast distributions of a local region, it serves as a distinctive local descriptor of the region. Our experiments demonstrate that contrast-based local descriptors can represent local regions with more compact histogram bins. Because of its high matching accuracy and efficient computation, the CCH has the potential to be used in a number of real-time applications.

Introduction

Invariant local descriptors constructed from images have been proposed as a way to solve many problems, such as in image matching [1], object recognition [1], [2], [3], and many other vision-based applications [4], [5], [6], [7]. The idea is to detect the invariant local properties of salient image corners under a class of transformations, and then establish discriminating descriptors for these corners. Descriptors provide robust representations of local image regions, even under partial occlusion. The basic problem is how to find the relevant information required to encode the local signatures.

Descriptors of local features have received considerable attention in recent years. For example, Freeman and Adelson [8] developed steerable filters, in which filters of arbitrary orientations are synthesized from linear combinations of pixel derivatives in particular directions. Belongie et al. [2] proposed a feature description called the shape context, which is a histogram of edge points with respect to a reference point under the log–polar coordinate. Lowe [1] introduced a scale invariant feature transformation (SIFT) descriptor that is invariant to scale and rotation. In this approach, keypoints are computed through the detection of scale-space extremes in a series of difference-of-Gaussian (DoG) images. Local descriptors are built up for each keypoint based on a weighted histogram of gradient orientations from a patch of pixels in its local neighborhood. In Refs. [3], [9], it is shown that SIFT is one of the most effective matching approaches when scale and viewpoint changes occur in images. Various extensions of SIFT have been proposed. For example, Ke and Sukthankar proposed PCA-SIFT [10], which applies principal components analysis (PCA) [11] to a normalized gradient patch, instead of using SIFT's smoothed weighted histograms. The gradient location-orientation histogram (GLOH) [3] computes the SIFT descriptor for a log–polar location grid and then reduces the size of the descriptor with PCA. The primary focus of these extensions is to provide more distinctive and compact descriptors to improve the matching accuracy and speed.

In this paper, we propose a novel invariant local descriptor called the contrast context histogram (CCH) for image matching and object recognition. Our primary motivation is to develop a descriptor that is computationally fast, requires fewer histogram bins to represent a local region, and can achieve a good matching performance. CCH exploits the contrast properties of a local region, instead of storing the weighted edge orientation histograms of salient corners like the SIFT approach. Rotation and linear illumination changes are considered to make the CCH robust against geometric and photometric changes. Compared to the approaches such as SIFT (PCA-SIFT and GLOH) that require computing the gradient orientations of all the pixels in a region, CCH is more efficient to compute since it only evaluates the intensity differences between the center pixel and the other pixels in a region. Therefore, CCH is potentially more suitable for real-time applications such as augmented reality [5]. In the experiments, we use CCH descriptors to represent cluttered scenes and objects, and evaluate the method's effectiveness.

The remainder of the paper is organized as follows. In the next section, we describe the construction of CCH descriptors from the salient corners of images. Section 3 discusses the implementation of the CCH approach. The experiment results are reported in Section 4. Finally, in Section 5, we present our conclusions.

Section snippets

The CCH descriptor

The main issue in developing invariant local descriptors is how to represent a region effectively and discriminatively. The color histogram [12] is an option for textural description, but it is sensitive to illumination changes. Instead, we consider a technique that computes the contrast values of points within a region with respect to a salient corner. A contrast value is defined as the difference in intensity between a point and the salient corner. If the brightness of each pixel changes by

Implementation

To compute CCH descriptors from an input image, we first extract the corners from a multi-scale Laplacian pyramid [17] by detecting the Harris corners [18] on each level of the pyramid. As noted in Ref. [17], corners that are invariant to the scale changes of the image can be detected by searching for stable features on Laplacian pyramids in the scale space [19]. A salient corner is selected if its minimal eigenvalue is larger than all the eigenvalues of its neighbors in a 7×7 region. Fig. 1

Data set

We evaluated CCH descriptors on the data set2 used in Ref. [3]. It contains images of various geometric and photometric transformations for different scene types. Fig. 2 shows the following images extracted from the data set: rotation, image blur, JPEG compression, lighting changes, viewpoint changes, and zoom and rotation changes. In the case of rotation, the images were obtained by rotating the camera around its

Conclusion

We have proposed a new invariant descriptor called CCH to describe the local properties of image patches, and shown that it is computationally efficient and highly effective in determining the correspondences between images. It is successful because the positive and negative histogram bins of the contrast values are discriminative properties of local regions that can be computed rapidly as their construction only involves simple subtractions. The experiment results suggest that CCH has

Acknowledgments

This research was supported in part by NSC 96-3113-H-001-011 and NSC 96-2752-E-002-007-PAE from the National Science Council, Taiwan. They would also like to thank Mr. P. Dunne for his work on polishing the writing.

About the Author—CHUN-RONG HUANG received the B.S. degree in electrical engineering from National Cheng Kung University, Taiwan, in 1999. He received the Ph.D. in Electrical Engineering from National Cheng Kung University, Taiwan, in 2005. He is currently a postdoctoral fellow in the Institute of Information Science, Academia Sinica, Taipei, Taiwan. His research interests include computer vision, computer graphics, and medical image processing.

References (20)

  • D. Lowe

    Distinctive image features from scale-invariant keypoints

    Int. J. Comput. Vision

    (2004)
  • S. Belongie et al.

    Shape matching and object recognition using shape contexts

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2002)
  • K. Mikolajczyk et al.

    A performance evaluation of local descriptors

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2005)
  • T. Ahonen et al.

    Face recognition with local binary patterns

  • I. Skrypnyk et al.

    Scene modeling, recognition and tracking with invariant image features

  • J. Sivic et al.

    Video google: a text retrieval approach to object matching in videos

  • A. Thayananthan et al.

    Shape context and chamfer matching in cluttered scenes

  • W. Freeman et al.

    The design and use of steerable filters

    IEEE Trans. Pattern Anal. Mach. Intell.

    (1991)
  • P. Moreels et al.

    Evaluation of features detectors and descriptors based on 3D objects

  • Y. Ke et al.

    PCA-SIFT: a more distinctive representation for local image descriptors

There are more references available in the full text version of this article.

About the Author—CHUN-RONG HUANG received the B.S. degree in electrical engineering from National Cheng Kung University, Taiwan, in 1999. He received the Ph.D. in Electrical Engineering from National Cheng Kung University, Taiwan, in 2005. He is currently a postdoctoral fellow in the Institute of Information Science, Academia Sinica, Taipei, Taiwan. His research interests include computer vision, computer graphics, and medical image processing.

About the Author—CHU-SONG CHEN received a B.S. in control engineering from National Chiao-Tung University, Hsing-Chu, Taiwan, in 1989. He received an M.S. in 1991 and a Ph.D. in 1996, respectively, both from the Department of Computer Science and Information Engineering, National Taiwan University. He is now an associate research fellow of the Institute of Information Science, Academia Sinica, and also an adjunct associate professor of Graduate Institute of Networking and Multimedia, National Taiwan University. His research interests include computer vision, pattern recognition, signal/image processing, and multimedia. Since 2007, Dr. Chen serves as the Secretary-General of the Image Processing and Pattern Recognition (IPPR) Society, Taiwan, which is one of the societies of the International Association of Pattern Recognition (IAPR). He has published more than 70 technical papers, and has received the outstanding paper awards of IPPR in 1997, 2001, and 2005.

About the Author—PAU-CHOO CHUNG received the B.S. and M.S. degrees in electrical engineering from National Cheng Kung University, Taiwan, Republic of China, in 1981 and 1983, respectively, and the Ph.D. degree in electrical engineering from Texas Tech University, in 1991. In 1991, she joined the Department of Electrical Engineering, National Cheng Kung University, and has become a full professor since 1996. Since 2001, she has served as the vice director, and currently the Director, of the Center for Research of E-life Digital Technology, National Cheng Kung University. She was selected as Distinguished Professor of National Cheng Kung University in 2005. She also served as the Director of Electrical Laboratory, National Cheng Kung University, during 2005–2008.

Dr. Chung's research interests include image analysis and pattern recognition, video image analysis, neural networks, telemedicine, and multimedia processing. Particularly she applies most of her research results on medical applications and received many awards. Dr. Chung has served as the program committee member in many international conferences. She is currently the chair of the IEEE Life Science Systems and Applications Technical Committee. She is also a member of the IEEE Visual Signal Processing and Communication Technical Committee, the IEEE Neural Systems and Applications Technical Committee, and the Multimedia Systems & Applications Technical Committee in the CASS. Currently she is also serving as the Associate Editor of Journal of Information Science and Engineering, the Guest Editor of IEEE Transactions on Circuits and Systems-I, a Member on IEEE International Steering Committee, IEEE Asian Pacific Conference on Circuits and Systems. She was the Chair of IEEE Computational Intelligence Society, Tainan Chapter (2005–2006), the Secretary General of Biomedical Engineering Society of the Republic of China (2005–2006), and Delegate in International Federation for Medical and Biological Engineering (IFMBE). She is one of the co-founders of Medical Image Standard Association at Taiwan. She is currently a member in BoG of CAS Society (2007–2009).

She is a member of Phi Tau Phi honor society and IEEE Fellow.

View full text