Multi-spectral dataset and its application in saliency detection

doi:10.1016/j.cviu.2013.07.002

Computer Vision and Image Understanding

Volume 117, Issue 12, December 2013, Pages 1748-1754

https://doi.org/10.1016/j.cviu.2013.07.002 Get rights and content

Highlights

•
A multi-spectral dataset is constructed containing RGB and near-infrared images.
•
The incorporation of near-infrared is proved to be valuable for saliency detection.
•
The best model for integrating RGB and near-infrared clues is analyzed.

Abstract

Saliency detection has been researched a lot in recent years. Traditional methods are mostly conducted and evaluated on conventional RGB images. Few work has considered the incorporation of multi-spectral clues. Considering the success of including near-infrared spectrum in applications such as face recognition and scene categorization, this paper presents a multi-spectral dataset and applies it in saliency detection. Experiments demonstrate that the incorporation of near-infrared band is effective in the saliency detection procedure. We also test the combinational models for integrating visible and near-infrared bands. Results show that there is no single model to effect on every saliency detection method. Models should be selected according to the specific employed method.

Introduction

Saliency detection has been a promising topic recently [1], [2], [3], [4]. The goal of saliency detection is to extract salient areas from an input image and present the result as a gray scale image. The whiter the pixel seems, the more possible it might be salient. Since the detected saliency map can be utilized in various applications, such as recognition [5], segmentation [6], and tracking [7], research towards this subject has attracted much attention [8], [9], [10].

Generally, methods for saliency detection can be categorized into local based and global based schemes [11]. Local based methods calculate a region’s saliency according to the contrast to a small neighborhood [12], [13], [14]. Global based methods evaluate saliency with respect to the whole image’s statistical characteristic [15], [16]. Whatever the case is, saliency detection is mostly conducted on natural images taken by ordinary cameras. These cameras can respond to wavelengths from about 390 to 700 nm, which is called the visible spectrum [17]. The obtained images are regular RGB images. As for the electromagnetic spectrums beyond this scope, their information is lost during the imaging process. However, the lost spectrums might be also valuable for vision tasks because the more supporting information we have, the more rationale decisions will be made. This judgment is not only the common sense for humans, but also proved by other applications in computer vision field. For example, after the proposition of SIFT descriptor [18] on gray scale images, CSIFT [19], [20] was developed to incorporate the color bands into the descriptor. Then not long ago, MSIFT [21] was presented to include the near-infrared band for a richer descriptor. As for the face recognition research, early work primarily focus on the gray or RGB images. Later, other light bands besides the visible spectrum [22] are involved to eliminate the lighting problem. The same is true for boundary detection [23] and tracking [24] that incorporating more clues will improve the performance. In remote sensing, the spectrum is not limited to one or several bands, but up to a level of tens and hundreds [25], [26], [27].

Considering the success of including other light bands besides the visible light in many applications, we construct a multi-spectral dataset containing both near-infrared (NIR) and regular RGB images in this work. Several dataset containing NIR images have been presented before. For example, the PolyU-NIRFD dataset [22] for face recognition, the NIR–RGB dataset [21] for scene categorization. But these datasets are designed for specific purpose. They cannot be readily utilized for saliency detection. To this aim, the presented dataset is constructed in the hope of providing a new platform for saliency research.

The rest of this paper is organized as follows. Section 2 presents the proposed multi-spectral dataset. Section 3 introduces the distinguishable properties of near-infrared band. Section 4 applies the presented dataset in saliency detection. Finally, conclusion is made in Section 5.

Section snippets

Multi-spectral dataset

Since more clues are prone to provide richer information, we hope that a camera can capture the NIR and RGB spectrums simultaneously. However, most existing datasets contain images captured from only RGB bands. We cannot get the information of the four bands at the same time. Though the NIR–RGB dataset [21] has images of both bands, each pair of them are taken consecutively with two cameras. This makes the contents of image pairs not the same. When these images are employed, they have to be

NIR spectrum

The NIR spectrum is between the visible light band and the thermal infrared band. It has the properties of both visible light and thermal infrared light, but is different from any of them. Firstly, unlike thermal infrared, objects can reflect the NIR light the same way as they do to visible light. Secondly, it is invisible to human eyes like the thermal infrared and reflects an “unseen” characteristic different from visible light.

In order to know the relationship and difference between the RGB

Saliency detection

To demonstrate the effectiveness of the presented dataset, we conduct experiments in the application of saliency detection. Saliency maps are firstly extracted from RGB images and NIR images. Then the obtained maps are combined together to get the final results. The purpose of these experiments is to answer the two following questions: 1) whether or not the incorporation of NIR band can improve the saliency detection performance; 2) which kind of models is the best to combine the saliency maps

Conclusion

In this work, a multi-spectral dataset is presented to serve as a new platform for saliency research. Different from existing ones, our dataset contains pairs of RGB and NIR images. This can provide more valuable information for detecting the salient areas in an image. Experiments demonstrate the effectiveness of the incorporation of NIR band in saliency detection. We also test several regression models for combining the RGB and NIR bands. Results show that it is not appropriate to employ one

Acknowledgment

This work is supported by the State Key Program of National Natural Science of China (Grant No. 61232010), the National Natural Science Foundation of China (Grant No. 61172143 and 61105012), and the Natural Science Foundation Research Project of Shaanxi Province (Grant No. 2012JM8024).

References (37)

T. Jost et al.
Assessing the contribution of color in visual attention
Comput. Vis. Image Understand.
(2005)
D. Walther et al.
Selective visual attention enables learning and recognition of multiple objects in cluttered scenes
Comput. Vis. Image Understand.
(2005)
I. Bogdanova et al.
Dynamic visual attention on the sphere
Comput. Vis. Image Understand.
(2010)
R. Perko et al.
A framework for visual-context-aware object detection in still images
Comput. Vis. Image Understand.
(2010)
Y. Sun et al.
A computer vision model for visual-object-based attention and eye movements
Comput. Vis. Image Understand.
(2008)
T.E. de Campos et al.
Images as sets of locally weighted features
Comput. Vis. Image Understand.
(2012)
G.J. Burghouts et al.
Performance evaluation of local colour invariants
Comput. Vis. Image Understand.
(2009)
B. Zhang et al.
Directional binary code with application to polyU near-infrared face database
Pattern Recogn. Lett.
(2010)
Q. Wang et al.
Saliency detection by multiple-instance learning
IEEE Trans. Cybernetics
(2013)
D. Walther et al.
Attentional selection for object recognition – a gentle way
Biol. Motivated Comput. Vis.
(2002)

J. Han et al.

Unsupervised extraction of visual attention objects in color images

IEEE Trans. Circ. Syst. Video Technol.

(2006)

V. Mahadevan, N. Vasconcelos, Saliency-based discriminant tracking, in: IEEE Computer Society Conference on Computer...

M.M. Cheng, G.X. Zhang, N.J. Mitra, X. Huang, S.M. Hu, Global contrast based salient region detection, in: IEEE...

L. Itti et al.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.

(1998)

S. Goferman, L. Zelnik-Manor, A. Tal, Context-aware saliency detection, in: IEEE Computer Society Conference on...

T. Liu, J. Sun, N.N. Zheng, X. Tang, H.Y. Shum, Learning to detect a salient object, in: IEEE Computer Society...

Y. Zhai, M. Shah, Visual attention detection in video sequences using spatiotemporal cues, in: ACM Multimedia, 2006, p....

R. Achanta, S. Hemami, F. Estrada, S. Susstrunk, Frequency-tuned salient region detection, in: IEEE Computer Society...

Cited by (34)

Ship Detection in Multispectral Remote Sensing Images via Saliency Analysis
2021, Applied Ocean Research
Citation Excerpt :
Scholars have studied and proved that near-infrared band images have a certain positive effect on image saliency detection. A multi-spectral image dataset of ordinary life scenes was produced as a new platform for saliency research(Wang et al., 2013b). Some regression model experiments have proved the effectiveness using the data set which includes RGB and NIR bands images in saliency detection.
Despite the increasing visible optical remote sensing cameras equipped with panchromatic and four-band multispectral sensors, the application of multispectral data is still rarely used in the field of remote sensing ship target detection with the openly recognized challenge to improve detection precision. Towards this end, a ship target saliency detection method is proposed in the paper, which combines weighted least squares (WLS) with maximum symmetric surround (MSS), based on RGB-NIR four-band multispectral remote sensing images to locate the candidate area of ship targets quickly and accurately. The high frequency information of the NIR band image is extracted through the WLS filter and integrated into the RGB band image, and then the saliency of the image is analyzed. The detection results are combined with AIS to achieve complementary information for ship recognition. Some experiments show that the proposed method can effectively suppress the complex background information of clutter interference such as cloud waves and sea waves, highlight the ship target in low contrast scenes, and also increase recall and precision effectively.
Saliency detection and region of interest extraction based on multi-image common saliency analysis in satellite images
2018, Neurocomputing
Saliency analysis is an effective method to extract interesting target regions from satellite images. However, when the satellite image contains salient background information or interference, it is difficult to eliminate this information and interference accurately only using single-image saliency analysis. In this paper, a novel multi-image saliency analysis (MSA) model based on multiple multispectral-images clustering saliency analysis (MMCS) and panchromatic image co-occurrence histogram saliency analysis (PCHS) is proposed to extract regions of interest (ROIs) with the common salient features. In MMCS, bisecting K-means clustering on the entire image set extracts the global correspondence in multispectral images. Then, cluster-based saliency using spectral contrast efficiently assigns the common ROIs with high saliency. Therefore, saliency maps of MMCS are computed, which effectively depress the background information and interference. In PCHS, saliency maps are obtained by computing co-occurrence histogram of panchromatic images, which aim to enhance the saliency of ROIs. After that, multi-image saliency maps are computed by a novel fusion strategy, which can depress the background information and highlight the ROIs. Finally, ROIs of multiple images are acquired by binarizing these saliency maps. Experimental results show that the MSA model can effectively eliminate the salient background information and exclude the images that contain no common salient information, thus achieving the highest ROC and PRF value.
Salient region detection via locally smoothed label propagation: With application to attention driven image abstraction
2017, Neurocomputing
Background prior and label propagation have been widely advocated for salient region detection. However, traditional background prior based models heuristically assume that all or parts of the pixels on the image boundary are background. And the label propagation based models only consider the pairwise smoothness in optimization. To tackle these two shortcomings, we propose a framework which utilizes background prior and label propagation to generate more reliable saliency maps. Firstly, a novel optimal seeds estimation strategy is proposed to adaptively and robustly choose the most informative seeds from refined background map and foreground prior. Then, a new label propagation model which takes into account both the pairwise and local smoothness constraint is proposed to learn the saliency score according to the estimated background and foreground seeds. Last but not least, we present a new application of salient region detection named attention driven image abstraction. Both quantitative and qualitative evaluations on three widely used datasets demonstrate the superiority of the proposed method to other several state-of-the-art methods.
An image texture insensitive method for saliency detection
2017, Journal of Visual Communication and Image Representation
Citation Excerpt :
Kanan et al. [18] and Xie et al. [19] compute saliency based on Bayesian modeling. Learning based methods have been used in [20,21]. Rigas et al. [22], Yan et al. [23], Shen et al. [24], Fareed et al. [25] and Li et al. [26] compute saliency using sparse representation of image.
We propose a texture insensitive, region based image saliency detection algorithm, having excellent detection and localization properties, to obtain salient objects. We use a total variation based regularizer to suppress textures from the image and to make the method invariant to textural variations in the scene. This leads to an image that contains piecewise constant gray valued regions. This texture-free image is sparsely segmented into a small number of regions using the expectation maximization algorithm assuming a Gaussian mixture model. We compute three different saliency measures for every region using its intensity and spatial features. We adopt a relevance feedback mechanism to obtain weights for combining the three saliency measures and obtain the final saliency map. Next we input the thresholded saliency map to an image matting technique and extract the salient objects from the image with exact boundaries. Experimental comparisons with existing saliency detection algorithms demonstrate the superiority of the proposed technique.
Recognising occluded multi-view actions using local nearest neighbour embedding
2016, Computer Vision and Image Understanding
Citation Excerpt :
However, how to recognise multiple, complex human actions or activities remains a challenging problem [4]. So far, the majority of action recognition systems are only restricted to a finite number of well-defined action categories, and the performance is evaluated on actions cropped by detected bounding boxes [5,8]. For realistic applications, current methods are still very sensitive to trivial environmental variations, e.g., gender, body size, viewpoint and illumination variations, and occlusions [6,7].
The recent advancement of multi-sensor technologies and algorithms has boosted significant progress to human action recognition systems, especially for dealing with realistic scenarios. However, partial occlusion, as a major obstacle in real-world applications, has not received sufficient attention in the action recognition community. In this paper, we extensively investigate how occlusion can be addressed by multi-view fusion. Specifically, we propose a robust representation called local nearest neighbour embedding (LNNE). We then extend the LNNE method to 3 multi-view fusion scenarios. Additionally, we provide detailed analysis of the proposed voting strategy from the boosting point of view. We evaluate our approach on both synthetic and realistic occluded databases, and the LNNE method outperforms the state-of-the-art approaches in all tested scenarios.
An image-based endmember bundle extraction algorithm using reconstruction error for hyperspectral imagery
2016, Neurocomputing
Although many endmember extraction algorithms have been proposed for hyperspectral images in recent years, there are still some problems in endmember extraction which would lead to inaccurate endmember extraction. One important problem is the variation in endmember spectral signatures due to spatial and temporal variability in the condition of scene components and differential illumination conditions. One category to handle endmember variability is considering endmembers as the bundles. In other words, each endmember of a material is represented by a set or “bundle” of spectra. In this article, to account for the variation in endmember spectral signatures, an image-based endmember bundle extraction algorithm using reconstruction error for hyperspectral remote sensing imagery is proposed. In order to demonstrate the performance of the proposed method, the current state-of-the-art endmember bundle extraction methods are used for comparison. Experiments with both synthetic and real hyperspectral data sets indicate that the proposed method shows a significant improvement over the current state-of-the-art endmember bundle extraction methods and perform best in subsequent unmixing.

View all citing articles on Scopus

View full text

Multi-spectral dataset and its application in saliency detection

Highlights

Abstract

Introduction

Section snippets

Multi-spectral dataset

NIR spectrum

Saliency detection

Conclusion

Acknowledgment

Comput. Vis. Image Understand.

Comput. Vis. Image Understand.

Comput. Vis. Image Understand.

Comput. Vis. Image Understand.

Comput. Vis. Image Understand.

Comput. Vis. Image Understand.

Comput. Vis. Image Understand.

Pattern Recogn. Lett.

Saliency detection by multiple-instance learning

IEEE Trans. Cybernetics

Attentional selection for object recognition – a gentle way

Biol. Motivated Comput. Vis.

Unsupervised extraction of visual attention objects in color images

IEEE Trans. Circ. Syst. Video Technol.

A model of saliency-based visual attention for rapid scene analysis

IEEE Trans. Pattern Anal. Mach. Intell.