Spatial interest pixels (SIPs): useful low-level features of visual media data

Li, Qi; Ye, Jieping; Kambhamettu, Chandra

doi:10.1007/s11042-006-0009-3

Spatial interest pixels (SIPs): useful low-level features of visual media data

Published: 01 June 2006

Volume 30, pages 89–108, (2006)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Qi Li²,
Jieping Ye² &
Chandra Kambhamettu¹

84 Accesses
2 Citations
Explore all metrics

Abstract

Visual media data such as an image is the raw data representation for many important applications. Reducing the dimensionality of raw visual media data is desirable since high dimensionality degrades not only the effectiveness but also the efficiency of visual recognition algorithms. We present a comparative study on spatial interest pixels (SIPs), including eight-way (a novel SIP detector), Harris, and Lucas‐Kanade, whose extraction is considered as an important step in reducing the dimensionality of visual media data. With extensive case studies, we have shown the usefulness of SIPs as low-level features of visual media data. A class-preserving dimension reduction algorithm (using GSVD) is applied to further reduce the dimension of feature vectors based on SIPs. The experiments showed its superiority over PCA.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dimensionality Reduction of SIFT Descriptor Using Vector Decomposition for Image Classification

Image Clustering Using Multi-visual Features

Modern Approaches to Multi-dimensional Visual Signals Analysis

Notes

We will strictly distinct the term feature from the term feature vector in the context of media-based classification applications. The former means color, texture, shape and pixel, whereas the latter means the representation of an image/video instance that are ready to feed into some classifier.
In the context of image retrieval or 3D computer vision, they are called interest points. Renaming them as interest pixels is to avoid confusion between the image point and data point (i.e., feature vector).
http://www.mis.atr.co.jp/~mlyons/jaffe.html
http://cvc.yale.edu/projects/yalefaces/yalefaces.html
http://rvl1.ecn.purdue.edu/~aleix/aleix_face_DB.html
http://pics.psych.stir.ac.uk/
In ten-fold cross validation, an entire dataset will first be split into ten pieces. Then the test will be run ten times. In each time, nine pieces are used as training data, and remaining one piece is used as test data. The final accuracy estimation is the mean estimation.
Eigenspace and Fisherspace refers to the reduced spaces via PCA and LDA (either classical or generalized), respectively.

References

Arya S (1995) Nearest neighbor searching and applications. In Ph. D. Thesis, University of Maryland, College Park, Maryland
Belhumeur P, Hespanha J, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE TPAMI 19(7):711–720
Google Scholar
Bergen J, Landy M (1991) Computational modeling of visual texture segregation. In computational models of visual perception. MIT, Cambridge Massachusetts, 1991, pp 253–271
Google Scholar
Chellappa R, Wilson C, Sirohey S (1995) Human and machine recognition of faces: a survey. Proc IEEE 83(5):705–740
Article Google Scholar
Ekman P, Friesen W (1976) Pictures of facial affect. In Consulting psychologist, Palo Alto, California
Fisher R (1936) The use of multiple measurements in taxonomic problems. In Annals of Eugenics 7:179–188
Google Scholar
Gevers T, Smeulders AWM (1998) Image indexing using composite color and shape invariant features. In ICCV, pp 576–581
Hancock P, Burton A, Bruce V (1996) Face processing: human perception and principal components analysis. Mem Cogn 24:26–40
Google Scholar
Harris C, Stephens M (1988) A combined corner and edge detector. In Proc. 4th Alvey Vision Conference, Manchester, pp 147–151
Huber P (1981) Robust statistics. Wiley
Jolliffe I (1986) Principle component analysis. J Educ Psychol 24:417–441
Google Scholar
Joyce D, Lewis P, Tansley R, Dobie M, Hall W (2000) Semiotics and agents for integrating and navigating through multimedia representations of concepts. In Proceedings of SPIE Vol. 3972, Storage and Retrieval for Media Databases 2000, pp 132–143
Lin W-H, Hauptmann A (2002) News video classification using SVM-based multimodal classifiers and combination strategies. In ACM Multimedia, Juan-les-Pins, France, pp 323–326
Loan CV (1976) Generalizing the singular value decomposition. SIAM J Numer Anal 13(1):76–83
Article MATH MathSciNet Google Scholar
Loupias E, Sebe N (1999) Wavelet-based salient points for image retrieval. In RR 99.11, Laboratoire Reconnaissance de Formes et Vision, INSA Lyon, November
Lu Y, Hu C, Zhu X, Zhang H, Yang Q (2000) A unified framework for semantics and feature based relevance feedback in image retrieval systems. In ACM Multimedia, pp 31–37
Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In International Joint Conference on Artificial Intelligence, pp 674–679
Lyons M, Budynek J, Akamatsu S (1999) Automatic classification of single facial images. IEEE transcations on PAMI 21(12):1357–1362
Google Scholar
Martinez A, Benavente R (1998) The AR face database. Technical Report CVC Tech. Report No. 24
Martinez A, Kak A (2001) PCA versus LDA. IEEE TPAMI 23(2):228–233
Google Scholar
Howland P, Jeon M, Park H (2003) Cluster structure preserving dimension reduction based on the generalized singular value decomposition. SIAM J Matrix Anal Appl 25(1):165–179
Google Scholar
Schmid C, Mohr R, Bauckhage C (2000) Evaluation of interest point detectors. Int J Comput Vis 37(2):151–172
Article MATH Google Scholar
Sim T, Sukthankar R, Mullin M, Baluja S Memory-based face recognition for visitor identification. In Proc. 4th Intl. Conf. on FG'00, pp 214–220
Smith J (1997) Integrated spatial and feature image systems: retrieval and compression. In PhD thesis, Graduate School of Arts and Sciences, Columbia University, New York, New York
Swain M, Ballard D (1991) Color indexing. Int J Comput Vis 7:11–32
Article Google Scholar
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86
Article Google Scholar
Ye J, Janardan R, Park C, Park H (2003) A new optimization criterion for generalized discriminant analysis on undersampled problems. Technical Report TR-026-03, Department of Computer Science and Engineering University of Minnesota, Twin Cities, U.S.A., 2003
Ye J, Janardan R, Park C, Park H (2003) A new optimization criterion for generalized discriminant analysis on undersampled problems. In IEEE Intl. Conf. on Data Mining, pp 419–426
Zhang Z (1999) Feature-based facial expression recognition: experiments with a multi-layer perceptron. Int J Pattern Recogn Artif Intell 13(6):893–911
Article Google Scholar
Zhao W, Chellappa R, Rosenfeld A, Phillips P (2000) Face recognition: a literature survey. Technical Report CAR-TR-948

Download references

Author information

Authors and Affiliations

Video/Image Modeling and Synthesis Lab Computer Information & Sciences, University of Delaware, Newark, DE, 19716, USA
Chandra Kambhamettu
Computer Science & Engineering, Arizona State University, Tempe, AZ, 85281, USA
Qi Li & Jieping Ye

Authors

Qi Li
View author publications
You can also search for this author in PubMed Google Scholar
Jieping Ye
View author publications
You can also search for this author in PubMed Google Scholar
Chandra Kambhamettu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Li.

Appendix

1.1 Generalized discriminant analysis using GSVD

In this Appendix, we will first complete the formulation of the optimization problem (5.10), and then give the proof of Theorem 5.1.

From Eq. 5.9, we have

$$\begin{array}{*{20}c} {X^{T} H_{b} H^{T}_{b} X = {\left[ {\begin{array}{*{20}l} {{{\sum\nolimits_1^T {} }} \hfill} \\ {0 \hfill} \\ \end{array} } \right]}U^{T} U{\left[ {{\sum\nolimits_1 {\quad 0} }} \right]}} \\ { = {\left[ {\begin{array}{*{20}l} {{{\sum\nolimits_1^T {{\sum\nolimits_1 {} }} }} \hfill} & {0 \hfill} \\ {0 \hfill} & {0 \hfill} \\ \end{array} } \right]} \equiv D_{1} ,} \\ {X^{T} H_{w} H^{T}_{w} X = {\left[ {\begin{array}{*{20}l} {{{\sum\nolimits_2^T {} }} \hfill} \\ {0 \hfill} \\ \end{array} } \right]}V^{T} V{\left[ {{\sum\nolimits_2 {\quad 0} }} \right]}} \\ { = {\left[ {\begin{array}{*{20}l} {{{\sum\nolimits_2^T {} }} \hfill} & {0 \hfill} \\ {0 \hfill} & {0 \hfill} \\ \end{array} } \right]} \equiv D_{2} .} \\ \end{array} $$

Hence

$$\begin{array}{*{20}c} {S^{L}_{b} = G^{T} S_{b} G = G^{T} H_{b} H^{T}_{b} G = \widetilde{G}D_{1} \widetilde{G}^{T} ,} \\ {S^{L}_{w} = G^{T} S_{w} G = G^{T} H_{w} H^{T}_{w} G = \widetilde{G}D_{2} \widetilde{G}^{T} ,} \\ \end{array} $$

(A.11)

where the matrix $\tilde{G} =(X^{-1} G)^T$.

We will use the above representations for $S_b^L$ and $S_w^L$ in Section A for the minimizations of $F$.

We first formulate the optimization problem in Eq. 5.8 as the following:

$$\begin{aligned} & \min {\text{imize}}\quad {\text{trace}}{\left( {S^{L}_{w} } \right)} \\ & {\text{subject to }}\;{\text{trace}}{\left( {S^{L}_{b} } \right)} = 1. \\ \end{aligned} $$

(A.12)

Recall by Eq. A.11,

$$\begin{array}{*{20}c} {{\text{trace}}{\left( {S^{L}_{b} } \right)} = {\text{trace}}{\left( {\widetilde{{\text{G}}}D_{1} \widetilde{{\text{G}}}^{T} } \right)} = {\text{trace}}{\left( {D_{1} \widetilde{{\text{G}}}^{T} \widetilde{{\text{G}}}} \right)},} \\ {{\text{trace}}{\left( {S^{L}_{w} } \right)}{\text{ = trace}}{\left( {\widetilde{{\text{G}}}D_{2} \widetilde{{\text{G}}}^{T} } \right)} = {\text{trace}}{\left( {D_{2} \widetilde{{\text{G}}}^{T} \widetilde{{\text{G}}}} \right)}.} \\ \end{array} $$

Let $u_{ij}$ be the $ij$-th term of the matrix $\tilde{G}^T \tilde{G}$, then

$${\text{trace}}{\left( {S^{L}_{b} } \right)} = {\sum\limits_{i = 1}^r {u_{{ii}} + } }{\sum\limits_{i = r + 1}^{r + s} {\alpha ^{2}_{i} u_{{ii}} = 1} }$$

(A.13)

$${\text{trace}}{\left( {S^{L}_{w} } \right)} = {\sum\limits_{i = r + 1}^{r + s} {\beta ^{2}_{i} u_{{ii}} + {\sum\limits_{i = r + s + 1}^t {u_{{ii}} } }.} }$$

(A.14)

Since $\alpha_i^2 + \beta_i^2 = 1$, for $ r+1 \le i \le r+s $, we have

$$\begin{array}{*{20}c} {{\text{trace}}{\left( {S^{L}_{w} } \right)} + {\text{trace}}{\left( {S^{L}_{b} } \right)}} \\ { = {\sum\limits_{i = 1}^r {u_{{ii}} + {\sum\limits_{i = r + 1}^{r + 2} {{\left( {\alpha ^{2}_{i} + \beta ^{2}_{i} } \right)}u_{{ii}} {\sum\limits_{i = r + s + 1}^t {u_{{_{{ii}} }} } }} }} }} \\ { = {\sum\limits_{i = 1}^t {u_{{ii}} ,} }} \\ \end{array} $$

hence $\mbox{trace} ( S_w^L)=\sum_{i=1}^t u_{ii}-\mbox{trace} ( S_b^L)=\sum_{i=1}^t u_{ii} -1$.

Therefore the original optimization (5.8) is equivalent to the following

$$\begin{array}{*{20}c} {\min {\text{imize trace}}{\left( {S^{L}_{w} } \right)} = {\sum\limits_{i = 1}^t {u_{{ii}} - 1} }} \\ {{\text{subject to trace}}{\left( {S^{L}_{b} } \right)} = {\sum\limits_{i = 1}^r {u_{{ii}} + {\sum\limits_{i = r + 1}^{r + 2} {\alpha ^{2}_{i} u_{{ii}} = 1} }.} }} \\ \end{array} $$

(A.15)

Now we begin the proof of Theorem 5.1. First, note that $u_{ii}$'s are diagonal elements of a positive semi-definite matrix, hence nonnegative. Since $u_{ii}$ is the diagonal element of a positive semi-definite matrix $\tilde{G}^T \tilde{G}$, if $u_{ii} = 0$ for some $i$, then $u_{ij} = u_{ji}= 0$ for every $j$.

Recall the matrix $\tilde{G}^T \tilde{G}$ is an $m$ by $m$ matrix with $m$ diagonal entries. However only the first $t$ diagonal entries $\{u_{ii}\}_{i=1}^t$ appear in the optimization problem (5.10), hence the last $m-t$ diagonal entries of the matrix $\tilde{G}^T \tilde{G}$ don't affect the optimization problem (5.10). For simplicity, we set the last $m-t$ diagonal entries of the matrix $\tilde{G}^T \tilde{G}$ to be zero, i.e., $u_{ii}=0$, for $i = t+1, \cdots, m$.

For $\{u_{ii}\}_{i=r+s+1}^t$, any positive value for $u_{ii}$, when $r+s+1 \le i \le t$ would increase the objective function in Eq. 5.10, while keeping the constraint unchanged. Hence we have $u_{ii}=0$, for $r+s+1 \le i \le t$. Thus we obtain Theorem 5.1.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, Q., Ye, J. & Kambhamettu, C. Spatial interest pixels (SIPs): useful low-level features of visual media data. Multimed Tools Appl 30, 89–108 (2006). https://doi.org/10.1007/s11042-006-0009-3

Download citation

Published: 01 June 2006
Issue Date: July 2006
DOI: https://doi.org/10.1007/s11042-006-0009-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spatial interest pixels (SIPs): useful low-level features of visual media data

Abstract

Access this article

Similar content being viewed by others

Dimensionality Reduction of SIFT Descriptor Using Vector Decomposition for Image Classification

Image Clustering Using Multi-visual Features

Modern Approaches to Multi-dimensional Visual Signals Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

1.1 Generalized discriminant analysis using GSVD

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Spatial interest pixels (SIPs): useful low-level features of visual media data

Abstract

Access this article

Similar content being viewed by others

Dimensionality Reduction of SIFT Descriptor Using Vector Decomposition for Image Classification

Image Clustering Using Multi-visual Features

Modern Approaches to Multi-dimensional Visual Signals Analysis

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

1.1 Generalized discriminant analysis using GSVD

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation