Skip to main content
Log in

Feature selection for high-dimensional data

  • Original Paper
  • Published:
Computational Management Science Aims and scope Submit manuscript

Abstract

This paper focuses on feature selection for problems dealing with high-dimensional data. We discuss the benefits of adopting a regularized approach with L 1 or L 1L 2 penalties in two different applications—microarray data analysis in computational biology and object detection in computer vision. We describe general algorithmic aspects as well as architecture issues specific to the two domains. The very promising results obtained show how the proposed approach can be useful in quite different fields of application.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Bertero M, Boccacci P (1998) Introduction to inverse problems in imaging. Institute of Physics Publishing, Bristol and Philadelphia

    Google Scholar 

  • Breiman L, Friedman JH, Olshen A, Stone CJ (1984) Classification and Regression Trees. Wadsworth and Brooks, Belmont

    Google Scholar 

  • Candes E, Tao T (2005) The Dantzig selector: statistical estimation when P is much larger than N

  • Chen S, Donoho D, Saunders M (1998) Atomic decomposition by basis pursuit. SIAM J Sci Comput 20(1): 33–61

    Article  Google Scholar 

  • Daubechies I, Defrise M, De Mol C (2004) An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun Pure Appl Math 57: 1413–1457

    Article  Google Scholar 

  • De Mol C, Defrise M (2002) A note on wavelet-based inversion algorithms. Contemp Math 313: 85–96

    Google Scholar 

  • De Mol C, Mosci, Traskine MS, Verri A (2007) Sparsity enforcing and correlation preserving algorithm for microarray data analysis. Technical Report DISI-TR-07-04, DISI, Università di Genova

  • Destrero A, De Mol C, Odone F, Verri A (2007) A regularized approach to feature selection for face detection. Technical Report DISI-TR-07-01, DISI, Università di Genova

  • Destrero A, Odone F, Verri A (2007) A system for face detection and tracking in unconstrained environments. In: Advanced video and signal based surveillance, AVSS, London, 2007. ISBN 978-1-4244-1696-7/07

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32: 407–499

    Article  Google Scholar 

  • Engl HW, Hanke M, Neubauer A (1996) Regularization of inverse problems. Math Appl 375

  • Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3: 1289–1306

    Article  Google Scholar 

  • Gordon GJ, Jensen RV, Hsiao L, Gullans SR, Blumenstock JE, Ramaswamy S, Richard WG, Sugarbaker DJ, Bueno R (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62: 4963–4967

    Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, Heidelberg

    Google Scholar 

  • Heisele B, Serre T, Mukherjee S, Poggio T (2001) Feature reduction and hierarchy of classifiers for fast object detection in video images. In: IEEE proceedings of CVPR

  • Hoerl AE, Kennard R (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12: 55–67

    Article  Google Scholar 

  • Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 97(1-2): 273–324

    Article  Google Scholar 

  • Mohan A, Papageorgiou C, Poggio T (2001) Example-based object detection in images by components. IEEE Trans Pattern Anal Mach Intell 23(4): 349–361

    Article  Google Scholar 

  • Osuna E, Freund R, Girosi F (1997) Training support vector machines: an application to face detection. In: IEEE proceedings international conference on computer vision and pattern recognition (CVPR), pp 130–136

  • Schneiderman H, Kanade T (2000) A statistical method for 3D object detection applied to faces and cars. In: IEEE proceedings international conference on computer vision and pattern recognition (CVPR), pp 1746–1759

  • Singh D et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1: 203–209

    Article  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 56: 267–288

    Google Scholar 

  • Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531–537

    Article  Google Scholar 

  • Ullman S, Vidal-Naquet M, Sali E (2002) Visual features of intermediate complexity and their use in classification. Nat Neurosci 5(7): 682–687

    Google Scholar 

  • Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2): 137–154

    Article  Google Scholar 

  • Werbos P (1988) Backpropagation: past and future. In: Proceedings of the IEEE international conference on neural networks. IEEE Press, pp 343–353

  • Weston J, Elisseeff A, Schoelkopf B, Tipping M (2003) Use of the zero norm with linear models and kernel methods. J Mach Learn Res 3: 1439–1461

    Article  Google Scholar 

  • Weston J, Elisseeff A, Scholkopf B, Tipping M (2003) The use of zero-norm with linear models and kernel methods. J Mach Learn Res 3: 1439–1461

    Article  Google Scholar 

  • Yang M-H, Kriegman DJ, Ahuja N (2002) Detecting faces in images: a survey. IEEE Trans Pattern Anal Mach Intell 24(1): 34–58

    Article  Google Scholar 

  • Zhu J, Rosset S, Hastie T, Tibshirani R (2004) 1-norm support vector machines. In: Thrun S, Saul LK, Schölkpf B (eds) Advances in neural information processing systems 16. MIT Press, Cambridge, pp 49–56

    Google Scholar 

  • Zou Z, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67: 301–320

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Augusto Destrero.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Destrero, A., Mosci, S., De Mol, C. et al. Feature selection for high-dimensional data. Comput Manag Sci 6, 25–40 (2009). https://doi.org/10.1007/s10287-008-0070-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10287-008-0070-7

Keywords

Navigation