A VNS-Based Heuristic for Feature Selection in Data Mining

Mucherino, A.; Liberti, L.

doi:10.1007/978-3-642-30671-6_13

A. Mucherino² &
L. Liberti³

Part of the book series: Studies in Computational Intelligence ((SCI,volume 434))

2252 Accesses

Abstract

The selection of features that describe samples in sets of data is a typical problem in data mining. A crucial issue is to select a maximal set of pertinent features, because the scarce knowledge of the problem under study often leads to consider features which do not provide a good description of the corresponding samples. The concept of consistent biclustering of a set of data has been introduced to identify such a maximal set. The problem can be modeled as a 0–1 linear fractional program, which is NP-hard. We reformulate this optimization problem as a bilevel program, and we prove that solutions to the original problem can be found by solving the reformulated problem. We also propose a heuristic for the solution of the bilevel program, that is based on the meta-heuristic Variable Neighborhood Search (VNS). Computational experiments show that the proposed heuristic outperforms previously proposed heuristics for feature selection by consistent biclustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Binary Black Widow with Hill Climbing Algorithm for Feature Selection

A novel feature selection method via mining Markov blanket

Article 30 July 2022

A fast meta-heuristic approach for the $(\alpha ,\beta )-k$-feature set problem

Article 21 December 2015

References

Belotti, P.: Couenne: a user’s manual. Technical report, Lehigh University (2009)
Google Scholar
Busygin, S., Prokopyev, O.A., Pardalos, P.M.: Feature selection for consistent biclustering via fractional 0–1 programming. Journal of Combinatorial Optimization 10, 7–21 (2005)
Article MathSciNet MATH Google Scholar
Hitt, B.A., Levine, P.J., Fusaro, V.A., Steinberg, S.M., Mills, G.B., Simone, C., Fishman, D.A., Kohn, E.C., Liotta, L.A., Petricoin III, E.F., Ardekani, A.M.: Use of proteomic patterns in serum to identify ovarian cancer. The Lancet 359, 572–577 (2002)
Article Google Scholar
Fourer, R., Gay, D.M., Kernighan, B.W.: AMPL: A Modeling Language for Mathematical Programming. Brooks/Cole Publishing Company, Cengage Learning (2002)
Google Scholar
Hansen, P., Mladenovic, N.: Variable neighborhood search: Principles and applications. European Journal of Operational Research 130(3), 449–467 (2001)
Article MathSciNet MATH Google Scholar
Hartigan, J.: Clustering Algorithms. John Wiles & Sons, New York (1975)
MATH Google Scholar
Ilog cplex solver, http://www.ilog.com/products/cplex/
Kent ridge database, http://datam.i2r.a-star.edu.sg/datasets/krbd/
Kundakcioglu, O.E., Pardalos, P.M.: The complexity of feature selection for consistent biclustering. In: Butenko, S., Pardalos, P.M., Chaovalitwongse, W.A. (eds.) Clustering Challenges in Biological Networks. World Scientific Publishing (2009)
Google Scholar
Mladenovic, M., Hansen, P.: Variable neighborhood search. Computers and Operations Research 24, 1097–1100 (1997)
Article MathSciNet MATH Google Scholar
Mucherino, A.: Extending the definition of β-consistent biclustering for feature selection. In: Proceedings of the Federated Conference on Computer Science and Information Systems, FedCSIS 2011. IEEE (2011)
Google Scholar
Mucherino, A., Cafieri, S.: A new heuristic for feature selection by consistent biclustering. Technical Report arXiv:1003.3279v1 (March 2010)
Google Scholar
Mucherino, A., Papajorgji, P., Pardalos, P.M.: Data Mining in Agriculture. Springer (2009)
Google Scholar
Mucherino, A., Papajorgji, P., Pardalos, P.M.: A survey of data mining techniques applied to agriculture. Operational Research: An International Journal 9(2), 121–140 (2009)
MATH Google Scholar
Mucherino, A., Urtubia, A.: Consistent biclustering and applications to agriculture. In: Proceedings of the Industrial Conference on Data Mining, ICDM 2010, Workshop on Data Mining and Agriculture DMA 2010, IbaI Conference Proceedings, pp. 105–113. Springer, Berlin (2010)
Google Scholar
Mucherino, A., Urtubia, A.: Feature selection for datasets of wine fermentations. In: Proceedings of the 10th International Conference on Modeling and Applied Simulation, MAS 2011. I3A (2011)
Google Scholar
Nahapatyan, A., Busygin, S., Pardalos, P.M.: An improved heuristic for consistent biclustering problems, vol. 102, pp. 185–198. Springer
Google Scholar
Notterman, D.A., Alon, U., Sierk, A.J., Levine, A.J.: Transcriptional gene expression profiles of colorectal adenoma, adenocarcinoma, and normal tissue examined by oligonucleotide arrays. Cancer Research 61, 3124–3130 (2001)
Google Scholar
Sahinidis, N.V., Tawarmalani, M.: BARON 9.0.4: Global Optimization of Mixed-Integer Nonlinear Programs. User’s Manual (2010)
Google Scholar
Tawarmalani, M., Sahinidis, N.V.: A polyhedral branch-and-cut approach to global optimization. Mathematical Programming 103, 225–249 (2005)
Article MathSciNet MATH Google Scholar
Notterman, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J., Alon, U., Barkai, N.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. PNAS 96, 6745–6750 (1999)
Article Google Scholar
Urtubia, A., Perez-Correa, J.R., Meurens, M., Agosin, E.: Monitoring large scale wine fermentations with infrared spectroscopy. Talanta 64(3), 778–784 (2004)
Article Google Scholar
Urtubia, A., Perez-Correa, J.R., Soto, A., Pszczolkowski, P.: Using data mining techniques to predict industrial wine problem fermentations. Food Control 18, 1512–1517 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

IRISA, University of Rennes, Rennes, France
A. Mucherino
LIX, École Polytechnique, Palaiseau, France
L. Liberti

Authors

A. Mucherino
View author publications
You can also search for this author in PubMed Google Scholar
L. Liberti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to A. Mucherino .

Editor information

Editors and Affiliations

, Cite Scientifique, University of Lille 1, Bat.M3, Villeneuve d'Ascq, 59655, France
El-Ghazali Talbi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mucherino, A., Liberti, L. (2013). A VNS-Based Heuristic for Feature Selection in Data Mining. In: Talbi, EG. (eds) Hybrid Metaheuristics. Studies in Computational Intelligence, vol 434. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30671-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-642-30671-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30670-9
Online ISBN: 978-3-642-30671-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

A VNS-Based Heuristic for Feature Selection in Data Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Binary Black Widow with Hill Climbing Algorithm for Feature Selection

A novel feature selection method via mining Markov blanket

A fast meta-heuristic approach for the \((\alpha ,\beta )-k\)-feature set problem

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

A VNS-Based Heuristic for Feature Selection in Data Mining

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Binary Black Widow with Hill Climbing Algorithm for Feature Selection

A novel feature selection method via mining Markov blanket

A fast meta-heuristic approach for the \((\alpha ,\beta )-k\)-feature set problem

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us