Skip to main content

Feature Selection Using Approximate Multivariate Markov Blankets

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9648))

Included in the following conference series:

  • 2261 Accesses

Abstract

In classification tasks, feature selection has become an important research area. In general, the performance of a classifier is intrinsically affected by existence of irrelevant and redundant features. In order to find an optimal subset of features, Markov blanket discovery can be used to identify such subset. The Approximate Markov blanket (AMb) is a standard approach to induce Markov blankets from data. However, this approach considers only pairwise comparisons of features. In this paper, we introduce a multivariate approach to the AMb definition, called Approximate Multivariate Markov blanket (AMMb), which takes into account interactions among different features of a given subset. In order to test the AMMb, we consider a backward strategy similar to the Fast Correlation Based Filter (FCBF), which incorporates our proposal. The resulting algorithm, named as FCBF\(_{ntc}\), is compared against the FCBF, Best First (BF) and Sequential Forward Selection (SFS) and tested on both synthetic and real-world datasets. Results show that the inclusion of interactions among features in a subset may yield smaller subsets of features without degrading the classification task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bell, D., Wang, H.: A formalism for relevance and its application in feature subset selection. Mach. Learn. 41(2), 175–195 (2000)

    Article  MATH  Google Scholar 

  2. Caruana, R., Freitag, D.: How useful is relevance?. In: Working Notes of the AAAI Fall Symposium on Relevance, pp. 25–29 (1994)

    Google Scholar 

  3. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  4. Hall, M.A., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: An update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  5. John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of the Eleventh International Conference on Machine Learning, pp. 121–1129. Morgan Kaufmann (1994)

    Google Scholar 

  6. Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms: A study on high-dimensional spaces. Knowl. Inf. Syst. 12(1), 95–116 (2007)

    Article  Google Scholar 

  7. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)

    Article  MATH  Google Scholar 

  8. Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 284–292 (1996)

    Google Scholar 

  9. Křížek, P., Kittler, J., Hlaváč, V.: Improving stability of feature selection methods. In: Kropatsch, W.G., Kampel, M., Hanbury, A. (eds.) CAIP 2007. LNCS, vol. 4673, pp. 929–936. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  10. Kuncheva, L.I.: A stability index for feature selection. In: Proceedings of the 25th IASTED International Multi-Conference, pp. 390–395 (2007)

    Google Scholar 

  11. Lichman, M.: UCI Machine Learning Repository. Kluwer Academic, Dordrecht (2013)

    Google Scholar 

  12. McGill, W.J.: Multivariate information transmission. Trans. IRE Prof. Group Inf. Theor. 4, 93–111 (1954)

    Article  MathSciNet  Google Scholar 

  13. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)

    Google Scholar 

  14. Saeys, Y., Abeel, T., Van de Peer, Y.: Robust feature selection using ensemble feature selection techniques. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 313–325. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Develop. 4(1), 66–82 (1960)

    Article  MathSciNet  MATH  Google Scholar 

  16. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgment

This work has been partially supported by the project TIN2015-64776-C3-2-R. Miguel García-Torres acknowledges the financial support of CONACyT-Paraguay (14-VIN-009). Christian E. Schaerer acknowledges PRONII-CONACyT-Paraguay. Part of the computer time was provided by the Centro Informático Científico de Andalucía (CIC).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Miguel García-Torres .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Arias-Michel, R., García-Torres, M., Schaerer, C., Divina, F. (2016). Feature Selection Using Approximate Multivariate Markov Blankets. In: Martínez-Álvarez, F., Troncoso, A., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2016. Lecture Notes in Computer Science(), vol 9648. Springer, Cham. https://doi.org/10.1007/978-3-319-32034-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-32034-2_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-32033-5

  • Online ISBN: 978-3-319-32034-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics