Skip to main content
Log in

Parallelizing Feature Selection

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

Classification is a key problem in machine learning/data mining. Algorithms for classification have the ability to predict the class of a new instance after having been trained on data representing past experience in classifying instances. However, the presence of a large number of features in training data can hurt the classification capacity of a machine learning algorithm. The Feature Selection problem involves discovering a subset of features such that a classifier built only with this subset would attain predictive accuracy no worse than a classifier built from the entire set of features. Several algorithms have been proposed to solve this problem. In this paper we discuss how parallelism can be used to improve the performance of feature selection algorithms. In particular, we present, discuss and evaluate a coarse-grained parallel version of the feature selection algorithm FortalFS. This algorithm performs well compared with other solutions and it has certain characteristics that makes it a good candidate for parallelization. Our parallel design is based on the master--slave design pattern. Promising results show that this approach is able to achieve near optimum speedups in the context of Amdahl's Law.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jerffeson Teixeira de Souza, Stan Matwin or Nathalie Japkowicz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Souza, J., Matwin, S. & Japkowicz, N. Parallelizing Feature Selection. Algorithmica 45, 433–456 (2006). https://doi.org/10.1007/s00453-006-1220-3

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-006-1220-3

Keywords

Navigation