poster

A new algorithm for data discretization and feature selection

Authors:
Marcela Xavier Ribeiro

University of São Paulo at São Carlos - Brazil

University of São Paulo at São Carlos - Brazil
View Profile

,
Agma J. M. Traina

University of São Paulo at São Carlos - Brazil

University of São Paulo at São Carlos - Brazil
View Profile

,
Caetano Traina

University of São Paulo at São Carlos - Brazil

University of São Paulo at São Carlos - Brazil
View Profile

SAC '08: Proceedings of the 2008 ACM symposium on Applied computingMarch 2008Pages 953–954https://doi.org/10.1145/1363686.1363905

Published:16 March 2008Publication History

SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

Pages 953–954

ABSTRACT

Data discretization and feature selection are two important tasks that can be performed prior to the learning phase of data mining algorithms and can significantly reduce the processing effort of the learning algorithm. In this paper, we present a new algorithm, called Omega, for data preprocessing. Our proposed algorithm performs simultaneously data discretization and feature selection. Some experiments were performed to validate the effects of the preprocessing performed by the Omega algorithm in the results of the C4.5 algorithm (a well-known decision tree-based classifier). The results indicates that the proposed algorithm Omega is well-suited to both, data discretization and feature selection, being appropriate for data pre-processing.

References

A. Asuncion and D. Newman. Uci repository (www.ics.uci.edu/mlearn/mlrepository.html). 2007.Google Scholar
R. Kerber. Chimerge: Discretization of numeric attributes. In 10th Intl. Conf. on Artificial Intelligence, pages 123--128, 1992.Google Scholar
K. Kira and L. A. Rendell. A practical approach for feature selection. In 9th Intl. Conf. on Machine Learning, pages 249--256, Aberdeen, Scotland, 1992. Google ScholarDigital Library
H. Liu and R. Setiono. Feature selection via discretization. Knowledge and Data Engineering, 9(4):642--645, 1997. Google ScholarDigital Library

Index Terms

A new algorithm for data discretization and feature selection
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Data pre-processing: a new algorithm for feature selection and data discretization
CSTST '08: Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology

Data pre-processing is a key element to improve the accuracy of data mining algorithms. In the pre-processing step, the data are treated in order to make the mining process achievable and effective. Data discretization and feature selection are two ...
Read More
Synthetic Data for Feature Selection
Artificial Intelligence and Soft Computing
Abstract
Feature selection is an important and active field of research in machine learning and data science. Our goal in this paper is to propose a collection of synthetic datasets that can be used as a common reference point for feature selection ...
Read More
Genetic algorithms in feature and instance selection

Feature selection and instance selection are two important data preprocessing steps in data mining, where the former is aimed at removing some irrelevant and/or redundant features from a given dataset and the latter at discarding the faulty data. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '08: Proceedings of the 2008 ACM symposium on Applied computing
March 2008
2586 pages
ISBN:9781595937537
DOI:10.1145/1363686
Conference Chairs:
Roger L. Wainwright
University of Tulsa
,
Hisham M. Haddad
Kennesaw State University
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 March 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data discretization
data pre-processing
feature selection
Qualifiers
- poster
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 434
  Total Downloads
- Downloads (Last 12 months)3
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A new algorithm for data discretization and feature selection

SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Data pre-processing: a new algorithm for feature selection and data discretization

Synthetic Data for Feature Selection

Genetic algorithms in feature and instance selection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A new algorithm for data discretization and feature selection

SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Data pre-processing: a new algorithm for feature selection and data discretization

Synthetic Data for Feature Selection

Genetic algorithms in feature and instance selection

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media