skip to main content
10.1145/1183614.1183736acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Combining feature selectors for text classification

Published: 06 November 2006 Publication History

Abstract

We introduce several methods of combining feature selectors for text classification. Results from a large investigation of these combinations are summarized. Easily constructed combinations of feature selectors are shown to improve peak R-precision and F1 at statistically significant levels.

References

[1]
T. Dunning. Accurate methods for the statistics of surprise and coincidence. Comp. Ling., 19(1):61--74, 1993.
[2]
G. Forman. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res., 3:1289--1305, 2003.
[3]
T. K. Ho, J. J. Hull, and S. N. Srihari. Decision Combination in Multiple Classifier Systems. IEEE Trans. Pattern Anal. Mach. Intell., 16(1):66--75, 1994.
[4]
D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A New Benchmark Collection for Text Categorization Research. J. Mach. Learn. Res., 5:361--397, 2004.
[5]
J. S. Olsson. An analysis of the coupling between training set and neighborhood sizes for the kNN classifier. In SIGIR '06.
[6]
J. S. Olsson and D. W. Oard. Exploring feature selection for multi-label text classification using ranked retrieval measures. University of Maryland CS technical report, UMIACS-TR-2006-41, 2006.
[7]
M. Rogati and Y. Yang. High-performing feature selection for text classification. In CIKM '02.
[8]
Y. Yang and J. O. Pedersen. A Comparative Study on Feature Selection in Text Categorization. In ICML '97.

Cited By

View all
  • (2024)Ensemble learning-based stability improvement method for feature selection towards performance predictionJournal of Manufacturing Systems10.1016/j.jmsy.2024.03.00174(55-67)Online publication date: Jun-2024
  • (2024)A novel framework for multi-label feature selection: integrating mutual information and Pythagorean fuzzy CRADISGranular Computing10.1007/s41066-024-00489-z9:3Online publication date: 26-Jun-2024
  • (2023)Vote-Based Feature Selection Method for Stratigraphic Recognition in Tunnelling Process of Shield MachineChinese Journal of Mechanical Engineering10.1186/s10033-023-00932-336:1Online publication date: 24-Oct-2023
  • Show More Cited By

Index Terms

  1. Combining feature selectors for text classification

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management
      November 2006
      916 pages
      ISBN:1595934332
      DOI:10.1145/1183614
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 November 2006

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. feature selection
      2. text classification

      Qualifiers

      • Article

      Conference

      CIKM06
      CIKM06: Conference on Information and Knowledge Management
      November 6 - 11, 2006
      Virginia, Arlington, USA

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      CIKM '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 05 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Ensemble learning-based stability improvement method for feature selection towards performance predictionJournal of Manufacturing Systems10.1016/j.jmsy.2024.03.00174(55-67)Online publication date: Jun-2024
      • (2024)A novel framework for multi-label feature selection: integrating mutual information and Pythagorean fuzzy CRADISGranular Computing10.1007/s41066-024-00489-z9:3Online publication date: 26-Jun-2024
      • (2023)Vote-Based Feature Selection Method for Stratigraphic Recognition in Tunnelling Process of Shield MachineChinese Journal of Mechanical Engineering10.1186/s10033-023-00932-336:1Online publication date: 24-Oct-2023
      • (2023)Ensemble Feature Selection Framework for Paddy Yield Prediction in Cauvery Basin using Machine Learning ClassifiersCogent Engineering10.1080/23311916.2023.225006110:2Online publication date: 29-Aug-2023
      • (2022)Feature selection techniques for microarray datasets: a comprehensive review, taxonomy, and future directions微阵列数据集的特征选择技术: 综合评述、 分类和未来方向Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210056923:10(1451-1478)Online publication date: 24-Oct-2022
      • (2022)A reliable and efficient machine learning pipeline for american sign language gesture recognition using EMG sensorsMultimedia Tools and Applications10.1007/s11042-022-14117-y82:15(23833-23871)Online publication date: 17-Nov-2022
      • (2022)A DDoS Detection Method with Feature Set Dimension ReductionMobile Internet Security10.1007/978-981-16-9576-6_25(365-378)Online publication date: 22-Jan-2022
      • (2022)Ensemble feature selection for multi‐label text classification: An intelligent order statistics approachInternational Journal of Intelligent Systems10.1002/int.2304437:12(11319-11341)Online publication date: 2-Sep-2022
      • (2021)A Novel Fault Diagnosis Method Based on Ensemble Feature Selection in The Industrial IoT Scenario2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC52423.2021.9658901(3324-3329)Online publication date: 17-Oct-2021
      • (2020)Data assessment and prioritization in mobile networks for real-time prediction of spatial information using machine learningEURASIP Journal on Wireless Communications and Networking10.1186/s13638-020-01709-12020:1Online publication date: 11-May-2020
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media