skip to main content
10.1145/2487788.2487903acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
poster

An effective class-centroid-based dimension reduction method for text classification

Published: 13 May 2013 Publication History

Abstract

Motivated by the effectiveness of centroid-based text classification techniques, we propose a classification-oriented class-centroid-based dimension reduction (DR) method, called CentroidDR. Basically, CentroidDR projects high-dimensional documents into a low-dimensional space spanned by class centroids. On this class-centroid-based space, the centroid-based classifier essentially becomes CentroidDR plus a simple linear classifier. Other classification techniques, such as K-Nearest Neighbor (KNN) classifiers, can be used to replace the simple linear classifier to form much more effective text classification algorithms. Though CentroidDR is simple, non-parametric and runs in linear time, preliminary experimental results show that it can improve the accuracy of the classifiers and perform better than general DR methods such as Latent Semantic Indexing (LSI).

References

[1]
Sun, J., Chen, Z. et al. 2004. Supervised latent semantic indexing for document categorization. ICDM'04, 535--538.
[2]
Han, E., Karypis, G. 2000. Centroid-based document classification: analysis and experimental results. PKDD'00, 116--123.
[3]
Jiang, S., Pang, G. et al. 2012. An improved K-nearest-neighbor algorithm for text categorization. Expert Systems with Applications, 39(1), 1503--1509.

Cited By

View all
  • (2019)A Similarity Function for Feature Pattern Clustering and High Dimensional Text Document ClassificationFoundations of Science10.1007/s10699-019-09592-w25:4(1077-1094)Online publication date: 9-Mar-2019

Index Terms

  1. An effective class-centroid-based dimension reduction method for text classification

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '13 Companion: Proceedings of the 22nd International Conference on World Wide Web
    May 2013
    1636 pages
    ISBN:9781450320382
    DOI:10.1145/2487788
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
    • CGIBR: Comite Gestor da Internet no Brazil

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2013

    Check for updates

    Author Tags

    1. class centroid
    2. dimension reduction
    3. text classification

    Qualifiers

    • Poster

    Conference

    WWW '13
    Sponsor:
    • NICBR
    • CGIBR
    WWW '13: 22nd International World Wide Web Conference
    May 13 - 17, 2013
    Rio de Janeiro, Brazil

    Acceptance Rates

    WWW '13 Companion Paper Acceptance Rate 831 of 1,250 submissions, 66%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)A Similarity Function for Feature Pattern Clustering and High Dimensional Text Document ClassificationFoundations of Science10.1007/s10699-019-09592-w25:4(1077-1094)Online publication date: 9-Mar-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media