skip to main content
10.1145/544220.544237acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
Article

An approach to automatic classification of text for information retrieval

Published:14 July 2002Publication History

ABSTRACT

In this paper, we explore an approach to make better use of semi-structured documents in information retrieval in the domain of biology. Using machine learning techniques, we make those inherent structures explicit by XML markups. This marking up has great potentials in improving task performance in specimen identification and the usability of online flora and fauna.

References

  1. Flora of North America (http://hua.huh.harvard.edu/FNA/). {Accessed 11 February 2002}Google ScholarGoogle Scholar
  2. McCallum, A. K. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering.http://www.cs.cmu.edu/~mccallum/bow 1996Google ScholarGoogle Scholar
  3. Burges, C.J.C A Tutorial on Support Vector Machines for pattern recognition. Knowledge Discovery and Data Mining, 2(2), 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Joachims, T. Text categorization with Support Vector Machines: Learning with many relevant features. In European Conference on Machine Learning (ECML-98), 1998 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Soderland, Stephen. Learning Information Extraction Rules for Semi-structured and Free Text. Machine Learning, 44(1-3):233--272, 1999 Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. An approach to automatic classification of text for information retrieval

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          JCDL '02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries
          July 2002
          448 pages
          ISBN:1581135130
          DOI:10.1145/544220

          Copyright © 2002 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 14 July 2002

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          JCDL '02 Paper Acceptance Rate69of240submissions,29%Overall Acceptance Rate415of1,482submissions,28%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader