skip to main content
10.1145/2505515.2507833acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Timeline adaptation for text classification

Published:27 October 2013Publication History

ABSTRACT

In this paper, we address the text classification problem that a period of time created test data is different from the training data, and present a method for text classification based on temporal adaptation. We first applied lexical chains for the training data to collect terms with semantic relatedness, and created sets (we call these Sem sets). Semantically related terms in the documents are replaced to their representative term. For the results, we identified short terms that are salient for a specific period of time. Finally, we trained SVM classifiers by applying a temporal weighting function to each selected short terms within the training data, and classified test data. Temporal weighting function is weighted each short term in the training data according to the temporal distance between training and test data. The results using MedLine data showed that the method was comparable to the current state-of-the-art biased-SVM method, especially the method is effective when testing on data far from the training data.

References

  1. R. Barzilay and M. Elhadad. Using Lexical Chain for Text Summarization. In Proc. of the ACL Workshop in Intelligent Scalable Text Summarization, pages 10--17, 1997.Google ScholarGoogle Scholar
  2. C. Elkan and K. Noto. Learning Classifiers from Only Positive and Unlabeled Data. In Proc. of the KDD'08, pages 213--220, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. He and D. S. Parker. Topic Dynamics: An Alternative Model of Bursts in Streams of Topics. In Proc. of the 16th ACM SIGKDD, pages 443--452, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. T. Joachims. SVM Light Support Vector Machine. In Dept. of Computer Science Cornell University, 1998.Google ScholarGoogle Scholar
  5. R. Klinkenberg and T. Joachims. Detecting Concept Drift with Support Vector Machines. In Proc. of the 17th ICML, pages 487--494, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Morris and H. Hirst. Lexical Cohesion Computed by Thesaural Relations as an Indicator of the Structure of Text. Computational Linguistics, 17(1):21--43, 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. F. Mourão, L. Rocha, R. Araujo, T. Couto, M. Gonçalves, and W. M. Jr. Understanding Temporal Aspects in Document Classification. In Proc. of the 1st ACM WSDM, pages 159--169, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. L. Rocha, F. Mourão, A. Pereira, M. A. Gonçalves, and W. M. Jr. Exploiting Temporal Contexts in Text Classification. In Proc. of the 17th ACM CIKM, pages 26--30, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. G. J. Ross, N. M. Adams, D. K. Tasoulis, and D. J. Hand. Exponentially Weighted Moving Average Charts for Detecting Concept Drift. Pattern Recognition Letters, 33(2012):191--198, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. T. Salles, L. Rocha, G. L. Pappa, F. Mourao, W. M. Jr., and M. Goncalves. Temporally-aware Algorithms for Document Classification. In Proc. of the ACM SIGIR 2010, pages 307--314, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Schmid. Improvements in Part-of-Speech Tagging with an Application to German. In Proc. of the EACL SIGDAT Workshop, pages 47--50, 1995.Google ScholarGoogle Scholar

Index Terms

  1. Timeline adaptation for text classification

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
          October 2013
          2612 pages
          ISBN:9781450322638
          DOI:10.1145/2505515

          Copyright © 2013 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 27 October 2013

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • poster

          Acceptance Rates

          CIKM '13 Paper Acceptance Rate143of848submissions,17%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader