skip to main content
10.1145/1363686.1363893acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Using unlabeled data to handle domain-transfer problem of semantic detection

Published:16 March 2008Publication History

ABSTRACT

Due to highly domain-specific nature, supervised sentiment classifiers typically require a large number of new labeled training data when transferred to another domain. This is so-called domaintransfer problem. In this work, we attempt to tackle this problem by combining old-domain labeled examples with new-domain unlabeled ones. The basic idea is to use old-domain-trained classifier to label some informative unlabeled examples in new domain, and train the base classifier again. The experimental results demonstrate that proposed method dramatically boosts the accuracy of the base sentiment classifier on new domain.

References

  1. Aue, A. and Gamon, M. Customizing Sentiment Classifiers to New Domains: a Case Study. RANLP. 2005.Google ScholarGoogle Scholar
  2. Blum, A. and Mitchell, T. (1998). Combining labeled and unlabeled data with Co-Training. COLT. 1998, 92--100. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cui, H., Mittal, V., Datar, M. Comparative Experiments on Sentiment Classification for Online Product Reviews. AAAI. 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Engström, C. Topic Dependence in sentiment classification. Unpublished M.Sc. thesis, University of Cambridge, 2004.Google ScholarGoogle Scholar
  5. Finn, A., and Kushmerick, N. 2003. Learning to classify documents according to genre. In IJCAI-03 Workshop on Computational Approaches to Style Analysis and SynthesisGoogle ScholarGoogle Scholar
  6. Han, E. and Karypis, G. Centroid-Based Document Classification Analysis & Experimental Result. PKDD. 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Joachims, T. Transductive inference for text classification using support vector machines. ICML. 1999, 200--209. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Kennedy, A. and Inkpen, D. Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters. FINEXIN. 2005.Google ScholarGoogle Scholar
  9. Lanquillon, C. Learning from Labeled and Unlabeled Documents: A Comparative Study on Semi-Supervised Text Classification. PKDD. 2000, 490--497 Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mullen, T. and Collier, N. Sentiment analysis using support vector machines with diverse information sources. EMNLP. 2004, 412--418Google ScholarGoogle Scholar
  11. Nigam, K., McCallum, A., Thrun, S. and Mitchell, T. Learning to classify text from labeled and unlabeled documents. AAAI. 1998, 792--799. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Pang, P., Lee, L., and Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques. EMNLP. 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Salton, G., McGill, M. Introduction to Modern Information Retrieval. McGraw-Hill Book Company, New York. 1983. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Turney, P. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL. 2002, 417--427 Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Whitelaw, C., Garg, N., Argamon, S. Using appraisal groups for sentiment analysis. CIKM. 2005, 625--631. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Yang, Y. A study on thresholding strategies for text categorization. SIGIR. 2001, 137--145 Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jing Jiang and ChengXiang Zhai. Instance weighting for domain adaptation in NLP. ACL 2007.Google ScholarGoogle Scholar

Index Terms

  1. Using unlabeled data to handle domain-transfer problem of semantic detection

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SAC '08: Proceedings of the 2008 ACM symposium on Applied computing
          March 2008
          2586 pages
          ISBN:9781595937537
          DOI:10.1145/1363686

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 16 March 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate1,650of6,669submissions,25%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader