research-article

Using unlabeled data to handle domain-transfer problem of semantic detection

Authors:
Songbo Tan

Chinese Academy of Sciences, China

Chinese Academy of Sciences, China
View Profile

,
Yuefen Wang

Chinese Academy of Geological Sciences, China

Chinese Academy of Geological Sciences, China
View Profile

,
Gaowei Wu

Chinese Academy of Sciences, China

Chinese Academy of Sciences, China
View Profile

,
Xueqi Cheng

Chinese Academy of Sciences, China

Chinese Academy of Sciences, China
View Profile

SAC '08: Proceedings of the 2008 ACM symposium on Applied computingMarch 2008Pages 896–903https://doi.org/10.1145/1363686.1363893

Published:16 March 2008Publication History

SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

Pages 896–903

ABSTRACT

Due to highly domain-specific nature, supervised sentiment classifiers typically require a large number of new labeled training data when transferred to another domain. This is so-called domaintransfer problem. In this work, we attempt to tackle this problem by combining old-domain labeled examples with new-domain unlabeled ones. The basic idea is to use old-domain-trained classifier to label some informative unlabeled examples in new domain, and train the base classifier again. The experimental results demonstrate that proposed method dramatically boosts the accuracy of the base sentiment classifier on new domain.

References

Aue, A. and Gamon, M. Customizing Sentiment Classifiers to New Domains: a Case Study. RANLP. 2005.Google Scholar
Blum, A. and Mitchell, T. (1998). Combining labeled and unlabeled data with Co-Training. COLT. 1998, 92--100. Google ScholarDigital Library
Cui, H., Mittal, V., Datar, M. Comparative Experiments on Sentiment Classification for Online Product Reviews. AAAI. 2006. Google ScholarDigital Library
Engström, C. Topic Dependence in sentiment classification. Unpublished M.Sc. thesis, University of Cambridge, 2004.Google Scholar
Finn, A., and Kushmerick, N. 2003. Learning to classify documents according to genre. In IJCAI-03 Workshop on Computational Approaches to Style Analysis and SynthesisGoogle Scholar
Han, E. and Karypis, G. Centroid-Based Document Classification Analysis & Experimental Result. PKDD. 2000. Google ScholarDigital Library
Joachims, T. Transductive inference for text classification using support vector machines. ICML. 1999, 200--209. Google ScholarDigital Library
Kennedy, A. and Inkpen, D. Sentiment Classification of Movie and Product Reviews Using Contextual Valence Shifters. FINEXIN. 2005.Google Scholar
Lanquillon, C. Learning from Labeled and Unlabeled Documents: A Comparative Study on Semi-Supervised Text Classification. PKDD. 2000, 490--497 Google ScholarDigital Library
Mullen, T. and Collier, N. Sentiment analysis using support vector machines with diverse information sources. EMNLP. 2004, 412--418Google Scholar
Nigam, K., McCallum, A., Thrun, S. and Mitchell, T. Learning to classify text from labeled and unlabeled documents. AAAI. 1998, 792--799. Google ScholarDigital Library
Pang, P., Lee, L., and Vaithyanathan S. Thumbs up? Sentiment classification using machine learning techniques. EMNLP. 2002. Google ScholarDigital Library
Salton, G., McGill, M. Introduction to Modern Information Retrieval. McGraw-Hill Book Company, New York. 1983. Google ScholarDigital Library
Turney, P. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL. 2002, 417--427 Google ScholarDigital Library
Whitelaw, C., Garg, N., Argamon, S. Using appraisal groups for sentiment analysis. CIKM. 2005, 625--631. Google ScholarDigital Library
Yang, Y. A study on thresholding strategies for text categorization. SIGIR. 2001, 137--145 Google ScholarDigital Library
Jing Jiang and ChengXiang Zhai. Instance weighting for domain adaptation in NLP. ACL 2007.Google Scholar

Index Terms

Using unlabeled data to handle domain-transfer problem of semantic detection
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
  2. Machine learning
2. Information systems
  1. Information retrieval

Recommendations

A novel scheme for domain-transfer problem in the context of sentiment analysis
CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management

In this work, we attempt to tackle domain-transfer problem by combining old-domain labeled examples with new-domain unlabeled ones. The basic idea is to use old-domain-trained classifier to label some informative unlabeled examples in new domain, and ...
Read More
A weakly supervised approach to Chinese sentiment classification using partitioned self-training

With the rapid evolution of documents on the World Wide Web which express opinions, there exists an increasing demand for developing such a sentiment analysis technique that can easily adapt to new domains with minimum supervision. This article ...
Read More
Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Semi-supervised framework which exploits unsupervised approach (JST) is proposed.Self-training suffers from incorrectly labeling problem with insufficient data.Confidently predicted instances are labeled and used as training data by JST.Self-training ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '08: Proceedings of the 2008 ACM symposium on Applied computing
March 2008
2586 pages
ISBN:9781595937537
DOI:10.1145/1363686
Conference Chairs:
Roger L. Wainwright
University of Tulsa
,
Hisham M. Haddad
Kennesaw State University
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 March 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
information retrieval
opinion mining
sentiment classification
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 256
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Using unlabeled data to handle domain-transfer problem of semantic detection

SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

A novel scheme for domain-transfer problem in the context of sentiment analysis

A weakly supervised approach to Chinese sentiment classification using partitioned self-training

Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Using unlabeled data to handle domain-transfer problem of semantic detection

SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

A novel scheme for domain-transfer problem in the context of sentiment analysis

A weakly supervised approach to Chinese sentiment classification using partitioned self-training

Sentiment labeling for extending initial labeled data to improve semi-supervised sentiment classification

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media