Abstract
In traditional text classification approaches, the semantic meanings of the classes are described by the labeled documents. Since labeling documents is often time consuming and expensive, it is a promising idea that asking users to provide some keywords to depict the classes, instead of labeling any documents. However, short pieces of keywords may not contain enough information and therefore may lead to unreliable classifier. Fortunately, there are large amount of public data easily available in web directories, such as ODP, Wikipedia, etc. We are interested in exploring the enormous crowd intelligence contained in such public data to enhance text classification. In this paper, we propose a novel text classification framework called “Knowledge Supervised Learning”(KSL), which utilizes the knowledge in keywords and the crowd intelligence to learn the classifier without any labeled documents. We design a two-stage risk minimization (TSRM) approach for the KSL problem. It can optimize the expected prediction risk and build the high quality classifier. Empirical results verify our claim: our algorithm can achieve above 0.9 on Micro-F1 on average, which is much better than baselines and even comparable against SVM classifier supervised by labeled documents.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Dayanik, A., Lewis, D.: Constructing informative prior distributions from domain knowledge in text classification. In: SIGIR 2006, pp. 493–500 (1995)
Genkin, A., Lewis, D., Madigan, D.: Large-scale bayesian logistic regression for text categorization. Technical report, DIMACS (2004)
Liu, B., Li, X., Lee, W.S.: Text Classification by Labeling Words. In: AAAI 2004, pp. 425–430 (2004)
Chelba, C., Acero, A.: Adaptation of maximum entropy capitalizer: Little data can help a lot. In: EMNLP 2004 (2004)
Lewis, D., Gale, W.: A sequential algorithm for training text classifiers. In: SIGIR 1994 (1994)
Madigan, D., Gavrin, J., Raftery, A.: Eliciting prior information to enhance the predictive performance of bayesian graphical models. Communications in Statistics-Theory and Methods, pp. 2271–2292 (1995)
Gabrilovich, E., Markovitch, S.: Feature Generation for Text Categorization Using World Knowledge. In: IJCAI 2005 (2005)
Gabrilovich, E., Markovitch, S.: Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge. In: AAAI 2006 (2006)
Ifrim, G., Weikum, G.: Transductive Learning for Text Classification Using Explicit Knowledge Models. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 223–234. Springer, Heidelberg (2006)
Raghavan, H., Madani, O., Jones, R.: Interactive feature selection. In: IJCAI 2005, pp. 841–846 (2005)
Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proceedings of SIGIR 2001 (2001)
Nigam, K., Ghani, R.: Analyzing the Effectiveness and Applicability of Co-training. In: CIKM 2000, pp. 86–93 (2000)
Jones, R., McCallum, A., Nigam, K., Riloff, E.: Bootstrapping for text learning tasks. In: IJCAI 1999 Workshop on Text Mining (1999)
Raina, R., Ng, A.Y., Koller, D.: Constructing informative priors using transfer learning. In: ICML 2006, pp. 713–720 (2006)
Schapire, R., Rochery, M., Rahim, M., Gupta, N.: Incorporating prior knowledge into boosting. In: ICML 2002 (2002)
Hofmann, T., Puzicha, J.: Statistical Models for Co-occurrence Data. Technical Report 1999 (1999)
Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Joachims, T.: Transductive Inference for Text Classification using Support Vector Machines. In: International Conference on Machine Learning, ICML 1999 (1999)
T. Joachims, Transductive Learning via Spectral Graph Partitioning. In: Proceedings of the International Conference on Machine Learning (ICML) (2003)
Mitchell, T.: The role of unlabeled data in supervised learning. In: Proceedings of the Sixth International Colloquium on Cognitive Science (1999)
Ji, X., Xu, W.: Document clustering with prior knowledge. In: SIGIR 2006, pp. 405–412 (2006)
Wu, X., Srihari, R.: Incorporating prior knowledge with weighted margin support vector machines. In: KDD 2004, pp. 326–333 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, C., Xue, GR., Yu, Y. (2008). Knowledge Supervised Text Classification with No Labeled Documents. In: Ho, TB., Zhou, ZH. (eds) PRICAI 2008: Trends in Artificial Intelligence. PRICAI 2008. Lecture Notes in Computer Science(), vol 5351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89197-0_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-89197-0_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89196-3
Online ISBN: 978-3-540-89197-0
eBook Packages: Computer ScienceComputer Science (R0)