A Novel Text Classification Approach Based on Deep Belief Network

Liu, Tao

doi:10.1007/978-3-642-17537-4_39

A Novel Text Classification Approach Based on Deep Belief Network

Tao Liu^19,20

Conference paper

3063 Accesses
13 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6443))

Abstract

A novel text classification approach is proposed in this paper based on deep belief network. Deep belief network constructs a deep architecture to obtain the high level abstraction of input data, which can be used to model the semantic correlation among words of documents. After basic features are selected by statistical feature selection measures, a deep belief network with discriminative fine tuning strategy is built on basic features to learn high level deep features. A support vector machine is then trained on the learned deep features. The proposed method outperforms traditional classifier based on support vector machine. As a dimension reduction strategy, the deep belief network also outperforms the traditional latent semantic indexing method. Detailed experiments are also made to show the effect of different fine tuning strategies and network structures on the performance of deep belief network.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computer Surveys 34(1), 1–47 (2002)
Article Google Scholar
Yang, Y., Liu, X.: A Re-examination of Text Categorization Methods. In: 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 42–49. ACM Press, New York (1999)
Google Scholar
Salton, G., Buckley, C.: Term Weighting Approaches in Automatic Text Retrieval. Processing and Management: an International Journal 24(5), 513–523 (1988)
Article Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the Dimensionality of Data with Neural Networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Bengio, Y.: Learning Deep Architectures for AI. Foundations and Trends in Machine Learning 2(1), 121–127 (2009)
Article MATH Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3(1), 993–1022 (2003)
MATH Google Scholar
Mnih, A., Hinton, G.E.: A Scalable Hierarchical Distributed Language Model. In: Advances in Neural Information Processing Systems (NIPS), pp. 1081–1088 (2008)
Google Scholar
Salakhutdinov, R., Hinton, G.: Semantic Hashing. International Journal of Approximate Reasoning archive 50(7), 969–978 (2009)
Article Google Scholar
Liu, T., Wang, X.L., Guan, Y., Xu, Z.M., et al.: Domain-specific Term Extraction and its Application in Text Classification. In: 8th Joint Conference on Information Sciences, pp. 1481–1484 (2005)
Google Scholar
Yang, Y., Ault, T.: kNN at TREC-9. In: 9th Text REtrieval Conference (TREC 1999), pp. 127–134 (1999)
Google Scholar
Yang, Y., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. In: 14th International Conference on Machine Learning, pp. 412–420 (1997)
Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy Layer-wise Training of Deep Networks. In: Advances in Neural Information Processing Systems, pp. 153–160 (2007)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A Fast Learning Algorithm for Deep Belief Nets. Neural Computation 18, 1527–1554 (2006)
Article MathSciNet MATH Google Scholar
Hinton, G.E.: Training Products of Experts by Minimizing Contrastive Divergence. Neural Computation 14(8), 1771–1800 (2002)
Article MATH Google Scholar
Joachims, T.: Text Categorization with Support Vector Machines: Learning with Many Relevant Features. In: 10th European Conference on Machine Learning, pp. 137–142 (1998)
Google Scholar
Kwok, J.-Y.: Automatic Text Categorization Using Support Vector Machine. In: International Conference on Neural Information Processing, pp. 347–351 (1998)
Google Scholar
Deerwester, S., Dumais, S., Landauer,T., et al.: Indexing by Latent Semantic Analysis. Journal of American Society of Information Science 41(6), 391–407 (1990)
Article Google Scholar
Manna, S., Petres, Z., Gedeon, T.D.: Tensor Term Indexing: An Application of HOSVD for Document Summarization. In: 4th International Symposium on Computational Intelligence and Intelligent Informatics, pp. 135–141 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Key Laboratory of Data Engineering and Knowledge Engineering, Renmin University of China, MOE, 100872, Beijing, China
Tao Liu
School of Information, Renmin University of China, 100872, Beijing, China
Tao Liu

Authors

Tao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Technology, Murdoch University, 6150, Murdoch, WA, Australia
Kok Wai Wong
The Australian National University, 0200, Canberra, ACT, Australia
B. Sumudu U. Mendis
School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, Northfields Avenue, 2522, P.O. Box, Wollongong, NSW, Australia
Abdesselam Bouzerdoum

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, T. (2010). A Novel Text Classification Approach Based on Deep Belief Network. In: Wong, K.W., Mendis, B.S.U., Bouzerdoum, A. (eds) Neural Information Processing. Theory and Algorithms. ICONIP 2010. Lecture Notes in Computer Science, vol 6443. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17537-4_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-17537-4_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17536-7
Online ISBN: 978-3-642-17537-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics