Reference Hub5
LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification

LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification

Yongjun Zhang, Zijian Wang, Yongtao Yu, Bolun Chen, Jialin Ma, Liang Shi
Copyright: © 2018 |Volume: 14 |Issue: 2 |Pages: 19
ISSN: 1548-3924|EISSN: 1548-3932|EISBN13: 9781522542650|DOI: 10.4018/IJDWM.2018040102
Cite Article Cite Article

MLA

Zhang, Yongjun, et al. "LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification." IJDWM vol.14, no.2 2018: pp.18-36. http://doi.org/10.4018/IJDWM.2018040102

APA

Zhang, Y., Wang, Z., Yu, Y., Chen, B., Ma, J., & Shi, L. (2018). LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification. International Journal of Data Warehousing and Mining (IJDWM), 14(2), 18-36. http://doi.org/10.4018/IJDWM.2018040102

Chicago

Zhang, Yongjun, et al. "LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification," International Journal of Data Warehousing and Mining (IJDWM) 14, no.2: 18-36. http://doi.org/10.4018/IJDWM.2018040102

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

This article describes how text documents are a major data structure in the era of big data. With the explosive growth of data, the number of documents with multi-labels has increased dramatically. The popular multi-label classification technology, which is usually employed to handle multinomial text documents, is sensitive to the noise terms of text documents. Therefore, there still exists a huge room for multi-label classification of text documents. This article introduces a supervised topic model, named labeled LDA with function terms (LF-LDA), to filter out the noisy function terms from text documents, which can help to improve the performance of multi-label classification of text documents. The article also shows the derivation of the Gibbs Sampling formulas in detail, which can be generalized to other similar topic models. Based on the textual data set RCV1-v2, the article compared the proposed model with other two state-of-the-art multi-label classifiers, Tuned SVM and labeled LDA, on both Macro-F1 and Micro-F1 metrics. The result shows that LF-LDA outperforms them and has the lowest variance, which indicates the robustness of the LF-LDA classifier.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.