LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification

Yongjun Zhang, Zijian Wang, Yongtao Yu, Bolun Chen, Jialin Ma, Liang Shi

Source Title: International Journal of Data Warehousing and Mining (IJDWM)14(2)

ISSN: 1548-3924|EISSN: 1548-3932|EISBN13: 9781522542650|DOI: 10.4018/IJDWM.2018040102

MLA

Zhang, Yongjun, et al. "LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification." IJDWM vol.14, no.2 2018: pp.18-36. http://doi.org/10.4018/IJDWM.2018040102

APA

Zhang, Y., Wang, Z., Yu, Y., Chen, B., Ma, J., & Shi, L. (2018). LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification. International Journal of Data Warehousing and Mining (IJDWM), 14(2), 18-36. http://doi.org/10.4018/IJDWM.2018040102

Chicago

Zhang, Yongjun, et al. "LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification," International Journal of Data Warehousing and Mining (IJDWM) 14, no.2: 18-36. http://doi.org/10.4018/IJDWM.2018040102

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

This article describes how text documents are a major data structure in the era of big data. With the explosive growth of data, the number of documents with multi-labels has increased dramatically. The popular multi-label classification technology, which is usually employed to handle multinomial text documents, is sensitive to the noise terms of text documents. Therefore, there still exists a huge room for multi-label classification of text documents. This article introduces a supervised topic model, named labeled LDA with function terms (LF-LDA), to filter out the noisy function terms from text documents, which can help to improve the performance of multi-label classification of text documents. The article also shows the derivation of the Gibbs Sampling formulas in detail, which can be generalized to other similar topic models. Based on the textual data set RCV1-v2, the article compared the proposed model with other two state-of-the-art multi-label classifiers, Tuned SVM and labeled LDA, on both Macro-F1 and Micro-F1 metrics. The result shows that LF-LDA outperforms them and has the lowest variance, which indicates the robustness of the LF-LDA classifier.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

LF-LDA: A Supervised Topic Model for Multi-Label Documents Classification

MLA

APA

Chicago

Export Reference

Abstract

Request Access