A Probability-Based Close Domain Metric in Lifelong Learning for Multi-label Classification

Pham, Thi-Ngan; Ha, Quang-Thuy; Nguyen, Minh-Chau; Nguyen, Tri-Thanh

doi:10.1007/978-3-030-38364-0_13

Thi-Ngan Pham^18,19,
Quang-Thuy Ha¹⁸,
Minh-Chau Nguyen¹⁸ &
…
Tri-Thanh Nguyen¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1121))

Included in the following conference series:

International Conference on Computer Science, Applied Mathematics and Applications

668 Accesses

Abstract

Lifelong machine learning has recently become a hot topic attracting the researchers all over the world by its effectiveness in dealing with current problem by exploiting the past knowledge. The combination of topic modeling on previous domain knowledge (such as topic modeling with Automatically generated Must-links and Cannot-links, which exploits must-link and cannot-link of two terms), and lifelong topic modeling (which employs the modeling of previous tasks) is widely used to produce better topics. This paper proposes a close domain metric based on probability to choose valuable knowledge learnt from the past to produce more associated topics on the current domain. This knowledge is, then, used to enrich features for multi-label classifier. Several experiments performed on review dataset of hotel show that the proposed approach leads to an improvement in performance over the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Blei, D.M.: Probabilistic topic models. Commun. ACM 55(4), 77–84 (2012)
Article Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Ha, Q.T., Pham, T.N., Nguyen, V.Q., Nguyen, T.C., Vuong, T.H., Tran, M.T., Nguyen, T.T.: A new lifelong topic modeling method and its application to vietnamese text multi-label classification. In: Asian Conference on Intelligent Information and Database Systems, pp. 200–210. Springer, Cham (2018)
Chapter Google Scholar
Chen, Z., Liu, B.: Topic modeling using topics from many domains, lifelong learning and big data. In: ICML, pp. 703–711 (2014)
Google Scholar
Chen, Z., Liu, B. Mining topics in documents: standing on the shoulders of big data. In: KDD, pp. 1116–1125 (2014)
Google Scholar
Chen, Z., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R.: Discovering coherent topics using general knowledge. In: CIKM, pp. 209–218 (2013)
Google Scholar
Andrzejewski, D., Zhu, X., Craven, M.: Incorporating domain knowledge into topic modeling via Dirichlet forest priors. In: ICML, pp. 25–32 (2009)
Google Scholar
Chen, Z., Mukherjee, A., Liu, B., Hsu, M,. Castellanos, M., Ghosh, R.: Exploiting Domain Knowledge in Aspect Extraction. In: EMNLP, pp. 1655–1667 (2013)
Google Scholar
Higashi, M., Klir, G.J.: On the notion of distance representing information closeness: possibility and probability distributions. Int. J. Gen Syst 9(2), 103–115 (1983)
Article MathSciNet Google Scholar
Lewis II, P.M.: Approximating probability distributions to reduce storage requirements. Inf. Control 2(3), 214–225 (1959)
Article MathSciNet Google Scholar
Chow, C., Liu, C.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14(3), 462–467 (1968)
Article MathSciNet Google Scholar
Hofmann T.: Probabilistic latent semantic indexing. In: Proceeding SIGIR 1999 Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 50–57 (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Vietnam National University, Hanoi (VNU), VNU-University of Engineering and Technology (UET), No. 144, Xuan Thuy, Cau Giay, Hanoi, Vietnam
Thi-Ngan Pham, Quang-Thuy Ha, Minh-Chau Nguyen & Tri-Thanh Nguyen
The Vietnamese People’s Police Academy, Hanoi, Vietnam
Thi-Ngan Pham

Authors

Thi-Ngan Pham
View author publications
You can also search for this author in PubMed Google Scholar
Quang-Thuy Ha
View author publications
You can also search for this author in PubMed Google Scholar
Minh-Chau Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Tri-Thanh Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Quang-Thuy Ha or Tri-Thanh Nguyen .

Editor information

Editors and Affiliations

Computer Science and Applications Department LGIPM, University of Lorraine, Metz Cedex 03, France
Hoai An Le Thi
Computer Science and Applications Department LGIPM, University of Lorraine, Metz Cedex 03, France
Hoai Minh Le
Laboratory of Mathematics, National Institute for Applied Sciences, Saint-Étienne-du-Rouvray Cedex, France
Tao Pham Dinh
Department of Information Systems, Wroclaw University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pham, TN., Ha, QT., Nguyen, MC., Nguyen, TT. (2020). A Probability-Based Close Domain Metric in Lifelong Learning for Multi-label Classification. In: Le Thi, H., Le, H., Pham Dinh, T., Nguyen, N. (eds) Advanced Computational Methods for Knowledge Engineering. ICCSAMA 2019. Advances in Intelligent Systems and Computing, vol 1121. Springer, Cham. https://doi.org/10.1007/978-3-030-38364-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-38364-0_13
Published: 20 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38363-3
Online ISBN: 978-3-030-38364-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics