WiseTag: An Ensemble Method for Multi-label Topic Classification

Liang, Guanqing; Kao, Hsiaohsien; Wing-Ki Leung, Cane; He, Chao

doi:10.1007/978-3-319-99501-4_47

Guanqing Liang¹⁸,
Hsiaohsien Kao¹⁸,
Cane Wing-Ki Leung¹⁸ &
…
Chao He¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11109))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2035 Accesses

Abstract

Multi-label topic classification aims to assign one or more relevant topic labels to a text. This paper presents the WiseTag system, which performs multi-label topic classification based on an ensemble of four single models, namely a KNN-based model, an Information Gain-based model, a Keyword Matching-based model and a Deep Learning-based model. These single models are carefully designed so that they are diverse enough to improve the performance of the ensemble model. In the NLPCC 2018 shared task 6 “Automatic Tagging of Zhihu Questions”, the proposed WiseTag system achieves an F1 score of 0.4863 on the test set, and ranks no. 4 among all the teams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

https://github.com/hyperopt/hyperopt
Joulin, A., Grave, E., Bojanowski, P., Mikolov, T.: Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759 (2016)
Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882 (2014)
Denkowski, M., Neubig G.: Stronger baselines for trustable results in neural machine translation. In: Proceedings of the First Workshop on Neural Machine Translation, Association for Computational Linguistics, Vancouver, pp. 18–27 (2017)
Google Scholar
https://github.com/HIT-SCIR/ltp-cws
https://radimrehurek.com/gensim/models/word2vec.html
Chollet, F.: Keras: deep learning library for theano and tensorflow (2016). https://keras.io
https://en.wikipedia.org/wiki/Pointwise_mutual_information
https://github.com/chenyuntc/PyTorchText
https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
https://en.wikipedia.org/wiki/Tf-idf
Zhang, M., Zhou, Z.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014)
Article Google Scholar
Boutell, M.R., Luo, J., Shen, X., Brown, C.M.: Learning multi-label scene classification. Pattern Recognit. 37(9), 1757–1771 (2004)
Article Google Scholar
Fürnkranz, J., Hüllermeier, E., Mencía, E.L., Brinker, K.: Multilabel classification via calibrated label ranking. Mach. Learn. 73(2), 133–153 (2008)
Article Google Scholar
Tsoumakas, G., Vlahavas, I.: Random k-labelsets: an ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_38
Chapter Google Scholar
Zhang, M.L., Zhou, Z.H.: ML-kNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)
Article Google Scholar
Clare, A., King, R.D.: Knowledge discovery in multi-label phenotype data. In: De Raedt, L., Siebes, A. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 42–53. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44794-6_4
Chapter MATH Google Scholar
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Dietterich, T.G., Becker, S., Ghahramani, Z. (eds.) Proceedings of the 14th International Conference on Neural Information Processing Systems: Natural and Synthetic (NIPS 2001), pp. 681–687. MIT Press, Cambridge (2001)
Google Scholar
https://zhuanlan.zhihu.com/p/28912353
Bergstra, J., Bardenet, R., Bengio, Y., Kégl, B.: Algorithms for hyper-parameter optimization. In: Shawe-Taylor, J., Zemel, R.S., Bartlett, P.L., Pereira, F., Weinberger, K.Q. (eds.) Proceedings of the 24th International Conference on Neural Information Processing Systems (NIPS 2011), pp. 2546–2554. Curran Associates Inc., USA (2011)
Google Scholar
Kingma, D.P., Ba, J.L.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2015)
Google Scholar
http://ruder.io/deep-learning-nlp-best-practices/index.html#fn:21
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Article Google Scholar

Download references

Author information

Authors and Affiliations

Wisers AI Lab, Wisers Information Limited, Wan Chai, Hong Kong
Guanqing Liang, Hsiaohsien Kao, Cane Wing-Ki Leung & Chao He

Authors

Guanqing Liang
View author publications
You can also search for this author in PubMed Google Scholar
Hsiaohsien Kao
View author publications
You can also search for this author in PubMed Google Scholar
Cane Wing-Ki Leung
View author publications
You can also search for this author in PubMed Google Scholar
Chao He
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guanqing Liang .

Editor information

Editors and Affiliations

Soochow University, Suzhou, China
Min Zhang
The University of Texas at Dallas, Richardson, Texas, USA
Vincent Ng
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, G., Kao, H., Wing-Ki Leung, C., He, C. (2018). WiseTag: An Ensemble Method for Multi-label Topic Classification. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2018. Lecture Notes in Computer Science(), vol 11109. Springer, Cham. https://doi.org/10.1007/978-3-319-99501-4_47

Download citation

DOI: https://doi.org/10.1007/978-3-319-99501-4_47
Published: 14 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99500-7
Online ISBN: 978-3-319-99501-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)