A Hierarchical Fine-Tuning Approach Based on Joint Embedding of Words and Parent Categories for Hierarchical Multi-label Text Classification

Ma, Yinglong; Zhao, Jingpeng; Jin, Beihong

doi:10.1007/978-3-030-61616-8_60

A Hierarchical Fine-Tuning Approach Based on Joint Embedding of Words and Parent Categories for Hierarchical Multi-label Text Classification

Yinglong Ma¹¹,
Jingpeng Zhao¹¹ &
Beihong Jin¹²

Conference paper
First Online: 14 October 2020

2249 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12397))

Abstract

Many important classification problems in real world consist of a large number of categories. Hierarchical multi-label text classification (HMTC) with higher accuracy over large sets of closely related categories organized in a hierarchical structure or taxonomy has become a challenging problem. In this paper, we present a hierarchical fine-tuning deep learning approach for HMTC, where a joint embedding of words and their parent categories is generated by leveraging the hierarchical relations in the hierarchical structure of categories and the textual data. A fine tuning technique is applied to the Ordered Neural LSTM (ONLSTM) neural network such that the text classification results in the upper levels are able to help the classification in the lower ones. The extensive experiments were made over two benchmark datasets, and the results show that the method proposed in this paper outperforms the state-of-the-art hierarchical and flat multi-label text classification approaches, in particular the aspect of reducing computational costs while achieving superior performance.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Koller, D., Sahami, M.: Hierarchically classifying documents using very few words. In: Proceedings of International Conference on Machine Learning (ICML 1997), pp. 170–178, Morgan Kaufmann Publishers Inc., San Francisco (1997)
Google Scholar
Stein, R., Jaques, P., Valiati, J.: An analysis of hierarchical text classification using word embeddings. Inf. Sci. 471, 216–232 (2019)
Article Google Scholar
Silla, C., Freitas, A.: A survey of hierarchical classification across different application domains. Data Min. Knowl. Disc. 22(1–2), 31–72 (2011)
Article MathSciNet Google Scholar
Kowsari, K., Brown, D., Heidarysafa, M., et al.: HDLTex: hierarchical deep learning for text classification. In: Proceedings of the 16th IEEE International Conference on Machine Learning and Applications, pp. 364–371 (2017)
Google Scholar
Sinha, K., Dong, Y., et al.: A hierarchical neural attention-based text classifier. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 817–823 (2018)
Google Scholar
Sun, A., Lim, E.: Hierarchical text classification and evaluation. In: Proceedings 2001 IEEE International Conference on Data Mining, pp. 521–528 (2001)
Google Scholar
Tsatsaronis, G., Balikas, G., et al.: An overview of the BioASQ large-scale biomedical semantic indexing and question answering competition. BMC Bioinformatics 16(1), 138 (2015)
Article Google Scholar
Shen, Y., Tan, S., Sordoni, A., et al.: Ordered neurons: Integrating tree structures into recurrent neural networks. In: Proceedings of the 7th International Conference on Learning Representations (2019)
Google Scholar
Alexis, C., Holger, S., et al.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 1107–1116 (2017)
Google Scholar
Lai, S., Xu, L., et al.: Recurrent convolutional neural networks for text classification. In: Proceedings of the 29th AAAI conference on artificial intelligence, pp. 2267–2273 (2015)
Google Scholar
Zhou, P., Qi, Z., Zheng, S., et al.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In: Proceedings of the 26th International Conference on Computational Linguistics, pp. 3485–3495 (2016)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Proceedings of the 29th Annual Conference on Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Han, X., Liu, J., Shen, Z., Miao, C.: An optimized k-nearest neighbor algorithm for large scale hierarchical text classification. In: Proceedings of the Joint ECML/PKDD PASCAL Workshop on Large-Scale Hierarchical Classification, pp. 2–12 (2011)
Google Scholar
Cai, L., Hofmann, T.: Hierarchical document categorization with support vector machines. In: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pp. 78–87 (2004)
Google Scholar
Chen, Y., Crawford, M., Ghosh, J.: Integrating support vector machines in a hierarchical output space decomposition framework. In: IEEE International Geoscience & Remote Sensing Symposium IEEE, vol. 2, pp. 949– 952 (2004)
Google Scholar
Gopal, S., Yang, Y.: Recursive regularization for large-scale classification with hierarchical and graphical dependencies. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 257–265 (2013)
Google Scholar
McCallum, A., Rosenfeld, R., Mitchell, T., Ng, A.: Improving text classification by shrinkage in a hierarchy of classes. In: Proceedings of the 15th International Conference on Machine Learning, vol. 98, pp. 359–367 (1998)
Google Scholar
Bennett, P., Nguyen, N.: Refined experts: improving classification in large taxonomies. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 11–18 (2009)
Google Scholar
Bi, W., Kwok, J.: Mandatory leaf node prediction in hierarchical multilabel classification. IEEE Trans. Neural Networks Learn. Syst. 25(12), 2275–2287 (2014)
Article Google Scholar
Jin-Bo, T.: An improved hierarchical document classification method. New Technol. Libr. Inf. Serv. 2(2), 56–59 (2007)
Google Scholar
Li, W., Miao, D., Wei, Z., Wang, W.: Hierarchical text classification model based on blocking priori knowledge. Pattern Recog. Artif. Intell. 23(4), 456–463 (2010)
Google Scholar
Weigend, A., Wiener, E., Pedersen, J.: Exploiting hierarchy in text categorization. Inf. Retrieval 1(3), 193–216 (1999)
Article Google Scholar
Liu, T., Yang, Y., et al.: Support vector machines classification with a very large scale taxonomy. ACM SIGKDD Explor. Newsl. 7(1), 36–43 (2005)
Article Google Scholar
Shimura, K., Li, J., Fukumoto, F.: HFT-CNN: learning hierarchical category structure for multi-label short text categorization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 811–816 (2018)
Google Scholar
Dzmitry, B., Kyunghyun, C., Yoshua, B.: Neural machine translation by jointly learning to align and translate. In: Proceedings of the 3rd International Conference on Learning Representations (2015)
Google Scholar
Yogatama, D., Faruqui, M., Dyer, C., Smith, N.A.: Learning word representations with hierarchical sparse coding. In: Proceedings of International Conference on Machine Learning (ICML 2015), pp. 87–96 (2015)
Google Scholar
Bengio, Y., et al.: A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)
MATH Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167 (2008)
Google Scholar
Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 384–394 (2010)
Google Scholar
Banerjee, S., Akkaya, C., et al.: Hierarchical transfer learning for multi-label text classification. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 6295–6300(2019)
Google Scholar
He, K., Girshick, R., Dollar, P.: Rethinking imagenet pre-training. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4917–4926 (2019)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the 3rd International Conference on Learning Representations (2015)
Google Scholar
Joulin, A., et al.: Bag of tricks for efficient text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 427–431 (2017)
Google Scholar
Lee, J., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. In: Proceedings of the 15th Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 515–520 (2016)
Google Scholar
Lin, Z., Feng, M., et al.: A structured self-attentive sentence embedding. In: Proceedings of the 5th International Conference on Learning Representations (2017)
Google Scholar

Download references

Acknowledgments

This work is partially supported by the National Key R&D Program of China under granted (2018YFC0830605, 2018YFC0831404).

Author information

Authors and Affiliations

School of Control and Computer Engineering, North China Electric Power University, Beijing, 102206, China
Yinglong Ma & Jingpeng Zhao
Institute of Software, Chinese Academy of Sciences, Beijing, 100190, China
Beihong Jin

Authors

Yinglong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jingpeng Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Beihong Jin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yinglong Ma .

Editor information

Editors and Affiliations

Department of Applied Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs. Lyngby, Denmark
Paolo Masulli
Department of Informatics, University of Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ma, Y., Zhao, J., Jin, B. (2020). A Hierarchical Fine-Tuning Approach Based on Joint Embedding of Words and Parent Categories for Hierarchical Multi-label Text Classification. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12397. Springer, Cham. https://doi.org/10.1007/978-3-030-61616-8_60

Download citation

DOI: https://doi.org/10.1007/978-3-030-61616-8_60
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61615-1
Online ISBN: 978-3-030-61616-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics