Abstract
Existing sequence generation models ignore the exposure bias problem when they apply to the multi-label classification task. To solve this issue, in this paper, we proposed a novel model, which disguises the label prediction probability distribution as label embedding and incorporate each label embedding from previous step into the current step’s LSTM decoding process. It allows the current step can make a better prediction based on the overall output of the previous prediction, rather than simply based on a local optimum output. In addition, we proposed a scheduled sampling-based learning algorithm for this model. The learning algorithm effectively and appropriately incorporates the label embedding into the process of label generation procedure. Through comparing with three classical methods and four SOTA methods for the multi-label classification task, the results demonstrated that our proposed method obtained the highest F1-Score (reaching 0.794 on a chemical exposure assessment task and reaching 0.615 on a clinical syndrome differentiation task of traditional Chinese medicine).
Y. Wang and F. Yan—These authors contributed equally to this work.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Pratt, W., Yetisgen-Yildiz, M.: LitLinker: capturing connections across the biomedical literature. In: 2nd International Conference on Knowledge Capture, New York, pp. 105–12. Association for Computing Machinery (2003)
Zhang, L., Yu, D.L., Wang, Y.G.: Selecting an appropriate interestingness measure to evaluate the correlation between Chinese medicine syndrome elements and symptoms. Chin. J. Integr. Med. 18(2), 93–99 (2012)
Tsoumakas, G., Vlahavas, I.: Random k-labelsets: an ensemble method for multilabel classification. In: Kok, J.N., Koronacki, J., de Mantaras, R.L., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 406–417. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74958-5_38
Read, J., Pfahringer, B., Holmes, G., et al.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)
Ghamrawi, N., McCallum, A.: Collective multi-label classification. In: 14th ACM International Conference on Information and Knowledge Management, New York, pp. 95–200. Association for Computing Machinery (2005)
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. In: 34th International Conference on Machine Learning, Sydney, pp. 1243–1252 (2017)
Li, W., Ren, X.C., Dai, D., et al.: Sememe prediction: learning semantic knowledge from unstructured textual wiki descriptions. arXiv preprint arXiv:1808.05437 (2018)
Yang, P.C., Luo, F.L., Ma, S.M., et al.: A deep reinforced sequence-to-set model for multi-label classification. In: 57th Annual Meeting of the Association for Computational Linguistics, Florence, pp. 5252–5258. Association for Computational Linguistics (2019)
Bengio, S., Vinyals, O., Jaitly, N., et al.: Scheduled sampling for sequence prediction with recur-rent neural networks. In: Advances in Neural Information Processing Systems, Montreal, pp. 1171–1179. Neural Information Processing Systems (2015)
Kowsari, K., Brown, D.E., Heidarysafa, M., et al.: HDLTex: hierarchical deep learning for text classification. In: 16th IEEE International Conference on Machine Learning and Applications (ICMLA), Cancun, pp. 364–371. IEEE (2017)
Baker, S., Korhonen, A.L.: Initializing neural networks for hierarchical multi-label text classification. In: 16th Biomedical Natural Language Processing Workshop, Vancouver, pp. 307–315. Association for Computational Linguistics (2017)
Peng, H., Li, J., He, Y., et al.: Large-scale hierarchical text classification with recursively regularized deep graph-CNN. In: 2018 World Wide Web Conference, Lyon, France pp. 1063–1072. International World Wide Web Conferences Steering Committee (2018)
Cerri, R., Barros, R.C., De Carvalho, A.C.: Hierarchical multi-label classification using local neural networks. J. Comput. Syst. Sci. 80(1), 39–56 (2014)
Yang, Y.Y., Lin, Y.A., Chu, H.M., et al.: Deep learning with a rethinking structure for multi-label classification. In: Asian Conference on Machine Learning, Nagoya, pp. 125–140. Proceedings of Machine Learning Research (2019)
Fu, D., Zhou, B., Hu, J.: Improving SVM based multi-label classification by using label relationship. In: 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, pp. 1–6. IEEE (2015)
Aly, R., Remus, S., Biemann, C.: Hierarchical multi-label classification of text with capsule networks. In: 57th Annual Meeting of the Association for Computational Linguistics, Florence, pp. 323–330. Association for Computational Linguistics (2019)
Nam, J., Mencía, E.L., Kim, H.J., et al.: Maximizing subset accuracy with recurrent neural networks in multi-label classification. In: Advances in Neural Information Processing Systems, Long Beach, pp. 5413–5423. Neural Information Processing Systems (2017)
Wiseman, S., Rush, A.M.: Sequence-to-sequence learning as beam-search optimization. In: 2016 Conference on Empirical Methods in Natural Language Processing, Austin, pp. 1296–1306 (2016)
Zhang, W., Feng, Y., Meng, F., et al.: Bridging the gap between training and inference for neural machine translation. arXiv preprint arXiv:1906.02448 (2019)
Yang, Z., Dai, Z., Yang, Y., et al.: XLNet: generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing System, Vancouver, pp. 5753–5763. Neural Information Processing Systems (2019)
Yang, P., Sun, X., Li, W., et al.: SGM: sequence generation model for multi-label classification. In: 27th International Conference on Computational Linguistics, Santa Fe, pp. 3915–3926. Association for Computational Linguistics (2018)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, Long Beach, pp. 5998–6008. Neural Information Processing Systems (2017)
Larsson, K., Baker, S., Silins, I., et al.: Text mining for improved exposure assessment. PLoS ONE 12(3), e0173132 (2017)
Szymański, P., Kajdanowicz, T.: A scikit-based Python environment for performing multi-label classification. arXiv preprint arXiv:1702.01460 (2017)
Liu, L., Mu, F., Li, P., et al.: NeuralClassifier: an open-source neural hierarchical multi-label text classification toolkit. In: 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Florence, pp. 87–92. Association for Computational Linguistics (2019)
PubMed Homepage. https://pubmed.ncbi.nlm.nih.gov. Accessed 20 Mar 2020
Acknowledgement
The research work is partially supported by the Sichuan Major Science and Technology Special Program under Grant (2017GZDZX0002), the Sichuan Science and Technology Program under Grant (2018GZ207), the Sichuan Province Science and Technology Support Program under Grant (2020YFG0299, 2020YFSY0067), and the National Natural Science Foundation of China under Grant (61801058, 61501063).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, Y., Yan, F., Wang, X., Tang, W., Shu, H. (2020). Label Embedding Enhanced Multi-label Sequence Generation Model. In: Zhu, X., Zhang, M., Hong, Y., He, R. (eds) Natural Language Processing and Chinese Computing. NLPCC 2020. Lecture Notes in Computer Science(), vol 12431. Springer, Cham. https://doi.org/10.1007/978-3-030-60457-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-60457-8_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60456-1
Online ISBN: 978-3-030-60457-8
eBook Packages: Computer ScienceComputer Science (R0)