Hierarchical multi-attention networks for document classification

Huang, Yingren; Chen, Jiaojiao; Zheng, Shaomin; Xue, Yun; Hu, Xiaohui

doi:10.1007/s13042-020-01260-x

Hierarchical multi-attention networks for document classification

Original Article
Published: 14 January 2021

Volume 12, pages 1639–1647, (2021)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Yingren Huang^1,2,
Jiaojiao Chen ORCID: orcid.org/0000-0003-3733-2650²,
Shaomin Zheng²,
Yun Xue² &
…
Xiaohui Hu²

978 Accesses
19 Citations
Explore all metrics

Abstract

Research of document classification is ongoing to employ the attention based-deep learning algorithms and achieves impressive results. Owing to the complexity of the document, classical models, as well as single attention mechanism, fail to meet the demand of high-accuracy classification. This paper proposes a method that classifies the document via the hierarchical multi-attention networks, which describes the document from the word-sentence level and the sentence-document level. Further, different attention strategies are performed on different levels, which enables accurate assigning of the attention weight. Specifically, the soft attention mechanism is applied to the word-sentence level while the CNN-attention to the sentence-document level. Due to the distinctiveness of the model, the proposed method delivers the highest accuracy compared to other state-of-the-art methods. In addition, the attention weight visualization outcomes present the effectiveness of attention mechanism in distinguishing the importance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Vietnamese Document Classification Using Hierarchical Attention Networks

Hierarchical Attentional Hybrid Neural Networks for Document Classification

Hierarchical Attention Networks for Different Types of Documents with Smaller Size of Datasets

References

Chambers A (2013) Statistical models for text classification and clustering: applications and analysis. Dissertation. University of California, Irvine
Google Scholar
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41:391–407
Article Google Scholar
Pang B, Lee LJF (2008) Retrieval TiI. Opin Min Sentiment Anal 2:1–135
Google Scholar
Longpre S, Pradhan S, Xiong C, Socher RJapa (2016) A way out of the odyssey: analyzing and combining recent insights for LSTMs
Sarioglu ES (2014) Effective classification of clinical reports: natural language processing-based and topic modeling-based approaches. The George Washington University
Hassan A, Mahmood A (2018) Convolutional recurrent deep learning model for sentence classification. IEEE Access 6:13949–13957. https://doi.org/10.1109/ACCESS.2018.2814818
Article Google Scholar
Core DB (2012) Applications of text classification to enterprise support documents. UC Santa Cruz
Silva C, Lotric U, Ribeiro B (2010) Dobnikar AJIToS, Man, cybernetics PC. Distributed text classification with an ensemble kernel-based learning approach. IEEE Trans Syst Man Cybern 40:287–297
Article Google Scholar
Nii M, Ando S, Takahashi Y, Uchinuno A, Sakashita R (2007) Nursing-care freestyle text classification using support vector machines. In: IEEE international conference on granular computing, 2007, GRC 2007. IEEE, New York, pp 665–665
Yang Y, Pedersen JO (1997) A comparative study on feature selection in text categorization. In: ICML, 1997, pp 412–420
Kim Y (2014) Convolutional neural networks for sentence classification arXiv preprint https://arxiv.org/abs/14085882
Zhao Y, Zhang J, Li Y et al (2019) Sentiment analysis using embedding from language model and multi-scale convolutional neural network. Comput Appl 40(3):651–657
Google Scholar
Funahashi K-I, Nakamura Y (1993) Approximation of dynamical systems by continuous time recurrent neural networks. Neural Netw 6:801–806
Article Google Scholar
Mnih V, Heess N, Graves A (2014) Recurrent models of visual attention. In: Advances in neural information processing systems, pp 2204–2212
Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint https://arxiv.org/abs/14090473
Du J, Gui L, Xu R, He YA (2017) Convolutional attention model for text classification. In: National CCF conference on natural language processing and Chinese computing, 2017. Springer, New York, pp 183–195
Zhang Y, Er MJ, Venkatesan R, Wang N, Pratama M (2016) Sentiment classification using comprehensive attention recurrent models. In: 2016 international joint conference on neural networks (IJCNN). IEEE, New York, pp 1562–1569
Remy JB, Tixier AJP, Vazirgiannis M (2019) Bidirectional context-aware hierarchical attention network for document understanding. arXiv preprint https://arxiv.org/abs/1908.06006
Shi M, Liu J (2018) Functional and contextual attention-based LSTM for service recommendation in Mashup creation. IEEE Trans Parallel Distrib Syst 30(5):1077–1090
Article Google Scholar
Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1480–1489
Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint https://arxiv.org/abs/14123555
Xu C, Shen J, Du X, Zhang F (2018) An intrusion detection system using a deep neural network with gated recurrent units. IEEE Access 6:48697–48707. https://doi.org/10.1109/ACCESS.2018.2867564
Article Google Scholar
Song Y (2018) Stock trend prediction: based on machine learning methods. UCLA
Iyyer M, Manjunatha V, Boyd-Graber J, Daumé III H (2015) Deep unordered composition rivals syntactic methods for text classification. In: Proceedings of the 53rd annual meeting of the Association for Computational Linguistics and the 7th international joint conference on natural language processing (volume 1: long papers), pp 1681–1691
Karpathy A, Johnson J, Fei-Fei L (2015) Visualizing and understanding recurrent networks arXiv preprint https://arxiv.org/abs/150602078
Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C (2011) Learning word vectors for sentiment analysis. In: Proceedings of the 49th annual meeting of the Association for Computational Linguistics: human language technologies, vol 1. Association for Computational Linguistics, pp 142–150
Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2016) Learning phrase representations using RNN encoder-decoder for statistical machine translation, arXiv preprint https://arxiv.org/abs/1406.1078.
Du J, Gui L, He Y et al (2019) Convolution-based neural attention with applications to sentiment classification. IEEE Access 7:27983–27992
Article Google Scholar

Download references

Acknowledgements

This work was supported by the National Statistical Science Research Project of China under Grant No. 2016LY98, the Science and Technology Department of Guangdong Province in China under Grant Nos. 2016A010101020, 2016A010101021 and 2016A010101022, the Characteristic Innovation Projects of Guangdong Colleges and Universities (Nos. 2018KTSCX049 and 2018GKTSCX069), the Bidding Project of Laboratory of Language Engineering and Computing of Guangdong University of Foreign Studies (No. LEC2019ZBKT005).

Author information

Authors and Affiliations

Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou, Guangdong, China
Yingren Huang
Guangdong Provincial Key Laboratory of Quantum Engineering and Quantum Materials, School of Physics and Telecommunication Engineering, South China Normal University, Guangzhou, 510006, China
Yingren Huang, Jiaojiao Chen, Shaomin Zheng, Yun Xue & Xiaohui Hu

Authors

Yingren Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jiaojiao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shaomin Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Yun Xue
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohui Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Xue.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PY 16 KB)

Supplementary file2 (PY 14 KB)

Supplementary file3 (PY 9 KB)

Supplementary file4 (PY 9 KB)

Supplementary file5 (PY 9 KB)

Supplementary file6 (PY 9 KB)

Supplementary file7 (PY 10 KB)

Supplementary file8 (PY 9 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Y., Chen, J., Zheng, S. et al. Hierarchical multi-attention networks for document classification. Int. J. Mach. Learn. & Cyber. 12, 1639–1647 (2021). https://doi.org/10.1007/s13042-020-01260-x

Download citation

Received: 14 August 2019
Accepted: 09 December 2020
Published: 14 January 2021
Issue Date: June 2021
DOI: https://doi.org/10.1007/s13042-020-01260-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Hierarchical multi-attention networks for document classification

Abstract

Access this article

Similar content being viewed by others

Vietnamese Document Classification Using Hierarchical Attention Networks

Hierarchical Attentional Hybrid Neural Networks for Document Classification

Hierarchical Attention Networks for Different Types of Documents with Smaller Size of Datasets

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Supplementary Information

Supplementary file1 (PY 16 KB)

Supplementary file2 (PY 14 KB)

Supplementary file3 (PY 9 KB)

Supplementary file4 (PY 9 KB)

Supplementary file5 (PY 9 KB)

Supplementary file6 (PY 9 KB)

Supplementary file7 (PY 10 KB)

Supplementary file8 (PY 9 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Hierarchical multi-attention networks for document classification

Abstract

Access this article

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's note

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation