A visual attention-based keyword extraction for document classification

Wu, Xing; Du, Zhikang; Guo, Yike

doi:10.1007/s11042-018-5788-9

A visual attention-based keyword extraction for document classification

Published: 01 March 2018

Volume 77, pages 25355–25367, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xing Wu ORCID: orcid.org/0000-0001-5331-022X^1,2,
Zhikang Du^1,2 &
Yike Guo^1,2

1045 Accesses
11 Citations
Explore all metrics

Abstract

Document classification plays an important role in natural language processing. Among that, keyword extraction algorithm shows its great potential in summarizing the entire document. Attention is the process of selectively concentrating on a discrete aspect of information, while ignoring other perceivable information. A new probabilistic keyword extraction algorithm is proposed, which is inspired by the visual attention mechanism. An unsupervised neural network based pre-training method is proposed for training the semantic attention based keyword extraction algorithm, which is helpful in extracting keywords with rich contextual information from the document. A bidirectional Long short-term memory network combined with the proposed semantic keyword extraction algorithm is designed for both topic and sentiment classification tasks. Experiments on four large scale datasets show that the proposed visual attention based keyword extraction algorithm gives a better performance than the baseline methods. The semantic attention based keyword extraction method is significant in summarizing the content of a document, which is very useful for large scale document classification.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Article 09 April 2024

Impact of word embedding models on text analytics in deep learning environment: a review

Article 22 February 2023

TextConvoNet: a convolutional neural network based architecture for text classification

Article 22 October 2022

References

Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate, arXiv:1409.0473
Cheng J, Lapata M (2016) Neural summarization by extracting sentences and words, arXiv:1603.07252
Chung J, Gulcehre C et al (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling, arXiv:1412.3555
Graves A, Jaitly N (2014) Towards end-to-end speech recognition with recurrent neural networks. In: Proceedings of the 31st international conference on machine learning, pp 1764–1772
Guo Z, Gao L, Song J et al (2016) Attention-based LSTM with semantic consistency for videos captioning. In: Proceedings of the 2016 ACM on multimedia conference. ACM, pp 357–361
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Jianfei N, Jiangzhen L (2016) Using Word2vec with TextRank to extract keywords. New Technology of Library and Information Service, pp 20–27
Kim Y (2014) Convolutional neural networks for sentence classification, arXiv:1408.5882
Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 1097–1105
LeCun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521 (7553):436–444
Article Google Scholar
Li J, Sun M (2007) Scalable term selection for text categorization. In: EMNLP-CoNLL, pp 774–782
Matsuo Y, Ishizuka M (2004) Keyword extraction from a single document using word co-occurrence statistical information. Int J Artif Intell Tools 13(01):157–169
Article Google Scholar
Mikolov T, Sutskever I, Chen K et al (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Proces Syst 3111–3119
Mnih V, Heess N, Graves A (2014) Recurrent models of visual attention. Adv Neural Inf Proces Syst 256–264
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Song J, Guo Z, Gao L et al (2017) Hierarchical LSTM with adjusted temporal attention for video captioning. In: IJCAI
Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: EMNLP, pp 1422–1432
Xu K, Ba J, Kiros R et al (2015) Show, attend and tell: neural image caption generation with visual attention. In: International conference on machine learning, pp 2048–2057
Yelp Dataset Challenge 2017, https://www.yelp.com/dataset/challenge (2017)
Yuepeng L, Cui J, Junchuan J (2015) A keyword extraction algorithm based on Word2vec. E-science Technol Appl 4(2015):54–59
Google Scholar
Zhang Y, Lai G, Zhang M et al (2014) Explicit factor models for explainable recommendation based on phrase-level sentiment analysis. In: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. ACM, pp 83–92
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inf Proces Syst 649–657

Download references

Acknowledgements

This paper is supported by the project 61303094 supported by National Natural Science Foundation of China, by the Science and Technology Commission of Shanghai Municipality 16511102400 and 16111107800, by Innovation Program of Shanghai Municipal Education Commission (14YZ024).

Author information

Authors and Affiliations

School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
Xing Wu, Zhikang Du & Yike Guo
Shanghai Institute for Advanced Communication and Data Science, Shanghai, China
Xing Wu, Zhikang Du & Yike Guo

Authors

Xing Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zhikang Du
View author publications
You can also search for this author in PubMed Google Scholar
Yike Guo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xing Wu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, X., Du, Z. & Guo, Y. A visual attention-based keyword extraction for document classification. Multimed Tools Appl 77, 25355–25367 (2018). https://doi.org/10.1007/s11042-018-5788-9

Download citation

Received: 26 June 2017
Revised: 16 December 2017
Accepted: 09 February 2018
Published: 01 March 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s11042-018-5788-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A visual attention-based keyword extraction for document classification

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A visual attention-based keyword extraction for document classification

Abstract

Access this article

Similar content being viewed by others

A supervised deep learning-based sentiment analysis by the implementation of Word2Vec and GloVe Embedding techniques

Impact of word embedding models on text analytics in deep learning environment: a review

TextConvoNet: a convolutional neural network based architecture for text classification

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation