Skip to main content
Log in

Attention-Augmented Machine Memory

  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Attention mechanism plays an important role in the perception and cognition of human beings. Among others, many machine learning models have been developed to memorize the sequential data, such as the Long Short-Term Memory (LSTM) network and its extensions. However, due to lack of the attention mechanism, they cannot pay special attention to the important parts of the sequences. In this paper, we present a novel machine learning method called attention-augmented machine memory (AAMM). It seamlessly integrates the attention mechanism into the memory cell of LSTM. As a result, it facilitates the network to focus on valuable information in the sequences and ignore irrelevant information during its learning. We have conducted experiments on two sequence classification tasks for pattern classification and sentiment analysis, respectively. The experimental results demonstrate the advantages of AAMM over LSTM and some other related approaches. Hence, AAMM can be considered as a substitute of LSTM in the sequence learning applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. Corbetta M, Shulman GL. Control of Goal-Directed and Stimulus-Driven Attention in the Brain. Nat Rev Neurosci. 2002;3(3):201–15.

    Article  Google Scholar 

  2. Posner MI. Cognitive Neuroscience of Attention. Guilford Press; 2011.

  3. Xu K, Ba J, Kiros R, Cho K, Courville A, Salakhudinov R, Zemel R, Bengio Y. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention. In: ICML. 2015. pp. 2048–2057.

  4. Gao F, Zhang Y, Wang J, Sun J, Yang E, Hussain A. Visual Attention Model Based Vehicle Target Detection in Synthetic Aperture Radar Images: A Novel Approach. Cogn Comput. 2015;7(4):434–44.

    Article  Google Scholar 

  5. Hinton G, Salakhutdinov R. Reducing the Dimensionality of Data with Neural Networks. Science. 2006;313:

  6. Shen T, Zhou T, Long G, Jiang J, Pan S, Zhang C. DiSAN: Directional Self-Attention Network for RNN/CNN-Free Language Understanding. In: AAAI. 2018. pp. 5446–5455.

  7. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I. Attention Is All You Need. In: NIPS. 2017. pp. 6000–6010.

  8. Luong T, Pham H, Manning CD. Effective approaches to attention-based neural machine translation. In: EMNLP. 2015. pp. 1412–1421.

  9. Chung J, Gülçehre Ç, Cho K, Bengio Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. 2014. CoRR abs/1412.3555.

  10. Hochreiter S, Schmidhuber J. Long Short-Term Memory. Neural computation. 1997;9(8):1735–80.

    Article  Google Scholar 

  11. Lin Z, Feng M, dos Santos CN, Yu M, Xiang B, Zhou B, Bengio Y. A Structured Self-Attentive Sentence Embedding. In: ICLR. 2017.

  12. Zhong G, Lin X, Chen K, Li Q, Huang K. Long Short-Term Attention. In: BICS. 2019. pp. 45–54.

  13. Basiri ME, Nemati S, Abdar M, Cambria E, Acharya UR. ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis. Future Gener Comput Syst. 2021;115:279–94.

    Article  Google Scholar 

  14. Cho K, Courville A, Bengio Y. Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks. IEEE Trans Multimedia. 2015;17(11):1875–86.

    Article  Google Scholar 

  15. Sukhbaatar S, Weston J., Fergus, R., et al. End-to-End Memory Networks. In: NIPS. 2015. pp. 2440–2448.

  16. Weston J, Chopra S, Bordes A. Memory networks. In: Y. Bengio, Y. LeCun (eds.) ICLR. 2015.

  17. Kim Y, Denton C, Hoang L, Rush AM. Structured Attention Networks. In: ICLR. 2017.

  18. Hsu WT, Lin C, Lee M, Min K, Tang J, Sun M. A unified model for extractive and abstractive summarization using inconsistency loss. In: I. Gurevych, Y. Miyao (eds.) ACL. 2018. pp. 132–141.

  19. Gehrmann S, Deng Y, Rush AM. Bottom-up abstractive summarization. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium. 2018. pp. 4098–4109.

  20. Bahdanau D, Cho K, Bengio Y. Neural Machine Translation by Jointly Learning to Align and Translate. In: ICLR. 2015.

  21. Cho K, van Merrienboer B, Bahdanau D, Bengio Y. On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. In: Proceedings of SSST@EMNLP. 2014. pp. 103–111.

  22. Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S. Recurrent Neural Network Based Language Model. In: INTERSPEECH. 2010.

  23. Sutskever I, Martens J, Hinton GE. Generating Text with Recurrent Neural Networks. In: ICML. 2011. pp. 1017–1024.

  24. Graves A, Jaitly N. Towards End-to-End Speech Recognition with Recurrent Neural Networks. In: ICML. 2014. pp. 1764–1772.

  25. Graves A, Mohamed AR, Hinton G. Speech Recognition with Deep Recurrent Neural Networks. In: ICASSP. 2013. pp. 6645–6649.

  26. Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. In: CVPR. 2015. pp. 2625–2634.

  27. Li W, Shao W, Ji S, Cambria E. Bieru: Bidirectional emotional recurrent unit for conversational sentiment analysis. 2020. CoRR abs/2006.00492.

  28. Wang Y, Long M, Wang J, Gao Z, Philip SY. PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs. In: NIPS. 2017. pp. 879–888.

  29. He Z, Gao S, Xiao L, Liu D, He H, Barber D. Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence Learning. In: NIPS. 2017. pp. 1–11.

  30. Neil D, Pfeiffer M, Liu SC. Phased LSTM: Accelerating Recurrent Network Training for Long or Event-Based Sequences. In: NIPS. 2016. pp. 3882–3890.

  31. Shi X, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC. Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. In: NIPS. 2015. pp. 802–810.

  32. Liu J, Wang G, Hu P, Duan L, Kot AC. Global Context-Aware Attention LSTM Networks for 3D Action Recognition. In: CVPR. 2017. pp. 3671–3680.

  33. Gao L, Guo Z, Zhang H, Xu X, Shen HT. Video Captioning With Attention-Based LSTM and Semantic Consistency. IEEE Trans Multimedia. 2017;19(9):2045–55.

    Article  Google Scholar 

  34. Li Y, Zhu Z, Kong D, Han H, Zhao Y. EA-LSTM: Evolutionary Attention-Based LSTM for Time Series Prediction. Knowl Based Syst. 2019;181:104785.

  35. Long X, Gan C, de Melo G, Wu J, Liu X, Wen S. Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification. In: CVPR. 2018. pp. 7834–7843.

  36. Liu F, Zhou X, Wang T, Cao J, Wang Z, Wang H, Zhang Y. An Attention-based Hybrid LSTM-CNN Model for Arrhythmias Classification. In: IJCNN. 2019. pp. 1–8.

  37. Liu Z, Zhou W, Li H. AB-LSTM: Attention-based Bidirectional LSTM Model for Scene Text Detection. ACM Transactions on Multimedia Computing, Communications, and Applications. 2019;15(4):1–23.

    Google Scholar 

  38. Guo Z, Gao L, Song J, Xu X, Shao, J., Shen, H.T. Attention-based LSTM with Semantic Consistency for Videos Captioning. In: ACMMM. 2016. pp. 357–361.

  39. LeCun Y, Cortes C, Burges C. MNIST Handwritten Digit Database. AT&T Labs [Online]. 2010;2.

  40. Xiao H, Rasul K, Vollgraf R. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. 2017. CoRR abs/1708.07747.

  41. Graves A, Schmidhuber J. Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures. Neural Netw. 2005;18(5–6):602–10.

    Article  Google Scholar 

  42. Moniz JRA, Krueger D. Nested LSTMs. In: ACML. 2017. pp. 530–544.

  43. Cambria E. Affective computing and sentiment analysis. IEEE Intell Syst. 2016;31(2):102–7.

    Article  Google Scholar 

  44. Cambria E, Hussain A, Havasi C, Eckl C. Sentic computing: Exploitation of common sense for the development of emotion-sensitive systems. In: Development of Multimodal Interfaces: Active Listening and Synchrony, Second COST 2102 International Training School, Dublin, Ireland, March 23-27, 2009, Revised Selected Papers, Lecture Notes in Computer Science, vol. 5967. 2009. pp. 148–156.

  45. Cambria E, Li Y, Xing FZ, Poria S, Kwok K. Senticnet 6: Ensemble application of symbolic and subsymbolic AI for sentiment analysis. In: M. d’Aquin, S. Dietze, C. Hauff, E. Curry, P. Cudré-Mauroux (eds.) CIKM. ACM 2020. pp. 105–114.

  46. Dai AM, Le QV. Semi-supervised Sequence Learning. In: NIPS. 2015. pp. 3079–3087.

  47. Dong L, Wei F, Tan C, Tang D, Zhou M, Xu K. Adaptive Recursive Neural Network for Target-dependent Twitter Sentiment Classification. In: ACL. 2014. pp. 49–54.

  48. Maas AL, Daly RE, Pham PT, Huang D, Ng AY, Potts C. Learning Word Vectors for Sentiment Analysis. In: ACL-HLT. 2011. pp. 142–150.

  49. Pontiki M, Galanis D, Pavlopoulos J, Papageorgiou H, Androutsopoulos I, Manandhar S. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. In: SemEval@COLING. 2014. pp. 27–35.

  50. Yan Y, Yin X, Li S, Yang M, Hao H. Learning Document Semantic Representation with Hybrid Deep Belief Network. Comp Int Neurosc. 2015. 650,527:1–650,527:9

  51. Liu Q, Zhang H, Zeng Y, Huang Z, Wu Z. Content Attention Model for Aspect Based Sentiment Analysis. In: WWW. 2018. pp. 1023–1032.

  52. Ma Y, Peng H, Khan T, Cambria E, Hussain A. Sentic LSTM: a hybrid network for targeted aspect-based sentiment analysis. Cogn. Comput. 2018;10(4):639–50.

    Article  Google Scholar 

  53. Van Asch V. Macro-and micro-averaged evaluation measures [[basic draft]]. Belgium: CLiPS. 2013. pp. 1–27.

  54. Tang D, Qin B, Liu T. Aspect Level Sentiment Classification with Deep Memory Network. In: EMNLP. 2016. pp. 214–224.

  55. Fan F, Feng Y, Zhao D. Multi-grained Attention Network for Aspect-Level Sentiment Classification. In: EMNLP. 2018. pp. 3433–3442.

  56. Ma D, Li S, Zhang X, Wang H. Interactive Attention Networks for Aspect-Level Sentiment Classification. In: IJCAI. 2017. pp. 4068–4074.

  57. Chen P, Sun Z, Bing L, Yang W. Recurrent Attention Network on Memory for Aspect Sentiment Analysis. In: EMNLP. 2017. pp. 452–461.

  58. Wang Y, Huang M, Zhu X, Zhao L. Attention-based LSTM for Aspect-level Sentiment Classification. In: EMNLP. 2016. pp. 606–615.

Download references

Acknowledgements

This work was partially supported by the Major Project for New Generation of AI under Grant No. 2018AAA0100400, the Joint Fund of the Equipments Pre-Research and Ministry of Education of China under Grant No. 6141A020337, the National Natural Science Foundation of China under Grant No. 61876155, the Natural Science Foundation of Shandong Province, China, under Grant No. ZR201911080230, the Jiangsu Science and Technology Programme (Natural Science Foundation of Jiangsu Province) under Grant No. BE2020006-4 and BK20181189, the Project for Graduate Student Education Reformation and Research of Ocean University of China under Grant No. HDJG19001, and the Key Program Special Fund in XJTLU under Grant No. KSF-T-06 and KSF-E-26. The authors would like to thank Zhaoyang Niu for his help in the revision of this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guoqiang Zhong.

Ethics declarations

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Conflicts of Interest

The authors declare that they have no conflict of interest.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, X., Zhong, G., Chen, K. et al. Attention-Augmented Machine Memory. Cogn Comput 13, 751–760 (2021). https://doi.org/10.1007/s12559-021-09854-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12559-021-09854-5

Keywords

Navigation