Reinforcement of BERT with Dependency-Parsing Based Attention Mask

Mechouma, Toufik; Biskri, Ismail; Meunier, Jean Guy

doi:10.1007/978-3-031-16210-7_9

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1653))

Included in the following conference series:

International Conference on Computational Collective Intelligence

758 Accesses

Abstract

Dot-Product based attention mechanism is among recent attention mechanisms. It showed an outstanding performance with BERT. In this paper, we propose a dependency-parsing mask to reinforce the padding mask, at the multi-head attention units. Padding mask, is already used to filter padding positions. The proposed mask, aims to improve BERT attention filter. The conducted experiments, show that BERT performs better with the proposed mask.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Graves, A.: Long Short Term Memory. Springer, Cham (2012). https://doi.org/10.1007/978-3-642-24797-2-4
Book Google Scholar
Sepp H., Jürgen S.: Long short-term memory. Neural Comput. 9(8), 1735–1780. PMID 9377276. S2CID 1915014 (1997). https://doi.org/10.1162/neco.1997.9.8.1735
Sak, H., Senior, A., Beaufays, F.: Long Short-Term Memory recurrent neural network architectures for large scale acoustic modeling (2014)
Google Scholar
Vaswani, A., et al.: Attention Is All You Need (2017)
Google Scholar
Luong, M.-T.: Effective approaches to attention-based neural machine translation (2015)
Google Scholar
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT pre-training of deep bidirectional transformers for language understanding (2018)
Google Scholar
Honnibal, M., Montani, I.: spaCy 2 natural language understanding with Bloom embeddings, convolutional neural networks and incremental parsing. (2017)
Google Scholar
Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing. Google AI Blog. Accessed 27 Nov 2019
Google Scholar
Clark, K., Khandelwal, U.M, Levy, O., Manning, C.: What Does BERT Look at? An Analysis of BERT’s attention. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP (2019)
Google Scholar
Peters, M., et al.: Deep contextualized word representations (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Quebec in Montreal, Montreal, QC, Canada
Toufik Mechouma & Jean Guy Meunier
University of Quebec in Trois Rivieres, Trois Rivieres, QC, Canada
Ismail Biskri

Authors

Toufik Mechouma
View author publications
You can also search for this author in PubMed Google Scholar
Ismail Biskri
View author publications
You can also search for this author in PubMed Google Scholar
Jean Guy Meunier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toufik Mechouma .

Editor information

Editors and Affiliations

University of Craiova, Craiova, Romania
Costin Bădică
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Jan Treur
Claude Bernard University Lyon 1, Villeurbanne Cedex, France
Djamal Benslimane
Wrocław University of Science and Technology, Wrocław, Poland
Bogumiła Hnatkowska
Wrocław University of Science and Technology, Wrocław, Poland
Marek Krótkiewicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mechouma, T., Biskri, I., Meunier, J.G. (2022). Reinforcement of BERT with Dependency-Parsing Based Attention Mask. In: Bădică, C., Treur, J., Benslimane, D., Hnatkowska, B., Krótkiewicz, M. (eds) Advances in Computational Collective Intelligence. ICCCI 2022. Communications in Computer and Information Science, vol 1653. Springer, Cham. https://doi.org/10.1007/978-3-031-16210-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-16210-7_9
Published: 21 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16209-1
Online ISBN: 978-3-031-16210-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reinforcement of BERT with Dependency-Parsing Based Attention Mask