Loading [a11y]/accessibility-menu.js
Improved Transformer With Multi-Head Dense Collaboration | IEEE Journals & Magazine | IEEE Xplore

Improved Transformer With Multi-Head Dense Collaboration


Abstract:

Recently, the attention mechanism boosts the performance of many neural network models in Natural Language Processing (NLP). Among the various attention mechanisms, Multi...Show More

Abstract:

Recently, the attention mechanism boosts the performance of many neural network models in Natural Language Processing (NLP). Among the various attention mechanisms, Multi-Head Attention (MHA) is a powerful and popular variant. MHA helps the model to attend to different feature subspaces independently which is an essential component of Transformer. Despite its success, we conjecture that the different heads of the existing MHA may not collaborate properly. To validate this assumption and further improve the performance of Transformer, we study the collaboration problem for MHA in this paper. First, we propose the Single-Layer Collaboration (SLC) mechanism to help each attention head improve its attention distribution based on the feedback of other heads. Furthermore, we extend SLC to the cross-layer Multi-Head Dense Collaboration (MHDC) mechanism. MHDC helps each MHA layer learn the attention distributions considering the knowledge from the other MHA layers. Both SLC and MHDC are implemented as lightweight modules with very few additional parameters. When equipped with these modules, our new framework, i.e., Collaborative TransFormer (CollFormer), significantly outperforms the vanilla Transformer on a range of NLP tasks, including machine translation, sentence semantic relatedness, natural language inference, sentence classification, and reading comprehension. Besides, we also carry out extensive quantitative experiments to analyze the properties of the MHDC in different settings. The experimental results validate the effectiveness and universality of MHDC as well as CollFormer.
Page(s): 2754 - 2767
Date of Publication: 30 August 2022

ISSN Information:

Funding Agency:


References

References is not available for this document.