Authors:
Subhadra Vadlamannati
1
and
Ryan Solgi
2
Affiliations:
1
Mercer Island High School, 9100 SE 42nd St, Mercer Island, U.S.A.
;
2
Department of Electrical and Computer Engineering, University of California Santa Barbara, Santa Barbara, U.S.A.
Keyword(s):
Neural Networks, Machine Learning, Natural Language Processing, ALIGN, Tensor-Train Decomposition, Vision-Language Modelling.
Abstract:
The transformer architecture has revolutionized Natural Language Processing (NLP) and other machine-learning tasks, due to its unprecedented accuracy. However, their extensive memory and parameter requirements often hinder their practical applications. In this work, we study the effect of tensor-train decomposition to improve the accuracy and compress transformer vision-language neural networks, namely BERT and ViT. We focus both on embedding-layer compression and partial tensorization of neural networks (PTNN) through an algorithmic approach. Our novel PTNN approach significantly improves the accuracy of existing models by up to 5%, all without the need for post-training adjustments, breaking new ground in the field of tensor decomposition.