Abstract:
Transformer and its derivatives are widely used in industrial Internet of Things due to their excellent performance. However, the scale of these network models is excepti...Show MoreMetadata
Abstract:
Transformer and its derivatives are widely used in industrial Internet of Things due to their excellent performance. However, the scale of these network models is exceptionally large, generating significant memory overhead and computational load during training and inference, as well as consuming large amounts of power resources. Therefore, these existing network models cannot be trained and deployed on resource-constrained industrial embedded devices, thus limiting their participation in collaborative computing and real-time applications. In this paper, we design plug-and-play lightweight multi-head attention and lightweight position-wise feed-forward networks, and propose lightweight tensorized transformer and lightweight tensorized transformer++ based on these two components. The performance of lightweight tensorized transformer and lightweight tensorized transformer++ is evaluated on real datasets. The experimental results show that efficient and lightweight tensor-coupled models can achieve comparable or even higher performance than the transformer on real tasks. Furthermore, the number of training parameters and floating-point operations for lightweight tensorized transformer and lightweight tensorized transformer++ are much lower than the transformer model, and the training time and power consumption of the model during training are also less than the transformer. Therefore, these two lightweight network models are better suited than the transformer for deployment to resource-constrained industrial embedded devices.
Published in: IEEE Transactions on Network Science and Engineering ( Volume: 11, Issue: 3, May-June 2024)