Conferences >2022 IEEE International Confe...

Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

This paper proposes a new pure attention model, Aggregated Pyramid Vision Transformer (APVT), for computer vision applications. Based on the Vision Transformer (ViT) arch...Show More

Metadata

Abstract:

This paper proposes a new pure attention model, Aggregated Pyramid Vision Transformer (APVT), for computer vision applications. Based on the Vision Transformer (ViT) architecture, APVT adopts the classic pyramid architecture of CNN and employs the group encoder technique to replace the traditional encoder for feature enhancement. APVT uses the split-transform-merge strategy to refine the group encoder operation. The model performs image classification on CIFAR-10 dataset and object detection on COCO 2017 dataset for verification. Experimental results show that APVT has excellent performance compared to other Transformer network architectures.

Published in: 2022 IEEE International Conference on Consumer Electronics - Taiwan

Date of Conference: 06-08 July 2022

Date Added to IEEE Xplore: 01 September 2022

ISBN Information:

ISSN Information:

DOI: 10.1109/ICCE-Taiwan55306.2022.9869242

Conference Location: Taipei, Taiwan

Contents

References is not available for this document.

Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Aggregated Pyramid Vision Transformer: Split-transform-merge Strategy for Image Recognition without Convolutions

Alerts

Abstract:

Metadata

Abstract:

ISSN Information:

References

IEEE Account

Purchase Details

Profile Information

Need Help?