Skip to main content

AIA: Attention in Attention Within Collaborate Domains

  • Conference paper
  • First Online:
  • 2687 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13534))

Abstract

Attention mechanisms can effectively improve the performance of the mobile networks with a limited computational complexity cost. However, existing attention methods extract importance from only one domain of the networks, hindering further performance improvement. In this paper, we propose the Attention in Attention (AIA) mechanism integrating One Dimension Frequency Channel Attention (1D FCA) with Joint Coordinate Attention (JCA) to collaboratively adjust the channel and coordinate weights in frequency and spatial domains, respectively. Specifically, 1D FCA using 1D Discrete Cosine Transform (DCT) adaptively extract and enhance the necessary channel information in the frequency domain. The JCA using explicit and implicit coordinate information extract and embed position feature into frequency channel attention. Extensive experiments on different datasets demonstrate that the proposed AIA mechanism can effectively improve the accuracy with only a limited computation complexity cost.

This work was supported in part by Guangdong Shenzhen Joint Youth Fund under Grant 2021A151511074, in part by the NSFC fund 62176077, in part by the Guangdong Basic and Applied Basic Research Foundation under Grant 2019Bl515120055, in part by the Shenzhen Key Technical Project under Grant 2020N046, in part by the Shenzhen Fundamental Research Fund under Grant JCYJ20210324132210025, and in part by the Medical Biometrics Perception and Analysis Engineering Laboratory, Shenzhen, China. Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies (2022B1212010005).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Cao, J., Liu, B., Wen, Y., Xie, R., Song, L.: Personalized and invertible face de-identification by disentangled identity information manipulation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3334–3342 (2021)

    Google Scholar 

  2. Chen, H., et al.: Diverse image style transfer via invertible cross-space mapping. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14860–14869. IEEE Computer Society (2021)

    Google Scholar 

  3. Darlow, L.N., Crowley, E.J., Antoniou, A., Storkey, A.J.: Cinic-10 is not imagenet or CIFAR-10. arXiv preprint arXiv:1810.03505 (2018)

  4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  5. Fu, J., et al.: Dual attention network for scene segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3146–3154 (2019)

    Google Scholar 

  6. Guo, L., Liu, J., Zhu, X., Yao, P., Lu, S., Lu, H.: Normalized and geometry-aware self-attention network for image captioning. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10327–10336 (2020)

    Google Scholar 

  7. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  8. Hou, Q., Zhou, D., Feng, J.: Coordinate attention for efficient mobile network design. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 13713–13722 (2021)

    Google Scholar 

  9. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  10. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018)

    Google Scholar 

  11. Jiang, X., et al.: Attention scaling for crowd counting. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4706–4715 (2020)

    Google Scholar 

  12. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  13. Li, Y., Gao, Y., Chen, B., Zhang, Z., Lu, G., Zhang, D.: Self-supervised exclusive-inclusive interactive learning for multi-label facial expression recognition in the wild. IEEE Trans. Circuits Syst. Video Technol. 32(5) (2022)

    Google Scholar 

  14. Li, Y., Lu, G., Li, J., Zhang, Z., Zhang, D.: Facial expression recognition in the wild using multi-level features and attention mechanisms. IEEE Trans. Affect. Comput. (2020). https://doi.org/10.1109/TAFFC.2020.3031602

    Article  Google Scholar 

  15. Li, Y., Zhang, Z., Chen, B., Lu, G., Zhang, D.: Deep margin-sensitive representation learning for cross-domain facial expression recognition. IEEE Trans. Multimedia (2022). https://doi.org/10.1109/TMM.2022.3141604

    Article  Google Scholar 

  16. Lu, C., Peng, X., Wei, Y.: Low-rank tensor completion with a new tensor nuclear norm induced by invertible linear transforms. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5996–6004 (2019)

    Google Scholar 

  17. Lu, Y., Lu, G., Li, J., Xu, Y., Zhang, D.: High-parameter-efficiency convolutional neural networks. Neural Comput. Appl. 32(14), 10633–10644 (2019). https://doi.org/10.1007/s00521-019-04596-w

    Article  Google Scholar 

  18. Lu, Y., Lu, G., Li, J., Zhang, Z., Xu, Y.: Fully shared convolutional neural networks. Neural Comput. Appl. 33(14), 8635–8648 (2021). https://doi.org/10.1007/s00521-020-05618-8

    Article  Google Scholar 

  19. Lu, Y., Lu, G., Lin, R., Li, J., Zhang, D.: SRGC-nets: Sparse repeated group convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 31(8), 2889–2902 (2019)

    Article  Google Scholar 

  20. Lu, Y., Lu, G., Xu, Y., Zhang, B.: AAR-CNNs: auto adaptive regularized convolutional neural networks. In: International Joint Conference on Artificial Intelligence, pp. 2511–2517 (2018)

    Google Scholar 

  21. Lu, Y., Lu, G., Zhang, B., Xu, Y., Li, J.: Super sparse convolutional neural networks. In: AAAI Conference on Artificial Intelligence, vol. 33, pp. 4440–4447 (2019)

    Google Scholar 

  22. Lu, Y., Lu, G., Zhou, Y., Li, J., Xu, Y., Zhang, D.: Highly shared convolutional neural networks. Expert Syst. Appl. 175, 114782 (2021)

    Article  Google Scholar 

  23. Park, J., Woo, S., Lee, J.Y., Kweon, I.S.: Bam: bottleneck attention module. arXiv preprint arXiv:1807.06514 (2018)

  24. Qin, Z., Zhang, P., Wu, F., Li, X.: FCANet: frequency channel attention networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 783–792 (2021)

    Google Scholar 

  25. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  26. Torralba, A., Fergus, R., Freeman, W.T.: 80 Million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30(11), 1958–1970 (2008)

    Article  Google Scholar 

  27. Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  28. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1

    Chapter  Google Scholar 

  29. Xiong, Y., et al.: MobileDets: searching for object detection architectures for mobile accelerators. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3825–3834 (2021)

    Google Scholar 

  30. Zhou, D., Hou, Q., Chen, Y., Feng, J., Yan, S.: Rethinking bottleneck structure for efficient mobile network design. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 680–697. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8_40

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Yao Lu or Guangming Lu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, L., Feng, Q., Lu, Y., Liu, C., Lu, G. (2022). AIA: Attention in Attention Within Collaborate Domains. In: Yu, S., et al. Pattern Recognition and Computer Vision. PRCV 2022. Lecture Notes in Computer Science, vol 13534. Springer, Cham. https://doi.org/10.1007/978-3-031-18907-4_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-18907-4_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18906-7

  • Online ISBN: 978-3-031-18907-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics