E-CapsGan: Generative adversarial network using capsule network as feature encoder

Xiang, Chao; Su, Minglan; Zhang, Chaoying; Wang, Feng; Yang, Mingchuan; Niu, Zhendong

doi:10.1007/s11042-022-12279-3

E-CapsGan: Generative adversarial network using capsule network as feature encoder

Published: 28 March 2022

Volume 81, pages 26425–26442, (2022)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Chao Xiang^1,2,
Minglan Su²,
Chaoying Zhang²,
Feng Wang²,
Mingchuan Yang² &
…
Zhendong Niu ORCID: orcid.org/0000-0002-0576-7572¹

314 Accesses
3 Citations
Explore all metrics

Abstract

We explore using the theory of Capsule Network(CapsNet) in Generative Adversarial Network(GAN). The traditional Convolutional Neural Networks(CNNs) cannot explain the spatial relationship between the part and whole, so it will lose some of the target’s attribute information such as direction and posture. Capsule Network, proposed by Hinton in 2017, overcomes the defect of CNNs. In order to utilize the attributes of the target as much as possible, we propose the E-CapsGan which applies the CapsNet to encode the input image attribute features and guide the data generation of GAN. We explore the application of the E-CapsGan in two scenarios. For image generation, we propose the E-CapsGan1, which uses the CapsNet as an additional attribute feature encoder to obtain image attribute features to guide GAN. For image compression encoding, we explore the E-CapsGan2 which employs the CapsNet as the encoder to compress images into vectors, and GAN as the decoder to reconstruct the original images from vectors. On multiple datasets, qualitative and quantitative experiments are used to demonstrate the superior performance of E-CapsGan1 in image generation and the feasibility of E-CapsGan2 in image compression encoding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on Image Data Augmentation for Deep Learning

Article Open access 06 July 2019

Deep learning models for digital image processing: a review

Article 07 January 2024

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

Article 11 October 2019

References

AlBahar B, Huang JB (2019) Guided image-to-image translation with bi-directional feature transformation[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9016–9025
Chang S, Liu J (2020) Multi-lane capsule network for classifying images with complex background[J]. IEEE Access 8:79876–79886
Article Google Scholar
Chen Z, Crandall D (2018) Generalized capsule networks with trainable routing procedure[J]. arXiv:1808.08692
Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial Networks[J]. Adv Neural Inf Process Syst 3:2672–2680
Google Scholar
Gu J, Tresp V Improving the robustness of capsule networks to image affine Transformations[C]. IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE
Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders[C] International conference on artificial neural networks. Springer, Berlin, Heidelberg, p 44–51
Jaiswal A, AbdAlmageed W, Wu Y et al (2018) Capsulegan: Generative adversarial capsule network[C]. In: Proceedings of the European conference on computer vision (ECCV) workshops, p 0–0
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, p 4401–4410
Kinli F, Ozcan B, Kirac F (2019) Fashion image retrieval with capsule networks[C]. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, p 0–0
Kurach K, Lučić M, Zhai X et al (2019) A large-scale study on regularization and normalization in GANs[C]. In: International conference on machine learning. PMLR, p 3581–3590
LeCun Y (1998) The MNIST database of handwritten digits[J]. http://www.yann.lecun.com/exdb/mnist/
Ledig C, Theis L, Huszár F et al (2017) Photo-realistic single image super-resolution using a generative adversarial network[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690
Li C, Wang Z, Qi H (2018) Fast-converging conditional generative adversarial networks for image synthesis[C]. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, p 2132–2136
Liu H, Gu X, Samaras D (2019) Wasserstein gan with quadratic transport cost[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, p 4832–4841
Ma T (2018) Generalization and equilibrium in generative adversarial nets (GANs)(invited talk)[C]. In: Proceedings of the 50th annual ACM SIGACT symposium on theory of computing, p 2–2
Mao X, Li Q, Xie H et al (2017) Least squares generative adversarial networks[C]. In: Proceedings of the IEEE international conference on computer vision, p 2794–2802
Marusaki K, Watanabe H (2020) Capsule GAN Using Capsule Network for Generator Architecture[J]. arXiv:2003.08047
Mirza M, Osindero S. (2014) Conditional generative adversarial nets[J]. arXiv:1411.1784
Miyato T, Kataoka T, Koyama M, et al. (2018) Spectral normalization for generative adversarial networks[J]. arXiv:1802.05957
Mukhometzianov R, Carrillo J (2018) CapsNet comparative performance evaluation for image classification[J]. arXiv:1805.11195
Nguyen HH, Yamagishi J, Echizen I (2019) Use of a capsule network to detect fake images and videos[J]. arXiv:1910.12467
Parkhi OM, Vedaldi A, Zisserman A et al (2012) Cats and dogs[C]. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, p 3498–3505
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv:1511.06434
Rawlinson D, Ahmed A, Kowadlo G (2018) Sparse unsupervised capsules generalize better[J]. arXiv:1804.06094
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules[J]. arXiv:1710.09829
Shi H, Wang L, Ding G et al (2018) Data augmentation with improved generative adversarial networks[C]. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, p 73–78
Wang C, Xu C, Wang C et al (2018) Perceptual adversarial networks for image-to-image transformation[J]. IEEE Trans Image Process 27 (8):4066–4079
Article MathSciNet Google Scholar
Wang D, Liu Q (2018) An optimization view on dynamic routing between capsules[J]
Wang X, Yu K, Wu S et al (2018) Esrgan: Enhanced super-resolution generative adversarial networks[C]. In: Proceedings of the European conference on computer vision (ECCV) workshops, p 0–0
Xian W, Sangkloy P, Agrawal V et al (2018) Texturegan: Controlling deep image synthesis with texture patches[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 8456–8465
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist:, a novel image dataset for benchmarking machine learning algorithms[J]. arXiv:1708.07747
Yang M, Zhao W, Ye J et al (2018) Investigating capsule networks with dynamic routing for text classification[C]. In: Proceedings of the 2018 conference on empirical methods in natural language processing, p 3110–3119
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks[J]. arXiv:1301.3557
Zhang B, Xu X, Yang M et al (2018) Cross-domain sentiment classification by capsule network with semantic rules[J]. IEEE Access 6:58284–58294
Article Google Scholar
Zhu JY, Park T, Isola P et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. In: Proceedings of the IEEE international conference on computer vision, p 2223–2232
Zhu M, Pan P, Chen W et al (2019) Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, p 5802–5810

Download references

Acknowledgements

This work was supported in part by the National Key R&D Program of China under Grant 2019YFB1406302, National Natural Science Foundation of China under Grant 61370137, 61902016, the Ministry of Education-China Mobile Research Foundation Project under Grant 2016/2-7.

Author information

Authors and Affiliations

School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
Chao Xiang & Zhendong Niu
Research Institue of China Telecom Corporation Limited, Beijing, China
Chao Xiang, Minglan Su, Chaoying Zhang, Feng Wang & Mingchuan Yang

Authors

Chao Xiang
View author publications
You can also search for this author in PubMed Google Scholar
Minglan Su
View author publications
You can also search for this author in PubMed Google Scholar
Chaoying Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Feng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Mingchuan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Zhendong Niu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhendong Niu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiang, C., Su, M., Zhang, C. et al. E-CapsGan: Generative adversarial network using capsule network as feature encoder. Multimed Tools Appl 81, 26425–26442 (2022). https://doi.org/10.1007/s11042-022-12279-3

Download citation

Received: 28 April 2021
Revised: 18 August 2021
Accepted: 14 January 2022
Published: 28 March 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11042-022-12279-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

E-CapsGan: Generative adversarial network using capsule network as feature encoder

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Deep learning models for digital image processing: a review

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

E-CapsGan: Generative adversarial network using capsule network as feature encoder

Abstract

Access this article

Similar content being viewed by others

A survey on Image Data Augmentation for Deep Learning

Deep learning models for digital image processing: a review

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation