Skip to main content
Log in

E-CapsGan: Generative adversarial network using capsule network as feature encoder

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

We explore using the theory of Capsule Network(CapsNet) in Generative Adversarial Network(GAN). The traditional Convolutional Neural Networks(CNNs) cannot explain the spatial relationship between the part and whole, so it will lose some of the target’s attribute information such as direction and posture. Capsule Network, proposed by Hinton in 2017, overcomes the defect of CNNs. In order to utilize the attributes of the target as much as possible, we propose the E-CapsGan which applies the CapsNet to encode the input image attribute features and guide the data generation of GAN. We explore the application of the E-CapsGan in two scenarios. For image generation, we propose the E-CapsGan1, which uses the CapsNet as an additional attribute feature encoder to obtain image attribute features to guide GAN. For image compression encoding, we explore the E-CapsGan2 which employs the CapsNet as the encoder to compress images into vectors, and GAN as the decoder to reconstruct the original images from vectors. On multiple datasets, qualitative and quantitative experiments are used to demonstrate the superior performance of E-CapsGan1 in image generation and the feasibility of E-CapsGan2 in image compression encoding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. AlBahar B, Huang JB (2019) Guided image-to-image translation with bi-directional feature transformation[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9016–9025

  2. Chang S, Liu J (2020) Multi-lane capsule network for classifying images with complex background[J]. IEEE Access 8:79876–79886

    Article  Google Scholar 

  3. Chen Z, Crandall D (2018) Generalized capsule networks with trainable routing procedure[J]. arXiv:1808.08692

  4. Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial Networks[J]. Adv Neural Inf Process Syst 3:2672–2680

    Google Scholar 

  5. Gu J, Tresp V Improving the robustness of capsule networks to image affine Transformations[C]. IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE

  6. Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders[C] International conference on artificial neural networks. Springer, Berlin, Heidelberg, p 44–51

  7. Jaiswal A, AbdAlmageed W, Wu Y et al (2018) Capsulegan: Generative adversarial capsule network[C]. In: Proceedings of the European conference on computer vision (ECCV) workshops, p 0–0

  8. Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, p 4401–4410

  9. Kinli F, Ozcan B, Kirac F (2019) Fashion image retrieval with capsule networks[C]. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, p 0–0

  10. Kurach K, Lučić M, Zhai X et al (2019) A large-scale study on regularization and normalization in GANs[C]. In: International conference on machine learning. PMLR, p 3581–3590

  11. LeCun Y (1998) The MNIST database of handwritten digits[J]. http://www.yann.lecun.com/exdb/mnist/

  12. Ledig C, Theis L, Huszár F et al (2017) Photo-realistic single image super-resolution using a generative adversarial network[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690

  13. Li C, Wang Z, Qi H (2018) Fast-converging conditional generative adversarial networks for image synthesis[C]. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, p 2132–2136

  14. Liu H, Gu X, Samaras D (2019) Wasserstein gan with quadratic transport cost[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, p 4832–4841

  15. Ma T (2018) Generalization and equilibrium in generative adversarial nets (GANs)(invited talk)[C]. In: Proceedings of the 50th annual ACM SIGACT symposium on theory of computing, p 2–2

  16. Mao X, Li Q, Xie H et al (2017) Least squares generative adversarial networks[C]. In: Proceedings of the IEEE international conference on computer vision, p 2794–2802

  17. Marusaki K, Watanabe H (2020) Capsule GAN Using Capsule Network for Generator Architecture[J]. arXiv:2003.08047

  18. Mirza M, Osindero S. (2014) Conditional generative adversarial nets[J]. arXiv:1411.1784

  19. Miyato T, Kataoka T, Koyama M, et al. (2018) Spectral normalization for generative adversarial networks[J]. arXiv:1802.05957

  20. Mukhometzianov R, Carrillo J (2018) CapsNet comparative performance evaluation for image classification[J]. arXiv:1805.11195

  21. Nguyen HH, Yamagishi J, Echizen I (2019) Use of a capsule network to detect fake images and videos[J]. arXiv:1910.12467

  22. Parkhi OM, Vedaldi A, Zisserman A et al (2012) Cats and dogs[C]. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, p 3498–3505

  23. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv:1511.06434

  24. Rawlinson D, Ahmed A, Kowadlo G (2018) Sparse unsupervised capsules generalize better[J]. arXiv:1804.06094

  25. Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules[J]. arXiv:1710.09829

  26. Shi H, Wang L, Ding G et al (2018) Data augmentation with improved generative adversarial networks[C]. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, p 73–78

  27. Wang C, Xu C, Wang C et al (2018) Perceptual adversarial networks for image-to-image transformation[J]. IEEE Trans Image Process 27 (8):4066–4079

    Article  MathSciNet  Google Scholar 

  28. Wang D, Liu Q (2018) An optimization view on dynamic routing between capsules[J]

  29. Wang X, Yu K, Wu S et al (2018) Esrgan: Enhanced super-resolution generative adversarial networks[C]. In: Proceedings of the European conference on computer vision (ECCV) workshops, p 0–0

  30. Xian W, Sangkloy P, Agrawal V et al (2018) Texturegan: Controlling deep image synthesis with texture patches[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 8456–8465

  31. Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist:, a novel image dataset for benchmarking machine learning algorithms[J]. arXiv:1708.07747

  32. Yang M, Zhao W, Ye J et al (2018) Investigating capsule networks with dynamic routing for text classification[C]. In: Proceedings of the 2018 conference on empirical methods in natural language processing, p 3110–3119

  33. Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks[J]. arXiv:1301.3557

  34. Zhang B, Xu X, Yang M et al (2018) Cross-domain sentiment classification by capsule network with semantic rules[J]. IEEE Access 6:58284–58294

    Article  Google Scholar 

  35. Zhu JY, Park T, Isola P et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. In: Proceedings of the IEEE international conference on computer vision, p 2223–2232

  36. Zhu M, Pan P, Chen W et al (2019) Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, p 5802–5810

Download references

Acknowledgements

This work was supported in part by the National Key R&D Program of China under Grant 2019YFB1406302, National Natural Science Foundation of China under Grant 61370137, 61902016, the Ministry of Education-China Mobile Research Foundation Project under Grant 2016/2-7.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhendong Niu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiang, C., Su, M., Zhang, C. et al. E-CapsGan: Generative adversarial network using capsule network as feature encoder. Multimed Tools Appl 81, 26425–26442 (2022). https://doi.org/10.1007/s11042-022-12279-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12279-3

Keywords

Navigation