Abstract
We explore using the theory of Capsule Network(CapsNet) in Generative Adversarial Network(GAN). The traditional Convolutional Neural Networks(CNNs) cannot explain the spatial relationship between the part and whole, so it will lose some of the target’s attribute information such as direction and posture. Capsule Network, proposed by Hinton in 2017, overcomes the defect of CNNs. In order to utilize the attributes of the target as much as possible, we propose the E-CapsGan which applies the CapsNet to encode the input image attribute features and guide the data generation of GAN. We explore the application of the E-CapsGan in two scenarios. For image generation, we propose the E-CapsGan1, which uses the CapsNet as an additional attribute feature encoder to obtain image attribute features to guide GAN. For image compression encoding, we explore the E-CapsGan2 which employs the CapsNet as the encoder to compress images into vectors, and GAN as the decoder to reconstruct the original images from vectors. On multiple datasets, qualitative and quantitative experiments are used to demonstrate the superior performance of E-CapsGan1 in image generation and the feasibility of E-CapsGan2 in image compression encoding.
Similar content being viewed by others
References
AlBahar B, Huang JB (2019) Guided image-to-image translation with bi-directional feature transformation[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 9016–9025
Chang S, Liu J (2020) Multi-lane capsule network for classifying images with complex background[J]. IEEE Access 8:79876–79886
Chen Z, Crandall D (2018) Generalized capsule networks with trainable routing procedure[J]. arXiv:1808.08692
Goodfellow IJ, Pouget-Abadie J, Mirza M et al (2014) Generative adversarial Networks[J]. Adv Neural Inf Process Syst 3:2672–2680
Gu J, Tresp V Improving the robustness of capsule networks to image affine Transformations[C]. IEEE/CVF conference on computer vision and pattern recognition (CVPR). IEEE
Hinton GE, Krizhevsky A, Wang SD (2011) Transforming auto-encoders[C] International conference on artificial neural networks. Springer, Berlin, Heidelberg, p 44–51
Jaiswal A, AbdAlmageed W, Wu Y et al (2018) Capsulegan: Generative adversarial capsule network[C]. In: Proceedings of the European conference on computer vision (ECCV) workshops, p 0–0
Karras T, Laine S, Aila T (2019) A style-based generator architecture for generative adversarial networks[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, p 4401–4410
Kinli F, Ozcan B, Kirac F (2019) Fashion image retrieval with capsule networks[C]. In: Proceedings of the IEEE/CVF international conference on computer vision workshops, p 0–0
Kurach K, Lučić M, Zhai X et al (2019) A large-scale study on regularization and normalization in GANs[C]. In: International conference on machine learning. PMLR, p 3581–3590
LeCun Y (1998) The MNIST database of handwritten digits[J]. http://www.yann.lecun.com/exdb/mnist/
Ledig C, Theis L, Huszár F et al (2017) Photo-realistic single image super-resolution using a generative adversarial network[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4681–4690
Li C, Wang Z, Qi H (2018) Fast-converging conditional generative adversarial networks for image synthesis[C]. In: 2018 25th IEEE International Conference on Image Processing (ICIP). IEEE, p 2132–2136
Liu H, Gu X, Samaras D (2019) Wasserstein gan with quadratic transport cost[C]. In: Proceedings of the IEEE/CVF international conference on computer vision, p 4832–4841
Ma T (2018) Generalization and equilibrium in generative adversarial nets (GANs)(invited talk)[C]. In: Proceedings of the 50th annual ACM SIGACT symposium on theory of computing, p 2–2
Mao X, Li Q, Xie H et al (2017) Least squares generative adversarial networks[C]. In: Proceedings of the IEEE international conference on computer vision, p 2794–2802
Marusaki K, Watanabe H (2020) Capsule GAN Using Capsule Network for Generator Architecture[J]. arXiv:2003.08047
Mirza M, Osindero S. (2014) Conditional generative adversarial nets[J]. arXiv:1411.1784
Miyato T, Kataoka T, Koyama M, et al. (2018) Spectral normalization for generative adversarial networks[J]. arXiv:1802.05957
Mukhometzianov R, Carrillo J (2018) CapsNet comparative performance evaluation for image classification[J]. arXiv:1805.11195
Nguyen HH, Yamagishi J, Echizen I (2019) Use of a capsule network to detect fake images and videos[J]. arXiv:1910.12467
Parkhi OM, Vedaldi A, Zisserman A et al (2012) Cats and dogs[C]. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, p 3498–3505
Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks[J]. arXiv:1511.06434
Rawlinson D, Ahmed A, Kowadlo G (2018) Sparse unsupervised capsules generalize better[J]. arXiv:1804.06094
Sabour S, Frosst N, Hinton GE (2017) Dynamic routing between capsules[J]. arXiv:1710.09829
Shi H, Wang L, Ding G et al (2018) Data augmentation with improved generative adversarial networks[C]. In: 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, p 73–78
Wang C, Xu C, Wang C et al (2018) Perceptual adversarial networks for image-to-image transformation[J]. IEEE Trans Image Process 27 (8):4066–4079
Wang D, Liu Q (2018) An optimization view on dynamic routing between capsules[J]
Wang X, Yu K, Wu S et al (2018) Esrgan: Enhanced super-resolution generative adversarial networks[C]. In: Proceedings of the European conference on computer vision (ECCV) workshops, p 0–0
Xian W, Sangkloy P, Agrawal V et al (2018) Texturegan: Controlling deep image synthesis with texture patches[C]. In: Proceedings of the IEEE conference on computer vision and pattern recognition, p 8456–8465
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist:, a novel image dataset for benchmarking machine learning algorithms[J]. arXiv:1708.07747
Yang M, Zhao W, Ye J et al (2018) Investigating capsule networks with dynamic routing for text classification[C]. In: Proceedings of the 2018 conference on empirical methods in natural language processing, p 3110–3119
Zeiler MD, Fergus R (2013) Stochastic pooling for regularization of deep convolutional neural networks[J]. arXiv:1301.3557
Zhang B, Xu X, Yang M et al (2018) Cross-domain sentiment classification by capsule network with semantic rules[J]. IEEE Access 6:58284–58294
Zhu JY, Park T, Isola P et al (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. In: Proceedings of the IEEE international conference on computer vision, p 2223–2232
Zhu M, Pan P, Chen W et al (2019) Dm-gan: Dynamic memory generative adversarial networks for text-to-image synthesis[C]. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, p 5802–5810
Acknowledgements
This work was supported in part by the National Key R&D Program of China under Grant 2019YFB1406302, National Natural Science Foundation of China under Grant 61370137, 61902016, the Ministry of Education-China Mobile Research Foundation Project under Grant 2016/2-7.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xiang, C., Su, M., Zhang, C. et al. E-CapsGan: Generative adversarial network using capsule network as feature encoder. Multimed Tools Appl 81, 26425–26442 (2022). https://doi.org/10.1007/s11042-022-12279-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-12279-3