Abstract
We study image-to-image translation and synthetic image generation. There is still no developed model to create popular synthetic images based on the user’s opinion in the fashion industry. This paper uses a combination of generative adversarial networks (GAN), deep learning, and user’s opinions to create popular images. Our proposed model consists of two modules; one is a popularity module that estimates the intrinsic popularity of images without considering the effects of non-visual factors. The second one is a translation module that converts unpopular images into popular ones. Our model also performs multi-dimensional translation and multi-domain translation. We use the ResNet50 neural network as the default deep neural network in which the last layer is replaced with a fully connected layer. We use a new dataset collected from Instagram to train our network. We evaluate the performance of the proposed method by FID, LPIPS scores, and popularity index in different scenarios. The results show that our proposed method shows at least 60% and 25% improvement in terms of FID and LPIPS in color-to-color image translation. These improvements confirm the proposed method’s generated images’ quality and diversity. The evaluations on the popularity score also confirms that the content-based translation is more effective than style-based translation in terms of popularity.




















Similar content being viewed by others
References
Achanta SDM, Karthikeyan T, Vinoth Kanna R (2021) Wearable sensor based acoustic gait analysis using phase transition-based optimization algorithm on iot. Int J Speech Technol, pp 1–11
Achanta SDM, Karthikeyan T, Vinothkanna R (2019) A novel hidden markov model-based adaptive dynamic time warping (hmdtw) gait analysis for identifying physically challenged persons. Soft Comput 23(18):8359–8366
Achanta SDM, Karthikeyan T et al (2019) A wireless iot system towards gait detection technique using fsr sensor and wearable iot devices. Int J Intell Unmanned Syst
Alec R, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434
Amirkhani D, Bastanfard A (2021) An objective method to evaluate exemplar-based inpainted images quality using jaccard index. Multimed Tools Appl 80(17):26199–26212
Antreas A, Storkey A, Edwards H (2017) Data augmentation generative adversarial networks. arXiv:1711.04340
Bai J, Chen R, Liu M (2020) Feature-attention module for context-aware image-to-image translation. Vis Comput 36(10):2145–2159
Chai C, Liao J, Zou N, Sun L (2018) A one-to-many conditional generative adversarial network framework for multiple image-to-image translations. Multimed Tools Appl 77(17):22339–22366
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. arXiv:1606.03657
Cheng G, Sun X, Li K, Guo L, Han J (2021) Perturbation-seeking generative adversarial networks: a defense framework for remote sensing image scene classification. IEEE Trans Geosci Remote Sensing
Choi Y, Choi M, Kim M, Ha J-W, Kim S, Stargan JC (2018) Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Ding K, Ma K, Wang S (2019) Intrinsic image popularity assessment. In: Proceedings of the 27th ACM international conference on multimedia, pp 1979–1987
Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing 321:321–331
Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing 321:321–331
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv:1406.2661
Gothwal R, Gupta S, Gupta D, Dahiya AK (2014) Color image segmentation algorithm based on rgb channels. In: Proceedings of 3rd international conference on reliability, infocom technologies and optimization, pp 1–5
Hajarian M, Bastanfard A, Mohammadzadeh J, Khalilian M (2017) Introducing fuzzy like in social networks and its effects on advertising profits and human behavior. Comput Hum Behav 77:282–293
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hessel J, Lee L, Mimno D (2017) Cats and captions vs. creators and the clock: comparing multimodal content to context in predicting relative popularity. In: Proceedings of the 26th international conference on world wide web, pp 927–936
Heusel M, Ramsauer H, Unterthiner T (2017) Bernhard nessler, and sepp hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv:1706.08500
Hsu C-C, Hwang H-T, Wu Y-C, Tsao Y, Wang H-M (2017) Voice conversion from unaligned corpora using variational autoencoding wasserstein generative adversarial networks. arXiv:1704.00849
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Jun-Yan Zhu, Zhang R, Pathak D, Trevor D, Alexei AE, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. arXiv:1711.11586
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196
Khosla A, Sarma AD, Hamid R (2014) What makes an image popular?. In: Proceedings of the 23rd international conference on World wide web, pp 867–876
Kingma DP, Adam JB (2014) A method for stochastic optimization. arXiv:1412.6980
Kingma DP, Welling M (2014) Stochastic gradient vb and the variational auto-encoder. In: Second international conference on learning representations, ICLR, vol 19
Kupyn O, Budzan V, Mykhailych M, Mishkin D, Deblurgan JM (2018) Blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8183–8192
Lee H-Y, Tseng H-Y, Huang J-B, Singh M, Yang M-H (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European conference on computer vision (ECCV), pp 35–51
Lin K, Li D, He X, Zhang Z, Sun M-T (2017) Adversarial ranking for language generation. arXiv:1705.11001
Liu M-Y, Breuel T, Jan Kautz (2017) Unsupervised image-to-image translation networks. arXiv:1703.00848
Liu M-Y, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. arXiv:1703.00848
Liu Z, Gao F, Wang Y (2019) A generative adversarial network for ai-aided chair design. In: IEEE conference on multimedia information processing and retrieval (MIPR). IEEE, pp 486–490
Liu M-Y, Huang X, Yu J, Wang T-C, Mallya A (2020) Generative adversarial networks for image and video synthesis: algorithms and applications. arXiv:2008.02793
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Murthy ASD, Karthikeyan T, Vinoth Kanna R (2021) Gait-based person fall prediction using deep learning approach. Soft Comput, pp 1–9
Na L, Zheng Z, Zhang S, Zhibin Y, Zheng H, Zheng B (2018) The synthesis of unpaired underwater images using a multistyle generative adversarial network. IEEE Access 6:54241–54257
Qian X, Xi C, Cheng G, Yao X, Jiang L (2021) Two-stream encoder gan with progressive training for co-saliency detection. IEEE Signal Process Lett 28:180–184
Rezende DJ, Mohamed S, Wierstra D (2014) Stochastic backpropagation and variational inference in deep latent gaussian models. In: International conference on machine learning. Citeseer, vol 2, p 2
Richardson E, Alaluf Y, Or P, Nitzan Y, Azar Y, Shapiro S, Cohen-Or D (2021) Encoding in style: a stylegan encoder for image-to-image translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2287–2296
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. arXiv:1606.03498
Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. Adv Neural Inform Process Syst 28:3483–3491
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tian Y, Peng X, Zhao L, Zhang S, Metaxas DN (2018) Cr-gan: learning complete representations for multi-view generation. arXiv:1806.11191
Wang C, Chang X u, Wang C, Tao D (2018) Perceptual adversarial networks for image-to-image transformation. IEEE Trans Image Process 27(8):4066–4079
Wang W, Zhou W, Bao J, Chen D, Li H (2021) Instance-wise hard negative example generation for contrastive learning in unpaired image-to-image translation. arXiv:2108.04547
Xiaoming Y, Chen Y, Li T, Liu S, Li G (2019) Multi-mapping image-to-image translation via learning disentanglement. arXiv:1909.07877
Yu X, Cai X, Ying Z, Li T, Li G (2018) Singlegan: image-to-image translation by a single-generator network using multiple generative adversarial learning. In: Asian conference on computer vision. Springer, pp 341–356
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhao Y, Zheng Z, Wang C, Zhaorui G, Min F, Zhibin Y, Zheng H, Wang N, Zheng B (2020) Fine-grained facial image-to-image translation with an attention based pipeline generative adversarial framework. Multimed Tools Appl, pp 1–20
Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA (2017) Oliver wang, and eli shechtman. Toward multimodal image-to-image translation. arXiv:1711.11586
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
The author declare that there are no conflicts of interest regarding the publication of this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Nezhad, N.M., Mirtaheri, S.L. & Shahbazian, R. Popular image generation based on popularity measures by generative adversarial networks. Multimed Tools Appl 82, 20873–20897 (2023). https://doi.org/10.1007/s11042-022-14090-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-022-14090-6