Popular image generation based on popularity measures by generative adversarial networks

Nezhad, Narges Mohammadi; Mirtaheri, Seyedeh Leili; Shahbazian, Reza

doi:10.1007/s11042-022-14090-6

Popular image generation based on popularity measures by generative adversarial networks

Published: 07 November 2022

Volume 82, pages 20873–20897, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Narges Mohammadi Nezhad¹,
Seyedeh Leili Mirtaheri ORCID: orcid.org/0000-0002-0744-5876² &
Reza Shahbazian³

326 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

We study image-to-image translation and synthetic image generation. There is still no developed model to create popular synthetic images based on the user’s opinion in the fashion industry. This paper uses a combination of generative adversarial networks (GAN), deep learning, and user’s opinions to create popular images. Our proposed model consists of two modules; one is a popularity module that estimates the intrinsic popularity of images without considering the effects of non-visual factors. The second one is a translation module that converts unpopular images into popular ones. Our model also performs multi-dimensional translation and multi-domain translation. We use the ResNet50 neural network as the default deep neural network in which the last layer is replaced with a fully connected layer. We use a new dataset collected from Instagram to train our network. We evaluate the performance of the proposed method by FID, LPIPS scores, and popularity index in different scenarios. The results show that our proposed method shows at least 60% and 25% improvement in terms of FID and LPIPS in color-to-color image translation. These improvements confirm the proposed method’s generated images’ quality and diversity. The evaluations on the popularity score also confirms that the content-based translation is more effective than style-based translation in terms of popularity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

TMGAN: two-stage multi-domain generative adversarial network for landscape image translation

Article 07 December 2023

Anime-to-real clothing: Cosplay costume generation via image-to-image translation

Article Open access 04 April 2022

Large-Scale Reinforcement Learning for Diffusion Models

References

Achanta SDM, Karthikeyan T, Vinoth Kanna R (2021) Wearable sensor based acoustic gait analysis using phase transition-based optimization algorithm on iot. Int J Speech Technol, pp 1–11
Achanta SDM, Karthikeyan T, Vinothkanna R (2019) A novel hidden markov model-based adaptive dynamic time warping (hmdtw) gait analysis for identifying physically challenged persons. Soft Comput 23(18):8359–8366
Article Google Scholar
Achanta SDM, Karthikeyan T et al (2019) A wireless iot system towards gait detection technique using fsr sensor and wearable iot devices. Int J Intell Unmanned Syst
Alec R, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv:1511.06434
Amirkhani D, Bastanfard A (2021) An objective method to evaluate exemplar-based inpainted images quality using jaccard index. Multimed Tools Appl 80(17):26199–26212
Article Google Scholar
Antreas A, Storkey A, Edwards H (2017) Data augmentation generative adversarial networks. arXiv:1711.04340
Bai J, Chen R, Liu M (2020) Feature-attention module for context-aware image-to-image translation. Vis Comput 36(10):2145–2159
Article Google Scholar
Chai C, Liao J, Zou N, Sun L (2018) A one-to-many conditional generative adversarial network framework for multiple image-to-image translations. Multimed Tools Appl 77(17):22339–22366
Article Google Scholar
Chen X, Duan Y, Houthooft R, Schulman J, Sutskever I, Abbeel P (2016) Infogan: interpretable representation learning by information maximizing generative adversarial nets. arXiv:1606.03657
Cheng G, Sun X, Li K, Guo L, Han J (2021) Perturbation-seeking generative adversarial networks: a defense framework for remote sensing image scene classification. IEEE Trans Geosci Remote Sensing
Choi Y, Choi M, Kim M, Ha J-W, Kim S, Stargan JC (2018) Unified generative adversarial networks for multi-domain image-to-image translation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8789–8797
Ding K, Ma K, Wang S (2019) Intrinsic image popularity assessment. In: Proceedings of the 27th ACM international conference on multimedia, pp 1979–1987
Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing 321:321–331
Article Google Scholar
Frid-Adar M, Diamant I, Klang E, Amitai M, Goldberger J, Greenspan H (2018) Gan-based synthetic medical image augmentation for increased cnn performance in liver lesion classification. Neurocomputing 321:321–331
Article Google Scholar
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv:1406.2661
Gothwal R, Gupta S, Gupta D, Dahiya AK (2014) Color image segmentation algorithm based on rgb channels. In: Proceedings of 3rd international conference on reliability, infocom technologies and optimization, pp 1–5
Hajarian M, Bastanfard A, Mohammadzadeh J, Khalilian M (2017) Introducing fuzzy like in social networks and its effects on advertising profits and human behavior. Comput Hum Behav 77:282–293
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hessel J, Lee L, Mimno D (2017) Cats and captions vs. creators and the clock: comparing multimodal content to context in predicting relative popularity. In: Proceedings of the 26th international conference on world wide web, pp 927–936
Heusel M, Ramsauer H, Unterthiner T (2017) Bernhard nessler, and sepp hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. arXiv:1706.08500
Hsu C-C, Hwang H-T, Wu Y-C, Tsao Y, Wang H-M (2017) Voice conversion from unaligned corpora using variational autoencoding wasserstein generative adversarial networks. arXiv:1704.00849
Huang X, Liu M-Y, Belongie S, Kautz J (2018) Multimodal unsupervised image-to-image translation. In: Proceedings of the European conference on computer vision (ECCV), pp 172–189
Isola P, Zhu J-Y, Zhou T, Efros AA (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1125–1134
Jun-Yan Zhu, Zhang R, Pathak D, Trevor D, Alexei AE, Wang O, Shechtman E (2017) Toward multimodal image-to-image translation. arXiv:1711.11586
Karras T, Aila T, Laine S, Lehtinen J (2017) Progressive growing of gans for improved quality, stability, and variation. arXiv:1710.10196
Khosla A, Sarma AD, Hamid R (2014) What makes an image popular?. In: Proceedings of the 23rd international conference on World wide web, pp 867–876
Kingma DP, Adam JB (2014) A method for stochastic optimization. arXiv:1412.6980
Kingma DP, Welling M (2014) Stochastic gradient vb and the variational auto-encoder. In: Second international conference on learning representations, ICLR, vol 19
Kupyn O, Budzan V, Mykhailych M, Mishkin D, Deblurgan JM (2018) Blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8183–8192
Lee H-Y, Tseng H-Y, Huang J-B, Singh M, Yang M-H (2018) Diverse image-to-image translation via disentangled representations. In: Proceedings of the European conference on computer vision (ECCV), pp 35–51
Lin K, Li D, He X, Zhang Z, Sun M-T (2017) Adversarial ranking for language generation. arXiv:1705.11001
Liu M-Y, Breuel T, Jan Kautz (2017) Unsupervised image-to-image translation networks. arXiv:1703.00848
Liu M-Y, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. arXiv:1703.00848
Liu Z, Gao F, Wang Y (2019) A generative adversarial network for ai-aided chair design. In: IEEE conference on multimedia information processing and retrieval (MIPR). IEEE, pp 486–490
Liu M-Y, Huang X, Yu J, Wang T-C, Mallya A (2020) Generative adversarial networks for image and video synthesis: algorithms and applications. arXiv:2008.02793
Mirza M, Osindero S (2014) Conditional generative adversarial nets. arXiv:1411.1784
Murthy ASD, Karthikeyan T, Vinoth Kanna R (2021) Gait-based person fall prediction using deep learning approach. Soft Comput, pp 1–9
Na L, Zheng Z, Zhang S, Zhibin Y, Zheng H, Zheng B (2018) The synthesis of unpaired underwater images using a multistyle generative adversarial network. IEEE Access 6:54241–54257
Article Google Scholar
Qian X, Xi C, Cheng G, Yao X, Jiang L (2021) Two-stream encoder gan with progressive training for co-saliency detection. IEEE Signal Process Lett 28:180–184
Article Google Scholar
Rezende DJ, Mohamed S, Wierstra D (2014) Stochastic backpropagation and variational inference in deep latent gaussian models. In: International conference on machine learning. Citeseer, vol 2, p 2
Richardson E, Alaluf Y, Or P, Nitzan Y, Azar Y, Shapiro S, Cohen-Or D (2021) Encoding in style: a stylegan encoder for image-to-image translation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 2287–2296
Salimans T, Goodfellow I, Zaremba W, Cheung V, Radford A, Chen X (2016) Improved techniques for training gans. arXiv:1606.03498
Sohn K, Lee H, Yan X (2015) Learning structured output representation using deep conditional generative models. Adv Neural Inform Process Syst 28:3483–3491
Google Scholar
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tian Y, Peng X, Zhao L, Zhang S, Metaxas DN (2018) Cr-gan: learning complete representations for multi-view generation. arXiv:1806.11191
Wang C, Chang X u, Wang C, Tao D (2018) Perceptual adversarial networks for image-to-image transformation. IEEE Trans Image Process 27(8):4066–4079
Article MathSciNet MATH Google Scholar
Wang W, Zhou W, Bao J, Chen D, Li H (2021) Instance-wise hard negative example generation for contrastive learning in unpaired image-to-image translation. arXiv:2108.04547
Xiaoming Y, Chen Y, Li T, Liu S, Li G (2019) Multi-mapping image-to-image translation via learning disentanglement. arXiv:1909.07877
Yu X, Cai X, Ying Z, Li T, Li G (2018) Singlegan: image-to-image translation by a single-generator network using multiple generative adversarial learning. In: Asian conference on computer vision. Springer, pp 341–356
Zhang R, Isola P, Efros AA, Shechtman E, Wang O (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–595
Zhao Y, Zheng Z, Wang C, Zhaorui G, Min F, Zhibin Y, Zheng H, Wang N, Zheng B (2020) Fine-grained facial image-to-image translation with an attention based pipeline generative adversarial framework. Multimed Tools Appl, pp 1–20
Zhu J-Y, Zhang R, Pathak D, Darrell T, Efros AA (2017) Oliver wang, and eli shechtman. Toward multimodal image-to-image translation. arXiv:1711.11586

Download references

Author information

Authors and Affiliations

Department of Computer Science, Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, Iran
Narges Mohammadi Nezhad
Electrical, Computer Engineering, Faculty of Engineering, Kharazmi University, Tehran, Iran
Seyedeh Leili Mirtaheri
Electrical Engineering Research Group, Faculty of Technology and Engineering Research Center, Standard Research Institute, Alborz, Iran
Reza Shahbazian

Authors

Narges Mohammadi Nezhad
View author publications
You can also search for this author inPubMed Google Scholar
Seyedeh Leili Mirtaheri
View author publications
You can also search for this author inPubMed Google Scholar
Reza Shahbazian
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Seyedeh Leili Mirtaheri.

Ethics declarations

The author declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Nezhad, N.M., Mirtaheri, S.L. & Shahbazian, R. Popular image generation based on popularity measures by generative adversarial networks. Multimed Tools Appl 82, 20873–20897 (2023). https://doi.org/10.1007/s11042-022-14090-6

Download citation

Received: 20 August 2021
Revised: 15 March 2022
Accepted: 21 October 2022
Published: 07 November 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11042-022-14090-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Popular image generation based on popularity measures by generative adversarial networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

TMGAN: two-stage multi-domain generative adversarial network for landscape image translation

Anime-to-real clothing: Cosplay costume generation via image-to-image translation

Large-Scale Reinforcement Learning for Diffusion Models

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now