Skip to main content
Log in

View adjustment: helping users improve photographic composition

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Image composition has always been one of the primary considerations for professional photography, and good composition can greatly enhance the perception of an image. However, it is difficult for most ordinary users to compose images quickly. To solve this challenge, we propose a new view adjustment method to improve image composition in this paper, assisting users in adjusting views promptly. We establish a composition feature extraction module, which can extract composition information of images based on professional photographic composition rules. Based on this module, we construct an image aesthetic assessment network to score the adjustment results. Additionally, we employ the concept of the hyper-net in our design to build a parameter generation-prediction network, enabling the application of view adjustment rules. In addition, we create a usable view adjustment dataset based on an existing image cropping dataset, which includes samples of displacement adjustment, scale adjustment, rotation adjustment, and their combined adjustment types to facilitate model training. Through experimentation, our model has demonstrated outstanding performance with an MAE score of 0.0392, an MSE score of 0.0037, and an IoU score of 0.783. Experimental results indicate that our proposed view adjustment method effectively improves the image composition, enabling real-time guidance for users to capture well-composed photographs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. Zhong, L., Li, F.-H., Huang, H.-Z., Zhang, Y., Lu, S.-P., Wang, J.: Aesthetic-guided outward image cropping. ACM Trans Graphics 40(6), 1–13 (2021)

    Google Scholar 

  2. Su, Y.-C., Vemulapalli, R., Weiss, B., Chu, C.-T., Mansfield, P. A., ShapiraL., & Pitts, C. (2021). Camera View Adjustment Prediction for Improving Image Composition. arXiv:2104.07608 [Cs]. http://arxiv.org/abs/2104.07608

  3. Wang, Y., Ke, Y., Wang, K., Guo, J., Qin, F.: A composition-oriented aesthetic view recommendation network supervised by the simplified golden ratio theory. Expert Syst. Appl. 195, 116500 (2022). https://doi.org/10.1016/j.eswa.2022.116500

    Article  Google Scholar 

  4. Niu, Y., Chen, S., Song, B., Chen, Z., Liu, W.: Comment-guided semantics-aware image aesthetics assessment. IEEE Trans. Circuits Syst. Video Technol. 33(3), 1487–1492 (2023). https://doi.org/10.1109/TCSVT.2022.3201510

    Article  Google Scholar 

  5. Li, L., Huang, Y., Wu, J., Yang, Y., Li, Y., Guo, Y., Shi, G.: Theme-aware visual attribute reasoning for image aesthetics assessment. IEEE Trans. Circuits Syst. Video Technol. 33(9), 4798–4811 (2023). https://doi.org/10.1109/TCSVT.2023.3249185

    Article  Google Scholar 

  6. Nie, X., Hu, B., Gao, X., Li, L., Zhang, X., & Xiao, B. (2023). BMI-Net: A Brain-inspired Multimodal Interaction Network for Image Aesthetic Assessment. Proceedings of the 31st ACM International Conference on Multimedia. pp 5514–5522. https://doi.org/10.1145/3581783.3611996

  7. Kosugi, S., Yamasaki, T.: Crowd-powered photo enhancement featuring an active learning-based local filter. IEEE Trans. Circuits Syst. Video Technol. 33(7), 3145–3158 (2023). https://doi.org/10.1109/TCSVT.2023.3233989

    Article  Google Scholar 

  8. Xu, Y., Xu, W., Wang, M., Li, L., Sang, G., Wei, P., Zhu, L.: Saliency-aware image cropping with latent region pair. Expert Syst. Appl. 171(114596), Q1 (2021). https://doi.org/10.1016/j.eswa.2021.114596

    Article  Google Scholar 

  9. Hong, C., Du, S., Xian, K., Lu, H., Cao, Z., Zhong, W.: Composing photos like a photographer. IEEE/CVF Conf Comput Vision Pattern Recogn (CVPR) 2021, 7053–7062 (2021). https://doi.org/10.1109/CVPR46437.2021.00698

    Article  Google Scholar 

  10. Celona, L., Ciocca, G., Napoletano, P.: A grid anchor-based cropping approach exploiting image aesthetics, geometric composition, and semantics. Expert Syst. Appl. 186(115852), Q1 (2021). https://doi.org/10.1016/j.eswa.2021.115852

    Article  Google Scholar 

  11. Lu, P., Zhang, H., Peng, X., Jin, X.: Learning the relation between interested objects and aesthetic region for image cropping. IEEE Trans. Multimed 23(3618–3630), Q1 (2021). https://doi.org/10.1109/TMM.2020.3029882

    Article  Google Scholar 

  12. Zhang, Y., Li, X., Li, X.: Reinforcement learning cropping method based on comprehensive feature and aesthetics assessment. IET Image Process. 16(5), 1415–1423 (2022). https://doi.org/10.1049/ipr2.12420

    Article  Google Scholar 

  13. Horanyi, N., Xia, K., Yi, K.M., Bojja, A.K., Leonardis, A., Chang, H.J.: Repurposing existing deep networks for caption and aesthetic-guided image cropping. Pattern Recogn. 126, 108485 (2022). https://doi.org/10.1016/j.patcog.2021.108485

    Article  Google Scholar 

  14. Zhang, X., Li, Z., Jiang, J.: Emotion attention-aware collaborative deep reinforcement learning for image cropping. IEEE Trans. Multimed 23(2545–2560), Q1 (2021). https://doi.org/10.1109/TMM.2020.3013350

    Article  Google Scholar 

  15. Jia, G., Huang, H., Fu, C., He, R.: Rethinking image cropping: exploring diverse compositions from global views. IEEE/CVF Conf Comput Vision Pattern Recogn (CVPR) 2022, 2436–2445 (2022). https://doi.org/10.1109/CVPR52688.2022.00248

    Article  Google Scholar 

  16. Chang, H., Zhang, H., Jiang, L., Liu, C., Freeman, W.T.: MaskGIT: masked generative image transformer. IEEE/CVF Conf Comput Vision Pattern Recogn (CVPR) 2022, 11305–11315 (2022). https://doi.org/10.1109/CVPR52688.2022.01103

    Article  Google Scholar 

  17. Li, X., Ren, Y., Ren, H., Shi, C., Zhang, X., Wang, L., Mumtaz, I., Wu, X.: Perceptual image outpainting assisted by low-level feature fusion and multi-patch discriminator. Comput Mater Cont. 71(3), 5021–5037 (2022). https://doi.org/10.32604/cmc.2022.023071

    Article  Google Scholar 

  18. Yang, C.-A., Tan, C.-Y., Fan, W.-C., Yang, C.-F., Wu, M.-L., & Wang, Y.-C. F. (2022). Scene Graph Expansion for Semantics-Guided Image Outpainting (arXiv:2205.02958; Version 1). arXiv. http://arxiv.org/abs/2205.02958

  19. Wei, G., Guo, J., Ke, Y., Wang, K., Yang, S., Sheng, N.: A three-stage GAN model based on edge and color prediction for image outpainting. Expert Syst. Appl. 214, 119136 (2023). https://doi.org/10.1016/j.eswa.2022.119136

    Article  Google Scholar 

  20. Ke, Y., Sheng, N., Wei, G., Wang, K., Qin, F., Guo, J.: Subject-aware image outpainting. Signal Image Video Process 17(5), 2661–2669 (2023). https://doi.org/10.1007/s11760-022-02444-4

    Article  Google Scholar 

  21. Klocek, S., Maziarka, Ł., Wołczyk, M., Tabor, J., Nowak, J., & Śmieja, M. (2019). Hypernetwork Functional Image Representation (I. V. Tetko, V. Kůrková, P. Karpov, & F. Theis, Eds.; Vol. 11731, pp. 496–510). Springer International Publishing. https://doi.org/10.1007/978-3-030-30493-5_48

  22. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition (arXiv:1512.03385). arXiv. http://arxiv.org/abs/1512.03385

  23. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 8691, 346–361 (2014). https://doi.org/10.1007/978-3-319-10578-9_23

    Article  Google Scholar 

  24. Zhang, B., Niu, L., & Zhang, L. (2021). Image Composition Assessment with Saliency-augmented Multi-pattern Pooling (arXiv:2104.03133). arXiv. http://arxiv.org/abs/2104.03133

  25. Zeng, H., Li, L., Cao, Z., Zhang, L.: Reliable and efficient image cropping: a grid anchor based approach. IEEE/CVF Conf Comput Vision Pattern Recogn (CVPR) 2019, 5942–5950 (2019). https://doi.org/10.1109/CVPR.2019.00610

    Article  Google Scholar 

  26. Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, Li.: ImageNet: a large-scale hierarchical image database. IEEE Conf Comput Vision Pattern Recogn 2009, 248–255 (2009). https://doi.org/10.1109/CVPR.2009.5206848

    Article  Google Scholar 

  27. Kingma, D. P., & Ba, J. (2014). Adam: A Method for Stochastic Optimization. CoRR. https://www.semanticscholar.org/paper/Adam%3A-A-Method-for-Stochastic-Optimization-Kingma-Ba/a6cb366736791bcccc5c8639de5a8f9636bf87e8

  28. Chen, Y.-L., Klopp, J., Sun, M., Chien, S.-Y., & Ma, K.-L. (2017). Learning to Compose with Professional Photographs on the Web. Proceedings of the 25th ACM International Conference on Multimedia. pp 37–45. https://doi.org/10.1145/3123266.3123274

  29. Wei, Z., Zhang, J., Shen, X., Lin, Z., Mech, R., Hoai, M., Samaras, D.: Good view hunting: learning photo composition from dense view pairs. IEEE/CVF Conf Comput Vision Pattern Recogn 2018, 5437–5446 (2018). https://doi.org/10.1109/CVPR.2018.00570

    Article  Google Scholar 

  30. Li, D., Zhang, J., Huang, K., Yang, M.-H.: Composing good shots by exploiting mutual relations. IEEE/CVF Conf Comput Vision Pattern Recogn (CVPR) 2020, 4212–4221 (2020). https://doi.org/10.1109/CVPR42600.2020.00427

    Article  Google Scholar 

  31. Li, D., Wu, H., Zhang, J., Huang, K.: A2-RL: aesthetics aware reinforcement learning for image cropping. IEEE/CVF Conf Comput Vision Pattern Recogn 2018, 8193–8201 (2018). https://doi.org/10.1109/CVPR.2018.00855

    Article  Google Scholar 

  32. Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. IEEE Conf Comput Vision Pattern Recogn 2009, 1597–1604 (2009). https://doi.org/10.1109/CVPR.2009.5206596

    Article  Google Scholar 

  33. Wang, C., Niu, L., Zhang, B., Zhang, L.: Image cropping with spatial-aware feature and rank consistency. IEEE/CVF Conf Comput Vision Pattern Recogn (CVPR) 2023, 10052–10061 (2023)

    Google Scholar 

  34. Shi, T., Chen, C., He, Y., Song, W., Hao, A.: Joint probability distribution regression for image cropping. IEEE Int Conf Image Process (ICIP) 2023, 990–994 (2023)

    Google Scholar 

  35. Zhong, Z., Cheng, M., Wu, Z., Yuan, Y., Zheng, Y., Li, J., Hu, H., Lin, S., Sato, Y., Sato, I.: ClipCrop: conditioned cropping driven by vision-language model. IEEE/CVF Int Conf Comput Vision Workshops (ICCVW) 2023, 294–304 (2022)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

Nan Sheng: methodology, software, writing—original draft. Yongzhen Ke: conceptualization, methodology, supervision, project administration,writing—review & editing. Shuai Yang: software, writing—review & editing, formal analysis, visualization. Yong Yang: methodology, writing—review & editing. Liming Chen: resources, validation, visualization.

Corresponding author

Correspondence to Yongzhen Ke.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Communicated by Bing-kun Bao.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sheng, N., Ke, Y., Yang, S. et al. View adjustment: helping users improve photographic composition. Multimedia Systems 30, 293 (2024). https://doi.org/10.1007/s00530-024-01490-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00530-024-01490-x

Keywords