Skip to main content
Log in

Sketch-to-photo face generation based on semantic consistency preserving and similar connected component refinement

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

Sketch-to-photo face generation has recently gained remarkable attention in computer vision and signal processing communities, because the sketches that employ concise lines are easily available and can describe significant facial attributes conveniently. Most existing sketch-to-photo works fail to maintain geometric structures and improve local details simultaneously, which limits their performance. In this work, we propose a two-stage sketch-to-photo generative adversarial network for face generation. In the first stage, we propose a semantic loss to maintain semantic consistency. In the second stage, we define the similar connected component and propose a color refinement loss to generate fine-grained details. Moreover, we introduce a multi-scale discriminator and design a patch-level local discriminator. We also propose a texture loss to enhance the local fidelity of synthesized images. Experiments show that our proposed method can significantly generate better results while preserving facial attributes than the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Baba, T., Okuda, M., Perrotin, P., Yusuke, T., Shirai, K.: An automatic yearbook style photo generation method using color grading and guide image filtering based facial skin color correction. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pp. 504–508 (2015)

  2. Chang, L., Jin, L., Weng, L., Chao, W., Wang, X., Deng, X., Dong, Q.: Face-sketch learning with human sketch-drawing order enforcement. Sci. China Inf. Sci. 63(11), 1–3 (2020)

    Article  Google Scholar 

  3. Chang, L., Zhou, M., Han, Y., Deng, X.: Face sketch synthesis via sparse representation. In: 2010 20th International Conference on Pattern Recognition, pp. 2146–2149. IEEE (2010)

  4. Chao, W., Chang, L., Wang, X., Cheng, J., Deng, X., Duan, F.: High-fidelity face sketch-to-photo synthesis using generative adversarial network. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 4699–4703 (2019)

  5. Chen, S.-Y., Su, W., Gao, L., Xia, S., Fu, H.: Deepfacedrawing: deep generation of face images from sketches. ACM Trans. Graph. (TOG) 39(4), 72 (2020)

    Article  Google Scholar 

  6. Chen, W., Hays, J.: Sketchygan: Towards diverse and realistic sketch to image synthesis. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9416–9425 (2018)

  7. Chen, Y., Tai, Y., Liu, X., Shen, C., Yang, J.: Fsrnet: end-to-end learning face super-resolution with facial priors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2492–2501 (2018)

  8. Gao, X., Wang, N., Tao, D., Li, X.: Face sketch-photo synthesis and retrieval using sparse representation. IEEE Trans. Circ. Syst. Video Technol. 22(8), 1213–1226 (2012)

    Article  Google Scholar 

  9. Gao, X., Zhong, J., Li, J., Tian, C.: Face sketch synthesis algorithm based on e-hmm and selective ensemble. IEEE Trans. Circ. Syst. Video Technol. 18(4), 487–496 (2008)

    Article  Google Scholar 

  10. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2414–2423 (2016)

  11. Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Bing, X., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (2014)

  12. Güçlütürk, Y., Güçlü, U., van Lier, R., van Gerven M.A.J.: Convolutional sketch inversion. In: European Conference on Computer Vision, pp. 810–824. Springer (2016)

  13. Guo, Q., Zhu, C., Xia, Z., Wang, Z., Liu, Y.: Attribute-controlled face photo synthesis from simple line drawing. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2946–2950. IEEE (2017)

  14. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems, pp. 6626–6637 (2017)

  15. Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)

  16. Johnson, J., Alahi, A., Li, F.-F.: Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp. 694–711. Springer (2016)

  17. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4401–4410 (2019)

  18. Kazemi, H., Taherkhani, F., Nasrabadi, N. M.: Unsupervised facial geometry learning for sketch to photo synthesis. In: 2018 International Conference of the Biometrics Special Interest Group (BIOSIG), pp. 1–5 (2018)

  19. Jaleed Khan, M., Curry, E.: Neuro-symbolic visual reasoning for multimedia event processing: Overview, prospects and challenges. In: The 29th ACM International Conference on Information and Knowledge Management (CIKM’2020) Workshops, pp. 1–6 (2020)

  20. Jaleed Khan, M., Khurshid, K., Shafait, F.: A spatio-spectral hybrid convolutional architecture for hyperspectral document authentication. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1097–1102. IEEE (2019)

  21. Junaid Khan, M., Jaleed Khan, M., Siddiqui, A.M., Khurshid, K.: An automated and efficient convolutional architecture for disguise-invariant face recognition using noise-based data augmentation and deep transfer learning. Vis. Comput. 1–15 (2021)

  22. Kingma, D.P., Adam, Ba, J.: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  23. Lee, C.-H., Liu, Z., Wu, L., Luo, P.: Maskgan: towards diverse and interactive facial image manipulation. arXiv preprint arXiv:1907.11922 (2019)

  24. Li, H., He, F., Liang, Y., Quan, Q.: A dividing-based many objective evolutionary algorithm for large-scale feature selection. Soft Comput. 24(9), 6851–6870 (2020)

  25. Li, Y., Chen, X., Wu, F., Zha, Z.-J.: Linestofacephoto: face photo generation from lines with conditional self-attention generative adversarial networks. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2323–2331 (2019)

  26. Liang, Y., Song, M., Xie, L., Bu, J., Chen, C.: Face sketch-to-photo synthesis from simple line drawing. In: Proceedings of The 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–5 (2012)

  27. Lin, C., Fu, J.T., Wang, S.H., Huang, C.: New face detection method based on multi-scale histograms. In: 2016 IEEE Second International Conference on Multimedia Big Data (BigMM), pp. 229–232 (2016)

  28. Lin, Y., Ling, S., Fu, K., Cheng, P.: An identity-preserved model for face sketch-photo synthesis. IEEE Signal Process. Lett. 27, 1095–1099 (2020)

    Article  Google Scholar 

  29. Liu, Q., Tang, X., Jin, H., Lu, H., Ma, S.: A nonlinear approach for face sketch synthesis and recognition. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 1005–1010. IEEE (2005)

  30. Osahor, U., Kazemi, H., Dabouei, A., Nasrabadi, N.: Quality guided sketch-to-photo image synthesis. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 3575–3584 (2020)

  31. Peng, C., Gao, X., Wang, N., Tao, D., Li, X., Li, J.: Multiple representations-based face sketch-photo synthesis. IEEE Trans. Neural Netw. Learn. Syst. 27(11), 2201–2215 (2016)

    Article  Google Scholar 

  32. Quan, Q., He, F., Li, H.: A multi-phase blending method with incremental intensity for training detection networks. Vis. Comput. 37(2), 245–259 (2021)

    Article  Google Scholar 

  33. Salimans, T., Goodfellow, I., Zaremba, W., Cheung, V., Radford, A., Chen, X., Chen, X.: Improved techniques for training gans. In: Lee, D.D., Sugiyama, M., Luxburg, U.V., Guyon, I., Garnett, R. (eds.) Advances in Neural Information Processing Systems 29, pp. 2234–2242. Curran Associates Inc, Red Hook (2016)

    Google Scholar 

  34. Sangkloy, P., Lu, J., Fang, C., Yu, F., Hays, J.: Scribbler: controlling deep image synthesis with sketch and color. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5400–5409 (2017)

  35. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

  36. Tang, X., Wang, X.: Face sketch recognition. IEEE Trans. Circ. Syst. Video Technol. 14(1), 50–57 (2004)

    Article  Google Scholar 

  37. Wang, L., Sindagi, V., Patel, V.: High-quality facial photo-sketch synthesis using multi-adversarial networks. In: 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pp. 83–90 (2018)

  38. Wang, N., Gao, X., Tao, D., Li, X.: Face sketch-photo synthesis under multi-dictionary sparse representation framework. In: 2011 Sixth International Conference on Image and Graphics, pp. 82–87 (2011)

  39. Wang, T., Liu, M., Zhu, J., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional gans. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)

  40. Wang, X., Tang, X.: Face photo-sketch synthesis and recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(11), 1955–1967 (2009)

    Article  MathSciNet  Google Scholar 

  41. Wang, Z., Simoncelli, E.P., Bovik, A.C.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, vol. 2, pp. 1398–1402 (2003)

  42. Xia, W., Yang, Y., Xue, J.-H.: Cali-sketch: stroke calibration and completion for high-quality face image generation from poorly-drawn sketches. arXiv preprint arXiv:1911.00426 (2019)

  43. Xian, W., Sangkloy, P., Agrawal, V., Raj, A., Lu, J., Fang, C., Yu, F., Hays, J.: Texturegan: controlling deep image synthesis with texture patches. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 8456–8465 (2018)

  44. Xiao, B., Gao, X., Tao, D., Li, X.: A new approach for face recognition by sketches in photos. Signal Process. 89(8), 1576–1588 (2009)

    Article  MATH  Google Scholar 

  45. Yang, B., Chen, X., Hong, R., Chen, Z., Li, Y., Zha, Z.-J.: Joint sketch-attribute learning for fine-grained face synthesis. In: International Conference on Multimedia Modeling, pp. 790–801. Springer (2020)

  46. Yang, Y., Zhao, H., You, L., Tu, R., Wu, X., Jin, X.: Semantic portrait color transfer with internet images. Multimed. Tools Appl. 76(1), 523–541 (2017)

    Article  Google Scholar 

  47. Yasarla, R., Perazzi, F., Patel, V.M.: Deblurring face images using uncertainty guided multi-stream semantic networks. IEEE Trans. Image Process. 29, 6251–6263 (2020)

    Article  MATH  Google Scholar 

  48. Yi, R., Liu, Y.-J., Lai, Y.-K., Rosin, P.L.: APDrawingGAN: Generating artistic portrait drawings from face photos with hierarchical gans. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 10743–10752 (2019)

  49. Yousaf, A., Khan, M.J., Khan, M.J., Siddiqui, A.M., Khurshid, K.: A robust and efficient convolutional deep learning framework for age-invariant face recognition. Expert Syst. 37(3), e12503 (2020)

    Article  Google Scholar 

  50. Yu, C., Wang, J., Peng, C., Gao, C., Yu, G., Sang, N.: Bisenet: bilateral segmentation network for real-time semantic segmentation. In: European Conference on Computer Vision, pp. 325–341 (2018)

  51. Yu, J., Xu, X., Gao, F., Shi, S., Wang, M., Tao, D., Huang, Q.: Toward realistic face photo-sketch synthesis via composition-aided gans. IEEE Trans, Cybern (2020)

    Google Scholar 

  52. Zhang, H., Goodfellow, I., Metaxas, D., Odena, A.: Self-attention generative adversarial networks. In: International Conference on Machine Learning, pp. 7354–7363 (2019)

  53. Zhang, L., Lin, L., Wu, X., Ding, S., Zhang, L.: End-to-end photo-sketch generation via fully convolutional representation learning. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, pp. 627–634 (2015)

  54. Zhang, M., Wang, R., Gao, X., Li, J., Tao, D.: Dual-transfer face sketch-photo synthesis. IEEE Trans. Image Process. 28(2), 642–657 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  55. Zhang, S., He, F.: DRCDN: learning deep residual convolutional dehazing networks. Vis. Comput. 36(9), 1797–1808 (2020)

    Article  Google Scholar 

  56. Zhang, S., He, F., Ren, W.: NLDN: non-local dehazing network for dense haze removal. Neurocomputing 410, 363–373 (2020)

    Article  Google Scholar 

  57. Zhang, Z., Jiang, M., Zhang, Z.: Multi-channel face reconstruction system based on sketch features using conditional adversarial networks. In: Proceedings of the 2020 5th International Conference on Mathematics and Artificial Intelligence, pp. 187–191 (2020)

  58. Zhao, T., Zhang, C.: Saan: semantic attention adaptation network for face super-resolution. In: 2020 IEEE International Conference on Multimedia and Expo, pp. 1–6. IEEE (2020)

  59. Zhou, W., Alan Conrad, B., Hamid Rahim, S., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

  60. Zulfiqar, M., Syed, F., Khan, M.J., Khurshid, K.: Deep face recognition for biometric authentication. In: 2019 International Conference on Electrical, Communication, and Computer Engineering (ICECCE), pp. 1–6. IEEE (2019)

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China (No. 2019YFC1521104), National Natural Science Foundation of China (No. 61972157), and Fundamental Research Funds for the Central Universities (No. 2021QN1072).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhiwen Shao or Lizhuang Ma.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 187 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, L., Tang, J., Shao, Z. et al. Sketch-to-photo face generation based on semantic consistency preserving and similar connected component refinement. Vis Comput 38, 3577–3594 (2022). https://doi.org/10.1007/s00371-021-02188-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02188-1

Keywords

Navigation