skip to main content
10.1145/3503161.3547959acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Towards High-Fidelity Face Normal Estimation

Published:10 October 2022Publication History

ABSTRACT

While existing face normal estimation methods have produced promising results on small datasets, they often suffer from severe performance degradation on diverse in-the-wild face images, especially for the high-fidelity face normal estimation. Training a high-fidelity face normal estimation model with generalization capability requires a large amount of training data with face normal ground truth. Since collecting such high-fidelity database is difficult in practice, which prevents current methods from recovering face normal with fine-grained geometric details. To mitigate this issue, we propose a coarse-to-fine framework to estimate face normal from an in-the-wild image with only a coarse exemplar reference. Specifically, we first train a model using limited training data to exploit the coarse normal of a real face image. Then, we leverage the estimated coarse normal as an exemplar and devise an exemplar-based normal estimation network to explore robust mapping from the input face image to the fine-grained normal. In this manner, our method can largely alleviate the negative impact caused by lacking training data, and focus on exploring the high-fidelity normal contained in natural images. Extensive experiments and ablation studies are conducted to demonstrate the efficacy of our design, and reveal its superiority over state-of-the-art methods in terms of both training data requirement and recovery quality of fine-grained face normal. Our code is available at \urlhttps://github.com/AutoHDR/HFFNE.

Skip Supplemental Material Section

Supplemental Material

MM22-mmfp0865.mp4

mp4

160.7 MB

References

  1. Victoria Fernández Abrevaya, Adnane Boukhayma, Philip HS Torr, and Edmond Boyer. 2020. Cross-modal deep face normals with deactivable skip connections. In CVPR. 4979--4989.Google ScholarGoogle Scholar
  2. Andrew D Bagdanov, Alberto Del Bimbo, and Iacopo Masi. 2011. The florence 2d/3d hybrid face dataset. In Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding. 79--80.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Aayush Bansal, Bryan Russell, and Abhinav Gupta. 2016. Marr revisited: 2d-3d alignment via surface normal prediction. In CVPR. 5965--5974.Google ScholarGoogle Scholar
  4. Jonathan T Barron and Jitendra Malik. 2011. High-frequency shape and albedo from shading using natural image statistics. In CVPR. IEEE, 2521--2528.Google ScholarGoogle Scholar
  5. Jonathan T Barron and Jitendra Malik. 2012. Shape, albedo, and illumination from a single image of an unknown object. In CVPR. IEEE, 334--341.Google ScholarGoogle Scholar
  6. Jonathan T Barron and Jitendra Malik. 2014. Shape, illumination, and reflectance from shading. TPAMI, Vol. 37, 8 (2014), 1670--1687.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chaofeng Chen, Dihong Gong, Hao Wang, Zhifeng Li, and Kwan-Yee K. Wong. 2020. Learning Spatial Attention for Face Super-Resolution. TIP.Google ScholarGoogle Scholar
  8. Zezhou Cheng, Qingxiong Yang, and Bin Sheng. 2015. Deep colorization. In ICCV. 415--423.Google ScholarGoogle Scholar
  9. Nikolai Chinaev, Alexander Chigorin, and Ivan Laptev. 2018. Mobileface: 3D face reconstruction with efficient cnn regression. In ECCVW. 0--0.Google ScholarGoogle Scholar
  10. Cheng Deng, Erkun Yang, Tongliang Liu, Jie Li, Wei Liu, and Dacheng Tao. 2019. Unsupervised semantic-preserving adversarial hashing for image search. TIP, Vol. 28, 8 (2019), 4032--4044.Google ScholarGoogle ScholarCross RefCross Ref
  11. Aditya Deshpande, Jiajun Lu, Mao-Chuang Yeh, Min Jin Chong, and David Forsyth. 2017. Learning diverse image colorization. In CVPR. 6837--6845.Google ScholarGoogle Scholar
  12. Berk Dogan, Shuhang Gu, and Radu Timofte. 2019. Exemplar guided face image super-resolution without facial landmarks. In CVPRW. 0--0.Google ScholarGoogle Scholar
  13. Ady Ecker and Allan D Jepson. 2010. Polynomial shape from shading. In CVPR. IEEE, 145--152.Google ScholarGoogle Scholar
  14. Yao Feng, Fan Wu, Xiaohu Shao, Yanfeng Wang, and Xi Zhou. 2018. Joint 3d face reconstruction and dense alignment with position map regression network. In ECCV. 534--551.Google ScholarGoogle Scholar
  15. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414--2423.Google ScholarGoogle Scholar
  16. Mingming He, Dongdong Chen, Jing Liao, Pedro V Sander, and Lu Yuan. 2018. Deep exemplar-based colorization. TOG, Vol. 37, 4 (2018), 1--16.Google ScholarGoogle Scholar
  17. Berthold KP Horn. 1975. Obtaining shape from shading information. The psychology of computer vision (1975), 115--155.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In CVPR. 5197--5206.Google ScholarGoogle Scholar
  19. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV. 1501--1510.Google ScholarGoogle Scholar
  20. Xinyu Huang, Jizhou Gao, Liang Wang, and Ruigang Yang. 2007. Examplar-based shape from shading. In 3DIM. IEEE, 349--356.Google ScholarGoogle Scholar
  21. Peter J Huber. 1992. Robust estimation of a location parameter. In Breakthroughs in statistics. Springer, 492--518.Google ScholarGoogle Scholar
  22. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2017. Globally and locally consistent image completion. TOG), Vol. 36, 4 (2017), 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR. 1125--1134.Google ScholarGoogle Scholar
  24. Aaron S Jackson, Adrian Bulat, Vasileios Argyriou, and Georgios Tzimiropoulos. 2017. Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In ICCV. 1031--1039.Google ScholarGoogle Scholar
  25. David W Jacobs and Ronen Basri. 2005. Lambertian reflectance and linear subspaces. US Patent 6,853,745.Google ScholarGoogle Scholar
  26. Wonjong Jang, Gwangjin Ju, Yucheol Jung, Jiaolong Yang, Xin Tong, and Seungyong Lee. 2021. StyleCariGAN: caricature generation via StyleGAN feature map modulation. TOG, Vol. 40, 4 (2021), 1--16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Taewon Kang. 2021. Multiple GAN Inversion for Exemplar-based Image-to-Image Translation. In ICCV. 3515--3522.Google ScholarGoogle Scholar
  28. Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In CVPR. 4401--4410.Google ScholarGoogle Scholar
  29. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In CVPR. 8110--8119.Google ScholarGoogle Scholar
  30. Diederik P Kingma and Ba J Adam. 2020. A method for stochastic optimization. arXiv preprint arXiv: 14126980. 2014. Cited on (2020), 50.Google ScholarGoogle Scholar
  31. Iasonas Kokkinos. 2017. Ubernet: Training a universal convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory. In CVPR. 6129--6138.Google ScholarGoogle Scholar
  32. Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. 2015. Deep learning face attributes in the wild. In ICCV. 3730--3738.Google ScholarGoogle Scholar
  33. Fujun Luan, Sylvain Paris, Eli Shechtman, and Kavita Bala. 2017. Deep photo style transfer. In CVPR. 4990--4998.Google ScholarGoogle Scholar
  34. Wan-Chun Ma, Tim Hawkins, Pieter Peers, Charles-Felix Chabert, Malte Weiss, Paul E Debevec, et al. 2007. Rapid Acquisition of Specular and Diffuse Normal Maps from Polarized Spherical Gradient Illumination. Rendering Techniques , Vol. 2007, 9 (2007), 10.Google ScholarGoogle Scholar
  35. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. Pytorch: An imperative style, high-performance deep learning library. NeurIPS , Vol. 32 (2019).Google ScholarGoogle Scholar
  36. Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, and Alexei A Efros. 2016. Context encoders: Feature learning by inpainting. In CVPR. 2536--2544.Google ScholarGoogle Scholar
  37. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In MICCAI. Springer, 234--241.Google ScholarGoogle Scholar
  38. Christos Sagonas, Georgios Tzimiropoulos, Stefanos Zafeiriou, and Maja Pantic. 2013. 300 faces in-the-wild challenge: The first facial landmark localization challenge. In ICCVW. 397--403.Google ScholarGoogle Scholar
  39. Matan Sela, Elad Richardson, and Ron Kimmel. 2017. Unrestricted facial geometry reconstruction using image-to-image translation. In ICCV. 1576--1585.Google ScholarGoogle Scholar
  40. Soumyadip Sengupta, Angjoo Kanazawa, Carlos D Castillo, and David W Jacobs. 2018. Sfsnet: Learning shape, reflectance and illuminance of facesin the wild'. In CVPR. 6296--6305.Google ScholarGoogle Scholar
  41. Zhixin Shu, Ersin Yumer, Sunil Hadap, Kalyan Sunkavalli, Eli Shechtman, and Dimitris Samaras. 2017. Neural face editing with intrinsic image disentangling. In CVPR. 5541--5550.Google ScholarGoogle Scholar
  42. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google ScholarGoogle Scholar
  43. Anh Tuan Tran, Tal Hassner, Iacopo Masi, Eran Paz, Yuval Nirkin, and Gerard Medioni. 2018. Extreme 3d face reconstruction: Seeing through occlusions. In CVPR. 3935--3944.Google ScholarGoogle Scholar
  44. Luan Tran, Feng Liu, and Xiaoming Liu. 2019. Towards high-fidelity nonlinear 3D face morphable model. In CVPR. 1126--1135.Google ScholarGoogle Scholar
  45. Luan Tran and Xiaoming Liu. 2018. Nonlinear 3d face morphable model. In CVPR. 7346--7355.Google ScholarGoogle Scholar
  46. Luan Tran and Xiaoming Liu. 2019. On learning 3d face morphable model from in-the-wild images. TPAMI, Vol. 43, 1 (2019), 157--171.Google ScholarGoogle Scholar
  47. George Trigeorgis, Patrick Snape, Iasonas Kokkinos, and Stefanos Zafeiriou. 2017. Face normals" in-the-wild" using fully convolutional networks. In CVPR. 38--47.Google ScholarGoogle Scholar
  48. Anh Tuan Tran, Tal Hassner, Iacopo Masi, and Gérard Medioni. 2017. Regressing robust and discriminative 3D morphable models with a very deep neural network. In CVPR. 5163--5172.Google ScholarGoogle Scholar
  49. Ying Xiong, Ayan Chakrabarti, Ronen Basri, Steven J Gortler, David W Jacobs, and Todd Zickler. 2014. From shading to local shape. TPAMI, Vol. 37, 1 (2014), 67--79.Google ScholarGoogle ScholarCross RefCross Ref
  50. Zhongyou Xu, Tingting Wang, Faming Fang, Yun Sheng, and Guixu Zhang. 2020. Stylization-based architecture for fast deep exemplar colorization. In CVPR. 9363--9372.Google ScholarGoogle Scholar
  51. Dawei Yang and Jia Deng. 2018. Shape from shading through shape evolution. In CVPR. 3781--3790.Google ScholarGoogle Scholar
  52. Erkun Yang, Cheng Deng, Wei Liu, Xianglong Liu, Dacheng Tao, and Xinbo Gao. 2017. Pairwise relationship guided deep hashing for cross-modal retrieval. In AAAI, Vol. 31.Google ScholarGoogle ScholarCross RefCross Ref
  53. Raymond A Yeh, Chen Chen, Teck Yian Lim, Alexander G Schwing, Mark Hasegawa-Johnson, and Minh N Do. 2017. Semantic image inpainting with deep generative models. In CVPR. 5485--5493.Google ScholarGoogle Scholar
  54. Baosheng Yu and Dacheng Tao. 2021. Heatmap Regression via Randomized Rounding. TPAMI (2021).Google ScholarGoogle Scholar
  55. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Generative image inpainting with contextual attention. In CVPR. 5505--5514.Google ScholarGoogle Scholar
  56. Stefanos Zafeiriou, Mark Hansen, Gary Atkinson, Vasileios Argyriou, Maria Petrou, Melvyn Smith, and Lyndon Smith. 2011. The photoface database. In CVPRW. IEEE, 132--139.Google ScholarGoogle Scholar
  57. Fangneng Zhan, Yingchen Yu, Kaiwen Cui, Gongjie Zhang, Shijian Lu, Jianxiong Pan, Changgong Zhang, Feiying Ma, Xuansong Xie, and Chunyan Miao. 2021. Unbalanced feature transport for exemplar-based image translation. In CVPR. 15028--15038.Google ScholarGoogle Scholar
  58. Bo Zhang, Mingming He, Jing Liao, Pedro V Sander, Lu Yuan, Amine Bermak, and Dong Chen. 2019. Deep exemplar-based video colorization. In CVPR. 8052--8061.Google ScholarGoogle Scholar
  59. Pan Zhang, Bo Zhang, Dong Chen, Lu Yuan, and Fang Wen. 2020. Cross-domain correspondence learning for exemplar-based image translation. In CVPR. 5143--5153.Google ScholarGoogle Scholar
  60. Zhenyu Zhang, Yanhao Ge, Renwang Chen, Ying Tai, Yan Yan, Jian Yang, Chengjie Wang, Jilin Li, and Feiyue Huang. 2021. Learning to Aggregate and Personalize 3D Face from In-the-Wild Photo Collection. In CVPR. 14214--14224.Google ScholarGoogle Scholar
  61. Hengyuan Zhao, Wenhao Wu, Yihao Liu, and Dongliang He. 2021. Color2Embed: Fast Exemplar-Based Image Colorization using Color Embeddings. arXiv preprint arXiv:2106.08017 (2021).Google ScholarGoogle Scholar
  62. Haitian Zheng, Minghao Guo, Haoqian Wang, Yebin Liu, and Lu Fang. 2017. Combining exemplar-based approach and learning-based approach for light field super-resolution using a hybrid imaging system. In ICCVW. 2481--2486.Google ScholarGoogle Scholar
  63. Xiangyu Zhu, Zhen Lei, Xiaoming Liu, Hailin Shi, and Stan Z Li. 2016. Face alignment across large poses: A 3d solution. In CVPR. 146--155.ioGoogle ScholarGoogle Scholar

Index Terms

  1. Towards High-Fidelity Face Normal Estimation

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        MM '22: Proceedings of the 30th ACM International Conference on Multimedia
        October 2022
        7537 pages
        ISBN:9781450392037
        DOI:10.1145/3503161

        Copyright © 2022 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 10 October 2022

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate995of4,171submissions,24%

        Upcoming Conference

        MM '24
        MM '24: The 32nd ACM International Conference on Multimedia
        October 28 - November 1, 2024
        Melbourne , VIC , Australia
      • Article Metrics

        • Downloads (Last 12 months)81
        • Downloads (Last 6 weeks)9

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader