Skip to main content
Log in

Scattering-based hybrid network for facial attribute classification

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

Face attribute classification (FAC) is a high-profile problem in biometric verification and face retrieval. Although recent research has been devoted to extracting more delicate image attribute features and exploiting the inter-attribute correlations, significant challenges still remain. Wavelet scattering transform (WST) is a promising non-learned feature extractor. It has been shown to yield more discriminative representations and outperforms the learned representations in certain tasks. Applied to the image classification task, WST can enhance subtle image texture information and create local deformation stability. This paper designs a scattering-based hybrid block, to incorporate frequency-domain (WST) and image-domain features in a channel attention manner (Squeeze-and-Excitation, SE), termed WS-SE block. Compared with CNN, WS-SE achieves a more efficient FAC performance and compensates for the model sensitivity of the small-scale affine transform. In addition, to further exploit the relationships among the attribute labels, we propose a learning strategy from a causal view. The cause attributes defined using the causality-related information can be utilized to infer the effect attributes with a high confidence level. Ablative analysis experiments demonstrate the effectiveness of our model, and our hybrid model obtains state-of-the-art results in two public datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Samangouei P, Patel V M, Chellappa R. Facial attributes for active authentication on mobile devices. Image and Vision Computing, 2017, 58: 181–192

    Article  Google Scholar 

  2. Liu Y, Li Q, Sun Z. Attribute-aware face aging with wavelet-based generative adversarial networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019, 11869–11878

  3. Cao C, Lu F, Li C, Lin S, Shen X. Makeup removal via bidirectional tunable de-makeup network. IEEE Transactions on Multimedia, 2019, 21(11): 2750–2761

    Article  Google Scholar 

  4. Nian F, Chen X, Yang S, Lv G. Facial attribute recognition with feature decoupling and graph convolutional networks. IEEE Access, 2019, 7: 85500–85512

    Article  Google Scholar 

  5. Mao L, Yan Y, Xue J H, Wang H. Deep multi-task multi-label CNN for effective facial attribute classification. IEEE Transactions on Affective Computing, 2022, 13(2): 818–828

    Article  Google Scholar 

  6. Shu Y, Yan Y, Chen S, Xue J H, Shen C, Wang H. Learning spatial-semantic relationship for facial attribute recognition with limited labeled data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 11911–11920

  7. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I J, Fergus R. Intriguing properties of neural networks. In: Proceedings of the 2nd International Conference on Learning Representations, 2014

  8. Azulay A, Weiss Y. Why do deep convolutional networks generalize so poorly to small image transformations? Journal of Machine Learning Research, 2019, 20(184): 1–25

    MathSciNet  MATH  Google Scholar 

  9. Zhang R. Making convolutional networks shift-invariant again. In: Proceedings of the 36th International Conference on Machine Learning. 2019, 7324–7334

  10. Suchitra S, Chitrakala S, Nithya J. A robust face recognition using automatically detected facial attributes. In: Proceedings of 2014 International Conference on Science Engineering and Management Research (ICSEMR). 2014, 1–5

  11. Yang S, Wang Z, Liu J, Guo Z. Controllable sketch-to-image translation for robust face synthesis. IEEE Transactions on Image Processing, 2021, 30: 8797–8810

    Article  Google Scholar 

  12. Zhang N, Zhao J, Duan F, Pan Z, Wu Z, Zhou M, Gu X. An end-to-end conditional generative adversarial network based on depth map for 3D craniofacial reconstruction. In: Proceedings of the 30th ACM International Conference on Multimedia. 2022, 759–768

  13. Rozsa A, Günther M, Rudd E M, Boult T E. Are facial attributes adversarially robust? In: Proceedings of the 23rd International Conference on Pattern Recognition (ICPR). 2016, 3121–3127

  14. Rozsa A, Günther M, Rudd E M, Boult T E. Facial attributes: accuracy and adversarial robustness. Pattern Recognition Letters, 2019, 124: 100–108

    Article  Google Scholar 

  15. Mallat S. Wavelets for a vision. Proceedings of the IEEE, 1996, 84(4): 604–614

    Article  Google Scholar 

  16. Liang H, Gao J, Qiang N. A novel framework based on wavelet transform and principal component for face recognition under varying illumination. Applied Intelligence, 2021, 51(3): 1762–1783

    Article  Google Scholar 

  17. Kar N B, Nayak D R, Babu K S, Zhang Y D. A hybrid feature descriptor with Jaya optimised least squares SVM for facial expression recognition. IET Image Processing, 2021, 15(7): 1471–1483

    Article  Google Scholar 

  18. Huang H, He R, Sun Z, Tan T. Wavelet-SRNet: a wavelet-based CNN for multi-scale face super resolution. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 1698–1706

  19. Li P, Hu Y, He R, Sun Z. Global and local consistent wavelet-domain age synthesis. IEEE Transactions on Information Forensics and Security, 2019, 14(11): 2943–2957

    Article  Google Scholar 

  20. Mallat S. Group invariant scattering. Communications on Pure and Applied Mathematics, 2012, 65(10): 1331–1398

    Article  MathSciNet  MATH  Google Scholar 

  21. Bruna J, Mallat S. Invariant scattering convolution networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(8): 1872–1886

    Article  Google Scholar 

  22. Bruna J, Mallat S. Classification with scattering operators. In: Proceedings of the CVPR 2011. 2011, 1561–1566

  23. Singh A, Kingsbury N. Efficient convolutional network learning using parametric log based dual-tree wavelet ScatterNet. In: Proceedings of the IEEE International Conference on Computer Vision Workshops. 2017, 1140–1147

  24. Oyallon E, Belilovsky E, Zagoruyko S. Scaling the scattering transform: Deep hybrid networks. In: Proceedings of the IEEE International Conference on Computer Vision. 2017, 5619–5628

  25. Cotter F, Kingsbury N. A learnable scatternet: locally invariant convolutional layers. In: Proceedings of 2019 IEEE International Conference on Image Processing (ICIP). 2019, 350–354

  26. Minskiy D, Bober M. Scattering-based hybrid networks: an evaluation and design guide. In: Proceedings of 2021 IEEE International Conference on Image Processing (ICIP). 2021, 2793–2797

  27. Gauthier S, Thérien B, Alséne-Racicot L, Chaudhary M, Rish I, Belilovsky E, Eickenberg M, Wolf G. Parametric scattering networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 5739–5748

  28. Hand E M, Chellappa R. Attributes for improved attributes: a multi-task network utilizing implicit and explicit relationships for facial attribute classification. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence. 2017, 4068–4074

  29. Cao J, Li Y, Zhang Z. Partially shared multi-task convolutional neural network with local constraint for face attribute learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 4290–4299

  30. Fanhe X, Guo J, Huang Z, Qiu W, Zhang Y. Multi-task learning with knowledge transfer for facial attribute classification. In: Proceedings of 2019 IEEE International Conference on Industrial Technology (ICIT). 2019, 877–882

  31. Lai X, Chen S, Wang D H, Zhu S. Multi-task learning with deep dual-path network for facial attribute recognition. In: Proceedings of the 9th International Conference on Computing and Pattern Recognition. 2020, 161–167

  32. Hu J, Shen L, Sun G. Squeeze-and-excitation networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2018, 7132–7141

  33. Lanchantin J, Wang T, Ordonez V, Qi Y. General multi-label image classification with transformers. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 16473–16483

  34. Wiatowski T, Tschannen M, Stanic A, Grohs P, Bölcskei H. Discrete deep feature extraction: a theory and new architectures. In: Proceedings of the 33rd International Conference on Machine Learning. 2016, 2149–2158

  35. Wiatowski T, Bolcskei H. A mathematical theory of deep convolutional neural networks for feature extraction. IEEE Transactions on Information Theory, 2018, 64(3): 1845–1866

    Article  MathSciNet  MATH  Google Scholar 

  36. Andén J, Mallat S. Deep scattering spectrum. IEEE Transactions on Signal Processing, 2014, 62(16): 4114–4128

    Article  MathSciNet  MATH  Google Scholar 

  37. Angles T, Mallat S. Generative networks as inverse problems with scattering transforms. In: Proceedings of the 6th International Conference on Learning Representations. 2018

  38. Wu J, Qiu X, Zhang J, Wu F, Kong Y, Yang G, Senhadji L, Shu H. Fractional wavelet-based generative scattering networks. Frontiers in Neurorobotics, 2021, 15: 752752

    Article  Google Scholar 

  39. Minskiy D, Bober M. Efficient hybrid network: inducting scattering features. In: Proceedings of the 26th International Conference on Pattern Recognition (ICPR). 2022, 2300–2306

  40. Woodward J. Causation in biology: stability, specificity, and the choice of levels of explanation. Biology & Philosophy, 2010, 25(3): 287–318

    Article  Google Scholar 

  41. Pearl J. Theoretical impediments to machine learning with seven sparks from the causal revolution. In: Proceedings of the 11th ACM International Conference on Web Search and Data Mining. 2018, 3

  42. Han H, Jain A K, Wang F, Shan S, Chen X. Heterogeneous face attribute estimation: A deep multi-task learning approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(11): 2597–2609

    Article  Google Scholar 

  43. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016, 770–778

  44. Tola E, Lepetit V, Fua P. DAISY: an efficient dense descriptor applied to wide-baseline stereo. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(5): 815–830

    Article  Google Scholar 

  45. Cotter F, Kingsbury N. Deep learning in the wavelet domain. 2018, arXiv preprint arXiv: 1811.06115

  46. Andreux M, Angles T, Exarchakis G, Leonarduzzi R, Rochette G, Thiry L, Zarka J, Mallat S, Andén J, Belilovsky E, Bruna J, Lostanlen V, Chaudhary M, Hirn M J, Oyallon E, Zhang S, Cella C, Eickenberg M. Kymatio: scattering transforms in python. Journal of Machine Learning Research, 2020, 21(60): 1–6

    Google Scholar 

  47. Selesnick I W, Baraniuk R G, Kingsbury N C. The dual-tree complex wavelet transform. IEEE Signal Processing Magazine, 2005, 22(6): 123–151

    Article  Google Scholar 

  48. Oyallon E, Zagoruyko S, Huang G, Komodakis N, Lacoste-Julien S, Blaschko M, Belilovsky E. Scattering networks for hybrid representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(9): 2208–2221

    Article  Google Scholar 

  49. Fan S, Wang X, Shi C, Kuang K, Liu N, Wang B. Debiased graph neural networks with agnostic label selection bias. IEEE Transactions on Neural Networks and Learning Systems, 2022

  50. Pearl J, Glymour M, Jewell N P. Causal Inference in Statistics: A Primer. Chichester: John Wiley & Sons, 2016

    MATH  Google Scholar 

  51. Sun Y, Chen Y, Wang X, Tang X. Deep learning face representation by joint identification-verification. In: Proceedings of the 27th International Conference on Neural Information Processing Systems. 2014, 1988–1996

  52. Huang G B, Mattar M, Berg T, Learned-Miller E. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In: Proceedings of the Workshop on Faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition. 2008

  53. De Boer P T, Kroese D P, Mannor S, Rubinstein R Y. A tutorial on the cross-entropy method. Annals of Operations Research, 2005, 134(1): 19–67

    Article  MathSciNet  MATH  Google Scholar 

  54. Liu Z, Luo P, Wang X, Tang X. Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision. 2015, 3730–3738

  55. Rudd E M, Gunther M, Boult T E. MOON: A mixed objective optimization network for the recognition of facial attributes. In: Proceedings of the 14th European Conference on Computer Vision. 2016, 19–35

  56. Günther M, Rozsa A, Boult T E. AFFACT: Alignment-free facial attribute classification technique. In: Proceedings of 2017 IEEE International Joint Conference on Biometrics (IJCB). 2017, 90–99

  57. Kalayeh M M, Shah M. On symbiosis of attribute prediction and semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, 43(5): 1620–1635

    Article  Google Scholar 

  58. Lingenfelter B, Hand E M. Improving evaluation of facial attribute prediction models. In: Proceedings of the 16th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021). 2021, 1–7

  59. Singh A, Kingsbury N. Multi-resolution dual-tree wavelet scattering network for signal classification. 2017, arXiv preprint arXiv: 1702.03345

  60. Singh A, Kingsbury N. Dual-tree wavelet scattering network with parametric log transformation for object classification. In: Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2017, 2622–2626

  61. Oyallon E, Mallat S. Deep roto-translation scattering for object classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015, 2865–2873

  62. Kang S, Kim G, Yoo C D. Fair facial attribute classification via causal graph-based attribute translation. Sensors, 2022, 22(14): 5271

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Key Research and Development Project of China (Grant No. 2018AAA0100802), Opening Foundation of National Engineering Laboratory for Intelligent Video Analysis and Application, and Experimental Center of Artificial Intelligence of Beijing Normal University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fuqing Duan.

Additional information

Na Liu received the BS degree from Xidian University, China in 2016. She is currently pursuing the PhD degree in Beijing Normal University, China. Her research interests include image processing, face Attribute Estimation, and deep learning.

Fan Zhang received the BE degree from Beijing University of Posts and Telecommunications, China in 2016. She is currently a PhD candidate in Beijing Normal University, China. Her research interests include image processing and deep learning.

Liang Chang received the bachelor’s and master’s degrees from Wuhan University, China and the PhD degree from the Institute of Automation, Chinese Academy of Sciences, China. She is currently an associate professor with the School of Artificial Intelligence, Beijing Normal University, China. Her research interests include computer vision and machine learning.

Fuqing Duan received the BS and MS degrees in mathematics from Northwest University, China in 1995 and 1998, respectively, and the PhD degree in pattern recognition from the National Laboratory of Pattern Recognition, China in 2006. He is currently an Professor with the School of Artificial Intelligence, Beijing Normal University, China. His current research interests include 3D reconstruction, skull identification, and machine learning and applications. He has authored more than 80 conference and journal articles on related topics.

Electronic Supplementary Material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, N., Zhang, F., Chang, L. et al. Scattering-based hybrid network for facial attribute classification. Front. Comput. Sci. 18, 183313 (2024). https://doi.org/10.1007/s11704-023-2570-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11704-023-2570-6

Keywords

Navigation