Abstract
Decrypting the secret of beauty or attractiveness has been the pursuit of artists and philosophers for centuries. To date, the computational model for attractiveness estimation has been actively explored in computer vision and multimedia community, yet with the focus mainly on facial features. In this article, we conduct a comprehensive study on female attractiveness conveyed by single/multiple modalities of cues, that is, face, dressing and/or voice, and aim to discover how different modalities individually and collectively affect the human sense of beauty. To extensively investigate the problem, we collect the Multi-Modality Beauty (M2B) dataset, which is annotated with attractiveness levels converted from manual k-wise ratings and semantic attributes of different modalities. Inspired by the common consensus that middle-level attribute prediction can assist higher-level computer vision tasks, we manually labeled many attributes for each modality. Next, a tri-layer Dual-supervised Feature-Attribute-Task (DFAT) network is proposed to jointly learn the attribute model and attractiveness model of single/multiple modalities. To remedy possible loss of information caused by incomplete manual attributes, we also propose a novel Latent Dual-supervised Feature-Attribute-Task (LDFAT) network, where latent attributes are combined with manual attributes to contribute to the final attractiveness estimation. The extensive experimental evaluations on the collected M2B dataset well demonstrate the effectiveness of the proposed DFAT and LDFAT networks for female attractiveness prediction.
- Aarabi, P., Hughes, D., Mohajer, K., and Emami, M. 2001. The automatic measurement of facial beauty. In Proceedings of the International Conference on Systems, Man and Cybernetics, 2644--2647.Google Scholar
- Alley, T. and Cunningham, M. 1991. Average faces are attractive, but very attractive faces are not average. Psych. Sci. 2, 123--125.Google ScholarCross Ref
- Beaupre, M. 2006. An ingroup advantage for confidence in emotion recognition judgments: The moderating effect of familiarity with the expressions of outgroup members. Personality Soc. Psych. Bull. 32, 16--26.Google ScholarCross Ref
- Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24, 4, 509--522. Google ScholarDigital Library
- Berg, T. L., Berg, A. C., and Shih, J. 2010. Automatic attribute discovery and characterization from noisy web data. In Proceedings of the European Conference on Computer Vision. 663--676. Google ScholarDigital Library
- Bourdev, L., Maji, S., and Malik, J. 2011. Describing people: A poselet-based approach to attribute classification. In Proceedings of the International Conference on Computer Vision. 1543--1550. Google ScholarDigital Library
- Brinton, D. 1890. Races and Peoples: Lectures on the Science of Ethnography. N.D.C. Hodges.Google Scholar
- Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J. 1995. Active shape models-their training and application. Computer Vision Image Understand. 61, 1, 38--59. Google ScholarDigital Library
- Dalal, N. and Triggs, B. 2005. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 886--893. Google ScholarDigital Library
- Daugman, J. 1985. Uncertainty relations for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. J. Opt. Soc. Am. 2, 1160--1169.Google ScholarCross Ref
- Dion, K., Berscheid, E., and Walster, E. 1972. What is beautiful is good. J. Appl. Soc. Psych. 24, 90.Google Scholar
- Eisenthal, Y., Dror, G., and Ruppin, E. 2006. Facial attractiveness: Beauty and the machine. Neural Comput. 18, 1, 119--142. Google ScholarDigital Library
- Glassenberg, A., Feinberg, D., Jones, B., Little, A., and Debruine, L. 2009. Sex-dimorphic face shape preference in heterosexual and homosexual men and women. Arch. Sexual Behav. 39, 6, 1289--1296.Google ScholarCross Ref
- Gray, D., Yu, K., Xu, W., and Gong, Y. 2010. Predicting facial beauty without landmarks. In Proceedings of the European Conference on Computer Vision. 434--447. Google ScholarDigital Library
- Green, C. 1995. All that glitters: A review of psychological research on the aesthetics of the golden section. Perception 24, 937--968.Google ScholarCross Ref
- Guo, D. and Sim, T. 2009. Digital face makeup by example. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 73--79.Google Scholar
- Haykin, S. 1999. Neural Networks. Prentice Hall.Google Scholar
- Hughes, S., Dispenza, F., and Gallup, G. 2004. Ratings of voice attractiveness predict sexual behavior and body configuration. Evolution Human Behav. 25, 5, 295--304.Google ScholarCross Ref
- Joachims, T. 2002. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining. 133--142. Google ScholarDigital Library
- Kagian, A., Dror, G., Leyvand, T., Cohen-Or, D., and Ruppin, E. 2005. A humanlike predictor of facial attractiveness. Adv. Neural Inf. Process. Sys. 649--656.Google Scholar
- Kumar, N., Belhumeur, P. N., and Nayar, S. K. 2008. Facetracer: A search engine for large collections of images with faces. In Proceedings of the European Conference on Computer Vision. 340--353. Google ScholarDigital Library
- Lartillot, O. and Toiviainen, P. 2007. MIR in Matlab: A toolbox for musical feature extraction from audio. In Proceedings of the International Society for Music Information Retrieval Conference. 127--130.Google Scholar
- Lennon, S. 1990. Effects of clothing attractiveness on perceptions. Home Economics Res. J. 18, 303--310.Google ScholarCross Ref
- Likert, R. 1932. A technique for the measurement of attitudes. Arch. Psych. 22, 140, 1--55.Google Scholar
- Liu, S., Nguyen, T., Feng, J., Wang, M., and Yan, S. 2012a. Hi, magic closet, tell me what to wear! In Proceedings of the 20th ACM International Conference on Multimedia. 1333--1334. Google ScholarDigital Library
- Liu, S., Song, Z., Liu, G., Xu, C., Lu, H., and Yan, S. 2012b. Street-to-shop: Cross-scenario clothing retrieval via parts alignment and auxiliary set. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3330--3337. Google ScholarDigital Library
- Mittal, A., Zisserman, A., and Torr, P. H. S. 2011. Hand detection using multiple proposals. In Proceedings of the British Machine Vision Conference. 1--11.Google Scholar
- Nguyen, T., Liu, S., Ni, B., Tan, J., Rui, Y., and Yan, S. 2012. Sense beauty via face, dressing and/or voice. In Proceedings of the ACM International Conference on Multimedia. 239--248. Google ScholarDigital Library
- Ojala, T., Pietikäinen, M., and Mäenpää, T. 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 7, 971--987. Google ScholarDigital Library
- Oliva, A. and Torralba, A. 2001. Modeling the shape of the scene: A holistic representation of the spatial envelope. Int. J. Comput. Vision 42, 3, 145--175. Google ScholarDigital Library
- Pallett, P., Link, S., and Lee, K. 2009. New golden ratios for facial beauty. Vision Res. 50, 149--154.Google ScholarCross Ref
- Parikh, D. and Grauman, K. 2011. Relative attributes. In Proceedings of the International Conference on Computer Vision. 503--510. Google ScholarDigital Library
- Riesenhuber, M. and Poggio, T. 1999. Hierarchical models of object recognition in cortex. Nature Neurosci. 2, 1019--1025.Google ScholarCross Ref
- Song, Z., Wang, M., Hua, X.-S., and Yan, S. 2011. Predicting occupation via human clothing and contexts. In Proceedings of the IEEE International Conference on Computer Vision. 1084--1091. Google ScholarDigital Library
- Tanaka, J., Kiefer, M., and Bukach, C. 2004. A holistic account of the own-race effect in face recognition: evidence from a cross-cultural study. Cognition 93, 1--9.Google ScholarCross Ref
- Viola, P. and Jones, M. 2004. Robust real-time face detection. Int. J. Comput. Vision 57, 2, 137--154. Google ScholarDigital Library
- Yang, Y. and Ramanan, D. 2011. Articulated pose estimation with flexible mixtures-of-parts. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1385--1392. Google ScholarDigital Library
- Zuckerman, M. and Miyake, K. 1993. The attractive voice: What makes it so? J. Nonverbal Behavior 17, 119--135.Google ScholarCross Ref
Index Terms
- Towards decrypting attractiveness via multi-modality cues
Recommendations
Advances in computational facial attractiveness methods
Attractiveness of a face plays an important role in many social endeavors. It influences careers like digital entertainment, modeling and acting, as well as person's career prospect, financial status, and personal relationships. Computational approaches ...
Computation of a face attractiveness index based on neoclassical canons, symmetry, and golden ratios
Analysis of attractiveness of faces has long been a topic of research. Literature has identified many different factors that can be related to attractiveness. In this research we analyze the role of symmetry, neoclassical canons, and golden ratio in the ...
Computation of facial attractiveness from 3D geometry
AbstractThe face attractiveness from 3D representation is still in the infancy stage. In this paper, we investigate the role of 3D face geometry in facial beauty perception. Driven by heuristic rules, the aesthetics-aware geometric features are extracted, ...
Comments