Abstract
Matching face images across different modalities is a challenging open problem for various reasons, notably feature heterogeneity, and particularly in the case of sketch recognition – abstraction, exaggeration and distortion. Existing studies have attempted to address this task by engineering invariant features, or learning a common subspace between the modalities. In this paper, we take a different approach and explore learning a mid-level representation within each domain that allows faces in each modality to be compared in a domain invariant way. In particular, we investigate sketch-photo face matching and go beyond the well-studied viewed sketches to tackle forensic sketches and caricatures where representations are often symbolic. We approach this by learning a facial attribute model independently in each domain that represents faces in terms of semantic properties. This representation is thus more invariant to heterogeneity, distortions and robust to mis-alignment. Our intermediate level attribute representation is then integrated synergistically with the original low-level features using CCA. Our framework shows impressive results on cross-modal matching tasks using forensic sketches, and even more challenging caricature sketches. Furthermore, we create a new dataset with \(\approx \)59, 000 attribute annotations for evaluation and to facilitate future research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Attributes are chosen to be properties that differentiate groups of the population, such as male/female, asian/white, young/old – thus an \(A\)-length attribute code can potentially differentiate \(2^A\) people, providing a highly discriminative representation.
- 2.
Both datasets (especially forensic) exhibits some degrees of lower diversity of attributes, so some attributes are always on or off rendering them meaningless for representation, so these are excluded for convenience.
- 3.
Reference [8] did not release their attributes. Our attributes and corresponding annotations are available at http://www.eecs.qmul.ac.uk/~yzs/heteroface/.
References
Klare, B., Li, Z., Jain, A.: Matching forensic sketches to mug shot photos. In: TPAMI, pp. 639–646 (2011)
Khan, Z., Hu, Y., Mian, A.: Facial self similarity for sketch to photo matching. In: Digital Image Computing Techniques and Applications (DICTA), pp. 1–7 (2012)
Kiani Galoogahi, H., Sim, T.: Face photo retrieval by sketch example. In: the 20th ACM International Conference on Multimedia, pp. 949–952 (2012)
Galoogahi, H., Sim, T.: Face sketch recognition by local radon binary pattern lrbp. In: ICIP, pp. 1837–1840 (2012)
Pramanik, S., Bhattacharjee, D.: Geometric feature based face-sketch recognition. In: Pattern Recognition, Informatics and Medical Engineering (PRIME), pp. 409–415 (2012)
Bhatt, H.S., Bharadwaj, S., Singh, R., Vatsa, M.: On matching sketches with digital face images. In: Biometrics: Theory Applications and Systems, pp. 1–7 (2010)
Choi, J., Sharma, A., Jacobs, D., Davis, L.: Data insufficiency in sketch versus photo face recognition. In: CVPR, pp. 1–8 (2012)
Klare, B., Bucak, S., Jain, A., Akgul, T.: Towards automated caricature recognition. In: The 5th IAPR International Conference on Biometrics Compendium, pp. 139–146 (2012)
Bhatt, H.S., Bharadwaj, S., Singh, R., Vatsa, M.: Memetic approach for matching sketches with digital face images. Indraprastha Institute of Information Technology Delhi, pp. 1–8 (2012)
Tang, X., Wang, X.: Face photo recognition using sketch. In: ICIP, pp. 257–260 (2002)
Wang, X., Tang, X.: Face photo-sketch synthesis and recognition. TPAMI 31, 1955–1967 (2009)
Sharma, A., Jacobs, D.W.: Bypassing synthesis PLS for face recognition with pose, low-resolution and sketch. In: CVPR, pp. 593–600 (2011)
Bhatt, H., Bharadwaj, S., Singh, R., Vatsa, M.: Memetically optimized mcwld for matching sketches with digital face images. IEEE Trans. Inf. Forensics Secur. 7, 1522–1535 (2012)
Galoogahi, H., Sim, T.: Inter-modality face sketch recognition. In: ICME (2012)
Uhl, R.G., Jr., da Vitoria Lobo, N.: A framework for recognizing a facial image from a police sketch. In: CVPR, pp. 586–593 (1996)
Bonnen, K., Klare, B., Jain, A.: Component-based representation in automated face recognition. IEEE Trans. Inf. Forensics Secur. 8, 239–253 (2013)
Rasiwasia, N., Costa Pereira, J., Coviello, E., Doyle, G., Lanckriet, G.R., Levy, R., Vasconcelos, N.: A new approach to cross-modal multimedia retrieval. In: Proceedings of the international conference on Multimedia, pp. 251–260 (2010)
Gong, Y., Ke, Q., Isard, M., Lazebnik, S.: A multi-view embedding space for modeling internet images, tags, and their semantics. IJCV 106, 210–233 (2014)
Wang, K., He, R., Wang, W., Wang, L., Tan, T.: Learning coupled feature spaces for cross-modal matching. In: ICCV, pp. 2088–2095 (2013)
Huang, D.A., Wang, Y.C.F.: Coupled dictionary and feature space learning with applications to cross-domain image synthesis and recognition. In: ICCV (2013)
Layne, R., Hospedales, T.M., Gong, S.: Person re-identification by attributes. In: BMVC, pp. 1–11 (2012)
Lampert, C.H., Nickisch, H., Harmeling, S.: Learning to detect unseen object classes by between-class attribute transfer. In: CVPR, pp. 951–958 (2009)
Fu, Y., Hospedales, T.M., Xiang, T., Fu, Z., Gong, S.: Transductive multi-view embedding for zero-shot recognition and annotation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 584–599. Springer, Heidelberg (2014)
Liu, J., Kuipers, B., Savarese, S.: Recognizing human actions by attributes. In: CVPR, pp. 3337–3344 (2011)
Fu, Y., Hospedales, T., Xiang, T., Gong, S.: Learning multimodal latent attributes. TPAMI 36, 303–316 (2014)
Luo, P., Wang, X., Tang, X.: A deep sum-product architecture for robust facial attributes analysis. In: ICCV, pp. 2864–2871 (2013)
Kumar, N., Berg, A.C., Belhumeur, P.N., Nayar, S.K.: Attribute and simile classifiers for face verification. In: ICCV, pp. 365–372 (2009)
Johnson, K.E.: Effects of knowledge and development on subordinate level categorization. Cognitive Dev. 13, 515–545 (1998)
Yao, B., Khosla, A., Fei-Fei, L.: Combining randomization and discrimination for fine-grained image categorization. In: CVPR, pp. 1577–1584 (2011)
Yi, D., Liu, R., Chu, R., Lei, Z., Li, S.Z.: Face matching between near infrared and visible light images. In: Lee, S.-W., Li, S.Z. (eds.) Advances in Biometrics. LNCS, vol. 4642. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ouyang, S., Hospedales, T., Song, YZ., Li, X. (2015). Cross-Modal Face Matching: Beyond Viewed Sketches. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9004. Springer, Cham. https://doi.org/10.1007/978-3-319-16808-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-16808-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16807-4
Online ISBN: 978-3-319-16808-1
eBook Packages: Computer ScienceComputer Science (R0)