Skip to main content
Log in

Name-face association with web facial image supervision

  • Special Issue Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

This paper describes methods for automatically associating faces detected from multimedia documents with their names presented in the surrounding metadata. We consider the task in the image matching (IM) framework, where external Web facial images are automatically retrieved as the gallery face set of the names in advance, and a detected face is assigned to one of the names, or none of them, according to the association score between the two kinds of faces and constraints. Several important issues are investigated within the IM framework. In collecting Web facial images, beyond the basic scheme that use a celebrity name purely as the query to crawl facial images, a context-assisted image search method is proposed to enhance the relevance and discriminability of the retrieved faces. In constraint formulation, we propose an assigning-thresholding (AT) pipeline to uniformly ensure that the name-face correspondence is strictly one-to-one, and set low confidence associations as null assignments. In association score computation, we propose methods that jointly consider IM with the well-established graph-based association (GA) method at different stages, aiming at producing more accurate scores to benefit the association. Based on these efforts, an Accu-IM method performing the association as accurate as possible and a Fast-IM method performing the association in real-time are respective proposed. Extensive experiments on datasets of captioned News images and Web videos both demonstrate the advantages of the proposed efforts individually and jointly, which consistently provide improvement gains under different settings when compared with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. http://www.youtube.com/trendsmap.

  2. In fact, his true name is Jan Kraus. He is recognized as Jana Krause since he is known as the host of a famous TV show named Jana Krause.

References

  1. Zhang, X., Zhang, L., Wang, X. J., Shum, H. Y.: Finding celebrities in billions of web images. IEEE Trans. Multimedia 14(4), 995–1007 (2012)

    Article  Google Scholar 

  2. Yao, T., Liu, Y., Ngo, C. W., Mei, T.: (2013) “Unified entity search in social media community,” International world wide web conference (WWW), pp. 1457–1466

  3. Pang, L., Ngo, C. W.: Unsupervised celebrity face naming in web videos. IEEE Trans. Multimedia. 17(6), 854–866 (2015)

    Article  Google Scholar 

  4. Wang, W., Zhang, D. M., Zhang, Y. D., Li, J. T.: Robust spatial matching for object retrieval and its parallel implementation on GPU. IEEE Trans. Multimedia 13(6), 1308–1318 (2011)

    Article  Google Scholar 

  5. Zhang, W., Ngo, C.W.: “Searching visual instances with topology checking and context modeling”, ACM international conference on multimedia retrieval (ICMR), pp. 57–64 (2013)

  6. Yao, T., Ngo, C.W., Mei, T.: Circular reranking for visual search. IEEE Trans. Image Process. 22(4), 1644–1655 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  7. Zhang, Y. D., Zhang, L., Tian, Q.: A prior-free weighting scheme for binary code ranking. IEEE Trans. Multimedia 16(4), 1127–1139 (2014)

    Article  Google Scholar 

  8. Pan, Y., Yao, T., Mei, T., Li, H., Ngo, C.W., Rui, Y.: “Click-through-based cross-view learning for Image Search,” ACM conference on research and development in information retrieval (SIGIR), pp. 717–726 (2014)

  9. Zhang, W., Ngo, C. W.: Topological spatial verification for instance search. IEEE Trans. Multimedia. 17(8), 1236–1247 (2015)

    Article  Google Scholar 

  10. Yao, T., Mei, T., Ngo, C.W.: “Learning query and image similarities with ranking canonical correlation analysis”, International conference on computer vision (ICCV), pp. 28–36 (2015)

  11. Zhang, W., Li, H., Ngo, C. W., Chang, S.-F.: “Scalable visual instance mining with threads of features”, ACM International Conference on Multimedia, pp. 297–306 (2014)

  12. Yao, T., Ngo, C. W., Mei, T.: “Context-based friend suggestion in online photo-sharing community,” ACM international conference on multimedia, pp. 945–948 (2011)

  13. Cao, J., Ngo, C. W., Zhang, Y. D., Li, J. T.: Tracking web video topics: discovery, visualization, and monitoring. IEEE Trans. Circuits Syst. Video Technol. 21(12), 1835–1846 (2011)

    Article  Google Scholar 

  14. Zhang, W., Ngo, C.W., Cao, X. C.: Hyperlink-aware object retrieval. IEEE Trans. Image Process. 25(9), 4186–4198 (2016)

    Article  MathSciNet  Google Scholar 

  15. Liu, N., Chen, J., Zhu, L., Zhang, J., He, Y.: A key management scheme for secure communications of advanced metering infrastructure in smart grid. IEEE Trans. Ind. Electron. 60(10), 4746–4756 (2013)

    Article  Google Scholar 

  16. Zhao, W., Chellappa, R., Phillips, P. J., Rosenfeld, A.: Face recognition: a literature survey. ACM Comput. Surv. 35(4), 399–458 (2003)

    Article  Google Scholar 

  17. Wright, J., Yang, A. Y., Ganesh, A., et al.: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell 31(2), 210–227 (2008)

    Article  Google Scholar 

  18. Satoh, S., Nakamura, Y., Kanade, T.: Name-it: naming and detecting faces in news videos. IEEE Multimedia. 6(1), 22–35 (1999)

    Article  Google Scholar 

  19. Berg, T. L., Berg, A.C., Edwards, J., et al.: Names and faces in the news. In: IEEE CVPR, pp. 848–854 (2004)

  20. Guillaumin, M., Mensink, T., Verbeek, J., Schmid, C.: Face recognition from caption-based supervision. Int. J. Comput. Vis. 96(1), 64–82 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  21. Bu, J., Xu, B., et al.: Unsupervised face-name association via commute distance. In: ACM multimedia, pp. 219–228 (2012)

  22. Chen, Z.N., Ngo, C.W., Zhang, W., Cao, J., Jiang, Y.G.: Name-face association in web videos: a large-scale dataset, baselines, and open issues. J. Comput. Sci. Technol. 29(5), 785–798 (2014)

    Article  Google Scholar 

  23. Poppe, R.: Facing scalability: naming faces in an online social network. Pattern Recognit. 45(6), 2335–2347 (2012)

    Article  Google Scholar 

  24. Ozkan, D., Duygulu, P.: Interesting faces: a graph-based approach for finding people in news. Pattern Recognit. 43(5), 1717–1735 (2010)

    Article  Google Scholar 

  25. Pham, P.T., Moens, M.F., Tuytelaars, T.: Cross-media alignment of names and faces. IEEE Trans. Multimedia. 12(1), 13–27 (2010)

    Article  Google Scholar 

  26. Ozcan, M., Zurich, E.T.H. et al.: A large-scale database of images and captions for automatic face naming. In: BMVC, pp. 1–8 (2011)

  27. Yang, J., Hauptmann, A.G.: Naming every individual in news video monologues. In: ACM multimedia, pp. 580–587 (2004)

  28. Yang, J., Yan, R., Hauptmann, A.G.: Multiple instance learning for labeling faces in broadcasting news video,’’ ACM international conference on multimedia, pp. 31–40 (2005)

  29. Pham, P.T., Tuytelaars, T., Moens, M.-F.: Naming people in news videos with label propagation. IEEE Multimedia. 18(3), 44–55 (2011)

    Article  Google Scholar 

  30. Liu, C. X., Jiang, S. Q., Huang, Q.M.: “Naming faces in broadcast news video by Image Google”, ACM Int. Conf. Multimedia, pp. 717–720 (2008)

  31. Zhang, Y.F., Xu, C.S., Lu, H.Q., et al.: Character identification in feature-length films using global face-name matching. IEEE Trans. Multimedia 11(7), 1276–1288 (2009)

    Article  Google Scholar 

  32. Sang, J., Xu, C.S.: Robust face-name graph matching for movie character identification. IEEE Trans. Multimedia. 14(3), 586–596 (2012)

    Article  Google Scholar 

  33. Gao, G.Y., Xu, M.D., Shen, J.J., Ma, H.D., Yan, S.C.: Cast2Face: assigning character names onto faces in movie with actor-character correspondence. IEEE Trans. Circuits Syst. Video Technol. 26(12), 2299–2312 (2015)

    Article  Google Scholar 

  34. Zhang, Y., Tang, Z., Wu, B., et al.: A coupled hidden conditional random field model for simultaneous face clustering and naming in videos. IEEE Trans. Image Process. 25(12), 5780–5792 (2016)

    Article  MathSciNet  Google Scholar 

  35. Tapaswi, M., Bäuml, M., Stiefelhagen, R.: “Improved weak labels using contextual cues for person identification in videos,” 11th IEEE international conference and workshops on automatic face and gesture recognition (FG), pp. 1–8 (2015)

  36. Cinbis, R. G., Verbeek, J., Schmid, C.: Unsupervised metric learning for face identification in TV video. In: ICCV, pp. 1559–1566 (2011)

  37. Everingham, M., Sivic, J., Zisserman, A.: Hello! My name is ... Buffy—automatic naming of characters in TV Video. In: BMVC, pp. 1–10 (2006)

  38. Ramanan, D., Baker, S., Kakade, S.: Leveraging archival video for building face datasets. In: ICCV, pp. 1–8 (2007)

  39. Bauml, M., Tapaswi, M., Stiefelhagen, R.: Semi-supervised learning with constraints for person identification in multimedia data. In: IEEE CVPR, pp. 3602–3609 (2013)

  40. Pham, P. T., Deschacht, K., Tuytelaars, T., Moens, M. F.: Naming persons in video: using the weak supervision of textual stories. J. Vis. Commun. Image Represent. 24(7), 944–955 (2013)

  41. Guillaumin, M., Verbeek, J., Schmid, C., Lear, I., Kuntzmann, L.: Is that you? Metric learning approaches for face identification. In: ICCV, pp. 498–505 (2009)

  42. Le, D.D., Satoh, S.: Unsupervised face annotation by mining the web. In: ICDM, pp. 383–392 (2008)

  43. Wang, D. Y., S. C. H. Hoi, He, Y.: Mining weakly labeled web facial images for search-based face annotation. In: ACM SIGIR, pp. 535–544 (2011)

  44. Wang, D. Y., S. C. H. Hoi, He, Y., Zhu, J. K., Mei, T., Luo, J. B.: Retrieval-based face annotation by weak label regularized local coordinate coding. IEEE Trans. Pattern Anal. Mach. Intell. 36(3), 550–563 (2014)

    Article  Google Scholar 

  45. Zhao, M., Yagnik, J., Adam, H., et al.: Large scale learning and recognition of faces in web videos. In: IEEE conf. automatic face and gesture recognition. IEEE Press, pp. 1–7 (2008)

  46. Sargin, M. E., Aradhye, H., Moreno, P.J., Zhao, M.: Audiovisual celebrity recognition in unconstrained web videos”, IEEE ICASSP, pp. 1977–1980 (2009)

  47. Wolf, L., Hassner, T., Maoz, I.: Face recognition in unconstrained videos with matched background similarity. In: CVPR, pp. 529–534 (2011)

  48. Chen, Z.N., Ngo, C.W., Cao, J., Zhang, W.: Community as a connector: associating faces with celebrity names in web videos”, ACM international conference on multimedia, pp. 809–812 (2012)

  49. Chen, Z. N., Feng, B. L., Ngo, C. W., Jia, C.Y., Huang, X.S.: Improving automatic name-face association using celebrity images on the web. In: ACM ICMR, pp. 623–626 (2015)

  50. Stone, Z., Zickler, T., Darrell, T.: Toward large-scale face recognition using social network context. Proc. IEEE. 98(8), 1408–1415 (2010)

    Article  Google Scholar 

  51. Chen, Z., Zhang, W., Xie, H., Feng, B., Gu, X.: Context-oriented name-face association in web videos,” Pacific-Rim conference on Multimedia, pp. 629–639 (2016)

  52. Holub, A., Moreels, P., Perona, P.: “Unsupervised clustering for Google searches of celebrity images”, IEEE conf. automatic face and gesture recognition. IEEE Press, pp. 1–7 (2008)

  53. Chen, Z. N., Cao, J., Song, Y. C., Zhang, Y. D., Li, J. T.: Web video categorization based on wikipedia categories and content-duplicate open resources”, ACM international conference on multimedia, pp. 1107–1110 (2010)

  54. Zhao, W. L., Wu, X., Ngo, C.W.: On the annotation of web videos by efficient near-duplicate search. IEEE Trans. Multimedia. 12(5), 448–461 (2010)

    Article  Google Scholar 

  55. Guo, Y. D., Zhang, L., Hu, Y., Gao, J.F.: MS-Celeb-1M: challenge of recognizing one million celebrities in the real world”, European conference on computer vision, pp. 87–102 (2016)

  56. Liu, L., Zhang, L., Liu, H., Yan, S.: Toward large-population face identification in unconstrained videos. IEEE Trans. Circuits Syst. Video Technol. 24(11), 1874–1884 (2014)

    Article  Google Scholar 

  57. Chen, Z., Feng, B., Xie, H., Zheng, R., Xu, B.: “Video to article hyperlinking by multiple tag property exploration,” International conference on multimedia modeling, pp. 62–73 (2014)

  58. Xie, H., Zhang, Y., Tan, J., Guo, L., Li, J.: Contextual query expansion for image retrieval. IEEE Trans. Multimedia. 16(4), 1104–1114 (2014)

    Article  Google Scholar 

  59. Chen, Z. N., Cao, J., Xia, T., Song, Y. C., Zhang, Y. D., Li, J. T.: Web video retagging. Multimedia Tools Appl.. 55(1), 53–82 (2011)

    Article  Google Scholar 

  60. Xie, H., Gao, K., Zhang, Y., Tang, S., Li, J., Liu, Y.: Efficient feature detection and effective post-verification for large scale near-duplicate image search. IEEE Trans. Multimedia. 13(6), 1319–1332 (2011)

    Article  Google Scholar 

  61. Xie, H., Zhang, Y., Gao, K., Tang, S., Xu, K., Guo, L., Li, J.: Robust common visual pattern discovery using graph matching. J. Vis. Commun. Image Represent. 24(5), 635–646 (2013)

    Article  Google Scholar 

  62. Xie, H., Gao, K., Zhang, Y., Li, J., Liu, Y.: “Pairwise weak geometric consistency for large scale image search,” ACM international conference on multimedia retrieval, pp. 42–49 (2011)

  63. Xie, H., Gao, K., Zhang, Y., Li, J., Ren, H., “Common visual pattern discovery via graph matching,” ACM international conference on multimedia, pp. 1385–1388 (2011)

  64. Yao, T., Pan, Y., Li, Y., Qiu, Z., Mei, T. Boosting image captioning with attributes. arXiv:1611.01646. (2016)

  65. Pan, Y., Yao, T., Mei, T., Li, H.: Video captioning with transferred semantic attributes. arXiv:1611.07675. (2016)

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China (61303175, 61303171, 61602463).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongtao Xie.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Z., Zhang, W., Deng, B. et al. Name-face association with web facial image supervision. Multimedia Systems 25, 1–20 (2019). https://doi.org/10.1007/s00530-017-0544-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-017-0544-y

Keywords

Navigation