Abstract
In this study, we present a method to extensively reduce the number of retrieved images and increase the retrieval performance for the person queries on the broadcast news videos. A multi-modal approach which integrates face and text information is proposed. A state-of-the-art face detection algorithm is improved using a skin color based method to eliminate the false alarms. This pruned set is clustered to group the similar faces and representative faces are selected from each cluster to be provided to the user. For six person queries of TRECVID2004, on the average, the retrieval rate is increased from 8% to around 50%, and the number of images that the user has to inspect are reduced from hundreds and thousands to tens.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
TREC Video Retrieval Evaluation, http://www-nlpir.nist.gov/projects/trecvid/
Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: A literature survey. ACM Computing Surveys (2003)
Snoek, C.G.M., Worring, M.: Multimodal video indexing: A review of the state-of-the art. Multimedia Tools and Applications 25(1), 5–35 (2005)
Satoh, S., Kanade, T.: NAME-IT: Association of face and name in video. In: IEEE Conf. on Computer Vision and Pattern Recognition, CVPR (1997)
Yang, J., Chen, M.-Y., Hauptmann, A.: Finding Person X: Correlating Names with Visual Appearances. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 270–278. Springer, Heidelberg (2004)
Mikolajczyk, K.: Face detector. Ph.D report, INRIA Rhone-Alpes
Schneiderman, H., Kanade, T.: Object detection using statistics of parts. International Journal of Computer Vision (2002)
Phung, S.L., Bouzerdoum, A., Chai, D.: Skin segmentation using color pixel classification: analysis and comparison. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 27(1) (January 2005)
Hamerly, G., Elkan, C.: Learning the k in kmeans. In: Proc. of the NIPS (2003)
Duygulu, P., Hauptmann, A.: Whatś news, whatś not? Associating News videos with words. In: Enser, P.G.B., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A., Smeulders, A.W.M. (eds.) CIVR 2004. LNCS, vol. 3115, pp. 132–140. Springer, Heidelberg (2004)
Miller, T., Berg, A.C., Edwards, J., Maire, M., White, R., Teh, Y.-W., Learned-Miller, E., Forsyth, D.A.: Faces and names in the news. In: IEEE Conf. on Computer Vision and Pattern Recognition, CVPR (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
İkizler, N., Duygulu, P. (2005). Person Search Made Easy. In: Leow, WK., Lew, M.S., Chua, TS., Ma, WY., Chaisorn, L., Bakker, E.M. (eds) Image and Video Retrieval. CIVR 2005. Lecture Notes in Computer Science, vol 3568. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11526346_61
Download citation
DOI: https://doi.org/10.1007/11526346_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27858-0
Online ISBN: 978-3-540-31678-7
eBook Packages: Computer ScienceComputer Science (R0)