skip to main content
10.1145/1101149.1101155acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Multiple instance learning for labeling faces in broadcasting news video

Published: 06 November 2005 Publication History

Abstract

Labeling faces in news video with their names is an interesting research problem which was previously solved using supervised methods that demand significant user efforts on labeling training data. In this paper, we investigate a more challenging setting of the problem where there is no complete information on data labels. Specifically, by exploiting the uniqueness of a face's name, we formulate the problem as a special multi-instance learning (MIL) problem, namely exclusive MIL or eMIL problem, so that it can be tackled by a model trained with partial labeling information as the anonymity judgment of faces, which requires less user effort to collect. We propose two discriminative probabilistic learning methods named Exclusive Density (ED) and Iterative ED for eMIL problems. Experiments on the face labeling problem shows that the performance of the proposed approaches are superior to the traditional MIL algorithms and close to the performance achieved by supervised methods trained with complete data labels.

References

[1]
S. Andrews, I. Tsochantaridis, and T. Hofmann. Support vector machines for multiple-instance learning. In Advances in Neural Information Processing Systems 15, pages 561--568. MIT Press, 2003.
[2]
T. Berg, A. Berg, J. Edwards, M. Maire, R. White, Y.-W. Teh, E. Learned-Miller, and D. Forsyth. Names and faces in news. In Proc. of Conf. on Computer Vision and Pattern Recognition, pages 848--854. IEEE Computer Society, 2004.
[3]
D. Bikel, S. Miller, R. Schwartz, and R. Weischedel. Nymble: a high-performance learning name-finder, 1997.
[4]
S. F. Chang, R. Manmatha, and T. S. Chua. Combining text and audio-visual features in video indexing. In IEEE ICASSP 2005, 2005.
[5]
M. Chen and A. Hauptmann. Toward robust face recognition from multiple views. In Proc. of Int'l Conference on Multimedia and Expo, 2004.
[6]
T. G. Dietterich, R. H. Lathrop, and T. Lozano-Perez. Solving the multiple instance problem with axis-parallel rectangles. Artificial Intelligence, 89(1-2):31--71, 1997.
[7]
R. Houghton. Named faces: Putting names to faces. IEEE Intelligent Systems, 14(5):45--50, 1999.
[8]
J. Jeon, V. Lavrenko, and R. Manmatha. Automatic image annotation and retrieval using cross-media relevance models. In Proc. of the 26th Int'l ACM SIGIR Conference on Research and Development in Informaion Retrieval, pages 119--126, 2003.
[9]
O. Maron and T. Lozano-Pérez. A framework for multiple-instance learning. In Advances in Neural Information Processing Systems, volume 10, pages 570--576. TheMITPress, 1998.
[10]
O. Maron and A. L. Ratan. Multiple-instance learning for natural scene classification. In Proc. 15th Int'l Conf. on Machine Learning, pages 341--349. Morgan Kaufmann, 1998.
[11]
S. Satoh and T. Kanade. Name-it: Association of face and name in video. In Proc. of the Conf. on Computer Vision and Pattern Recognition, pages 368--373. IEEE Computer Society, 1997.
[12]
H. Schneiderman and T. Kanade. Object detection using the statistics of parts. Int. J. Comput. Vision, 56(3):151--177, 2002.
[13]
C. Snoek, M. Worring, and A. Hauptmann. Detection ofTVnews monologues by style analysis. In Proc. of theIEEEInt'l Conference on Multimedia & Expo, June 2004.
[14]
X. Song, C.-Y. Lin, and M.-T. Sun. Autonomous visual model building based on image crawling through internet search engines. In Int'l Workshop on Multimedia Information Retrieval, pages 315--322. ACM Press, 2004.
[15]
M.-T. S. Song Xiaodan, Ching-Yung Lin. Cross-modality automatic face model training from large video databases. In Workshop on Face Processing in Video, 2004.
[16]
J. Wang and J.-D. Zucker. Solving the multiple instance problem:Alazy learning approach. In Proc. 17th Int'l Conf. on Machine Learning, pages 1119--1125. Morgan Kaufmann, 2000.
[17]
Y. Wu, E. Y. Chang, K. C.-C. Chang, and J. R. Smith. Optimal multimodal fusion for multimedia data analysis. In Proc. of the 12th annual ACM Int'l Conf. on Multimedia, pages 572--579, 2004.
[18]
R. Yan and M. R. Naphade. Semi-supervised cross feature learning for semantic concept detection in video. In Proc. of Conf. on Computer Vision and Pattern Recognition, 2005.
[19]
C. Yang and T. Lozano-Perez. Image database retrieval with multiple-instance learning techniques. In Proc. of Internatinal Conf. on Data Engineering, pages 233--243, 2000.
[20]
J. Yang, M. Chen, and A. G. Hauptmann. Finding personX: Correlating names with visual appearances. In Proc. of 3rd Int'l Conf. on Image and Video Retrieval, pages 270--278, 2004.
[21]
J. Yang and A. G. Hauptmann. Naming every individual in news video monologues. In Proc. of the 12th annual ACM Int'l Conf. on Multimedia, pages 580--587. ACM Press, 2004.
[22]
Q. Zhang and S. Goldman. Em-DD: An improved multiple-instance learning technique. In Advances in Neural Information Processing Systems, pages 1073--1080. TheMITPress, 2001.
[23]
Q. Zhang, S. Goldman, W. Yu, and J. Fritts. Content-based image retrieval using multiple-instance learning. In Proc. 19th Int'l Conf. on Machine Learning, pages 682--689, 2002.

Cited By

View all
  • (2023)On the Relevance of Temporal Features for Medical Ultrasound Video RecognitionMedical Image Computing and Computer Assisted Intervention – MICCAI 202310.1007/978-3-031-43895-0_70(744-753)Online publication date: 1-Oct-2023
  • (2021)Automated Video Labelling: Identifying Faces by Corroborative Evidence2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR51284.2021.00019(77-83)Online publication date: Sep-2021
  • (2021)Automatic identification of focus personage in multi-lingual news imagesMultimedia Tools and Applications10.1007/s11042-020-10254-480:7(11015-11030)Online publication date: 3-Jan-2021
  • Show More Cited By

Index Terms

  1. Multiple instance learning for labeling faces in broadcasting news video

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MULTIMEDIA '05: Proceedings of the 13th annual ACM international conference on Multimedia
      November 2005
      1110 pages
      ISBN:1595930442
      DOI:10.1145/1101149
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 06 November 2005

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. face labeling
      2. machine learning
      3. multiple instance learning
      4. news video

      Qualifiers

      • Article

      Conference

      MM05

      Acceptance Rates

      MULTIMEDIA '05 Paper Acceptance Rate 49 of 312 submissions, 16%;
      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 07 Mar 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)On the Relevance of Temporal Features for Medical Ultrasound Video RecognitionMedical Image Computing and Computer Assisted Intervention – MICCAI 202310.1007/978-3-031-43895-0_70(744-753)Online publication date: 1-Oct-2023
      • (2021)Automated Video Labelling: Identifying Faces by Corroborative Evidence2021 IEEE 4th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR51284.2021.00019(77-83)Online publication date: Sep-2021
      • (2021)Automatic identification of focus personage in multi-lingual news imagesMultimedia Tools and Applications10.1007/s11042-020-10254-480:7(11015-11030)Online publication date: 3-Jan-2021
      • (2021)Hierarchical multi-label propagation using speaking face graphs for multimodal person discoveryMultimedia Tools and Applications10.1007/s11042-020-09692-x80:2(2797-2820)Online publication date: 1-Jan-2021
      • (2020)Automated Video Face Labelling for Films and TV MaterialIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2018.288983142:4(780-792)Online publication date: 1-Apr-2020
      • (2019)Count, Crop and Recognise: Fine-Grained Recognition in the Wild2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)10.1109/ICCVW.2019.00032(236-246)Online publication date: Oct-2019
      • (2019)DeepStar: Detecting Starring Characters in MoviesIEEE Access10.1109/ACCESS.2018.28905607(9265-9272)Online publication date: 2019
      • (2019)Name-face association with web facial image supervisionMultimedia Systems10.1007/s00530-017-0544-y25:1(1-20)Online publication date: 1-Feb-2019
      • (2018)An anomaly-introduced learning method for abnormal event detectionMultimedia Tools and Applications10.5555/3288251.328829077:22(29573-29588)Online publication date: 1-Nov-2018
      • (2017)Automatic Baseball Video Tagging Based on Voice Pattern Prioritization and Recursive Model LocalizationJournal of Advanced Computational Intelligence and Intelligent Informatics10.20965/jaciii.2017.p126221:7(1262-1279)Online publication date: 20-Nov-2017
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media