skip to main content
research-article

Towards optimizing human labeling for interactive image tagging

Published: 19 August 2013 Publication History

Abstract

Interactive tagging is an approach that combines human and computer to assign descriptive keywords to image contents in a semi-automatic way. It can avoid the problems in automatic tagging and pure manual tagging by achieving a compromise between tagging performance and manual cost. However, conventional research efforts on interactive tagging mainly focus on sample selection and models for tag prediction. In this work, we investigate interactive tagging from a different aspect. We introduce an interactive image tagging framework that can more fully make use of human's labeling efforts. That means, it can achieve a specified tagging performance by taking less manual labeling effort or achieve better tagging performance with a specified labeling cost. In the framework, hashing is used to enable a quick clustering of image regions and a dynamic multiscale clustering labeling strategy is proposed such that users can label a large group of similar regions each time. We also employ a tag refinement method such that several inappropriate tags can be automatically corrected. Experiments on a large dataset demonstrate the effectiveness of our approach

References

[1]
Andoni, A. and Indyk, P. 2008. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Comm. ACM 51, 1.
[2]
Bissol, S., Mulhem, P., and Chiaramella, Y. 2003. Mialbum - a system for home photo managemet using the semi-automatic image annotation approach. In Proceedings of the International Workshop on Content-Based Multimedia Indexing.
[3]
Chua, T.-S., Tang, J., Hong, R., Li, H., Luo, Z., and Zheng, Y.-T. 2009. NUS-WIDE: A real-world web image database from National University of Singapore. In Proceedings of the ACM Conference on Image and Video Retrieval.
[4]
Cui, J., Wen, F., Xiao, R., Tian, O., and Tang, X. 2007. Easyalbum: An interactive photo annotation system based on face clustering and re-ranking. In Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems.
[5]
Deng, Y. and Manjunath, B. S. 2001. Unsupervised segmentation of color-texture regions in images and video. IEEE Trans. Pattern Anal. Mach. Intell.
[6]
Duygulu, P., Barnard, K., and Forsyth, D. 2002. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In Proceedings of the European Conference on Computer Vision.
[7]
Frey, B. J. and Dueck, D. 2007. Clustering by passing messages between data points. Science 315, 972--976.
[8]
Girgensohn, A., Adcock, J., and Wilcox, L. 2004. Leveraging face recognition technology to find and organize photos. In Proceedings of the 6th ACM SIGMM International Workshop on Multimedia Information Retrieval.
[9]
Hauptmann, A., Lin, W. H., Yan, R., Yang, J., and Chen, M. Y. 2006. Extreme video retrieval: Joint maximization of human and computer performance. In Proceedings of the ACM International Conference on Multimedia.
[10]
Huang, T., Dagli, C., Rajaram, S., Chang, E., Mandel, M., Poliner, G., and Ellis, D. 2008. Active learning for interactive multimedia retrieval. Proc. IEEE 96, 4.
[11]
Jeon, J., Lavrenko, V., and Manmatha, R. 2003. Automatic image annotation and retrieval using cross-media relevance models. In Proceedings of the ACM Conference on Research and Development in Information Retrieval. 119--126.
[12]
Joshi, A., Porikli, F., and Papanikolopoulos, N. 2009. Multi-class active learning for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[13]
Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., and Wu, A. Y. 2002. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24, 881--892.
[14]
Kuchinsky, A., Pering, C., Creech, M. L., Freeze, D., Serra, B., and Gwizdka, J. 1999. Fotofile: A consumer multimedia organization and retrieval system. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.
[15]
Lee, S., Neve, W. D., and Ro, Y. M. 2010. Image tag refinement along the what dimension using tag categorization and neighbor voting. In Proceedings of the IEEE International Conference on Multimedia and Expo.
[16]
Li, T., Yan, S., Mei, T., Hua, X.-S., and Kweon, I.-S. 2011. Image decomposition with multilabel context: Algorithms and applications. IEEE Trans. Image Process. 20, 8.
[17]
Liu, D., Wang, M., Hua, X. S., and Zhang, H. J. 2009. Smart batch tagging of photo albums. In Proceedings of the ACM International Conference on Multimedia.
[18]
Liu, W., Sun, Y., and Zhang, H. 2000. Mialbum - a system for home photo managemet using the semi-automatic image annotation approach. In Proceedings of the ACM International Conference on Multimedia.
[19]
Liu, W., Susan, D., Sun, Y., Zhang, H.-J., Czerwinski, M., and Field, B. 2001. Semi-automatic image annotation. In Proceedings of the IFIP TC 13 International Conference on Human Computer Interaction.
[20]
Makadia, A., Pavlovic, V., and Kumar, S. 2008. A new baseline for image annotation. In Proceedings of the 10th European Conference on Computer Vision.
[21]
Mu, Y., Shen, J., and Yan, S. 2010. Weakly supervised hashing in kernel space. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
[22]
Nakamuraa, E. and Kehtarnavaz, N. 1998. Determining number of clusters and prototype locations via multi-scale clustering. Pattern Recognit. Lett. 19, 14.
[23]
Ng, A. Y., Jordan, M. I., and Weiss, Y. 2001. On spectral clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14, MIT Press, 849--856.
[24]
Rother, C., Bordeaux, L., Hamadi, Y., and Blake, A. 2006. Autocollage. In Proceedings of the ACM SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. ACM Press, 847--852.
[25]
Rui, Y., Huang, T. S., Ortega, M., and Mehrotra, S. 1998. Relevance feedback: a power tool for interactive content-based image retrieval. IEEE Trans. Circ. Syst. Video Tech. 18, 5, 644--655.
[26]
Suh, B. and Bederson, B. B. 2004. Semi-automatic image annotation using event and torso identification. Tech. rep., HCIL-2004-15, Computer Science Department, University of Maryland.
[27]
Suh, B. and Bederson, 2007. Semi-automatic photo annotation strategies using event based clustering and clothing based person recognition. Interact. Comput. 19, 4, 524--544.
[28]
Tang, J., Chen, Q., Yan, S., Chua, T.-S., and Jain, R. 2010. One person labels one million images. In Proceedings of the ACM International Conference on Multimedia.
[29]
Tang, J., Hong, R., Yan, S., Chua, T.-S., Qi, G.-J., and Jain, R. 2011. Image annotation by knn-sparse graph-based label propagation over noisily-tagged web images. ACM Trans. Intell. Syst. Technol. 2, 2.
[30]
Tang, J., Yan, S., Hong, R., Qi, G.-J., and Chua, T.-S. 2009. Inferring semantic concepts from community contributed images and noisy tags. In Proceedings of the ACM International Conference on Multimedia.
[31]
Tang, J., Zha, Z.-J., Tao, D., and Chua, T.-S. 2012. Semantic-gap-oriented active learning for multilabel image annotation. IEEE Trans. Image Process. 21, 4, 2354--2360.
[32]
Tian, Y., Liu, W., Xiao, R., Wen, F., and Tang, X. 2007. A face annotation framework with partial clustering and interactive labeling. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition.
[33]
Trec. Trec-10 proceedings appendix on common evaluation measures. http://trec.nist.gov/pubs/trec10/appendices/measures.pdf.
[34]
Tuffield, M. M., Harris, S., et al. 2006. Image annotation with photocopain. In Proceedings of the World Wide Web Conference.
[35]
Wang, X. J., Zhang, L., Li, X., and Ma, W. Y. 2008. Annotating images by mining image search results. IEEE Trans. Pattern Anal. Mach. Intell.
[36]
Xu, H., Wang, J., Hua, X.-S., and Li, S. 2009. Tag refinement by regularized LDA. In Proceedings of the ACM International Conference on Multimedia.
[37]
Yan, R., Natsev, A., and Campbell, M. 2009. Hybrid tagging and browsing approaches for efficient manual image annotation. IEEE Multimedia Mag.
[38]
Yang, K., Wang, M., and Zhang, H.-J. 2009. Active tagging for image indexing. In Proceedings of the IEEE International Conference on Multimedia and Expo.
[39]
Zhang, L., Chen, L., Li, M., and Zhang, H. 2003. Automated annotation of human faces in family albums. In Proceedings of the 11th ACM International Conference on Multimedia.
[40]
Zhu, G., Yan, S., and Ma, Y. 2010. Image tag refinement towards low-rank, content-tag prior and error sparsity. In Proceedings of the ACM International Conference on Multimedia.

Cited By

View all
  • (2024)CataAnno: An Ancient Catalog Annotator for Annotation Cleaning by RecommendationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345637931:1(404-414)Online publication date: 16-Sep-2024
  • (2023)Simulation-Based Optimization of User Interfaces for Quality-Assuring Machine Learning Model PredictionsACM Transactions on Interactive Intelligent Systems10.1145/3594552Online publication date: 17-May-2023
  • (2022)NumCap: A Number-controlled Multi-caption Image Captioning NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357692719:4(1-24)Online publication date: 16-Dec-2022
  • Show More Cited By

Index Terms

  1. Towards optimizing human labeling for interactive image tagging

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 9, Issue 4
      August 2013
      168 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/2501643
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 August 2013
      Accepted: 01 March 2013
      Revised: 01 April 2012
      Received: 01 July 2011
      Published in TOMM Volume 9, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Interactive image tagging
      2. multiscale cluster labeling

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)21
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 12 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)CataAnno: An Ancient Catalog Annotator for Annotation Cleaning by RecommendationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.345637931:1(404-414)Online publication date: 16-Sep-2024
      • (2023)Simulation-Based Optimization of User Interfaces for Quality-Assuring Machine Learning Model PredictionsACM Transactions on Interactive Intelligent Systems10.1145/3594552Online publication date: 17-May-2023
      • (2022)NumCap: A Number-controlled Multi-caption Image Captioning NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357692719:4(1-24)Online publication date: 16-Dec-2022
      • (2022)Classifier Construction Under Budget ConstraintsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517863(1160-1174)Online publication date: 10-Jun-2022
      • (2022)OneLabeler: A Flexible System for Building Data Labeling ToolsProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3517612(1-22)Online publication date: 29-Apr-2022
      • (2021)Exploring Misclassification Information for Fine-Grained Image ClassificationSensors10.3390/s2112417621:12(4176)Online publication date: 18-Jun-2021
      • (2020)Minimization of Classifier Construction Cost for Search QueriesProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389755(1351-1365)Online publication date: 11-Jun-2020
      • (2020)Construction of Diverse Image Datasets From Web Collections With Limited LabelingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2019.289889930:4(1147-1161)Online publication date: Apr-2020
      • (2020)A review on visual content-based and users’ tags-based image annotation: methods and techniquesMultimedia Tools and Applications10.1007/s11042-020-08862-179:29-30(21679-21741)Online publication date: 9-May-2020
      • (2019)Semantically Modeling of Object and Context for CategorizationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2018.285609630:4(1013-1024)Online publication date: Apr-2019
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media