skip to main content
10.1145/1743384.1743398acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Performance measures for multilabel evaluation: a case study in the area of image classification

Published:29 March 2010Publication History

ABSTRACT

With the steadily increasing amount of multimedia documents on the web and at home, the need for reliable semantic indexing methods that assign multiple keywords to a document grows. The performance of existing approaches is often measured with standard evaluation measures of the information retrieval community. In a case study on image annotation, we show the behaviour of 13 different evaluation measures and point out their strengths and weaknesses. For the analysis, data from 19 research groups that participated in the ImageCLEF Photo Annotation Task are utilized together with several configurations based on random numbers. A recently proposed ontology-based measure was investigated that incorporates structure information, relationships from the ontology and the agreement between annotators for a concept and compared to a hierarchical variant. The results for the hierarchical measure are not competitive. The ontology-based results assign good scores to the systems that got also good ranks in the other measures like the example-based F-measure. For concept-based evaluation, stable results could be obtained for MAP concerning random numbers and the number of annotated labels. The AUC measure shows good evaluation characteristics in case all annotations contain confidence values.

References

  1. A. Bernstein, E. Kaufmann, C. Bürki, and M. Klein. How similar is it? Towards personalized similarity measures in ontologies. In 7th Intern. Conference Wirtschaftsinformatik, Germany. Springer, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  2. A. Binder and M. Kawanabe. Fraunhofer FIRST' Submission to ImageCLEF2009 Photo Annotation Task: Non-sparse Multiple Kernel Learning. CLEF working notes, 2009.Google ScholarGoogle Scholar
  3. H. Blockeel, M. Bruynooghe, S. Dzeroski, J. Ramon, and J. Struyf. Hierarchical multi-classification. In SIGKDD Workshop on Multi-Relational Data Mining, pages 21--35, 2002.Google ScholarGoogle Scholar
  4. N. Cesa-Bianchi, C. Gentile, and L. Zaniboni. Incremental algorithms for hierarchical classification. Journal of Machine Learning Research, 7:31--54, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. B. Daroczy, I. Petras, A. Benczur, Z. Fekete, D. Nemeskey, D. Siklosi, and Z. Weiner. SZTAKI @ ImageCLEF 2009. CLEF working notes, 2009.Google ScholarGoogle Scholar
  6. M. Douze, M. Guillaumin, T. Mensink, C. Schmid, and J. Verbeek. INRIA-LEARs participation to ImageCLEF 2009. CLEF working notes, 2009.Google ScholarGoogle Scholar
  7. H. Escalante, J. Gonzalez, C. Hernandez, A. Lopez, M. Montex, E. Morales, E. Ruiz, L. Sucar, and L. Villasenor. TIA-INAOE's Participation at ImageCLEF 2009. CLEF working notes, 2009.Google ScholarGoogle Scholar
  8. A. Fakeri-Tabrizi, S. Tollari, L. Denoyer, and P. Gallinari. UPMC/LIP6 at ImageCLEF annotation >2009: Large Scale Visual Concept Detection and Annotation. CLEF working notes, 2009.Google ScholarGoogle Scholar
  9. J. Fan, Y. Gao, H. Luo, and R. Jain. Mining multilevel image semantics via hierarchical classification. IEEE Trans. on Multimedia, 10(2):167, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Ferecatu and H. Sahbi. TELECOM ParisTech at ImageClef 2009: Large Scale Visual Concept Detection and Annotation Task. CLEF working notes, 2009.Google ScholarGoogle Scholar
  11. A. Freitas and A. de Carvalho. A tutorial on hierarchical classification with applications in bioinformatics. Intelligent Information Technologies: Concepts, Methodologies, Tools and Applications, 2007.Google ScholarGoogle Scholar
  12. H. Glotin, A. Fakeri-Tabrizi, P. Mulhem, M. Ferecatu, Z. Zhao, S. Tollari, G. Quenot, H. Sahbi, E. Dumont, and P. Gallinari. Comparison of Various AVEIR Visual Concept Detectors with an Index of Carefulness. CLEF working notes, 2009.Google ScholarGoogle Scholar
  13. J. Hare and P. Lewis. IAM@ImageCLEF Photo Annotation 2009: Naive application of a linear algebraic semantic space. CLEF working notes, 2009.Google ScholarGoogle Scholar
  14. M. J. Huiskes and M. S. Lew. The MIR Flickr Retrieval Evaluation. In MIR '08: Proceedings of the 2008 ACM Intern. Conf. on Multimedia Information Retrieval, New York, NY, USA, 2008. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. A. Iftene, L. Vamanu, and C. Croitoru. UAIC at ImageCLEF 2009 Photo Annotation Task. CLEF working notes, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Y. Liu and E. Shriberg. Comparing evaluation metrics for sentence boundary detection. In Intern. Conf. on Acoustics, Speech and Signal Processing, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  17. A. Llorente, S. Little, and S. Rüger. MMIS at ImageCLEF 2009: Non-parametric Density Estimation Algorithms. CLEF working notes, 2009.Google ScholarGoogle Scholar
  18. P. Lord, R. Stevens, A. Brass, and C. Goble. Investigating semantic similarity measures across the Gene Ontology: The relationship between sequence and annotation. volume 19. Oxford Univ Press, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  19. D. Lowe. Object recognition from local scale-invariant features. In Intern. Conf. on Computer Vision, volume 2, pages 1150--1157. Corfu, Greece, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Manning, P. Raghavan, and H. Schütze. An Introduction to Information Retrieval {Draft}. Cambridge, UK: Cambridge University Press, April 2009. http://www.informationretrieval.org/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. P. Mulhem, J.-P. Chevallet, G. Quenon, and R. Al Batal. MRIM-LIG at ImageCLEF 2009: Photo Retrieval and Photo Annotation tasks. CLEF working notes, 2009.Google ScholarGoogle Scholar
  22. J. Ngiam and H. Goh. I2R ImageCLEF Photo Annotation 2009 Working Notes. CLEF working notes, 2009.Google ScholarGoogle Scholar
  23. S. Nowak and P. Dunker. A Consumer Photo Tagging Ontology: Concepts and Annotations. In THESEUS/ImageCLEF Pre-Workshop, 2009.Google ScholarGoogle Scholar
  24. S. Nowak and P. Dunker. Overview of the CLEF 2009 Large-Scale Visual Concept Detection and Annotation Task. CLEF working notes, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Nowak and H. Lukashevich. Multilabel Classification Evaluation using Ontology Information. In Proc. of IRMLeS Workshop, ESWC, 2009.Google ScholarGoogle Scholar
  26. P. Resnik. Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of artificial intelligence research, 1999.Google ScholarGoogle Scholar
  27. S. Sarin and W. Kameyama. Joint Contribution of Global and Local Features for Image Annotation. CLEF working notes, 2009.Google ScholarGoogle Scholar
  28. X. Shen, M. Boutell, J. Luo, and C. Brown. Multi-label machine learning and its application to semantic scene classification. In Intern. Symp. on Electronic Imaging, San Jose, CA, 2004.Google ScholarGoogle Scholar
  29. G. Tsoumakas and I. Vlahavas. Random k-labelsets: An ensemble method for multilabel classification. Lecture Notes in Computer Science, 4701:406, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. K. van de Sande, T. Gevers, and A. Smeulders. The University of Amsterdam's Concept Detection System at ImageCLEF 2009. CLEF working notes, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Z.-Q. Zhao, H. Glotin, and E. Dumont. LSIS Scale Photo Annotations: Discriminant Features SVM versus Visual Dictionary based on Image Frequency. CLEF working notes, 2009.Google ScholarGoogle Scholar

Index Terms

  1. Performance measures for multilabel evaluation: a case study in the area of image classification

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in
            • Published in

              cover image ACM Conferences
              MIR '10: Proceedings of the international conference on Multimedia information retrieval
              March 2010
              600 pages
              ISBN:9781605588155
              DOI:10.1145/1743384

              Copyright © 2010 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 29 March 2010

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article

              Upcoming Conference

              MM '24
              MM '24: The 32nd ACM International Conference on Multimedia
              October 28 - November 1, 2024
              Melbourne , VIC , Australia

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader