skip to main content
research-article

A reward-and-punishment-based approach for concept detection using adaptive ontology rules

Published:10 May 2013Publication History
Skip Abstract Section

Abstract

Despite the fact that performance improvements have been reported in the last years, semantic concept detection in video remains a challenging problem. Existing concept detection techniques, with ontology rules, exploit the static correlations among primitive concepts but not the dynamic spatiotemporal correlations. The proposed method rewards (or punishes) detected primitive concepts using dynamic spatiotemporal correlations of the given ontology rules and updates these ontology rules based on the accuracy of detection. Adaptively learned ontology rules significantly help in improving the overall accuracy of concept detection as shown in the experimental result.

Skip Supplemental Material Section

Supplemental Material

References

  1. Amir, A., Berg, M., Chang, S.-F., Iyengar, G., Lin, C.-Y., Natsev, A., Neti, C., Nock, H., Naphade, M., Hsu, W., Smith, J. R., Tseng, B., Wu, Y., Zhang, D., and Watson, I. T. J. 2003. Ibm research trecvid-2003 video retrieval system. In Proceedings of the TREC Video Retrieval Evaluation (NIST TRECVID'03).Google ScholarGoogle Scholar
  2. Assfalg, J., Bertini, M., Colombo, C., Bimbo, A. D., and Nunziati, W. 2003. Semantic annotation of soccer videos: Automatic highlights identification. Comput. Vis. Image Understand. 92, 2--3, 285--305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Bai, L., Lao, S., Zhang, W., Jones, G. J., and Smeaton, A. F. 2007. Video semantic content analysis based on ontology combinedmpeg-7. In Adaptive Multimedial Retrieval: Retrieval, User, and Semantics, Lecture Notes in Computer Science, vol. 4918, Springer, 237--250.Google ScholarGoogle Scholar
  4. Ballan, L., Bertini, M., Bimbo, A., Seidenari, L., and Serra, G. 2011. Event detection and recognition for semantic annotation of video. Multimedia Tools Appl. 51, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ballan, L., Bertini, M., Bimbo, A. D., and Serra, G. 2010. Video annotation and retrieval using ontologies and rule learning. IEEE Multimedia 17, 4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Bather, J. 2000. Decision Theory: An Introduction to Dynamic Programming and Sequential Decisions. John Wiley & Sons. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bertini, M., Cucchiara, R., del Bimbo, A., and Torniai, C. 2005. Video annotation with pictorially enriched ontologies. In Proceedings of the IEEE International Conference on Multimedia and Expo.Google ScholarGoogle Scholar
  8. Bhatt, C. and Kankanhalli, M. 2011. Multimedia data mining: State of the art and challenges. Multimedia Tools Appl. 51, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Brand, M. and Kettnaker, V. 2000. Discovery and segmentation of activities in video. IEEE Trans. Pattern Anal. Mach. Intell. 22, 8, 844--851. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Castano, S., Espinosa, S., Ferrara, A., Karkaletsis, V., Kaya, A., Melzer, S., Moller, R., Montanelli, S., and Petasis, G. 2007. Ontology dynamics with multimedia information: The boemie evolution methodology. In Proceedings of the International Workshop on Ontology Dynamics.Google ScholarGoogle Scholar
  11. Castano, S., Espinosa, S., Ferrara, A., Karkaletsis, V., Kaya, A., Moller, R., Montanelli, S., Petasis, G., and Wessel, M. 2008. Multimedia interpretation for dynamic ontology evolution. J. Logic Comput. 19, 5, 859--897. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chao, C.-Y., Shih, H.-C., and Huang, C.-L. 2005. Semantics-Based highlight extraction of soccer program using dbn. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing.Google ScholarGoogle Scholar
  13. Dasiopoulou, S., Kompatsiaris, I., and Strintzis, M. 2010. Investigating fuzzy dls-based reasoning in semantic image analysis. Multimedia Tools Appl. 49, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Dasiopoulou, S., Mezaris, V., Kompatsiaris, I., Papastathis, V. K., and Strintzis, M. G. 2005. Knowledge-Assisted semantic video object detection. Trans. Circ. Syst. Video Technol. 15, 10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Everingham, M., van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. 2011. The PASCAL visual object classes challenge 2011 (VOC2011) results. http://www.pascal-network.org/challenges/VOC/voc2011/workshop/index.html.Google ScholarGoogle Scholar
  16. Harte, N., Lennon, D., and Kokaram, A. 2009. On parsing visual sequences with the hidden Markov model. J. Image Video Process. 6:1--6:13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Haubold, A. and Naphade, M. 2007. Classification of video events using 4-dimensional time-compressed motion features. In Proceedings of the ACM International Conference on Image and Video Retrieval. 178--185. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Hossain, M. A., Atrey, P. K., and Saddik, A. E. 2009. Learning multisensor confidence using a reward-and-punishment mechanism. IEEE Trans. Instrument. Measur. 58, 5.Google ScholarGoogle ScholarCross RefCross Ref
  19. Kohlmorgen, J., Lemm, S., Muller, K., Liehr, S., and Pawelzik, K. 1999. Fast change point detection in switching dynamics using a hidden Markov model of prediction experts. In Proceedings of the 9th International Conference on Artificial Neural Networks. Vol. 1. 204--209.Google ScholarGoogle Scholar
  20. Li, L., Prakash, B. A., and Faloutsos, C. 2010. Parsimonious linear fingerprinting for time series. Proc. VLDB Endow. 3, 1. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Oca, V. M. D., Jeske, D. R., Zhang, Q., Rendon, C., and Marvasti, M. 2010. A cusum change-point detection algorithm for non-stationary sequences with application to data network surveillance. J. Syst. Softw. 83, 7. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Over, P., Awad, G., Michel, M., Fiscus, J., Kraaij, W., Smeaton, A. F., and Quenot, G. 2011. Trecvid 2011 -- An overview of the goals, tasks, data, evaluation mechanisms and metrics. In Proceedings of the TREC Video Retrieval Evaluation (TRECVID'11) Workshop.Google ScholarGoogle Scholar
  23. Petridis, S. and Perantonis, S. J. 2011. Semantics extraction from multimedia data: An ontology-based machine learning approach. In Perception-Action Cycle, Series in Cognitive and Neural Systems, Springer.Google ScholarGoogle Scholar
  24. Qi, G.-J., Hua, X.-S., Rui, Y., Tang, J., Mei, T., and Zhang, H.-J. 2007. Correlative multi-label video annotation. In Proceedings of the ACM International Conference on Multimedia. 17--26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Sadlier, D. and O'Connor, N. 2005. Event detection in field sports video using audio-visual features and a support vector machine. IEEE Trans. Circ. Syst. Video Technol. 15, 10, 1225--1233. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Shyu, M.-L., Xie, Z., Chen, M., and Chen, S.-C. 2008. Video semantic event/concept detection using a subspace-based multimedia data mining framework. IEEE Trans. Multimedia 10, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Smeaton, A. F., Over, P., and Kraaij, W. 2006. Evaluation campaigns and trecvid. In Proceedings of the ACM International Workshop on Multimedia Information Retrieval. 321--330. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Smith, J. R., Naphade, M., and Natsev, A. 2003. Multimedia semantic indexing using model vectors. In Proceedings of the IEEE International Conference on Multimedia and Expo. 445--448. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Wang, M., Hua, X.-S., Hong, R., Tang, J., Qi, G.-J., and Song, Y. 2009a. Unified video annotation via multigraph learning. IEEE Trans. Cir. Syst. Video Technol. 19, 5, 733--746. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Wang, M., Hua, X.-S., Tang, J., and Hong, R. 2009b. Beyond distance measurement: constructing neighborhood similarity for video annotation. IEEE Trans. Multimedia 11, 3, 465--476. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Wu, Y., Tseng, B. L., and Smith, J. R. 2004. Ontology-Based multi-classification learning for video concept detection. In Proceedings of the International Conference on Multimedia and Expo.Google ScholarGoogle Scholar
  32. Xu, D. and Chang, S.-F. 2008. Video event recognition using Kernel methods with multilevel temporal alignment. IEEE Trans. Pattern Anal. Mach. Intell. 30, 11, 1985--1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Xu, P., Xie, L., Chang, S.-F., Divakaran, A., Vetro, A., and Sun, H. 2001. Algorithms and system for segmentation and structure analysis in soccer video. In Proceedings of the International Conference on Multimedia and Expo.Google ScholarGoogle Scholar
  34. Yan, R., Tesic, J., and Smith, J. R. 2007. Model-Shared subspace boosting for multi-label classification. In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining. 834--843. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Yanagawa, A., Chang, S.-F., Kennedy, L., and Hsu, W. 2007. Columbia university's baseline detectors for 374 lscom semantic visual concepts. Tech. rep. 222-2006, Columbia University.Google ScholarGoogle Scholar
  36. Yang, K. and Shahabi, C. 2004. A pca-based similarity measure for multivariate time series. In Proceedings of the ACM International Workshop on Multimedia Databases. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Zha, Z.-J., Mei, T., Wang, Z., and Hua, X.-S. 2007. Building a comprehensive ontology to refine video concept detection. In Proceedings of the ACM International Workshop on Multimedia Information Retrieval. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Zhou, X., Zhuang, X., Yan, S., Chang, S.-F., Hasegawa-Johnson, M., and Huang, T. S. 2008. Sift-Bag kernel for video event analysis. In Proceedings of the ACM International Conference on Multimedia. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A reward-and-punishment-based approach for concept detection using adaptive ontology rules

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 9, Issue 2
      May 2013
      144 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/2457450
      Issue’s Table of Contents

      Copyright © 2013 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 May 2013
      • Revised: 1 July 2012
      • Accepted: 1 July 2012
      • Received: 1 December 2011
      Published in tomm Volume 9, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader