Skip to main content
Log in

An Integrated Framework for Semantic Annotation and Adaptation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Tools for the interpretation of significant events from video and video clip adaptation can effectively support automatic extraction and distribution of relevant content from video streams. In fact, adaptation can adjust meaningful content, previously detected and extracted, to the user/client capabilities and requirements. The integration of these two functions is increasingly important, due to the growing demand of multimedia data from remote clients with limited resources (PDAs, HCCs, Smart phones). In this paper we propose an unified framework for event-based and object-based semantic extraction from video and semantic on-line adaptation. Two cases of application, highlight detection and recognition from soccer videos and people behavior detection in domotic* applications, are analyzed and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. J.K. Aggarwal and A. Madabhushi, “A bayesian approach to human activity recognition,” in Proc. of the Second IEEE International Workshop on Visual Surveillance (CVPR workshop), Fort Collins, CO (USA), June 1999, pp. 25–30.

  2. J. Assfalg, M. Bertini, C. Colombo, A. Del Bimbo, and W. Nunziati, “Automatic interpretation of soccer video for highlights extraction and annotation,” in Proceeedings of the ACM Symposium on Applied Computing, March 2003, pp. 769–773.

  3. J. Assfalg, M. Bertini, C. Colombo, A. Del Bimbo, and W. Nunziati, “Semantic annotation of soccer videos: Automatic highlights identification,” Computer Vision and Image Understanding, Vol. 92, No. 2/3, pp. 285–305, 2003.

    Google Scholar 

  4. R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Detecting moving objects, ghosts and shadows in video streams,” in press on IEEE Transcations on Pattern Analysis and Machine Intelligence, 2003.

  5. R. Cucchiara, C. Grana, and A. Prati, “Semantic transcoding for live video server,” in Proceedings of ACM Multimedia 2002 Conference, December 2002, pp. 223–226.

  6. R. Cucchiara, C. Grana, and A. Prati, “Semantic video transcoding using classes of relevance,” International Journal of Image and Graphics, Vol. 3, No. 1, pp. 145–169, 2003.

    Google Scholar 

  7. A. Ekin, A. Murat Tekalp, and R. Mehrotra, “Automatic soccer video analysis and summarization,” IEEE Transactions on Image Processing, 2003 (to appear).

  8. D. Farin, M. Ksemann, P.H.N. de With, and W. Effelsberg, “Rate-distortion optimal adaptive quantization and coefficient thresholding for MPEG coding,” in 23rd Symposium on Information Theory in the Benelux, May 2002.

  9. F. Brémond, F. Cupillard, and M. Thonnat, “Behaviour recognition for individuals, groups of people and crowd,” in IEEE Proc. of the IDSS Symposium—Intelligent Distributed Surveillance Systems, London (UK), February 2003.

  10. Y. Gong, L.T. Sin, C.H. Chuan, H. Zhang, and M. Sakauchi, “Automatic parsing of tv soccer programs,” in Proceedings of IEEE Int’l Conference on Multimedia Computing and Systems, 1995, pp. 15–18.

  11. C.A. Gonzales and E. Viscito, “Motion video adaptive quantization in the transform domain, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 1, No. 4, pp. 374–378, 1991.

    Google Scholar 

  12. M.R. Hashemi, L. Winger, and S. Panchanathan, “Compressed domain motion vector resampling for downscaling of MPEG video, in Proceedings of IEEE Int’l Conference on Image Processing, Vol. 4, pp. 276–279, 1999.

  13. K.-L. Huang, Y.-S. Tung, J.-L. Wu, P.-K. Hsiao, and H.-S. Chen, “A frame-based mpeg characteristics extraction tool and its application in video transcoding, IEEE Transcations on Consumer Electronics, Vol. 48, No. 3, pp. 522–532, 2002.

    Google Scholar 

  14. J. Hwang, T. Wu, and C. Lin, “Dynamic frame-skipping in video transcoding,” in Proceedings of the IEEE Second Workshop on Multimedia Signal Processing, 1998, pp. 616–621.

  15. G. Keesman, R. Hellinghuizen, Fokke Hoeksema, and Geert Heideman, “Transcoding of MPEG bitstreams,” Signal Processing: Image Communication, Vol. 8, No. 6, pp. 481–500, 1996.

    Google Scholar 

  16. J.-G. Kim, Y. Wang, and S.-F. Chang, “Content-adaptive utility-based video adaptation,” in Proceedings of IEEE Int’l Conference on Multimedia Computing and Expo, 2003.

  17. R. Leonardi and P. Migliorati, “Semantic indexing of multimedia documents,” IEEE Multimedia, Vol. 9, No. 2, pp. 44–51, 2002.

    Google Scholar 

  18. Y. Liang and Y-P. Tan, “A new content-based hybrid video transcoding method,” in Proceedings of IEEE Int’l Conference on Image Processing, Vol. 1, 2001, pp. 429–432.

  19. R. Mohan, J.R. Smith, and C. Li, “Adapting multimedia internet content for universal access,” IEEE Transactions on Multimedia, Vol. 1, No. 1, pp. 104–114, 1999.

    Google Scholar 

  20. K. Nagao, Y. Shirai, and K. Squire, “Semantic annotation and transcoding: Making web content more accessible,” IEEE Multimedia, Vol. 8, No. 2, pp. 69–81, 2001.

    Google Scholar 

  21. S. Nepal, U. Srinivasan, and G. Reynolds, “Automatic detection of ‘goal’ segments in basketball videos,” in Proceedings of ACM Multimedia, 2001, pp. 261–269.

  22. A. Ortega and K. Ramchandran, “Forward-adaptive quantization with optimal overhead cost for image and video coding with applications to MPEG video coders,” in SPIE Digital Video Compression, February 1995.

  23. K. Ramchandran and M. Vetterli, “Rate-distortion optimal fast thresholding with complete JPEG/MPEG decoder compatibility,” IEEE Transactions on Image Processing, Vol. 3, No. 5, pp. 700–704, 1994.

    Google Scholar 

  24. IBM research. http://www.research.ibm.com/MediaStar/VideoSystem.html.

  25. T. Shanableh and M. Ghanbari, “Heterogeneous video transcoding to lower spatio-temporal resolution and different encoding formats,” IEEE Transactions on Multimedia, Vol. 2, No. 2, pp. 101–110, 2000.

    Google Scholar 

  26. J.R. Smith, R. Mohan, and C. Li, “Content-based transcoding of images in the internet,” in Proceedings of IEEE Int’l Conference on Image Processing, October 1998, Vol. 3, pp. 7–11.

  27. J. Song and B.-L. Yeo, “Fast extraction of spatially reduced image sequences from MPEG-2 compressed video,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9, No. 7, pp. 1100–1114, 1999.

    Google Scholar 

  28. G. Sudhir, J.C.M. Lee, and A.K. Jain, “Automatic classification of tennis video for high-level content-based retrieval,” in Proceedings of Int’l Workshop on Content-based Access of Image and Video Databases, 1998.

  29. H. Sun, A. Vetro, J. Bao, and T. Poon, “A new approach for memory-efficient atv decoding,” IEEE Transcations on Consumer Electronics, Vol. 43, No. 3, pp. 517–525, 1997.

    Google Scholar 

  30. F. Brémond S. Hongeng and R. Nevatia, “Representation and optimal recognition of human activities,” in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition CVPR00, South Carolina (USA), June 2000.

  31. A. Vetro, C. Chrisopoulos, and H. Sun, “Video transcoding architectures and techniques: An overview,” IEEE Signal Processing Magazine, Vol. 20, No. 2, pp. 18–29, 2003.

    Google Scholar 

  32. A. Vetro and H. Sun, “Encoding and transcoding multiple video-objects with variable temporal resolution,” in Proceedings of Intern. Symposium of Circuit and Systems, May 2001.

  33. A. Vetro, H. Sun, and Y. Wang, “Object-based transcoding for adaptable video content delivery,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11, No. 3, pp. 387–401, 2001.

    Google Scholar 

  34. O. Werner, “Requantization for transcoding of MPEG-2 bit streams,” IEEE Transactions on Image Processing, Vol. 8, No. 2, pp. 179–191, February 1999.

    Google Scholar 

  35. P.H. Westerink, R. Rajagopalan, and C.A. Gonzales, “Two-pass MPEG-2 variable-bitrate encoding,” IBM Journal of Research and Developement, Vol. 43, No. 4, July 1999.

  36. C. Yim and M.A. Isnardi, “An efficient method for dct-domain image resizing with mixed field/frame-mode macroblocks,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 9, No. 5, pp. 696–700, 1999.

    Google Scholar 

  37. Y. Yoo and A. Ortega, “Adaptive quantization without side information using svq and tcq,” in 29th Asilomar Conference on Signals, Systems, and Computers, November 1995.

  38. Y. Yu and C.W. Chen, “SNR scalable transcoding for video over wireless channels,” in Proceedings of the Wireless Communications and Networking Conference (WCNC), 2000, Vol. 3, pp. 1396–1402.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Bertini.

Additional information

Domotics is a neologism coming from the Latin word domus (home) and informatics.

Marco Bertini has a research grant and carries out his research activity at the Department of Systems and Informatics at the University of Florence, Italy. He received a M.S. in electronic engineering from the University of Florence in 1999, and Ph.D. in 2004. His main research interest is content-based indexing and retrieval of videos. He is author of more than 25 papers in international conference proceedings and journals, and is a reviewer for international journals on multimedia and pattern recognition.

Rita Cucchiara (Laurea Ingegneria Elettronica, 1989; Ph.D. in Computer Engineering, University of Bologna, Italy 1993). She is currently Full Professor in Computer Engineering at the University of Modena and Reggio Emilia (Italy). She was formerly Assistant Professor (‘93–‘98) at the University of Ferrara, Italy and Associate Professor (‘98–‘04) at the University of Modena and Reggio Emilia, Italy. She is currently in the Faculty staff of Computer Engenering where has in charges the courses of Computer Architectures and Computer Vision.

Her current interests include pattern recognition, video analysis and computer vision for video surveillance, domotics, medical imaging, and computer architecture for managing image and multimedia data.

Rita Cucchiara is author and co-author of more than 100 papers in international journals, and conference proceedings. She currently serves as reviewer for many international journals in computer vision and computer architecture (e.g. IEEE Trans. on PAMI, IEEE Trans. on Circuit and Systems, Trans. on SMC, Trans. on Vehicular Technology, Trans. on Medical Imaging, Image and Vision Computing, Journal of System architecture, IEEE Concurrency). She participated at scientific committees of the outstanding international conferences in computer vision and multimedia (CVPR, ICME, ICPR, ...) and symposia and organized special tracks in computer architecture for vision and image processing for traffic control. She is in the editorial board of Multimedia Tools and Applications journal. She is member of GIRPR (Italian chapter of Int. Assoc. of Pattern Recognition), AixIA (Ital. Assoc. Of Artificial Intelligence), ACM and IEEE Computer Society.

Alberto Del Bimbo is Full Professor of Computer Engineering at the Università di Firenze, Italy. Since 1998 he is the Director of the Master in Multimedia of the Università di Firenze. At the present time, he is Deputy Rector of the Università di Firenze, in charge of Research and Innovation Transfer. His scientific interests are Pattern Recognition, Image Databases, Multimedia and Human Computer Interaction. Prof. Del Bimbo is the author of over 170 publications in the most distinguished international journals and conference proceedings. He is the author of the “Visual Information Retrieval” monography on content-based retrieval from image and video databases edited by Morgan Kaufman. He is Member of IEEE (Institute of Electrical and Electronic Engineers) and Fellow of IAPR (International Association for Pattern Recognition). He is presently Associate Editor of Pattern Recognition, Journal of Visual Languages and Computing, Multimedia Tools and Applications Journal, Pattern Analysis and Applications, IEEE Transactions on Multimedia, and IEEE Transactions on Pattern Analysis and Machine Intelligence. He was the Guest Editor of several special issues on Image databases in highly respected journals.

Andrea Prati (Laurea in Computer Engineering, 1998; PhD in Computer Engineering, University of Modena and Reggio Emilia, 2002). He is currently an assistant professor at the University of Modena and Reggio Emilia (Italy), Faculty of Engineering, Dipartimento di Scienze e Metodi dell’Ingegneria, Reggio Emilia. During last year of his PhD studies, he has spent six months as visiting scholar at the Computer Vision and Robotics Research (CVRR) lab at University of California, San Diego (UCSD), USA, working on a research project for traffic monitoring and management through computer vision. His research interests are mainly on motion detection and analysis, shadow removal techniques, video transcoding and analysis, computer architecture for multimedia and high performance video servers, video-surveillance and domotics. He is author of more than 60 papers in international and national conference proceedings and leading journals and he serves as reviewer for many international journals in computer vision and computer architecture. He is a member of IEEE, ACM and IAPR.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bertini, M., Cucchiara, R., Bimbo, A.D. et al. An Integrated Framework for Semantic Annotation and Adaptation. Multimed Tools Appl 26, 345–363 (2005). https://doi.org/10.1007/s11042-005-0893-y

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-005-0893-y

Keywords

Navigation