skip to main content
research-article

Client-centered multimedia content adaptation

Published:14 August 2009Publication History
Skip Abstract Section

Abstract

The design and implementation of a client-centered multimedia content adaptation system suitable for a mobile environment comprising of resource-constrained handheld devices or clients is described. The primary contributions of this work are: (1) the overall architecture of the client-centered content adaptation system, (2) a data-driven multi-level Hidden Markov model (HMM)-based approach to perform both video segmentation and video indexing in a single pass, and (3) the formulation and implementation of a Multiple-choice Multidimensional Knapsack Problem (MMKP)-based video personalization strategy. In order to segment and index video data, a video stream is modeled at both the semantic unit level and video program level. These models are learned entirely from training data and no domain-dependent knowledge about the structure of video programs is used. This makes the system capable of handling various kinds of videos without having to manually redefine the program model. The proposed MMKP-based personalization strategy is shown to include more relevant video content in response to the client's request than the existing 0/1 knapsack problem and fractional knapsack problem-based strategies, and is capable of satisfying multiple client-side constraints simultaneously. Experimental results on CNN news videos and Major League Soccer (MLS) videos are presented and analyzed.

References

  1. Akbar, M. D., Manning, E. G., Shoja, G. C., and Khan, S. 2001. Heuristic solutions for the multiple-choice multidimension knapsack problem. In Proceedings of the International Conference on Computational Science, 659--668. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bartoli, A., Dalal, N. and Horaud, R. 2004. Motion panoramas. Comput. Anim. Virtual Worlds, 15, 501--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Baum, L. E., Peterie, T., Souled, G., and Weiss, N. 1970. A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann. Math. Statist, 164--171.Google ScholarGoogle Scholar
  4. Bhandarkar, S. M., Warke, Y. S., Khombhadia, A. A. 1999. Integrated parsing of compressed video. Lecture Notes In Computer Science, vol. 1614, 269--276. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Boreczky, J. S. and Wilcox. L. D. 1998. A hidden Markov model framework for video segmentation using audio and image features. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP).Google ScholarGoogle ScholarCross RefCross Ref
  6. Brown, P. F., Pietra, V. J., DeSouza, P. V., Lai, J. C., and Mercer, R. L. 1992. Class-based n-gram models of natural language. Comput. Linguist. 18, 4, 467--479. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Chen, M. J., Chu, M. C., and Pan, C. W. 2002. Efficient motion estimation algorithm for reduced frame-rate video transcoder. IEEE Trans. Circ. Syst. Video Technol. 12, 4, 269--275. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Eickeler, S. and Müller, S. 1999. Content-based video indexing of TV broadcast news using hidden Markov models. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2997--3000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Eickeler, S. and Rigoll, G. 2000. A novel error measure for the evaluation of video indexing systems. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, 1991--1994. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Eleftheriadis, A. and Batra, P. 2006. Dynamic rate shaping of compressed digital video. IEEE Trans. Multimedia 8, 2, 297--314. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Fellbaum, C., Ed. 1998. WordNet—An Electronic Lexical Database. The MIT Press, Cambridge, MA.Google ScholarGoogle Scholar
  12. Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D., and Yanker, P. 1995. Query by image and video content: The QBIC system. IEEE Comput. Mag. 23--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Forney, G. D. 1973. The Viterbi algorithm. Proceedings of the IEEE, vol. 61, No. 3, 268-278.Google ScholarGoogle ScholarCross RefCross Ref
  14. Hernandez, R. P. and Nikitas, N. J. 2005. A new heuristic for solving the multiple-choice multidimensional knapsack problem. IEEE Trans. Syst. Man Cybernetics, Part A 35, 5, 708--717. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Huang, J., Liu, Z., and Wang, Y. 2005. Joint scene classification and segmentation based on hidden Markov model. IEEE Trans. Multimedia 7, 3, 538--550. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Irani, M., Hsu, S., and Anandan, P. 1995. Mosaic-based video compression. In Proceedings of the SPIE Conference on Electronic Imaging, Digital Video Compression: Algorithms and Technologies, vol. 2419, 242--253.Google ScholarGoogle ScholarCross RefCross Ref
  17. Irani, M., Anandan, P., Bergen, J., Kumar, R., and Hsu, S. 1996. Efficient representations of video sequences and their applications. Signal Process. Image Commun. Special Issue on Image Video Semantics: Processing, Analysis, Appl. 8, 4, 327--351.Google ScholarGoogle Scholar
  18. Khan, S. 1998. Quality adaptation in a multi-session adaptive multimedia system: Model and architecture. Ph.D. thesis, Department of Electronical and Computer Engineering, University of Victoria. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Leacock, C. and Chodorow, M. 1998. Combining local context and wordnet similarity for word sense identification. In WordNet: An Electronic Lexical Database, Fellbaum C. (Ed.), MIT Press, Cambridge, MA, 265--283.Google ScholarGoogle Scholar
  20. Li, B. and Sezan, M. I. 2001. Event detection and summarization in sports video. In Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries 8, 132--138. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Li, C. S., Mohan, R., and Smith, J. R. 1998. Multimedia content description in the Info-Pyramid. In Proceedings of the ICASSP'98, Special Session on Signal Processing in Modern Multimedia Standards, vol.6, 3789--3792.Google ScholarGoogle Scholar
  22. Merialdo, B., Lee, K.T., Luparello, D., and Roudaire, J. 1999. Automatic construction of personalized TV news programs. In Proceedings of the ACM Conference on Multimedia, 323--331. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Nakajima, Y., Hori, H., and Kanoh, T. 1995. Rate conversion of MPEG coded video by requantization process. In Proceedings of the IEEE International Conference on Image Processing, 408--411. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Ney, H. and Ortmanns, S. 1999. Progress on dynamic programming search for continuous speech recognition. IEEE Signal Proc. Mag. 16, 5, 64--83.Google ScholarGoogle ScholarCross RefCross Ref
  25. Papoulis, A. 1984. Probability, Random Variables, and Stochastic Processes, 2nd Ed. McGraw-Hill, New York, 104, 148.Google ScholarGoogle Scholar
  26. Rabiner, L. R. 1989. A tutorial on hidden Markov models and selected applications in speech recognition. In Proceedings of the IEEE 77, 2, 257--286.Google ScholarGoogle ScholarCross RefCross Ref
  27. Shinoda, K., Bach, N. H., Furui, S., and Kawai, N. 2005. Scene recognition using hidden Markov models for video database. In Proceedings of the Symposium on Large-Scale Knowledge Resources (LKR'05), 107--110.Google ScholarGoogle Scholar
  28. Snoek, C. G. M. and Worring, M. 2003. Time interval maximum entropy based event indexing in soccer video. In Proceedings of the IEEE International Conference on Multimedia & Expo, vol. 3, 481--484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Sun, H., Kwok, W., and Zdepski, J. 1996. Architectures for MPEG compressed bitstream scaling. IEEE Trans. Circ. Syst. Video Technol. 6, 191--199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Tamura, H., Mori, S., and Yamawaki, T. 1978. Textural features corresponding to visual perception. IEEE Trans. Syst. Man Cybernetics 8, 460--472.Google ScholarGoogle ScholarCross RefCross Ref
  31. Tseng, B. L., Lin, C. Y., and Smith, J. R. 2004. Using MPEG-7 and MPEG-21 for personalizing video. IEEE Multimedia, 11, 1, 42--52. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Tseng, B. L. and Smith, J. R. 2003. Hierarchical video summarization based on context clustering. In Proceedings of the SPIE, 5242, 14--25.Google ScholarGoogle ScholarCross RefCross Ref
  33. Tseng, B. L., Lin, C. Y., and Smith, J. R. 2002. Video personalization and summarization system. In Proceedings of the IEEE Workshop on Multimedia Signal Processing, 424--427.Google ScholarGoogle Scholar
  34. Uykan, Z. and Koivo, H. N. 2000. Unsupervised learning of sigmoid perceptron. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 6, 3486--3489. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Vanderbei, R. J. 1997. Linear Programming: Foundations and Extensions. Kluwer Academic, Norwell, MA.Google ScholarGoogle Scholar
  36. Viola, P. and Jones, M. J. 2004. Robust real-time face detection. Int. J. Comput. Vision 57, 2, 137--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Wei, Y., Wang, H., Bhandarkar, S. M., and Li, K. 2006. Parallel algorithms for motion panorama construction. In Proceedings of the ICPP Workshop on Parallel and Distributed Multimedia, 82--92. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Wheeler, E. S. 2002. Zipf's law and why it works everywhere. Glottometrics, 4, 45--48.Google ScholarGoogle Scholar
  39. Zhu, W., Yang, K., and Beacken, M. 1998. CIF-to-QCIF video bitstream down conversion in the DCT domain. Bell Labs Tech. J. 3, 3, 21--29.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Client-centered multimedia content adaptation

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Multimedia Computing, Communications, and Applications
      ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 5, Issue 3
      August 2009
      204 pages
      ISSN:1551-6857
      EISSN:1551-6865
      DOI:10.1145/1556134
      Issue’s Table of Contents

      Copyright © 2009 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 14 August 2009
      • Accepted: 1 August 2007
      • Revised: 1 May 2007
      • Received: 1 November 2006
      Published in tomm Volume 5, Issue 3

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader