Abstract
Although detecting highlights in films is a trivial task for humans, previous studies have not determined whether a computer can be equipped with this capability. In this paper, we present a content-based system that automatically detects highlight scenes and predicts highlight scores in action movies. In particular, high-level image attributes and an early event detection approach are applied. Dissimilar to current learning-based approaches that model the relationship between the whole highlight and corresponding audiovisual features, the proposed system studies the temporal changes of a set of general features from a nonhighlight to a highlight scene. The experimental results indicate that achieving the highlight detection task is technically feasible. It also provides critical insights into understanding the feasibility of solving this challenging problem. For example, both audio and visual features are crucial and the filming style can be captured using high-level image attributes, which further improve the overall detection performance.
Similar content being viewed by others
References
Adams, B., Dorai, C., Venkatesh, S.: Study of shot length and motion as contributing factors to movie tempo. In: ACM International Conference on Multimedia (2000)
Chênes, C., Chanel, G., Soleymani, M., Pun, T.: Highlight detection in movie scenes through inter-users physiological linkage. In: Ramzan, N., (Ed.) Social Media Retrieval. Computer Communications and Networks, pp. 217–237. Springer, London (2013)
Gross, J.J., Levenson, R.W.: Emotion elicitation using films. Cogn. Emot. 9(1), 87–108 (1995)
Hamann, S.: Cognitive and neural mechanisms of emotional memory. Trends Cogn Sci 5(9), 394–400 (2001)
Hanjalic, A.: Adaptive extraction of highlights from a sport video based on excitement modeling. IEEE Trans Multimed 7(6), 1114–1122 (2005)
Irie, G., Satou, T., Kojima, A., Yamasaki, T., Aizawa, K.: Automatic trailer generation. In: ACM International Conference on Multimedia (2010)
Li, Y., Lee, S.-H., Yeh, C.-H., Kuo, C.-C.J.: Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques. IEEE Signal Process Mach 23(3), 79–89 (2006)
Lin, K.-S., Lee, A., Yang, Y.-H., Lee, C.-T., Chen H.H.: Automatic highlights extraction for drama video using music emotion and human face features. In: IEEE International Workshop on Multimedia Signal Processing (2011)
Liu, A., Li, J., Zhang, Y., Tang, S., Song, Y., Yang, Z.: An innovative model of tempo and its application in action scene detection for movie analysis. In: IEEE Workshop on Applications of Computer Vision (2008)
Liu, A., Tang, S., Zhang, Y., Song, Y., Li, J., Yang, Z.: A hierarchical framework for movie content analysis: Let computers watch films like humans. In: IEEE International Conference on Computer Vision and Pattern Recognition Workshops (2008)
Liu, A., Yang, Z.: Watching, thinking, reacting: a human-centered framework for movie content analysis. Int J Digit Content Technol Appl 4(5), 23–37 (2010)
Ma, Y.-F., Lu, L., Zhang, H.-J., Li, M.: A user attention model for video summarization. In: ACM International Conference on Multimedia (2002)
Merriam-Webster: Merriam-Webster’s collegiate dictionary, 2003. Retrieved July 17 2014 from: http://www.merriam-webster.com/dictionary/highlight
Minh, H., Torre, D.: Max-margin early event detection. In: IEEE International Conference on Computer Vision and Pattern Recognition (2012)
MPEG-7 Visual Experimentation Model (XM), Version 10.0, ISO/IEC/JTC1/SC29/WG11, Doc. N4063 (2001)
Rasheed, Z., Shah, M.: Detection and representation of scenes in videos. IEEE Trans Multimed 7(6), 1097–1105 (2005)
Sadlier, D.A., O’Connor, N.E.: Event detection in field sports video using audio-visual features and a support vector machine. IEEE Trans Circuits Syst Video Technol 15(10), 1225–1233 (2005)
Smeaton, A.F., Lehane, B., O’Connor, N.E., Brady, C., Craig, G.: Automatically selecting shots for action movie trailers. ACM International Workshop on Multimedia Information Retrieval (2006)
Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. J Mach Learn Res (JMLR) 6, 1453–1484 (2005)
Wang, H.L., Cheng, L.-F.: Affective understanding in film. IEEE Trans Circuits Syst Video Technol 16(6), 689–704 (2006)
Wang, J., Xu, C., Chng, E., Tian, Q.: Sports highlight detection from key word sequences using HMM. In: IEEE International Conference on Multimedia & Expo (2004)
Zheng, Y., Zhu, G., Jiang, S., Huang, Q., Gao, W.: Visual-aural attention modeling for talk show video highlight detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Y. Zhang.
Rights and permissions
About this article
Cite this article
Yeh, MC., Tsai, YW. & Hsu, HC. A content-based approach for detecting highlights in action movies. Multimedia Systems 22, 287–295 (2016). https://doi.org/10.1007/s00530-015-0457-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-015-0457-6