Abstract
In this paper we propose a new method for automatic storyboard segmentation of TV news using image retrieval techniques and content manipulation. Our framework performs: shot boundary detection, global key-frame representation, image re-ranking based on neighborhood relations and temporal variance of image locations in order to construct a unimodal cluster for anchor person detection and differentiation. Finally, anchor shots are used to form video scenes. The entire technique is unsupervised being able to learn semantic models and extract natural patterns from the current video data. The experimental evaluation performed on a dataset of 50 videos, totalizing more than 30 h, demonstrates the pertinence of the proposed method, with gains in terms of recall and precision rates with more than 5–7% when compared with state of the art techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Chaisorn, L., Chua, T.S., Lee, C.H.: A multi-modal approach to story segmentation for news video. World Wide Web 6(2), 187–208 (2003)
Ji, P., Cao, L.J., Zhang, X.G.: News videos anchor person detection by shot clustering. Neurocomputing 123, 86–99 (2014)
Dumont, E., Quenot, G.: A local temporal context-based approach for TV news story segmentation. In: IEEE ICME, pp. 973–978 (2012)
Chaisorn, L., Chua, T.S.: Story boundary detection in news video using global rule induction technique. In: IEEE ICME, pp. 2101–2104 (2006)
Misra, H., Hopfgartner, F., Goyal, A., Punitha, P., Jose, J.M.: TV news story segmentation based on semantic coherence and content similarity. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, Y.-P.P. (eds.) MMM 2010. LNCS, vol. 5916, pp. 347–357. Springer, Heidelberg (2010)
Goyal, A., Punitha, P., Hopfgartner, F., Jose, J.M.: Split and merge based story segmentation in news videos. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 766–770. Springer, Heidelberg (2009)
Ma, C., Byun, B., Kim, I., Lee, C.H.: A detection-based approach to broadcast news video story segmentation. In: IEEE ICASSP, pp. 1957–1960 (2009)
Zheng, F., Li, S., Wu, H., Feng, J.: Anchor shot detection with diverse style backgrounds based on spatial-temporal slice analysis. In: Boll, S., Tian, Q., Zhang, L., Zhang, Z., Chen, Y.-P.P. (eds.) MMM 2010. LNCS, vol. 5916, pp. 676–682. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11301-7_68
Lee, H., Yu, J., Im, Y., Gil, J.M., Park, D.: A unified scheme of shot boundary detection and anchor shot detection in news video story parsing. Multimedia Tools Appl. 51(3), 1127–1145 (2011)
Dumont, E., Quénot, G.: Automatic story segmentation for TV news video using multiple modalities. Int. J. Digit. Multimedia Broadcast. 2012, 11 (2012). doi:10.1155/2012/732514
Tapu, R., Zaharia, T.: High level video temporal segmentation. In: Bebis, G., et al. (eds.) ISVC 2011. LNCS, vol. 6938, pp. 224–235. Springer, Heidelberg (2011). doi:10.1007/978-3-642-24028-7_21
Canny, J.: A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. PAMI–8(6), 679–698 (1986)
Zhu, X., Ramanan, D.: Face detection, pose estimation, and landmark localization in the wild. In: CVPR, pp. 2879–2886 (2012)
Tuzel, O., Porikli, F., Meer, P.: Region covariance: a fast descriptor for detection and classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 589–600. Springer, Heidelberg (2006)
Lowe, D.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)
Harris, C., Stephens, M.: A combined corner and edge detector. In: Alvey Vision Conference, pp. 147–151 (1988)
Mikolajczyk, K., Schmid, C.: Scale and affine invariant interest point detectors. IJCV 60(1), 63–86 (2004)
Arandjelovic, R., Zisserman, A.: Three things everyone should know to improve object retrieval. In: CVPR (2012)
Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Graph. Stat. 15(2), 265–286 (2006)
Delhumeau, J., Gosselin, P.H., Jegou, H., Perez, P.: Revisiting the VLAD image representation. In: ACM Multimedia, pp. 653–656 (2013)
Chen, D.M., Vajda, P., Tsai, S., Daneshi, M., Yu, M., Chen, H., Araujo, A., Girod, B.: Analysis of visual similarity in news videos with robust and memory-efficient image retrieval. In: IEEE ICME Workshops, pp. 1–6 (2013)
Broilo, M., Basso, A., De Natale, F.G.B.: Unsupervised anchor persons differentiation in news video. In: 9th International Workshop on Content-Based Multimedia Indexing, pp. 115–120 (2011)
Acknowledgments
This work has been partially accomplished within the framework of the FUI 19 Media4D project, supported by BPI (Banque Publique d’investissement) France and DGE (Direction Generale des Entreprises).
This work was supported by a grant of the Romanian National Authority for Scientific Research and Innovation, CNCS - UEFISCDI, project number: PN-II-RU-TE-2014-4-0202.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Mocanu, B., Tapu, R., Zaharia, T. (2016). Automatic Segmentation of TV News into Stories Using Visual and Temporal Information. In: Blanc-Talon, J., Distante, C., Philips, W., Popescu, D., Scheunders, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2016. Lecture Notes in Computer Science(), vol 10016. Springer, Cham. https://doi.org/10.1007/978-3-319-48680-2_57
Download citation
DOI: https://doi.org/10.1007/978-3-319-48680-2_57
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48679-6
Online ISBN: 978-3-319-48680-2
eBook Packages: Computer ScienceComputer Science (R0)