Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences

Vacchetti, Bartolomeo; Cerquitelli, Tania

doi:10.1007/978-3-031-25069-9_42

Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences

Conference paper
First Online: 14 February 2023

1080 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13804))

Abstract

Video is the most widely used media format. Automating the editing process would impact many areas, from the film industry to social media content. The editing process defines the structure of a video. In this paper, we present a new method to analyze and characterize the structure of 30-second videos. Specifically, we study the video structure in terms of sequences of shots. We investigate what type of relation there is between what is shown in the video and the sequence of shots used to represent it and if it is possible to define editing classes. To this aim, labeled data are needed, but unfortunately they are not available. Hence, it is necessary to develop new data-driven methodologies to address this issue. In this paper we present Movie Lens, a data driven approach to discover and characterize editing patterns in the analysis of short movie sequences. Its approach relies on the exploitation of the Levenshtein distance, the K-Means algorithm, and a Multilayer Perceptron (MLP). Through the Levenshtein distance and the K-Means algorithm we indirectly label 30 s long movie shot sequences. Then, we train a Multilayer Perceptron to assess the validity of our approach. Additionally the MLP helps domain experts to assess the semantic concepts encapsulated by the identified clusters. We have taken out data from the Cinescale dataset. We have gathered 23 887 shot sequences from 120 different movies. Each sequence is 30 s long. The performance of Movie Lens in terms of accuracy varies (93% - 77%) in relation to the number of classes considered (4-32). We also present a preliminary characterization concerning the identified classes and their relative editing patterns in 16 classes scenario, reaching an overall accuracy of 81%.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
the Multilayer Perceptron for sentence classification can be retrieved from a GitHub repository [32].

References

Argaw, D.M., Heilbron, F.C., Lee, J.Y., Woodson, M., Kweon, I.: The anatomy of video editing: a dataset and benchmark suite for AI-assisted video editing. ArXiv abs/2207.09812 (2022)
Google Scholar
Bain, M., Nagrani, A., Brown, A., Zisserman, A.: Condensed movies: story based retrieval with contextual embeddings. CoRR abs/2005.04208 (2020). https://arxiv.org/abs/2005.04208
Bak, H.Y., Park, S.B.: Comparative study of movie shot classification based on semantic segmentation. Applied Sci. 10, 3390 (2020). https://doi.org/10.3390/app10103390
Benini, S., Savardi, M., Balint, K., Kovacs, A., Signoroni, A.: On the influence of shot scale on film mood and narrative engagement in film viewers. IEEE Trans. Affect. Comput. 13(2), 592–603 (2022). https://doi.org/10.1109/taffc.2019.2939251
Article Google Scholar
Berthouzoz, F., Li, W., Agrawala, M.: Tools for placing cuts and transitions in interview video. ACM Trans. Graph. 31, 1–8 (2012). https://doi.org/10.1145/2185520.2335418
Bloemheuvel, S., van den Hoogen, J., Jozinovic, D., Michelini, A., Atzmueller, M.: Multivariate time series regression with graph neural networks. CoRR abs/2201.00818 (2022). https://arxiv.org/abs/2201.00818
Chakraborty, S., Nagwani, N., Dey, L.: Performance comparison of incremental k-means and incremental dbscan algorithms. Int. J. Comput. Appl. 27, 975–8887 (2011)
Google Scholar
Haldar, R., Mukhopadhyay, D.: Levenshtein distance technique in dictionary lookup methods: an improved approach. Computing Research Repository - CORR (2011)
Google Scholar
Hasan, M.A., Xu, M., He, X., Xu, C.: CAMHID: camera motion histogram descriptor and its application to cinematographic shot classification. IEEE Trans. Circuits Syst. Video Technol. 24(10), 1682–1695 (2014). https://doi.org/10.1109/TCSVT.2014.2345933
Article Google Scholar
He, Z., Gao, S., Xiao, L., Liu, D., He, H., Barber, D.: Wider and deeper, cheaper and faster: tensorized LSTMS for sequence learning (2017)
Google Scholar
Jani, K., Chaudhuri, M., Patel, H., Shah, M.: Machine learning in films: an approach towards automation in film censoring. J. Data Inf. Manage. 2(1), 55–64 (2019). https://doi.org/10.1007/s42488-019-00016-9
Article Google Scholar
Juang, B.H., Rabiner, L.: The segmental k-means algorithm for estimating parameters of hidden Markov models. IEEE Trans. Acoust. Speech Signal Process. 38(9), 1639–1641 (1990)
Article MATH Google Scholar
Liberti, L., Lavor, C., Maculan, N., Mucherino, A.: Euclidean distance geometry and applications. SIAM Rev. 56, 120875909 (2012). https://doi.org/10.1137/120875909
Matsuo, Y., Amano, M., Uehara, K.: Mining video editing rules in video streams, pp. 255–258 (2002). https://doi.org/10.1145/641007.641058
Mogadala, A., Kalimuthu, M., Klakow, D.: Trends in integration of vision and language research: a survey of tasks, datasets, and methods. J. Artif. Int. Res. 71, 1183–1317 (2021). https://doi.org/10.1613/jair.1.11688
Murch, W.: In the Blink of an Eye. Silman-James Press (2001)
Google Scholar
Nothelfer, C., DeLong, J., Cutting, J.E.: Shot structure in Hollywood film (2009)
Google Scholar
Pardo, A., Heilbron, F.C., Alcázar, J.L., Thabet, A.K., Ghanem, B.: Learning to cut by watching movies. CoRR abs/2108.04294 (2021). https://arxiv.org/abs/2108.04294
Podlesnyy, S.: Towards data-driven automatic video editing (2019)
Google Scholar
Qaisar, S.: Sentiment analysis of IMDB movie reviews using long short-term memory (2020). https://doi.org/10.1109/ICCIS49240.2020.9257657
Ramesh, A., et al.: Zero-shot text-to-image generation (2021). https://doi.org/10.48550/ARXIV.2102.12092. https://arxiv.org/abs/2102.12092
Rao, A., Wang, J., Xu, L., Jiang, X., Huang, Q., Zhou, B., Lin, D.: A unified framework for shot type classification based on subject centric lens. CoRR abs/2008.03548 (2020). https://arxiv.org/abs/2008.03548
Ren, J., Shen, X., Lin, Z., Měch, R.: Best frame selection in a short video. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3201–3210 (2020). https://doi.org/10.1109/WACV45572.2020.9093615
Savardi, M., Kovács, A.B., Signoroni, A., Benini, S.: Cinescale: A dataset of cinematic shot scale in movies. Data Brief 36, 107002 (2021)
Google Scholar
Savardi, M., Signoroni, A., Migliorati, P., Benini, S.: Shot scale analysis in movies by convolutional neural networks, pp. 2620–2624 (2018). https://doi.org/10.1109/ICIP.2018.8451474
Simões, G., Wehrmann, J., Barros, R., Ruiz, D.: Movie genre classification with convolutional neural networks, pp. 259–266 (2016). https://doi.org/10.1109/IJCNN.2016.7727207
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556
Soe, T.H.: Automation in video editing: assisted workflows in video editing. In: AutomationXP@CHI (2021)
Google Scholar
Svanera, M., Savardi, M., Signoroni, A., Kovács, A.B., Benini, S.: Who is the film’s director? authorship recognition based on shot features. IEEE Multimedia 26(4), 43–54 (2019). https://doi.org/10.1109/MMUL.2019.2940004
Article Google Scholar
Vacchetti, B., Cerquitelli, T.: Cinematographic shot classification with deep ensemble learning. Electronics 11(10), 1570 (2022)
Article Google Scholar
Vacchetti, B., Cerquitelli, T., Antonino, R.: Cinematographic shot classification through deep learning. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 345–350 (2020). https://doi.org/10.1109/COMPSAC48688.2020.0-222
Walters, A.: Sentence classification. https://github.com/lettergram/sentence-classification
Wang, M., Yang, G.W., Hu, S.M., Yau, S.T., Shamir, A.: Write-a-video: computational video montage from themed text. ACM Trans. Graph. 38(6) 1–13 (2019). https://doi.org/10.1145/3355089.3356520
Wu, H.Y., Santarra, T., Leece, M., Vargas, R., Jhala, A.: Joint attention for automated video editing. In: ACM International Conference on Interactive Media Experiences, pp. 55–64. IMX 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3391614.3393656
Zhang, X., Li, Y., Han, Y., Wen, J.: AI video editing: a survey (2021). https://doi.org/10.20944/preprints202201.0016.v1
Zhou, H., Hermans, T., Karandikar, A., Rehg, J.: Movie genre classification via scene categorization, pp. 747–750 (2010). https://doi.org/10.1145/1873951.1874068
Zhou, J., Zhang, X.P.: Automatic identification of digital video based on shot-level sequence matching. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 515–518. MULTIMEDIA 2005, Association for Computing Machinery, New York, NY, USA (2005). https://doi.org/10.1145/1101149.1101265

Download references

Author information

Authors and Affiliations

Polytechnic of Turin, Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy
Bartolomeo Vacchetti & Tania Cerquitelli

Authors

Bartolomeo Vacchetti
View author publications
You can also search for this author in PubMed Google Scholar
Tania Cerquitelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bartolomeo Vacchetti .

Editor information

Editors and Affiliations

IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
Leonid Karlinsky
Technion – Israel Institute of Technology, Haifa, Israel
Tomer Michaeli
Kyoto University, Kyoto, Japan
Ko Nishino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vacchetti, B., Cerquitelli, T. (2023). Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13804. Springer, Cham. https://doi.org/10.1007/978-3-031-25069-9_42

Download citation

DOI: https://doi.org/10.1007/978-3-031-25069-9_42
Published: 14 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25068-2
Online ISBN: 978-3-031-25069-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics