Skip to main content

Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13804))

Abstract

Video is the most widely used media format. Automating the editing process would impact many areas, from the film industry to social media content. The editing process defines the structure of a video. In this paper, we present a new method to analyze and characterize the structure of 30-second videos. Specifically, we study the video structure in terms of sequences of shots. We investigate what type of relation there is between what is shown in the video and the sequence of shots used to represent it and if it is possible to define editing classes. To this aim, labeled data are needed, but unfortunately they are not available. Hence, it is necessary to develop new data-driven methodologies to address this issue. In this paper we present Movie Lens, a data driven approach to discover and characterize editing patterns in the analysis of short movie sequences. Its approach relies on the exploitation of the Levenshtein distance, the K-Means algorithm, and a Multilayer Perceptron (MLP). Through the Levenshtein distance and the K-Means algorithm we indirectly label 30 s long movie shot sequences. Then, we train a Multilayer Perceptron to assess the validity of our approach. Additionally the MLP helps domain experts to assess the semantic concepts encapsulated by the identified clusters. We have taken out data from the Cinescale dataset. We have gathered 23 887 shot sequences from 120 different movies. Each sequence is 30 s long. The performance of Movie Lens in terms of accuracy varies (93% - 77%) in relation to the number of classes considered (4-32). We also present a preliminary characterization concerning the identified classes and their relative editing patterns in 16 classes scenario, reaching an overall accuracy of 81%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    the Multilayer Perceptron for sentence classification can be retrieved from a GitHub repository [32].

References

  1. Argaw, D.M., Heilbron, F.C., Lee, J.Y., Woodson, M., Kweon, I.: The anatomy of video editing: a dataset and benchmark suite for AI-assisted video editing. ArXiv abs/2207.09812 (2022)

    Google Scholar 

  2. Bain, M., Nagrani, A., Brown, A., Zisserman, A.: Condensed movies: story based retrieval with contextual embeddings. CoRR abs/2005.04208 (2020). https://arxiv.org/abs/2005.04208

  3. Bak, H.Y., Park, S.B.: Comparative study of movie shot classification based on semantic segmentation. Applied Sci. 10, 3390 (2020). https://doi.org/10.3390/app10103390

  4. Benini, S., Savardi, M., Balint, K., Kovacs, A., Signoroni, A.: On the influence of shot scale on film mood and narrative engagement in film viewers. IEEE Trans. Affect. Comput. 13(2), 592–603 (2022). https://doi.org/10.1109/taffc.2019.2939251

    Article  Google Scholar 

  5. Berthouzoz, F., Li, W., Agrawala, M.: Tools for placing cuts and transitions in interview video. ACM Trans. Graph. 31, 1–8 (2012). https://doi.org/10.1145/2185520.2335418

  6. Bloemheuvel, S., van den Hoogen, J., Jozinovic, D., Michelini, A., Atzmueller, M.: Multivariate time series regression with graph neural networks. CoRR abs/2201.00818 (2022). https://arxiv.org/abs/2201.00818

  7. Chakraborty, S., Nagwani, N., Dey, L.: Performance comparison of incremental k-means and incremental dbscan algorithms. Int. J. Comput. Appl. 27, 975–8887 (2011)

    Google Scholar 

  8. Haldar, R., Mukhopadhyay, D.: Levenshtein distance technique in dictionary lookup methods: an improved approach. Computing Research Repository - CORR (2011)

    Google Scholar 

  9. Hasan, M.A., Xu, M., He, X., Xu, C.: CAMHID: camera motion histogram descriptor and its application to cinematographic shot classification. IEEE Trans. Circuits Syst. Video Technol. 24(10), 1682–1695 (2014). https://doi.org/10.1109/TCSVT.2014.2345933

    Article  Google Scholar 

  10. He, Z., Gao, S., Xiao, L., Liu, D., He, H., Barber, D.: Wider and deeper, cheaper and faster: tensorized LSTMS for sequence learning (2017)

    Google Scholar 

  11. Jani, K., Chaudhuri, M., Patel, H., Shah, M.: Machine learning in films: an approach towards automation in film censoring. J. Data Inf. Manage. 2(1), 55–64 (2019). https://doi.org/10.1007/s42488-019-00016-9

    Article  Google Scholar 

  12. Juang, B.H., Rabiner, L.: The segmental k-means algorithm for estimating parameters of hidden Markov models. IEEE Trans. Acoust. Speech Signal Process. 38(9), 1639–1641 (1990)

    Article  MATH  Google Scholar 

  13. Liberti, L., Lavor, C., Maculan, N., Mucherino, A.: Euclidean distance geometry and applications. SIAM Rev. 56, 120875909 (2012). https://doi.org/10.1137/120875909

  14. Matsuo, Y., Amano, M., Uehara, K.: Mining video editing rules in video streams, pp. 255–258 (2002). https://doi.org/10.1145/641007.641058

  15. Mogadala, A., Kalimuthu, M., Klakow, D.: Trends in integration of vision and language research: a survey of tasks, datasets, and methods. J. Artif. Int. Res. 71, 1183–1317 (2021). https://doi.org/10.1613/jair.1.11688

  16. Murch, W.: In the Blink of an Eye. Silman-James Press (2001)

    Google Scholar 

  17. Nothelfer, C., DeLong, J., Cutting, J.E.: Shot structure in Hollywood film (2009)

    Google Scholar 

  18. Pardo, A., Heilbron, F.C., Alcázar, J.L., Thabet, A.K., Ghanem, B.: Learning to cut by watching movies. CoRR abs/2108.04294 (2021). https://arxiv.org/abs/2108.04294

  19. Podlesnyy, S.: Towards data-driven automatic video editing (2019)

    Google Scholar 

  20. Qaisar, S.: Sentiment analysis of IMDB movie reviews using long short-term memory (2020). https://doi.org/10.1109/ICCIS49240.2020.9257657

  21. Ramesh, A., et al.: Zero-shot text-to-image generation (2021). https://doi.org/10.48550/ARXIV.2102.12092. https://arxiv.org/abs/2102.12092

  22. Rao, A., Wang, J., Xu, L., Jiang, X., Huang, Q., Zhou, B., Lin, D.: A unified framework for shot type classification based on subject centric lens. CoRR abs/2008.03548 (2020). https://arxiv.org/abs/2008.03548

  23. Ren, J., Shen, X., Lin, Z., Měch, R.: Best frame selection in a short video. In: 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 3201–3210 (2020). https://doi.org/10.1109/WACV45572.2020.9093615

  24. Savardi, M., Kovács, A.B., Signoroni, A., Benini, S.: Cinescale: A dataset of cinematic shot scale in movies. Data Brief 36, 107002 (2021)

    Google Scholar 

  25. Savardi, M., Signoroni, A., Migliorati, P., Benini, S.: Shot scale analysis in movies by convolutional neural networks, pp. 2620–2624 (2018). https://doi.org/10.1109/ICIP.2018.8451474

  26. Simões, G., Wehrmann, J., Barros, R., Ruiz, D.: Movie genre classification with convolutional neural networks, pp. 259–266 (2016). https://doi.org/10.1109/IJCNN.2016.7727207

  27. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014). http://arxiv.org/abs/1409.1556

  28. Soe, T.H.: Automation in video editing: assisted workflows in video editing. In: AutomationXP@CHI (2021)

    Google Scholar 

  29. Svanera, M., Savardi, M., Signoroni, A., Kovács, A.B., Benini, S.: Who is the film’s director? authorship recognition based on shot features. IEEE Multimedia 26(4), 43–54 (2019). https://doi.org/10.1109/MMUL.2019.2940004

    Article  Google Scholar 

  30. Vacchetti, B., Cerquitelli, T.: Cinematographic shot classification with deep ensemble learning. Electronics 11(10), 1570 (2022)

    Article  Google Scholar 

  31. Vacchetti, B., Cerquitelli, T., Antonino, R.: Cinematographic shot classification through deep learning. In: 2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC), pp. 345–350 (2020). https://doi.org/10.1109/COMPSAC48688.2020.0-222

  32. Walters, A.: Sentence classification. https://github.com/lettergram/sentence-classification

  33. Wang, M., Yang, G.W., Hu, S.M., Yau, S.T., Shamir, A.: Write-a-video: computational video montage from themed text. ACM Trans. Graph. 38(6) 1–13 (2019). https://doi.org/10.1145/3355089.3356520

  34. Wu, H.Y., Santarra, T., Leece, M., Vargas, R., Jhala, A.: Joint attention for automated video editing. In: ACM International Conference on Interactive Media Experiences, pp. 55–64. IMX 2020, Association for Computing Machinery, New York, NY, USA (2020). https://doi.org/10.1145/3391614.3393656

  35. Zhang, X., Li, Y., Han, Y., Wen, J.: AI video editing: a survey (2021). https://doi.org/10.20944/preprints202201.0016.v1

  36. Zhou, H., Hermans, T., Karandikar, A., Rehg, J.: Movie genre classification via scene categorization, pp. 747–750 (2010). https://doi.org/10.1145/1873951.1874068

  37. Zhou, J., Zhang, X.P.: Automatic identification of digital video based on shot-level sequence matching. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 515–518. MULTIMEDIA 2005, Association for Computing Machinery, New York, NY, USA (2005). https://doi.org/10.1145/1101149.1101265

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bartolomeo Vacchetti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vacchetti, B., Cerquitelli, T. (2023). Movie Lens: Discovering and Characterizing Editing Patterns in the Analysis of Short Movie Sequences. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13804. Springer, Cham. https://doi.org/10.1007/978-3-031-25069-9_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25069-9_42

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25068-2

  • Online ISBN: 978-3-031-25069-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics