Skip to main content
Log in

From video pornography to cancer cells: a tensor framework for spatiotemporal description

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Spatiotemporal description is a research field with applications in various areas such as video indexing, surveillance, human-computer interfaces, among others. Big Data problems in large databases are now being treated with Deep Learning tools, however we still have room for improvement in spatiotemporal handcraft description. Moreover, we still have problems that involve small data in which data augmentation and other techniques are not valid, or even, it is not worth the use of an expensive method. The main contribution of this work is the development of a framework for spatiotemporal representation using orientation tensors enabling dimension reduction and invariance. This is a multipurpose framework called Features As Spatiotemporal Tensors (FASTensor). We evaluate this framework in two different applications: Video Pornography classification and Cancer Cell classification. The latter one is also a contribution of this work, since we introduce a new dataset called Melanoma Cancer Cell (MCC). It is a small dataset with inherent difficulties in the acquisition process and its particular motion nature. The results are competitive, while also being ease to compute. Finally, our results in the MCC dataset can be used in other cancer cell treatment analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Notes

  1. https://www.github.com/ssig/ssiglib

  2. http://www.image-net.org/

  3. http://host.robots.ox.ac.uk/pascal/VOC/

  4. https://pytorch.org/

  5. https://pytorch.org/docs/stable/torchvision/models.html

  6. https://scikit-image.org/

  7. http://scikit-learn.org/stable/

  8. http://www.numpy.org/

  9. https://www.scipy.org/

  10. http://tiny.cc/mcc-dataset

  11. https://www.nikoninstruments.com/pr_BR/Produtos/Sistemas-de-triagem-de-celulas-vivas/BioStation-IM-Q

References

  1. Almeida J, dos Santos JA, Alberton B, Morellato LPC, da S, Torres R (2016) Phenological visual rhythms: compact representations for fine-grained plant species identification. Pattern Recogn Lett 81:90–100

    Article  Google Scholar 

  2. Andaló FA, Miranda PAV, da Silva Torres R, Falcão AX (2007) Detecting contour saliences using tensor scale. In: IEEE International conference on image processing, pp 349–352

  3. Augereau B, Tremblais B, Fernandez-Maloigne C (2005) Vectorial computation of the optical flow in color image sequences. In: Thirteenth color imaging conference, pp 130–134

  4. Avila S, Thome N, Cord M, Valle E, Araújo AA (2013) Pooling in image representation: the visual codeword point of view. Comput Vis Image Underst 117(5):453–465

    Article  Google Scholar 

  5. Avila S, Thome N, Cord N, Valle E, Araújo AA (2011) Bossa: extended bow formalism for image classification. In: IEEE International conference on image processing, pp 2909–2912

  6. Baburaj M, Sudhish N (2019) Tensor based approach for inpainting of video containing sparse text. Multimed Tools Appl 78(2):1805–1829

    Article  Google Scholar 

  7. Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley Longman Publishing Co. Inc

  8. Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features. Comput Vis Image Underst 110:346–359

    Article  Google Scholar 

  9. Bradbury RH (2007) Overview BT - cancer. Springer, pp 1–17

  10. Brox T, Bruhn A, Papenberg N, Weickert J (2004) High accuracy optical flow estimation based on a theory for warping. In: 25–36

  11. Caetano C, Avila S, Schwartz WR, Guimarães SJF, Araújo AA (2016) A mid-level video representation based on binary descriptors: a case study for pornography detection. Neurocomputing 213:102–114

    Article  Google Scholar 

  12. Caetano C, dos Santos JA, Schwartz WR (2016) Optical flow co-occurrence matrices: a novel spatiotemporal feature descriptor. In: 1947–1952

  13. Castro TK, Almeida Perez E, Mota V, Chapiro A, Vieira MB, Freire WP (2009) High frequency assessment from multiresolution analysis. In: International conference on computational science, pp 429–438

  14. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition, vol 1, pp 886–893

  15. Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European conference on computer vision, pp 428–441

    Chapter  Google Scholar 

  16. Decaestecker C, Debeir O, Van Ham P, Kiss R (2007) Can anti-migratory drugs be screened in vitro? A review of 2d and 3d assays for the quantitative analysis of cell migration. Med Res Rev 27(2):149–176

    Article  Google Scholar 

  17. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, pp 248–255

  18. Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Computer vision and pattern recognition, pp 2625–2634

  19. Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136

    Article  Google Scholar 

  20. Farnebäck G (July 2001) Very high accuracy velocity estimation using orientation tensors, parametric motion, and simultaneous segmentation of the motion field. In: International conference on computer vision, pp 171–177

  21. Förstner W (1994) A framework for low level feature extraction. In: European conference on computer vision, pp 383–394

    Chapter  Google Scholar 

  22. Gillet JP, Varma S, Gottesman MM (2013) The clinical relevance of cancer cell lines. J Natl Cancer Inst 105:452–458

    Article  Google Scholar 

  23. Goodspeed A, Heiser L, Gray J, Costello J (2016) Tumor-derived cell lines as molecular models of cancer pharmacogenomics. Molec Cancer Res 14(1):3–13

    Article  Google Scholar 

  24. Gracias X, Negahdaripour S (2005) Underwater mosaic creation using video sequences from different altitudes. In: Proceedings of OCEANS 2005 MTS/IEEE. IEEE, pp 1295–1300

  25. Granlund GH, Knutsson H (1995) Signal processing for computer vision. Kluwer Academic Publishers

  26. Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph-based video segmentation. In: Computer vision and pattern recognition, pp 2141–2148

  27. Harris C, Stephens M (1988) A combined corner and edge detector. In: Fourth Alvey vision conference, pp 147–151

  28. Hart IR (1979) The selection and characterization of an invasive variant of the b16 melanoma. Am J Pathol 97:587–600

    Google Scholar 

  29. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Computer vision and pattern recognition, pp 770–778

  30. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  31. Jia C, Wang S, Xu X, Zhou C, Zhang L (2010) Tensor analysis and multi-scale features based multi-view human action recognition. In: International conference on computer engineering and technology, pp 60–64

  32. Johansson B, Farnebäck G, Ack GF (2002) A theoretical comparison of different orientation tensors. In: Symposium on image analysis, pp 69–73

  33. Kang J, Xiao C, Deng M, Yu J, Liu H (2011) Image registration based on harris corner and mutual information. In: Proceedings of 2011 international conference on electronic & mechanical engineering and information technology, vol 7. IEEE, pp 3434–3437

  34. Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Computer vision and pattern recognition, pp 1725–1732

  35. Katira P, Bonnecaze RT, Zaman MH (2013) Modeling the mechanics of cancer: effect of changes in cellular and extra-cellular mechanical properties. Front Encol 3:145

    Google Scholar 

  36. Kim T, Wong S, Cipolla RR (2007) Tensor canonical correlation analysis for action classification. In: Computer vision and pattern recognition, pp 1–8

  37. Krausz B, Bauckhage C (2010) Action recognition in videos using nonnegative tensor factorization. Int Conf Pattern Recogn 0:1763–1766

    Google Scholar 

  38. Kriegel F, Köhler R, Bayat-Sarmadi J, Bayerl S, Hauser EA, Niesner R, Luch A, Cseresnyés Z (2017) Cell shape characterization and classification with discrete fourier transforms and self-organizing maps. Int Soc Adv Cytometry 93:323–333

    Article  Google Scholar 

  39. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Neur Inform Process Syst, 1097–1105

  40. Lan X, Zhang S, Yuen PC, Chellappa R (2018) Learning common and feature-specific patterns: a novel multiple-sparse-representation-based tracker. IEEE Trans Image Process 27(4):2022–2037

    Article  MathSciNet  Google Scholar 

  41. Lan X, Ye M, Shao R, Zhong B, Yuen PC, Zhou H (2019) Learning modality-consistency feature templates: a robust rgb-infrared tracking system. IEEE Trans Ind Electron 66:9887–9897

    Article  Google Scholar 

  42. Laptev I, Pérez P (2007) Retrieving actions in movies. In: International conference on computer vision, pp 1–8

  43. Laptev I, Caputo B, Schuldt C, Lindeberg T (2007) Local velocity-adapted motion events for spatio-temporal recognition. Comput Vis Image Underst 108:207–229

    Article  Google Scholar 

  44. Laptev I, Marszałek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: Computer vision and pattern recognition, pp 1–8

  45. Lowe DG (1999) Object recognition from local scale-invariant features. In: International conference on computer vision, vol 2, pp 1150–1157

  46. Lucas B, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: International joint conference on artificial intelligence, pp 674–679

  47. Mak M, Kim T, Zaman MH, Kamm RD (2015) Multiscale mechanobiology: computational models for integrating molecules to multicellular systems. Integrative Biology: Quantitative Biosciences from Nano to Macro 7:1093–1108

    Article  Google Scholar 

  48. Malandrino A, Kamm RD, Moeendarbary E (2018) In vitro modeling of mechanics in cancer metastasis. ACS Biomater Sci Eng 4:294–301

    Article  Google Scholar 

  49. Martin TA, Ye L, Sanders AJ, Lane J, Jiang WG (2013) Cancer invasion and metastasis: molecular and cellular perspective. Metastatic Cancer: Clin Biol Perspect, 135–168

  50. Masuzzo P, Van Troys M, Ampe C, Martens L (2016) Taking aim at moving targets in computational cell migration. Trends Cell Biol 26:88–110

    Article  Google Scholar 

  51. Mordohai P, Medioni GG (2007) Tensor voting: a perceptual organization approach to computer vision and machine learning. Morgan and Claypool Publishers

  52. Moreira D, Avila S, Perez M, Moraes D, Testoni V, Valle E, Goldenstein S, Rocha A (2016) Pornography classification: the hidden clues in video space-time. Forensic Sci Int 268:46–61

    Article  Google Scholar 

  53. Mota V, De Almeida Perez E, De Castro TK, Chapiro A, Bernardes Vieira M (2009) Detection of high frequency regions in multiresolution. In: IEEE International conference on image processing, pp 2141–2144

  54. Mota V, Perez EA, Vieira MB, Maciel L, Precioso F, Gosselin PH (2012) A tensor based on optical flow for global description of motion in videos. In: Conference on graphics, patterns and images (SIBGRAPI), pp 298–301

  55. Mota V, Souza J, Araújo AA, Vieira MB (2013) Combining orientation tensors for human action recognition. In: Conference on graphics, patterns and images (SIBGRAPI), pp 328–333

  56. Mota V, Perez EA, Maciel L, Vieira MB, Gosselin PH (2014) A tensor motion descriptor based on histograms of gradients and optical flow. Pattern Recogn Lett 39:85–91

    Article  Google Scholar 

  57. Oliveira FLM, Vieira MB (2015) Variable size block matching trajectories for human action recognition. In: International conference on computational science and applications, pp 283–297

  58. Pasupa K, Sunhem W (Oct 2016) A comparison between shallow and deep architecture classifiers on small dataset. In: International conference on information technology and electrical engineering (ICITEE), pp 1–6

  59. Perez EA, Mota V, Maciel L, Sad D, Vieira MB (2012) Combining gradient histograms using orientation tensors for human action recognition. In: International conference on pattern recognition, pp 3460–3463

  60. Perez M, Avila S, Moreira D, Moraes D, Testoni V, Valle E, Goldenstein S, Rocha A (2017) Video pornography detection through deep learning techniques and motion information. Neurocomputing 230(C):279–293

    Article  Google Scholar 

  61. Perona P, Malik J (1990) Scale-space and edge detection using anisotropic diffusion. Pattern Anal Mach Intell, 629–639

    Article  Google Scholar 

  62. Prates R, Schwartz WR (2018) Kernel multiblock partial least squares for a scalable and multicamera person reidentification system. J Electron Imaging 27(3):1–33

    Article  Google Scholar 

  63. Ramnath N, Creaven P (2004) Matrix metalloproteinase inhibitors. Curr Oncol 6:96–102

    Article  Google Scholar 

  64. Sad D, Mota V, Maciel L, Vieira MB, Araújo AA (2013) A tensor motion descriptor based on multiple gradient estimators. In: Conference on graphics, patterns and images (SIBGRAPI), pp 70–74

  65. Saha PK, Xu Z (2010) An analytic approach to tensor scale with an efficient algorithm and applications to image filtering. In: International conference on digital image computing techniques and applications, pp 429–434

  66. Santos RJ (2017) Matrizes, Vetores e Geometria Analítica Imprensa. Universitária da UFMG

  67. Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: International conference on pattern recognition, pp 32–36

  68. Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: International conference on computer vision, vol 2, pp 1470–1477

  69. Souza K, Araújo AA, Patrocínio Z Jr, Guimarães S (2014) Graph-based hierarchical video segmentation based on a simple dissimilarity measure. Pattern Recogn Lett 47:85–92

    Article  Google Scholar 

  70. Sze V, Budagavi M, Sullivan GJ (2014) High efficiency video coding: algorithms and architectures. Springer

  71. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Computer vision and pattern recognition, pp 1–9

  72. Van de Sande KEA, Gevers T, Snoek CGM (2010) Evaluating color descriptors for object and scene recognition. Pattern Anal Mach Intell 32(9):1582–1596

    Article  Google Scholar 

  73. Villareal MO, Sato Y, Matsuyama K, Isoda H (2018) Daphnane diterpenes inhibit the metastatic potential of b16f10 murine melanoma cells in vitro and in vivo. BMC Cancer 18:856

    Article  Google Scholar 

  74. Wang TC, Liu MY, Zhu JY, Liu G, Tao A, Kautz J, Catanzaro B (2018) Video-to-video synthesis. In: Advances in neural information processing systems, vol 31, pp 1144–1156

  75. Wehrmann J, Simões GS, Barros RC, Cavalcante VF (2018) Adult content detection in videos with convolutional and recurrent neural networks. Neurocomputing 272:432–438

    Article  Google Scholar 

  76. Westin CF (1994) A tensor framework for multidimensional signal processing. Ph.D. thesis, Linköping University, Sweden, dissertation No 348, ISBN 91-7871-421-4

  77. Wiegand T, Sullivan GJ (2007) The h. 264/avc video coding standard [standards in a nutshell]. IEEE Signal Process Mag 24(2):148–153

    Article  Google Scholar 

  78. Xu Z, Gao Z, Hoffman EA, Saha PK (2012) Tensor scale-based anisotropic region growing for segmentation of elongated biological structures. In: International symposium on biomedical imaging, pp 1032–1035

  79. Young EWK (2013) Cells, tissues, and organs on chips: challenges and opportunities for the cancer tumor microenvironment. In: Integrative biology, vol 5, pp 1096–1109

    Article  Google Scholar 

  80. Zaman MH (2013) The role of engineering approaches in analysing cancer invasion and metastasis. Nat Rev Cancer 13:596

    Article  Google Scholar 

  81. Zelnik-manor L, Irani M (2001) Event-based analysis of video. In: Computer vision and pattern recognition, pp 123–130

  82. Zhang J, Li Z, Jing P, Liu Y, Su Y (2017) Tensor-driven low-rank discriminant analysis for image set classification. Multimed Tools Appl 78:4001–4020

    Article  Google Scholar 

  83. Zhang J, Liu Y, Jiang J (2018) Tensor learning and automated rank selection for regression-based video classification. Multimed Tools Appl 77:29213–29230

    Article  Google Scholar 

Download references

Acknowledgements

Authors would like to thank UFMG, FAPEMIG, CAPES and CNPq for funding and NVIDIA for support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Virgínia F. Mota.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mota, V.F., de Oliveira, H.N., Scalzo, S. et al. From video pornography to cancer cells: a tensor framework for spatiotemporal description. Multimed Tools Appl 79, 13919–13949 (2020). https://doi.org/10.1007/s11042-020-08642-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08642-x

Keywords

Navigation