Abstract
Spatiotemporal description is a research field with applications in various areas such as video indexing, surveillance, human-computer interfaces, among others. Big Data problems in large databases are now being treated with Deep Learning tools, however we still have room for improvement in spatiotemporal handcraft description. Moreover, we still have problems that involve small data in which data augmentation and other techniques are not valid, or even, it is not worth the use of an expensive method. The main contribution of this work is the development of a framework for spatiotemporal representation using orientation tensors enabling dimension reduction and invariance. This is a multipurpose framework called Features As Spatiotemporal Tensors (FASTensor). We evaluate this framework in two different applications: Video Pornography classification and Cancer Cell classification. The latter one is also a contribution of this work, since we introduce a new dataset called Melanoma Cancer Cell (MCC). It is a small dataset with inherent difficulties in the acquisition process and its particular motion nature. The results are competitive, while also being ease to compute. Finally, our results in the MCC dataset can be used in other cancer cell treatment analysis.
Similar content being viewed by others
Notes
References
Almeida J, dos Santos JA, Alberton B, Morellato LPC, da S, Torres R (2016) Phenological visual rhythms: compact representations for fine-grained plant species identification. Pattern Recogn Lett 81:90–100
Andaló FA, Miranda PAV, da Silva Torres R, Falcão AX (2007) Detecting contour saliences using tensor scale. In: IEEE International conference on image processing, pp 349–352
Augereau B, Tremblais B, Fernandez-Maloigne C (2005) Vectorial computation of the optical flow in color image sequences. In: Thirteenth color imaging conference, pp 130–134
Avila S, Thome N, Cord M, Valle E, Araújo AA (2013) Pooling in image representation: the visual codeword point of view. Comput Vis Image Underst 117(5):453–465
Avila S, Thome N, Cord N, Valle E, Araújo AA (2011) Bossa: extended bow formalism for image classification. In: IEEE International conference on image processing, pp 2909–2912
Baburaj M, Sudhish N (2019) Tensor based approach for inpainting of video containing sparse text. Multimed Tools Appl 78(2):1805–1829
Baeza-Yates RA, Ribeiro-Neto B (1999) Modern information retrieval. Addison-Wesley Longman Publishing Co. Inc
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features. Comput Vis Image Underst 110:346–359
Bradbury RH (2007) Overview BT - cancer. Springer, pp 1–17
Brox T, Bruhn A, Papenberg N, Weickert J (2004) High accuracy optical flow estimation based on a theory for warping. In: 25–36
Caetano C, Avila S, Schwartz WR, Guimarães SJF, Araújo AA (2016) A mid-level video representation based on binary descriptors: a case study for pornography detection. Neurocomputing 213:102–114
Caetano C, dos Santos JA, Schwartz WR (2016) Optical flow co-occurrence matrices: a novel spatiotemporal feature descriptor. In: 1947–1952
Castro TK, Almeida Perez E, Mota V, Chapiro A, Vieira MB, Freire WP (2009) High frequency assessment from multiresolution analysis. In: International conference on computational science, pp 429–438
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: Computer vision and pattern recognition, vol 1, pp 886–893
Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. In: European conference on computer vision, pp 428–441
Decaestecker C, Debeir O, Van Ham P, Kiss R (2007) Can anti-migratory drugs be screened in vitro? A review of 2d and 3d assays for the quantitative analysis of cell migration. Med Res Rev 27(2):149–176
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition, pp 248–255
Donahue J, Anne Hendricks L, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: Computer vision and pattern recognition, pp 2625–2634
Everingham M, Eslami SA, Van Gool L, Williams CK, Winn J, Zisserman A (2015) The pascal visual object classes challenge: a retrospective. Int J Comput Vis 111(1):98–136
Farnebäck G (July 2001) Very high accuracy velocity estimation using orientation tensors, parametric motion, and simultaneous segmentation of the motion field. In: International conference on computer vision, pp 171–177
Förstner W (1994) A framework for low level feature extraction. In: European conference on computer vision, pp 383–394
Gillet JP, Varma S, Gottesman MM (2013) The clinical relevance of cancer cell lines. J Natl Cancer Inst 105:452–458
Goodspeed A, Heiser L, Gray J, Costello J (2016) Tumor-derived cell lines as molecular models of cancer pharmacogenomics. Molec Cancer Res 14(1):3–13
Gracias X, Negahdaripour S (2005) Underwater mosaic creation using video sequences from different altitudes. In: Proceedings of OCEANS 2005 MTS/IEEE. IEEE, pp 1295–1300
Granlund GH, Knutsson H (1995) Signal processing for computer vision. Kluwer Academic Publishers
Grundmann M, Kwatra V, Han M, Essa I (2010) Efficient hierarchical graph-based video segmentation. In: Computer vision and pattern recognition, pp 2141–2148
Harris C, Stephens M (1988) A combined corner and edge detector. In: Fourth Alvey vision conference, pp 147–151
Hart IR (1979) The selection and characterization of an invasive variant of the b16 melanoma. Am J Pathol 97:587–600
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Computer vision and pattern recognition, pp 770–778
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Jia C, Wang S, Xu X, Zhou C, Zhang L (2010) Tensor analysis and multi-scale features based multi-view human action recognition. In: International conference on computer engineering and technology, pp 60–64
Johansson B, Farnebäck G, Ack GF (2002) A theoretical comparison of different orientation tensors. In: Symposium on image analysis, pp 69–73
Kang J, Xiao C, Deng M, Yu J, Liu H (2011) Image registration based on harris corner and mutual information. In: Proceedings of 2011 international conference on electronic & mechanical engineering and information technology, vol 7. IEEE, pp 3434–3437
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Computer vision and pattern recognition, pp 1725–1732
Katira P, Bonnecaze RT, Zaman MH (2013) Modeling the mechanics of cancer: effect of changes in cellular and extra-cellular mechanical properties. Front Encol 3:145
Kim T, Wong S, Cipolla RR (2007) Tensor canonical correlation analysis for action classification. In: Computer vision and pattern recognition, pp 1–8
Krausz B, Bauckhage C (2010) Action recognition in videos using nonnegative tensor factorization. Int Conf Pattern Recogn 0:1763–1766
Kriegel F, Köhler R, Bayat-Sarmadi J, Bayerl S, Hauser EA, Niesner R, Luch A, Cseresnyés Z (2017) Cell shape characterization and classification with discrete fourier transforms and self-organizing maps. Int Soc Adv Cytometry 93:323–333
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Neur Inform Process Syst, 1097–1105
Lan X, Zhang S, Yuen PC, Chellappa R (2018) Learning common and feature-specific patterns: a novel multiple-sparse-representation-based tracker. IEEE Trans Image Process 27(4):2022–2037
Lan X, Ye M, Shao R, Zhong B, Yuen PC, Zhou H (2019) Learning modality-consistency feature templates: a robust rgb-infrared tracking system. IEEE Trans Ind Electron 66:9887–9897
Laptev I, Pérez P (2007) Retrieving actions in movies. In: International conference on computer vision, pp 1–8
Laptev I, Caputo B, Schuldt C, Lindeberg T (2007) Local velocity-adapted motion events for spatio-temporal recognition. Comput Vis Image Underst 108:207–229
Laptev I, Marszałek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: Computer vision and pattern recognition, pp 1–8
Lowe DG (1999) Object recognition from local scale-invariant features. In: International conference on computer vision, vol 2, pp 1150–1157
Lucas B, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: International joint conference on artificial intelligence, pp 674–679
Mak M, Kim T, Zaman MH, Kamm RD (2015) Multiscale mechanobiology: computational models for integrating molecules to multicellular systems. Integrative Biology: Quantitative Biosciences from Nano to Macro 7:1093–1108
Malandrino A, Kamm RD, Moeendarbary E (2018) In vitro modeling of mechanics in cancer metastasis. ACS Biomater Sci Eng 4:294–301
Martin TA, Ye L, Sanders AJ, Lane J, Jiang WG (2013) Cancer invasion and metastasis: molecular and cellular perspective. Metastatic Cancer: Clin Biol Perspect, 135–168
Masuzzo P, Van Troys M, Ampe C, Martens L (2016) Taking aim at moving targets in computational cell migration. Trends Cell Biol 26:88–110
Mordohai P, Medioni GG (2007) Tensor voting: a perceptual organization approach to computer vision and machine learning. Morgan and Claypool Publishers
Moreira D, Avila S, Perez M, Moraes D, Testoni V, Valle E, Goldenstein S, Rocha A (2016) Pornography classification: the hidden clues in video space-time. Forensic Sci Int 268:46–61
Mota V, De Almeida Perez E, De Castro TK, Chapiro A, Bernardes Vieira M (2009) Detection of high frequency regions in multiresolution. In: IEEE International conference on image processing, pp 2141–2144
Mota V, Perez EA, Vieira MB, Maciel L, Precioso F, Gosselin PH (2012) A tensor based on optical flow for global description of motion in videos. In: Conference on graphics, patterns and images (SIBGRAPI), pp 298–301
Mota V, Souza J, Araújo AA, Vieira MB (2013) Combining orientation tensors for human action recognition. In: Conference on graphics, patterns and images (SIBGRAPI), pp 328–333
Mota V, Perez EA, Maciel L, Vieira MB, Gosselin PH (2014) A tensor motion descriptor based on histograms of gradients and optical flow. Pattern Recogn Lett 39:85–91
Oliveira FLM, Vieira MB (2015) Variable size block matching trajectories for human action recognition. In: International conference on computational science and applications, pp 283–297
Pasupa K, Sunhem W (Oct 2016) A comparison between shallow and deep architecture classifiers on small dataset. In: International conference on information technology and electrical engineering (ICITEE), pp 1–6
Perez EA, Mota V, Maciel L, Sad D, Vieira MB (2012) Combining gradient histograms using orientation tensors for human action recognition. In: International conference on pattern recognition, pp 3460–3463
Perez M, Avila S, Moreira D, Moraes D, Testoni V, Valle E, Goldenstein S, Rocha A (2017) Video pornography detection through deep learning techniques and motion information. Neurocomputing 230(C):279–293
Perona P, Malik J (1990) Scale-space and edge detection using anisotropic diffusion. Pattern Anal Mach Intell, 629–639
Prates R, Schwartz WR (2018) Kernel multiblock partial least squares for a scalable and multicamera person reidentification system. J Electron Imaging 27(3):1–33
Ramnath N, Creaven P (2004) Matrix metalloproteinase inhibitors. Curr Oncol 6:96–102
Sad D, Mota V, Maciel L, Vieira MB, Araújo AA (2013) A tensor motion descriptor based on multiple gradient estimators. In: Conference on graphics, patterns and images (SIBGRAPI), pp 70–74
Saha PK, Xu Z (2010) An analytic approach to tensor scale with an efficient algorithm and applications to image filtering. In: International conference on digital image computing techniques and applications, pp 429–434
Santos RJ (2017) Matrizes, Vetores e Geometria Analítica Imprensa. Universitária da UFMG
Schuldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local svm approach. In: International conference on pattern recognition, pp 32–36
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: International conference on computer vision, vol 2, pp 1470–1477
Souza K, Araújo AA, Patrocínio Z Jr, Guimarães S (2014) Graph-based hierarchical video segmentation based on a simple dissimilarity measure. Pattern Recogn Lett 47:85–92
Sze V, Budagavi M, Sullivan GJ (2014) High efficiency video coding: algorithms and architectures. Springer
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Computer vision and pattern recognition, pp 1–9
Van de Sande KEA, Gevers T, Snoek CGM (2010) Evaluating color descriptors for object and scene recognition. Pattern Anal Mach Intell 32(9):1582–1596
Villareal MO, Sato Y, Matsuyama K, Isoda H (2018) Daphnane diterpenes inhibit the metastatic potential of b16f10 murine melanoma cells in vitro and in vivo. BMC Cancer 18:856
Wang TC, Liu MY, Zhu JY, Liu G, Tao A, Kautz J, Catanzaro B (2018) Video-to-video synthesis. In: Advances in neural information processing systems, vol 31, pp 1144–1156
Wehrmann J, Simões GS, Barros RC, Cavalcante VF (2018) Adult content detection in videos with convolutional and recurrent neural networks. Neurocomputing 272:432–438
Westin CF (1994) A tensor framework for multidimensional signal processing. Ph.D. thesis, Linköping University, Sweden, dissertation No 348, ISBN 91-7871-421-4
Wiegand T, Sullivan GJ (2007) The h. 264/avc video coding standard [standards in a nutshell]. IEEE Signal Process Mag 24(2):148–153
Xu Z, Gao Z, Hoffman EA, Saha PK (2012) Tensor scale-based anisotropic region growing for segmentation of elongated biological structures. In: International symposium on biomedical imaging, pp 1032–1035
Young EWK (2013) Cells, tissues, and organs on chips: challenges and opportunities for the cancer tumor microenvironment. In: Integrative biology, vol 5, pp 1096–1109
Zaman MH (2013) The role of engineering approaches in analysing cancer invasion and metastasis. Nat Rev Cancer 13:596
Zelnik-manor L, Irani M (2001) Event-based analysis of video. In: Computer vision and pattern recognition, pp 123–130
Zhang J, Li Z, Jing P, Liu Y, Su Y (2017) Tensor-driven low-rank discriminant analysis for image set classification. Multimed Tools Appl 78:4001–4020
Zhang J, Liu Y, Jiang J (2018) Tensor learning and automated rank selection for regression-based video classification. Multimed Tools Appl 77:29213–29230
Acknowledgements
Authors would like to thank UFMG, FAPEMIG, CAPES and CNPq for funding and NVIDIA for support.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Mota, V.F., de Oliveira, H.N., Scalzo, S. et al. From video pornography to cancer cells: a tensor framework for spatiotemporal description. Multimed Tools Appl 79, 13919–13949 (2020). https://doi.org/10.1007/s11042-020-08642-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-08642-x