Skip to main content

Advertisement

Log in

Recurrent Self-Structuring Machine Learning for Video Processing using Multi-Stream Hierarchical Growing Self-Organizing Maps

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The emergence of IoT and advanced multimedia information systems have undoubtedly created a proliferation of video sensor data. Although diverse machine learning approaches are utilized to extract useful insights from these data, limitations occur when processing and accommodating the large volumes of video data, which are unlabeled and have previously unseen data structures. This brings out the importance of using self-structuring intelligence that can adapt to the nature of the data and with the ability to learn from multi-modal, spatiotemporal and unstructured data. Encompassing these advances, we propose a recurrent self-structuring machine learning approach for video processing using multi-stream hierarchical recurrent growing self-organizing maps (RGSOM) architecture. We have designed, implemented and evaluated the said approach using a human activity recognition video dataset (Weizmann dataset), achieving state-of-the-art accuracy of 93.5% in the unsupervised domain. We used both spatial and temporal data from the video as separate input feature streams, where RGSOMs were used to self-structure the video data in multi-streams for visual exploratory analysis and video classification. As potential implications, this study can contribute to the existing literature in advancing self-adaptation techniques for video sensor data processing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Alahakoon D, Halgamuge SK, Srinivasan B (2000) Dynamic Self-Organizing Maps with Controlled Growth for Knoledge Discovery. IEEE Trans. Neural Netw. 11(3):601–614

    Article  Google Scholar 

  2. Amarasiri R, Alahakoon D, Smith K, Premaratne M (2005) HDGSOMr: a high dimensional growing self-organizing map using randomness for efficient web and text mining, in IEEE/WIC/ACM International Conference on Web Intelligence (WI'05), 215–221

  3. Cardullo F, Sweet B, Hosman R, Coon C (2011) The human visual system and its role in motion perception’, in AIAA Modeling and Simulation Technologies Conference, American Institute of Aeronautics and Astronautics

  4. Chappell GJ, Taylor JG (Mar. 1993) The temporal Kohonen map. Neural Netw 6(3):441–445

    Article  Google Scholar 

  5. Chaudhry R, Ravichandran A, Hager G, Vidal R (2009) Histograms of oriented optical flow and Binet-Cauchy kernels on nonlinear dynamical systems for the recognition of human actions, in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 1932–1939

  6. Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection, in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 1, pp. 886–893 vol. 1

  7. Dollár P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features, in Visual Surveillance and Performance Evaluation of Tracking and Surveillance, 2005. 2nd Joint IEEE International Workshop on, 65–72

  8. Elloumi S, Cosar S, Pusiol G, Bremond F, Thonnat M (2015) Unsupervised discovery of human activities from long-time videos. IET Comput Vis 9(4):522–530

    Article  Google Scholar 

  9. Fritzke B (1994) Growing cell structures—a self-organizing network for unsupervised and supervised learning. Neural Netw 7(9):1441–1460

    Article  Google Scholar 

  10. Goldbeck J, Huertgen B (1999) Lane detection and tracking by video sensors, in Proceedings 199 IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems (Cat. No.99TH8383), 74–79

  11. Gorelick L, Blank M, Shechtman E, Irani M, Basri R (Dec. 2007) Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell 29(12):2247–2253

    Article  Google Scholar 

  12. Hassabis D, Kumaran D, Summerfield C, Botvinick M (Jul. 2017) Neuroscience-Inspired Artificial Intelligence. Neuron 95(2):245–258

    Article  Google Scholar 

  13. He Z, Wu D (May 2006) Resource allocation and performance analysis of wireless video sensors. IEEE Trans. Circuits Syst. Video Technol. 16(5):590–599

    Article  Google Scholar 

  14. Kohonen T (Nov. 1998) The self-organizing map. Neurocomputing 21(1–3):1–6

    Article  Google Scholar 

  15. Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies, in Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, 1–8

  16. Liu H, Chen S, Kubota N (Aug. 2013) Intelligent video systems and analytics: a survey. IEEE Trans Ind Inform 9(3):1222–1233

    Article  Google Scholar 

  17. López-Rubio E, Luque-Baena RM, Domínguez E (Jun. 2011) Foreground detection in video sequences with probabilistic self-organizing maps. Int J Neural Syst 21(03):225–246

    Article  Google Scholar 

  18. Lungarella M, Sporns O (2005) Information Self-Structuring: Key Principle for Learning and Development, in Proceedings. The 4th International Conference on Development and Learning, 2005, 25–30

  19. Maddalena L, Petrosino A (Jul. 2008) A self-organizing approach to background subtraction for visual surveillance applications. IEEE Trans Image Process 17(7):1168–1177

    Article  MathSciNet  Google Scholar 

  20. Marrow P (Oct. 2000) Nature-inspired computing technology and applications. BT Technol J 18(4):13–23

    Article  Google Scholar 

  21. Marsland S, Shapiro J, Nehmzow U (2002) A self-organising network that grows when required. Neural Netw 15(8–9):1041–1058

    Article  Google Scholar 

  22. Nallaperuma D et al. (2019) Online incremental machine learning platform for big data-driven smart traffic management, IEEE Trans Intell Transp Syst, pp. 1–12

  23. Nawaratne R, Alahakoon D, De Silva D, Yu X (2019) Spatiotemporal anomaly detection using deep learning for real-time video surveillance. IEEE Trans Ind Inform 16(1):393–402

  24. Nawaratne R, Bandaragoda T, Adikari A, Alahakoon D, De Silva D, Yu X (2017) Incremental knowledge acquisition and self-learning for autonomous video surveillance, in IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society, 4790–4795

  25. Parisi GI, Magg S, Wermter S (2016) Human motion assessment in real time using recurrent self-organization, in 2016 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 71–76

  26. Parisi GI, Tani J, Weber C, Wermter S (Dec. 2017) Lifelong learning of human actions with deep neural network self-organization. Neural Netw 96:137–149

    Article  Google Scholar 

  27. Peng B, Lei J, Fu H, Zhang C, Chua T, Li X (2018) Unsupervised video action clustering via motion-scene interaction constraint, IEEE Trans Circuits Syst Video Technol, pp. 1–1

  28. Petrushin VA (2005) Mining rare and frequent events in multi-camera surveillance video using self-organizing maps, in Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, New York, NY, USA, 794–800

  29. Poggio T, Smale S The mathematics of learning: Dealing with data. Not. AMS 50(5):537–544

  30. Sargano AB, Angelov P, Habib Z (Jan. 2017) A Comprehensive Review on Handcrafted and Learning-Based Action Representation Approaches for Human Activity Recognition. Appl. Sci. 7(1):110

    Article  Google Scholar 

  31. Strickert M, Hammer B (Mar. 2005) Merge SOM for temporal data. Neurocomputing 64:39–71

    Article  Google Scholar 

  32. Voegtlin T (Oct. 2002) Recursive self-organizing maps. Neural Netw 15(8):979–991

    Article  Google Scholar 

  33. Wang H, Kläser A, Schmid C, Liu C-L (2013) Dense trajectories and motion boundary descriptors for action recognition. Int J Comput Vis 103(1):60–79

    Article  MathSciNet  Google Scholar 

  34. Xu Z, Mei L, Hu C, Liu Y (Sep. 2016) The big data analytics and applications of the surveillance system using video structured description technology. Clust Comput 19(3):1283–1292

    Article  Google Scholar 

  35. Yang Y, Saleemi I, Shah M (Jul. 2013) Discovering motion primitives for unsupervised grouping and one-shot learning of human actions, gestures, and expressions. IEEE Trans Pattern Anal Mach Intell 35(7):1635–1648

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by a La Trobe University Postgraduate Research Scholarship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naveen Chilamkurti.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nawaratne, R., Adikari, A., Alahakoon, D. et al. Recurrent Self-Structuring Machine Learning for Video Processing using Multi-Stream Hierarchical Growing Self-Organizing Maps. Multimed Tools Appl 79, 16299–16317 (2020). https://doi.org/10.1007/s11042-020-08886-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-08886-7

Keywords

Navigation