Modified ResNet-152 Network With Hybrid Pyramidal Pooling for Local Change Detection | IEEE Journals & Magazine | IEEE Xplore

Modified ResNet-152 Network With Hybrid Pyramidal Pooling for Local Change Detection


Impact Statement:Visual surveillance is one of the challenging areas of research for computer vision. Recently, visual surveillance demand has rapidly increased due to its potential appli...Show More

Abstract:

In this article, we put forth a unique attempt to detect the local changes in challenging video scenes by exploring the capabilities of an encoder-decoder type network th...Show More
Impact Statement:
Visual surveillance is one of the challenging areas of research for computer vision. Recently, visual surveillance demand has rapidly increased due to its potential applications in public safety. Moving object detection followed by tracking are the two critical steps for any surveillance system. The output of any surveillance system depends on how accurately the moving objects in the scene are detected. However, moving object detection from real-life video scenes is challenging as theyhave high uncertainty and are affected by many complexities. In this article, we have proposed a deep learning-based background subtraction technique that can effectively handle different challenging video scenes. Using residual connections among the layers makes the model less complex, and the transfer learningmechanism boosts the model's performance.

Abstract:

In this article, we put forth a unique attempt to detect the local changes in challenging video scenes by exploring the capabilities of an encoder-decoder type network that employs a modified ResNet-152 architecture with a multi-scale feature extraction (MFE) framework. The proposed encoder network consists of a modified ResNet-152 network where the initial two blocks are freeze and the weights of the last blocks are learned using a transfer learning mechanism. The said encoder can reduce the computational complexity and extract fine as well as coarse-scale features. We have proposed an MFE mechanism block which is a hybridization of pyramidal pooling architecture (PPA), and various atrous convolutional layers where the high-level features from the encoder network are utilized to extract multi-scale features. The use of PPA in the MFE block preserves maximum value in every pooling area, to retain the contextual relationship between the pixels in the complex video frames that can handle...
Published in: IEEE Transactions on Artificial Intelligence ( Volume: 5, Issue: 4, April 2024)
Page(s): 1599 - 1612
Date of Publication: 31 July 2023
Electronic ISSN: 2691-4581

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.