Violent action represents a threat to public security, thus intelligent violence detection became one of the important and challenging topics in video surveillance scenarios for this reason there is a growing appeal of video-surveillance systems. Hence, it’s mandatory for the detection of violent or abnormal activities to avert any casualties which could cause any damages. Distinctly, in this paper, it is possible to create a network to learn spatial-temporal information on all subjects of violence rather than going through each concept separately. In order to construct a new concept for violence detection system, we rely on a strategy of a dynamic frame skipping to reduce the complexity of calculation. However, following the regions of interest in the frame, the overall complexity of the calculation is decreased. Withal, the History of Binary Motion Image for n successive images is used for features extraction to facilitate to model the human behaviors. Then, the biggest regions of interest are extracted in order to find the maximum component represented violence action. Finally, deep neural networks involve three stacked Autoencoders and a Softmax are adopted as an exterior layer for classification.
|