Robust online rain removal for surveillance videos with dynamic rains

doi:10.1016/j.knosys.2021.107006

Knowledge-Based Systems

Volume 222, 21 June 2021, 107006

https://doi.org/10.1016/j.knosys.2021.107006 Get rights and content

Abstract

The current rain removal techniques proposed for surveillance videos mainly assume consistent rains with invariant extents and types, and are implemented in a batch-mode learning manner. Such assumption deviates from the continuously varying insights of practical rains, and the batch mode further makes the techniques infeasible for real long-lasting videos. To alleviate these issues, this study proposes a novel online rain removal approach to represent practical dynamic rains embedded in surveillance videos. Particularly, we model the rain streaks scattered in each video frame as a patch-wise mixture of Gaussians (P-MoG) distribution, and update its parameters frame by frame. Such a P-MoG modeling manner finely reflects the non-i.i.d. dynamic variations of rains along time. In specific, the P-MoG rain model in each frame is regularized by the learned rain knowledge in previous frames, making the online model adaptable to not-identically-distributed rains in each frame while regularized by not-independently-distributed rains in previous frames. The proposed model is formulated as a concise probabilistic MAP model, which can be readily solved by EM algorithm. We further embed an affine transformation operator into the proposed model, making it adaptable to a wider range of videos with camera jitters. The superiority of the proposed method is substantiated by extensive experiments implemented on synthetic and real videos containing static and dynamic rains as compared with state-of-the-arts in both accuracy and efficiency.

Introduction

Nowadays, tremendous surveillance cameras have been installed almost all over the world, and large amount of surveillance videos have been and are being collected automatically, for facilitating various subsequent video processing tasks, like interest in object detection [1], [2], traffic congestion [3], action recognition [4], [5] and so on. However, such collected surveillance videos are always with unsatisfactory qualities due to complicated environmental variations, conducting difficulties of further processing missions. One of the most typical videos with such issues are those containing rain streaks. Actually, videos captured in the wild inevitably contain rainy frames where the rain extent always dynamically changes and consistently lasts for a long time. The appearance of rain streaks in a video form a layer of bright streaks adhered to the clean background, and always causes severe quality corruption and blurry artifacts to the videos. Designing effective rain removal techniques to enhance the quality of surveillance videos has thus become a critical issue and been attracting much research attention recently [6].

Many approaches have been presented for this rain streak removal issue on surveillance videos. The early attempts have tried to establish analytic models to extract rain components from videos by discovering specific and distinctive attributes for representing rains. Typical utilized attributes include chromatic [7], [8], photometric [9], [10], [11], and specific temporal/spatiotemporal [12], [13] properties of rains, as well as those existed in frequency domain [8], [14]. In the recent decade, there is a new trend for handling this task, by directly designing a MAP model for decomposing a rainy video into a rain layer and non-rain ones [12], [15], [16], [17]. The key for this modeling task is to extract appropriate prior expression for rain streaks contained in the video for representing insightful features underlying this specific subject. An elegant design of prior always leads to good performance of a rain removal method. Most frequently used priors along this research line include low rank [12], [15], [16], [17], similarity across non-local area [11], [12], [13], [18] and so on.

However, there are still significant limitations existed in the current rain removal methods for surveillance videos collected from practical scenarios. One typical issue is that most current methods either employ one or more fixed regularization terms, with pre-specified parameters, for representing rains [7], [11], [13], or assume rains i.i.d. distributed across frames imposed with local structures [19]. Such a modeling manner is suitable for encoding rainy videos with relatively consistent extent of rains throughout time. Nevertheless, in practically obtained videos, the rain types as well as its extents are always highly diverse and dynamically changed across time, due to both inevitable weather change and illumination condition variation. Especially, in practical long-time rainy videos, rain streaks over frames are always demonstrated as a highly dynamic and non-i.i.d. distributed configurations. On the one hand, the rains are always not identically distributed and evidently different along time. In some frames the rains might be pouring and the rain streaks are heavily scattered across the frame, while in others, the rains might be light and some tiny rain drops are weakly distributed over the frame. On the other hand, the distribution shapes of rain streaks over one frame are generally closely related with its adjacent frames across the video, and thus such rain distribution is not independent in time. Most of the current methods for video rain removal have not specifically considered such complicated characteristic of dynamic rain streak variations, especially its temporal non-i.i.d. distribution essence across a video sequence, which makes them always not able to robustly adapt practically long-term videos containing rains with dynamic changes in time.

There is another critical issue to realize an online approach to handle the rain removal task for a long-lasting surveillance video containing rains. While the surveillance videos are continuously collected in practice, most of current methods are designed on an entire video sequence and implemented in a batch-mode learning manner. This makes them only suitable on a limited length of videos, while cannot efficiently fit the consistently incoming video frames lasting for a long time in practice. It is thus also significant to rebuild efficient methods for such an rain removal issue for constructing rational online algorithms to make the task able to be efficiently implemented on practical consecutive surveillance videos.

To alleviate the aforementioned issues, this study presents a novel online method for the task. The main idea is to formulatethe distribution of rain streaks as a patch-wise mixture of Gaussians (P-MoG), whose parameters vary from frame to frame, and dynamically and associatively updated along time. The utilization of P-MoG for encoding rain knowledge is inspired by our previous work [17], which can finely represent the local structure pattern underlying rain streak distributions. As compared with most previous methods, this new method has some specific characteristics: Firstly, it employs an online mode to incrementally implement rain removal in a frame-by-frame manner along a video sequence. Compared with conventional batch-mode manner, such an online paradigm is more efficient, and more available on long-lasting videos. Secondly, for each frame, the method learns a specific P-MoG distribution with distinctive parameters, which can well fit the specific configuration of rains located on the frame to be differentiated from others. This finely encodes the “not identical” property of rain streak distributions along time. Thirdly, when tuning parameters of rain distribution in each frame, the previously learned knowledge on such distribution can regularize them by a KL-divergence term, to make those parameters not too much deviated from previous ones, and enforce relationship between rain knowledge of this frame and those learned from previous ones. This finely reflects the “not independently distributed” property of rains temporally. Through designing such novel learning regime, our method can be robustly and efficiently calculated on surveillance videos with dynamic rains.

In summary, this paper mainly makes following contributions for rain removal tasks on surveillance videos:

•
An online rain removal method is designed for the task, which can be efficiently implemented in long-lasting surveillance videos. To the best we know, this is the first online approach proposed against this task.
•
A concise probabilistic model is designed for reflecting the dynamic rains contained in videos. The model contains a likelihood term, functioning to specifically fit rain shapes in each new-coming frame, and a KL-divergence regularization term, enforcing the learned rain knowledge not too far away from the previously learned ones. In such a modeling manner, the practically non-i.i.d. rain distribution can be finely delivered. To our best knowledge, this is the first method considering dynamic variations of rain extents/shapes in videos for the task.
•
An entire MAP model is constructed by fully encoding rains as aforementioned, and integratively formulating the backgrounds and moving objects contained in a surveillance video. A transform operator is further embedded into the model to make it adaptable to videos with camera jitters. An EM algorithm is readily designed for solving the model, each step capable of being efficiently solved in closed-form. Experiments substantiate the advantage of the method on the task in both time and accuracy.

The paper is organized as follows: Section 2 provides relevant research works on video/image rain removal. Section 3 presents the proposed online rain removal model and its solving algorithm. Section 4 further extends the method to videos with camera jitters. Experimental results are shown in Section 5, and conclusions are finally made in Section 6.

Section snippets

Related work

In this Section, we briefly discuss related work on rain removal for videos and images. Interested readers may refer to [20] for a more comprehensive review.

The P-MoG method

First, we briefly introduce the batch-mode rain removal method, P-MoG, proposed in our previous work [17]. The input video is represented as a tensor $X \in R^{h \times w \times n}$ , where $h, w, n$ denote the height, width and frame number of the video, respectively. In this paper, the same letter may appear in different typeface. The bold upper-case characters (i.e. $X$ ) stands for the unfolded matrix for of its tensor form which is written in italic (i.e. $X$ ). Let $X \in R^{h w \times n}$ denote the unfolded version of $X$ along its $3^{rd}$

Transformed online P-MoG

Due to unpredictable environmental factors in outdoor surveillance systems, camera jitter may possibly occur in the collected video sequences. To better adjust to such camera jitter issues, we embed an transformation $τ$ to help align the input video frame $X^{t}$ in our model as $τ \circ X^{t}$ , that is: $H^{t^{⊥}} ⊙ (τ \circ X^{t}) = H^{t^{⊥}} ⊙ B^{t} + R^{t}$ $f {(R^{t})}_{m} \sim \prod_{k = 1}^{K} N {(0, Σ_{k})}^{z_{m k}^{t}}, R \geq 0, z_{m}^{t} \sim Multi (z_{m}^{t} | Π) .$ Similar to the case of static background, we can formulate a MAP problem as: $min_{τ, Σ, Π, H^{t}, U, v} - ln p (f (τ \circ X^{t}) | Σ, Π, H^{t}, U, v) + R_{G}^{(t)} (Σ, Π)$ $+ R_{M}^{(t)} (H^{t}) + R_{B}^{(t)} (U)$

Experiments

In this section, we conduct a series of experiments on rainy surveillance videos with various of situations to verify the effectiveness of the proposed method.

Conclusion

In this paper, we have proposed an online rain removal method for surveillance videos with/without camera jitters and with dynamic rains. The method is implemented in an incremental frame-by-frame mode, and thus is efficient as compared with current batch-mode methods for the task. Besides, the proposed method adopts the adaptive learning framework that can help fit the dynamic rain variations in wild-taken videos. Through an embedding of affine transformation, the model can be applicable to

CRediT authorship contribution statement

Lixuan Yi: Methodology, Software, Writing - original draft. Qian Zhao: Conceptualization, Methodology, Writing - review & editing. Wei Wei: Conceptualization, Investigation. Zongben Xu: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research was supported by National Key R&D Program of China (2020YFA0713900) and the China NSFC projects (62076196, 11690011, 61721002, U1811461)

References (48)

ChenH.-T. et al.
Traffic congestion classification for nighttime surveillance videos
MengD. et al.
Improve robustness of sparse PCA by L1-norm maximization
Pattern Recognit.
(2012)
ZhouX. et al.
Moving object detection by detecting contiguous outliers in the low-rank representation
IEEE Trans. Pattern Anal. Mach. Intell.
(2013)
JoshiK.A. et al.
A survey on moving object detection and tracking in video surveillance system
Int. J. Soft Comput. Eng.
(2012)
SimonyanK. et al.
Two-stream convolutional networks for action recognition in videos
JiS. et al.
3d convolutional neural networks for human action recognition
IEEE Trans. Pattern Anal. Mach. Intell.
(2013)
MukhopadhyayS. et al.
Combating bad weather part i: Rain removal from video
Synth. Lect. Image Video Multimedia Process.
(2014)
ZhangX. et al.
Rain removal in video by combining temporal and chromatic properties
LiuP. et al.
Pixel based temporal analysis using chromatic property for removing rain from videos
Comput. Inf. Sci.
(2009)
GargK. et al.
When does a camera see rain?

GargK. et al.

Detection and removal of rain from videos

GargK. et al.

Vision and rain

Int. J. Comput. Vis.

(2007)

ChenY.-L. et al.

A generalized low-rank appearance model for spatio-temporally correlated rain streaks

KimJ.-H. et al.

Video deraining and desnowing using temporal correlation and low-rank matrix completion

IEEE Trans. Image Process.

(2015)

BarnumP.C. et al.

Analysis of rain and snow in frequency space

Int. J. Comput. Vis.

(2010)

T.-X. Jiang, T.-Z. Huang, X.-L. Zhao, L.-J. Deng, Y. Wang, A novel tensor-based video rain streaks removal approach via...

W. Ren, J. Tian, Z. Han, A. Chan, Y. Tang, Video desnowing and deraining based on matrix decomposition, in: Proceedings...

W. Wei, L. Yi, Q. Xie, Q. Zhao, D. Meng, Z. Xu, Should we encode rain streaks in video as deterministic or stochastic?...

BossuJ. et al.

Rain or snow detection in image sequences through use of a histogram of orientation of streaks

Int. J. Comput. Vis.

(2011)

SanthaseelanV. et al.

Utilizing local phase information to remove rain from video

Int. J. Comput. Vis.

(2015)

H. Wang, M. Li, Y. Wu, Q. Zhao, D. Meng, A survey on rain removal from video and single image, Sci. China Inform....

GargK. et al.

Photometric model of a rain drop

CMU Tech. Rep.

(2003)

YouS. et al.

Adherent raindrop detection and removal in video

BrewerN. et al.

Using the shape characteristics of rain to identify and remove rain from video

Cited by (9)

Tensor ring decomposition-based model with interpretable gradient factors regularization for tensor completion
2023, Knowledge-Based Systems
Tensor ring (TR) decomposition, which factorizes a tensor into a sequence of cyclically interconnected third-order TR factors, is a powerful tool to capture the global low-rankness of high-dimensional data. However, the understanding of the physical interpretation of TR factors is not clear. In this paper, we first empirically discover the physical interpretation of TR factors in the gradient domain (termed as gradient factors) and then give the theoretical justification. Based on the interpretable gradient factors, we suggest a TR decomposition-based model with interpretable gradient factors regularization (TR-GFR) for tensor completion. To be specific, we consider the low-rankness and transformed sparsity priors of gradient factors to boost the performance and robustness of TR decomposition-based model. In addition, we develop an effective proximal alternating minimization algorithm to solve the proposed model. Numerical experiments validate that the proposed TR-GFR is superior to the compared state-of-the-art methods in terms of PSNR and SSIM values and more robust with TR rank.
A quality enhancement network with coding priors for constant bit rate video coding
2022, Knowledge-Based Systems
Citation Excerpt :
Although deep learning-based works perform well, traditional methods still provide the theoretical basis for VQE works. Deep learning has achieved success in almost all low-level computer vision tasks, such as image/video super-resolution [27,34,35], deraining [36,37], low-light image enhancement [4,38–40], remote sensing image enhancement [41–43], and image dehazing [44,45]. Deep learning-based compressed video quality enhancement works are derived from image super- resolution and JPEG image restoration.
Video coding can effectively compress data, while the introduction of compression artifacts degrades the visual quality and the performance of artificial intelligence video applications. Video quality enhancement (VQE) methods can improve compressed video quality without modifying the coding standard’s main modules. The existing VQE works mainly aim to improve video quality in the constant quantization parameter (CQP) coding mode. However, constant bit rate (CBR) coding mode is widely used in some streaming playback video applications, and VQE at CBR is more challenging than that at CQP. This article presents a novel Constant bit rate Video quality enhancement Network combined with Coding priors (CVCN). The proposed CVCN can be inserted into High Efficiency Video Coding (HEVC) codec as a CNN-based in-loop filter (LF) module or a post-processing module out of the codec. Moreover, we design a two-pass training strategy for the LF module to overcome multiple filtering. To adapt the QP diversity of CBR videos, we utilize the coding unit (CU)-wise QP prior by constructing a CU-wise QP adaptive module (CQAM) and a QP adaptive multi-scale residual block (QAMSRB) based on CQAM. We construct the CU-partition prior (CPP) by exploiting the relationship between the compression noise and CU partition information. A novel CPP spatial-attention block is proposed to combine the CPP with the self-attention module. Furthermore, the proposed CVCN can effectively enhance CBR videos at different bits per pixel (BPP) via a single model. Extensive experimental results show the superiority of the CVCN over state-of-the-art VQE approaches for CBR videos.
Image All-in-One Adverse Weather Removal Via Dynamic Model Weights Generation
2023, SSRN
Video Deraining Mechanism by Preserving the Temporal Consistency and Intrinsic Properties of Rain Streaks
2023, International Journal on Recent and Innovation Trends in Computing and Communication
STANet: a Spatial-Temporal Aggregation Network for Video Deraining
2023, 2023 3rd International Conference on Neural Networks, Information and Communication Engineering, NNICE 2023
REINFORCED PEDESTRIAN ATTRIBUTE RECOGNITION WITH GROUP OPTIMIZATION REWARD
2022, arXiv

View all citing articles on Scopus

View full text

Robust online rain removal for surveillance videos with dynamic rains

Abstract

Introduction

Section snippets

Related work

The P-MoG method

Transformed online P-MoG

Experiments

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgments

Pattern Recognit.

Moving object detection by detecting contiguous outliers in the low-rank representation

IEEE Trans. Pattern Anal. Mach. Intell.

A survey on moving object detection and tracking in video surveillance system

Int. J. Soft Comput. Eng.

Two-stream convolutional networks for action recognition in videos

3d convolutional neural networks for human action recognition

IEEE Trans. Pattern Anal. Mach. Intell.

Combating bad weather part i: Rain removal from video

Synth. Lect. Image Video Multimedia Process.

Rain removal in video by combining temporal and chromatic properties

Pixel based temporal analysis using chromatic property for removing rain from videos

Comput. Inf. Sci.

When does a camera see rain?

Detection and removal of rain from videos

Vision and rain

Int. J. Comput. Vis.

A generalized low-rank appearance model for spatio-temporally correlated rain streaks

Video deraining and desnowing using temporal correlation and low-rank matrix completion

IEEE Trans. Image Process.

Analysis of rain and snow in frequency space

Int. J. Comput. Vis.

Rain or snow detection in image sequences through use of a histogram of orientation of streaks

Int. J. Comput. Vis.

Utilizing local phase information to remove rain from video

Int. J. Comput. Vis.

Photometric model of a rain drop

CMU Tech. Rep.

Adherent raindrop detection and removal in video

Using the shape characteristics of rain to identify and remove rain from video