Abstract
For online prediction of surveillance video, how to design a valid machine learning model is a challenging problem. To deal with the issue, a multilayer ELM with object principal trajectory has been proposed. In this scheme, in order to support dynamic semantic representation between adjacent frames, the temporal and spatial characteristics have been taken into consideration. And after calculated the coordinate distance by K-means algorithm, the objective regions can be separated at the pixel level. Then, the object moving trend is determined according to the principal trajectory of interest area. Finally, multilayer ELM is adopted to quantify the new shape characteristics. This deep neural network helps generate the new frame sequence enough to be true. The proposed method not only recognizes multiple objects with different movement directions, but also effectively identifies subtle semantic features. The whole forecasting process avoids the trial and error caused by user intervention, which makes the model suitable for online environment. Numerical experiments are conducted on two different kinds of surveillance video datasets. The result is shown that the proposed algorithm has better performance than other state-of-the-art methods.
Similar content being viewed by others
References
Tripathi, R.K., Jalal, A.S., Agrawal, S.C.: Suspicious human activity recognition: a review. Artif. Intell. Rev. 10, 1–57 (2017)
Zhang, R., Liu, X., Hu, J., et al.: A fast method for moving object detection in video surveillance image. Signal Image Video Process. 11(5), 841–848 (2017)
Bahmani, S., Romberg, J.: Compressive deconvolution in random mask imaging. IEEE Trans. Comput. Imaging 1(4), 236–246 (2015)
Huang, T.S.: Image Sequence Analysis, vol. 5. Springer, Berlin (2013)
Zhang, X., Tian, Y., Huang, T., et al.: Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling. IEEE Trans. Image Process. 23(10), 4511–4526 (2014)
Ibrahim, A., Tharwat, A., Gaber, T., Hassanien, A.E.: Optimized superpixel and AdaBoost classifier for human thermal face recognition. Signal Image Video Process. 12(4), 711–719 (2018)
Tian, Z., Zheng, N., Xue, J., et al.: Video object segmentation with shape cue based on spatiotemporal superpixel neighbourhood. IET Comput. Vis. 8(1), 16–25 (2014)
Alibouch, B., Radgui, A., Demonceaux, C., et al.: A phase-based framework for optical flow estimation on omnidirectional images. Signal Image Video Process. 10(2), 285–292 (2016)
Guo, D., Li, W., Fang, X.: Capturing temporal structures for video captioning by spatio-temporal contexts and channel attention mechanism. Neural Process. Lett. 46, 1–16 (2017)
Antony, A., Sreelekha, G.: Performance enhancement of HEVC lossless mode using sample-based angular and planar predictions. Signal Image Video Process. 11(6), 1057–1064 (2017)
Diaz-Honrubia, A.J., Martinez, J.L., Cuenca, P.: A fast intra H. 264/AVC to HEVC transcoding system. Multimed. Tools Appl. 77(5), 6367–6384 (2018)
Dey, B., Kundu, M.K.: Efficient foreground extraction from HEVC compressed video for application to real-time analysis of surveillance ‘big’data. IEEE Trans. Image Process. 24(11), 3574–3585 (2015)
Kaviani, H.R., Shirani, S.: Frame rate upconversion using optical flow and patch-based reconstruction. IEEE Trans. Circuits Syst. Video Technol. 26(9), 1581–1594 (2016)
Yin, Y., Zhao, Y., Zhang, B., Li, C., Guo, S.: Enhancing ELM by Markov boundary based feature selection. Neurocomputing 261, 57–69 (2017)
Tavakoli, H.R., Borji, A., Laaksonen, J., Rahtu, E.: Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features. Neurocomputing 244, 10–18 (2017)
Jia, B., Feng, W., Zhu, M.: Obstacle detection in single images with deep neural networks. Signal Image Video Process. 10(6), 1033–1040 (2016)
Srivastava, N., Mansimov, E., Salakhudinov, R. Unsupervised learning of video representations using LSTMs. In: International Conference on Machine Learning, vol. 6, pp. 843–852 (2015)
Zhao, F., Feng, J., Zhao, J., et al.: Robust LSTM-autoencoders for face de-occlusion in the wild. IEEE Trans. Image Process. 27(2), 778–790 (2018)
Zhao, Z., Song, Y., Su, F.: Specific video identification via joint learning of latent semantic concept, scene and temporal structure. Neurocomputing 208, 378–386 (2016)
Li, H., Trocan, M.: Deep neural network based single pixel prediction for unified video coding. Neurocomputing 272, 558–570 (2018)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1–3), 489–501 (2006)
Sahin, S.O., Kozat, S.S.: Nonuniformly sampled data processing using LSTM networks. IEEE Trans. Neural Netw. Learn. Syst. (online)
Liu, J., Wang, G., Duan, L.Y., Abdiyeva, K., Kot, A.C.: Skeleton-based human action recognition with global context-aware attention LSTM networks. IEEE Trans. Image Process. 27(4), 1586–1599 (2018)
Greff, K., Srivastava, R.K., Koutník, J., et al.: LSTM: a search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 10(28), 2222–2232 (2017)
Tang, J., Deng, C., Huang, G.B.: Extreme learning machine for multilayer perceptron. IEEE Trans. Neural Netw. Learn. Syst. 27(4), 809–821 (2016)
Wu, H.C.: The Karush–Kuhn–Tucker optimality conditions in an optimization problem with interval-valued objective function. Eur. J. Oper. Res. 176(1), 46–59 (2007)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Dataset: Mixtures of Dynamic Textures.: Statistical Visual Computing Laboratory (SVCL) at UCSD. http://www.svcl.ucsd.edu/projects/motiondytex/
Dataset: Detection of Moving Objects.: Department of Advanced Information Technology, Kyushu University. http://limu.ait.kyushu-u.ac.jp/dataset/en/index.html
Kim, S., Pak, D., Lee, S.: SSIM-based distortion metric for film grain noise in HEVC. Signal Image Video Process. 12(3), 489–496 (2018)
Acknowledgements
The work was supported by the National Key Research Project of China under Grant No. 2016YFB1001304, the National Natural Science Foundation of China under Grant 61572229, the JLUSTIRT High-level Innovation Team, and the Fundamental Research Funds for Central Universities under Grant No. 2017TD-19. The authors gratefully acknowledge financial support from the Research Centre for Intelligent Signal Identification and Equipment, Jilin Province.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yu, H., Wang, J. & Sun, X. Surveillance video online prediction using multilayer ELM with object principal trajectory. SIViP 13, 1243–1251 (2019). https://doi.org/10.1007/s11760-019-01471-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-019-01471-y