Abstract
Nowadays, multimedia is vulnerable to hacking because of insecurity. The traditional security mechanism is insufficient to deal with multimedia to protect them against malicious events. So, the present study has introduced a novel grey wolf-based YOLO spatiotemporal framework (GW-YSTF) for predicting frames, whether it is fake or real from the trained video data. After initializing the data, the function pre-processing is activated in the hidden layer of the GW-YSTF to eliminate the noisy features in the introduced video frames. Then, a feature analysis function was performed to select the needed parts. Henceforth, the fake video frames are predicted based on the different classes in the trained deepfake video database. Moreover, the presented model is tested in the Python environment. The improvement measure was validated in comparative analysis by comparing the proposed model performance with other existing models based on accuracy, recall, F-score, and precision. The proposed model has recorded the most comprehensive fake score for the accuracy of video frame prediction of 99.8%, higher than the traditional approaches.
Similar content being viewed by others
Data availability
Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.
References
Masud, U., Sadiq, M., Masood, S., El-Latif, A.A.A.: LW-deepfakenet: a lightweight time distributed CNN-LSTM network for real-time deep fake video detection. Signal Image Video Process. 17, 4029–4037 (2023). https://doi.org/10.1007/s11760-023-02633-9
Aloraini, M.: FaceMD: convolutional neural network-based spatiotemporal fusion facial manipulation detection. Signal Image Video Process. 17, 247–255 (2023). https://doi.org/10.1007/s11760-022-02227-x
Hu, Z., Duan, Q., Zhang, P., Tao, H.: An attention-erasing stripe pyramid network for face forgery detection. Signal Image Video Process. 17, 4123–4131 (2023). https://doi.org/10.1007/s11760-023-02644-6
Ullah, W., Hussain, T., Khan, Z.A., Haroon, U., Baik, S.W.: Intelligent dual stream CNN and echo state network for anomaly detection. Knowl. Based Syst. 253, 109456 (2022). https://doi.org/10.1016/j.knosys.2022.109456
Hashemzadeh, M., Farajzadeh, N., Heydari, M.: Smoke detection in video using convolutional neural networks and efficient spatiotemporal features. Appl. Soft Comput. 128, 109496 (2022). https://doi.org/10.1016/j.asoc.2022.109496
Kamoona, A.M., Gostar, A.K., Bab-Hadiashar, A., Hoseinnezhad, R.: Multiple instance-based video anomaly detection using deep temporal encoding–decoding. Expert Syst. Appl. 214, 119079 (2023). https://doi.org/10.1016/j.eswa.2022.119079
Zhang, H., Li, H.: Interactive spatio-temporal feature learning network for video foreground detection. Complex Intell. Syst. 8(5), 4251–4263 (2022). https://doi.org/10.1007/s40747-022-00712-x
Aftab, S., Ali, S.F., Mahmood, A., Suleman, U.: A boosting framework for human posture recognition using spatio-temporal features along with radon transform. Multimed. Tools Appl. 81(29), 42325–42351 (2022). https://doi.org/10.1007/s11042-022-13536-1
Yu, W., Huang, Q.: A deep encoder-decoder network for anomaly detection in driving trajectory behavior under spatio-temporal context. Int. J. Appl. Earth Obs. Geoinf. 115, 103115 (2022). https://doi.org/10.1016/j.jag.2022.103115
Ullah, W., Ullah, A., Hussain, T., Muhammad, K., Heidari, A.A., Del Ser, J., Baik, S.W., De Albuquerque, V.H.: Artificial Intelligence of Things-assisted two-stream neural network for anomaly detection in surveillance big video data. Future Gener. Comput. Syst. 129, 286–297 (2022). https://doi.org/10.1016/j.future.2021.10.033
Bekhouche, S.E., Ruichek, Y., Dornaika, F.: Driver drowsiness detection in video sequences using hybrid selection of deep features. Knowl. Based Syst. 252, 109436 (2022). https://doi.org/10.1016/j.knosys.2022.109436
Nayak, R., Pati, U.C., Das, S.K.: A comprehensive review on deep learning-based methods for video anomaly detection. Image Vis. Comput. 106, 104078 (2021). https://doi.org/10.1016/j.imavis.2020.104078
Ullah, W., Hussain, T., Baik, S.W.: Vision transformer attention with multi-reservoir echo state network for anomaly recognition. Inf. Process. Manag. 60(3), 103289 (2023). https://doi.org/10.1016/j.ipm.2023.103289
Ullah, A., Muhammad, K., Ding, W., Palade, V., Haq, I.U., Baik, S.W.: Efficient activity recognition using lightweight CNN and DS-GRU network for surveillance applications. Appl. Soft Comput. 103, 107102 (2021). https://doi.org/10.1016/j.asoc.2021.107102
Wu, L., Huang, C., Zhao, S., Li, J., Zhao, J., Cui, Z., Yu, Z., Xu, Y., Zhang, M.: Robust fall detection in video surveillance based on weakly supervised learning. Neural Netw. (2023). https://doi.org/10.1016/j.neunet.2023.03.042
Gul, S., Malik, M.I., Khan, G.M., Shafait, F.: Multi-view gait recognition system using spatio-temporal features and deep learning. Expert Syst. Appl. 179, 115057 (2021). https://doi.org/10.1016/j.eswa.2021.115057
Pandey, N.N., Muppalaneni, N.B.: Dumodds: dual modeling approach for drowsiness detection based on spatial and spatio-temporal features. Eng. Appl. Artif. Intell. 119, 105759 (2023). https://doi.org/10.1016/j.engappai.2022.105759
Thakkar, K., Lo, D.: Video normalization in identifying fake videos using a long short-term memory model. Southeast Con. 189, 108282 (2023). https://doi.org/10.1109/SoutheastCon51012.2023.10115139
Mohammed, S.K., Singh, S., Mizouni, R., Otrok, H.: A deep learning framework for target localization in error-prone environment. Int. Things 22, 100713 (2023). https://doi.org/10.1016/j.iot.2023.100713
Tian, H., Ma, X., Wu, H., Li, Y.: Skeleton-based abnormal gait recognition with spatio-temporal attention enhanced gait-structural graph convolutional networks. Neurocomputing 473, 116–126 (2022). https://doi.org/10.1016/j.neucom.2021.12.004
Aslam, N., Kolekar, M.H.: Unsupervised anomalous event detection in videos using spatio-temporal inter-fused autoencoder. Multimed. Tools Appl. 20, 1–26 (2022). https://doi.org/10.1007/s11042-022-13496-6
Mohanty, S.K., Rup, S.: An adaptive background modeling for foreground detection using spatio-temporal features. Multimed. Tools Appl. 80, 1311–1341 (2021). https://doi.org/10.1007/s11042-020-09552-8
Fakhar, B., RashidyKanan, H., Behrad, A.: Event detection in soccer videos using unsupervised learning of spatio-temporal features based on pooled spatial pyramid model. Multimed. Tools Appl. 78(12), 16995–17025 (2019). https://doi.org/10.1007/s11042-018-7083-1
Suratkar, S., Kazi, F.: Deep fake video detection using transfer learning approach. Arab. J. Sci. Eng. 102, 1–1 (2022). https://doi.org/10.1007/s13369-022-07321-3
Liu, B., Liu, Q., Zhang, T., Yang, Y.: MSSTResNet-TLD: a robust tracking method based on tracking-learning-detection framework by using multi-scale spatio-temporal residual network feature model. Neurocomputing 362, 175–194 (2019). https://doi.org/10.1016/j.neucom.2019.07.024
Dehkordy, D.T., Rasoolzadegan, A.: A new machine learning-based method for android malware detection on imbalanced dataset. Multimed. Tools Appl. 80, 24533–24554 (2021). https://doi.org/10.1007/s11042-021-10647-z
Sadaf, K., Sultana, J.: Intrusion detection based on auto encoder and isolation forest in fog computing. IEEE Access 8, 167059–167068 (2020). https://doi.org/10.1109/ACCESS.2020.3022855
Meesad, P.: Thai fake news detection based on information retrieval, natural language processing and machine learning. SN Comput. Sci. 2(6), 425 (2021). https://doi.org/10.1007/s42979-021-00775-6
Yang, C.Z., Ma, J., Wang, S., Liew, A.W.C.: Preventing deepfake attacks on speaker authentication by dynamic lip movement analysis. IEEE Trans. Inf. Forensics Secur. 16, 1841–1854 (2020). https://doi.org/10.1109/TIFS.2020.3045937
Heusch, G., George, A., Geissbühler, D., Mostaani, Z., Marcel, S.: Deep models and shortwave infrared information to detect face presentation attacks. IEEE Trans. Biom. Behav. Identity Sci. 2(4), 399–409 (2020). https://doi.org/10.1109/TBIOM.2020.3010312
Fan, R., Si, C., Yi, W., Wan, Q.: YOLO-DoA: a new data-driven method of DoA estimation based on YOLO neural network framework. IEEE Sens. Lett. 7(2), 1–4 (2023). https://doi.org/10.1109/LSENS.2023.3241080
Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014). https://doi.org/10.1016/j.advengsoft.2013.12.007
Bansal, N., Aljrees, T., Yadav, D.P., Singh, K.U., Kumar, A., Verma, G.K., Singh, T.: Real-time advanced computational intelligence for deep fake video detection. Appl. Sci. 13(5), 3095 (2023). https://doi.org/10.3390/app13053095
Suratkar, S., Kazi, F.: Deep fake video detection using transfer learning approach. Arab. J. Sci. Eng. 48(8), 9727–9737 (2023). https://doi.org/10.1007/s13369-022-07321-3
Zhou, H., Jiang, F., Lu, H.: SSDA-YOLO: Semi-supervised domain adaptive YOLO for cross-domain object detection. Comput. Vis. Image Underst. 229, 103649 (2023). https://doi.org/10.1016/j.cviu.2023.103649
Suratkar, S., Bhiungade, S., Pitale, J., Soni, K., Badgujar, T., Kazi, F.: Deep-fake video detection approaches using convolutional–recurrent neural networks. J. Control. Decis. 10(2), 198–214 (2023). https://doi.org/10.1080/23307706.2022.2033644
Acknowledgements
None.
Funding
This research received no specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Contributions
Authors AK, MBR, and GJS have contributed equally to the work.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no potential conflict of interest.
Ethical approval
All applicable institutional and/or national guidelines for the care and use of animals were followed.
Informed consent
For this type of analysis, formal consent is not needed.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Koteswaramma, A., Rao, M.B. & Suma, G.J. An intelligent adaptive learning framework for fake video detection using spatiotemporal features. SIViP 18, 2231–2241 (2024). https://doi.org/10.1007/s11760-023-02895-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-023-02895-3