Skip to main content
Log in

Self-learning and explainable deep learning network toward the security of artificial intelligence of things

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

At present, the security of the Internet of things (IoT) has aroused great concern in artificial intelligence area, Artificial Intelligence of Things (AIoT) are widely used in various intelligent surveillance scenarios. However, due to the weak interpretability of the model and high data security risks, developing a robust and explainable deep learning network framework for scene understanding under AIoT is extremely difficult. In addition, the fusion of IoT and AI also poses several challenges. To solve these difficulties, we develop a self-learning and explainable deep learning network toward the security of AIoT. The constructed system contains video collection, upload and display as well as data analysis and early warning operation at the embedded device end, and automatically recognizes the behaviors of scene by our developed visual recognition algorithms. In addition, the cloud computing platform can be controlled through our developed network. Our developed visual recognition algorithms contribute to three aspects. First, we propose a lightweight reinforcement learning network model by extracting spatial–temporal feature of different behavior characteristic. Then, we propose a self-paced learning framework through fusing the deep reinforcement learning and transfer learning. Finally, we propose a multi-perspective deep transfer learning model to solve the problem of weak explanation of model. The experimental results show that our proposed model is able to provide high interpretability of model and outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

The data that support the findings of this paper are available from the corresponding author.

References

  1. Tironi M, Valderrama M (2019) The militarization of the urban sky in Santiago de Chile: the vision multiple of a video-surveillance system of aerostatic balloons. Urban Geogr 14:1–20

    Google Scholar 

  2. Zhang Y, Wan JF, Wang T, Zhang YH (2018) Physically-based rendering for indoor scene understanding using convolutional neural networks, In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, pp 980–988

  3. Sakaridis C, Dai D, Van GL (2017) Semantic foggy scene understanding with synthetic data. Int J Comput Vis 8(2):108–120

    Google Scholar 

  4. Qiu Z, Zhuang Y, Hu H et al (2020) Using stacked sparse auto-encoder and superpixel CRF for long-term visual scene understanding of UGVs. IEEE Trans Syst Man Cybern Syst 50(4):1331–1342

    Article  Google Scholar 

  5. Arulkumaran K, Deisenroth MP, Brundage M et al (2016) A brief survey of deep reinforcement learning. IEEE Signal Process Mag 34(6)

  6. An S, Liu W, Venkatesh S (2007) Face recognition using kernel ridge regression. In: 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp 1–7

  7. Mirza M, Osindero S (2014) Conditional Generative Adversarial Nets, arXiv:1411.1784

  8. Idrees H, Saleemi I, Seibert C, Shah M (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2547–2554

  9. Pentina A, Sharmanska V, Lampert CH (2015) Curriculum learning of multiple tasks. In: Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition, pp 2547–2554

  10. Lin L, Wang K, Meng D et al (2017) Active self-paced learning for cost-effective and progressive face identification. IEEE Trans Pattern Anal Mach Intell 99:7–19

    Google Scholar 

  11. Holzinger A, Biemann C, Constantinos SP, Douglas BK (2017) What do we need to build explainable ai systems for the medical domain?, arXiv:1411.1784

  12. Arrieta AB, Diaz-Rodriguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Garcia S, Gil-Lopez S, Molina D, Benjamins R, Chatila R, Herrera F (2020) Explainable ARTIFICIAL INTELLIgence (XAI): concepts, taxonomies, opportunities and challenges toward responsible AI. Inf Fus 58:82–115

    Article  Google Scholar 

  13. Khan SD, Basalamah S (2021) Scale and density invariant head detection deep model for crowd counting in pedestrian crowds. Vis Comput 37(4):1–11

    Google Scholar 

  14. Zhang C, Li H, Wang X, Yang X (2015) Cross-scene crowd counting via deep convolutional neural networks. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 833–841

  15. Xia YZ, Zhang BL (2016) Face occlusion detection using deep convolutional neural networks. Int J Pattern Recogn Artif Intell 30(09):1–24

    Article  Google Scholar 

  16. Fernández G, Svensson ÁFL, Morelande MR (2020) Multiple target tracking based on sets of trajectories. IEEE Trans Aerosp Electr Syst 56(3):1685–1707

    Article  Google Scholar 

  17. Hui L , Zhaohong D , Haitao Y et al (2021) circRNA-binding protein site prediction based on multi-view deep learning, subspace learning and multi-view classifier. Brief Bioinf pp 990–1012.

  18. Zhang C, Li HS, Wang XG, Yang XK (2015) Cross-scene crowd counting via deep convolutional neural networks. In CVPR, pp 833–841

  19. Radford A, Metz L, Chintala S (2015) Unsupervised representation learning with deep convolutional generative adversarial networks, Comput Sci pp 513–527

  20. Xing JL, Niu ZH, Huang JS, Hu WM, Zhou X, Yan SC (2018) Towards robust and accurate multi-view and partially-occluded face alignment. IEEE Trans Pattern Anal Mach Intell, pp 987–1001

  21. Li T, Chang H, Wang M, Ni B, Hong R, Yan S (2015) Crowded scene analysis: a survey. IEEE Trans Circuits Syst Video Technol 25(3):367–386

    Article  Google Scholar 

  22. Habite T, Abdeljaber O, Olsson A (2021) Automatic detection of annual rings and pith location along Norway spruce timber boards using conditional adversarial networks. Wood Sci Technol 55(2):461–488

    Article  Google Scholar 

  23. Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European Conference on Computer Vision, pp 694–711. Springer

  24. Li C, Wand M (2016) Precomputed real-time texture synthesis with markovian generative adversarial networks. In: European Conference on Computer Vision, pp 702–716. Springer

  25. Chen YJ, Song LX, He R (2018) Adversarial occlusion-aware face detection. In: 4th Asian Conference on Pattern Recognition, pp 354–361

  26. Zhao F, Feng JS, Zhao J, Yang WH, Yan SC (2018) Robust LSTM-autoencoders for face de-occlusion in the wild. IEEE Trans Image Process 27(2):778–790

    Article  MATH  Google Scholar 

  27. Zhang Y, Zhou D, Chen S, Gao S, Ma Y (2016) Singleimage crowd counting via multi-column convolutional neural network. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 589–597

  28. Song J (2020) Binary generative adversarial networks for image retrieval. Int J Comput Vision 2:1–22

    Google Scholar 

  29. Yang B, Kang Y, Yuan YY et al (2021) ST-LBAGAN: spatio-temporal learnable bidirectional attention generative adversarial networks for missing traffic data imputation. Knowl-Based Syst 215(10):106705

    Article  Google Scholar 

  30. Li Y, Liu S, Yang J, Yang M-H( 2017) Generative face completion. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1180–1188

  31. Zeng L, Xu XM, Cai BL, Qiu S, Zhang T (2017) Multi-scale convolutional neural networks for crowd counting. In ICIP, pp 465–469. IEEE

  32. Dar SU, Yurt M, Ildz ME et al (2020) Prior-guided image reconstruction for accelerated multi-contrast MRI via generative adversarial networks. IEEE J Select Top Signal Process 99:1–12

    Google Scholar 

  33. Sam DB, Surya S, Babu RV (2017) Switching convolutional neural network for crowd counting. In: 2017 IEEE Conference on computer vision and pattern recognition (CVPR), pp 2180–2188

  34. Olmschenk G, Wang X, Tang H et al (2021) Impact of labeling schemes on dense crowd counting using convolutional neural networks with multiscale upsampling. Int J Pattern Recognit Artif Intell 4(3):1190–1198

    Google Scholar 

  35. Xla B, Jsa B, Wwa B et al (2021) Density-aware and background-aware network for crowd counting via multi-task learning. Pattern Recogn Lett 2(3):2190–2198

    Google Scholar 

  36. Pan X, Zhao J, Xu J (2020) Conditional generative adversarial network-based training sample set improvement model for the semantic segmentation of high-resolution remote sensing images. IEEE Trans Geosci Remote Sens pp 2190–2203

  37. Sindagi VA, Patel VM (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: 2017 IEEE International Conference on Computer Vision, pp 43962–4972

  38. Xu D, Yang WLO, Alameda-Pineda X, Ricci E, Wang XG, Sebe N (2017) Learning deep structured multi-scale features using attention-gated crfs for contour prediction. In: NIPS, pp 3961–3970

  39. Zhang L, Dai J, Lu HC, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In CVPR, pp 1741–1750

  40. Tavakkoli A, Kamran SA, Hossain KF et al (2020) A novel deep learning conditional generative adversarial network for producing angiography images from retinal fundus photographs. Sci Rep 10(1):789–798

    Article  Google Scholar 

  41. Shen Z, Xu Y, Ni BB, Wang M, Hu JG, Yang XK (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 5245–5254

  42. Li Y, Chen X, Zhu Z, Xie L, Huang G, Du D, Wang X (2018) Attention-guided unified network for panoptic segmentation (CVPR), pp 1812–1821

  43. Sam DB, Babu RV (2018) Top-down feedback for crowd counting convolutional neural network, In AAAI, pp 1517–1425

  44. Liu WZ, Salzmann M, Fua P (2018) Contextaware crowd counting. arXiv preprint, arXiv:1811.10452

  45. Zhang L, Shi MJ, Chen QB (2018) Crowd counting via scale-adaptive convolutional neural network, In WACV. IEEE, pp 1427–1440

  46. Liu N, Long YC, Zou CQ, Niu Q, Pan L, Wu HF (2018) Adcrowdnet: An attention-injective deformable convolutional network for crowd understanding, arXiv preprint arXiv:1811.11968.

  47. Liu YB, Jia RS, Liu QM et al (2021) Crowd counting method based on the self-attention residual network. Appl Intell 51(1):427–440

    Article  Google Scholar 

  48. Hu J, Shen L, Albanie S, Sun G, Wu EH (2017) Squeeze-and-excitation networks, In arXiv:1709.01507

  49. Goodfellow IJ, Abadie JP, Mirza M, Xu B, Farley DW, Ozair S, Courville A, Bengio YS (2017) GenerativeAdversarialNets, In arXiv:1406.2661v1

  50. Hagras H (2018) Toward human-understandable, explainable AI. Computer 51(09):28–36

    Article  Google Scholar 

  51. Punjabi A, Katsaggelos AK (2017) Visualization of feature evolution during convolutional neural network training, 2017 25th European Signal Processing Conference (EUSIPCO). Kos 2017:311–315

    Google Scholar 

  52. Samek W, Wiegand T, Mller KR (2018) Explainable artificial intelligence: understanding, visualizing and interpreting deep learning models. ITU J ICT Discov Special Issue Impact Artif Intell (AI) Commun Netw Serv 1(1):3948–3958

    Google Scholar 

  53. Mao J, Huang J, Toshev A, Camburu O, Yuille A, Murphy K (2016) Generation and comprehension of unambiguous object descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp 1696–1709

  54. Cheng Y, Jiang H, Wang F et al (2018) Using high-bandwidth networks efficiently for fast graph computation. IEEE Trans Parallel Distrib Syst 2(3):1–21

    Google Scholar 

  55. Zhang T, Jia WJ, He XJ, Yang J (2017) Discriminative Dictionary learning with motion weber local descriptor for violence detection. IEEE Trans Circuits Syst Video Technol 27(3):696–709

    Article  Google Scholar 

  56. de Souza Jr LA, Mendel R, Strasser S et al (2021) Convolutional neural networks for the evaluation of cancer in Barrett’s esophagus: explainable AI to Lighten up the black-box. Comput Biol Med 2(3):812–828

    Google Scholar 

  57. Hossain MS, Muhammad G, Guizani N (2020) Explainable AI and mass surveillance system-based healthcare framework to combat COVID-19 like pandemics. IEEE Network 99:1–7

    Google Scholar 

  58. Zhang T, Yang ZJ, Jia WJ, Wu Q, Yang J, He XJ (2015) Fast and robust head detection with arbitrary pose and occlusion. Multim Tools Appl 74(21):9365–9385

    Article  Google Scholar 

  59. Zhang T, Yang ZJ, Jia WJ, Yang BQ, Yang J, He XJ (2016) A new method for violence detection in surveillance scenes. Multim Tools Appl 74(12):7327–7349

    Article  Google Scholar 

  60. Cheng Y, Wang F, Jiang H et al (2018) A communication-reduced and computation-balanced framework for fast graph computation. Front Comp Sci 12(5):1222–1238

    Google Scholar 

  61. Han L, Li KC, Castiglione A et al (2021) A clique-based discrete bat algorithm for influence maximization in identifying top-k influential nodes of social networks. Soft Comput 25(13):8223–8240

    Article  Google Scholar 

  62. Zhang T, Jia WJ, Li JJ, Sun J, Yang HH (2018) Fast and robust occluded face detection in ATM surveillance. Pattern Recogn Lett 107:33–40

    Article  Google Scholar 

  63. G. L, S. H, Z. W, (2017) Efficient approximation algorithms for multi-antennae largest weight data retrieval. IEEE Trans Mob Comput 16(12):3320–3333

    Article  Google Scholar 

  64. Nirmala PG (2020) Comparison of partially occluded face detection and recognition methods. J Adv Res Dyn Control Syst 12(SP7):201–211

    Article  Google Scholar 

  65. Ernst MR, Triesch J, Burwick T (2021) Recurrent feedback improves recognition of partially occluded objects. Digit Signal Process 6(3):120–129

    Google Scholar 

  66. Zhang T, Jia WJ, Gong C, Sun J, Song XN (2018) Semi-supervised dictionary learning via local sparse constraints for violence detection. Pattern Recogn Lett 107:98–104

    Article  Google Scholar 

  67. Niu Y, Lin W, Ke X (2018) CF-based optimisation for saliency detection. IET Comput Vis 12(4):365–376

    Article  Google Scholar 

  68. Tao Z, Zou J, Jia W (2019) Fast and robust road sign detection in color images. Appl Intell 48:4113–4127

    Google Scholar 

  69. Zhang T, Jia WJ, Yang BQ, Yang J, He XJ, Zheng ZL (2017) MoWLD: a robust motion image descriptor for violence detection. Multim Tools Appl 76(1):1419–1438

    Article  Google Scholar 

  70. Wang S, Guo W (2017) Sparse multi-graph embedding for multimodal feature representation. IEEE Trans Multim 99:1–1

    Google Scholar 

  71. Niu Y, Chen J, Guo W (2018) Meta-metric for saliency detection evaluation metrics based on application preference. Multimed Tools Appl. https://doi.org/10.1007/s11042-018-5863-2

    Article  Google Scholar 

  72. Z, Jian & Dong, Le & Wu, L. Wen, (2017) New Algorithms for the Unbalanced Generalized Birthday Problem. IET Inf Secur. https://doi.org/10.1049/iet-ifs.2017.0495

    Article  Google Scholar 

  73. Lin B, Guo W, Xiong N, Chen G, Vasilakos AV, Zhang H (2016) A pretreatment workflow scheduling approach for big data applications in multicloud environments. IEEE Trans Netw Service Manage 13(3):581–594

    Article  Google Scholar 

  74. Liu G, Chen Z, Zhuang Z, Guo W, Chen G (2015) A unified algorithm based on HTS and self-adapting PSO for the construction of octagonal and rectilinear SMT. Soft Comput 24(6):3943–3961. https://doi.org/10.1007/s00500-019-04165-2

    Article  Google Scholar 

  75. Liu G, Guo W, Li R et al (2015) XGRouter: high-quality global router in X-architecture with particle swarm optimization. Front Comp Sci 9(4):576–594

    Article  Google Scholar 

  76. Liu G, Guo W, Li R, Niu Y, Chen G (2015) XGRouter: high-quality global router in X-architecture with particle swarm optimization. Front Comput Sci 9(4):576–594

    Article  Google Scholar 

  77. Liu G, Guo W, Niu Y, Chen G, Huang X (2015) A PSO-based-timing-driven octilinear steiner tree algorithm for VLSI routing considering bend reduction. Soft Comput 19(5):1153–1169. https://doi.org/10.1007/s00500-014-1329-2

    Article  MATH  Google Scholar 

  78. Liu G, Huang X, Guo W, Niu Y, Chen G (2015) Multilayer obstacle-avoiding x-architecture steiner minimal tree construction based on particle swarm optimization. IEEE Trans Cybern 45(5):989–1002. https://doi.org/10.1109/TCYB.2014.2342713

    Article  Google Scholar 

  79. Ma T, Liu Q, Cao J, Tian Y, Al-Dhelaan A, Al-Rodhaan M (2020) M LGIEM: global and local node influence based community detection. Fut Gener Comput Syst 105:533–546

    Article  Google Scholar 

  80. Ye Q, Li Z, Fu L, Zhang Z, Yang W, Yang GW (2019) G nonpeaked discriminant analysis for data representation. IEEE Trans Neural Netw Learn Syst 30(12):3818–3832

    Article  Google Scholar 

  81. Liu G (2021) Attribute reduction algorithms determined by invariants for decision tables. Cognit Comput pp 818–832

  82. Cheng Z, Chen N, Liu B et al (2020) Joint user association and resource allocation in hetnets based on user mobility prediction. Comput Netw 177:107312

    Article  Google Scholar 

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (No. 61702226); the 111 Project (B12018); open Fund of Jiangsu Key Laboratory of Image and Video Understanding for Social Safety, Nanjing University of Science and Technology, Nanjing (J2021-7).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bin Wu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, B., He, S. Self-learning and explainable deep learning network toward the security of artificial intelligence of things. J Supercomput 79, 4436–4467 (2023). https://doi.org/10.1007/s11227-022-04818-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04818-4

Keywords

Navigation