Making accurate object detection at the edge: review and new approach

Huang, Zhenhua; Yang, Shunzhi; Zhou, MengChu; Gong, Zheng; Abusorrah, Abdullah; Lin, Chen; Huang, Zheng

doi:10.1007/s10462-021-10059-3

Making accurate object detection at the edge: review and new approach

Published: 01 September 2021

Volume 55, pages 2245–2274, (2022)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Zhenhua Huang^1,2,
Shunzhi Yang¹,
MengChu Zhou ORCID: orcid.org/0000-0002-5408-8752³,
Zheng Gong¹,
Abdullah Abusorrah⁴,
Chen Lin⁵ &
…
Zheng Huang⁶

2489 Accesses
40 Citations
1 Altmetric
Explore all metrics

Abstract

With the development of Internet of Things (IoT), data are increasingly appearing at the edge of a network. Processing tasks at the network edge can effectively solve the problems of personal privacy leakage and server overloading. As a result, it has attracted a great deal of attention. A number of efficient convolutional neural network (CNN) models are proposed to do so. However, since they require much computing and memory resources, none of them can be deployed to such typical edge computing devices as Raspberry Pi 3B+ and 4B+ to meet the real-time requirements of user tasks. Considering that a traditional machine learning method can precisely locate an object with a highly acceptable calculation load, this work reviews state-of-the-art literature and then proposes a CNN with reduced input size for an object detection system that can be deployed in edge computing devices. It splits an object detection task into object positioning and classification. In particular, this work proposes a CNN model with 44 $\times$ 44-pixel inputs instead of much more inputs, e.g., 224 $\times$ 224-pixel in many existing methods, for edge computing devices with slow memory access and limited computing resources. Its overall performance has been verified via a facial expression detection system realized in Raspberry Pi 3B+ and 4B+. The work makes accurate object detection at the edge possible.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EC-RFERNet: an edge computing-oriented real-time facial expression recognition network

Article 18 December 2023

An Edge Computing Architecture for Object Detection

Efficient facial expression recognition framework based on edge computing

Article 24 July 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

https://github.com/tobysunx/face_recognition.
https://github.com/yangshunzhi1994/CNN-RIS.
OpenVINO$^{\mathrm{TM}}$ toolkit: https://docs.openvinotoolkit.org/latest/index.html.

References

Ahmed SB, Ali SF, Ahmad J, Adnan M, Fraz MM (2020) On the frontiers of pose invariant face recognition: a review. Artif Intell Rev 53(4):2571–2634
Article Google Scholar
Arndt S, Turvey C, Andreasen NC (1999) Correlating and predicting psychiatric symptom ratings: Spearmans r versus kendalls tau correlation. J Psychiatric Res 33(2):97–104
Article Google Scholar
Bao J, Wei S, Lv J, Zhang W (2020) Optimized faster-RCNN in real-time facial expression classification. In: IOP Conference Series: Materials Science and Engineering, vol 790, pp 012148
Chang T, Wen G, Hu Y, Ma J (2018) Facial expression recognition based on complexity perception classification algorithm. arXiv preprint arXiv:180300185
Chen S, Li Q, Zhou M, Abusorrah A (2021) Recent advances in collaborative scheduling of computing tasks in an edge computing paradigm. Sensors. https://doi.org/10.3390/s21030779
Article Google Scholar
Devries T, Taylor GW (2017) Improved regularization of convolutional neural networks with cutout. CoRR arxiv: abs/1708.04552
Eigen D, Rolfe JT, Fergus R, LeCun Y (2014) Understanding deep architectures using a recursive convolutional network. In: ICLR
Gholami A, Kwon K, Wu B, Tai Z, Yue X, Jin PH, Zhao S, Keutzer K (2018) Squeezenext: Hardware-aware neural network design. In: CVPR, pp 1638–1647
Gilad-Bachrach R, Dowlin N, Laine K, Lauter KE, Naehrig M, Wernsing J (2016) Cryptonets: applying neural networks to encrypted data with high throughput and accuracy. In: ICML vol 48, pp 201–210
Girma A, Bahadori N, Sarkar M, Tadewos TG, Behnia MR, Mahmoud MN, Karimoddini A, Homaifar A (2020) IoT-enabled autonomous system collaboration for disaster-area management. IEEE CAA J Autom Sin 7(1):1
Article MathSciNet Google Scholar
Goodfellow IJ, Erhan D, Carrier PL, Courville AC, Mirza M, Hamner B, Cukierski W, Tang Y, Thaler D, Lee D, Zhou Y, Ramaiah C, Feng F, Li R, Wang X, Athanasakis D, Shawe-Taylor J, Milakov M, Park J, Ionescu RT, Popescu M, Grozea C, Bergstra J, Xie J, Romaszko L, Xu B, Zhang C, Bengio Y (2013) Challenges in representation learning: a report on three machine learning contests. In: ICONIP vol 8228, pp 117–124
Han H, Zhou M, Zhang Y (2020) Can virtual samples solve small sample size problem of KISSME in pedestrian re-identification of smart transportation? IEEE Trans Intell Transp Syst 21(9):3766–3776
Article Google Scholar
Han H, Zhou M, Shang X, Cao W, Abusorrah A (2021) KISS+ for rapid and accurate pedestrian re-identification. IEEE Trans Intell Transp Syst 22(1):394–403
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: CVPR, pp 770–778
He L, Hu D, Wan M, Wen Y, von Deneen KM, Zhou M (2016b) Common bayesian network for classification of eeg-based multiclass motor imagery BCI. IEEE Trans Syst Man Cybern Syst 46(6):843–854
Article Google Scholar
Ho YC, Pepyne DL (2001) Simple explanation of the no free lunch theorem of optimization. In: Proceedings of the 40th ieee conference on decision and control, IEEE, vol 5, pp 4409–4414
Howard A, Pang R, Adam H, Le QV, Sandler M, Chen B, Wang W, Chen L, Tan M, Chu G, Vasudevan V, Zhu Y (2019) Searching for mobilenetv3. In: ICCV, pp 1314–1324
Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR arxiv: abs/1704.04861
Huang G, Liu Z, van der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: CVPR, pp 2261–2269
Huang G, Liu S, van der Maaten L, Weinberger KQ (2018) Condensenet: an efficient densenet using learned group convolutions. In: CVPR, pp 2752–2761
Huang Z, Xu X, Ni J, Zhu H, Wang C (2019) Multimodal representation learning for recommendation in internet of things. IEEE Internet Things J 6(6):10675–10685
Article Google Scholar
Huang Z, Xu X, Zhu H, Zhou M (2020) An efficient group recommendation model with multiattention-based neural networks. IEEE Trans Neural Netw Learn Syst 31(11):4461–4474
Article MathSciNet Google Scholar
Iandola FN, Moskewicz MW, Ashraf K, Han S, Dally WJ, Keutzer K (2016) Squeezenet: Alexnet-level accuracy with 50x fewer parameters and $<$1mb model size. CoRR arxiv: abs/1602.07360
Kang Q, Shi L, Zhou M, Wang X, Wu Q, Wei Z (2018) A distance-based weighted undersampling scheme for support vector machines and its application to imbalanced classification. IEEE Trans Neural Netw Learn Syst 29(9):4152–4165
Article Google Scholar
Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In: CVPR, pp 1867–1874
King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758
Google Scholar
Ko B (2018) A brief review of facial emotion recognition based on visual information. Sensors 18(2):401
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: NIPS, pp 1106–1114
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Article Google Scholar
Li S, Deng W (2019) Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans Image Process 28(1):356–370
Article MathSciNet Google Scholar
Lin M, Chen Q, Yan S (2013) Network in network. arXiv preprint arXiv:13124400
Liu , Z et al (2017) A facial expression emotion recognition based human-robot interaction system. In IEEE/CAA J Automatica Sinica 4(4):668–676, 2017.
Liu H, Zhou M, Liu Q (2019) An embedded feature selection method for imbalanced data classification. IEEE CAA J Autom Sin 6(3):703–715
Article Google Scholar
Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg AC (2016) Ssd: Single shot multibox detector. In: European conference on computer vision, Springer, pp 21–37
Ma H, Celik T (2019) Fer-net: facial expression recognition using densely connected convolutional network. Electron Lett 55(4):184–186
Article Google Scholar
Ma N, Zhang X, Zheng H, Sun J (2018) Shufflenet V2: practical guidelines for efficient CNN architecture design. ECCV 11218:122–138
Google Scholar
Passalis N, Raitoharju J, Tefas A, Gabbouj M (2019) Adaptive inference using hierarchical convolutional bag-of-features for low-power embedded platforms. In: ICIP, pp 3048–3052
Riaz MN, Shen Y, Sohail M, Guo M (2020) exnet: an efficient approach for emotion recognition in the wild. Sensors 20(4):1087
Article Google Scholar
Sahni Y, Cao J, Yang L (2019) Data-aware task allocation for achieving low latency in collaborative edge computing. IEEE Internet Things J 6(2):3512–3524
Article Google Scholar
Sajjad M, Nasir M, Muhammad K, Khan S, Jan Z, Sangaiah AK, Elhoseny M, Baik SW (2020) Raspberry pi assisted face recognition framework for enhanced law-enforcement services in smart cities. Future Gener Comput Syst 108:995–1007
Article Google Scholar
van de Sande KEA, Uijlings JRR, Gevers T, Smeulders AWM (2011) Segmentation as selective search for object recognition. In: ICCV, pp 1879–1886
Sandler M, Howard AG, Zhu M, Zhmoginov A, Chen L (2018) Mobilenetv2: Inverted residuals and linear bottlenecks. In: CVPR, pp 4510–4520
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. CoRR arxiv: abs/1409.1556
Sun C, Vianney JMU, Li Y, Chen L, Li L, Wang F, Khajepour A, Cao D (2020) Proximity based automatic data annotation for autonomous driving. IEEE CAA J Autom Sin 7(2):395–404
Article Google Scholar
Sun K, Li M, Liu D, Wang J (2018) IGCV3: interleaved low-rank group convolutions for efficient deep neural networks. In: BMVC, p 101
Szegedy C, Liu W, Jia Y, Sermanet P, Reed SE, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: CVPR, pp 1–9
Tan M, Le QV (2019a) Efficientnet: Rethinking model scaling for convolutional neural networks. ICML 97:6105–6114
Google Scholar
Tan M, Le QV (2019b) Mixconv: Mixed depthwise convolutional kernels. In: BMVC, p 74
Walecki R, Rudovic O, Pavlovic V, Schuller BW, Pantic M (2017) Deep structured learning for facial action unit intensity estimation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, IEEE Computer Society, pp 5709–5718
Wang J, Bohn TA, Ling CX (2018) Pelee: A real-time object detection system on mobile devices. In: Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pp 1967–1976
Yang B, Yan J, Lei Z, Li SZ (2016) CRAFT objects from images. In: 2016 IEEE conference on computer vision and pattern recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, IEEE Computer Society, pp 6043–6051
Yang S, Gong Z, Ye K, Wei Y, Huang Z, Huang Z (2020) Edgernn: a compact speech recognition network with spatio-temporal features for edge computing. IEEE Access 8:81468–81478
Article Google Scholar
Yuan H, Zhou M, Liu Q, Abusorrah A (2020) Fine-grained resource provisioning and task scheduling for heterogeneous applications in distributed green clouds. IEEE CAA J Autom Sin 7(5):1380–1393
Google Scholar
Zaidan AA, Zaidan BB (2020) A review on intelligent process for smart home applications based on iot: coherent taxonomy, motivation, open challenges, and recommendations. Artif Intell Rev 53(1):141–165
Article Google Scholar
Zhang H, Cissé M, Dauphin YN, Lopez-Paz D (2018a) mixup: Beyond empirical risk minimization. In: ICLR
Zhang J, Hu X, Ning Z, Ngai ECH, Zhou L, Wei J, Cheng J, Hu B (2017a) Energy-latency tradeoff for energy-aware offloading in mobile edge computing networks. IEEE Internet Things J 5(4):2633–2645
Article Google Scholar
Zhang P, Zhou M, Fortino G (2018b) Security and trust issues in fog computing: a survey. Future Gener Comput Syst 88:16–27
Article Google Scholar
Zhang T, Qi G, Xiao B, Wang J (2017b) Interleaved group convolutions. In: ICCV, pp 4383–4392
Zhang X, Zhou X, Lin M, Sun J (2018c) Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: CVPR, pp 6848–6856

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant 61772366, Grant 62072192, and the Natural Science Foundation of Shanghai under Grant 17ZR1445900. The Deanship of Scientific Research (DSR) at King Abdulaziz University, Jeddah, Saudi Arabia has funded this project, under grant no. (FP-51-43)

Author information

Authors and Affiliations

School of Computer Science, South China Normal University, Guangzhou, 510631, China
Zhenhua Huang, Shunzhi Yang & Zheng Gong
Research and Development Department, DataGrand Inc., Shenzhen, 518063, China
Zhenhua Huang
Department of Electrical and Computer Engineering, New Jersey Institute of Technology, Newark, NJ, 07102, USA
MengChu Zhou
Department of Electrical and Computer Engineering, Faculty of Engineering, and Center of Research Excellence in Renewable Energy and Power Systems, King Abdulaziz University, Jeddah, 21481, Saudi Arabia
Abdullah Abusorrah
School of Informatics, Xiamen University, Xiamen, 361000, China
Chen Lin
School of Information Security Engineering, Shanghai Jiaotong University, Shanghai, 200240, China
Zheng Huang

Authors

Zhenhua Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shunzhi Yang
View author publications
You can also search for this author in PubMed Google Scholar
MengChu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Gong
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah Abusorrah
View author publications
You can also search for this author in PubMed Google Scholar
Chen Lin
View author publications
You can also search for this author in PubMed Google Scholar
Zheng Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Zhenhua Huang, MengChu Zhou or Zheng Gong.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Z., Yang, S., Zhou, M. et al. Making accurate object detection at the edge: review and new approach. Artif Intell Rev 55, 2245–2274 (2022). https://doi.org/10.1007/s10462-021-10059-3

Download citation

Accepted: 02 August 2021
Published: 01 September 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10462-021-10059-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Making accurate object detection at the edge: review and new approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

EC-RFERNet: an edge computing-oriented real-time facial expression recognition network

An Edge Computing Architecture for Object Detection

Efficient facial expression recognition framework based on edge computing

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Making accurate object detection at the edge: review and new approach

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

EC-RFERNet: an edge computing-oriented real-time facial expression recognition network

An Edge Computing Architecture for Object Detection

Efficient facial expression recognition framework based on edge computing

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation