Abstract
Deep learning is heavily influenced by the quantity and quality of data. Moreover, most deep learning models are developed and tested on servers equipped with high-performance GPUs and large memory capacities. However, for practical application in industrial fields or real-world scenarios, optimal performance must be achieved using limited resources and equipment. Therefore, this paper proposes the FMC (Feature Maximizer Convolution). This method aims to enhance performance with limited data by extracting as diverse features as possible and assigning more weight to crucial feature maps, which are then passed on to the next layer. Additionally, to ensure real-time performance on limited hardware, DSC (Depthwise Separable Convolution) is employed instead of standard convolution to reduce computational load. The approach is applied to deep learning models on datasets such as COCO, VisDrone, VOC, and xView, and its performance is compared with existing networks. Inference experiments are also conducted on the edge device Odroid H3+. The proposed network shows a 30% average reduction in the number of parameters compared to existing networks and a 5% increase in inference speed. On the Odroid H3+, the inference speed improved by an average of 2.5 ms, resulting in an increase from 19 FPS to 20 FPS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Perez, L., Wang, J.: The effectiveness of data augmentation in image classification using deep learning. CoRR arxiv:1712.04621 (2017)
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(1), 1–48 (2019). https://doi.org/10.1186/s40537-019-0197-0citeas
Sohn, K., et al.: Fixmatch: simplifying semi-supervised learning with consistency and confidence. arXiv preprint arXiv:2001.07685 (2020)
Sun, Q., Liu, Y., Chua, T., Schiele, B.: Meta-transfer learning for few-shot learning. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, 16–20 June 2019, pp. 403–412. Computer Vision Foundation/IEEE (2019)
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. Int. J. Comput. Vision 88, 303–308 (2009). https://www.microsoft.com/en-us/research/publication/the-pascal-visual-object-classes-voc-challenge/
Lam, D., et al.: xview: objects in context in overhead imagery (2018)
Zhu, P., et al.: Detection and tracking meet drones challenge. IEEE Trans. Pattern Anal. Mach. Intell. 44(11), 7380–7399 (2021)
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks (2016)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection (2015). https://arxiv.org/abs/1506.02640
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). https://arxiv.org/abs/1804.02767
Jocher, G., et al.: ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation (2022). https://doi.org/10.5281/zenodo.7347926
Lian, Z., Tian, F.: Deepsi: a sensitive-driven testing samples generation method of whitebox cnn model for edge computing. Tsinghua Sci. Technol. 29(3), 784–794 (2024)
Zeng, L., Liu, Q., Shen, S., Liu, X.: Improved double deep q network-based task scheduling algorithm in edge computing for makespan optimization. Tsinghua Sci. Technol. 29(3), 806–817 (2024)
Wang, G., et al.: Bed: a real-time object detection system for edge devices ( 2022)
Liu, S., Zha, J., Sun, J., Li, Z., Wang, G.: Edgeyolo: an edge-real-time object detector (2023)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Lin, T.-Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection (2017)
Acknowledgements
This result was supported by “Regional Innovation Strategy (RIS)" through the National Research Foundation of Korea(NRF) funded by the Ministry of Education(MOE)(2021RIS-003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Choi, J., Lee, Y., Jo, K. (2024). Efficient Detection Model Using Feature Maximizer Convolution for Edge Computing. In: Irie, G., Shin, C., Shibata, T., Nakamura, K. (eds) Frontiers of Computer Vision. IW-FCV 2024. Communications in Computer and Information Science, vol 2143. Springer, Singapore. https://doi.org/10.1007/978-981-97-4249-3_10
Download citation
DOI: https://doi.org/10.1007/978-981-97-4249-3_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-4248-6
Online ISBN: 978-981-97-4249-3
eBook Packages: Computer ScienceComputer Science (R0)