MEMA-NAS: Memory-Efficient Multi-Agent Neural Architecture Search

Kong, Qi; Xu, Xin; Zhang, Liangliang

doi:10.1007/978-3-030-88013-2_15

Qi Kong¹⁶,
Xin Xu¹⁷ &
Liangliang Zhang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13022))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1825 Accesses
1 Citations

Abstract

Object detection is a core computer vision task that aims to localize and classify categories for various objects in an image. With the development of convolutional neural networks, deep learning methods have been widely used in the object detection task, achieving promising performance compared to traditional methods. However, designing a well-performing detection network is inefficient. It consumes too much hardware resources and time to trial, and it also heavily relies on expert knowledge. To efficiently design the neural network architecture, there has been a growing interest in automatically designing neural network architecture by Neural Architecture Search (NAS). In this paper, we propose a Memory-Efficient Multi-Agent Neural Architecture Search (MEMA-NAS) framework in end-to-end object detection neural network. Specifically, we introduce the multi-agent learning to search holistic architecture of the detection network. In this way, a lot of GPU memory is saved, allowing us to search each module’s architecture of the detection network simultaneously. To find a better tradeoff between the precision and computational costs, we add the resource constraint in our method. Search experiments on multiple datasets show that MEMA-NAS achieves state-of-the-art results in search efficiency and precision.

Q. Kong and X. Xu—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Carlucci, F.M., et al.: MANAS: multi-agent neural architecture search. arXiv preprint arXiv:1909.01051 (2019)
Chao, P., Xiao, T., Li, Z., Jiang, Y., Jian, S.: MegDet: a large mini-batch object detector. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Chen, X., Xie, L., Wu, J., Tian, Q.: Progressive differentiable architecture search: bridging the depth gap between search and evaluation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1294–1303 (2019)
Google Scholar
Chen, Y., et al.: Reinforced evolutionary neural architecture search. arXiv preprint arXiv:1808.00193 (2018)
Everingham, M., Gool, L.V., Williams, C., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Ghiasi, G., Lin, T.Y., Le, Q.V.: NAS-FPN: learning scalable feature pyramid architecture for object detection. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Kim, S.-W., Kook, H.-K., Sun, J.-Y., Kang, M.-C., Ko, S.-J.: Parallel feature pyramid network for object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 239–256. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_15
Chapter Google Scholar
Kong, T., Sun, F., Huang, W., Liu, H.: Deep feature pyramid reconfiguration for object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11209, pp. 172–188. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01228-1_11
Chapter Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, C., et al.: Progressive neural architecture search. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11205, pp. 19–35. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01246-5_2
Chapter Google Scholar
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Google Scholar
Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33 (2018)
Google Scholar
Xie, S., Girshick, R., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492–1500 (2017)
Google Scholar
Xu, H., Yao, L., Zhang, W., Liang, X., Li, Z.: Auto-FPN: automatic network architecture adaptation for object detection beyond classification. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6649–6658 (2019)
Google Scholar
Xu, Y., et al.: PC-DARTS: partial channel connections for memory-efficient architecture search. arXiv preprint arXiv:1907.05737 (2019)
Yu, F., et al.: BDD100K: a diverse driving video database with scalable annotation tooling. arXiv preprint arXiv:1805.04687 2(5), 6 (2018)
Zhao, Z.Q., Zheng, P., Xu, S.T., Wu, X.: Object detection with deep learning: a review. IEEE Trans. Neural Netw. Learn. Syst. 30(11), 3212–3232 (2019)
Article Google Scholar
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Autonomous Driving Division, JD.com American Technologies Corporation, Mountain View, CA, 94043, USA
Qi Kong & Liangliang Zhang
Autonomous Driving Division, JD.com, Beijing, China
Xin Xu

Authors

Qi Kong
View author publications
You can also search for this author in PubMed Google Scholar
Xin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Liangliang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Xu .

Editor information

Editors and Affiliations

University of Science and Technology Beijing, Beijing, China
Huimin Ma
Chinese Academy of Sciences, Beijing, China
Liang Wang
Tsinghua University, Beijing, China
Changshui Zhang
Zhejiang University, Hangzhou, China
Fei Wu
Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Hunan University, Changsha, China
Yaonan Wang
Sun Yat-Sen University, Guangzhou, Guangdong, China
Jianhuang Lai
Beijing Jiaotong University, Beijing, China
Yao Zhao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kong, Q., Xu, X., Zhang, L. (2021). MEMA-NAS: Memory-Efficient Multi-Agent Neural Architecture Search. In: Ma, H., et al. Pattern Recognition and Computer Vision. PRCV 2021. Lecture Notes in Computer Science(), vol 13022. Springer, Cham. https://doi.org/10.1007/978-3-030-88013-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-88013-2_15
Published: 22 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88012-5
Online ISBN: 978-3-030-88013-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics