Skip to main content
Log in

A technical view on neural architecture search

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Due to the discovery of innovative and practical neural architectures, deep learning has achieved bright successes in many fields, such as computer vision, natural language processing, recommendation systems, etc. To reach high performance, researchers have to adjust neural architectures and choose training tricks very carefully. The manual trial-and-error process for discovering the best neural network configuration consumes plenty of manpower. The neural architecture search (NAS) aims to alleviate this issue by automatically configuring neural networks. Recently, the rapid development of NAS has shown significant achievements. Novel neural network architectures that outperform the state-of-the-art handcrafted networks have been discovered in image classification benchmarks. In this paper, we survey NAS from a technical view. By summarizing the previous NAS approaches, we drew a picture of NAS for readers including problem definition, search approaches, progress towards practical applications and possible future directions. We hope that this paper can help beginners start their researches on NAS.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Adankon MM, Cheriet M (2009) Model selection for the LS-SVM. Application to handwriting recognition. Pattern Recogn 42(12):3264–3270

    MATH  Google Scholar 

  2. Angeline PJ, Saunders GM, Pollack JB (1994) An evolutionary algorithm that constructs recurrent neural networks. IEEE Trans Neural Netw 5(1):54–65

    Google Scholar 

  3. Baker B, Gupta O, Naik N, Raskar R (2017) Designing neural network architectures using reinforcement learning. In: Proceedings of the 5th International Conference on Learning Representations

  4. Baker B, Gupta O, Raskar R, Naik N (2017) Accelerating neural architecture search using performance prediction. arXiv preprint arXiv:1705.10823

  5. Belew RK (1990) Evolution, learning, and culture: computational metaphors for adaptive algorithms. Complex Syst 4(1):11–49

    MATH  Google Scholar 

  6. Bender G, Kindermans PJ, Zoph B, Vasudevan V, Le Q (2018) Understanding and simplifying one-shot architecture search. In: Proceedings of the International Conference on machine learning, pp 549–558

  7. Bengio Y (2000) Gradient-based optimization of hyperparameters. Neural Comput 12(8):1889–1900

    MathSciNet  Google Scholar 

  8. Bengio Y, Courville AC, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Google Scholar 

  9. Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems, pp 153–160

  10. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305

    MathSciNet  MATH  Google Scholar 

  11. Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of the 30th International Conference on Machine Learning, pp 115–123

  12. Bergstra JS, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Advances in neural information processing systems, pp 2546–2554

  13. Biem A (2003) A model selection criterion for classification: application to hmm topology optimization. In: Proceedings of the 7th International Conference on document analysis and recognition, pp 104–108

  14. Brazdil PB, Soares C, Da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn 50(3):251–277

    MATH  Google Scholar 

  15. Brock A, Lim T, Ritchie JM, Weston N (2018) SMASH: one-shot model architecture search through hypernetworks. In: Proceedings of the 6th International Conference on learning representations

  16. Cai H, Chen T, Zhang W, Yu Y, Wang J (2018) Efficient architecture search by network transformation. In: Proceedings of the 32nd AAAI Conference on artificial intelligence, pp 2787–2794

  17. Cai H, Yang J, Zhang W, Han S, Yu Y (2018) Path-level network transformation for efficient architecture search. In: Proceedings of the 35th International Conference on machine learning, pp 677–686

  18. Cai H, Zhu L, Han S (2019) ProxylessNAS: Direct neural architecture search on target task and hardware. In: Proceedings of the 7th International Conference on learning representations

  19. Caudell TP, Dolan CP (1989) Parametric connectivity: Training of constrained networks using genetic algorithms. In: Proceedings of the 3rd International Conference on genetic algorithms, pp 370–374

  20. Censor Y (1977) Pareto optimality in multiobjective problems. Appl Math Optim 4(1):41–59

    MathSciNet  MATH  Google Scholar 

  21. Chapelle O, Vapnik V, Bengio Y (2002) Model selection for small sample regression. Mach Learn 48(1):9–23

    MATH  Google Scholar 

  22. Chen Y, Meng G, Zhang Q, Xiang S, Huang C, Mu L, Wang X (2019) Renas: Reinforced evolutionary neural architecture search. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4787–4796

  23. Chollet F (2017) Xception: Deep learning with depthwise separable convolutions. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 1251–1258

  24. Dai X, Zhang P, Wu B, Yin H, Sun F, Wang Y, Dukhan M, Hu Y, Wu Y, Jia Y et al (2019) ChamNet: Towards efficient network design through platform-aware model adaptation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 11398–11407

  25. Deb K (2001) Multi-objective optimization using evolutionary algorithms, vol. 16

  26. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on computer vision and pattern recognition, pp 248–255

  27. Domhan T, Springenberg JT, Hutter F (2015) Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In: Proceedings of the 25th International Joint Conference on artificial intelligence, pp 3460–3468

  28. Dong C, Loy CC, He K, Tang X (2014) Learning a deep convolutional network for image super-resolution. In: European conference on computer vision, pp 184–199

    Google Scholar 

  29. Dong JD, Cheng AC, Juan DC, Wei W, Sun M PPP-Net: Platform-aware progressive search for pareto-optimal neural architectures. In: Proceedings of the Workshop on the 6th International Conference on learning representations

  30. Dong X, Yang Y (2019) Searching for a robust neural architecture in four gpu hours. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 1761–1770

  31. Elsken T, Metzen JH, Hutter F (2019) Efficient multi-objective neural architecture search via Lamarckian evolution. In: Proceedings of the 7th International Conference on learning representations

  32. Elsken T, Metzen JH, Hutter F (2019) Neural architecture search: a survey. J Mach Learn Res 55(1–55):21

    MathSciNet  MATH  Google Scholar 

  33. Fahlman SE, Lebiere C (1990) The cascade-correlation learning architecture. In: Advances in neural information processing systems, pp 524–532

  34. Falkner S, Klein A, Hutter F (2018) Bohb: Robust and efficient hyperparameter optimization at scale. In: Proceedings of the 35th International Conference on machine learning

  35. Fei-Fei L, Fergus R, Perona P (2006) One-shot learning of object categories. IEEE Trans Pattern Anal Mach Intell 28(4):594–611

    Google Scholar 

  36. Fernández-Godino MG, Park C, Kim NH, Haftka RT (2016) Review of multi-fidelity models. arXiv preprint arXiv:1609.07196

  37. Finn C, Abbeel P, Levine S (2017) Model-agnostic meta-learning for fast adaptation of deep networks. In: Proceedings of the 34th International Conference on machine learning, pp 1126–1135

  38. Fogel DB (1991) System identification through simulated evolution: a machine learning approach to modeling

  39. Fogel DB (2006) Evolutionary computation: toward a new philosophy of machine intelligence, vol. 1

  40. Frean M (1990) The upstart algorithm: a method for constructing and training feedforward neural networks. Neural Comput 2(2):198–209

    Google Scholar 

  41. García S, Luengo J, Herrera F (2015) Data preprocessing in data mining. Springer, Berlin

    Google Scholar 

  42. Garcia-Garcia A, Orts-Escolano S, Oprea S, Villena-Martinez V, Garcia-Rodriguez J (2017) A review on deep learning techniques applied to semantic segmentation. arXiv preprint arXiv:1704.06857

  43. Ghiasi G, Lin TY, Le QV (2019) Nas-fpn: learning scalable feature pyramid architecture for object detection. In: Proceedings of the IEEE Conference on Computer vision and pattern recognition, pp 7036–7045

  44. Goldberg DE, Deb K (1991) A comparative analysis of selection schemes used in genetic algorithms. Found Genet Algorithms 1:69–93

    MathSciNet  Google Scholar 

  45. Goldberg DE, Holland JH (1988) Genetic algorithms and machine learning. Mach Learn 3(2):95–99

    Google Scholar 

  46. Gong X, Chang S, Jiang Y, Wang Z (2019) Autogan: Neural architecture search for generative adversarial networks. arXiv preprint arXiv:1908.03835

  47. Guo X, Yang J, Wu C, Wang C, Liang Y (2008) A novel LS-SVMs hyper-parameter selection based on particle swarm optimization. Neurocomputing 71(16):3211–3215

    Google Scholar 

  48. Guo Z, Zhang X, Mu H, Heng W, Liu Z, Wei Y, Sun J (2019) Single path one-shot neural architecture search with uniform sampling. arXiv preprint arXiv:1904.00420

  49. Ha D, Dai AM, Le QV (2017) Hypernetworks. In: Proceedings of the 5th International Conference on learning representations

  50. Han D, Kim J, Kim J (2017) Deep pyramidal residual networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 5927–5935

  51. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 770–778

  52. He T, Zhang Z, Zhang H, Zhang Z, Xie J, Li M (2019) Bag of tricks for image classification with convolutional neural networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 558–567

  53. Hinton GE, Nowlan SJ (1987) How learning can guide evolution. Complex Syst 1(3):495–502

    MATH  Google Scholar 

  54. Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507

    MathSciNet  MATH  Google Scholar 

  55. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580

  56. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  57. Holland JH, et al (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence

  58. Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

  59. Hsu CH, Chang SH, Juan DC, Pan JY, Chen YT, Wei W, Chang SC (2018) Monas: multi-objective neural architecture search using reinforcement learning. In: Proceedings of the 33rd AAAI Conference on artificial intelligence

  60. Hu YQ, Qian H, Yu Y (2017) Sequential classification-based optimization for direct policy search. In: Proceedings of the 31st AAAI Conference on artificial intelligence, pp 2029–2035

  61. Hu YQ, Yu Y, Liao JD (2019) Cascaded algorithm-selection and hyper-parameter optimization with extreme-region upper confidence bound bandit. In: Proceeding of the 28th International Joint Conference on artificial intelligence

  62. Hu YQ, Yu Y, Tu WW, Yang Q, Chen Y, Dai W (2019) Multi-fidelity automatic hyper-parameter tuning via transfer series expansion. In: Proceedings of the 33rd AAAI Conference on artificial intelligence

  63. Hu YQ, Yu Y, Zhou ZH (2018) Experienced optimization with reusable directional model for hyper-parameter search. In: Proceeding of the 27th International Joint Conference on artificial intelligence, pp 2276–2282

  64. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4700–4708

  65. Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of the International Joint Conference on neural networks, pp 25–29

  66. Huang Y, Cheng Y, Chen D, Lee H, Ngiam J, Le QV, Chen Z (2018) Gpipe: efficient training of giant neural networks using pipeline parallelism. arXiv preprint arXiv:1811.06965

  67. Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. LION 5:507–523

    Google Scholar 

  68. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on machine learning, pp 448–456

  69. Kaiming H, Xiangyu Z, Shaoqing R, Jian S (2016) Deep residual learning for image recognition. In: Proceedings of the 29th IEEE Conference on computer vision and pattern recognition, pp 770–778

  70. Kim YH, Reddy B, Yun S, Seo C (2017) Nemo: Neuro-evolution with multiobjective optimization of deep neural network for speed and accuracy. In: Proceedings of the 20th International Conference on machine learning Workshop on AutoML

  71. Klein A, Falkner S, Bartels S, Hennig P, Hutter F (2017) Fast bayesian optimization of machine learning hyperparameters on large datasets. In: Proceedings of the 20th International Conference on artificial intelligence and statistics, pp 528–536

  72. Kotsiantis S, Kanellopoulos D, Pintelas P (2006) Data preprocessing for supervised leaning. Int J Comput Sci 1(2):111–117

    Google Scholar 

  73. Krizhevsky A, Hinton G et al (2009) Learning multiple layers of features from tiny images. Tech Rep, Citeseer

  74. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  75. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Google Scholar 

  76. LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD (1989) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems, pp 396–404

  77. LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Google Scholar 

  78. Levine S, Finn C, Darrell T, Abbeel P (2016) End-to-end training of deep visuomotor policies. J Mach Learn Res 17(1):1334–1373

    MathSciNet  MATH  Google Scholar 

  79. Li L, Jamieson KG, DeSalvo G, Rostamizadeh A, Talwalkar A (2017) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res 18:185:1–185:52

    MathSciNet  MATH  Google Scholar 

  80. Li X, Zhou Y, Pan Z, Feng J (2019) Partial order pruning: for best speed/accuracy trade-off in neural architecture search. In: Proceedings of the IEEE Conference on computer vision and pattern recognition

  81. Lillicrap TP, Hunt JJ, Pritzel A, Heess N, Erez T, Tassa Y, Silver D, Wierstra D (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971

  82. Lindauer M, Hutter F (2018) Warmstarting of model-based algorithm configuration. In: Proceedings of the 32nd AAAI Conference on artificial intelligence, pp 1355–1362

  83. Lippmann RP (1987) An introduction to computing with neural nets. IEEE Assp Mag 4(2):4–22

    Google Scholar 

  84. Liu C, Chen LC, Schroff F, Adam H, Hua W, Yuille AL, Fei-Fei L (2019) Auto-DeepLab: Hierarchical neural architecture search for semantic image segmentation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 82–92

  85. Liu C, Zoph B, Neumann M, Shlens J, Hua W, Li LJ, Fei-Fei L, Yuille A, Huang J, Murphy K (2018) Progressive neural architecture search. In: Proceedings of the European Conference on computer vision, pp 19–34

  86. Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2018) Hierarchical representations for efficient architecture search. In: Proceedings of 6th International Conference on learning representations

  87. Liu H, Simonyan K, Yang Y (2019) DARTS: Differentiable architecture search. In: Proceedings of the 7th International Conference on learning representations

  88. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440

  89. Luo R, Tian F, Qin T, Chen E, Liu TY (2018) Neural architecture optimization. In: Advances in neural information processing systems, pp 7816–7827

  90. Luong M, Le QV, Sutskever I, Vinyals O, Kaiser L (2016) Multi-task sequence to sequence learning. In: Proceedings of the 4th International Conference on learning representations

  91. Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th International Conference on machine learning

  92. Maron O, Moore AW (1994) Hoeffding races: Accelerating model selection search for classification and function approximation. In: Advances in neural information processing systems, pp 59–66

  93. McClelland JL, Rumelhart DE, Group PR (1987) Parallel distributed processing, vol 2. MIT press, Cambridge

    Google Scholar 

  94. Mendoza H, Klein A, Feurer M, Springenberg JT, Hutter F (2016) Towards automatically-tuned neural networks. In: Proceedings of the AutoML workshop on the 33rd International Conference on machine learning, pp 58–65

  95. Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B, Shahrzad H, Navruzyan A, Duffy N, et al (2019) Evolving deep neural networks. In: Artificial intelligence in the age of neural networks and brain computing, pp 293–312

    Google Scholar 

  96. Mikolov T, Karafiát M, Burget L, Černockỳ J, Khudanpur S (2010) Recurrent neural network based language model. In: Proceedings of the 11th Annual Conference of the International Speech Communication Association

  97. Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller M (2013) Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602

  98. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529

    Google Scholar 

  99. Montana DJ, Davis L (1989) Training feedforward neural networks using genetic algorithms. In: Proceeding of the 11th International Joint Conference on artificial intelligence, vol. 89, pp 762–767

  100. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on machine learning, pp 807–814

  101. Nekrasov V, Chen H, Shen C, Reid I (2019) Fast neural architecture search of compact semantic segmentation models via auxiliary cells. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 9126–9135

  102. Nolfi S, Parisi D, Elman JL (1994) Learning and evolution in neural networks. Adapt Behav 3(1):5–28

    Google Scholar 

  103. Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Google Scholar 

  104. Pérez-Rúa JM, Vielzeuf V, Pateux S, Baccouche M, Jurie F (2019) MFAS: multimodal fusion architecture search. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 6966–6975

  105. Pham H, Guan MY, Zoph B, Le QV, Dean J (2018) Efficient neural architecture search via parameter sharing. In: Proceedings of the 35th International Conference on machine learning, pp 4092–4101

  106. Poultney C, Chopra S, Cun YL, et al (2007) Efficient learning of sparse representations with an energy-based model. In: Advances in neural information processing systems, pp 1137–1144

  107. Rawal A, Miikkulainen R (2018) From nodes to networks: Evolving recurrent neural networks. arXiv preprint arXiv:1803.04439

  108. Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Proceedings of the 33rd AAAI Conference on artificial intelligence

  109. Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. In: Proceedings of the 34th International Conference on machine learning, pp 2902–2911

  110. Redmon J, Farhadi A (2018) Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767

  111. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  112. Roy A, Kim LS, Mukhopadhyay S (1993) A polynomial time algorithm for the construction and training of a class of multilayer perceptrons. Neural Netw 6(4):535–545

    Google Scholar 

  113. Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC (2018) Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 4510–4520

  114. Schwefel HPP (1993) Evolution and optimum seeking: the sixth generation

  115. Sen R, Kandasamy K, Shakkottai S (2018) Multi-fidelity black-box optimization with hierarchical partitions. In: Proceedings of the 35th International Conference on machine learning, pp 4545–4554

  116. Shahriari B, Swersky K, Wang Z, Adams RP, Freitas ND (2015) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1):148–175

    Google Scholar 

  117. Sharif Razavian A, Azizpour H, Sullivan J, Carlsson S (2014) Cnn features off-the-shelf: an astounding baseline for recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 806–813

  118. Shin R, Packer C, Song D (2018) Differentiable neural network architecture search. In: Proceedings of the 6th International Conference on learning representations, Workshop Track

  119. Sietsma J, Dow RJ (1991) Creating artificial neural networks that generalize. Neural Netw 4(1):67–79

    Google Scholar 

  120. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484

    Google Scholar 

  121. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the 3rd International Conference on learning representations

  122. Smith JM (1987) When learning guides evolution. Nature 329(6142):761

    Google Scholar 

  123. Snoek J, Larochelle H, Adams RP (2012) Practical Bayesian optimization of machine learning algorithms. In: Advances in neural information processing systems, pp 2951–2959

  124. So DR, Liang C, Le QV (2019) The evolved transformer. In: Proceedings of the 36th International Conference on machine learning, pp 5877–5886

  125. Suganuma M, Shirakawa S, Nagao T (2017) A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the genetic and evolutionary computation conference, pp 497–504

  126. Sutton RS, Barto AG (2018) Reinforcement learning: an introduction. MIT press, Cambridge

    MATH  Google Scholar 

  127. Swersky K, Snoek J, Adams RP (2014) Freeze-thaw bayesian optimization. arXiv preprint arXiv:1406.3896

  128. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the 31st AAAI Conference on artificial intelligence

  129. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Dragomir A, Dumitru, E, Vincent V, Andrew R (2015) Going deeper with convolutions. In: Proceedings of the 28th IEEE Conference on computer vision and pattern recognition, pp 1–9

  130. Tan M, Chen B, Pang R, Vasudevan V, Le QV (2018) MnasNet: Platform-aware neural architecture search for mobile. arXiv preprint arXiv:1807.11626

  131. Tan M, Le QV (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the International Conference on machine learning

  132. Van Hasselt H, Guez A, Silver D (2016) Deep reinforcement learning with double q-learning. In: Proceedings of the 13th AAAI Conference on artificial intelligence

  133. Viola P, Jones M et al (2001) Rapid object detection using a boosted cascade of simple features. CVPR 1(511–518):3

    Google Scholar 

  134. Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292

    MATH  Google Scholar 

  135. Wei XS, Wu J, Cui Q (2019) Deep learning for fine-grained image analysis: a survey. arXiv preprint arXiv:1907.03069

  136. Whitley D, Starkweather T, Bogart C (1990) Genetic algorithms and neural networks: optimizing connections and connectivity. Parallel Comput 14(3):347–361

    Google Scholar 

  137. Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3–4):229–256

    MATH  Google Scholar 

  138. Wistuba M, Rawat A, Pedapati T (2019) A survey on neural architecture search. arXiv preprint arXiv:1905.01392

  139. Wu B, Dai X, Zhang P, Wang Y, Sun F, Wu Y, Tian Y, Vajda P, Jia Y, Keutzer K (2019) FBNet: Hardware-aware efficient convnet design via differentiable neural architecture search. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 10734–10742

  140. Wu XZ, Liu S, Zhou ZH (2019) Heterogeneous model reuse via optimizing multiparty multiclass margin. In: Proceedings of the 36th International Conference on machine learning (ICML), pp 6840–6849

  141. Wu Y, Zhang Y, Liu X, Cai Z, Cai Y (2018) A multiobjective optimization-based sparse extreme learning machine algorithm. Neurocomputing 317:88–100

    Google Scholar 

  142. Xie L, Yuille A (2017) Genetic CNN. In: Proceedings of the IEEE International Conference on computer vision, pp 1379–1388

  143. Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 1492–1500

  144. Xie S, Zheng H, Liu C, Lin L (2019) SNAS: stochastic neural architecture search. In: Proceedings of the 7th International Conference on learning representations

  145. Yang J, Wright J, Huang TS, Ma Y (2010) Image super-resolution via sparse representation. IEEE Trans Image Process 19(11):2861–2873

    MathSciNet  MATH  Google Scholar 

  146. Yao Q, Wang M, Jair Escalante H, Guyon I, Hu YQ, Li YF, Tu WW, Yang Q, Yu Y (2018) Taking human out of learning applications: a survey on automated machine learning. arXiv preprint arXiv:1810.13306

  147. Yao X (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447

    Google Scholar 

  148. Yu Y, Qian H, Hu YQ (2016) Derivative-free optimization via classification. In: Proceedings of the 30th AAAI Conference on artificial intelligence, pp 2286–2292

  149. Zela A, Klein A, Falkner S, Hutter F (2018) Towards automated deep learning: efficient joint neural architecture and hyperparameter search. arXiv preprint arXiv:1807.06906

  150. Zhang C, Ren M, Urtasun R (2019) Graph hypernetworks for neural architecture search. In: Proceedings of the 7th International Conference on learning representations

  151. Zhao P, Cai LW, Zhou ZH (2019) Handling concept drift via model reuse. Mach Learn

  152. Zhao P, Wang G, Zhang L, Zhou ZH (2019) Bandit convex optimization in non-stationary environments. arXiv preprint arXiv:1907.12340

  153. Zhong Z, Yan J, Wu W, Shao J, Liu CL (2018) Practical block-wise neural network architecture generation. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 2423–2432

  154. Zhou H, Yang M, Wang J, Pan W (2019) BayesNAS: a Bayesian approach for neural architecture search. In: Proceedings of the 36th International Conference on machine learning, pp 7603–7613

  155. Zhou ZH (2016) Learnware: on the future of machine learning. Front Comput Sci 10(4):589–590

    Google Scholar 

  156. Zhu QY, Qin AK, Suganthan PN, Huang GB (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763

    MATH  Google Scholar 

  157. Zoph B, Le QV (2017) Neural architecture search with reinforcement learning. In: Proceedings of the 5th International Conference on learning representations

  158. Zoph B, Vasudevan V, Shlens J, Le QV (2018) Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on computer vision and pattern recognition, pp 8697–8710

  159. Zoph B, Yuret D, May J, Knight K (2016) Transfer learning for low-resource neural machine translation. In: Proceedings of the Conference on empirical methods in natural language processing, pp 1568–1575

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi-Qi Hu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, YQ., Yu, Y. A technical view on neural architecture search. Int. J. Mach. Learn. & Cyber. 11, 795–811 (2020). https://doi.org/10.1007/s13042-020-01062-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01062-1

Keywords

Navigation