Abstract:
Network pruning is an effective technique to reduce computation costs for deep model deployment on resource-constraint devices. Searching superior sub-networks from a vas...Show MoreMetadata
Abstract:
Network pruning is an effective technique to reduce computation costs for deep model deployment on resource-constraint devices. Searching superior sub-networks from a vast search space through Neural Architecture Search (NAS) , which conducts a one-shot supernet used as a performance estimator, is still time-consuming. In addition to searching inefficiency, such solutions also focus on FLOPs budget and suffer from an inferior ranking consistency between supernet-inherited and stand-alone performance. To solve the problems above, we propose a framework, namely DBS. Firstly, we pre-sample sub-networks with a similar budget setting as starting points, then we use a strict path-wise fair sandwich rule to train these starting points in a supernet. Second, we train Transformer-based predictors according to the performance and budget (FLOPs or latency) of starting points. After that, we freeze the parameters of predictors and apply a differentiable budgetaware search on continuous sub-networks vectors. Finally, we obtain the derived sub-networks from the optimized vectors by a decoder. We conduct comprehensive experiments on Imagenet with Resnet and Mobilenet-V2 under various FLOPs settings as well as different latency, which shows consistent improvements to the-state-of-art methods.
Published in: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Date of Conference: 14-19 April 2024
Date Added to IEEE Xplore: 18 March 2024
ISBN Information: