ABSTRACT
Recurrent Neural Networks (RNNs) break a time-series input (or a sentence) into multiple time-steps (or words) and process it one time-step (word) at a time. However, not all of these time-steps (words) need to be processed to determine the final output accurately. Prior work has exploited this intuition by incorporating an additional predictor in front of the RNN model to prune time-steps that are not relevant. However, they jointly train the predictor and the RNN model, allowing one to learn from the mistakes of the other. In this work we present a method to skip RNN time-steps without retraining or fine tuning the original RNN model. Using an ideal predictor, we show that even without retraining the original model, we can train a predictor to skip 45% of steps for the SST dataset and 80% of steps for the IMDB dataset without impacting the model accuracy. We show that the decision to skip is not easy by comparing against 5 different baselines based on solutions derived from domain knowledge. Finally, we present a case study about the cost and accuracy benefits of realizing such a predictor. This realistic predictor on the SST dataset is able to reduce the computation by more than 25% with at most 0.3% loss in accuracy while being 40× smaller than the original RNN model.
- Victor Campos, Brendan Jou, Xavier Giró i Nieto, Jordi Torres, and Shih-Fu Chang. Skip RNN: learning to skip state updates in recurrent neural networks. CoRR, abs/1708.06834, 2017.Google Scholar
- Ting Chen, Ji Lin, Tian Lin, Song Han, Chong Wang, and Denny Zhou. Adaptive mixture of low-rank factorizations for compact neural modeling. Advances in neural information processing systems (CDNNRIA workshop), 2018.Google Scholar
- Caiwen Ding, Siyu Liao, Yanzhi Wang, Zhe Li, Ning Liu, Youwei Zhuo, Chao Wang, Xuehai Qian, Yu Bai, Geng Yuan, Xiaolong Ma, Yipeng Zhang, Jian Tang, Qinru Qiu, Xue Lin, and Bo Yuan. Circnn: Accelerating and compressing deep neural networks using block-circulant weight matrices. In Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-50 '17, pages 395--408, New York, NY, USA, 2017. ACM.Google ScholarDigital Library
- Tsu-Jui Fu and Wei-Yun Ma. Speed reading: Learning to read ForBackward via shuttle. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4439--4448, Brussels, Belgium, October-November 2018. Association for Computational Linguistics.Google ScholarCross Ref
- Trevor Gale, Erich Elsen, and Sara Hooker. The state of sparsity in deep neural networks. CoRR, abs/1902.09574, 2019.Google Scholar
- Dibakar Gope, Ganesh Dasika, and Matthew Mattina. Ternary hybrid neural-tree networks for highly constrained iot applications. CoRR, abs/1903.01531, 2019.Google Scholar
- Artem M. Grachev, Dmitry I. Ignatov, and Andrey V. Savchenko. Neural networks compression for language modeling. In B. Uma Shankar, Kuntal Ghosh, Deba Prasad Mandal, Shubhra Sankar Ray, David Zhang, and Sankar K. Pal, editors, Pattern Recognition and Machine Intelligence, pages 351--357, Cham, 2017. Springer International Publishing.Google ScholarCross Ref
- Jan Koutník, Klaus Greff, Faustino J. Gomez, and Jürgen Schmidhuber. A clockwork RNN. CoRR, abs/1402.3511, 2014.Google Scholar
- Daniel Neil, Michael Pfeiffer, and Shih-Chii Liu. Phased LSTM: accelerating recurrent network training for long or event-based sequences. CoRR, abs/1610.09513, 2016.Google Scholar
- Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28, ICML'13, pages III--1310--III--1318. JMLR.org, 2013.Google Scholar
- Min Joon Seo, Sewon Min, Ali Farhadi, and Hannaneh Hajishirzi. Neural speed reading via skim-rnn. CoRR, abs/1711.02085, 2017.Google Scholar
- Urmish Thakker, Jesse G. Beu, Dibakar Gope, Ganesh Dasika, and Matthew Mattina. Run-time efficient RNN compression for inference on edge devices. CoRR, abs/1906.04886, 2019.Google Scholar
- Urmish Thakker, Jesse G. Beu, Dibakar Gope, Chu Zhou, Igor Fedorov, Ganesh Dasika, and Matthew Mattina. Compressing rnns for iot devices by 15-38x using kronecker products. CoRR, abs/1906.02876, 2019.Google Scholar
- Urmish Thakker, Ganesh Dasika, Jesse G. Beu, and Matthew Mattina. Measuring scheduling efficiency of rnns for NLP applications. CoRR, abs/1904.03302, 2019.Google Scholar
- Michael Tschannen, Aran Khanna, and Anima Anandkumar. Strassennets: Deep learning with a multiplication budget. CoRR, abs/1712.03942, 2017.Google Scholar
- Vincent Vanhoucke, Andrew Senior, and Mark Z. Mao. Improving the speed of neural networks on cpus. In Deep Learning and Unsupervised Feature Learning Workshop, NIPS 2011, 2011.Google Scholar
- Adams Wei Yu, Hongrae Lee, and Quoc V. Le. Learning to skim text. CoRR, abs/1704.06877, 2017.Google Scholar
- Keyi Yu, Yang Liu, Alexander G. Schwing, and Jian Peng. Fast and accurate text classification: Skimming, rereading and early stopping, 2018.Google Scholar
- Chenzhuo Zhu, Song Han, Huizi Mao, and William J. Dally. Trained ternary quantization. CoRR, abs/1612.01064, 2016.Google Scholar
- Michael Zhu and Suyog Gupta. To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv e-prints, page arXiv:1710.01878, October 2017.Google Scholar
Index Terms
- Skipping RNN State Updates without Retraining the Original Model
Recommendations
Stock Price Prediction with FinBERT and RNN
ICACS '23: Proceedings of the 7th International Conference on Algorithms, Computing and SystemsIn recent years, stock price prediction has increasingly become a pivotal research direction at the intersection of finance and computer science. While traditional forecasting models predominantly rely on historical prices and technical indicators, the ...
Sentiment analysis using RNN model with LSTM
In today's digital world with a rapid increase in e-commerce portals, the consumers are more oriented towards seeking out online reviews, feedback, or ratings over a product during the online buying process. In this research work, we tried to investigate ...
Prediction Method of Parking Space Based on Genetic Algorithm and RNN
Advances in Multimedia Information Processing – PCM 2018AbstractWith respect to the prediction of short-term unoccupied parking space of parking guidance and information system (PGIS),a prediction method using genetic algorithm combined with recurrent neural network (RNN) is proposed. First, set the ...
Comments