Abstract
The rapid evolution of deep neural networks is demanding deep learning (DL) frameworks not only to satisfy the requirement of quickly executing large computations, but also to support straightforward programming models for quickly implementing and experimenting with complex network structures. However, existing frameworks fail to excel in both departments simultaneously, leading to diverged efforts for optimizing performance and improving usability.
This paper presents JANUS, a system that combines the advantages from both sides by transparently converting an imperative DL program written in Python, a de-facto scripting language for DL, into an efficiently executable symbolic dataflow graph. JANUS can convert various dynamic features of Python, including dynamic control flow, dynamic types, and impure functions, into elements of a symbolic dataflow graph. Our experiments show that JANUS can achieve fast DL training by exploiting the techniques imposed by symbolic graph-based DL frameworks, while maintaining the simple and flexible programmability of imperative DL frameworks at the same time.
- Martín Abadi et al. 2016. TensorFlow: A System for Large-scale Machine Learning. In OSDI. Google ScholarDigital Library
- Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. CoRR abs/1606.01540 (2016). arXiv:1606.01540 http://arxiv.org/abs/1606.01540Google Scholar
- Ciprian Chelba, Tomas Mikolov, Mike Schuster, Qi Ge, Thorsten Brants, Phillipp Koehn, and Tony Robinson. 2013. One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling. Technical Report. Google. http://arxiv.org/abs/1312.3005Google Scholar
- Tianqi Chen, Mu Li, Yutian Li, Min Lin, Naiyan Wang, Minjie Wang, Tianjun Xiao, Bing Xu, Chiyuan Zhang, and Zheng Zhang. 2015. MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems. In Workshop on Machine Learning Systems in NIPS.Google Scholar
- Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan, LeyuanWang, Yuwei Hu, Luis Ceze, Carlos Guestrin, and Arvind Krishnamurthy. 2018. TVM: An Automated End-to- End Optimizing Compiler for Deep Learning. In OSDI. Google ScholarDigital Library
- Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In NIPS. Google ScholarDigital Library
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.Google Scholar
- Sergey Ioffe and Christian Szegedy. 2015. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In ICML. Google ScholarDigital Library
- Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Imageto- Image Translation with Conditional Adversarial Networks. In CVPR.Google Scholar
- Eunji Jeong, Sungwoo Cho, Gyeong-In Yu, Joo Seong Jeong, Dong-Jin Shin, and Byung-Gon Chun. 2019. JANUS: Fast and Flexible Deep Learning via Symbolic Graph Execution of Imperative Programs. In NSDI. Google ScholarDigital Library
- Eunji Jeong, Joo Seong Jeong, Soojeong Kim, Gyeong-In Yu, and Byeong- Gon Chun. 2018. Improving the expressiveness of Deep Learning Frameworks with Recursion. In EuroSys. Google ScholarDigital Library
- Rafal Józefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. 2016. Exploring the Limits of Language Modeling. CoRR abs/1602.02410 (2016). arXiv:1602.02410 http://arxiv.org/abs/1602.02410Google Scholar
- Yann LeCun, Leon Buttou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (November 1998), 2278--2324.Google ScholarCross Ref
- Yann LeCun and Corinna Cortes. The MNIST Database of handwritten digits. http://yann.lecun.com/exdb/mnist/.Google Scholar
- Moshe Looks, Marcello Herreshoff, DeLesley Hutchins, and Peter Norvig. 2017. Deep Learning with Dynamic Computation Graphs. In ICLR.Google Scholar
- Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous Methods for Deep Reinforcement Learning. In ICML. Google ScholarDigital Library
- Dan Moldovan, James M. Decker, Fei Wang, Andrew A. Johnson, Brian K. Lee, Zachary Nado, D. Sculley, Tiark Rompf, and Alexander B. Wiltschko. 2019. AutoGraph: Imperative-style Coding with Graph-based Performance. In SysML.Google Scholar
- MXNet Developers. Gluon. http://gluon.mxnet.io/.Google Scholar
- Stefan C. Müller, Gustavo Alonso, and Adam Amara André Csillaghy. 2014. Pydron: Semi-Automatic Parallelization for Multi-Core and the Cloud. In OSDI. Google ScholarDigital Library
- Graham Neubig et al. 2017. DyNet: The Dynamic Neural Network Toolkit. CoRR abs/1701.03980 (2017).Google Scholar
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic Differentiation in PyTorch. In Autodiff Workshop in NIPS.Google Scholar
- Python Software Foundation. Python programming language. https: //www.python.org/.Google Scholar
- Pytorch Developers. PyTorch JIT. https://github.com/pytorch/pytorch/ tree/master/torch/csrc/jit.Google Scholar
- Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. Inter- national Journal of Computer Vision 115, 3 (December 2015), 211--252. Google ScholarDigital Library
- John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. CoRR abs/1707.06347 (2017).Google Scholar
- Richard Socher, Cliff Chiung-Yu Lin, Andrew Y. Ng, and Christopher D. Manning. 2011. Parsing Natural Scenes and Natural Language with Recursive Neural Networks. In ICML. Google ScholarDigital Library
- Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D Manning, Andrew Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank. In EMNLP.Google Scholar
- Swift Project Authors. Swift for TensorFlow. https://github.com/tensorflow/ swift.Google Scholar
- Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In CVPR.Google Scholar
- Kai Sheng Tai, Richard Socher, and Christopher D Manning. 2015. Improved Semantic Representations from Tree-structured Long Short-term Memory Networks. In ACL.Google Scholar
- TensorFlow Developers. Eager Execution. https://www.tensorflow.org/ programmers_guide/eager.Google Scholar
- TensorFlow Developers. tf.contrib.eager.defun. https://www.tensorflow. org/versions/r1.8/api_docs/python/tf/contrib/eager/defun.Google Scholar
- TensorFlow Developers. XLA Overview. https://www.tensorflow.org/ performance/xla/.Google Scholar
- Radim Tylecek and Radim ára. 2013. Spatial Pattern Templates for Recognition of Objects with Regular Structure. In GCPR.Google Scholar
- Stéfan Van Der Walt, S. Chris Colbert, and Gael Varoquaux. 2011. The NumPy Array: A Structure for Efficient Numerical Computation. Comput- ing in Science Engineering 13, 2 (March 2011), 22--30. Google ScholarDigital Library
- Yuan Yu, Martín Abadi, Paul Barham, Eugene Brevdo, Mike Burrows, Andy Davis, Jeff Dean, Sanjay Ghemawat, Tim Harley, Peter Hawkins, Michael Isard, Manjunath Kudlur, Rajat Monga, Derek Murray, and Xiaoqiang Zheng. 2018. Dynamic Control Flow in Large-scale Machine Learning. In EuroSys. Google ScholarDigital Library
- Wojciech Zaremba, Ilya Sutskever, and Oriol Vinyals. 2014. Recurrent Neural Network Regularization. CoRR abs/1409.2329 (2014).Google Scholar
Recommendations
Symbolic types for lenient symbolic execution
We present lambda_sym, a typed λ-calculus for lenient symbolic execution, where some language constructs do not recognize symbolic values. Its type system, however, ensures safe behavior of all symbolic values in a program. Our calculus extends a base ...
Declarative view of imperative programs
IW-FM'98: Proceedings of the 2nd Irish conference on Formal MethodsBy giving a declarative meaning to an imperative program, the verification of the imperative program is switched from the imperative paradigm to the declarative or logic paradigm where one can take advantage of, for example, referential transparency. ...
Comments