Abstract
This paper introduces Jittor, a fully just-in-time (JIT) compiled deep learning framework. With JIT compilation, we can achieve higher performance while making systems highly customizable. Jittor provides classes of Numpy-like operators, which we call meta-operators. A deep learning model built upon these meta-operators is compiled into high-performance CPU or GPU code in real-time. To manage metaoperators, Jittor uses a highly optimized way of executing computation graphs, which we call unified graph execution. This approach is as easy to use as dynamic graph execution yet has the efficiency of static graph execution. It also provides other improvements, including operator fusion, cross iteration fusion, and unified memory.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Collobert R, Bengio S, Mariethoz J. Torch: a modular machine learning software library. In: Proceedings of IDIAP Research Report, 2002
Al-Rfou R, Alain G, Almahairi A, et al. Theano: a python framework for fast computation of mathematical expressions. 2016. ArXiv:1605.02688
Jia Y, Shelhamer E, Donahue J, et al. Caffe: convolutional architecture for fast feature embedding. In: Proceedings of ACM International Conference on Multimedia, 2014. 675–678
Abadi M, Barham P, Chen J, et al. Tensorflow: a system for large-scale machine learning. In: Proceedings of the 12th Symposium on Operating Systems Design and Implementation, 2016. 265–283
Paszke A, Gross S, Massa F, et al. Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of Advances in Neural Information Processing Systems, 2019. 8024–8035
Cyphers D S, Bansal A K, Bhiwandiwalla A, et al. Intel nGraph: an intermediate representation, compiler, and executor for deep learning. 2018. ArXiv:1801.08058
Schoenholz S S, Cubuk E D. JAX, M.D.: end-to-end differentiable, hardware accelerated, molecular dynamics in pure Python. 2019. ArXiv:1912.04232
Chen T, Moreau T, Jiang Z, et al. TVM: an automated end-to-end optimizing compiler for deep learning. In: Proceedings of the 13th Symposium on Operating Systems Design and Implementation, 2018. 578–594
Nickolls J, Buck I, Garland M, et al. Scalable parallel programming with CUDA. In: Proceedings of IEEE Hot Chips 20 Symposium (HCS), 2008
Thompson J A, Schlachter K. An introduction to the OpenCL programming model. 2012. https://cims.nyu.edu/∼schlacht/OpenCLModel.pdf
Oliphant T E. Guide to NumPy. North Charleston: CreateSpace Publishing, 2015
Chen T Q, Li M, Li Y T, et al. MXNet: a flexible and efficient machine learning library for heterogeneous distributed systems. 2015. ArXiv:1512.01274
Chetlur S, Woolley C, Vandermersch P, et al. cuDNN: efficient primitives for deep learning. 2014. ArXiv:1410.0759
Lattner C, Adve V S. LLVM: a compilation framework for lifelong program analysis & transformation. In: Proceedings of International Symposium on Code Generation and Optimization, 2004. 97–104
Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors. Nature, 1986, 323: 533–536
Gabriel E, Fagg G E, Bosilca G, et al. Open MPI: goals, concept, and design of a next generation MPI implementation. In: Proceedings of European Parallel Virtual Machine/Message Passing Interface Users’ Group Meeting, 2004. 97–104
Tokui S, Oono K. Chainer: a next-generation open source framework for deep learning. In: Proceedings of Workshop on Machine Learning Systems (LearningSys), 2015
Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems, 2014
Sutton R S, Barto A G. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 2018
He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016. 770–778
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems, 2012. 1106–1114
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2015. ArXiv:1409.1556
Zagoruyko S, Komodakis N. Wide residual networks. 2016. ArXiv:1605.07146
Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0.5 MB model size. 2017. ArXiv:1602.07360
Xie S N, Girshick R, Dollar P, et al. Aggregated residual transformations for deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. 5987–5995
Gao S H, Cheng M M, Zhao K, et al. Res2Net: a new multi-scale backbone architecture. IEEE Trans Pattern Anal Mach Intell, 2019. doi: https://doi.org/10.1109/TPAMI.2019.2938758
Gulrajani I, Ahmed F, Arjovsky M, et al. Improved training of wasserstein GANs. In: Proceedings of Advances in Neural Information Processing Systems, 2017. 5767–5777
Radford A, Metz L, Chintala S. Unsupervised representation learning with deep convolutional generative adversarial networks. In: Proceedings of the 4th International Conference on Learning Representations, 2016
Mao X D, Li Q, Xie H R, et al. Least squares generative adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2813–2821
Zhu J Y, Park T, Isola P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 2223–2232
LeCun Y, Cortes C, Burges C J C. The MNIST database of handwritten digits. 2005. http://yann.lecun.com/exdb/mnist/
Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016
Li T M. Differentiable visual computing. 2019. ArXiv:1904.12228
Kato H, Ushiku Y, Harada T. Neural 3D mesh renderer. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018. 3907–3916
Hu Y M, Anderson L, Li T M, et al. DiffTaichi: differentiable programming for physical simulation. In: Proceedings of the 8th International Conference on Learning Representations, 2020
Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant No. 61521002). We would like to thank anonymous reviewers for their helpful review comments, and Professor Ralph R. MARTIN for his useful suggestions and great help in paper writting.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hu, SM., Liang, D., Yang, GY. et al. Jittor: a novel deep learning framework with meta-operators and unified graph execution. Sci. China Inf. Sci. 63, 222103 (2020). https://doi.org/10.1007/s11432-020-3097-4
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-020-3097-4