Abstract
Deep learning (DL) systems usually utilize asynchronous prefetch to improve data reading performance. However, the efficiency of the data transfer path from hard disk to DRAM is still limited by disk performance. The emerging non-volatile memory (NVRAM) provides a novel solution for this problem, while few existing researches have considered it. We propose a novel efficient data prefetch strategy for DL based on a heterogeneous memory system combining NVRAM with DRAM. Benefitting from the large capacity and fast reading speed of NVRAM, the strategy uses an asynchronous reading method named sliding NVRAM cache (SNC) to improve the performance of the data transfer paths. A sliding window is applied to map the data from disk to NVRAM and continuously update the data, while non-ideal writing performance of NVRAM can be remitted to a large extent in this strategy. Experiments show that SNC can improve the time performance of diverse deep neural networks training by more than 30%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation (OSDI 2016), pp. 265–283. USENIX, Berkeley (2016)
Cantalupo, C., Venkatesan, V., Hammond, J., Czurlyo, K., Hammond, S.D.: Memkind: an extensible heap memory manager for heterogeneous memory platforms and mixed memory policies. Technical report, Sandia National Lab (SNL-NM), Albuquerque, NM, United States (2015)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Li, F.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 248–255. IEEE, Piscataway (2009)
Duan, Z., Liu, H., Liao, X., Jin, H.: HME: a lightweight emulator for hybrid memory. In: Proceedings of Design, Automation & Test in Europe Conference & Exhibition (DATE 2018), pp. 1375–1380. IEEE, Piscataway (2018)
Eisenman, A., et al.: Bandana: using non-volatile memory for storing deep learning models. arXiv preprint arXiv:1811.05922 (2018)
Geng, Q., Zhou, Z., Cao, X.: Survey of recent progress in semantic image segmentation with CNNs. SCIENCE CHINA Inf. Sci. 61(5), 1–18 (2018). 051101
Hadjis, S., Abuzaid, F., Zhang, C., Ré, C.: Caffe con Troll: shallow ideas to speed up deep learning. In: Proceedings of the Fourth Workshop on Data Analytics in the Cloud (DanaC 2015), pp. 2:1–4. ACM, New York (2015)
Intel Corporation: Intel distribution of Caffe (2019). https://github.com/intel/caffe
Izraelevitz, J., et al.: Basic performance measurements of the Intel Optane DC persistent memory module. arXiv preprint arXiv:1903.05714 (2019)
Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia (MM 2014), pp. 675–678. ACM, New York (2014)
Peng, I.B., Gokhale, M.B., Green, E.W.: System evaluation of the Intel Optane byte-addressable NVM. In: Proceedings of the International Symposium on Memory Systems (MEMSYS 2019), pp. 304–315. ACM, New York (2019)
Poremba, M., Zhang, T., Xie, Y.: NVMain 2.0: a user-friendly memory simulator to model (non-)volatile memory systems. IEEE Comput. Archit. Lett. 14(2), 140–143 (2015)
Pumma, S., Min, S., Feng, W.C., Balaji, P.: Parallel I/O optimizations for scalable deep learning. In: Proceedings of the IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS 2017), pp. 720–729. IEEE, Los Alamitos (2017)
Sreedhar, D., Saxena, V., Sabharwal, Y., Verma, A., Kumar, S.: Efficient training of convolutional neural nets on large distributed systems. In: Proceedings of the International Conference on Cluster Computing (CLUSTER 2018), pp. 392–401. IEEE, Los Alamitos (2018)
You, Y., Zhang, Z., Hsieh, C., Demmel, J., Keutzer, K.: ImageNet training in minutes. In: Proceedings of the 47th International Conference on Parallel Processing (ICPP 2018), pp. 1:1–10. ACM, New York (2018)
Acknowledgement
This work is supported by National Natural Science Foundation of China under grants No. 61832006 and No. 61672250.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiang, W., Liu, P., Jin, H., Peng, J. (2020). An Efficient Data Prefetch Strategy for Deep Learning Based on Non-volatile Memory. In: Yu, Z., Becker, C., Xing, G. (eds) Green, Pervasive, and Cloud Computing. GPC 2020. Lecture Notes in Computer Science(), vol 12398. Springer, Cham. https://doi.org/10.1007/978-3-030-64243-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-64243-3_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64242-6
Online ISBN: 978-3-030-64243-3
eBook Packages: Computer ScienceComputer Science (R0)