In-Memory Computing Architectures for Big Data and Machine Learning Applications

Snášel, Václav; Dang, Tran Khanh; Pham, Phuong N. H.; Küng, Josef; Kong, Lingping

doi:10.1007/978-981-19-8069-5_2

Václav Snášel⁸,
Tran Khanh Dang⁹,
Phuong N. H. Pham⁹,
Josef Küng¹⁰ &
…
Lingping Kong⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1688))

Included in the following conference series:

International Conference on Future Data and Security Engineering

1871 Accesses

Abstract

Traditional computing hardware is working to meet the extensive computational load presented by the rapidly growing Machine Learning (ML) and Artificial Intelligence algorithms such as Deep Neural Networks and Big Data. In order to get hardware solutions to meet the low-latency and high-throughput computational needs of these algorithms, Non-Von Neumann computing architectures such as In-memory Computing (IMC) have been extensively researched and experimented with over the last five years. This study analyses and reviews works designed to accelerate Machine Learning task. We investigate different architectural aspects and directions and provide our comparative evaluations. We further discuss IMC research’s challenges and limitations and present possible directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Hashiyana, V., Suresh, N., Sverdlik, W.: Big data: We’re almost at infinity. In: 2017 IST-Africa Week Conference (IST-Africa), pp. 1–7. IEEE (2017)
Google Scholar
Salkuti, S.R.: A survey of big data and machine learning. Int. J. Electr. Comput. Eng. (2088–8708) 10(1) (2020)
Google Scholar
Zhang, Y., Huang, T., Bompard, E.F.: Big data analytics in smart grids: a review. Energy Inform. 1(1), 1–24 (2018). https://doi.org/10.1186/s42162-018-0007-5
Article Google Scholar
Khan, A.I., Al-Habsi, S.: Machine learning in computer vision. Procedia Comput. Sci. 167, 1444–1451 (2020)
Article Google Scholar
Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Philosophical Trans. Roy. Soc. A 379(2194), 20200209 (2021)
Article MathSciNet Google Scholar
Chen, T., Guestrin, C.: Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
Google Scholar
Khan, A.A., Laghari, A.A., Awan, S.A.: Machine learning in computer vision: a review. EAI Endorsed Trans. Scalable Inf. Syst. 8(32), e4 (2021)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Article Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 (2016)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8697–8710 (2018)
Google Scholar
Wang, W., Yang, Y., Wang, X., Wang, W., Li, J.: Development of convolutional neural network and its application in image classification: a survey. Optical Eng. 58(4), 040901 (2019)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Bradski, G.: The opencv library. Dr. Dobb’s J. Softw. Tools Prof. Programmer 25(11), 120–123 (2000)
Google Scholar
Longa, A., Santin, G., Pellegrini, G.: Pyg, torch_geometric (2022). http://github.com/PyGithub/PyGithub. Accessed 24 Sept 2022
Wu, Z., Pan, S., Chen, F., Long, G., Zhang, C., Philip, S.Y.: A comprehensive survey on graph neural networks. IEEE Trans. Neural Networks Learn. Syst. 32(1), 4–24 (2020)
Article MathSciNet Google Scholar
Zhao, R., Luk, W., Niu, X., Shi, H., Wang, H.: Hardware acceleration for machine learning. In: 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 645–650. IEEE (2017)
Google Scholar
Faggin, F., Mead, C.: Vlsi implementation of neural networks (1990)
Google Scholar
Jesan, J.P., Lauro, D.M.: Human brain and neural network behavior: a comparison (2003)
Google Scholar
Mijwel, M.M.: Artificial neural networks advantages and disadvantages. Retrieved from LinkedIn (2018) http://www.linkedin.com/pulse/artificial-neuralnetWork
Reuben, J.: Rediscovering majority logic in the post-cmos era: a perspective from in-memory computing. J. Low Power Electron. Appl. 10(3), 28 (2020)
Article Google Scholar
Lynham, J.: How have catch shares been allocated? Marine Policy 44, 42–48 (2014)
Article Google Scholar
Hoschek, W., Jaen-Martinez, J., Samar, A., Stockinger, H., Stockinger, K.: Data management in an international data grid project. In: Buyya, R., Baker, M. (eds.) GRID 2000. LNCS, vol. 1971, pp. 77–90. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-44444-0_8
Chapter Google Scholar
Kabakus, A.T., Kara, R.: A performance evaluation of in-memory databases. J. King Saud Univ.-Comput. Inf. Sci. 29(4), 520–525 (2017)
Google Scholar
Rashed, M.R.H., Thijssen, S., Jha, S.K., Yao, F., Ewetz, R.: Stream: towards read-based in-memory computing for streaming based data processing. In: 2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 690–695. IEEE (2022)
Google Scholar
Peng, X., Huang, S., Jiang, H., Lu, A., Yu, S.: Dnn+ neurosim v2. 0: an end-to-end benchmarking framework for compute-in-memory accelerators for on-chip training. IEEE Trans. Comput.-Aided Des. Integrated Circuits Syst. 40(11), 2306–2319 (2020)
Google Scholar
Angizi, S., He, Z., Fan, D.: Dima: a depthwise cnn in-memory accelerator. In: 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 1–8. IEEE (2018)
Google Scholar
Ríos, C., et al.: In-memory computing on a photonic platform. Sci. Adv. 5(2), eaau5759 (2019)
Google Scholar
Zanotti, T., Puglisi, F.M., Pavan, P.: Reconfigurable smart in-memory computing platform supporting logic and binarized neural networks for low-power edge devices. IEEE J. Emerging Sel. Top. Circuits Syst. 10(4), 478–487 (2020)
Article Google Scholar
Agrawal, A., Jaiswal, A., Lee, C., Roy, K.: X-sram: enabling in-memory boolean computations in cmos static random access memories. IEEE Trans. Circuits Syst. I: Regular Papers 65(12), 4219–4232 (2018)
Google Scholar
Verma, N., et al.: In-memory computing: advances and prospects. IEEE Solid-State Circuits Mag. 11(3), 43–55 (2019)
Article Google Scholar
Wang, Y.: Design considerations for emerging memory and in-memory computing. In: VLSI 2020 Symposium on Technology and Circuits. Short Course 3(8) (2020)
Google Scholar
Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R., Eleftheriou, E.: Memory devices and applications for in-memory computing. Nature Nanotechnol. 15(7), 529–544 (2020)
Article Google Scholar
Ielmini, D., Pedretti, G.: Device and circuit architectures for in-memory computing. Adv. Intell. Syst. 2(7), 2000040 (2020)
Article Google Scholar
Jawandhiya, P.: Hardware design for machine learning. Int. J. Artif. Intell. Appl. 9(1), 63–84 (2018)
Google Scholar
Dazzi, M., Sebastian, A., Benini, L., Eleftheriou, E.: Accelerating inference of convolutional neural networks using in-memory computing. Front. Comput. Neurosci. 15, 674154 (2021)
Google Scholar
Saikia, J., Yin, S., Jiang, Z., Seok, M., Seo, J.: K-nearest neighbor hardware accelerator using in-memory computing sram. In: 2019 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), pp. 1–6. IEEE (2019)
Google Scholar
Dietterich, T.G.: Machine-learning research. AI Mag. 18(4), 97–97 (1997)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Capra, M., Peloso, R., Masera, G., Roch, M.R., Martina, M.: Edge computing: a survey on the hardware requirements in the internet of things world. Future Internet 11(4), 100 (2019)
Article Google Scholar
Kim, J.-W., Kim, D.-S., Kim, S.-H., Shin, S.-M.: The firmware design and implementation scheme for c form-factor pluggable optical transceiver. Appl. Sci. 10(6), 2143 (2020)
Article Google Scholar
Freund, K.: A machine learning landscape: where amd, intel, nvidia, qualcomm and xilinx ai engines live. http://www.forbes.com/sites/moorinsights/2017/03/03, Forbes, 2022. Accessed 23 Sept 2022
Chmielewski, Ł, Weissbart, L.: On reverse engineering neural network implementation on GPU. In: Zhou, J., et al. (eds.) ACNS 2021. LNCS, vol. 12809, pp. 96–113. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-81645-2_7
Chapter Google Scholar
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing fpga-based accelerator design for deep convolutional neural networks. In: Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 161–170 (2015)
Google Scholar
Jung, S., Kim, S.: Hardware implementation of a real-time neural network controller with a dsp and an fpga for nonlinear systems. IEEE Trans. Ind. Electron. 54(1), 265–271 (2007)
Article Google Scholar
Sahin, S., Becerikli, Y., Yazici, S.: Neural network implementation in hardware using FPGAs. In: King, I., Wang, J., Chan, L.-W., Wang, D.L. (eds.) ICONIP 2006. LNCS, vol. 4234, pp. 1105–1112. Springer, Heidelberg (2006). https://doi.org/10.1007/11893295_122
Chapter Google Scholar
Nurvitadhi, E., Sim, J., Sheffield, D., Mishra, A., Krishnan, S., Marr, D.: Accelerating recurrent neural networks in analytics servers: Comparison of fpga, cpu, gpu, and asic. In: 2016 26th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4. IEEE (2016)
Google Scholar
Boutros, A., Yazdanshenas, S., Betz, V.: You cannot improve what you do not measure: Fpga vs. asic efficiency gaps for convolutional neural network inference. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 11(3), 1–23 (2018)
Google Scholar
Kerbl, B., Kenzel, M., Winter, M., Steinberger, M.: Cuda and applications to task-based programming (2022). http://cuda-tutorial.github.io/part2_22.pdf. Accessed 23 Sept 2022
Tarditi, D., Puri, S., Oglesby, J.: Accelerator: using data parallelism to program gpus for general-purpose uses. ACM SIGPLAN Not. 41(11), 325–335 (2006)
Article Google Scholar
Jang, H., Park, A., Jung, K.: Neural network implementation using cuda and openmp. In: 2008 Digital Image Computing: Techniques and Applications, pp. 155–161. IEEE (2008)
Google Scholar
Silicon Graphics Khronos Group. Opengl (2022). http://www.opengl.org/. Accessed 23 Sept 2022
Advanced Micro Devices. Amd radeon graphics cards specifications (2022). http://www.amd.com/en/support/kb/faq/gpu-624. Accessed 23 Sept 2022
Nvidia. Cuda toolkit (2022). http://developer.nvidia.com/cuda-zone. Accessed 23 Sept 2022
Zhang, C., Song, D., Huang, C., Swami, A., Chawla, N.V.: Heterogeneous graph neural network. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 793–803 (2019)
Google Scholar
Touvron, H., Cord, M., Sablayrolles, A., Synnaeve, G., Jégou, H.: Going deeper with image transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 32–42 (2021)
Google Scholar
Osman, A.A.M.: Gpu computing taxonomy. In: Recent Progress in Parallel and Distributed Computing, IntechOpen (2017)
Google Scholar
Ashu Rege. An introduction to modern gpu architecture (nvidia talk). http://download.nvidia.com/developer/cuda/seminar/TDCI_Arch.pdf
author. Nvidia, gpu (2022). http://www.nvidia.com/en-us/data-center/a100/. Accessed 21 Sept 2022
author. Googlecloud, tpu (2022). http://cloud.google.com/tpu/docs/bfloat16. Accessed 21 Sept 2022
author. Graphcore, ipu (2022). http://www.graphcore.ai/. Accessed 21 Sept 2022
Jia, X., et al.: Highly scalable deep learning training system with mixed-precision: training imagenet in four minutes. arXiv preprint arXiv:1807.11205 (2018)
Goncalo, R., Pedro, T., Nuno, R.: Positnn: training deep neural networks with mixed low-precision posit. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7908–7912 (2021)
Google Scholar
Miyashita, D., Lee, E.H., Murmann, B.: Convolutional neural networks using logarithmic data representation. arXiv preprint arXiv:1603.01025 (2016)
Sun, X.: Ultra-low precision 4-bit training of deep neural networks. Adv. Neural Inf. Process. Syst. 33, 1796–1807 (2020)
Google Scholar
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6645–6649. IEEE (2013)
Google Scholar
Sun, X., et al.: Hybrid 8-bit floating point (hfp8) training and inference for deep neural networks. In: Advances in Neural Information Processing Systems, 32 (2019)
Google Scholar
Lin, T., Wang, Y., Liu, X., Qiu, X.: A survey of transformers. arXiv preprint arXiv:2106.04554 (2021)
Chen, Y., et al.: Mobile-former: bridging mobilenet and transformer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5270–5279 (2022)
Google Scholar
Sebastian, A., et al.: Computational memory-based inference and training of deep neural networks. In: 2019 Symposium on VLSI Technology, pp. T168–T169. IEEE (2019)
Google Scholar
Nandakumar, S.R., et al.: Mixed-precision deep learning based on computational memory. Front. Neurosci. 14, 406 (2020)
Article Google Scholar
Yann, L., Corinna, C., Burges Christopher, J.C.: Mnist, dataset (2022). http://yann.lecun.com/exdb/mnist/. Accessed 21 Sept 2022
Wang, C., Gong, L., Qi, Yu., Li, X., Xie, Y., Zhou, X.: Dlau: a scalable deep learning accelerator unit on fpga. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 36(3), 513–517 (2016)
Google Scholar
Merolla, P., Arthur, J., Akopyan, F., Imam, N., Manohar, R., Modha, D.S.: A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45 nm. In: 2011 IEEE Custom Integrated Circuits Conference (CICC), pp. 1–4. IEEE (2011)
Google Scholar
Chen, T., et al.: Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM SIGARCH Comput. Archit. News 42(1), 269–284 (2014)
Article Google Scholar
Shafiee, A., et al.: Isaac: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. ACM SIGARCH Comput. Archit. News 44(3), 14–26 (2016)
Article Google Scholar
Song, L., Qian, X., Li, H., Chen, Y.: Pipelayer: a pipelined reram-based accelerator for deep learning. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 541–552. IEEE (2017)
Google Scholar
Chen, Y., Chen, T., Zhiwei, X., Sun, N., Temam, O.: Diannao family: energy-efficient hardware accelerators for machine learning. Commun. ACM 59(11), 105–112 (2016)
Article Google Scholar
Mao, H., Song, M., Li, T., Dai, Y., Shu, J.: Lergan: a zero-free, low data movement and pim-based gan architecture. In: 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 669–681. IEEE (2018)
Google Scholar
Creswell, A., White, T., Dumoulin, V., Arulkumaran, K., Sengupta, B., Bharath, A.A.: Generative adversarial networks: an overview. IEEE Signal Process. Magazine 35(1), 53–65 (2018)
Article Google Scholar
Salami, B., Unsal, O.S., Kestelman, A.C.: Comprehensive evaluation of supply voltage underscaling in fpga on-chip memories. In: 2018 51st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 724–736. IEEE (2018)
Google Scholar
Makrani, H.M., Sayadi, H., Mohsenin, T., Rafatirad, S., Sasan, A., Homayoun, H.: Xppe: cross-platform performance estimation of hardware accelerators using machine learning. In Proceedings of the 24th Asia and South Pacific Design Automation Conference, pp. 727–732 (2019)
Google Scholar
Song, M., Zhang, J., Chen, H., Li, T.: Towards efficient microarchitectural design for accelerating unsupervised gan-based deep learning. In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 66–77. IEEE (2018)
Google Scholar
Li, B., Song, L., Chen, F., Qian, X., Chen, Y., Li, H.H.: Reram-based accelerator for deep learning. In: 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE), pp. 815–820. IEEE (2018)
Google Scholar
Chen, Y., et al.: Dadiannao: a machine-learning supercomputer. In: 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 609–622. IEEE (2014)
Google Scholar
Luo, T., et al.: Dadiannao: a neural network supercomputer. IEEE Trans. Comput. 66(1), 73–88 (2016)
Article MathSciNet Google Scholar
Korchagin, P.A., Letopolskiy, A.B., Teterina, I.A.: Results of research of working capability of refined pipelayer equipment. In: International Conference "Aviamechanical Engineering and Transport" (AVENT 2018), pp. 416–420. Atlantis Press (2018)
Google Scholar
Qiao, X., Cao, X., Yang, H., Song, L., Li, H.: Atomlayer: a universal reram-based cnn accelerator with atomic layer computation. In: Proceedings of the 55th Annual Design Automation Conference, pp. 1–6 (2018)
Google Scholar
Liu, D., et al.: Pudiannao: a polyvalent machine learning accelerator. ACM SIGARCH Comput. Archit. News 43(1), 369–381 (2015)
Article Google Scholar
O’Shea, K., Nash, R.: An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458 (2015)
Huang, G.B., Mattar, M., Berg, T., Learned-Miller, E.: Labeled faces in the wild: a database for studying face recognition in unconstrained environments. In: Workshop on faces in ‘Real-Life’ Images: Detection, Alignment, and Recognition (2008)
Google Scholar
Furber, S.B., Galluppi, F., Temple, S., Plana, L.A.: The spinnaker project. Proc. IEEE 102(5), 652–665 (2014)
Article Google Scholar
Gokmen, T., Haensch, W.: Algorithm for training neural networks on resistive device arrays. Front. Neurosc. 14, 103 (2020)
Article Google Scholar
Wang, C., Gong, L., Li, X., Zhou, X.: A ubiquitous machine learning accelerator with automatic parallelization on fpga. IEEE Trans. Parallel Distrib. Syst. 31(10), 2346–2359 (2020)
Article Google Scholar
Yan, B., et al.: Resistive memory-based in-memory computing: from device and large-scale integration system perspectives. Adv. Intell. Syst. 1(7), 1900068 (2019)
Article Google Scholar

Download references

Acknowledgements

The authors gratefully acknowledge financial support DST/INT/Czech/P-12/2019, reg. no. LTAIN19176.

Author information

Authors and Affiliations

Faculty of Electrical Engineering and Computer Science, VSB-Technical University of Ostrava, Ostrava, Czech Republic
Václav Snášel & Lingping Kong
Faculty of Information Technology, Ho Chi Minh City University of Food Industry (HUFI), Ho Chi Minh City, Vietnam
Tran Khanh Dang & Phuong N. H. Pham
Johannes Kepler University Linz, Linz, Austria
Josef Küng

Authors

Václav Snášel
View author publications
You can also search for this author in PubMed Google Scholar
Tran Khanh Dang
View author publications
You can also search for this author in PubMed Google Scholar
Phuong N. H. Pham
View author publications
You can also search for this author in PubMed Google Scholar
Josef Küng
View author publications
You can also search for this author in PubMed Google Scholar
Lingping Kong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Václav Snášel or Lingping Kong .

Editor information

Editors and Affiliations

Ho Chi Minh City University of Food Industry, Ho Chi Minh City, Vietnam
Tran Khanh Dang
Johannes Kepler University of Linz, Linz, Austria
Josef Küng
Sungkyunkwan University, Seoul, Korea (Republic of)
Tai M. Chung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Snášel, V., Dang, T.K., Pham, P.N.H., Küng, J., Kong, L. (2022). In-Memory Computing Architectures for Big Data and Machine Learning Applications. In: Dang, T.K., Küng, J., Chung, T.M. (eds) Future Data and Security Engineering. Big Data, Security and Privacy, Smart City and Industry 4.0 Applications. FDSE 2022. Communications in Computer and Information Science, vol 1688. Springer, Singapore. https://doi.org/10.1007/978-981-19-8069-5_2

Download citation

DOI: https://doi.org/10.1007/978-981-19-8069-5_2
Published: 20 November 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8068-8
Online ISBN: 978-981-19-8069-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

In-Memory Computing Architectures for Big Data and Machine Learning Applications