A Near-Data Processing Server Architecture and Its Impact on Data Center Applications

Song, Xiaojia; Xie, Tao; Fischer, Stephen

doi:10.1007/978-3-030-20656-7_5

Xiaojia Song¹⁸,
Tao Xie¹⁸ &
Stephen Fischer¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11501))

Included in the following conference series:

International Conference on High Performance Computing

1385 Accesses

Abstract

Existing near-data processing (NDP) techniques have demonstrated their strength for some specific data-intensive applications. However, they might be inadequate for a data center server, which normally needs to perform a diverse range of applications from data-intensive to compute-intensive. How to develop a versatile NDP-powered server to support various data center applications remains an open question. Further, a good understanding of the impact of NDP on data center applications is still missing. For example, can a compute-intensive application also benefit from NDP? Which type of NDP engine is a better choice, an FPGA-based engine or an ARM-based engine? To address these issues, we first propose a new NDP server architecture that tightly couples each SSD with a dedicated NDP engine to fully exploit the data transfer bandwidth of an SSD array. Based on the architecture, two NDP servers ANS (ARM-based NDP Server) and FNS (FPGA-based NDP Server) are introduced. Next, we implement a single-engine prototype for each of them. Finally, we measure performance, energy efficiency, and cost/performance ratio of six typical data center applications running on the two prototypes. Some new findings have been observed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Efficient Scheduling Algorithm for Multi-mode Tasks on Near-Data Processing SSDs

On the necessity of explicit cross-layer data formats in near-data processing systems

Article Open access 16 March 2021

dReDBox: A Disaggregated Architectural Perspective for Data Centers

References

Ahn, J., Hong, S., Yoo, S., Mutlu, O., Choi, K.: A scalable processing-in-memory accelerator for parallel graph processing. ACM SIGARCH Comput. Architect. News 43(3), 105–117 (2016)
Article Google Scholar
Asanovic, K., Patterson, D.: Firebox: a hardware building block for 2020 warehouse-scale computers. In: USENIX FAST, vol. 13 (2014)
Google Scholar
Cho, S., Park, C., Oh, H., Kim, S., Yi, Y., Ganger, G.R.: Active disk meets flash: a case for intelligent SSDs. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, pp. 91–102. ACM (2013)
Google Scholar
CNXSoft: Allwinner A64 a quad core 64-bit ARM cortex A53 SoC for tablets (2015)
Google Scholar
Davidson, G.S., Cowie, J.R., Helmreich, S.C., Zacharski, R.A., Boyack, K.W.: Data-centric computing with the netezza architecture. Technical report, Sandia National Laboratories (2006)
Google Scholar
De, A., Gokhale, M., Gupta, R., Swanson, S.: Minerva: accelerating data analysis in next-generation SSDs. In: FCCM, pp. 9–16. IEEE (2013)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR. IEEE (2009)
Google Scholar
Digikey: Price of VCU1525 board (2018). https://www.digikey.com/products/en?keywords=VCU1525
Fidus Systems, Inc.: Fidus sidewinder-100 (2017). https://www.xilinx.com/products/boards-and-kits/1-o1x8yv.html
Gao, M., Ayers, G., Kozyrakis, C.: Practical near-data processing for in-memory analytics frameworks. In: 2015 International Conference on PACT, pp. 113–124 (2015)
Google Scholar
Gu, B., et al.: Biscuit: a framework for near-data processing of big data workloads. In: ISCA, pp. 153–165. IEEE (2016)
Google Scholar
He, H., Guo, H.: The realization of FFT algorithm based on FPGA co-processor. In: Second International Symposium on Intelligent Information Technology Application, IITA 2008, vol. 3, pp. 239–243. IEEE (2008)
Google Scholar
The Khronos Group, Inc.: The open standard for parallel programming of heterogeneous systems (2018). https://www.khronos.org/opencl/
Intel: Intel® Xeon® Gold 6154 Processor. https://ark.intel.com/products/120495/Intel-Xeon-Gold-6154-Processor-24_75M-Cache-3_00-GHz
István, Z., Sidler, D., Alonso, G.: Caribou: intelligent distributed storage. Proc. VLDB Endowment 10(11), 1202–1213 (2017)
Article Google Scholar
Jo, I., et al.: YourSQL: a high-performance database system leveraging in-storage computing. Proc. VLDB Endowment 9(12), 924–935 (2016)
Article Google Scholar
Jun, S.W., Liu, M., Lee, S., Hicks, et al.: BlueDBM: an appliance for big data analytics. In: Computer Architecture (ISCA), pp. 1–13 (2015)
Google Scholar
Koo, G., et al.: Summarizer: trading communication with computing near storage. In: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 219–231. ACM (2017)
Google Scholar
Mayhew, D., Krishnan, V.: PCI express and advanced switching: evolutionary path to building next generation interconnects. In: Proceedings of the High Performance Interconnects, pp. 21–29 (2003)
Google Scholar
Nurvitadhi, E., Sheffield, D., Sim, J., Mishra, A., Venkatesh, G., Marr, D.: Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In: FPT, pp. 77–84. IEEE (2016)
Google Scholar
Rodinia: accelerating compute-intensive applications with accelerators (2009)
Google Scholar
Samsung: Mission peak NGSFF all flash NVMe reference design (2017). http://www.samsung.com/semiconductor/insights/tech-leadership/mission-peak-ngsff-all-flash-nvme-reference-design/
Talbot, J., Yoo, R.M., Kozyrakis, C.: Phoenix++: modular mapreduce for shared-memory systems. In: Proceedings of the Second International Workshop on MapReduce and its Applications, pp. 9–16. ACM (2011)
Google Scholar
Tiwari, D., et al.: Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines. In: FAST, pp. 119–132 (2013)
Google Scholar
Wang, E., et al.: Intel math kernel library. In: High-Performance Computing on the Intel® Xeon Phi$^{{\rm TM}}$, pp. 167–188. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06486-4_7
Google Scholar
Whitman, M., Fink, M.: HP labs: the future technology. HP discover Las Vegas (2014). https://news.hpe.com/content-hub/memory-driven-computing/
Woods, L., István, Z., Alonso, G.: Ibex: an intelligent storage engine with support for advanced SQL offloading. Proc. VLDB Endowment 7(11), 963–974 (2014)
Article Google Scholar
Xilinx virtex ultrascale+ FPGA VCU1525 (2017). https://www.xilinx.com/products/boards-and-kits/vcu1525-a.html
Yoshimi, M., Oge, Y., Yoshinaga, T.: Pipelined parallel join and its FPGA-based acceleration. TRETS 10(4), 28 (2017)
Article Google Scholar
Zhang, D., Jayasena, N., Lyashevsky, A., Greathouse, J.L., Xu, L., Ignatowski, M.: TOP-PIM: throughput-oriented programmable processing in memory. In: HPDC, pp. 85–98. ACM (2014)
Google Scholar

Download references

Acknowledgment

This research was supported by Samsung Memory Solution Laboratory (MSL). We thank our colleagues from MSL who provided insight and expertise that greatly assisted the research. This work is also partially supported by the US National Science Foundation under grant CNS-1813485.

Author information

Authors and Affiliations

San Diego State University, 5500 Campanile Dr, San Diego, CA, 92182, USA
Xiaojia Song & Tao Xie
Samsung Semiconductor, 3655 N 1st St, San Jose, CA, 95134, USA
Stephen Fischer

Authors

Xiaojia Song
View author publications
You can also search for this author in PubMed Google Scholar
Tao Xie
View author publications
You can also search for this author in PubMed Google Scholar
Stephen Fischer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiaojia Song , Tao Xie or Stephen Fischer .

Editor information

Editors and Affiliations

University of Edinburgh, Edinburgh, UK
Michèle Weiland
Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Dresden, Germany
Guido Juckeland
Technical University of Munich, Munich, Germany
Carsten Trinitis
Ohio State University, Columbus, USA
Ponnuswamy Sadayappan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Song, X., Xie, T., Fischer, S. (2019). A Near-Data Processing Server Architecture and Its Impact on Data Center Applications. In: Weiland, M., Juckeland, G., Trinitis, C., Sadayappan, P. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11501. Springer, Cham. https://doi.org/10.1007/978-3-030-20656-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-20656-7_5
Published: 17 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20655-0
Online ISBN: 978-3-030-20656-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Near-Data Processing Server Architecture and Its Impact on Data Center Applications

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Scheduling Algorithm for Multi-mode Tasks on Near-Data Processing SSDs

On the necessity of explicit cross-layer data formats in near-data processing systems

dReDBox: A Disaggregated Architectural Perspective for Data Centers

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Near-Data Processing Server Architecture and Its Impact on Data Center Applications

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

An Efficient Scheduling Algorithm for Multi-mode Tasks on Near-Data Processing SSDs

On the necessity of explicit cross-layer data formats in near-data processing systems

dReDBox: A Disaggregated Architectural Perspective for Data Centers

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation