Abstract
Existing near-data processing (NDP) techniques have demonstrated their strength for some specific data-intensive applications. However, they might be inadequate for a data center server, which normally needs to perform a diverse range of applications from data-intensive to compute-intensive. How to develop a versatile NDP-powered server to support various data center applications remains an open question. Further, a good understanding of the impact of NDP on data center applications is still missing. For example, can a compute-intensive application also benefit from NDP? Which type of NDP engine is a better choice, an FPGA-based engine or an ARM-based engine? To address these issues, we first propose a new NDP server architecture that tightly couples each SSD with a dedicated NDP engine to fully exploit the data transfer bandwidth of an SSD array. Based on the architecture, two NDP servers ANS (ARM-based NDP Server) and FNS (FPGA-based NDP Server) are introduced. Next, we implement a single-engine prototype for each of them. Finally, we measure performance, energy efficiency, and cost/performance ratio of six typical data center applications running on the two prototypes. Some new findings have been observed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahn, J., Hong, S., Yoo, S., Mutlu, O., Choi, K.: A scalable processing-in-memory accelerator for parallel graph processing. ACM SIGARCH Comput. Architect. News 43(3), 105–117 (2016)
Asanovic, K., Patterson, D.: Firebox: a hardware building block for 2020 warehouse-scale computers. In: USENIX FAST, vol. 13 (2014)
Cho, S., Park, C., Oh, H., Kim, S., Yi, Y., Ganger, G.R.: Active disk meets flash: a case for intelligent SSDs. In: Proceedings of the 27th International ACM Conference on International Conference on Supercomputing, pp. 91–102. ACM (2013)
CNXSoft: Allwinner A64 a quad core 64-bit ARM cortex A53 SoC for tablets (2015)
Davidson, G.S., Cowie, J.R., Helmreich, S.C., Zacharski, R.A., Boyack, K.W.: Data-centric computing with the netezza architecture. Technical report, Sandia National Laboratories (2006)
De, A., Gokhale, M., Gupta, R., Swanson, S.: Minerva: accelerating data analysis in next-generation SSDs. In: FCCM, pp. 9–16. IEEE (2013)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR. IEEE (2009)
Digikey: Price of VCU1525 board (2018). https://www.digikey.com/products/en?keywords=VCU1525
Fidus Systems, Inc.: Fidus sidewinder-100 (2017). https://www.xilinx.com/products/boards-and-kits/1-o1x8yv.html
Gao, M., Ayers, G., Kozyrakis, C.: Practical near-data processing for in-memory analytics frameworks. In: 2015 International Conference on PACT, pp. 113–124 (2015)
Gu, B., et al.: Biscuit: a framework for near-data processing of big data workloads. In: ISCA, pp. 153–165. IEEE (2016)
He, H., Guo, H.: The realization of FFT algorithm based on FPGA co-processor. In: Second International Symposium on Intelligent Information Technology Application, IITA 2008, vol. 3, pp. 239–243. IEEE (2008)
The Khronos Group, Inc.: The open standard for parallel programming of heterogeneous systems (2018). https://www.khronos.org/opencl/
Intel: Intel® Xeon® Gold 6154 Processor. https://ark.intel.com/products/120495/Intel-Xeon-Gold-6154-Processor-24_75M-Cache-3_00-GHz
István, Z., Sidler, D., Alonso, G.: Caribou: intelligent distributed storage. Proc. VLDB Endowment 10(11), 1202–1213 (2017)
Jo, I., et al.: YourSQL: a high-performance database system leveraging in-storage computing. Proc. VLDB Endowment 9(12), 924–935 (2016)
Jun, S.W., Liu, M., Lee, S., Hicks, et al.: BlueDBM: an appliance for big data analytics. In: Computer Architecture (ISCA), pp. 1–13 (2015)
Koo, G., et al.: Summarizer: trading communication with computing near storage. In: Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 219–231. ACM (2017)
Mayhew, D., Krishnan, V.: PCI express and advanced switching: evolutionary path to building next generation interconnects. In: Proceedings of the High Performance Interconnects, pp. 21–29 (2003)
Nurvitadhi, E., Sheffield, D., Sim, J., Mishra, A., Venkatesh, G., Marr, D.: Accelerating binarized neural networks: comparison of FPGA, CPU, GPU, and ASIC. In: FPT, pp. 77–84. IEEE (2016)
Rodinia: accelerating compute-intensive applications with accelerators (2009)
Samsung: Mission peak NGSFF all flash NVMe reference design (2017). http://www.samsung.com/semiconductor/insights/tech-leadership/mission-peak-ngsff-all-flash-nvme-reference-design/
Talbot, J., Yoo, R.M., Kozyrakis, C.: Phoenix++: modular mapreduce for shared-memory systems. In: Proceedings of the Second International Workshop on MapReduce and its Applications, pp. 9–16. ACM (2011)
Tiwari, D., et al.: Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines. In: FAST, pp. 119–132 (2013)
Wang, E., et al.: Intel math kernel library. In: High-Performance Computing on the Intel® Xeon Phi\(^{{\rm TM}}\), pp. 167–188. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06486-4_7
Whitman, M., Fink, M.: HP labs: the future technology. HP discover Las Vegas (2014). https://news.hpe.com/content-hub/memory-driven-computing/
Woods, L., István, Z., Alonso, G.: Ibex: an intelligent storage engine with support for advanced SQL offloading. Proc. VLDB Endowment 7(11), 963–974 (2014)
Xilinx virtex ultrascale+ FPGA VCU1525 (2017). https://www.xilinx.com/products/boards-and-kits/vcu1525-a.html
Yoshimi, M., Oge, Y., Yoshinaga, T.: Pipelined parallel join and its FPGA-based acceleration. TRETS 10(4), 28 (2017)
Zhang, D., Jayasena, N., Lyashevsky, A., Greathouse, J.L., Xu, L., Ignatowski, M.: TOP-PIM: throughput-oriented programmable processing in memory. In: HPDC, pp. 85–98. ACM (2014)
Acknowledgment
This research was supported by Samsung Memory Solution Laboratory (MSL). We thank our colleagues from MSL who provided insight and expertise that greatly assisted the research. This work is also partially supported by the US National Science Foundation under grant CNS-1813485.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, X., Xie, T., Fischer, S. (2019). A Near-Data Processing Server Architecture and Its Impact on Data Center Applications. In: Weiland, M., Juckeland, G., Trinitis, C., Sadayappan, P. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11501. Springer, Cham. https://doi.org/10.1007/978-3-030-20656-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-20656-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20655-0
Online ISBN: 978-3-030-20656-7
eBook Packages: Computer ScienceComputer Science (R0)