Abstract
In the era of big data, high-bandwidth and high-concurrency architecture of storage systems is urgently needed. Due to the superiority in power consumption, random access rate and shock resistance, NAND flash memory is popularly adopted in enterprise-class storage systems, and gradually takes the place of traditional hard disk. However, this kind of superiority is not off-the-shelf. Several factors, such as out-of-place update and limited erase/program cycles, have hindered the applicability of flash memory in existing storage systems. Therefore, to fully exploit flash memory’s advantages, this paper proposes a high-performance PCIe SSD, Gemini, and depicts its principles in hardware and software implementation. Our proposed Gemini features several hardware and software optimizations, including PBFTL (the page to block mapping FTL), Dysource (a synchronous-interface flash channel controller with the out-of-order scheduling strategy), a customized I/O stack, the scatter/gather DMA and the multi-queue architecture. What’s more, an FPGA-based prototype of Gemini with 2 TB storage capacity is implemented for verification. In experiment, Gemini achieves a maximum read bandwidth of 3.6 GB/s and a maximum write bandwidth of 1.08 GB/s for 64 KB data access. It can also provide remarkable processing rates over 580,000 IOPS and 270,000 IOPS, with regard to 4 KB random read and write respectively.
Similar content being viewed by others
References
Avagotech: PEX 8632. http://www.avagotech.com/products/pcie-switches-bridges/pcie-switches/pex8632 (2014)
Axboe, J.: FIO. http://git.kernel.dk/?p=fio.git;a=summary (2014)
Ban, A.: Flash file system (1995). US Patent 5,404,485
Caulfield, A.M., Grupp, L.M., Swanson, S.: Gordon: Using flash memory to build fast, power-efficient clusters for data-intensive applications. Acm Sigplan Not. 44(3), 217–228 (2009)
Chung, T.S., Park, D.J., Park, S., Lee, D.H., Lee, S.W., Song, H.J.: System software for flash memory: a survey. In: Embedded and Ubiquitous Computing, pp. 394–404. Springer (2006)
Desnoyers, P.: What systems researchers need to know about nand flash. In: The 5th USENIX Workshop on Hot Topics in File and Storage Technologies (HotStorage13), San Jose, California (2013)
Huawei: HUAWEI ES3000 V2 High Performance PCIe SSD V100R001C00 User Guide 04. http://support.huawei.com/enterprise/docinforeader.action?contentId=DOC1000052593&idPath=7919749|9856522|9856629|21242728 (2015)
Kim, Y., Tauras, B., Gupta, A., Urgaonkar, B.: Flashsim: A simulator for nand flash-based solid-state drives. In: 2009 First International Conference on Advances in System Simulation, pp. 125–131 (2009)
Kuo, T.W.: flash traces. http://newslab.csie.ntu.edu.tw/~flash/index.php?SelectedItem=Traces (2013)
Lee, S.W., Park, D.J., Chung, T.S., Lee, D.H., Park, S., Song, H.J.: A log buffer-based flash translation layer using fully-associative sector translation. ACM Trans. Embed. Comput. Syst (TECS). 6(3), 18 (2007)
Microelectronics, S.: Bad block management in nand flash memories. Application note AN-1819, Geneva, Switzerland (2004)
Micron: NAND Flash Memory datasheet. http://download.micron.com/pdf/datasheets/flash/nand/2gb_nand_m29b4325l63b_32gb_64gb_128gb_256gb_asyncsync_nand.pdf (2013)
Ouyang, J., Lin, S., Jiang, S., Hou, Z., Wang, Y., Wang, Y.: Sdf: Software-defined flash for web-scale internet storage systems. In: ACM SIGPLAN Notices, vol. 49, pp. 471–484. ACM (2014)
Park, D., Debnath, B., Du, D.: Cftl: A convertible flash translation layer with consideration of data access pattern. In: Proc. ACM International Conference on Measurement and Modeling of Computer Systems, pp. 365–366. Citeseer (2009)
Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM Trans. Comput. Syst (TOCS). 10(1), 26–52 (1992)
Samsung: SSD 840 EVO. http://www.samsung.com/global/business/semiconductor/minisite/SSD/global/html/ssd840evo/overview.html (2015)
Sandisk: Fusion ioMemory. https://www.sandisk.com/content/dam/sandisk-main/en_us/assets/resources/enterprise/data-sheets/ioDrive2_MLC_DS_SanDisk (2014)
Seong, Y.J., Nam, E.H., Yoon, J.H., Kim, H., Choi, Jy, Lee, S., Bae, Y.H., Lee, J., Cho, Y., Min, S.L.: Hydra: A block-mapped parallel flash memory solid-state disk architecture. Comput. IEEE Trans. 59(7), 905–921 (2010)
Swanson, S., Caulfield, A.M.: Refactor, reduce, recycle: Restructuring the i/o stack for the future of storage. Computer 8, 52–59 (2013)
Urgaonkar, A.G.Y.K.B.: Dftl: A flash translation layer employing demand-based selective caching of page-level address mappings. Computer Systems Laboratory, department of Computer Science & Engineering. The Pennsylvania State University, Univesity Park, PA 16802 (2008)
Xiao, N., Chen, Z., Liu, F., Lai, M., An, L.: P3stor: A parallel, durable flash-based ssd for enterprise-scale storage systems. Sci. China Inf. Sci. 54(6), 1129–1141 (2011)
Xilinx: Xlinx Virtex-6 Family Overview. http://www.xilinx.com/support/documentation/data_sheets/ds150.pdf (2015)
Acknowledgments
The authors would like to thank all anonymous reviewers for your constructive and insightful suggestions to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported by the National Natural Science Foundation of China under Grant Nos. 61433019, U1435217, No.61170288, No.61202121, the National High Technology Research and Development 863 Program of China under Grant No. 2015AA015305, and Research on Co-Designed Virtual Machine based on Dynamic Binary Translation No.20114307120013.
Rights and permissions
About this article
Cite this article
Ou, Y., Xiao, N., Liu, F. et al. Gemini: A Novel Hardware and Software Implementation of High-performance PCIe SSD. Int J Parallel Prog 45, 923–945 (2017). https://doi.org/10.1007/s10766-016-0449-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-016-0449-y