Abstract
Low-Density Parity-heck Codes (LDPC) with excellent error-correction capabilities have been widely used in both data communication and storage fields, to construct reliable cyber-physical systems that are resilient to real-world noises. Fast prototyping field-programmable gate array (FPGA)-based decoder is essential to achieve high decoding performance while accelerating the development process. This paper proposes a three-level parallel architecture, TLP-LDPC, to achieve high throughput by fully exploiting the characteristics of both LDPC and underlying hardware while effectively scaling to large-size FPGA platforms. The three-level parallel architecture contains a low-level decoding unit, a mid-level multi-unit decoding core, and a high-level multi-core decoder. The low-level decoding unit is a basic LDPC computation component that effectively combines the features of the LDPC algorithm and hardware with the specific structure (e.g., Look-Up-Table, LUT) of the FPGA and eliminates potential data conflicts. The mid-level decoding core integrates the input/output and multiple decoding units in a well-balancing pipelined fashion. The top-level multi-core architecture conveniently makes full use of board-level resources to improve the overall throughput. We develop an LDPC C++ code with dedicated pragmas and leverage HLS tools to implement the TLP-LDPC architecture. Experimental results show that TLP-LDPC achieves 9.63 Gbps end-to-end decoding throughput on a Xilinx Alveo U50 platform, 3.9x higher than existing HLS-based FPGA implementations.
Similar content being viewed by others
References
Pratas F, Andrade J, Falcao G, Silva V, Sousa L. Open the gates: Using high-level synthesis towards programmable LDPC decoders on FPGAs. In Proc. the 2013 IEEE Global Conference on Signal and Information Processing, Dec. 2013, pp.1274-1277. DOI: 10.1109/GlobalSIP.2013.6737141.
Mhaske S, Kee H, Ly T, Aziz A, Spasojevic P. FPGAbased channel coding architectures for 5G wireless using high-level synthesis. International Journal of Reconfigurable Computing, 2017, 2017: Article No. 3689308. DOI: https://doi.org/10.1155/2017/3689308.
Zhang M, Wu F, Yu Q, Liu W, Cui L, Zhao Y, Xie C. BeLDPC: Bit errors aware adaptive rate LDPC codes for 3D TLC NAND ash memory. In Proc. the 2020 Design, Automation and Test in Europe Conference and Exhibition, March 2020, pp.302-305. DOI: 10.23919/DATE48585.2020.9116324.
Andrade J, George N, Karras K, Novo D, Pratas F, Sousa L, Ienne P, Falcao G, Silva V. Design space exploration of LDPC decoders using high-level synthesis. IEEE Access, 2017, 5: 14600-14615. DOI: https://doi.org/10.1109/ACCESS.2017.2727221.
Andrade J, Pratas F, Falcao G, Silva V, Sousa L. Combining exibility with low power: Dataow and widepipeline LDPC decoding engines in the Gbit/s era. In Proc. the 2014 IEEE International Conference on Application-Specific Systems, Architectures and Processors, June 2014, pp.264-269. DOI: 10.1109/ASAP.2014.6868671.
Andrade J, Falcao G, Silva V. Flexible design of widepipeline- based WiMAX QC-LDPC decoder architectures on FPGAs using high-level synthesis. Electronics Letters, 2014, 50(11): 839-840. DOI: https://doi.org/10.1049/el.2013.3411.
Hailes P, Xu L, Maunder R G, Al-Hashimi B M, Hanzo L. A survey of FPGA-based LDPC decoders. IEEE Communications Surveys and Tutorials, 2016, 18(2): 1098-1122. DOI: https://doi.org/10.1109/COMST.2015.2510381.
Gallager R. Low-density parity-check codes. IRE Transactions on Information Theory, 1962, 8(1): 21-28. DOI: https://doi.org/10.1109/TIT.1962.1057683.
Nane R, Sima V M, Pilato C, Choi J, Fort B, Canis A, Chen Y T, Hsiao H, Brown S, Ferrandi F, Anderson J, Bertels K. A survey and evaluation of FPGA high-level synthesis tools. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2016, 35(10): 1591-1604. DOI: https://doi.org/10.1109/TCAD.2015.2513673.
Chandrasetty V A, Aziz S M S. FPGA implementation of high performance LDPC decoder using modified 2-bit Min-Sum algorithm. In Proc. the 2nd International Conference on Computer Research and Development, May 2010, pp.881-885. DOI: 10.1109/ICCRD.2010.186.
Chandrasetty V A, Aziz S M. An area efficient LDPC decoder using a reduced complexity Min-Sum algorithm. Integration, 2012, 45(2): 141-148. DOI: https://doi.org/10.1016/j.vlsi.2011.08.002.
Zarubica R, Wilson S G, Hall E. Multi-Gbps FPGA-based low density parity check (LDPC) decoder design. In Proc. the 2007 IEEE Global Telecommunications Conference, Nov. 2007, pp.548-552. DOI: 10.1109/GLOCOM.2007.108.
Townsend R, Weldon E. Self-orthogonal quasi-cyclic codes. IEEE Transactions on Information Theory, 1967, 13(2): 183-195. DOI: https://doi.org/10.1109/TIT.1967.1053974.
Choi Y K, Chi Y, Qiao W, Samardzic N, Cong J. HBM connect: High-performance HLS interconnect for FPGA HBM. In Proc. the 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 28-Mar. 2, 2021, pp.116-126. DOI: 10.1145/3431920.3439301.
Le Gal B, Jégo C. Low-latency software LDPC decoders for x86 multi-core devices. In Proc. the 2017 IEEE International Workshop on Signal Processing Systems, Oct. 2017. DOI: https://doi.org/10.1109/SiPS.2017.8110001.
Yuan J, Sha J. 4.7-Gb/s LDPC decoder on GPU. IEEE Communications Letters, 2018, 22(3): 478-481. DOI: https://doi.org/10.1109/LCOMM.2017.2778727.
Wen X, Jiao X J, Jääskeläinen P, Kultala H, Chen C F, Berg H, Bie Z S. A high throughput LDPC decoder using a midrange GPU. In Proc. the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2014, pp.7515-7519. DOI: 10.1109/ICASSP.2014.6855061.
Guan Y, Liang H, Xu N, Wang W, Shi S, Chen X, Sun G, Zhang W, Cong J. FP-DNN: An automated framework for mapping deep neural networks onto FPGAs with RTL-HLS hybrid templates. In Proc. the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, April 30-May 2, 2017, pp.152-159. DOI: 10.1109/FCCM.2017.25.
Zhang X, Liu X, Ramachandran A, Zhuge C, Tang S, Ouyang P, Cheng Z, Rupnow K, Chen D. High-performance video content recognition with long-term recurrent convolutional network for FPGA. In Proc. the 27th International Conference on Field Programmable Logic and Applications, Sept. 2017. DOI: 10.23919/FPL.2017.8056833.
Chen X, Tan H, Chen Y, He B, Wong W F, Chen D. ThunderGP: HLS-based graph processing framework on FPGAs. In Proc. the 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Feb. 28-Mar. 2, 2021, pp.69-80. DOI: 10.1145/3431920.3439290.
Zhang K, Huang X, Wang Z. High-throughput layered decoder implementation for quasi-cyclic LDPC codes. IEEE Journal on Selected Areas in Communications, 2009, 27(6): 985-994. DOI: https://doi.org/10.1109/JSAC.2009.090816.
Author information
Authors and Affiliations
Corresponding author
Supplementary Information
ESM 1
(PDF 442 kb)
Rights and permissions
About this article
Cite this article
Zhang, YF., Sun, L. & Cao, Q. TLP-LDPC: Three-Level Parallel FPGA Architecture for Fast Prototyping of LDPC Decoder Using High-Level Synthesis. J. Comput. Sci. Technol. 37, 1290–1306 (2022). https://doi.org/10.1007/s11390-022-1499-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11390-022-1499-9