Abstract:
Unstructured low-density parity-check (LDPC) code, which have better error correction performance, has received widespread attention.However, its irregular distribution of non-zero elements with no cyclic or quasi-cyclic structure in sub-matrix increases the complexity of the decoder implementation. Based on CUDA, a LDPC decoder design is proposedto support high throughput parallel decoding for any unstructured LDPC code. By means of compression and rearrangement of LDPC check matrix and optimization of message storage, an efficient parallel decoding kernel on GPU is designed and implemented for multi-frame decoding. The results on GTX1660Ti GPU platform show that the throughput of LLR-BP and NMSA decoding kernels based on TPMP schedule can achieve 78.88~360.25Mbps and 174.38~1 323.75 Mbps, realizing efficient parallel decoding for any unstructured LDPC codes.