Abstract:
This paper proposes a new fault-tolerant model realized by software method to ensure the reliability of general purpose computation on graphics hardware (GPGPU) on CPU-CPU heterogeneous platform.After analyzing the transient fault occurrence mode and error propagation of GPGPU,fault-tolerant designed both in CPU side and GPU side.An optimal scheme of the fault-tolerant which can reduce the computational overhead and enhance the ability of system interoperability is raised according to the feature of GPGPU.In addition,overhead from the design of fault-tolerance will decline when improving the reliability of GPGPU program.Finally,the feasibility and performance of the model proposed is tested and verified on typical examples.