王志君, 梁利平, 洪钦智, 罗汉青, 王昳, 赵淳. 一种DSP和通用CPU一体化的处理器架构及其4核实现[J]. 微电子学与计算机, 2014, 31(10): 32-38.
引用本文: 王志君, 梁利平, 洪钦智, 罗汉青, 王昳, 赵淳. 一种DSP和通用CPU一体化的处理器架构及其4核实现[J]. 微电子学与计算机, 2014, 31(10): 32-38.
WANG Zhi-jun, LIANG Li-ping, HONG Qin-zhi, LUO Han-qing, WANG Die, ZHAO Chun. The Architecture of an Unified DSP Plus General-purpose CPU and the Implementation of a 4-core Homogeneous Processor[J]. Microelectronics & Computer, 2014, 31(10): 32-38.
Citation: WANG Zhi-jun, LIANG Li-ping, HONG Qin-zhi, LUO Han-qing, WANG Die, ZHAO Chun. The Architecture of an Unified DSP Plus General-purpose CPU and the Implementation of a 4-core Homogeneous Processor[J]. Microelectronics & Computer, 2014, 31(10): 32-38.

一种DSP和通用CPU一体化的处理器架构及其4核实现

The Architecture of an Unified DSP Plus General-purpose CPU and the Implementation of a 4-core Homogeneous Processor

  • 摘要: 提出了一种DSP和通用CPU一体化的处理器架构,并完成了一款基于该架构的同构4核处理器设计和流片验证.该处理器基于VLIW结构,支持自主定义的DSP指令系统,兼容现有通用的MIPS 4KC处理器指令集,支持最大8个指令通道的并行发射.处理器在不改变CPU的指令编码以及执行顺序的前提下,实现了芯片结构上的DSP和CPU执行处理的一体化,适合在统一的平台上同时完成宽带通信和多媒体的信号和协议处理的嵌入式应用开发.处理器内核通过自主定义的DSP指令字中前后并行标识位和一条专用的前导paralink指令实现了DSP与CPU指令的并行发射.在4核处理器的同构架构上,采用了全局读局部写的多核间片上数据存储策略,在控制硬件开销的基础上实现片上数据的共享.仿真和流片验证结果表明,所提出的DSP和CPU一体化处理器架构可行,在宽带通信和多媒体等嵌入式应用上具有优势.

     

    Abstract: An unified architecture of DSP plus general-purpose CPU is proposed in this paper.A 4-core homogeneous processor based on the unified architecture is implemented.The processor is designed based on the VLIW architecture and can support maximum 8-instruction parallel dispatch.The DSP instruction set is designed independently and the CPU instruction set is compatible with the MIPS 4KC instruction set.The processor merged the DSP and CPU functions in one unified architecture without changing the existent CPU execution sequence.This architecture makes the processor suitable for broadband communication and multimedia embedded applications.The program can be dispatched parallel by the parallel bits in DSP instruction or by a designed'paralink'prefix instruction.In this 4-core architecture,we performed a globally read locally write strategy to control the chip scale.The results of the simulation and chip test show that the architecture we performed is feasible and have its advantage in broadband communication and multimedia application.

     

/

返回文章
返回