郭佳乐, 蒋林, 山蕊, 崔朋飞, 武鑫. 可重构视频阵列处理器簇内存储结构设计与实现[J]. 微电子学与计算机, 2017, 34(9): 116-120, 125.
引用本文: 郭佳乐, 蒋林, 山蕊, 崔朋飞, 武鑫. 可重构视频阵列处理器簇内存储结构设计与实现[J]. 微电子学与计算机, 2017, 34(9): 116-120, 125.
GUO Jia-le, JIANG Lin, SHAN Rui, CUI Peng-fei, WU Xin. Design of Cluster Memory Structure for Reconfigurable Cideo Array Processor[J]. Microelectronics & Computer, 2017, 34(9): 116-120, 125.
Citation: GUO Jia-le, JIANG Lin, SHAN Rui, CUI Peng-fei, WU Xin. Design of Cluster Memory Structure for Reconfigurable Cideo Array Processor[J]. Microelectronics & Computer, 2017, 34(9): 116-120, 125.

可重构视频阵列处理器簇内存储结构设计与实现

Design of Cluster Memory Structure for Reconfigurable Cideo Array Processor

  • 摘要: 提出了一种簇内高效并行访问存储结构.该结构采用"逻辑共享、物理分布"多个存储块并行存储的方法, 实现了4×4视频阵列处理器的并行访问.实验结果表明, 在无冲突情况下, 该结构支持16个轻核处理元的同时读/写操作, 最高频率200 MHz, 访存峰值带宽6.25 GB/s.最后对8×8二维离散余弦变换算法进行映射实现和性能比较发现, 簇内存储结构能够为该算法提供312.2Msamples/s的数据访存带宽, 相较于同类型阵列结构, 执行周期数降低了31.67%, 工作频率提高了一倍, 访存带宽增加了192.60%.

     

    Abstract: A high efficient and parallel access memory structure is proposed. The architecture adopts the method of "logical sharing, physical distribution" and parallel storage of multiple memory blocks, which realizes the parallel access of 4×4 video array processors. The experimental results show that the proposed architecture can support simultaneous read/write operations of 16 light nuclear processing elements, the highest frequency is 200 MHz, access to the peak bandwidth of 6.25 GB/s. Finally, the 8×8 two-dimensional discrete cosine transform algorithm is mapped and compared. It is found that the cluster memory structure can provide data storage bandwidth of 312.2 Msamples/s. Compared with the same type of array structure, the number of execution cycles decreased 31.67%, frequency doubled, memory bandwidth is increased by 192.60%.

     

/

返回文章
返回