杨兵, 毛志刚, 陈晓, 尹捷明. 分簇处理器中分簇投机的L0 Cache设计[J]. 微电子学与计算机, 2010, 27(7): 15-20.
引用本文: 杨兵, 毛志刚, 陈晓, 尹捷明. 分簇处理器中分簇投机的L0 Cache设计[J]. 微电子学与计算机, 2010, 27(7): 15-20.
YANG Bing, MAO Zhi-gang, CHEN Xiao, YIN Jie-ming. Speculative Clustered L0 Caches Design in Clustered Processor[J]. Microelectronics & Computer, 2010, 27(7): 15-20.
Citation: YANG Bing, MAO Zhi-gang, CHEN Xiao, YIN Jie-ming. Speculative Clustered L0 Caches Design in Clustered Processor[J]. Microelectronics & Computer, 2010, 27(7): 15-20.

分簇处理器中分簇投机的L0 Cache设计

Speculative Clustered L0 Caches Design in Clustered Processor

  • 摘要: 处理器分簇技术是进一步提高超标量处理器性能的一种有效手段,实现了更大指令窗口和发射宽度的同时对Cache系统提出了严峻要求,需要一种访存延迟更小、扩展性更强的Cache结构.采用分簇投机的L0 Cache结构,处理器在访存时投机访问各簇内简单快速的L0 Cache,较好地隐藏了下级Cache的访问延迟.仿真结果显示在8簇的分簇处理器中,采用4kB,2路组相连的分簇L0 Cache后处理器性能平均提升5.6%,在部分测试程序中达到20%以上.

     

    Abstract: Clustering is an attractive technique for large monolithic superscalar processor, allowing for more in-flight instructions, wider issue width. Thus, to design a Cache structure with low memory access latency and high scalability is needed. By implementing spectulative clustered L0 caches, clustered processor speculatively accesses a small, fast, and simple L0 cache in cluster so that accessing latency of low-level high-capacity cache is hidden. As a result, the latency of memory access is shortened. Simulation studies show that 4kB, 2-way set associative L0 Cache in 1x8 clustered processor provides a 5.6% IPC improvement, and in some particular programs a 20% improvement is achieved.

     

/

返回文章
返回