FLOWER and FaME: A Low Overhead Bit-Level Fault-map and Fault-Tolerance Approach for Deeply Scaled Memories

2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)(2020)

引用 16|浏览38
暂无评分
摘要
To maintain appropriate yields in deeply scaled technologies requires fault-tolerance of increasingly high fault rates. These fault rates far exceed traditional general approaches such as ECC, particularly when faults accrue over time. Effective fault tolerance at such high fault rates requires detailed bit-level knowledge of the location of faulty cells. We provide a solution to this problem in the form of a space efficient, bit-level fault map called FLOWER. FLOWER utilizes Bloom filters to provide detailed fault characterization for a relatively small overhead. We demonstrate how FLOWER can enable improved fault tolerance at high fault rates by enhancing existing fault tolerance proposals and yielding 10–100x improvements. Using in-memory processing, FLOWER can maintain a less than 2% performance overhead at 10E-4 fault rates with less than 2% loss of memory density to report bit-level faults with high accuracy. Using a tuned novel hashing technique called MinCI, FLOWER for memory achieves considerably lower false positives than with disk-level hashing techniques at a fraction of the performance overhead. With a new technique to protect against errors during in-memory operations, PETAL bits, FLOWER can remain resilient against random errors while efficiently targeting predictable errors. Furthermore, we propose a new fault tolerance scheme called FaME, which provides ultra-efficient bit-level sparing by using the FLOWER fault map to identify the location of faults. FLOWER+FaME can achieve 14x longer PCM memory lifetime with half the area overhead versus SECDED ECC.
更多
查看译文
关键词
Fault tolerance,Bloom filters,Hash functions,DRAM,Phase Change Memory,Processing in memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要