DiscoRD: An Experimental Methodology for Quickly Discovering the Reliable Read Disturbance Threshold of Real DRAM Chips
arXiv:2603.12435v1 Announce Type: new
Abstract: State-of-the-art DRAM read disturbance mitigations rely on the read disturbance threshold (RDT) (e.g., the number of aggressor row activations needed to induce the first read disturbance bitflip) to securely and performance- and energy-efficiently prevent read disturbance bitflips. However, accurately and exhaustively characterizing the RDT of every DRAM row in a chip is time intensive. Rapidly determining RDT is important for enabling secure, performance- and energy-efficient systems. Our goal is to develop and evaluate a reliable and rapid read disturbance testing methodology. To that end, we develop DiscoRD building on the key results of an extensive experimental characterization study using 212 real DDR4 chips whereby we measure the RDT of hundreds of thousands of DRAM rows millions of times.
We develop an empirical model for read disturbance bitflips and evaluate the probability of read-disturbance-induced uncorrectable errors when a read disturbance mechanism is configured using a single $RDT_{min}$ measurement. Using this model we demonstrate that 1) relying on a lightweight error-correcting code (ECC) alone yields relatively high uncorrectable error probability and 2) combining ECC, infrequent memory scrubbing, and configurable read disturbance mitigation mechanisms can greatly reduce the error probability. Building on our observations and analyses, we discuss the RDT of each individual row can be identified more precisely. Our results show that error tolerance, memory scrubbing, online profiling, and run-time configurable read disturbance mitigation techniques are important to enable secure and energy-efficient spatial-variation aware read disturbance mitigations. We hope that DiscoRD drives research that enables us to quantitatively navigate the performance/cost – reliability tradeoff space for read disturbance mitigation techniques.