High Performance Cache Replacement Using Re-Reference Interval Prediction (RRIP)

by Thomas Pinon (Leader / Presentor), Eric Morgan (scribe), James S. Tappert (blogger) January 27, 2026

Introduction

Cache replacement policies play a central role in determining the effectiveness of modern cache hierarchies. Despite decades of architectural evolution, the dominant policy remains Least Recently Used (LRU), largely due to its intuitive appeal and historical simplicity. However, LRU operates off the assumption that recent access implies near-future reuse. While often true, this assumption fails for a wide range of contemporary workloads, including streaming applications, large working sets, and multiprogrammed environments where interference dominates cache behavior.

The paper High Performance Cache Replacement Using Re-reference Interval Prediction (RRIP) introduces RRIP as an alternative replacment policy to LRU. Rather than estimating reuse indirectly through recency, RRIP explicitly predicts how far in the future a cache block will be referenced again. This alternative prediction method enables replacement decisions that more closely approximate optimal behavior, while still remaining implementable in real hardware.

Background and Motivation

Limitations of LRU

LRU assumes that recently accessed data will be reused in the near future. This assumption breaks down in several common scenarios:

Streaming workloads, where data is accessed once and never reused
Large working sets that exceed cache capacity, leading to thrashing
Mixed access patterns, where frequently reused data structures coexist with large scans

In such cases, LRU allows non-temporal data to evict frequently reused cache blocks, significantly degrading performance.

Key Concepts and Terminology

Re-reference Interval Prediction (RRIP): A cache replacement framework that predicts how long it will be before a cache block is reused, rather than relying solely on recency.
Re-reference Prediction Value (RRPV): A small per-cache-line counter that encodes the predicted distance to the next reuse. Larger values indicate farther reuse.
Static RRIP (SRRIP): A policy that inserts new cache lines with a long predicted re-reference interval, providing strong resistance to cache pollution from streaming accesses.
Bimodal RRIP (BRRIP): A variant that usually inserts cache lines with a very long re-reference interval but occasionally inserts them as likely-to-be-reused, improving performance under thrashing workloads.
Dynamic RRIP (DRRIP): A hybrid policy that dynamically selects between using SRRIP or BRIPP using Set Dueling.
Set Dueling: A technique that dedicates a small number of cache sets to competing policies and uses their miss behavior to select the most effective policy at runtime.

Overview of the RRIP Mechanism

RRIP associates each cache block with an RRPV counter that represents its predicted reuse distance. Conceptually:

RRPV = 0 indicates an expected near-immediate reuse
RRPV = Max indicates reuse far in the future or no reuse at all

Replacement Policy

On a cache miss:

The cache searches for a block with the maximum RRPV.
If no such block exists, all RRPVs are incremented until at least one reaches the maximum value.
A block with the maximum RRPV is selected for eviction.

On a cache hit:

The accessed block’s RRPV is decremented, indicating high reuse potential.

Scan Resistance and Adaptivity

SRRIP inserts new cache blocks with long predicted reuse distances, preventing streaming data from displacing frequently reused blocks. DRRIP extends this approach by dynamically choosing between SRRIP and BRRIP based on runtime behavior. Using set dueling, DRRIP adapts to workload characteristics without software intervention.

Evaluation and Results

The authors evaluate RRIP across a wide range of single-core and multi-core workloads. Key findings include:

Performance Improvements
- SRRIP achieves approximately 4–7% performance improvement over LRU
- DRRIP achieves approximately 9–10% improvement over LRU
Hardware Overhead
- Only 2 bits per cache block are required
- Lower complexity than LRU implementations
Robustness
- Strong resistance to cache pollution from scans
- Stable performance across diverse application behaviors
Multicore Benefits
- Reduced inter-core cache interference
- Improved overall throughput and fairness

These results demonstrate that RRIP provides consistent gains with minimal additional hardware cost.

Strengths and Limitations

Strengths

Conceptual Clarity: Explicitly modeling reuse distance aligns more closely with optimal replacement behavior than recency-based policies.
Low Implementation Cost: RRIP requires minimal state and simpler logic than LRU.
Adaptability: DRRIP dynamically adjusts to workload behavior without programmer or operating system involvement.

Limitations

Predictive Approximation: RRPVs provide only a coarse approximation of reuse distance, especially with small counters, and irregular access patterns may still lead to poor predictions.
Evaluation Scope: Some highly pointer-intensive or unpredictable workloads are not extensively explored.

Class Discussion

Where is the data going when it is not recently used? Is there demotion between levels of cache?:
- Yes, when you kick something from the L1, it goes to the L2. Caches are generally inclusive.
Why is this not implemented in higher levels of cache?:
- Latency constraints and hardware overhead. More computational logic for finding near and far values.. Not much gain in efficiency.
What eviction policy is used in L1?:
- NRU is used. Only one bit and very fast.
How often is L3 accessed in workloads such as gaming? Are the performance gains worth it?:"
- Depends on the workload. The paper showed that performance increased in all benchmarks expect photoshop. Performance increases across the board seem to show L3 cache access. L3 is designed to minimize cache misses, so additional hardware is fine because latency is not an issue.
What hardware was used to test this?
- CNP-based x86 simulator, as well as games, and productivity apps.

Conclusion

The RRIP framework illustrates that cache replacement policies can be both simple and highly effective when they directly model reuse behavior. By predicting re-reference intervals instead of relying on recency, RRIP consistently outperforms traditional LRU while maintaining low hardware overhead. The success of DRRIP further demonstrates the importance of adaptive policies in handling modern, diverse workloads.

References

Jaleel, A., et al. High Performance Cache Replacement Using Re-reference Interval Prediction (https://dl.acm.org/doi/10.1145/1815961.1815971)

Generative AI Disclosure

ChatGPT was used to generate a Markdown file template and check spelling and grammar.
Generative AI can be useful tools for tasks such as summarizing or drafting, however, they may give innacurate information confidently and should always have generated information validated

The CS/ECE 4/599 Course Blog