[P] DWARF: O(1) KV cache attention derived from heterodyne receiver physics
DWARF uses a fixed circular buffer (about 1.5GB, always, regardless of context length). The tradeoff is you don’t get full attention over the whole context, but the physics-derived offset set recovers most of what matters.
Core result: a fixed ~1.5GB KV cache at any context length (versus ~52GB for a standard 7B at 100K tokens), achieved by computing attention at 44 physics-derived dyadic offsets rather than all past positions.
Code has been public for two weeks with 500+ clones.
Paper is written and LaTeX-compiled, and available.
GitHub: https://github.com/Lanerra/DWARF
submitted by /u/MariusNocturnum
[link] [comments]
Like
0
Liked
Liked