[D] Unpopular opinion: “context window size” is a red herring if you don’t control what goes in it.

We keep talking about 128k, 200k, 1M context. But if the model is bad at using the middle, or we’re stuffing in noise, more window just means more cost and more confusion. I’d rather have a small, curated context than a huge dump.

Curious if others think the real problem is formation – what we put in, in what order, and how we compact – not raw size. What’s your take?

submitted by /u/hack_the_developer
[link] [comments]

Liked Liked