Compression Method Matters: Benchmark-Dependent Output Dynamics in LLM Prompt Compression
arXiv:2603.23527v1 Announce Type: new Abstract: Prompt compression is often evaluated by input-token reduction, but its real deployment impact depends on how compression changes output length and total inference cost. We present a controlled replication and extension study of benchmark-dependent output dynamics under aggressive compression, covering 5,400 API calls across three benchmarks and multiple providers. To explain conflicting prior observations, we formalize instruction survival probability (Psi), a structural metric that captures whether task-critical prompt segments remain after truncation. Results […]