Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing
arXiv:2603.28972v1 Announce Type: new Abstract: The large-scale adoption of Large Language Models (LLMs) forces a trade-off between operational cost (OpEx) and data privacy. Current routing frameworks reduce costs but ignore prompt sensitivity, exposing users and institutions to leakage risks towards third-party cloud providers. We formalise the “Inseparability Paradigm”: advanced context management intrinsically coincides with privacy management. We propose a local “Privacy Guard” — a holistic contextual observer powered by an on-premise Small Language Model (SLM) — that performs […]