TrustLLM-Fin: A Privacy-Centric and Auditable Impact Assessment Framework for Large Language Models in Automated Financial Reporting

The powerful use of Large Language Models to automate financial reporting is currently hindered by afundamental disconnect: popular models generate stochastic outputs, but the industry expects absolute precision and privacy. General-purpose LLMs are outstanding at reiterating guidelines in natural language, but are still prone to “knowledge overriding,” hallucinated financial figures, as well as adversarial prompt injections, shortcomings that are existential threats to the institutional fiduciary duties and regulatory compliance required by the EU AI Act. We propose TrustLLM-Fin, a domain specific, privacy-preserving and auditable impact assessment framework to systematically model the real-world financial disclosure workflow. We identify orthogonal dimensions in the “Trust Surface” of LLM-powered FinTech Agents: Prompt Safety & Privacy, Factuality & Robustness, Auditability, and wefurther quantize abstract governance principles into a numerical “Trust Score” using the Analytic Hierarchy Process. We test the efficacy of the TrustLLM-Fin framework in a high-fidelity financial dataset constructed from real-world SEC 10-K/10-Q filings. Strikingly, a standard general-purpose LLMscored a TS of only 0.234, due to a 93.33% success rate for context-embedded adversarial attacks. The Guardrail-Enhanced TrustLLM-Fin scored a TS of 0.710, revealing the critical decoupling between generative fluency and instruction following. Our results show that in practice, trustworthiness is not an emergent property of scale, but requires systemic alignment by design. The proposed framework provides auditors and financial institutions with a pragmatic, extensible framework to ensure the deployment of generative agents is transparent, robust and auditable in a changing regulatory landscape.

Liked Liked