Retrieval-Augmented Generation Enhanced GPT-4.1 to Support Clinical Trial Informed Consent Review for Data Reuse
Background: Regulatory frameworks such as the Belmont Report, the Common Rule, and the Declaration of Helsinki require informed consent to ensure participants understand a study’s purpose and can make voluntary decisions about their involvement. Regulations including the General Data Protection Regulation (Regulation (EU) 2016/679) further emphasise that consent must be freely given and revocable without disadvantage. Although informed consent forms (ICFs) are intended to be clear and accessible, they have become increasingly lengthy and complex. Large language models (LLMs) offer potential to navigate and interpret this complexity and have shown promise in biomedical information extraction tasks. However, their susceptibility to hallucinations limits reliability in high stakes settings. Retrieval augmented generation (RAG) can mitigate such errors. This study evaluates the integration of LLMs with RAG for reviewing data reuse language in ICFs and their ability to interpret complex textual structures. Methods: Firstly, we processed 438 ICFs from different trials, including multi-countries, languages and versions of ICFs. Using expertly curated prompts, we extracted information about data reuse using GPT-4.1. Comparing the LLM-generated data reuse outputs with human expert ground truth, we evaluated accuracy and the time required to extract information for each ICF. To further validate the workflow, we evaluated an independent set of 488 ICFs spanning additional trials, languages, and regions. For this cohort, we assessed the correctness of LLM outputs along with the quality of supporting evidence provided by the model. Results: Across 438 ICFs, the system achieved 81.6% accuracy, which increased to 90% in a subsequent evaluation of additional 488 ICFs after prompt optimisation. Using a RAG-based approach, the system accurately extracted data reuse information across multiple languages and identified nuanced international regulatory requirements. Conclusion: This approach has the potential to significantly alleviate administrative burdens by automating labour-intensive processes, while also generating insights that could inform the standardisation of ICF creation. Ultimately, these advancements may contribute to reduce the complexity of ICFs, thereby improving their readability and comprehensibility for participants.