GlyRAG: Context-Aware Retrieval-Augmented Framework for Blood Glucose Forecasting
arXiv:2601.05353v1 Announce Type: new
Abstract: Accurate forecasting of blood glucose from CGM is essential for preventing dysglycemic events, thus enabling proactive diabetes management. However, current forecasting models treat blood glucose readings captured using CGMs as a numerical sequence, either ignoring context or relying on additional sensors/modalities that are difficult to collect and deploy at scale. Recently, LLMs have shown promise for time-series forecasting tasks, yet their role as agentic context extractors in diabetes care remains largely unexplored. To address these limitations, we propose GlyRAG, a context-aware, retrieval-augmented forecasting framework that derives semantic understanding of blood glucose dynamics directly from CGM traces without requiring additional sensor modalities. GlyRAG employs an LLM as a contextualization agent to generate clinical summaries. These summaries are embedded by a language model and fused with patch-based glucose representations in a multimodal transformer architecture with a cross translation loss aligining textual and physiological embeddings. A retrieval module then identifies similar historical episodes in the learned embedding space and uses cross-attention to integrate these case-based analogues prior to making a forecasting inference. Extensive evaluations on two T1D cohorts show that GlyRAG consistently outperforms state-of-the art methods, achieving up to 39% lower RMSE and a further 1.7% reduction in RMSE over the baseline. Clinical evaluation shows that GlyRAG places 85% predictions in safe zones and achieves 51% improvement in predicting dysglycemic events across both cohorts. These results indicate that LLM-based contextualization and retrieval over CGM traces can enhance the accuracy and clinical reliability of long-horizon glucose forecasting without the need for extra sensors, thus supporting future agentic decision-support tools for diabetes management.