Biogas Prediction Enhancement of a Swine Farm Bio-Digester Using a Lag-Based Surrogate Machine Learning Model

Biogas production estimation has been one of the most important and challenging objectives for anaerobic digestion processes due to the complexity of its dynamics and the lack of high-quality open-access datasets. This study presents a hybrid modeling framework that combines a mechanistic model, based on ordinary differential equations (ODE), with a machine learning model. Rather than relying exclusively on experimental data, the proposed approach leverages physics-informed synthetic data generation, complemented by a lag-based feature engineering to capture inherent temporal dependencies in the process dynamics available in operational data of a bio-digester. Two configurations were evaluated: a baseline model and an enhanced version incorporating lag features and simplified temperature profile. While the improved model achieved high predictive performance (R2=0.97885, RMSE=131.80[L/d]), additional analyses reveal that this performance is partly driven by temporal memory and remains sensitive to noise and feature composition. Instead of presenting the model as a final solution, this work frames it as a step toward practical digital twin implementations, acknowledging the gap that still exists between simulation-based accuracy and real-world reliability.

Liked Liked