Spatio-Temporal Data Model for Early Wildfire Detection
Early detection is a key tool for preventing wildfires. Modern machine learning algorithms integrated into large-scale monitoring systems enable automated surveillance of high-risk areas. However, single‑image detection methods that do not consider interframe dependencies and changes between consecutive images often fail to detect smoke plumes at the very early stage and at larger distances, critical for effective response, or they produce an increased number of false alarms. Biological vision is particularly sensitive to motion cues, and this translates well to automated systems. Recent temporal-memory approaches have demonstrated improved performance over purely spatial methods but typically rely on complex, computationally heavy multi-stage architectures.
This study investigates the possibility of encoding temporal and contextual information into additional image channels as a basis for compiling data models with increased information content. Several distinct data models were proposed, and corresponding datasets were generated to train standard YOLO architectures. Experimental evaluation compared the performance of YOLO models trained on the information‑enriched datasets with those trained on standard RGB images. Based on the results, the best data model for early wildfire smoke detection was selected. Comparative evaluation demonstrated improved detection accuracy for models trained on data containing spatio-temporal information compared to standard RGB images, while preserving low inference latency. The proposed approach shifts the focus to the structure and information content of the data while preserving the efficiency of standard convolutional neural network architectures. This approach could be applied to other problems requiring high efficiency and real-time operation, where temporal and contextual information can improve detection performance.