Searching of Training Images with Rich Features Required for Generalization Performance of CNN Models Using Interactive Genetic Algorithms

Selecting training parameters for convolutional neural networks (CNNs) and determining the amount of training data required for reliable generalization remain challenging and often time-consuming tasks, typically relying on manual trial-and-error. While genetic algorithms (GAs) have been applied to hyperparameter tuning, less attention has been given to how the proportion of training data influences generalization performance.
In this study, we propose an interactive GA-based framework that simultaneously optimizes key training parameters and the image usage rate, defined as the proportion of training images used during learning. The approach is implemented within a MATLAB-based environment, allowing parameters to be adjusted dynamically during the optimization process.
Experimental results on datasets including CIFAR-10 and EuroSat show that the proposed method can achieve classification performance comparable to manually tuned models while using a reduced portion of the available training data. In particular, similar accuracy levels were obtained with image usage rates in the range of approximately 70–95%, suggesting that not all training samples contribute equally to model performance.
These findings indicate that incorporating data usage into the optimization process can support more efficient CNN training and provide practical guidance for selecting both training parameters and data subsets in practical applications.

Liked Liked