Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift
arXiv:2503.12634v2 Announce Type: replace-cross Abstract: We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence. The leaf-wise predictions for each decision tree making up clustered random forests takes the form of a weighted least squares estimator, which leverage correlations between observations for improved prediction accuracy and tighter confidence intervals when performing inference. We show that approximately linear time algorithms exist for fitting classes of clustered random forests, matching […]