High-dimensional censored MIDAS logistic regression for corporate survival forecasting

arXiv:2502.09740v2 Announce Type: replace-cross
Abstract: This paper addresses the challenge of forecasting corporate distress, a problem marked by three key statistical hurdles: (i) right censoring, (ii) high-dimensional predictors, and (iii) mixed-frequency data. To overcome these complexities, we introduce a novel high-dimensional censored MIDAS (Mixed Data Sampling) logistic regression. Our approach handles censoring through inverse probability weighting and achieves accurate estimation with numerous mixed-frequency predictors by employing a sparse-group penalty. We establish finite-sample bounds for the estimation error, accounting for censoring, MIDAS approximation error, and heavy tails. For statistical inference, we develop a de-sparsified version of the proposed penalized estimator and establish its asymptotic theory, which enables valid statistical inference in high-dimensional settings with censoring. We show that censoring induces a nonstandard variance structure for the de-sparsified estimator, a feature that, to the best of our knowledge, has not been studied in the existing literature. The superior performance of the method is demonstrated through Monte Carlo simulations. Finally, we present an extensive application of our methodology to predict the financial distress of Chinese-listed firms and to identify covariates that are statistically significant for predicting distress. Our novel procedure is implemented in the R package texttt{Survivalml}.

Liked Liked