Convergence Rate of a Functional Learning Method for Contextual Stochastic Optimization
We consider a stochastic optimization problem involving two random variables: a context variable $X$ and a dependent variable $Y$. The objective is to minimize the expected value of a nonlinear loss functional applied to the conditional expectation $mathbb{E}[f(X, Y,β) mid X]$, where $f$ is a nonlinear function and $β$ represents the decision variables. We focus on the practically important setting in which direct sampling from the conditional distribution of $Y mid X$ is infeasible, and only a stream of i.i.d. observation pairs ${(X^k, Y^k)}_{k=0,1,2,ldots}$ is available. In our approach, the conditional expectation is approximated within a prespecified parametric function class. We analyze a simultaneous learning-and-optimization algorithm that jointly estimates the conditional expectation and optimizes the outer objective, and establish that the method achieves a convergence rate of order $mathcal{O}big(1/sqrt{N}big)$, where $N$ denotes the number of observed pairs.