Sample-Efficient Optimization over Generative Priors via Coarse Learnability
arXiv:2503.06917v4 Announce Type: replace-cross
Abstract: In zeroth-order optimization, we seek to minimize a function $d(cdot)$, which may encode combinatorial feasibility, using only function evaluations. We focus on the setting where solutions must also satisfy qualitative constraints or conform to a complex prior distribution. To address this, we introduce a new framework in which such constraints are represented by an initial generative prior $L(cdot)$, for example, a Large Language Model (LLM). The objective is to find solutions $s$ that minimize $d(s)$ while having high probability under $L(s)$, effectively sampling from a target distribution proportional to $L(s) cdot e^{-T cdot d(s)}$ for a temperature parameter $T$.
While this framework aligns with classical Model-Based Optimization (e.g., the Cross-Entropy method), existing theory is ill-suited for deriving sample complexity bounds in black-box deep generative models. We therefore propose a novel learning assumption, which we term emph{coarse learnability}, where an agent with access to a polynomial number of samples can learn a model whose point-wise density approximates the target within a polynomial factor. Leveraging this assumption, we design an iterative algorithm that employs a Metropolis-Hastings correction to provably approximate the target distribution using a polynomial number of samples. To the best of our knowledge, this is one of the first works to establish such sample-complexity guarantees for model-based optimization with deep generative priors.
We provide two lines of evidence supporting the coarse learnability assumption. Theoretically, we show that maximum likelihood estimation naturally induces the required coverage properties, holding for both standard exponential families and for misspecified models. Empirically, we demonstrate that LLMs can adapt their learned distributions to zeroth-order feedback to solve combinatorial optimization problems.