January 2026

Q-learning with Adjoint Matching

digitado ⋅ 26 de January de 2026

arXiv:2601.14234v2 Announce Type: replace-cross Abstract: We propose Q-learning with Adjoint Matching (QAM), a novel TD-based reinforcement learning (RL) algorithm that tackles a long-standing challenge in continuous-action RL: efficient optimization of an expressive diffusion or flow-matching policy with respect to a parameterized Q-function. Effective optimization requires exploiting the first-order information of the critic, but it is challenging to do so for flow or diffusion policies because direct gradient-based optimization via backpropagation through their multi-step denoising process is numerically unstable. […]

Ver mais

Like 0

Liked Liked

technocracy

On Nonasymptotic Confidence Intervals for Treatment Effects in Randomized Experiments

digitado ⋅ 26 de January de 2026

arXiv:2601.11744v2 Announce Type: replace-cross Abstract: We study nonasymptotic (finite-sample) confidence intervals for treatment effects in randomized experiments. In the existing literature, the effective sample sizes of nonasymptotic confidence intervals tend to be looser than the corresponding central-limit-theorem-based confidence intervals by a factor depending on the square root of the propensity score. We show that this performance gap can be closed, designing nonasymptotic confidence intervals that have the same effective sample size as their asymptotic counterparts. Our approach involves […]

Ver mais

Like 0

Liked Liked

technocracy

Joint learning of a network of linear dynamical systems via total variation penalization

digitado ⋅ 26 de January de 2026

arXiv:2511.18737v2 Announce Type: replace-cross Abstract: We consider the problem of joint estimation of the parameters of $m$ linear dynamical systems, given access to single realizations of their respective trajectories, each of length $T$. The linear systems are assumed to reside on the nodes of an undirected and connected graph $G = ([m], mathcal{E})$, and the system matrices are assumed to either vary smoothly or exhibit small number of “jumps” across the edges. We consider a total variation penalized […]

Ver mais

Like 0

Liked Liked

technocracy

AutoSciDACT: Automated Scientific Discovery through Contrastive Embedding and Hypothesis Testing

digitado ⋅ 26 de January de 2026

arXiv:2510.21935v2 Announce Type: replace-cross Abstract: Novelty detection in large scientific datasets faces two key challenges: the noisy and high-dimensional nature of experimental data, and the necessity of making statistically robust statements about any observed outliers. While there is a wealth of literature on anomaly detection via dimensionality reduction, most methods do not produce outputs compatible with quantifiable claims of scientific discovery. In this work we directly address these challenges, presenting the first step towards a unified pipeline for […]

Ver mais

Like 0

Liked Liked

technocracy

Flow Matching with Semidiscrete Couplings

digitado ⋅ 26 de January de 2026

arXiv:2509.25519v2 Announce Type: replace-cross Abstract: Flow models parameterized as time-dependent velocity fields can generate data from noise by integrating an ODE. These models are often trained using flow matching, i.e. by sampling random pairs of noise and target points $(mathbf{x}_0,mathbf{x}_1)$ and ensuring that the velocity field is aligned, on average, with $mathbf{x}_1-mathbf{x}_0$ when evaluated along a segment linking $mathbf{x}_0$ to $mathbf{x}_1$. While these pairs are sampled independently by default, they can also be selected more carefully by matching […]

Ver mais

Like 0

Liked Liked

technocracy

Differentiable Cyclic Causal Discovery Under Unmeasured Confounders

digitado ⋅ 26 de January de 2026

arXiv:2508.08450v3 Announce Type: replace-cross Abstract: Understanding causal relationships between variables is fundamental across scientific disciplines. Most causal discovery algorithms rely on two key assumptions: (i) all variables are observed, and (ii) the underlying causal graph is acyclic. While these assumptions simplify theoretical analysis, they are often violated in real-world systems, such as biological networks. Existing methods that account for confounders either assume linearity or struggle with scalability. To address these limitations, we propose DCCD-CONF, a novel framework for […]

Ver mais

Like 0

Liked Liked

technocracy

Statistical Analysis of Conditional Group Distributionally Robust Optimization with Cross-Entropy Loss

digitado ⋅ 26 de January de 2026

arXiv:2507.09905v3 Announce Type: replace-cross Abstract: In multi-source learning with discrete labels, distributional heterogeneity across domains poses a central challenge to developing predictive models that transfer reliably to unseen domains. We study multi-source unsupervised domain adaptation, where labeled data are available from multiple source domains and only unlabeled data are observed from the target domain. To address potential distribution shifts, we propose a novel Conditional Group Distributionally Robust Optimization (CG-DRO) framework that learns a classifier by minimizing the worst-case […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian Ensembling: Insights from Online Optimization and Empirical Bayes

digitado ⋅ 26 de January de 2026

arXiv:2505.15638v2 Announce Type: replace-cross Abstract: We revisit the classical problem of Bayesian ensembles and address the challenge of learning optimal combinations of Bayesian models in an online, continual learning setting. To this end, we reinterpret existing approaches such as Bayesian model averaging (BMA) and Bayesian stacking through a novel empirical Bayes lens, shedding new light on the limitations and pathologies of BMA. Further motivated by insights from online optimization, we propose Online Bayesian Stacking (OBS), a method that […]

Ver mais

Like 0

Liked Liked

technocracy

Estimation of discrete distributions in relative entropy, and the deviations of the missing mass

digitado ⋅ 26 de January de 2026

arXiv:2504.21787v3 Announce Type: replace-cross Abstract: We study the problem of estimating a distribution over a finite alphabet from an i.i.d. sample, with accuracy measured in relative entropy (Kullback-Leibler divergence). While optimal bounds on the expected risk are known, high-probability guarantees remain less well-understood. First, we analyze the classical Laplace (add-one) estimator, obtaining matching upper and lower bounds on its performance and establishing its optimality among confidence-independent estimators. We then characterize the minimax-optimal high-probability risk and show that it […]

Ver mais

Like 0

Liked Liked

technocracy

Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift

digitado ⋅ 26 de January de 2026

arXiv:2503.12634v2 Announce Type: replace-cross Abstract: We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence. The leaf-wise predictions for each decision tree making up clustered random forests takes the form of a weighted least squares estimator, which leverage correlations between observations for improved prediction accuracy and tighter confidence intervals when performing inference. We show that approximately linear time algorithms exist for fitting classes of clustered random forests, matching […]

Ver mais

Like 0

Liked Liked