February 2026

Multi-layer Cross-Attention is Provably Optimal for Multi-modal In-context Learning

digitado ⋅ 5 de February de 2026

arXiv:2602.04872v1 Announce Type: new Abstract: Recent progress has rapidly advanced our understanding of the mechanisms underlying in-context learning in modern attention-based neural networks. However, existing results focus exclusively on unimodal data; in contrast, the theoretical underpinnings of in-context learning for multi-modal data remain poorly understood. We introduce a mathematically tractable framework for studying multi-modal learning and explore when transformer-like architectures can recover Bayes-optimal performance in-context. To model multi-modal problems, we assume the observed data arises from a latent […]

Ver mais

Like 0

Liked Liked

technocracy

Conditional Counterfactual Mean Embeddings: Doubly Robust Estimation and Learning Rates

digitado ⋅ 5 de February de 2026

arXiv:2602.04736v1 Announce Type: new Abstract: A complete understanding of heterogeneous treatment effects involves characterizing the full conditional distribution of potential outcomes. To this end, we propose the Conditional Counterfactual Mean Embeddings (CCME), a framework that embeds conditional distributions of counterfactual outcomes into a reproducing kernel Hilbert space (RKHS). Under this framework, we develop a two-stage meta-estimator for CCME that accommodates any RKHS-valued regression in each stage. Based on this meta-estimator, we develop three practical CCME estimators: (1) Ridge […]

Ver mais

Like 0

Liked Liked

technocracy

Causal explanations of outliers in systems with lagged time-dependencies

digitado ⋅ 5 de February de 2026

arXiv:2602.04667v1 Announce Type: new Abstract: Root-cause analysis in controlled time dependent systems poses a major challenge in applications. Especially energy systems are difficult to handle as they exhibit instantaneous as well as delayed effects and if equipped with storage, do have a memory. In this paper we adapt the causal root-cause analysis method of Budhathoki et al. [2022] to general time-dependent systems, as it can be regarded as a strictly causal definition of the term “root-cause”. Particularly, we […]

Ver mais

Like 0

Liked Liked

technocracy

Targeted Synthetic Control Method

digitado ⋅ 5 de February de 2026

arXiv:2602.04611v1 Announce Type: new Abstract: The synthetic control method (SCM) estimates causal effects in panel data with a single-treated unit by constructing a counterfactual outcome as a weighted combination of untreated control units that matches the pre-treatment trajectory. In this paper, we introduce the targeted synthetic control (TSC) method, a new two-stage estimator that directly estimates the counterfactual outcome. Specifically, our TSC method (1) yields a targeted debiasing estimator, in the sense that the targeted updating refines the […]

Ver mais

Like 0

Liked Liked

technocracy

A principled framework for uncertainty decomposition in TabPFN

digitado ⋅ 5 de February de 2026

arXiv:2602.04596v1 Announce Type: new Abstract: TabPFN is a transformer that achieves state-of-the-art performance on supervised tabular tasks by amortizing Bayesian prediction into a single forward pass. However, there is currently no method for uncertainty decomposition in TabPFN. Because it behaves, in an idealised limit, as a Bayesian in-context learner, we cast the decomposition challenge as a Bayesian predictive inference (BPI) problem. The main computational tool in BPI, predictive Monte Carlo, is challenging to apply here as it requires […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian PINNs for uncertainty-aware inverse problems (BPINN-IP)

digitado ⋅ 5 de February de 2026

arXiv:2602.04459v1 Announce Type: new Abstract: The main contribution of this paper is to develop a hierarchical Bayesian formulation of PINNs for linear inverse problems, which is called BPINN-IP. The proposed methodology extends PINN to account for prior knowledge on the nature of the expected NN output, as well as its weights. Also, as we can have access to the posterior probability distributions, naturally uncertainties can be quantified. Also, variational inference and Monte Carlo dropout are employed to provide […]

Ver mais

Like 0

Liked Liked

technocracy

Anytime-Valid Conformal Risk Control

digitado ⋅ 5 de February de 2026

arXiv:2602.04364v1 Announce Type: new Abstract: Prediction sets provide a means of quantifying the uncertainty in predictive tasks. Using held out calibration data, conformal prediction and risk control can produce prediction sets that exhibit statistically valid error control in a computationally efficient manner. However, in the standard formulations, the error is only controlled on average over many possible calibration datasets of fixed size. In this paper, we extend the control to remain valid with high probability over a cumulatively […]

Ver mais

Like 0

Liked Liked

technocracy

A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

digitado ⋅ 5 de February de 2026

arXiv:2602.04347v1 Announce Type: new Abstract: In recent years, instructional practices in Operations Research (OR), Management Science (MS), and Analytics have increasingly shifted toward digital environments, where large and diverse groups of learners make it difficult to provide practice that adapts to individual needs. This paper introduces a method that generates personalized sequences of exercises by selecting, at each step, the exercise most likely to advance a learner’s understanding of a targeted skill. The method uses information about the […]

Ver mais

Like 0

Liked Liked

technocracy

Geometry-Aware Optimal Transport: Fast Intrinsic Dimension and Wasserstein Distance Estimation

digitado ⋅ 5 de February de 2026

arXiv:2602.04335v1 Announce Type: new Abstract: Solving large scale Optimal Transport (OT) in machine learning typically relies on sampling measures to obtain a tractable discrete problem. While the discrete solver’s accuracy is controllable, the rate of convergence of the discretization error is governed by the intrinsic dimension of our data. Therefore, the true bottleneck is the knowledge and control of the sampling error. In this work, we tackle this issue by introducing novel estimators for both sampling error and […]

Ver mais

Like 0

Liked Liked

technocracy

Provable Target Sample Complexity Improvements as Pre-Trained Models Scale

digitado ⋅ 5 de February de 2026

arXiv:2602.04233v1 Announce Type: new Abstract: Pre-trained models have become indispensable for efficiently building models across a broad spectrum of downstream tasks. The advantages of pre-trained models have been highlighted by empirical studies on scaling laws, which demonstrate that larger pre-trained models can significantly reduce the sample complexity of downstream learning. However, existing theoretical investigations of pre-trained models lack the capability to explain this phenomenon. In this paper, we provide a theoretical investigation by introducing a novel framework, caulking, […]

Ver mais

Like 0

Liked Liked