February 2026

Group Contrastive Learning for Weakly Paired Multimodal Data

digitado ⋅ 5 de February de 2026

arXiv:2602.04021v1 Announce Type: cross Abstract: We present GROOVE, a semi-supervised multi-modal representation learning approach for high-content perturbation data where samples across modalities are weakly paired through shared perturbation labels but lack direct correspondence. Our primary contribution is GroupCLIP, a novel group-level contrastive loss that bridges the gap between CLIP for paired cross-modal data and SupCon for uni-modal supervised contrastive learning, addressing a fundamental gap in contrastive learning for weakly-paired settings. We integrate GroupCLIP with an on-the-fly backtranslating autoencoder […]

Ver mais

Like 0

Liked Liked

technocracy

The Role of Target Update Frequencies in Q-Learning

digitado ⋅ 5 de February de 2026

arXiv:2602.03911v1 Announce Type: cross Abstract: The target network update frequency (TUF) is a central stabilization mechanism in (deep) Q-learning. However, their selection remains poorly understood and is often treated merely as another tunable hyperparameter rather than as a principled design decision. This work provides a theoretical analysis of target fixing in tabular Q-learning through the lens of approximate dynamic programming. We formulate periodic target updates as a nested optimization scheme in which each outer iteration applies an inexact […]

Ver mais

Like 0

Liked Liked

technocracy

GeoIB: Geometry-Aware Information Bottleneck via Statistical-Manifold Compression

digitado ⋅ 5 de February de 2026

arXiv:2602.03906v1 Announce Type: cross Abstract: Information Bottleneck (IB) is widely used, but in deep learning, it is usually implemented through tractable surrogates, such as variational bounds or neural mutual information (MI) estimators, rather than directly controlling the MI I(X;Z) itself. The looseness and estimator-dependent bias can make IB “compression” only indirectly controlled and optimization fragile. We revisit the IB problem through the lens of information geometry and propose a textbf{Geo}metric textbf{I}nformation textbf{B}ottleneck (textbf{GeoIB}) that dispenses with mutual information […]

Ver mais

Like 0

Liked Liked

technocracy

Multi-layer Cross-Attention is Provably Optimal for Multi-modal In-context Learning

digitado ⋅ 5 de February de 2026

arXiv:2602.04872v1 Announce Type: new Abstract: Recent progress has rapidly advanced our understanding of the mechanisms underlying in-context learning in modern attention-based neural networks. However, existing results focus exclusively on unimodal data; in contrast, the theoretical underpinnings of in-context learning for multi-modal data remain poorly understood. We introduce a mathematically tractable framework for studying multi-modal learning and explore when transformer-like architectures can recover Bayes-optimal performance in-context. To model multi-modal problems, we assume the observed data arises from a latent […]

Ver mais

Like 0

Liked Liked

technocracy

Conditional Counterfactual Mean Embeddings: Doubly Robust Estimation and Learning Rates

digitado ⋅ 5 de February de 2026

arXiv:2602.04736v1 Announce Type: new Abstract: A complete understanding of heterogeneous treatment effects involves characterizing the full conditional distribution of potential outcomes. To this end, we propose the Conditional Counterfactual Mean Embeddings (CCME), a framework that embeds conditional distributions of counterfactual outcomes into a reproducing kernel Hilbert space (RKHS). Under this framework, we develop a two-stage meta-estimator for CCME that accommodates any RKHS-valued regression in each stage. Based on this meta-estimator, we develop three practical CCME estimators: (1) Ridge […]

Ver mais

Like 0

Liked Liked

technocracy

Causal explanations of outliers in systems with lagged time-dependencies

digitado ⋅ 5 de February de 2026

arXiv:2602.04667v1 Announce Type: new Abstract: Root-cause analysis in controlled time dependent systems poses a major challenge in applications. Especially energy systems are difficult to handle as they exhibit instantaneous as well as delayed effects and if equipped with storage, do have a memory. In this paper we adapt the causal root-cause analysis method of Budhathoki et al. [2022] to general time-dependent systems, as it can be regarded as a strictly causal definition of the term “root-cause”. Particularly, we […]

Ver mais

Like 0

Liked Liked

technocracy

Targeted Synthetic Control Method

digitado ⋅ 5 de February de 2026

arXiv:2602.04611v1 Announce Type: new Abstract: The synthetic control method (SCM) estimates causal effects in panel data with a single-treated unit by constructing a counterfactual outcome as a weighted combination of untreated control units that matches the pre-treatment trajectory. In this paper, we introduce the targeted synthetic control (TSC) method, a new two-stage estimator that directly estimates the counterfactual outcome. Specifically, our TSC method (1) yields a targeted debiasing estimator, in the sense that the targeted updating refines the […]

Ver mais

Like 0

Liked Liked

technocracy

A principled framework for uncertainty decomposition in TabPFN

digitado ⋅ 5 de February de 2026

arXiv:2602.04596v1 Announce Type: new Abstract: TabPFN is a transformer that achieves state-of-the-art performance on supervised tabular tasks by amortizing Bayesian prediction into a single forward pass. However, there is currently no method for uncertainty decomposition in TabPFN. Because it behaves, in an idealised limit, as a Bayesian in-context learner, we cast the decomposition challenge as a Bayesian predictive inference (BPI) problem. The main computational tool in BPI, predictive Monte Carlo, is challenging to apply here as it requires […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian PINNs for uncertainty-aware inverse problems (BPINN-IP)

digitado ⋅ 5 de February de 2026

arXiv:2602.04459v1 Announce Type: new Abstract: The main contribution of this paper is to develop a hierarchical Bayesian formulation of PINNs for linear inverse problems, which is called BPINN-IP. The proposed methodology extends PINN to account for prior knowledge on the nature of the expected NN output, as well as its weights. Also, as we can have access to the posterior probability distributions, naturally uncertainties can be quantified. Also, variational inference and Monte Carlo dropout are employed to provide […]

Ver mais

Like 0

Liked Liked

technocracy

Anytime-Valid Conformal Risk Control

digitado ⋅ 5 de February de 2026

arXiv:2602.04364v1 Announce Type: new Abstract: Prediction sets provide a means of quantifying the uncertainty in predictive tasks. Using held out calibration data, conformal prediction and risk control can produce prediction sets that exhibit statistically valid error control in a computationally efficient manner. However, in the standard formulations, the error is only controlled on average over many possible calibration datasets of fixed size. In this paper, we extend the control to remain valid with high probability over a cumulatively […]

Ver mais

Like 0

Liked Liked