February 2026

Dataset Distillation as Pushforward Optimal Quantization

digitado ⋅ 9 de February de 2026

arXiv:2501.07681v3 Announce Type: replace-cross Abstract: Dataset distillation aims to find a synthetic training set such that training on the synthetic data achieves similar performance to training on real data, with orders of magnitude less computational requirements. Existing methods can be broadly categorized as either bi-level optimization problems that have neural network training heuristics as the lower level problem, or disentangled methods that bypass the bi-level optimization by matching distributions of data. The latter method has the major advantages […]

Ver mais

Like 0

Liked Liked

technocracy

Adventures in Demand Analysis Using AI

digitado ⋅ 9 de February de 2026

arXiv:2501.00382v3 Announce Type: replace-cross Abstract: This paper advances empirical demand analysis by integrating multimodal product representations derived from artificial intelligence (AI). Using a detailed dataset of toy cars on textit{Amazon.com}, we combine text descriptions, images, and tabular covariates to represent each product using transformer-based embedding models. These embeddings capture nuanced attributes, such as quality, branding, and visual characteristics, that traditional methods often struggle to summarize. Moreover, we fine-tune these embeddings for causal inference tasks. We show that the […]

Ver mais

Like 0

Liked Liked

technocracy

Radon–Wasserstein Gradient Flows for Interacting-Particle Sampling in High Dimensions

digitado ⋅ 9 de February de 2026

arXiv:2602.05227v2 Announce Type: replace Abstract: Gradient flows of the Kullback–Leibler (KL) divergence, such as the Fokker–Planck equation and Stein Variational Gradient Descent, evolve a distribution toward a target density known only up to a normalizing constant. We introduce new gradient flows of the KL divergence with a remarkable combination of properties: they admit accurate interacting-particle approximations in high dimensions, and the per-step cost scales linearly in both the number of particles and the dimension. These gradient flows are […]

Ver mais

Like 0

Liked Liked

technocracy

Performative Learning Theory

digitado ⋅ 9 de February de 2026

arXiv:2602.04402v2 Announce Type: replace Abstract: Performative predictions influence the very outcomes they aim to forecast. We study performative predictions that affect a sample (e.g., only existing users of an app) and/or the whole population (e.g., all potential app users). This raises the question of how well models generalize under performativity. For example, how well can we draw insights about new app users based on existing users when both of them react to the app’s predictions? We address this […]

Ver mais

Like 0

Liked Liked

technocracy

FreDN: Spectral Disentanglement for Time Series Forecasting via Learnable Frequency Decomposition

digitado ⋅ 9 de February de 2026

arXiv:2511.11817v2 Announce Type: replace Abstract: Time series forecasting is essential in a wide range of real world applications. Recently, frequency-domain methods have attracted increasing interest for their ability to capture global dependencies. However, when applied to non-stationary time series, these methods encounter the $textit{spectral entanglement}$ and the computational burden of complex-valued learning. The $textit{spectral entanglement}$ refers to the overlap of trends, periodicities, and noise across the spectrum due to $textit{spectral leakage}$ and the presence of non-stationarity. However, existing […]

Ver mais

Like 0

Liked Liked

technocracy

Optimal Bias-variance Tradeoff in Matrix and Tensor Estimation

digitado ⋅ 9 de February de 2026

arXiv:2509.17382v3 Announce Type: replace Abstract: We study matrix and tensor denoising when the underlying signal is textbf{not} necessarily low-rank. In the tensor setting, we observe [ Y = X^ast + Z in mathbb{R}^{p_1 times p_2 times p_3}, ] where $X^ast$ is an unknown signal tensor and $Z$ is a noise tensor. We propose a one-step variant of the higher-order SVD (HOSVD) estimator, denoted $widetilde X$, and show that, uniformly over any user-specified Tucker ranks $(r_1,r_2,r_3)$, with high probability, […]

Ver mais

Like 0

Liked Liked

technocracy

Efficient Perplexity Bound and Ratio Matching in Discrete Diffusion Language Models

digitado ⋅ 9 de February de 2026

arXiv:2507.04341v2 Announce Type: replace Abstract: While continuous diffusion models excel in modeling continuous distributions, their application to categorical data has been less effective. Recent work has shown that ratio-matching through score-entropy within a continuous-time discrete Markov chain (CTMC) framework serves as a competitive alternative to autoregressive models in language modeling. To enhance this framework, we first introduce three new theorems concerning the KL divergence between the data and learned distribution. Our results serve as the discrete counterpart to […]

Ver mais

Like 0

Liked Liked

technocracy

Ensemble Transport Filter via Optimized Maximum Mean Discrepancy

digitado ⋅ 9 de February de 2026

arXiv:2407.11518v2 Announce Type: replace Abstract: In this paper, we present a new ensemble-based filter method by reconstructing the analysis step of the particle filter through a transport map, which directly transports prior particles to posterior particles. The transport map is constructed through an optimization problem described by the Maximum Mean Discrepancy loss function, which matches the expectation information of the approximated posterior and reference posterior. The proposed method inherits the accurate estimation of the posterior distribution from particle […]

Ver mais

Like 0

Liked Liked

technocracy

Training-Conditional Coverage Bounds under Covariate Shift

digitado ⋅ 9 de February de 2026

arXiv:2405.16594v4 Announce Type: replace Abstract: Conformal prediction methodology has recently been extended to the covariate shift setting, where the distribution of covariates differs between training and test data. While existing results ensure that the prediction sets from these methods achieve marginal coverage above a nominal level, their coverage rate conditional on the training dataset (referred to as training-conditional coverage) remains unexplored. In this paper, we address this gap by deriving upper bounds on the tail of the training-conditional […]

Ver mais

Like 0

Liked Liked

technocracy

Continuous-time reinforcement learning: ellipticity enables model-free value function approximation

digitado ⋅ 9 de February de 2026

arXiv:2602.06930v1 Announce Type: cross Abstract: We study off-policy reinforcement learning for controlling continuous-time Markov diffusion processes with discrete-time observations and actions. We consider model-free algorithms with function approximation that learn value and advantage functions directly from data, without unrealistic structural assumptions on the dynamics. Leveraging the ellipticity of the diffusions, we establish a new class of Hilbert-space positive definiteness and boundedness properties for the Bellman operators. Based on these properties, we propose the Sobolev-prox fitted $q$-learning algorithm, which […]

Ver mais

Like 0

Liked Liked