January 2026

Online Markov Decision Processes with Terminal Law Constraints

digitado ⋅ 13 de January de 2026

arXiv:2601.07492v1 Announce Type: cross Abstract: Traditional reinforcement learning usually assumes either episodic interactions with resets or continuous operation to minimize average or cumulative loss. While episodic settings have many theoretical results, resets are often unrealistic in practice. The infinite-horizon setting avoids this issue but lacks non-asymptotic guarantees in online scenarios with unknown dynamics. In this work, we move towards closing this gap by introducing a reset-free framework called the periodic framework, where the goal is to find periodic […]

Ver mais

Like 0

Liked Liked

technocracy

Minimum Wasserstein distance estimator under covariate shift: closed-form, super-efficiency and irregularity

digitado ⋅ 13 de January de 2026

arXiv:2601.07282v1 Announce Type: cross Abstract: Covariate shift arises when covariate distributions differ between source and target populations while the conditional distribution of the response remains invariant, and it underlies problems in missing data and causal inference. We propose a minimum Wasserstein distance estimation framework for inference under covariate shift that avoids explicit modeling of outcome regressions or importance weights. The resulting W-estimator admits a closed-form expression and is numerically equivalent to the classical 1-nearest neighbor estimator, yielding a […]

Ver mais

Like 0

Liked Liked

technocracy

Simulated Annealing-based Candidate Optimization for Batch Acquisition Functions

digitado ⋅ 13 de January de 2026

arXiv:2601.07258v1 Announce Type: cross Abstract: Bayesian Optimization with multi-objective acquisition functions such as q-Expected Hypervolume Improvement (qEHVI) requires efficient candidate optimization to maximize acquisition function values. Traditional approaches rely on continuous optimization methods like Sequential Least Squares Programming (SLSQP) for candidate selection. However, these gradient-based methods can become trapped in local optima, particularly in complex or high-dimensional objective landscapes. This paper presents a simulated annealing-based approach for candidate optimization in batch acquisition functions as an alternative to conventional […]

Ver mais

Like 0

Liked Liked

technocracy

Tractable Multinomial Logit Contextual Bandits with Non-Linear Utilities

digitado ⋅ 13 de January de 2026

arXiv:2601.06913v1 Announce Type: cross Abstract: We study the multinomial logit (MNL) contextual bandit problem for sequential assortment selection. Although most existing research assumes utility functions to be linear in item features, this linearity assumption restricts the modeling of intricate interactions between items and user preferences. A recent work (Zhang & Luo, 2024) has investigated general utility function classes, yet its method faces fundamental trade-offs between computational tractability and statistical efficiency. To address this limitation, we propose a computationally […]

Ver mais

Like 0

Liked Liked

technocracy

Variational decomposition autoencoding improves disentanglement of latent representations

digitado ⋅ 13 de January de 2026

arXiv:2601.06844v1 Announce Type: cross Abstract: Understanding the structure of complex, nonstationary, high-dimensional time-evolving signals is a central challenge in scientific data analysis. In many domains, such as speech and biomedical signal processing, the ability to learn disentangled and interpretable representations is critical for uncovering latent generative mechanisms. Traditional approaches to unsupervised representation learning, including variational autoencoders (VAEs), often struggle to capture the temporal and spectral diversity inherent in such data. Here we introduce variational decomposition autoencoding (VDA), a […]

Ver mais

Like 0

Liked Liked

technocracy

Artificial Entanglement in the Fine-Tuning of Large Language Models

digitado ⋅ 13 de January de 2026

arXiv:2601.06788v1 Announce Type: cross Abstract: Large language models (LLMs) can be adapted to new tasks using parameter-efficient fine-tuning (PEFT) methods that modify only a small number of trainable parameters, often through low-rank updates. In this work, we adopt a quantum-information-inspired perspective to understand their effectiveness. From this perspective, low-rank parameterizations naturally correspond to low-dimensional Matrix Product States (MPS) representations, which enable entanglement-based characterizations of parameter structure. Thereby, we term and measure “Artificial Entanglement”, defined as the entanglement entropy […]

Ver mais

Like 0

Liked Liked

technocracy

Diffusion Models with Heavy-Tailed Targets: Score Estimation and Sampling Guarantees

digitado ⋅ 13 de January de 2026

arXiv:2601.06715v1 Announce Type: cross Abstract: Score-based diffusion models have become a powerful framework for generative modeling, with score estimation as a central statistical bottleneck. Existing guarantees for score estimation largely focus on light-tailed targets or rely on restrictive assumptions such as compact support, which are often violated by heavy-tailed data in practice. In this work, we study conventional (Gaussian) score-based diffusion models when the target distribution is heavy-tailed and belongs to a Sobolev class with smoothness parameter $beta>0$. […]

Ver mais

Like 0

Liked Liked

technocracy

Physics-constrained Gaussian Processes for Predicting Shockwave Hugoniot Curves

digitado ⋅ 13 de January de 2026

arXiv:2601.06655v1 Announce Type: cross Abstract: A physics-constrained Gaussian Process regression framework is developed for predicting shocked material states along the Hugoniot curve using data from a small number of shockwave simulations. The proposed Gaussian process employs a probabilistic Taylor series expansion in conjunction with the Rankine-Hugoniot jump conditions between the various shocked material states to construct a thermodynamically consistent covariance function. This leads to the formulation of an optimization problem over a small number of interpretable hyperparameters and […]

Ver mais

Like 0

Liked Liked

technocracy

Detecting LLM-Generated Text with Performance Guarantees

digitado ⋅ 13 de January de 2026

arXiv:2601.06586v1 Announce Type: cross Abstract: Large language models (LLMs) such as GPT, Claude, Gemini, and Grok have been deeply integrated into our daily life. They now support a wide range of tasks — from dialogue and email drafting to assisting with teaching and coding, serving as search engines, and much more. However, their ability to produce highly human-like text raises serious concerns, including the spread of fake news, the generation of misleading governmental reports, and academic misconduct. To […]

Ver mais

Like 0

Liked Liked

technocracy

Deriving Decoder-Free Sparse Autoencoders from First Principles

digitado ⋅ 13 de January de 2026

arXiv:2601.06478v1 Announce Type: cross Abstract: Gradient descent on log-sum-exp (LSE) objectives performs implicit expectation–maximization (EM): the gradient with respect to each component output equals its responsibility. The same theory predicts collapse without volume control analogous to the log-determinant in Gaussian mixture models. We instantiate the theory in a single-layer encoder with an LSE objective and InfoMax regularization for volume control. Experiments confirm the theory’s predictions. The gradient–responsibility identity holds exactly; LSE alone collapses; variance prevents dead components; decorrelation […]

Ver mais

Like 0

Liked Liked