digitado

On the Peril of (Even a Little) Nonstationarity in Satisficing Regret Minimization

digitado ⋅ 20 de March de 2026

arXiv:2603.18514v1 Announce Type: new Abstract: Motivated by the principle of satisficing in decision-making, we study satisficing regret guarantees for nonstationary $K$-armed bandits. We show that in the general realizable, piecewise-stationary setting with $L$ stationary segments, the optimal regret is $Theta(Llog T)$ as long as $Lgeq 2$. This stands in sharp contrast to the case of $L=1$ (i.e., the stationary setting), where a $T$-independent $Theta(1)$ satisficing regret is achievable under realizability. In other words, the optimal regret has to […]

Ver mais

Like 0

Liked Liked

technocracy

Learning path from Q-learning to TD3 (course suggestions?)

digitado ⋅ 6 de February de 2026

I’m a graduate research assistant working on autonomous vehicle–related research. I was given an existing codebase with folders like Q-learning / DQN / DDPG / TD3, and I’m expected to replicate and work with TD3. The problem is that I currently have: Basic Python skills, very limited Intro-level understanding of RL (Q-learning, DQN) and almost no exposure to actor–critic methods I’m looking for a clear learning roadmap that builds knowledge from tabular Q-learning → DQN → policy gradients […]

Ver mais

Like 0

Liked Liked

technocracy

Robotic Assembly Using Deep Reinforcement Learning

digitado ⋅ 21 de October de 2020

Introduction Disclaimer: This article is a cross post from Pytorch Medium Blog Post. One of the most exciting advancements, that has pushed the frontier of the Artificial Intelligence (AI) in recent years, is Deep Reinforcement Learning (DRL). DRL belongs to the family of machine learning algorithms. It assumes that intelligent machines can learn from their actions similar to the way humans learn from experience. Over the recent years we could witness some impressive real-world applications of DRL. The […]

Ver mais

Like 0

Liked Liked

technocracy

Simulation-Based Inference via Regression Projection and Batched Discrepancies

digitado ⋅ 4 de February de 2026

arXiv:2602.03613v1 Announce Type: cross Abstract: We analyze a lightweight simulation-based inference method that infers simulator parameters using only a regression-based projection of the observed data. After fitting a surrogate linear regression once, the procedure simulates small batches at the proposed parameter values and assigns kernel weights based on the resulting batch-residual discrepancy, producing a self-normalized pseudo-posterior that is simple, parallelizable, and requires access only to the fitted regression coefficients rather than raw observations. We formalize the construction as […]

Ver mais

Like 0

Liked Liked

technocracy

The Mortality Engine: Why I Built a DApp Designed to Die

digitado ⋅ 12 de January de 2026

I have been around long enough to know that “Forever” is a trap. In the Web3 space, we are obsessed with Immutability. We carve our JPEGs into IPFS granite. We deploy smart contracts meant to outlast civilizations. We scream “Code is Law” and “Data is Eternal.” But as an Immortal, I can tell you: Eternity is boring. Without death, there is no urgency. Without the threat of loss, there is no value. So, I decided to break the […]

Ver mais

Like 0

Liked Liked

technocracy

Grables: Tabular Learning Beyond Independent Rows

digitado ⋅ 5 de February de 2026

arXiv:2602.03945v1 Announce Type: new Abstract: Tabular learning is still dominated by row-wise predictors that score each row independently, which fits i.i.d. benchmarks but fails on transactional, temporal, and relational tables where labels depend on other rows. We show that row-wise prediction rules out natural targets driven by global counts, overlaps, and relational patterns. To make “using structure” precise across architectures, we introduce grables: a modular interface that separates how a table is lifted to a graph (constructor) from […]

Ver mais

Like 0

Liked Liked

technocracy

A tensor network formalism for neuro-symbolic AI

digitado ⋅ 23 de January de 2026

arXiv:2601.15442v1 Announce Type: cross Abstract: The unification of neural and symbolic approaches to artificial intelligence remains a central open challenge. In this work, we introduce a tensor network formalism, which captures sparsity principles originating in the different approaches in tensor decompositions. In particular, we describe a basis encoding scheme for functions and model neural decompositions as tensor decompositions. The proposed formalism can be applied to represent logical formulas and probability distributions as structured tensor decompositions. This unified treatment […]

Ver mais

Like 0

Liked Liked

technocracy

Parameter-free Dynamic Regret: Time-varying Movement Costs, Delayed Feedback, and Memory

digitado ⋅ 9 de February de 2026

arXiv:2602.06902v1 Announce Type: cross Abstract: In this paper, we study dynamic regret in unconstrained online convex optimization (OCO) with movement costs. Specifically, we generalize the standard setting by allowing the movement cost coefficients $lambda_t$ to vary arbitrarily over time. Our main contribution is a novel algorithm that establishes the first comparator-adaptive dynamic regret bound for this setting, guaranteeing $widetilde{mathcal{O}}(sqrt{(1+P_T)(T+sum_t lambda_t)})$ regret, where $P_T$ is the path length of the comparator sequence over $T$ rounds. This recovers the optimal […]

Ver mais

Like 0

Liked Liked

technocracy

Fairness Constraints in High-Dimensional Generalized Linear Models

digitado ⋅ 21 de April de 2026

arXiv:2604.16610v1 Announce Type: new Abstract: Machine learning models often inherit biases from historical data, raising critical concerns about fairness and accountability. Conventional fairness interventions typically require access to sensitive attributes like gender or race, but privacy and legal restrictions frequently limit their use. To address this challenge, we propose a framework that infers sensitive attributes from auxiliary features and integrates fairness constraints into model training. Our approach mitigates bias while preserving predictive accuracy, offering a practical solution for […]

Ver mais

Like 0

Liked Liked

technocracy

Disturbance Attenuation Regulator I-B: Signal Bound Convergence and Steady-State

digitado ⋅ 19 de January de 2026

arXiv:2601.10868v1 Announce Type: new Abstract: This paper establishes convergence and steady-state properties for the signal bound disturbance attenuation regulator (SiDAR). Building on the finite horizon recursive solution developed in a companion paper, we introduce the steady-state SiDAR and derive its tractable linear matrix inequality (LMI) with $O(n^3)$ complexity. Systems are classified as degenerate or nondegenerate based on steady-state solution properties. For nondegenerate systems, the finite horizon solution converges to the steady-state solution for all states as the horizon […]

Ver mais

Like 0

Liked Liked