digitado – Page 173

Robust Exploratory Stopping under Ambiguity in Reinforcement Learning

digitado ⋅ 17 de April de 2026

arXiv:2510.10260v2 Announce Type: replace-cross Abstract: We propose and analyze a continuous-time robust reinforcement learning framework for optimal stopping under ambiguity. In this framework, an agent chooses a robust exploratory stopping time motivated by two objectives: robust decision-making under ambiguity and learning about the unknown environment. Here, ambiguity refers to considering multiple probability measures dominated by a reference measure, reflecting the agent’s awareness that the reference measure representing her learned belief about the environment would be erroneous. Using the […]

Ver mais

Like 0

Liked Liked

technocracy

CLAMP: Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining

digitado ⋅ 1 de February de 2026

Leveraging pre-trained 2D image representations in behavior cloning policies has achieved great success and has become a standard approach for robotic manipulation. However, such representations fail to capture the 3D spatial information about objects and scenes that is essential for precise manipulation. In this work, we introduce Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining (CLAMP), a novel 3D pre-training framework that utilizes point clouds and robot actions. From the merged point cloud computed from RGB-D images […]

Ver mais

Like 0

Liked Liked

technocracy

Right to History: A Sovereignty Kernel for Verifiable AI Agent Execution

digitado ⋅ 25 de February de 2026

arXiv:2602.20214v1 Announce Type: new Abstract: AI agents increasingly act on behalf of humans, yet no existing system provides a tamper-evident, independently verifiable record of what they did. As regulations such as the EU AI Act begin mandating automatic logging for high-risk AI systems, this gap carries concrete consequences — especially for agents running on personal hardware, where no centralized provider controls the log. Extending Floridi’s informational rights framework from data about individuals to actions performed on their behalf, […]

Ver mais

Like 0

Liked Liked

technocracy

Cram’er-Rao Bound Analysis and Near-Optimal Performance of the Synchronous Nyquist-Folding Generalized Eigenvalue Method (SNGEM) for Sub-Nyquist Multi-Tone Parameter Estimation

digitado ⋅ 30 de January de 2026

arXiv:2601.20866v1 Announce Type: new Abstract: The synchronous Nyquist folding generalized eigenvalue method (SNGEM) realizes full frequency/amplitude/phase estimation of multitone signals at extreme sub-Nyquist rates by jointly processing the original signals and their time derivatives. In this paper, accurate Cramer-Rao bounds for amplitude ratio parameter R=A/B=1/(2pif) are derived for two channels with equal SNR. Monte-Carlo simulations confirm that SNGEM achieves machine accuracy in noise-free conditions and closely approaches the derived CRB at all SNR levels, even at 10- 20x […]

Ver mais

Like 0

Liked Liked

technocracy

Anytime Optimal Decision Tree Learning with Continuous Features

digitado ⋅ 21 de January de 2026

In recent years, significant progress has been made on algorithms for learning optimal decision trees, primarily in the context of binary features. Extending these methods to continuous features remains substantially more challenging due to the large number of potential splits for each feature. Recently, an elegant exact algorithm was proposed for learning optimal decision trees with continuous features; however, the rapidly increasing computational time limits its practical applicability to shallow depths (typically 3 or 4). It relies on […]

Ver mais

Like 0

Liked Liked

technocracy

An adaptive integrating factor midpoint method for second order evolution equations

digitado ⋅ 3 de March de 2026

arXiv:2603.00594v1 Announce Type: new Abstract: In this paper, we consider the integrating factor midpoint method for wave-type equations and derive optimal order a posteriori error estimates. We first introduce an integrating factor midpoint approximation defined by the piecewise linear approximate solutions, and derive suboptimal order residual-based error estimates using the energy technique. Hence the key is introducing a continuous, piecewise quadratic time reconstruction to establish optimal order error bounds. Based on the reliable a posteriori error control, we […]

Ver mais

Like 0

Liked Liked

technocracy

Netflix cedes Warner Bros. Discovery to Paramount: “No longer financially attractive”

digitado ⋅ 27 de February de 2026

Netflix backed out of its deal to acquire Warner Bros. Discovery’s (WBD’s) streaming and movie studios businesses on Thursday night. After increasing its bid for all of WBD by $1 per share on Tuesday, Paramount Skydance is poised to become the new owner of WBD, including Game of Thrones, DC Comics, and other IP, as well as the HBO Max streaming service and cable channels CNN and TBS. Netflix and WBD announced merger intentions on December 5. Netflix […]

Ver mais

Like 0

Liked Liked

technocracy

What happens to a car when the company behind its software goes under?

digitado ⋅ 17 de February de 2026

Imagine turning the key or pressing the start button of your car—and nothing happens. Not because the battery is dead or the engine is broken but because a server no longer answers. For a growing number of cars, that scenario isn’t hypothetical. As vehicles become platforms for software and subscriptions, their longevity is increasingly tied to the survival of the companies behind their code. When those companies fail, the consequences ripple far beyond a bad app update and […]

Ver mais

Like 0

Liked Liked

technocracy

Quiz: Design and Guidance: Object-Oriented Programming in Python

digitado ⋅ 15 de April de 2026

Test your understanding of the Design and Guidance: Object-Oriented Programming in Python video course. You’ll revisit single responsibility, open-closed, Liskov substitution, interface segregation, and dependency inversion. You’ll also review when to use classes in Python and alternatives to inheritance like composition and dependency injection. [ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples […]

Ver mais

Like 0

Liked Liked

technocracy

Bayesian Preference Learning for Test-Time Steerable Reward Models

digitado ⋅ 9 de February de 2026

Reward models are central to aligning language models with human preferences via reinforcement learning (RL). As RL is increasingly applied to settings such as verifiable rewards and multi-objective alignment, RMs are expected to encode more complex and multifaceted preference distributions. However, classifier RMs remain static once trained, limiting their adaptability at test time. We propose Variational In-Context Reward Modeling (ICRM), a novel Bayesian reward modeling objective that enables test-time steerability via in-context preference demonstrations. ICRM casts reward modeling […]

Ver mais

Like 0

Liked Liked