digitado – Page 284

Flow Matching for Offline Reinforcement Learning with Discrete Actions

digitado ⋅ 9 de February de 2026

arXiv:2602.06138v1 Announce Type: new Abstract: Generative policies based on diffusion models and flow matching have shown strong promise for offline reinforcement learning (RL), but their applicability remains largely confined to continuous action spaces. To address a broader range of offline RL settings, we extend flow matching to a general framework that supports discrete action spaces with multiple objectives. Specifically, we replace continuous flows with continuous-time Markov chains, trained using a Q-weighted flow matching objective. We then extend our […]

Ver mais

Like 0

Liked Liked

technocracy

Overparameterized Multiple Linear Regression as Hyper-Curve Fitting

digitado ⋅ 26 de February de 2026

arXiv:2404.07849v2 Announce Type: replace Abstract: This work demonstrates that applying a fixed-effect multiple linear regression (MLR) model to an overparameterized dataset is mathematically equivalent to fitting a hyper-curve parameterized by a single scalar. This reformulation shifts the focus from global coefficients to individual predictors, allowing each to be modeled as a function of a common parameter. We prove that this overparameterized linear framework can yield exact predictions even when the underlying data contains nonlinear dependencies that violate classical […]

Ver mais

Like 0

Liked Liked

technocracy

Local Linearity of LLMs Enables Activation Steering via Model-Based Linear Optimal Control

digitado ⋅ 22 de April de 2026

arXiv:2604.19018v1 Announce Type: cross Abstract: Inference-time LLM alignment methods, particularly activation steering, offer an alternative to fine-tuning by directly modifying activations during generation. Existing methods, however, often rely on non-anticipative interventions that ignore how perturbations propagate through transformer layers and lack online error feedback, resulting in suboptimal, open-loop control. To address this, we show empirically that, despite the nonlinear structure of transformer blocks, layer-wise dynamics across multiple LLM architectures and scales are well-approximated by locally-linear models. Exploiting this […]

Ver mais

Like 0

Liked Liked

technocracy

The life of a prescription at Amazon Pharmacy

digitado ⋅ 30 de September de 2024

The life of a prescription at Amazon Pharmacy From pricing estimation and regulatory compliance to inventory management and chatbot assistants, machine learning models help Amazon Pharmacy customers stay healthy and save time and money. Conversational AI Alexandre Alves Anita Vila September 30, 01:32 PM October 02, 11:42 AM Pharmacies play a vital role in ensuring patients health, but the process of dispensing medications is far more complex than it may appear. At Amazon Pharmacy, we are using artificial […]

Ver mais

Like 0

Liked Liked

technocracy

The Geometry of Efficient Nonconvex Sampling

digitado ⋅ 27 de March de 2026

arXiv:2603.25622v1 Announce Type: cross Abstract: We present an efficient algorithm for uniformly sampling from an arbitrary compact body $mathcal{X} subset mathbb{R}^n$ from a warm start under isoperimetry and a natural volume growth condition. Our result provides a substantial common generalization of known results for convex bodies and star-shaped bodies. The complexity of the algorithm is polynomial in the dimension, the Poincar’e constant of the uniform distribution on $mathcal{X}$ and the volume growth constant of the set $mathcal{X}$.

Ver mais

Like 0

Liked Liked

technocracy

Simulating Complex Multi-Turn Tool Calling Interactions in Stateless Execution Environments

digitado ⋅ 29 de January de 2026

arXiv:2601.19914v1 Announce Type: new Abstract: Synthetic data has proven itself to be a valuable resource for tuning smaller, cost-effective language models to handle the complexities of multi-turn tool calling conversations. While many frameworks and systems for producing synthetic multi-turn tool calling data have been proposed, prior works have frequently assumed that any tool calling interactions will take place in an execution environment that maintains state. When such an environment is available, this is advantageous as it allows for […]

Ver mais

Like 0

Liked Liked

technocracy

LAGS: Low-Altitude Gaussian Splatting with Groupwise Heterogeneous Graph Learning

digitado ⋅ 21 de April de 2026

arXiv:2604.16910v1 Announce Type: new Abstract: Low-altitude Gaussian splatting (LAGS) facilitates 3D scene reconstruction by aggregating aerial images from distributed drones. However, as LAGS prioritizes maximizing reconstruction quality over communication throughput, existing low-altitude resource allocation schemes become inefficient. This inefficiency stems from their failure to account for image diversity introduced by varying viewpoints. To fill this gap, we propose a groupwise heterogeneous graph neural network (GW-HGNN) for LAGS resource allocation. GW-HGNN explicitly models the non-uniform contribution of different image […]

Ver mais

Like 0

Liked Liked

technocracy

Experimental Demonstration of a Decentralized Electromagnetic Formation Flying Control Using Alternating Magnetic Field Forces

digitado ⋅ 12 de January de 2026

arXiv:2601.05408v1 Announce Type: new Abstract: Electromagnetic formation flying (EMFF) is challenging due to the complex coupling between the electromagnetic fields generated by each satellite in the formation. To address this challenge, this article uses alternating magnetic field forces (AMFF) to decouple the electromagnetic forces between each pair of satellites. Each satellite’s electromagnetic actuation system is driven by a sum of amplitude-modulated sinusoids, where amplitudes are controlled to achieve desired forces between each pair of satellites. The main contribution […]

Ver mais

Like 0

Liked Liked

technocracy

MMUEChange: A Generalized LLM Agent Framework for Intelligent Multi-Modal Urban Environment Change Analysis

digitado ⋅ 12 de January de 2026

arXiv:2601.05483v1 Announce Type: new Abstract: Understanding urban environment change is essential for sustainable development. However, current approaches, particularly remote sensing change detection, often rely on rigid, single-modal analysis. To overcome these limitations, we propose MMUEChange, a multi-modal agent framework that flexibly integrates heterogeneous urban data via a modular toolkit and a core module, Modality Controller for cross- and intra-modal alignment, enabling robust analysis of complex urban change scenarios. Case studies include: a shift toward small, community-focused parks in […]

Ver mais

Like 0

Liked Liked

technocracy

MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference

digitado ⋅ 16 de February de 2026

Reward learning typically relies on a single feedback type or combines multiple feedback types using manually weighted loss terms. Currently, it remains unclear how to jointly learn reward functions from heterogeneous feedback types such as demonstrations, comparisons, ratings, and stops that provide qualitatively different signals. We address this challenge by formulating reward learning from multiple feedback types as Bayesian inference over a shared latent reward function, where each feedback type contributes information through an explicit likelihood. We introduce […]

Ver mais

Like 0

Liked Liked