digitado – Page 503

How Much Reasoning Do Retrieval-Augmented Models Add beyond LLMs? A Benchmarking Framework for Multi-Hop Inference over Hybrid Knowledge

digitado ⋅ 12 de February de 2026

arXiv:2602.10210v1 Announce Type: new Abstract: Large language models (LLMs) continue to struggle with knowledge-intensive questions that require up-to-date information and multi-hop reasoning. Augmenting LLMs with hybrid external knowledge, such as unstructured text and structured knowledge graphs, offers a promising alternative to costly continual pretraining. As such, reliable evaluation of their retrieval and reasoning capabilities becomes critical. However, many existing benchmarks increasingly overlap with LLM pretraining data, which means answers or supporting knowledge may already be encoded in model […]

Ver mais

Like 0

Liked Liked

technocracy

RL in quant finance?

digitado ⋅ 14 de February de 2026

I have been keen in applied rl, though I wasn’t domain specific I tried building good rl models for drones robotics, brain computer interfaces etc.. I got intrigued by quant finance very late I know that.. Seeing the vast potential and problem solving it takes and me being a physics major with an rl interest how about pivoting to quant finance? submitted by /u/Man_plaintiffx [link] [comments]

Ver mais

Like 0

Liked Liked

technocracy

Age-Related Differences in the Perception of Eye-Gaze from a Social Robot

digitado ⋅ 11 de March de 2026

arXiv:2603.08810v1 Announce Type: new Abstract: There is an increasing interest in social robots assisting older adults during daily life tasks. In this context, non-verbal cues such as deictic gaze are important in natural communication in human-robot interaction. However, the sensibility to deictic-gaze declines naturally with age and results in a reduction in social perception. Therefore, this work explores the benefits of deictic gaze from social robots assisting older adults during daily life tasks, and how age-related differences may […]

Ver mais

Like 0

Liked Liked

technocracy

Physics-informed neural operators for the in situ characterization of locally reacting sound absorbers

digitado ⋅ 10 de April de 2026

arXiv:2604.07412v1 Announce Type: new Abstract: Accurate knowledge of acoustic surface admittance or impedance is essential for reliable wave-based simulations, yet its in situ estimation remains challenging due to noise, model inaccuracies, and restrictive assumptions of conventional methods. This work presents a physics-informed neural operator approach for estimating frequency-dependent surface admittance directly from near-field measurements of sound pressure and particle velocity. A deep operator network is employed to learn the mapping from measurement data, spatial coordinates, and frequency to […]

Ver mais

Like 0

Liked Liked

technocracy

Projected Microbatch Accumulation yields reference-free proximal policy updates for reinforcement learning

digitado ⋅ 15 de January de 2026

This note introduces Projected Microbatch Accumulation (PROMA), a proximal policy update method for large language model fine-tuning. PROMA accumulates policy gradients across microbatches by projecting out sequence-wise gradient components before microbatch aggregation. The projection is applied layer-wise during the backward pass, enabling efficient implementation without additional forward or backward passes. Empirically, PROMA enforces tighter control of local KL divergence than GRPO, resulting in more stable policy learning. Unlike PPO and GRPO, PROMA achieves proximal updates without inducing entropy […]

Ver mais

Like 0

Liked Liked

technocracy

Why Agents Stall in Production: When Real-Time Retrieval Meets Reality

digitado ⋅ 9 de April de 2026

AI agents look impressive in demos until they hit the real world. The moment your agent retrieval pipeline scales up, failures start appearing, especially the dreaded “429 Too Many Requests” error. Suddenly, your thought-to-be “production-ready” agent can’t fetch evidence fast enough to stay accurate…. The fix isn’t better prompts or agent frameworks. The solution is an infrastructure built for robust, verifiable, instant evidence acquisition! In this article, you’ll learn why agents stall in production, what causes errors in […]

Ver mais

Like 0

Liked Liked

technocracy

Noisy Data is Destructive to Reinforcement Learning with Verifiable Rewards

digitado ⋅ 17 de March de 2026

Reinforcement learning with verifiable rewards (RLVR) has driven recent capability advances of large language models across various domains. Recent studies suggest that improved RLVR algorithms allow models to learn effectively from incorrect annotations, achieving performance comparable to learning from clean data. In this work, we show that these findings are invalid because the claimed 100% noisy training data is "contaminated" with clean data. After rectifying the dataset with a rigorous re-verification pipeline, we demonstrate that noise is destructive […]

Ver mais

Like 0

Liked Liked

technocracy

Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels

digitado ⋅ 1 de April de 2026

Grokking occurs when a model achieves high training accuracy but generalization to unseen test points happens long after that. This phenomenon was initially observed on a class of algebraic problems, such as learning modular arithmetic (Power et al., 2022). We study grokking on algebraic tasks in a class of feature learning kernels via the Recursive Feature Machine (RFM) algorithm (Radhakrishnan et al., 2024), which iteratively updates feature matrices through the Average Gradient Outer Product (AGOP) of an estimator […]

Ver mais

Like 0

Liked Liked

technocracy

Analysis of LLMs Against Prompt Injection and Jailbreak Attacks

digitado ⋅ 27 de February de 2026

arXiv:2602.22242v1 Announce Type: new Abstract: Large Language Models (LLMs) are widely deployed in real-world systems. Given their broader applicability, prompt engineering has become an efficient tool for resource-scarce organizations to adopt LLMs for their own purposes. At the same time, LLMs are vulnerable to prompt-based attacks. Thus, analyzing this risk has become a critical security requirement. This work evaluates prompt-injection and jailbreak vulnerability using a large, manually curated dataset across multiple open-source LLMs, including Phi, Mistral, DeepSeek-R1, Llama […]

Ver mais

Like 0

Liked Liked

technocracy

AI Is Not Just an API Call: What iOS Engineers Learn the Hard Way in Production

digitado ⋅ 8 de January de 2026

Every AI-powered iOS app starts with the same assumption: “We’ll just call the AI API and show the result.” That assumption works — right up until real users, real devices, and real constraints enter the system. AI demos look simple. Production AI on iOS is not. This article is about what breaks when AI is treated as a regular API call — and what iOS engineers inevitably learn when shipping AI-powered features to real users. The Mental Model […]

Ver mais

Like 0

Liked Liked