March 2026

NeuroVLM-Bench: Evaluation of Vision-Enabled Large Language Models for Clinical Reasoning in Neurological Disorders

digitado ⋅ 27 de March de 2026

arXiv:2603.24846v1 Announce Type: new Abstract: Recent advances in multimodal large language models enable new possibilities for image-based decision support. However, their reliability and operational trade-offs in neuroimaging remain insufficiently understood. We present a comprehensive benchmarking study of vision-enabled large language models for 2D neuroimaging using curated MRI and CT datasets covering multiple sclerosis, stroke, brain tumors, other abnormalities, and normal controls. Models are required to generate multiple outputs simultaneously, including diagnosis, diagnosis subtype, imaging modality, specialized sequence, and […]

Ver mais

Like 0

Liked Liked

technocracy

Reaching Beyond the Mode: RL for Distributional Reasoning in Language Models

digitado ⋅ 27 de March de 2026

arXiv:2603.24844v1 Announce Type: new Abstract: Given a question, a language model (LM) implicitly encodes a distribution over possible answers. In practice, post-training procedures for LMs often collapse this distribution onto a single dominant mode. While this is generally not a problem for benchmark-style evaluations that assume one correct answer, many real-world tasks inherently involve multiple valid answers or irreducible uncertainty. Examples include medical diagnosis, ambiguous question answering, and settings with incomplete information. In these cases, we would like […]

Ver mais

Like 0

Liked Liked

technocracy

Data-Oriented Modeling for Spacecraft Design

digitado ⋅ 27 de March de 2026

arXiv:2603.24841v1 Announce Type: new Abstract: Spacecraft development costs remain high despite falling launch costs, in part because Model-Based Systems Engineering (MBSE) tools carry the complexity of the object-oriented programming paradigm: tightly coupled data and logic, mutable state, and rigid class hierarchies that resist integration with discipline-specific analysis tools. This paper presents a data-oriented approach to MBSE that adapts the Entity-Component-System (ECS) architecture from the video game industry. Design data is stored as immutable, format-agnostic components in a generic […]

Ver mais

Like 0

Liked Liked

technocracy

Prune as You Generate: Online Rollout Pruning for Faster and Better RLVR

digitado ⋅ 27 de March de 2026

arXiv:2603.24840v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has significantly advanced the reasoning capabilities of Large Language Models (LLMs). However, methods such as GRPO and DAPO suffer from substantial computational cost, since they rely on sampling many rollouts for each prompt. Moreover, in RLVR the relative advantage is often sparse: many samples become nearly all-correct or all-incorrect, yielding low within-group reward variance and thus weak learning signals. In this paper, we introduce arrol (Accelerating RLVR […]

Ver mais

Like 0

Liked Liked

technocracy

Bridging Code Property Graphs and Language Models for Program Analysis

digitado ⋅ 27 de March de 2026

arXiv:2603.24837v1 Announce Type: new Abstract: Large Language Models (LLMs) face critical challenges when analyzing security vulnerabilities in real world codebases: token limits prevent loading entire repositories, code embeddings fail to capture inter procedural data flows, and LLMs struggle to generate complex static analysis queries. These limitations force existing approaches to operate on isolated code snippets, missing vulnerabilities that span multiple functions and files. We introduce codebadger, an open source Model Context Protocol (MCP) server that integrates Joern’s Code […]

Ver mais

Like 0

Liked Liked

technocracy

WAFT-Stereo: Warping-Alone Field Transforms for Stereo Matching

digitado ⋅ 27 de March de 2026

arXiv:2603.24836v1 Announce Type: new Abstract: We introduce WAFT-Stereo, a simple and effective warping-based method for stereo matching. WAFT-Stereo demonstrates that cost volumes, a common design used in many leading methods, are not necessary for strong performance and can be replaced by warping with improved efficiency. WAFT-Stereo ranks first on ETH3D, KITTI and Middlebury public benchmarks, reducing the zero-shot error by 81% on ETH3D benchmark, while being 1.8-6.7x faster than competitive methods. Code and model weights are available at […]

Ver mais

Like 0

Liked Liked

technocracy

DCARL: A Divide-and-Conquer Framework for Autoregressive Long-Trajectory Video Generation

digitado ⋅ 27 de March de 2026

arXiv:2603.24835v1 Announce Type: new Abstract: Long-trajectory video generation is a crucial yet challenging task for world modeling primarily due to the limited scalability of existing video diffusion models (VDMs). Autoregressive models, while offering infinite rollout, suffer from visual drift and poor controllability. To address these issues, we propose DCARL, a novel divide-and-conquer, autoregressive framework that effectively combines the structural stability of the divide-and-conquer scheme with the high-fidelity generation of VDMs. Our approach first employs a dedicated Keyframe Generator […]

Ver mais

Like 0

Liked Liked

technocracy

SABER: Spatial Attention, Brain, Extended Reality

digitado ⋅ 27 de March de 2026

arXiv:2603.24830v1 Announce Type: new Abstract: Tracking moving objects is a critical skill for many everyday tasks, such as crossing a busy street, driving a car or catching a ball. Attention is a key cognitive function that supports object tracking; however, our understanding of the brain mechanisms that support attention is almost exclusively based on evidence from tasks that present stable objects at fixed locations. Accounts of multiple object tracking are also limited because they are largely based on […]

Ver mais

Like 0

Liked Liked

technocracy

Flow matching on homogeneous spaces

digitado ⋅ 27 de March de 2026

arXiv:2603.24829v1 Announce Type: new Abstract: We propose a general framework to extend Flow Matching to homogeneous spaces, i.e. quotients of Lie groups. Our approach reformulates the problem as a flow matching task on the underlying Lie group by lifting the data distributions. This strategy avoids the potentially complicated geometry of homogeneous spaces by working directly on Lie groups, which in turn enables us reduce the problem to a Euclidean flow matching task on Lie algebras. In contrast to […]

Ver mais

Like 0

Liked Liked

technocracy

A Practical Guide Towards Interpreting Time-Series Deep Clinical Predictive Models: A Reproducibility Study

digitado ⋅ 27 de March de 2026

arXiv:2603.24828v1 Announce Type: new Abstract: Clinical decisions are high-stakes and require explicit justification, making model interpretability essential for auditing deep clinical models prior to deployment. As the ecosystem of model architectures and explainability methods expands, critical questions remain: Do architectural features like attention improve explainability? Do interpretability approaches generalize across clinical tasks? While prior benchmarking efforts exist, they often lack extensibility and reproducibility, and critically, fail to systematically examine how interpretability varies across the interplay of clinical tasks […]

Ver mais

Like 0

Liked Liked