January 2026

White-Box Sensitivity Auditing with Steering Vectors

digitado ⋅ 26 de January de 2026

arXiv:2601.16398v1 Announce Type: new Abstract: Algorithmic audits are essential tools for examining systems for properties required by regulators or desired by operators. Current audits of large language models (LLMs) primarily rely on black-box evaluations that assess model behavior only through input-output testing. These methods are limited to tests constructed in the input space, often generated by heuristics. In addition, many socially relevant model properties (e.g., gender bias) are abstract and difficult to measure through text-based inputs alone. To […]

Ver mais

Like 0

Liked Liked

technocracy

Cite-While-You-Generate: Training-Free Evidence Attribution for Multimodal Clinical Summarization

digitado ⋅ 26 de January de 2026

arXiv:2601.16397v1 Announce Type: new Abstract: Trustworthy clinical summarization requires not only fluent generation but also transparency about where each statement comes from. We propose a training-free framework for generation-time source attribution that leverages decoder attentions to directly cite supporting text spans or images, overcoming the limitations of post-hoc or retraining-based methods. We introduce two strategies for multimodal attribution: a raw image mode, which directly uses image patch attentions, and a caption-as-span mode, which substitutes images with generated captions […]

Ver mais

Like 0

Liked Liked

technocracy

ResAgent: Entropy-based Prior Point Discovery and Visual Reasoning for Referring Expression Segmentation

digitado ⋅ 26 de January de 2026

arXiv:2601.16394v1 Announce Type: new Abstract: Referring Expression Segmentation (RES) is a core vision-language segmentation task that enables pixel-level understanding of targets via free-form linguistic expressions, supporting critical applications such as human-robot interaction and augmented reality. Despite the progress of Multimodal Large Language Model (MLLM)-based approaches, existing RES methods still suffer from two key limitations: first, the coarse bounding boxes from MLLMs lead to redundant or non-discriminative point prompts; second, the prevalent reliance on textual coordinate reasoning is unreliable, […]

Ver mais

Like 0

Liked Liked

technocracy

GNSS-based Lunar Orbit and Clock Estimation With Stochastic Cloning UD Filter

digitado ⋅ 26 de January de 2026

arXiv:2601.16393v1 Announce Type: new Abstract: This paper presents a terrestrial GNSS-based orbit and clock estimation framework for lunar navigation satellites. To enable high-precision estimation under the low-observability conditions encountered at lunar distances, we develop a stochastic-cloning UD-factorized filter and delayed-state smoother that provide enhanced numerical stability when processing precise time-differenced carrier phase (TDCP) measurements. A comprehensive dynamics and measurement model is formulated, explicitly accounting for relativistic coupling between orbital and clock states, lunar time-scale transformations, and signal propagation […]

Ver mais

Like 0

Liked Liked

technocracy

Toward Agentic Software Project Management: A Vision and Roadmap

digitado ⋅ 26 de January de 2026

arXiv:2601.16392v1 Announce Type: new Abstract: With the advent of agentic AI, Software Engineering is transforming to a new era dubbed Software Engineering 3.0. Software project management (SPM) must also evolve with such transformations to boost successful project completion, while keeping humans at the heart of it. Building on our preliminary ideas of “agentic SPM”, and supporting literature, we present our vision of an “Agentic Project Manager (PM)” as a multi-agent system for SPM 3.0. They will work like […]

Ver mais

Like 0

Liked Liked

technocracy

Cross-Lingual Activation Steering for Multilingual Language Models

digitado ⋅ 26 de January de 2026

arXiv:2601.16390v1 Announce Type: new Abstract: Large language models exhibit strong multilingual capabilities, yet significant performance gaps persist between dominant and non-dominant languages. Prior work attributes this gap to imbalances between shared and language-specific neurons in multilingual representations. We propose Cross-Lingual Activation Steering (CLAS), a training-free inference-time intervention that selectively modulates neuron activations. We evaluate CLAS on classification and generation benchmarks, achieving average improvements of 2.3% (Acc.) and 3.4% (F1) respectively, while maintaining high-resource language performance. We discover that […]

Ver mais

Like 0

Liked Liked

technocracy

Study of Switched Step-size Based Filtered-x NLMS Algorithm for Active Noise Cancellation

digitado ⋅ 26 de January de 2026

arXiv:2601.16382v1 Announce Type: new Abstract: While the filtered-x normalized least mean square (FxNLMS) algorithm is widely applied due to its simple structure and easy implementation for active noise control system, it faces two critical limitations: the fixed step-size causes a trade-off between convergence rate and steady-state residual error, and its performance deteriorates significantly in impulsive noise environments. To address the step-size constraint issue, we propose the switched mbox{step-size} FxNLMS (SSS-FxNLMS) algorithm. Specifically, we derive the mbox{mean-square} deviation (MSD) […]

Ver mais

Like 0

Liked Liked

technocracy

VTFusion: A Vision-Text Multimodal Fusion Network for Few-Shot Anomaly Detection

digitado ⋅ 26 de January de 2026

arXiv:2601.16381v1 Announce Type: new Abstract: Few-Shot Anomaly Detection (FSAD) has emerged as a critical paradigm for identifying irregularities using scarce normal references. While recent methods have integrated textual semantics to complement visual data, they predominantly rely on features pre-trained on natural scenes, thereby neglecting the granular, domain-specific semantics essential for industrial inspection. Furthermore, prevalent fusion strategies often resort to superficial concatenation, failing to address the inherent semantic misalignment between visual and textual modalities, which compromises robustness against cross-modal […]

Ver mais

Like 0

Liked Liked

technocracy

Cognitively-Inspired Tokens Overcome Egocentric Bias in Multimodal Models

digitado ⋅ 26 de January de 2026

arXiv:2601.16378v1 Announce Type: new Abstract: Multimodal language models (MLMs) perform well on semantic vision-language tasks but fail at spatial reasoning that requires adopting another agent’s visual perspective. These errors reflect a persistent egocentric bias and raise questions about whether current models support allocentric reasoning. Inspired by human spatial cognition, we introduce perspective tokens, specialized embeddings that encode orientation through either (1) embodied body-keypoint cues or (2) abstract representations supporting mental rotation. Integrating these tokens into LLaVA-1.5-13B yields performance […]

Ver mais

Like 0

Liked Liked

technocracy

PolyAgent: Large Language Model Agent for Polymer Design

digitado ⋅ 26 de January de 2026

arXiv:2601.16376v1 Announce Type: new Abstract: On-demand Polymer discovery is essential for various industries, ranging from biomedical to reinforcement materials. Experiments with polymers have a long trial-and-error process, leading to long procedures and extensive resources. For these processes, machine learning has accelerated scientific discovery at the property prediction and latent space search fronts. However, laboratory researchers cannot readily access codes and these models to extract individual structures and properties due to infrastructure limitations. We present a closed-loop polymer structure-property […]

Ver mais

Like 0

Liked Liked