January 2026

How does information access affect LLM monitors’ ability to detect sabotage?

digitado ⋅ 30 de January de 2026

arXiv:2601.21112v1 Announce Type: new Abstract: Frontier language model agents can exhibit misaligned behaviors, including deception, exploiting reward hacks, and pursuing hidden objectives. To control potentially misaligned agents, we can use LLMs themselves to monitor for misbehavior. In this paper, we study how information access affects LLM monitor performance. While one might expect that monitors perform better when they have access to more of the monitored agents’ reasoning and actions, we find that contemporary systems often perform better with […]

Ver mais

Like 0

Liked Liked

technocracy

ChunkWise LoRA: Adaptive Sequence Partitioning for Memory-Efficient Low-Rank Adaptation and Accelerated LLM Inference

digitado ⋅ 30 de January de 2026

arXiv:2601.21109v1 Announce Type: new Abstract: Recent advances in low-rank adaptation (LoRA) have enabled efficient fine-tuning of large language models (LLMs) with minimal additional parameters. However, existing LoRA methods apply static rank configurations uniformly across all input tokens, ignoring variation in token complexity and computational requirements. In this work, we propose ChunkWise LoRA, a dynamic and adaptive approach that partitions sequences into variable-length chunks based on token complexity and assigns each chunk a tailored low-rank configuration. Our system introduces […]

Ver mais

Like 0

Liked Liked

technocracy

SteerEval: A Framework for Evaluating Steerability with Natural Language Profiles for Recommendation

digitado ⋅ 30 de January de 2026

arXiv:2601.21105v1 Announce Type: new Abstract: Natural-language user profiles have recently attracted attention not only for improved interpretability, but also for their potential to make recommender systems more steerable. By enabling direct editing, natural-language profiles allow users to explicitly articulate preferences that may be difficult to infer from past behavior. However, it remains unclear whether current natural-language-based recommendation methods can follow such steering commands. While existing steerability evaluations have shown some success for well-recognized item attributes (e.g., movie genres), […]

Ver mais

Like 0

Liked Liked

technocracy

The Quiet Contributions: Insights into AI-Generated Silent Pull Requests

digitado ⋅ 30 de January de 2026

arXiv:2601.21102v1 Announce Type: new Abstract: We present the first empirical study of AI-generated pull requests that are ‘silent,’ meaning no comments or discussions accompany them. This absence of any comments or discussions associated with such silent AI pull requests (SPRs) poses a unique challenge in understanding the rationale for their acceptance or rejection. Hence, we quantitatively study 4,762 SPRs of five AI agents made to popular Python repositories drawn from the AIDev public dataset. We examine SPRs impact […]

Ver mais

Like 0

Liked Liked

technocracy

Magellan: Autonomous Discovery of Novel Compiler Optimization Heuristics with AlphaEvolve

digitado ⋅ 30 de January de 2026

arXiv:2601.21096v1 Announce Type: new Abstract: Modern compilers rely on hand-crafted heuristics to guide optimization passes. These human-designed rules often struggle to adapt to the complexity of modern software and hardware and lead to high maintenance burden. To address this challenge, we present Magellan, an agentic framework that evolves the compiler pass itself by synthesizing executable C++ decision logic. Magellan couples an LLM coding agent with evolutionary search and autotuning in a closed loop of generation, evaluation on user-provided […]

Ver mais

Like 0

Liked Liked

technocracy

Responsible AI: The Good, The Bad, The AI

digitado ⋅ 30 de January de 2026

arXiv:2601.21095v1 Announce Type: new Abstract: The rapid proliferation of artificial intelligence across organizational contexts has generated profound strategic opportunities while introducing significant ethical and operational risks. Despite growing scholarly attention to responsible AI, extant literature remains fragmented and is often adopting either an optimistic stance emphasizing value creation or an excessively cautious perspective fixated on potential harms. This paper addresses this gap by presenting a comprehensive examination of AI’s dual nature through the lens of strategic information systems. […]

Ver mais

Like 0

Liked Liked

technocracy

Safety Generalization Under Distribution Shift in Safe Reinforcement Learning: A Diabetes Testbed

digitado ⋅ 30 de January de 2026

arXiv:2601.21094v1 Announce Type: new Abstract: Safe Reinforcement Learning (RL) algorithms are typically evaluated under fixed training conditions. We investigate whether training-time safety guarantees transfer to deployment under distribution shift, using diabetes management as a safety-critical testbed. We benchmark safe RL algorithms on a unified clinical simulator and reveal a safety generalization gap: policies satisfying constraints during training frequently violate safety requirements on unseen patients. We demonstrate that test-time shielding, which filters unsafe actions using learned dynamics models, effectively […]

Ver mais

Like 0

Liked Liked

technocracy

MapPFN: Learning Causal Perturbation Maps in Context

digitado ⋅ 30 de January de 2026

arXiv:2601.21092v1 Announce Type: new Abstract: Planning effective interventions in biological systems requires treatment-effect models that adapt to unseen biological contexts by identifying their specific underlying mechanisms. Yet single-cell perturbation datasets span only a handful of biological contexts, and existing methods cannot leverage new interventional evidence at inference time to adapt beyond their training data. To meta-learn a perturbation effect estimator, we present MapPFN, a prior-data fitted network (PFN) pretrained on synthetic data generated from a prior over causal […]

Ver mais

Like 0

Liked Liked

technocracy

Deep Reinforcement Learning for Fault-Adaptive Routing in Eisenstein-Jacobi Interconnection Topologies

digitado ⋅ 30 de January de 2026

arXiv:2601.21090v1 Announce Type: new Abstract: The increasing density of many-core architectures necessitates interconnection networks that are both high-performance and fault-resilient. Eisenstein-Jacobi (EJ) networks, with their symmetric 6-regular topology, offer superior topological properties but challenge traditional routing heuristics under fault conditions. This paper evaluates three routing paradigms in faulty EJ environments: deterministic Greedy Adaptive Routing, theoretically optimal Dijkstra’s algorithm, and a reinforcement learning (RL)-based approach. Using a multi-objective reward function to penalize fault proximity and reward path efficiency, the […]

Ver mais

Like 0

Liked Liked

technocracy

Position-invariant Fine-tuning of Speech Enhancement Models with Self-supervised Speech Representations

digitado ⋅ 30 de January de 2026

arXiv:2601.21084v1 Announce Type: new Abstract: Integrating front-end speech enhancement (SE) models with self-supervised learning (SSL)-based speech models is effective for downstream tasks in noisy conditions. SE models are commonly fine-tuned using SSL representations with mean squared error (MSE) loss between enhanced and clean speech. However, MSE is prone to exploiting positional embeddings in SSL models, allowing the objective to be minimised through positional correlations instead of content-related information. This work frames the problem as a general limitation of […]

Ver mais

Like 0

Liked Liked