February 2026

Hair-Trigger Alignment: Black-Box Evaluation Cannot Guarantee Post-Update Alignment

digitado ⋅ 2 de February de 2026

arXiv:2601.22313v1 Announce Type: new Abstract: Large Language Models (LLMs) are rarely static and are frequently updated in practice. A growing body of alignment research has shown that models initially deemed “aligned” can exhibit misaligned behavior after fine-tuning, such as forgetting jailbreak safety features or re-surfacing knowledge that was intended to be forgotten. These works typically assume that the initial model is aligned based on static black-box evaluation, i.e., the absence of undesired responses to a fixed set of […]

Ver mais

Like 0

Liked Liked

technocracy

SCALAR: Quantifying Structural Hallucination, Consistency, and Reasoning Gaps in Materials Foundation Models

digitado ⋅ 2 de February de 2026

arXiv:2601.22312v1 Announce Type: new Abstract: Large language models are increasingly applied to materials science reasoning, yet their behavior under physically structured distribution shifts remains poorly understood. We introduce SCALAR (Structural Consistency And Logic Across Regimes), a benchmark for evaluating geometric scale generalization and its connection to structural hallucination, consistency, and reasoning in materials foundation models. Given canonical crystal representations, models must reason about derived nanoparticle structures obtained through supercell expansion and geometric truncation across length scales spanning a […]

Ver mais

Like 0

Liked Liked

technocracy

Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents

digitado ⋅ 2 de February de 2026

arXiv:2601.22311v1 Announce Type: new Abstract: Large language model (LLM)-based agents exhibit strong step-by-step reasoning capabilities over short horizons, yet often fail to sustain coherent behavior over long planning horizons. We argue that this failure reflects a fundamental mismatch: step-wise reasoning induces a form of step-wise greedy policy that is adequate for short horizons but fails in long-horizon planning, where early actions must account for delayed consequences. From this planning-centric perspective, we study LLM-based agents in deterministic, fully structured […]

Ver mais

Like 0

Liked Liked

technocracy

Stealthy Poisoning Attacks Bypass Defenses in Regression Settings

digitado ⋅ 2 de February de 2026

arXiv:2601.22308v1 Announce Type: new Abstract: Regression models are widely used in industrial processes, engineering and in natural and physical sciences, yet their robustness to poisoning has received less attention. When it has, studies often assume unrealistic threat models and are thus less useful in practice. In this paper, we propose a novel optimal stealthy attack formulation that considers different degrees of detectability and show that it bypasses state-of-the-art defenses. We further propose a new methodology based on normalization […]

Ver mais

Like 0

Liked Liked

technocracy

Exact closed-form Gaussian moments of residual layers

digitado ⋅ 2 de February de 2026

arXiv:2601.22307v1 Announce Type: new Abstract: We study the problem of propagating the mean and covariance of a general multivariate Gaussian distribution through a deep (residual) neural network using layer-by-layer moment matching. We close a longstanding gap by deriving exact moment matching for the probit, GeLU, ReLU (as a limit of GeLU), Heaviside (as a limit of probit), and sine activation functions; for both feedforward and generalized residual layers. On random networks, we find orders-of-magnitude improvements in the KL […]

Ver mais

Like 0

Liked Liked

technocracy

BayesFlow: A Probability Inference Framework for Meta-Agent Assisted Workflow Generation

digitado ⋅ 2 de February de 2026

arXiv:2601.22305v1 Announce Type: new Abstract: Automatic workflow generation is the process of automatically synthesizing sequences of LLM calls, tool invocations, and post-processing steps for complex end-to-end tasks. Most prior methods cast this task as an optimization problem with limited theoretical grounding. We propose to cast workflow generation as Bayesian inference over a posterior distribution on workflows, and introduce textbf{Bayesian Workflow Generation (BWG)}, a sampling framework that builds workflows step-by-step using parallel look-ahead rollouts for importance weighting and a […]

Ver mais

Like 0

Liked Liked

technocracy

ZK-HybridFL: Zero-Knowledge Proof-Enhanced Hybrid Ledger for Federated Learning

digitado ⋅ 2 de February de 2026

arXiv:2601.22302v1 Announce Type: new Abstract: Federated learning (FL) enables collaborative model training while preserving data privacy, yet both centralized and decentralized approaches face challenges in scalability, security, and update validation. We propose ZK-HybridFL, a secure decentralized FL framework that integrates a directed acyclic graph (DAG) ledger with dedicated sidechains and zero-knowledge proofs (ZKPs) for privacy-preserving model validation. The framework uses event-driven smart contracts and an oracle-assisted sidechain to verify local model updates without exposing sensitive data. A built-in […]

Ver mais

Like 0

Liked Liked

technocracy

Coarse-to-Real: Generative Rendering for Populated Dynamic Scenes

digitado ⋅ 2 de February de 2026

arXiv:2601.22301v1 Announce Type: new Abstract: Traditional rendering pipelines rely on complex assets, accurate materials and lighting, and substantial computational resources to produce realistic imagery, yet they still face challenges in scalability and realism for populated dynamic scenes. We present C2R (Coarse-to-Real), a generative rendering framework that synthesizes real-style urban crowd videos from coarse 3D simulations. Our approach uses coarse 3D renderings to explicitly control scene layout, camera motion, and human trajectories, while a learned neural renderer generates realistic […]

Ver mais

Like 0

Liked Liked

technocracy

Conformal Prediction for Generative Models via Adaptive Cluster-Based Density Estimation

digitado ⋅ 2 de February de 2026

arXiv:2601.22298v1 Announce Type: new Abstract: Conditional generative models map input variables to complex, high-dimensional distributions, enabling realistic sample generation in a diverse set of domains. A critical challenge with these models is the absence of calibrated uncertainty, which undermines trust in individual outputs for high-stakes applications. To address this issue, we propose a systematic conformal prediction approach tailored to conditional generative models, leveraging density estimation on model-generated samples. We introduce a novel method called CP4Gen, which utilizes clustering-based […]

Ver mais

Like 0

Liked Liked

technocracy

Prepare Reasoning Language Models for Multi-Agent Debate with Self-Debate Reinforcement Learning

digitado ⋅ 2 de February de 2026

arXiv:2601.22297v1 Announce Type: new Abstract: The reasoning abilities of large language models (LLMs) have been substantially improved by reinforcement learning with verifiable rewards (RLVR). At test time, collaborative reasoning through Multi-Agent Debate (MAD) has emerged as a promising approach for enhancing LLM performance. However, current RLVR methods typically train LLMs to solve problems in isolation, without explicitly preparing them to synthesize and benefit from different rationales that arise during debate. In this work, we propose Self-Debate Reinforcement Learning […]

Ver mais

Like 0

Liked Liked