February 2026

AsynDBT: Asynchronous Distributed Bilevel Tuning for efficient In-Context Learning with Large Language Models

digitado ⋅ 23 de February de 2026

arXiv:2602.17694v1 Announce Type: new Abstract: With the rapid development of large language models (LLMs), an increasing number of applications leverage cloud-based LLM APIs to reduce usage costs. However, since cloud-based models’ parameters and gradients are agnostic, users have to manually or use heuristic algorithms to adjust prompts for intervening LLM outputs, which requiring costly optimization procedures. In-context learning (ICL) has recently emerged as a promising paradigm that enables LLMs to adapt to new tasks using examples provided within […]

Ver mais

Like 0

Liked Liked

technocracy

A Case Study of Selected PTQ Baselines for Reasoning LLMs on Ascend NPU

digitado ⋅ 23 de February de 2026

arXiv:2602.17693v1 Announce Type: new Abstract: Post-Training Quantization (PTQ) is crucial for efficient model deployment, yet its effectiveness on Ascend NPU remains under-explored compared to GPU architectures. This paper presents a case study of representative PTQ baselines applied to reasoning-oriented models such as DeepSeek-R1-Distill-Qwen series (1.5B/7B/14B) and QwQ-32B. We evaluate four distinct algorithms, including AWQ, GPTQ, SmoothQuant, and FlatQuant, to cover the spectrum from weight-only compression to advanced rotation-based methods. Our empirical results reveal significant platform sensitivity. While 4-bit […]

Ver mais

Like 0

Liked Liked

technocracy

Agentic Unlearning: When LLM Agent Meets Machine Unlearning

digitado ⋅ 23 de February de 2026

arXiv:2602.17692v1 Announce Type: new Abstract: In this paper, we introduce textbf{agentic unlearning} which removes specified information from both model parameters and persistent memory in agents with closed-loop interaction. Existing unlearning methods target parameters alone, leaving two critical gaps: (i) parameter-memory backflow, where retrieval reactivates parametric remnants or memory artifacts reintroduce sensitive content, and (ii) the absence of a unified strategy that covers both parameter and memory pathways. We present Synchronized Backflow Unlearning (SBU), a framework that unlearns jointly […]

Ver mais

Like 0

Liked Liked

technocracy

Tethered Reasoning: Decoupling Entropy from Hallucination in Quantized LLMs via Manifold Steering

digitado ⋅ 23 de February de 2026

arXiv:2602.17691v1 Announce Type: new Abstract: Quantized language models face a fundamental dilemma: low sampling temperatures yield repetitive, mode-collapsed outputs, while high temperatures (T > 2.0) cause trajectory divergence and semantic incoherence. We present HELIX, a geometric framework that decouples output entropy from hallucination by tethering hidden-state trajectories to a pre-computed truthfulness manifold. HELIX computes a Unified Truth Score (UTS) combining token-level semantic entropy with Mahalanobis distance from the manifold. When UTS indicates trajectory divergence, graduated steering vectors redirect […]

Ver mais

Like 0

Liked Liked

technocracy

DesignAsCode: Bridging Structural Editability and Visual Fidelity in Graphic Design Generation

digitado ⋅ 23 de February de 2026

arXiv:2602.17690v1 Announce Type: new Abstract: Graphic design generation demands a delicate balance between high visual fidelity and fine-grained structural editability. However, existing approaches typically bifurcate into either non-editable raster image synthesis or abstract layout generation devoid of visual content. Recent combinations of these two approaches attempt to bridge this gap but often suffer from rigid composition schemas and unresolvable visual dissonances (e.g., text-background conflicts) due to their inexpressive representation and open-loop nature. To address these challenges, we propose […]

Ver mais

Like 0

Liked Liked

technocracy

Robust Pre-Training of Medical Vision-and-Language Models with Domain-Invariant Multi-Modal Masked Reconstruction

digitado ⋅ 23 de February de 2026

arXiv:2602.17689v1 Announce Type: new Abstract: Medical vision-language models show strong potential for joint reasoning over medical images and clinical text, but their performance often degrades under domain shift caused by variations in imaging devices, acquisition protocols, and reporting styles. Existing multi-modal pre-training methods largely overlook robustness, treating it as a downstream adaptation problem. In this work, we propose Robust Multi-Modal Masked Reconstruction (Robust-MMR), a self-supervised pre-training framework that explicitly incorporates robustness objectives into masked vision-language learning. Robust-MMR integrates […]

Ver mais

Like 0

Liked Liked

technocracy

AnCoder: Anchored Code Generation via Discrete Diffusion Models

digitado ⋅ 23 de February de 2026

arXiv:2602.17688v1 Announce Type: new Abstract: Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of programming languages and, as a result, often produce broken programs that fail to execute. To address this, we introduce AnchorTree, a framework that explicitly anchors the diffusion process using structured, hierarchical priors native to code. Specifically, AnchorTree uses the abstract syntax tree […]

Ver mais

Like 0

Liked Liked

technocracy

IRPAPERS: A Visual Document Benchmark for Scientific Retrieval and Question Answering

digitado ⋅ 23 de February de 2026

arXiv:2602.17687v1 Announce Type: new Abstract: AI systems have achieved remarkable success in processing text and relational data, yet visual document processing remains relatively underexplored. Whereas traditional systems require OCR transcriptions to convert these visual documents into text and metadata, recent advances in multimodal foundation models offer retrieval and generation directly from document images. This raises a key question: How do image-based systems compare to established text-based methods? We introduce IRPAPERS, a benchmark of 3,230 pages from 166 scientific […]

Ver mais

Like 0

Liked Liked

technocracy

Curriculum Learning for Efficient Chain-of-Thought Distillation via Structure-Aware Masking and GRPO

digitado ⋅ 23 de February de 2026

arXiv:2602.17686v1 Announce Type: new Abstract: Distilling Chain-of-Thought (CoT) reasoning from large language models into compact student models presents a fundamental challenge: teacher rationales are often too verbose for smaller models to faithfully reproduce. Existing approaches either compress reasoning into single-step, losing the interpretability that makes CoT valuable. We present a three-stage curriculum learning framework that addresses this capacity mismatch through progressive skill acquisition. First, we establish structural understanding via masked shuffled reconstruction. Second, we apply Group Relative Policy […]

Ver mais

Like 0

Liked Liked

technocracy

Optimal Multi-Debris Mission Planning in LEO: A Deep Reinforcement Learning Approach with Co-Elliptic Transfers and Refueling

digitado ⋅ 23 de February de 2026

arXiv:2602.17685v1 Announce Type: new Abstract: This paper addresses the challenge of multi target active debris removal (ADR) in Low Earth Orbit (LEO) by introducing a unified coelliptic maneuver framework that combines Hohmann transfers, safety ellipse proximity operations, and explicit refueling logic. We benchmark three distinct planning algorithms Greedy heuristic, Monte Carlo Tree Search (MCTS), and deep reinforcement learning (RL) using Masked Proximal Policy Optimization (PPO) within a realistic orbital simulation environment featuring randomized debris fields, keep out zones, […]

Ver mais

Like 0

Liked Liked