February 2026

CodeGuard: Improving LLM Guardrails in CS Education

digitado ⋅ 4 de February de 2026

arXiv:2602.02509v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly embedded in Computer Science (CS) classrooms to automate code generation, feedback, and assessment. However, their susceptibility to adversarial or ill-intentioned prompts threatens student learning and academic integrity. To cope with this important issue, we evaluate existing off-the-shelf LLMs in handling unsafe and irrelevant prompts within the domain of CS education. We identify important shortcomings in existing LLM guardrails which motivates us to propose CodeGuard, a comprehensive guardrail […]

Ver mais

Like 0

Liked Liked

technocracy

Precoding-Oriented CSI Feedback Design with Mutual Information Regularized VQ-VAE

digitado ⋅ 4 de February de 2026

arXiv:2602.02508v1 Announce Type: new Abstract: Efficient channel state information (CSI) compression at the user equipment plays a key role in enabling accurate channel reconstruction and precoder design in massive multiple-input multiple-output systems. A key challenge lies in balancing the CSI feedback overhead with the achievable downlink rate, i.e., maximizing the utility of limited feedback to maintain high system performance. In this work, we propose a precoding-oriented CSI feedback framework based on a vector quantized variational autoencoder, augmented with […]

Ver mais

Like 0

Liked Liked

technocracy

Learning-augmented smooth integer programs with PAC-learnable oracles

digitado ⋅ 4 de February de 2026

arXiv:2602.02505v1 Announce Type: new Abstract: This paper investigates learning-augmented algorithms for smooth integer programs, covering canonical problems such as MAX-CUT and MAX-k-SAT. We introduce a framework that incorporates a predictive oracle to construct a linear surrogate of the objective, which is then solved via linear programming followed by a rounding procedure. Crucially, our framework ensures that the solution quality is both consistent and smooth against prediction errors. We demonstrate that this approach effectively extends tractable approximations from the […]

Ver mais

Like 0

Liked Liked

technocracy

Sparse Adapter Fusion for Continual Learning in NLP

digitado ⋅ 4 de February de 2026

arXiv:2602.02502v1 Announce Type: new Abstract: Continual learning in natural language processing plays a crucial role in adapting to evolving data and preventing catastrophic forgetting. Despite significant progress, existing methods still face challenges, such as inefficient parameter reuse across tasks, risking catastrophic forgetting when tasks are dissimilar, and the unnecessary introduction of new parameters for each task, which hampers knowledge sharing among similar tasks. To tackle these issues, we propose a Sparse Adapter Fusion Method (SAFM), which dynamically fuses […]

Ver mais

Like 0

Liked Liked

technocracy

Augmenting Parameter-Efficient Pre-trained Language Models with Large Language Models

digitado ⋅ 4 de February de 2026

arXiv:2602.02501v1 Announce Type: new Abstract: Training AI models in cybersecurity with help of vast datasets offers significant opportunities to mimic real-world behaviors effectively. However, challenges like data drift and scarcity of labelled data lead to frequent updates of models and the risk of overfitting. To address these challenges, we used parameter-efficient fine-tuning techniques for pre-trained language models wherein we combine compacters with various layer freezing strategies. To enhance the capabilities of these pre-trained language models, in this work […]

Ver mais

Like 0

Liked Liked

technocracy

UNSO: Unified Newton Schulz Orthogonalization

digitado ⋅ 4 de February de 2026

arXiv:2602.02500v1 Announce Type: new Abstract: The Newton-Schulz (NS) iteration has gained increasing interest for its role in the Muon optimizer and the Stiefel manifold. However, the conventional NS iteration suffers from inefficiency and instability. Although various improvements have been introduced to NS iteration, they fail to deviate from the conventional iterative paradigm, which could increase computation burden largely due to the matrix products along the long dimension repeatedly. To address this, we consolidate the iterative structure into a […]

Ver mais

Like 0

Liked Liked

technocracy

ROSA-Tuning: Enhancing Long-Context Modeling via Suffix Matching

digitado ⋅ 4 de February de 2026

arXiv:2602.02499v1 Announce Type: new Abstract: Long-context capability and computational efficiency are among the central challenges facing today’s large language models. Existing efficient attention methods reduce computational complexity, but they typically suffer from a limited coverage of the model state. This paper proposes ROSA-Tuning, a retrieval-and-recall mechanism for enhancing the long-context modeling ability of pretrained models. Beyond the standard attention mechanism, ROSA-Tuning introduces in parallel a CPU-based ROSA (RWKV Online Suffix Automaton) retrieval module, which efficiently locates historical positions […]

Ver mais

Like 0

Liked Liked

technocracy

Test-Time Detoxification without Training or Learning Anything

digitado ⋅ 4 de February de 2026

arXiv:2602.02498v1 Announce Type: new Abstract: Large language models can produce toxic or inappropriate text even for benign inputs, creating risks when deployed at scale. Detoxification is therefore important for safety and user trust, particularly when we want to reduce harmful content without sacrificing the model’s generation quality. Many existing approaches rely on model retraining, gradients, or learned auxiliary components, which can be costly and may not transfer across model families or to truly black-box settings. We introduce a […]

Ver mais

Like 0

Liked Liked

technocracy

STEMVerse: A Dual-Axis Diagnostic Framework for STEM Reasoning in Large Language Models

digitado ⋅ 4 de February de 2026

arXiv:2602.02497v1 Announce Type: new Abstract: As Large Language Models (LLMs) achieve significant breakthroughs in complex reasoning tasks, evaluating their proficiency in science, technology, engineering, and mathematics (STEM) has become a primary method for measuring machine intelligence. However, current evaluation paradigms often treat benchmarks as isolated “silos,” offering only monolithic aggregate scores that neglect the intricacies of both academic specialization and cognitive depth. This result-oriented approach fails to distinguish whether model errors stem from insufficient domain knowledge or deficiencies […]

Ver mais

Like 0

Liked Liked

technocracy

The Hypocrisy Gap: Quantifying Divergence Between Internal Belief and Chain-of-Thought Explanation via Sparse Autoencoders

digitado ⋅ 4 de February de 2026

arXiv:2602.02496v1 Announce Type: new Abstract: Large Language Models (LLMs) frequently exhibit unfaithful behavior, producing a final answer that differs significantly from their internal chain of thought (CoT) reasoning in order to appease the user they are conversing with. In order to better detect this behavior, we introduce the Hypocrisy Gap, a mechanistic metric utilizing Sparse Autoencoders (SAEs) to quantify the divergence between a model’s internal reasoning and its final generation. By mathematically comparing an internal truth belief, derived […]

Ver mais

Like 0

Liked Liked