March 2026

Pooling Engram Conditional Memory in Large Language Models using CXL

digitado ⋅ 12 de March de 2026

arXiv:2603.10087v1 Announce Type: new Abstract: Engram conditional memory has emerged as a promising component for LLMs by decoupling static knowledge lookup from dynamic computation. Since Engram exhibits sparse access patterns and supports prefetching, its massive embedding tables are well-suited for offloading to lower-tier memory. In this paper, we propose using Compute Express Link (CXL) memory pool for Engram storage. Compared to RDMA, CXL provides fine-grained and low-latency access required by minimal and discrete retrieval patterns of Engram. We […]

Ver mais

Like 0

Liked Liked

technocracy

KernelSkill: A Multi-Agent Framework for GPU Kernel Optimization

digitado ⋅ 12 de March de 2026

arXiv:2603.10085v1 Announce Type: new Abstract: Improving GPU kernel efficiency is crucial for advancing AI systems. Recent work has explored leveraging large language models (LLMs) for GPU kernel generation and optimization. However, existing LLM-based kernel optimization pipelines typically rely on opaque, implicitly learned heuristics within the LLMs to determine optimization strategies. This leads to inefficient trial-and-error and weakly interpretable optimizations. Our key insight is to replace implicit heuristics with expert optimization skills that are knowledge-driven and aware of task […]

Ver mais

Like 0

Liked Liked

technocracy

Digging Deeper: Learning Multi-Level Concept Hierarchies

digitado ⋅ 12 de March de 2026

arXiv:2603.10084v1 Announce Type: new Abstract: Although concept-based models promise interpretability by explaining predictions with human-understandable concepts, they typically rely on exhaustive annotations and treat concepts as flat and independent. To circumvent this, recent work has introduced Hierarchical Concept Embedding Models (HiCEMs) to explicitly model concept relationships, and Concept Splitting to discover sub-concepts using only coarse annotations. However, both HiCEMs and Concept Splitting are restricted to shallow hierarchies. We overcome this limitation with Multi-Level Concept Splitting (MLCS), which discovers […]

Ver mais

Like 0

Liked Liked

technocracy

Categorical Calculus and Algebra for Multi-Model Data

digitado ⋅ 12 de March de 2026

arXiv:2603.10081v1 Announce Type: new Abstract: Multi-model databases are designed to store, manage, and query data in various models, such as relational, hierarchical, and graph data, simultaneously. In this paper, we provide a theoretical basis for querying categorical databases. We propose two formal query languages: categorical calculus and categorical algebra, by extending relational calculus and relational algebra respectively. We demonstrate the equivalence between these two languages of queries. We propose a series of transformation rules of categorical algebra to […]

Ver mais

Like 0

Liked Liked

technocracy

Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models

digitado ⋅ 12 de March de 2026

arXiv:2603.10080v1 Announce Type: new Abstract: Warning: This article includes red-teaming experiments, which contain examples of compromised LLM responses that may be offensive or upsetting. Large Language Models (LLMs) have the potential to create harmful content, such as generating sophisticated phishing emails and assisting in writing code of harmful computer viruses. Thus, it is crucial to ensure their safe and responsible response generation. To reduce the risk of generating harmful or irresponsible content, researchers have developed techniques such as […]

Ver mais

Like 0

Liked Liked

technocracy

Large Spikes in Stochastic Gradient Descent: A Large-Deviations View

digitado ⋅ 12 de March de 2026

arXiv:2603.10079v1 Announce Type: new Abstract: We analyse SGD training of a shallow, fully connected network in the NTK scaling and provide a quantitative theory of the catapult phase. We identify an explicit criterion separating two behaviours: When an explicit function $G$, depending only on the kernel, learning rate $eta$ and data, is positive, SGD produces large NTK-flattening spikes with high probability; when $G<0$, their probability decays like $(n/eta)^{-vartheta/2}$, for an explicitly characterised $varthetain (0,infty)$. This yields a concrete […]

Ver mais

Like 0

Liked Liked

technocracy

Stochastic Port-Hamiltonian Neural Networks: Universal Approximation with Passivity Guarantees

digitado ⋅ 12 de March de 2026

arXiv:2603.10078v1 Announce Type: new Abstract: Stochastic port-Hamiltonian systems represent open dynamical systems with dissipation, inputs, and stochastic forcing in an energy based form. We introduce stochastic port-Hamiltonian neural networks, SPH-NNs, which parameterize the Hamiltonian with a feedforward network and enforce skew symmetry of the interconnection matrix and positive semidefiniteness of the dissipation matrix. For It^o dynamics we establish a weak passivity inequality in expectation under an explicit generator condition, stated for a stopped process on a compact set. […]

Ver mais

Like 0

Liked Liked

technocracy

TASER: Task-Aware Spectral Energy Refine for Backdoor Suppression in UAV Swarms Decentralized Federated Learning

digitado ⋅ 12 de March de 2026

arXiv:2603.10075v1 Announce Type: new Abstract: As backdoor attacks in UAV-based decentralized federated learning (DFL) grow increasingly stealthy and sophisticated, existing defenses have likewise escalated in complexity. Yet these defenses, which rely heavily on outlier detection, remain vulnerable to carefully crafted backdoors. In UAV-DFL, the lack of global coordination and limited resources further render outlier-based defenses impractical. Against this backdrop, gradient spectral analysis offers a promising alternative. While prior work primarily leverages low-frequency coefficients for pairwise comparisons, it neglects […]

Ver mais

Like 0

Liked Liked

technocracy

Marginals Before Conditionals

digitado ⋅ 12 de March de 2026

arXiv:2603.10074v1 Announce Type: new Abstract: We construct a minimal task that isolates conditional learning in neural networks: a surjective map with K-fold ambiguity, resolved by a selector token z, so H(A | B) = log K while H(A | B, z) = 0. The model learns the marginal P(A | B) first, producing a plateau at exactly log K, before acquiring the full conditional in a sharp, collective transition. The plateau has a clean decomposition: height = log […]

Ver mais

Like 0

Liked Liked

technocracy

Why LLMs Fail: A Failure Analysis and Partial Success Measurement for Automated Security Patch Generation

digitado ⋅ 12 de March de 2026

arXiv:2603.10072v1 Announce Type: new Abstract: Large Language Models (LLMs) show promise for Automated Program Repair (APR), yet their effectiveness on security vulnerabilities remains poorly characterized. This study analyzes 319 LLM-generated security patchesacross 64 Java vulnerabilities from the Vul4J benchmark. Using tri-axis evaluation (compilation, security via PoV tests, functionality via test suites), the analysis reveals that only 24.8% of patches achieve full correctness, while 51.4% fail both security and functionality. The dominant failure mode is semantic misunderstanding: LLMs produce […]

Ver mais

Like 0

Liked Liked