March 2026

Training Language Models via Neural Cellular Automata

digitado ⋅ 12 de March de 2026

arXiv:2603.10055v1 Announce Type: new Abstract: Pre-training is crucial for large language models (LLMs), as it is when most representations and capabilities are acquired. However, natural language pre-training has problems: high-quality text is finite, it contains human biases, and it entangles knowledge with reasoning. This raises a fundamental question: is natural language the only path to intelligence? We propose using neural cellular automata (NCA) to generate synthetic, non-linguistic data for pre-pre-training LLMs–training on synthetic-then-natural language. NCA data exhibits rich […]

Ver mais

Like 0

Liked Liked

technocracy

Quantization of Ricci Curvature in Information Geometry

digitado ⋅ 12 de March de 2026

arXiv:2603.10054v1 Announce Type: new Abstract: In 2004, while studying the information geometry of binary Bayesian networks (bitnets), the author conjectured that the volume-averaged Ricci scalar computed with respect to the Fisher information metric is universally quantized to positive half-integers: in (1/2)Z. This paper resolves the conjecture after 20 years. We prove it for tree-structured and complete-graph bitnets via a universal Beta function cancellation mechanism, and disprove it in general by exhibiting explicit loop counterexamples. We extend the program […]

Ver mais

Like 0

Liked Liked

technocracy

Cluster-Aware Attention-Based Deep Reinforcement Learning for Pickup and Delivery Problems

digitado ⋅ 12 de March de 2026

arXiv:2603.10053v1 Announce Type: new Abstract: The Pickup and Delivery Problem (PDP) is a fundamental and challenging variant of the Vehicle Routing Problem, characterized by tightly coupled pickup–delivery pairs, precedence constraints, and spatial layouts that often exhibit clustering. Existing deep reinforcement learning (DRL) approaches either model all nodes on a flat graph, relying on implicit learning to enforce constraints, or achieve strong performance through inference-time collaborative search at the cost of substantial latency. In this paper, we propose emph{CAADRL} […]

Ver mais

Like 0

Liked Liked

technocracy

OmniGuide: Universal Guidance Fields for Enhancing Generalist Robot Policies

digitado ⋅ 12 de March de 2026

arXiv:2603.10052v1 Announce Type: new Abstract: Vision-language-action(VLA) models have shown great promise as generalist policies for a large range of relatively simple tasks. However, they demonstrate limited performance on more complex tasks, such as those requiring complex spatial or semantic understanding, manipulation in clutter, or precise manipulation. We propose OMNIGUIDE, a flexible framework that improves VLA performance on such tasks by leveraging arbitrary sources of guidance, such as 3D foundation models, semantic-reasoning VLMs, and human pose models. We show […]

Ver mais

Like 0

Liked Liked

technocracy

Where Do Flow Semantics Reside? A Protocol-Native Tabular Pretraining Paradigm for Encrypted Traffic Classification

digitado ⋅ 12 de March de 2026

arXiv:2603.10051v1 Announce Type: new Abstract: Self-supervised masked modeling shows promise for encrypted traffic classification by masking and reconstructing raw bytes. Yet recent work reveals these methods fail to reduce reliance on labeled data despite costly pretraining: under frozen encoder evaluation, accuracy drops from greater than 0.9 to less than 0.47. We argue the root cause is inductive bias mismatch: flattening traffic into byte sequences destroys protocol-defined semantics. We identify three specific issues: 1) field unpredictability, random fields like […]

Ver mais

Like 0

Liked Liked

technocracy

Geometrically Explicit Cosserat-Rod Modeling with Piecewise Linear Strain for Complex Rod Systems

digitado ⋅ 12 de March de 2026

arXiv:2603.10050v1 Announce Type: new Abstract: This paper presents a geometrically explicit formulation for Cosserat rods that unifies configuration-space and strain-based representations within a single modeling framework. The proposed method uses nodal configurations on the Lie group SE(3) as generalized coordinates, while internal strains are reconstructed via a piecewise-linear parameterization. This hybrid design preserves the geometric rigor of Lie-group formulations and retains the locality, simplicity, and computational efficiency characteristic of strain-parameterized rod models. The formulation naturally avoids shear and […]

Ver mais

Like 0

Liked Liked

technocracy

InFusionLayer: a CFA-based ensemble tool to generate new classifiers for learning and modeling

digitado ⋅ 12 de March de 2026

arXiv:2603.10049v1 Announce Type: new Abstract: Ensemble learning is a well established body of methods for machine learning to enhance predictive performance by combining multiple algorithms/models. Combinatorial Fusion Analysis (CFA) has provided method and practice for combining multiple scoring systems, using rank-score characteristic (RSC) function and cognitive diversity (CD), including ensemble method and model fusion. However, there is no general-purpose Python tool available that incorporate these techniques. In this paper we introduce texttt{InFusionLayer}, a machine learning architecture inspired by […]

Ver mais

Like 0

Liked Liked

technocracy

Revisiting Sharpness-Aware Minimization: A More Faithful and Effective Implementation

digitado ⋅ 12 de March de 2026

arXiv:2603.10048v1 Announce Type: new Abstract: Sharpness-Aware Minimization (SAM) enhances generalization by minimizing the maximum training loss within a predefined neighborhood around the parameters. However, its practical implementation approximates this as gradient ascent(s) followed by applying the gradient at the ascent point to update the current parameters. This practice can be justified as approximately optimizing the objective by neglecting the (full) derivative of the ascent point with respect to the current parameters. Nevertheless, a direct and intuitive understanding of […]

Ver mais

Like 0

Liked Liked

technocracy

Toward Epistemic Stability: Engineering Consistent Procedures for Industrial LLM Hallucination Reduction

digitado ⋅ 12 de March de 2026

arXiv:2603.10047v1 Announce Type: new Abstract: Hallucinations in large language models (LLMs) are outputs that are syntactically coherent but factually incorrect or contextually inconsistent. They are persistent obstacles in high-stakes industrial settings such as engineering design, enterprise resource planning, and IoT telemetry platforms. We present and compare five prompt engineering strategies intended to reduce the variance of model outputs and move toward repeatable, grounded results without modifying model weights or creating complex validation models. These methods include: (M1) Iterative […]

Ver mais

Like 0

Liked Liked

technocracy

Gated Adaptation for Continual Learning in Human Activity Recognition

digitado ⋅ 12 de March de 2026

arXiv:2603.10046v1 Announce Type: new Abstract: Wearable sensors in Internet of Things (IoT) ecosystems increasingly support applications such as remote health monitoring, elderly care, and smart home automation, all of which rely on robust human activity recognition (HAR). Continual learning systems must balance plasticity (learning new tasks) with stability (retaining prior knowledge), yet AI models often exhibit catastrophic forgetting, where learning new tasks degrades performance on earlier ones. This challenge is especially acute in domain-incremental HAR, where on-device models […]

Ver mais

Like 0

Liked Liked