February 2026

Support Tokens, Stability Margins, and a New Foundation for Robust LLMs

digitado ⋅ 27 de February de 2026

arXiv:2602.22271v1 Announce Type: new Abstract: Self-attention is usually described as a flexible, content-adaptive way to mix a token with information from its past. We re-interpret causal self-attention transformers, the backbone of modern foundation models, within a probabilistic framework, much like how classical PCA is extended to probabilistic PCA. However, this re-formulation reveals a surprising and deeper structural insight: due to a change-of-variables phenomenon, a barrier constraint emerges on the self-attention parameters. This induces a highly structured geometry on […]

Ver mais

Like 0

Liked Liked

technocracy

Prior Knowledge-enhanced Spatio-temporal Epidemic Forecasting

digitado ⋅ 27 de February de 2026

arXiv:2602.22270v1 Announce Type: new Abstract: Spatio-temporal epidemic forecasting is critical for public health management, yet existing methods often struggle with insensitivity to weak epidemic signals, over-simplified spatial relations, and unstable parameter estimation. To address these challenges, we propose the Spatio-Temporal priOr-aware Epidemic Predictor (STOEP), a novel hybrid framework that integrates implicit spatio-temporal priors and explicit expert priors. STOEP consists of three key components: (1) Case-aware Adjacency Learning (CAL), which dynamically adjusts mobility-based regional dependencies using historical infection patterns; […]

Ver mais

Like 0

Liked Liked

technocracy

CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning

digitado ⋅ 27 de February de 2026

arXiv:2602.22269v1 Announce Type: new Abstract: Federated Learning (FL) enables collaborative model training without sharing raw data. However, shared local model updates remain vulnerable to inference and poisoning attacks. Secure aggregation schemes have been proposed to mitigate these attacks. In this work, we aim to understand how these techniques are implemented in quantum-assisted FL. Quantum Secure Aggregation (QSA) has been proposed, offering information-theoretic privacy by encoding client updates into the global phase of multipartite entangled states. Existing QSA protocols, […]

Ver mais

Like 0

Liked Liked

technocracy

AutoQRA: Joint Optimization of Mixed-Precision Quantization and Low-rank Adapters for Efficient LLM Fine-Tuning

digitado ⋅ 27 de February de 2026

arXiv:2602.22268v1 Announce Type: new Abstract: Quantization followed by parameter-efficient fine-tuning has emerged as a promising paradigm for downstream adaptation under tight GPU memory constraints. However, this sequential pipeline fails to leverage the intricate interaction between quantization bit-width and LoRA rank. Specifically, a carefully optimized quantization allocation with low quantization error does not always translate to strong fine-tuning performance, and different bit-width and rank configurations can lead to significantly varying outcomes under the same memory budget. To address this […]

Ver mais

Like 0

Liked Liked

technocracy

Data-Driven Supervision of a Thermal-Hydraulic Process Towards a Physics-Based Digital Twin

digitado ⋅ 27 de February de 2026

arXiv:2602.22267v1 Announce Type: new Abstract: The real-time supervision of production processes is a common challenge across several industries. It targets process component monitoring and its predictive maintenance in order to ensure safety, uninterrupted production and maintain high efficiency level. The rise of advanced tools for the simulation of physical systems in addition to data-driven machine learning models offers the possibility to design numerical tools dedicated to efficient system monitoring. In that respect, the digital twin concept presents an […]

Ver mais

Like 0

Liked Liked

technocracy

WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention

digitado ⋅ 27 de February de 2026

arXiv:2602.22266v1 Announce Type: new Abstract: State-space models (SSMs) have emerged as a powerful foundation for long-range sequence modeling, with the HiPPO framework showing that continuous-time projection operators can be used to derive stable, memory-efficient dynamical systems that encode the past history of the input signal. However, existing projection-based SSMs often rely on polynomial bases with global temporal support, whose inductive biases are poorly matched to signals exhibiting localized or transient structure. In this work, we introduce emph{WaveSSM}, a […]

Ver mais

Like 0

Liked Liked

technocracy

Entropy-Controlled Flow Matching

digitado ⋅ 27 de February de 2026

arXiv:2602.22265v1 Announce Type: new Abstract: Modern vision generators transport a base distribution to data through time-indexed measures, implemented as deterministic flows (ODEs) or stochastic diffusions (SDEs). Despite strong empirical performance, standard flow-matching objectives do not directly control the information geometry of the trajectory, allowing low-entropy bottlenecks that can transiently deplete semantic modes. We propose Entropy-Controlled Flow Matching (ECFM): a constrained variational principle over continuity-equation paths enforcing a global entropy-rate budget d/dt H(mu_t) >= -lambda. ECFM is a convex […]

Ver mais

Like 0

Liked Liked

technocracy

Sustainable LLM Inference using Context-Aware Model Switching

digitado ⋅ 27 de February de 2026

arXiv:2602.22261v1 Announce Type: new Abstract: Large language models have become central to many AI applications, but their growing energy consumption raises serious sustainability concerns. A key limitation in current AI deployments is the reliance on a one-size-fits-all inference strategy where most systems route every request to the same large model, regardless of task complexity, leading to substantial and unnecessary energy waste. To address this issue, we propose a context-aware model switching approach that dynamically selects an appropriate language […]

Ver mais

Like 0

Liked Liked

technocracy

Code World Models for Parameter Control in Evolutionary Algorithms

digitado ⋅ 27 de February de 2026

arXiv:2602.22260v1 Announce Type: new Abstract: Can an LLM learn how an optimizer behaves — and use that knowledge to control it? We extend Code World Models (CWMs), LLM-synthesized Python programs that predict environment dynamics, from deterministic games to stochastic combinatorial optimization. Given suboptimal trajectories of $(1{+}1)$-$text{RLS}_k$, the LLM synthesizes a simulator of the optimizer’s dynamics; greedy planning over this simulator then selects the mutation strength $k$ at each step. On lo{} and onemax{}, CWM-greedy performs within 6% of […]

Ver mais

Like 0

Liked Liked

technocracy

Orthogonal Weight Modification Enhances Learning Scalability and Convergence Efficiency without Gradient Backpropagation

digitado ⋅ 27 de February de 2026

arXiv:2602.22259v1 Announce Type: new Abstract: Recognizing the substantial computational cost of backpropagation (BP), non-BP methods have emerged as attractive alternatives for efficient learning on emerging neuromorphic systems. However, existing non-BP approaches still face critical challenges in efficiency and scalability. Inspired by neural representations and dynamic mechanisms in the brain, we propose a perturbation-based approach called LOw-rank Cluster Orthogonal (LOCO) weight modification. We find that low-rank is an inherent property of perturbation-based algorithms. Under this condition, the orthogonality constraint […]

Ver mais

Like 0

Liked Liked