March 2026

Learning to Rank Caption Chains for Video-Text Alignment

digitado ⋅ 26 de March de 2026

Direct preference optimization (DPO) is an effective technique to train language models to generate preferred over dispreferred responses. However, this binary "winner-takes-all" approach is suboptimal for vision-language models whose response quality is highly dependent on visual content. In particular, a response may still be faithful to the visual inputs even if it is less preferable than an alternative. The standard Bradley-Terry DPO formulation lacks this nuance, upweighting winning responses without sufficient regard for whether the "losing" response still […]

Ver mais

Like 0

Liked Liked

technocracy

Gateway Security Won’t Be Enough for MCP-Powered AI

digitado ⋅ 26 de March de 2026

As AI systems become agentic and interact directly with enterprise tools through Model Context Protocol (MCP), gateway-based security models may no longer be sufficient. Policy enforcement must move closer to where capability execution occurs. I was tempted to start by saying that AI operations and AI interactions must be secure, even from local threats. But that is obvious. Instead, it is worth looking at a familiar story from the history of enterprise security. The Rise and Fall of […]

Ver mais

Like 0

Liked Liked

technocracy

Reinforcement learning for quantum processes with memory

digitado ⋅ 26 de March de 2026

In reinforcement learning, an agent interacts sequentially with an environment to maximize a reward, receiving only partial, probabilistic feedback. This creates a fundamental exploration-exploitation trade-off: the agent must explore to learn the hidden dynamics while exploiting this knowledge to maximize its target objective. While extensively studied classically, applying this framework to quantum systems requires dealing with hidden quantum states that evolve via unknown dynamics. We formalize this problem via a framework where the environment maintains a hidden quantum […]

Ver mais

Like 0

Liked Liked

technocracy

Phase 2 Calibration: Fixing Gating and Reward Scoring Together

digitado ⋅ 26 de March de 2026

I didn’t add per‑category OOD thresholds because it was academically elegant. I added them because my baseline runs were telling me the same story over and over: some prompt categories were systematically getting mis-gated by a single global uncertainty threshold. When that happens, you don’t just waste compute—you route the wrong jobs into the wrong generation strategy, and your downstream scoring starts making “confident” decisions on top of the wrong substrate. Phase 2 calibration in this codebase is […]

Ver mais

Like 0

Liked Liked

technocracy

Layer-Specific Lipschitz Modulation for Fault-Tolerant Multimodal Representation Learning

digitado ⋅ 26 de March de 2026

Modern multimodal systems deployed in industrial and safety-critical environments must remain reliable under partial sensor failures, signal degradation, or cross-modal inconsistencies. This work introduces a mathematically grounded framework for fault-tolerant multimodal representation learning that unifies self-supervised anomaly detection and error correction within a single architecture. Building upon a theoretical analysis of perturbation propagation, we derive Lipschitz- and Jacobian-based criteria that determine whether a neural operator amplifies or attenuates localized faults. Guided by this theory, we propose a two-stage […]

Ver mais

Like 0

Liked Liked

technocracy

An Explainable Ensemble Learning Framework for Crop Classification with Optimized Feature Pyramids and Deep Networks

digitado ⋅ 26 de March de 2026

Agriculture is increasingly challenged by climate change, soil degradation, and resource depletion, and hence requires advanced data-driven crop classification and recommendation solutions. This work presents an explainable ensemble learning paradigm that fuses optimized feature pyramids, deep networks, self-attention mechanisms, and residual networks for bolstering crop suitability predictions based on soil characteristics (e.g., pH, nitrogen, potassium) and climatic conditions (e.g., temperature, rainfall). With a dataset comprising 3,867 instances and 29 features from the Ethiopian Agricultural Transformation Agency and NASA, […]

Ver mais

Like 0

Liked Liked

technocracy

SIGMA: Structure-Invariant Generative Molecular Alignment for Chemical Language Models via Autoregressive Contrastive Learning

digitado ⋅ 26 de March de 2026

Linearized string representations serve as the foundation of scalable autoregressive molecular generation; however, they introduce a fundamental modality mismatch where a single molecular graph maps to multiple distinct sequences. This ambiguity leads to textit{trajectory divergence}, where the latent representations of structurally equivalent partial graphs drift apart due to differences in linearization history. To resolve this without abandoning the efficient string formulation, we propose Structure-Invariant Generative Molecular Alignment (SIGMA). Rather than altering the linear representation, SIGMA enables the model […]

Ver mais

Like 0

Liked Liked

technocracy

Two-Time-Scale Learning Dynamics: A Population View of Neural Network Training

digitado ⋅ 26 de March de 2026

arXiv:2603.19808v2 Announce Type: replace-cross Abstract: Population-based learning paradigms, including evolutionary strategies, Population-Based Training (PBT), and recent model-merging methods, combine fast within-model optimisation with slower population-level adaptation. Despite their empirical success, a general mathematical description of the resulting collective training dynamics remains incomplete. We introduce a theoretical framework for neural network training based on two-time-scale population dynamics. We model a population of neural networks as an interacting agent system in which network parameters evolve through fast noisy gradient updates […]

Ver mais

Like 0

Liked Liked

technocracy

Self-Aware Markov Models for Discrete Reasoning

digitado ⋅ 26 de March de 2026

arXiv:2603.16661v2 Announce Type: replace-cross Abstract: Standard masked discrete diffusion models face limitations in reasoning tasks due to their inability to correct their own mistakes on the masking path. Since they rely on a fixed number of denoising steps, they are unable to adjust their computation to the complexity of a given problem. To address these limitations, we introduce a method based on learning a Markov transition kernel that is trained on its own outputs. This design enables tokens […]

Ver mais

Like 0

Liked Liked

technocracy

From Reachability to Learnability: Geometric Design Principles for Quantum Neural Networks

digitado ⋅ 26 de March de 2026

arXiv:2603.03071v2 Announce Type: replace-cross Abstract: Classical deep networks are effective because depth enables adaptive geometric deformation of data representations. In quantum neural networks (QNNs), however, depth or state reachability alone does not guarantee this feature-learning capability. We study this question in the pure-state setting by viewing encoded data as an embedded manifold in $mathbb{C}P^{2^n-1}$ and analysing infinitesimal unitary actions through Lie-algebra directions. We introduce Classical-to-Lie-algebra (CLA) maps and the criterion of almost Complete Local Selectivity (aCLS), which combines […]

Ver mais

Like 0

Liked Liked