digitado – Page 494

Convex Dominance in Deep Learning I: A Scaling Law of Loss and Learning Rate

digitado ⋅ 6 de February de 2026

Deep learning has non-convex loss landscape and its optimization dynamics is hard to analyze or control. Nevertheless, the dynamics can be empirically convex-like across various tasks, models, optimizers, hyperparameters, etc. In this work, we examine the applicability of convexity and Lipschitz continuity in deep learning, in order to precisely control the loss dynamics via the learning rate schedules. We illustrate that deep learning quickly becomes weakly convex after a short period of training, and the loss is predicable […]

Ver mais

Like 0

Liked Liked

technocracy

Stochastic Matching Bandits with Rare Optimization Updates

digitado ⋅ 30 de January de 2026

arXiv:2509.04194v2 Announce Type: replace Abstract: We introduce a bandit framework for stochastic matching under the multinomial logit (MNL) choice model. In our setting, $N$ agents on one side are assigned to $K$ arms on the other side, where each arm stochastically selects an agent from its assigned pool according to unknown preferences and yields a corresponding reward over a horizon $T$. The objective is to minimize regret by maximizing the cumulative revenue from successful matches. A naive approach […]

Ver mais

Like 0

Liked Liked

technocracy

Demystifying Group Relative Policy Optimization: Its Policy Gradient is a U-Statistic

digitado ⋅ 24 de March de 2026

arXiv:2603.01162v3 Announce Type: replace-cross Abstract: Group relative policy optimization (GRPO), a core methodological component of DeepSeekMath and DeepSeek-R1, has emerged as a cornerstone for scaling reasoning capabilities of large language models. Despite its widespread adoption and the proliferation of follow-up works, the theoretical properties of GRPO remain less studied. This paper provides a unified framework to understand GRPO through the lens of classical U-statistics. We demonstrate that the GRPO policy gradient is inherently a U-statistic, allowing us to […]

Ver mais

Like 0

Liked Liked

technocracy

Computationally sufficient statistics for Ising models

digitado ⋅ 16 de February de 2026

arXiv:2602.12449v1 Announce Type: cross Abstract: Learning Gibbs distributions using only sufficient statistics has long been recognized as a computationally hard problem. On the other hand, computationally efficient algorithms for learning Gibbs distributions rely on access to full sample configurations generated from the model. For many systems of interest that arise in physical contexts, expecting a full sample to be observed is not practical, and hence it is important to look for computationally efficient methods that solve the learning […]

Ver mais

Like 0

Liked Liked

technocracy

Statistical-Neural Interaction Networks for Interpretable Mixed-Type Data Imputation

digitado ⋅ 21 de January de 2026

arXiv:2601.12380v1 Announce Type: cross Abstract: Real-world tabular databases routinely combine continuous measurements and categorical records, yet missing entries are pervasive and can distort downstream analysis. We propose Statistical-Neural Interaction (SNI), an interpretable mixed-type imputation framework that couples correlation-derived statistical priors with neural feature attention through a Controllable-Prior Feature Attention (CPFA) module. CPFA learns head-wise prior-strength coefficients ${lambda_h}$ that softly regularize attention toward the prior while allowing data-driven deviations when nonlinear patterns appear to be present in the data. […]

Ver mais

Like 0

Liked Liked

technocracy

Scenario-Adaptive Evaluation of Trustworthy Fine-Tuned Text Models Across Knowledge-Grounded Generation and Misinformation Detection

digitado ⋅ 11 de May de 2026

Large language models (LLMs) increasingly require robust evaluation under realistic instruction-following conditions, particularly for fine-tuned task-specific adapters operating in multilingual environments. This study proposes a scenario-adaptive evaluation framework for assessing the reliability of fine-tuned text models across two application regimes: misinformation detection (disinfo) and knowledge-grounded factual biography generation (heroes). The framework integrates automated generation of balanced risk-oriented scenarios, bilingual evaluation in English and Ukrainian, the LLM-as-a-Judge paradigm, and multidimensional robustness analysis through the Alignment Robustness Index (ARI). Six […]

Ver mais

Like 0

Liked Liked

technocracy

Multi-Agent Cooperative Learning for Robust Vision-Language Alignment under OOD Concepts

digitado ⋅ 16 de January de 2026

arXiv:2601.09746v1 Announce Type: new Abstract: This paper introduces a novel Multi-Agent Cooperative Learning (MACL) framework to address cross-modal alignment collapse in vision-language models when handling out-of-distribution (OOD) concepts. Four core agents, including image, text, name, and coordination agents, collaboratively mitigate modality imbalance through structured message passing. The proposed framework enables multi-agent feature space name learning, incorporates a context exchange enhanced few-shot learning algorithm, and adopts an adaptive dynamic balancing mechanism to regulate inter-agent contributions. Experiments on the VISTA-Beyond […]

Ver mais

Like 0

Liked Liked

technocracy

Mosaic Learning: A Framework for Decentralized Learning with Model Fragmentation

digitado ⋅ 4 de February de 2026

Decentralized learning (DL) enables collaborative machine learning (ML) without a central server, making it suitable for settings where training data cannot be centrally hosted. We introduce Mosaic Learning, a DL framework that decomposes models into fragments and disseminates them independently across the network. Fragmentation reduces redundant communication across correlated parameters and enables more diverse information propagation without increasing communication cost. We theoretically show that Mosaic Learning (i) shows state-of-the-art worst-case convergence rate, and (ii) leverages parameter correlation in […]

Ver mais

Like 0

Liked Liked

technocracy

Privately Learning Decision Lists and a Differentially Private Winnow

digitado ⋅ 7 de February de 2026

We give new differentially private algorithms for the classic problems of learning decision lists and large-margin halfspaces in the PAC and online models. In the PAC model, we give a computationally efficient algorithm for learning decision lists with minimal sample overhead over the best non-private algorithms. In the online model, we give a private analog of the influential Winnow algorithm for learning halfspaces with mistake bound polylogarithmic in the dimension and inverse polynomial in the margin. As an […]

Ver mais

Like 0

Liked Liked

technocracy

GeMA: Learning Latent Manifold Frontiers for Benchmarking Complex Systems

digitado ⋅ 17 de March de 2026

Benchmarking the performance of complex systems such as rail networks, renewable generation assets and national economies is central to transport planning, regulation and macroeconomic analysis. Classical frontier methods, notably Data Envelopment Analysis (DEA) and Stochastic Frontier Analysis (SFA), estimate an efficient frontier in the observed input-output space and define efficiency as distance to this frontier, but rely on restrictive assumptions on the production set and only indirectly address heterogeneity and scale effects. We propose Geometric Manifold Analysis (GeMA), […]

Ver mais

Like 0

Liked Liked