digitado – Page 449

Celo2: Towards Learned Optimization Free Lunch

digitado ⋅ 22 de February de 2026

Learned optimizers are powerful alternatives to hand-designed update rules like Adam, yet they have seen limited practical adoption since they often fail to meta-generalize beyond their training distribution and incur high meta-training cost. For instance, prior work, VeLO, scaled meta-training to 4,000 TPU months ($sim$10$times$ GPT-3 compute) to meta-train a general-purpose optimizer but it failed to generalize beyond 600M parameters tasks. In this work, we present a surprising finding: by crafting a simple normalized optimizer architecture and augmenting […]

Ver mais

Like 0

Liked Liked

technocracy

Microservices and Multi-Runtime Architectures

digitado ⋅ 23 de January de 2024

Software developers increasingly adopt the microservices architecture, a server-side solution where interconnected services function autonomously. This enables distinct teams to work on separate services without interrupting the overall workflow—a level of flexibility rarely seen in alternative architectural approaches. Additionally, the next generation approach—multi-runtime architecture—is gaining more attention. In this blog post, we explain both concepts, as well as their benefits and limitations compared to the monolithic architecture. What are microservices? Microservices are a method in software development where […]

Ver mais

Like 0

Liked Liked

technocracy

Information Geometry Description of Inferential Scattering

digitado ⋅ 14 de April de 2026

We investigate the geometrical structure underlying the notion of Inferential Scattering, which was formulated by E. T. Jaynes in the 1980s using the language of equilibrium statistical mechanics. We show that inferential scattering can be naturally defined on a dually flat Riemannian manifold equipped with dual coordinate systems, a differential- geometric structure that occupies a central place in information geometry. We find that the evolution of the system on the dually flat manifold can be expressed as the […]

Ver mais

Like 0

Liked Liked

technocracy

Uncertainty-Aware Surrogate-based Amortized Bayesian Inference for Computationally Expensive Models

digitado ⋅ 19 de January de 2026

arXiv:2505.08683v3 Announce Type: replace Abstract: Bayesian inference typically relies on a large number of model evaluations to estimate posterior distributions. Established methods like Markov Chain Monte Carlo (MCMC) and Amortized Bayesian Inference (ABI) can become computationally challenging. While ABI enables fast inference after training, generating sufficient training data still requires thousands of model simulations, which is infeasible for expensive models. Surrogate models offer a solution by providing approximate simulations at a lower computational cost, allowing the generation of […]

Ver mais

Like 0

Liked Liked

technocracy

Contact-Grounded Policy: Dexterous Visuotactile Policy with Generative Contact Grounding

digitado ⋅ 9 de March de 2026

arXiv:2603.05687v1 Announce Type: new Abstract: Contact-Grounded Policy (CGP) enables fine-grained, contact-rich dexterous manipulation by grounding multi-point contacts through predicting the actual robot state and tactile feedback, and by using a learned contact-consistency mapping to convert these predictions into controller-executable targets for a compliance controller. CGP supports both dense tactile arrays and vision-based tactile sensors mounted on the hand. We collect demonstrations via teleoperation in both simulation and on a physical robot, and evaluate CGP across multiple dexterous manipulation […]

Ver mais

Like 0

Liked Liked

technocracy

FTimeXer: Frequency-aware Time-series Transformer with Exogenous variables for Robust Carbon Footprint Forecasting

digitado ⋅ 7 de April de 2026

arXiv:2604.02347v1 Announce Type: new Abstract: Accurate and up-to-date forecasting of the power grid’s carbon footprint is crucial for effective product carbon footprint (PCF) accounting and informed decarbonization decisions. However, the carbon intensity of the grid exhibits high non-stationarity, and existing methods often struggle to effectively leverage periodic and oscillatory patterns. Furthermore, these methods tend to perform poorly when confronted with irregular exogenous inputs, such as missing data or misalignment. To tackle these challenges, we propose FTimeXer, a frequency-aware […]

Ver mais

Like 0

Liked Liked

technocracy

Leveraging Mutation Analysis for LLM-based Repair of Quantum Programs

digitado ⋅ 21 de January de 2026

arXiv:2601.12273v1 Announce Type: new Abstract: In recent years, Automated Program Repair (APR) techniques specifically designed for quantum programs have been proposed. However, existing approaches often suffer from low repair success rates or poor understandability of the generated patches. In this study, we construct a framework in which a large language model (LLM) generates code repairs along with a natural language explanation of the applied repairs. To investigate how the contextual information included in prompts influences APR performance for […]

Ver mais

Like 0

Liked Liked

technocracy

IGA-LWP: An Iterative Gradient-based Adversarial Attack for Link Weight Prediction

digitado ⋅ 9 de January de 2026

arXiv:2601.04259v1 Announce Type: new Abstract: Link weight prediction extends classical link prediction by estimating the strength of interactions rather than merely their existence, and it underpins a wide range of applications such as traffic engineering, social recommendation, and scientific collaboration analysis. However, the robustness of link weight prediction against adversarial perturbations remains largely unexplored.In this paper, we formalize the link weight prediction attack problem as an optimization task that aims to maximize the prediction error on a set […]

Ver mais

Like 0

Liked Liked

technocracy

Equivariant Evidential Deep Learning for Interatomic Potentials

digitado ⋅ 11 de February de 2026

Uncertainty quantification (UQ) is critical for assessing the reliability of machine learning interatomic potentials (MLIPs) in molecular dynamics (MD) simulations, identifying extrapolation regimes and enabling uncertainty-aware workflows such as active learning for training dataset construction. Existing UQ approaches for MLIPs are often limited by high computational cost or suboptimal performance. Evidential deep learning (EDL) provides a theoretically grounded single-model alternative that determines both aleatoric and epistemic uncertainty in a single forward pass. However, extending evidential formulations from scalar […]

Ver mais

Like 0

Liked Liked

technocracy

Monkey Jump : MoE-Style PEFT for Efficient Multi-Task Learning

digitado ⋅ 10 de January de 2026

Mixture-of-experts variants of parameter-efficient fine-tuning enable per-token specialization, but they introduce additional trainable routers and expert parameters, increasing memory usage and training cost. This undermines the core goal of parameter-efficient fine-tuning. We propose Monkey Jump, a method that brings mixture-of-experts-style specialization to parameter-efficient fine-tuning without introducing extra trainable parameters for experts or routers. Instead of adding new adapters as experts, Monkey Jump treats the adapters already present in each Transformer block (such as query, key, value, up, and […]

Ver mais

Like 0

Liked Liked