digitado – Page 103

LATMiX: Learnable Affine Transformations for Microscaling Quantization of LLMs

digitado ⋅ 23 de February de 2026

arXiv:2602.17681v1 Announce Type: new Abstract: Post-training quantization (PTQ) is a widely used approach for reducing the memory and compute costs of large language models (LLMs). Recent studies have shown that applying invertible transformations to activations can significantly improve quantization robustness by reducing activation outliers; however, existing approaches are largely restricted to rotation or Hadamard-based transformations. Moreover, most studies focused primarily on traditional quantization schemes, whereas modern hardware increasingly supports the microscaling (MX) data format. Attempts to combine both […]

Ver mais

Like 0

Liked Liked

technocracy

Inside the AI That Translates 200 Languages, Even the Ones With Almost No Data

digitado ⋅ 9 de March de 2026

:::info Authors: NLLB Team Marta R. Costa-jussà James Cross Onur Çelebi Maha Elbayad Kenneth Heafield Kevin Heffernan Elahe Kalbassi Janice Lam Daniel Licht Jean Maillard Anna Sun Skyler Wang Guillaume Wenzek Al Youngblood Bapi Akula Loic Barrault Gabriel Mejia Gonzalez Prangthip Hansanti John Hoffman Semarley Jarrett Kaushik Ram Sadagopan Dirk Rowe Shannon Spruit Chau Tran Pierre Andrews Necip Fazil Ayan Shruti Bhosale Sergey Edunov Angela Fan Cynthia Gao Vedanuj Goswami Francisco Guzmán Philipp Koehn Alexandre Mourachko Christophe Ropers […]

Ver mais

Like 0

Liked Liked

technocracy

A graph neural network based chemical mechanism reduction method for combustion applications

digitado ⋅ 25 de March de 2026

arXiv:2603.22318v1 Announce Type: new Abstract: Direct numerical simulations of turbulent reacting flows involving millions of grid points and detailed chemical mechanisms with hundreds of species and thousands of reactions are computationally prohibitive. To address this challenge, we present two data-driven chemical mechanism reduction formulations based on graph neural networks (GNNs) with message-passing transformer layers that learn nonlinear dependencies among species and reactions. The first formulation, GNN-SM, employs a pre-trained surrogate model to guide reduction across a broad range […]

Ver mais

Like 0

Liked Liked

technocracy

Multi-Partner Project: COIN-3D — Collaborative Innovation in 3D VLSI Reliability

digitado ⋅ 22 de January de 2026

arXiv:2601.14347v1 Announce Type: new Abstract: As semiconductor manufacturing advances from the 3-nm process toward the sub-nanometer regime and transitions from FinFETs to gate-all-around field-effect transistors (GAAFETs), the resulting complexity and manufacturing challenges continue to increase. In this context, 3D chiplet-based approaches have emerged as key enablers to address these limitations while exploiting the expanded design space. Specifically, chiplets help address the lower yields typically associated with large monolithic designs. This paradigm enables the modular design of heterogeneous systems […]

Ver mais

Like 0

Liked Liked

technocracy

Safety-Critical Reinforcement Learning with Viability-Based Action Shielding for Hypersonic Longitudinal Flight

digitado ⋅ 5 de February de 2026

arXiv:2602.03968v1 Announce Type: new Abstract: This paper presents a safety-critical reinforcement learning framework for nonlinear dynamical systems with continuous state and input spaces operating under explicit physical constraints. Hard safety constraints are enforced independently of the reward through action shielding and reachability-based admissible action sets, ensuring that unsafe behaviors are never intentionally selected during learning or execution. To capture nominal operation and recovery behavior within a single control architecture, the state space is partitioned into safe and unsafe […]

Ver mais

Like 0

Liked Liked

technocracy

Terence McKenna: Esalen 1989 Lecture

digitado ⋅ 5 de March de 2025

Ayahuasca is a visionary “brain cocktail” brewed from two secret jungle plants. Part medicine, part spiritual telephone, it connects humans to nature’s hidden wisdom. This ancient partnership suggests we are students of a grand, green symbiosis—perhaps even guided by chatty extraterrestrial spores whispering cosmic secrets.

Ver mais

Like 0

Liked Liked

technocracy

CogFormer: Learn All Your Models Once

digitado ⋅ 20 de March de 2026

Simulation-based inference (SBI) with neural networks has accelerated and transformed cognitive modeling workflows. SBI enables modelers to fit complex models that were previously difficult or impossible to estimate, while also allowing rapid estimation across large numbers of datasets. However, the utility of SBI for iterating over varying modeling assumptions remains limited: changing parameterizations, generative functions, priors, and design variables all necessitate model retraining and hence diminish the benefits of amortization. To address these issues, we pilot a meta-amortized […]

Ver mais

Like 0

Liked Liked

technocracy

Decoupled Continuous-Time Reinforcement Learning via Hamiltonian Flow

digitado ⋅ 16 de February de 2026

Many real-world control problems, ranging from finance to robotics, evolve in continuous time with non-uniform, event-driven decisions. Standard discrete-time reinforcement learning (RL), based on fixed-step Bellman updates, struggles in this setting: as time gaps shrink, the $Q$-function collapses to the value function $V$, eliminating action ranking. Existing continuous-time methods reintroduce action information via an advantage-rate function $q$. However, they enforce optimality through complicated martingale losses or orthogonality constraints, which are sensitive to the choice of test processes. These […]

Ver mais

Like 0

Liked Liked

technocracy

Do Neural Networks Lose Plasticity in a Gradually Changing World?

digitado ⋅ 11 de February de 2026

arXiv:2602.09234v1 Announce Type: new Abstract: Continual learning has become a trending topic in machine learning. Recent studies have discovered an interesting phenomenon called loss of plasticity, referring to neural networks gradually losing the ability to learn new tasks. However, existing plasticity research largely relies on contrived settings with abrupt task transitions, which often do not reflect real-world environments. In this paper, we propose to investigate a gradually changing environment, and we simulate this by input/output interpolation and task […]

Ver mais

Like 0

Liked Liked

technocracy

Sigmas and Student

digitado ⋅ 21 de January de 2026

I saw something yesterday saying that the Japanese bond market had experienced a six standard deviation move. This brought to mind a post I’d written eight years ago. All probability statements depend on a model. And if you’re probability model says an event had a probability six standard deviations from the mean, it’s more likely that your model is wrong than that you’ve actually seen something that rare. I expand on this idea here. How likely is it […]

Ver mais

Like 0

Liked Liked