technocracy

Optimizing LoRA target module selection for efficient fine tuning

digitado ⋅ 19 de March de 2026

Fine-tuning a large language model (LLM) on a specific task requires updates to billions of parameters across trillions of tokens, with the attendant costs in GPU resources and time. Low-rank adaptation (LoRA) is a more efficient alternative that freezes the original model weights but introduces lightweight matrices into specific model sublayers, or “modules”. These matrices (commonly referred to as “adapters”) modify the modules’ weights, enabling not only efficient fine tuning but also on-demand model serving, which dramatically lowers […]

Ver mais

Like 0

Liked Liked

technocracy

A Theory of Diversity for Random Matrices with Applications to In-Context Learning of Schrödinger Equations

digitado ⋅ 18 de January de 2026

We address the following question: given a collection ${mathbf{A}^{(1)}, dots, mathbf{A}^{(N)}}$ of independent $d times d$ random matrices drawn from a common distribution $mathbb{P}$, what is the probability that the centralizer of ${mathbf{A}^{(1)}, dots, mathbf{A}^{(N)}}$ is trivial? We provide lower bounds on this probability in terms of the sample size $N$ and the dimension $d$ for several families of random matrices which arise from the discretization of linear Schrödinger operators with random potentials. When combined with recent work […]

Ver mais

Like 0

Liked Liked

technocracy

Maximin Relative Improvement: Fair Learning as a Bargaining Problem

digitado ⋅ 4 de February de 2026

When deploying a single predictor across multiple subpopulations, we propose a fundamentally different approach: interpreting group fairness as a bargaining problem among subpopulations. This game-theoretic perspective reveals that existing robust optimization methods such as minimizing worst-group loss or regret correspond to classical bargaining solutions and embody different fairness principles. We propose relative improvement, the ratio of actual risk reduction to potential reduction from a baseline predictor, which recovers the Kalai-Smorodinsky solution. Unlike absolute-scale methods that may not be […]

Ver mais

Like 0

Liked Liked

technocracy

Model Merging via Data-Free Covariance Estimation

digitado ⋅ 3 de April de 2026

arXiv:2604.01329v1 Announce Type: new Abstract: Model merging provides a way of cheaply combining individual models to produce a model that inherits each individual’s capabilities. While some merging methods can approach the performance of multitask training, they are often heuristically motivated and lack theoretical justification. A principled alternative is to pose model merging as a layer-wise optimization problem that directly minimizes interference between tasks. However, this formulation requires estimating per-layer covariance matrices from data, which may not be available […]

Ver mais

Like 0

Liked Liked

technocracy

Concept Heterogeneity-aware Representation Steering

digitado ⋅ 4 de March de 2026

arXiv:2603.02237v1 Announce Type: new Abstract: Representation steering offers a lightweight mechanism for controlling the behavior of large language models (LLMs) by intervening on internal activations at inference time. Most existing methods rely on a single global steering direction, typically obtained via difference-in-means over contrastive datasets. This approach implicitly assumes that the target concept is homogeneously represented across the embedding space. In practice, however, LLM representations can be highly non-homogeneous, exhibiting clustered, context-dependent structure, which renders global steering directions […]

Ver mais

Like 0

Liked Liked

technocracy

Delta Fair Sharing: Performance Isolation for Multi-Tenant Storage Systems

digitado ⋅ 29 de January de 2026

arXiv:2601.20030v1 Announce Type: new Abstract: Modern storage systems, often deployed to support multiple tenants in the cloud, must provide performance isolation. Unfortunately, traditional approaches such as fair sharing do not provide performance isolation for storage systems, because their resources (e.g., write buffers and read caches) exhibit high preemption delays. These delays lead to unacceptable spikes in client tail latencies, as clients may be forced to wait arbitrarily long to receive their fair share of resources. We introduce Delta […]

Ver mais

Like 0

Liked Liked

technocracy

And the award for the most improved EV goes to… the 2026 Toyota bZ

digitado ⋅ 27 de February de 2026

The world’s largest automaker has had a somewhat difficult relationship with battery-electric vehicles. Toyota was an early pioneer of hybrid powertrains, and it remains a fan today, often saying that given limited battery supply, it makes sense to build more hybrids than fewer EVs. Its first full BEV had a rocky start, suffering a recall due to improperly attached wheels just as the cars were hitting showrooms. Reviews for the awkwardly named bZ4x were mixed; the car did little […]

Ver mais

Like 0

Liked Liked

technocracy

Curvature-Aware PCA with Geodesic Tangent Space Aggregation for Semi-Supervised Learning

digitado ⋅ 22 de April de 2026

arXiv:2604.18816v1 Announce Type: new Abstract: Principal Component Analysis (PCA) is a fundamental tool for representation learning, but its global linear formulation fails to capture the structure of data supported on curved manifolds. In contrast, manifold learning methods model nonlinearity but often sacrifice the spectral structure and stability of PCA. We propose emph{Geodesic Tangent Space Aggregation PCA (GTSA-PCA)}, a geometric extension of PCA that integrates curvature awareness and geodesic consistency within a unified spectral framework. Our approach replaces the […]

Ver mais

Like 0

Liked Liked

technocracy

qPRO-AQFP: Post-Routing Optimization of AQFP Circuits with Delay Line Clocking

digitado ⋅ 13 de April de 2026

arXiv:2604.08705v1 Announce Type: new Abstract: Adiabatic Quantum-Flux-Parametron (AQFP) logic is an ultra-low-power superconducting logic family with energy consumption approaching the Shannon limit, making it attractive for quantum computing control and cryogenic computing systems. Traditional AQFP designs face significant physical design challenges due to strict gate-level clocking requirements and limited interconnect lengths, leading to substantial buffer overhead and difficult timing closure. Recently, delay-line clocking of AQFP has been proposed to improve timing margins and reduce latency by enabling more […]

Ver mais

Like 0

Liked Liked

technocracy

Wavelet-Enhanced Deep Learning for Multi-Variable Meteorological Time-Series Forecasting in Togo

digitado ⋅ 14 de April de 2026

Accurate short-term forecasting of key meteorological variables—air temperature, relative humidity, and wind speed—remains challenging in tropical and sub-Saharan regions due to strong diurnal cycles, seasonal variability, and non-stationary dynamics. To address these limitations, this study proposes a hybrid deep learning model combining Stationary Wavelet Transform (SWT), Multi-Head Attention (MHA), and LSTM networks. First, SWT decomposes meteorological time series into multi-scale components, capturing both low-frequency trends and high-frequency fluctuations while preserving temporal resolution. Then, the attention mechanism dynamically weights […]

Ver mais

Like 0

Liked Liked