January 2026

Partially observable Matsuzawa. Can any RL algorithm generalize in this way?

digitado ⋅ 19 de January de 2026

Fully observable Matsuzawa puzzles are grid worlds where an agent must pick up coins in a particular order, travel down a long hallway, then pick up coins in order again. The secondary chamber has the coins in exactly the locations in which they occurred in the primary. https://i.imgur.com/5nvi0oe.png coins must be picked up in the order of their face number. coins in the secondary chamber are pickable only when there are no coins remaining in the primary. reward […]

Ver mais

Like 0

Liked Liked

technocracy

Prime gaps and Gapcoin

digitado ⋅ 19 de January de 2026

The previous post looked at tightly clustered primes. This post looks at the opposite, large gaps between primes. Riecoin is a cryptocurrency that uses finding prime clusters as its proof of work task. Gapcoin uses finding prime gaps as its proof of work task. There’s some nuance to defining prime gaps. It’s trivial to produce a gap of any size. For example, [n! + 2, n! + n] is an interval of length n − 1 that contains no […]

Ver mais

Like 0

Liked Liked

technocracy

Hippotorch: Hippocampus-inspired episodic memory for sparse-reward problems

digitado ⋅ 19 de January de 2026

![img](socqna2mb7eg1) I’ve been working on a replay buffer replacement inspired by how the hippocampus consolidates memories during sleep. The problem: In sparse-reward tasks with long horizons (e.g., T-maze variants), the critical observation arrives at t=0 but the decision happens 30+ steps later. Uniform replay treats all transitions equally, so the rare successes get drowned out. The approach: Hippotorch uses a dual encoder to embed experiences, stores them in an episodic memory with semantic indices, and periodically runs a […]

Ver mais

Like 0

Liked Liked

technocracy

FLUX.2-klein-4B Pure C Implementation

digitado ⋅ 19 de January de 2026

FLUX.2-klein-4B Pure C Implementation On 15th January Black Forest Labs, a lab formed by the creators of the original Stable Diffusion, released black-forest-labs/FLUX.2-klein-4B – an Apache 2.0 licensed 4 billion parameter version of their FLUX.2 family. Salvatore Sanfilippo (antirez) decided to build a pure C and dependency-free implementation to run the model, with assistance from Claude Code and Claude Opus 4.5. Salvatore shared this note on Hacker News: Something that may be interesting for the reader of this […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Deterministic Finite-State Machines from the Prefixes of a Single String is NP-Complete

digitado ⋅ 19 de January de 2026

It is well known that computing a minimum DFA consistent with a given set of positive and negative examples is NP-hard. Previous work has identified conditions on the input sample under which the problem becomes tractable or remains hard. In this paper, we study the computational complexity of the case where the input sample is prefix-closed. This formulation is equivalent to computing a minimum Moore machine consistent with observations along its runs. We show that the problem is […]

Ver mais

Like 0

Liked Liked

technocracy

A Theory of Diversity for Random Matrices with Applications to In-Context Learning of Schrödinger Equations

digitado ⋅ 18 de January de 2026

We address the following question: given a collection ${mathbf{A}^{(1)}, dots, mathbf{A}^{(N)}}$ of independent $d times d$ random matrices drawn from a common distribution $mathbb{P}$, what is the probability that the centralizer of ${mathbf{A}^{(1)}, dots, mathbf{A}^{(N)}}$ is trivial? We provide lower bounds on this probability in terms of the sample size $N$ and the dimension $d$ for several families of random matrices which arise from the discretization of linear Schrödinger operators with random potentials. When combined with recent work […]

Ver mais

Like 0

Liked Liked

technocracy

Prime clusters and Riecoin

digitado ⋅ 18 de January de 2026

Prime clusters are sets of primes that appear as close together as is generally possible. There is one pair of consecutive prime numbers, 2 and 3, but there cannot be any more: in any larger pair of consecutive numbers, one of the pair will be even. But there are a lot of twin primes, perhaps infinitely many, and so a prime cluster of size two is a pair of primes whose difference is 2. How close together can a […]

Ver mais

Like 0

Liked Liked

technocracy

Life, Machine Learning, and the Search for Habitability: Predicting Biosignature Fluxes for the Habitable Worlds Observatory

digitado ⋅ 18 de January de 2026

Future direct-imaging flagship missions, such as NASA’s Habitable Worlds Observatory (HWO), face critical decisions in prioritizing observations due to extremely stringent time and resource constraints. In this paper, we introduce two advanced machine-learning architectures tailored for predicting biosignature species fluxes from exoplanetary reflected-light spectra: a Bayesian Convolutional Neural Network (BCNN) and our novel model architecture, the Spectral Query Adaptive Transformer (SQuAT). The BCNN robustly quantifies both epistemic and aleatoric uncertainties, offering reliable predictions under diverse observational conditions, whereas […]

Ver mais

Like 0

Liked Liked

technocracy

[Project Review] Attempting Multi-Warehouse VRP with Heterogeneous Fleet (REINFORCE). Stuck on the “Efficiency vs. Effectiveness” trade-off

digitado ⋅ 18 de January de 2026

Hi everyone, I am an RL novice working on my first “real” project: a solver for the Multi-Warehouse Vehicle Routing Problem (MWVRP). My background is limited (I’ve essentially only read the DeepMDV paper and some standard VRP literature), so I am looking for a sanity check on my approach, as well as recommendations for papers or codebases that tackle similar constraints. The Problem Setting: I am modeling a supply chain with: Multiple Depots & Heterogeneous Fleet (Vans, Medium […]

Ver mais

Like 0

Liked Liked

technocracy

Learning Relativistic Geodesics and Chaotic Dynamics via Stabilized Lagrangian Neural Networks

digitado ⋅ 18 de January de 2026

Lagrangian Neural Networks (LNNs) can learn arbitrary Lagrangians from trajectory data, but their unusual optimization objective leads to significant training instabilities that limit their application to complex systems. We propose several improvements that address these fundamental challenges, namely, a Hessian regularization scheme that penalizes unphysical signatures in the Lagrangian’s second derivatives with respect to velocities, preventing the network from learning unstable dynamics, activation functions that are better suited to the problem of learning Lagrangians, and a physics-aware coordinate […]

Ver mais

Like 0

Liked Liked