Q-learning + Shannon entropy for classifying 390K integer sequences (OEIS)

Recently posted some info on a full “intelligence engine” we’ve been working on. reinforcement learning framework that uses Q-learning with entropy-based exploration control to classify structured datasets. I’ve been running it across multiple domains and just released the datasets publicly.

The most interesting one: I ran it against the entire OEIS (Online Encyclopedia of Integer Sequences) — 390,952 sequences. The agent classifies each sequence by information-theoretic properties: Shannon entropy of term values, growth dynamics, periodicity, convergence behavior, and structural patterns.

The same framework, with no shared state between domains, also classified 9,673 genes from Neurospora crassa by expression entropy across 97 experimental conditions.

What’s interesting is what emerged independently across domains. Low-entropy patterns in mathematics (fundamental constants, convergent sequences) have structural parallels to constitutive genes in biology (always expressed, essential machinery). High-entropy patterns (irregular, chaotic sequences) parallel condition-specific genes. Nobody told the agent these should be related. Same framework, different data, analogous categories.

Some details on the setup:

  • Q-learning with Elo-based pairwise preference learning
  • 36 signal categories for mathematics, 30 for biology
  • 187K learning steps on math, 105K on biology
  • Pure Python, zero external dependencies, runs on consumer hardware
  • Also running on 7 programming languages, cybersecurity, and a couple other domains (those datasets aren’t public yet)

Released the classified datasets on Codeberg under CC-BY-4.0: https://codeberg.org/SYNTEX/multi-domain-datasets

The OEIS classification includes per-sequence: entropy, growth class (exponential/polynomial/constant/oscillating), periodicity, monotonicity, and growth ratios. 131 MB uncompressed, 16 MB gzipped.

The framework itself is proprietary but the data is open. If anyone wants to poke at the classifications or has ideas for what else to do with 390K entropy-classified sequences, interested to hear.

submitted by /u/entropiclybound
[link] [comments]

Liked Liked