[P] A new open source MLP symbolic distillation and analysis tool Project
[P]
Hey folks! I built a tool that turns neural networks into readable math formulas – SDHCE
I’ve been working on a small project called SDHCE (Symbolic Distillation via Hierarchical Concept Extraction) and wanted to share it here.
The core idea: after you train a neural network, SDHCE extracts a human-readable concept hierarchy directly from the weights – no extra data needed. It then checks whether that hierarchy alone can reproduce the network’s predictions. If it can, you get a compact symbolic formula at the end that you could implement by hand and throw the network away.
The naming works through “concept arithmetic” – instead of just concatenating layer names, it traces every path back to the raw input features, sums the signed contributions, and cancels out opposing signals. So if two paths pull petal_length in opposite directions, it just disappears from the name rather than cluttering it.
It also handles arbitrary interval granularity (low/mid/high, or finer splits like low/mid_low/mid/mid_high/high) without you having to manually name anything.
Tested on Iris so far – the 4-layer network distilled down to exactly 2 concepts that fully reproduced all predictions. The formula fits in a text file.
Code + analyses here: https://github.com/MateKobiashvili/SDHCE-and-analyses/graphs/traffic
Feedback welcome – especially on whether the concept naming holds up on messier datasets.
TL;DR: Tool that extracts a readable symbolic formula from a trained neural net, verifies it reproduces the network exactly, and lets you delete the model and keep just the formula.
submitted by /u/stron44
[link] [comments]