Here is llama.cpp with PrimeVHT2 and llama-turbo with PrimeVHT2 PrimeVHT2 is the basis of the algorithm used in the unreleased llama turbo
Hi I’ll just leave this here for you guys to check out. It is llama.cpp with PrimeVHT2 integration which is like TurboQuant except it is working and better! reaching the maximum at 0.9987. One is pure llama.cpp with PrimeVHT2 and the other is llama-turbo with PrimeVHT2. PrimeVHT2 is the basis for the unrelease llama.cpp turbo algorithm https://github.com/nihilistau/llama-cpp-vht2 https://github.com/nihilistau/llama-PrimeVHT2 # PrimePE / Position_Is_Arithmetic — Session Context v3 ## Date: April 5, 2026 | Updated: VHT2 banded compression validated + […]