NPG: New Random Generator, 3x Faster & Stronger than PCG64

The NumPy library in Python, and many other systems, relied on the Mersenne Twister PRNG (pseudo-random number generator) for a long time. It was slow and did not mimic randomness well enough, failing some statistical tests. In addition, it could be cracked, raising security issues. It was replaced recently by PCG64 which addresses some of these issues. Nowadays, PCG64 is widely adopted.

In the meanwhile, for over 2 decades, I worked on this topic. My efforts culminated when working on my new book “Breakthroughs on the Digit Distribution of Math Constants”, available for free, here (see review here). For a long time, I used and built random generators on an industrial scale, in AI (alternatives to deep neural networks, xLLM), heavy simulations, and in the context of high-performance computing. I also studied randomness since my PhD in math and postdoc in computational statistics at the University of Cambridge. Some of my solutions are published in my GenAI book (Elsevier).

Introducing NPG

Today, I am happy to release NPG (non-periodic generator), surpassing predecessors on all fronts: speed, statistical randomness, and security. As a potential use, LLMs rely on deep neural networks, which rely on stochastic gradient descent and other probabilistic systems, all of them using random numbers on a massive scale, whether called deterministic or not. However, my PRNG can be used in any context, AI or not.

The secure version, NPGS, makes it suitable for cryptography: it is just as fast and as strong (statistically speaking) as the non-secure NPG aimed at simulations. Both work well in a parallel implementation, without generating cross-correlated bit streams unlike other PRNGS.

Non-periodic PRNGs (with true infinite period) are thought to be slow and crackable. This mainstream opinion comes from experts unfamiliar with the current state of alternative models. My NPG is not only much faster than periodic PRNGs, but it also beats them in statistical quality. The speed is due to using almost all the bits generated at any time, without loss due to modulo operations. In its simplest form (which works just as well), there are only 2 operations at each iteration: an addition and a bit shift, not even a multiplication.  Hard to beat that!

Differences between NPG and standard PRNGs

Some of the main differences are as follows.

  • The state of NPG (both for the secure and non-secure versions) is not of static size. Like many cryptographically secure PRNGs, it also depends on the iteration number. I built a wrapper around it so that the state size appears static to the end-user, hiding the complexity in the backend.
  • There is no external API call. NPG requires exact addition and multiplication for very large integers. I do it with an HPC library written in C, but it can be done from scratch with fast multiplication.
  • The seed is set to 1. There is no benefit in using a large seed. The end-user specifies some parameters when calling an instance of NPG; these parameters play the role of the seed, but they are a different component of the system. Of course, NPG leads to full replicability if you use the same parameters.
  • NPG is as a byproduct of number theory research and thus benefits from proved results.  Other PRNGs are created from scratch without knowing beforehand what could go wrong. In NPG, there are subtle patterns predicted by the theory; if you ignore them, it still surpasses PCG64 and passes the tests. I removed them to further increase the quality, with no impact on speed.

Performance

Table 8.2 shows that NPG is about 3 times faster than PCG64. Here N is the number of iterations in NPG (producing about N3/6 total bits) while a and b are the NPG substitutes to the seed.

Table 8.1 shows the failures observed when testing the quality of the random bits. Only PCG64 and its upgraded version PCG64DXSM failed some tests depending on its seed, the number of bits generated, and other elements.

The plots below show a specific example of PCG64 failure: the red bits. The blue bits come from NPG, the other colors from PCG64. Only the red sample is abnormal.

Final thoughts

NumPy recently upgraded its random generator environment. I thought it was just a cosmetic change, only to realize later that it now relies on PCG64, replacing the Mersenne Twister. You lose replicability on legacy code even if using the same seed. You may not be aware of these unannounced changes, thus the need for a solution you have full control over. NPG fills that gap, besides the other benefits: speed and better randomness.

To further discuss NPG and its benefits, or PCG64 failures, email me at vincent@bondingAI.io or contact me on LinkedIn, here. To not miss future announcements, sign up to my newsletter, here.  

About the Author

Towards Better GenAI: 5 Major Issues, and How to Fix Them

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.

Liked Liked