Checking for Randomness: Replacing Test Batteries with a Single Test

In cybersecurity applications where replicability is critical, or when building pseudo-random number generators, it is typical to perform a large number of various tests to check if a sequence of bits is random enough for practical purposes. This is also true in scientific research, to assess whether or not the digits of π or other constants mimic randomness well enough, as discussed in my previous article, here.

Technically, one could use the Kolmogorov-Smirnov (KS) distance between two joint (multivariate) distributions: the empirical one computed on the digits in question versus a pure uniform distribution in high dimensions, assuming the digit sequence is long enough.  I implemented the multivariate KS distance in the context of data synthesis, to compare the full statistical distributions between the real data, and its synthetic version generated by NoGAN (one of the xLLM agents for synthetic data). See details here.  It is even adjusted for dimension. This is the only fully multivariate version on the market, avoiding the drawbacks caused by using combinations of univariate or bivariate tests. Also, it has its own Python library.

A Single Comprehensive Test of Randomness

Here I propose a simpler alternative to multivariate KS, to test for randomness in bit sequences. More specifically, the methodology tests whether a sequence of digits constitute a normal number, using a much simplified and faster version of Weyl’s criterion.

Convergence of orange and blue curves to zero indicates randomness in the sampled data

The new test does what is usually performed with a battery of randomness tests such as NIST or Dieharder, including frequency, run and spectral tests. Other non-standard tests are discussed in my book on chaos and dynamical systems, available here.

In this paper, I apply the new test to special irrational math constants whose digits are believed to behave randomly, empirically confirming the conjecture. Conversely, the test flags the digits of any rational number as non-random: after all, they all have a period, not compatible with randomness. The paper contains interesting mathematics related to Chebyshev polynomials and quadratic dynamical systems where the state space consists of matrices with eigenvalues on the unit circle.

Download your copy of the paper

Available as technical paper #62, here, with illustrations and Python code using the Gmpy2 library for high-performance computing. This paper is included in my new eBook about the digit distribution of math constants, available here. Links are clickable in the eBook (PDF). To no miss future articles, subscribe to my AI newsletter, here.

About the Author

Towards Better GenAI: 5 Major Issues, and How to Fix Them

Vincent Granville is a pioneering GenAI scientist, co-founder at BondingAI.io, the LLM 2.0 platform for hallucination-free, secure, in-house, lightning-fast Enterprise AI at scale with zero weight and no GPU. He is also author (Elsevier, Wiley), publisher, and successful entrepreneur with multi-million-dollar exit. Vincent’s past corporate experience includes Visa, Wells Fargo, eBay, NBC, Microsoft, and CNET. He completed a post-doc in computational statistics at University of Cambridge.

Liked Liked