Efficiently computing multiple modular inverses at once

Suppose you have a large prime number M and you need to find the inverse of several numbers mod M.  Montgomery’s trick is a way to combine the computation of the inverses to take less time than computing the inverses individually. Peter Montgomery (1947–2020) came up with this trick in 1985.

We will illustrate Montgomery’s trick by inverting three numbers—a, b, and c—though the trick extends to any number of numbers. It is commonly used in cryptography.

Modular inverses are much slower to calculate than modular products, so doing fewer of the former and more of the latter is a good tradeoff. Montgomery’s method only calculates one modular inverse, regardless of how many numbers need to be inverted.

The idea is to directly invert the product of all the numbers and use multiplication to find the inverses of the individual numbers. In our case, we compute

xab
y = cy = abc
x−1 = cy−1
b−1 = ax−1
a−1 = bx−1

To show that this actually saves time, we’ll run some Python code to invert three random numbers modulo a very large prime, much larger than occurs in practice. The reason is to make the computation time longer and easier to demonstrate. In practice, Montgomery’s trick saves a little time off of a lot of calculations. Here we’ll save a lot of time off a handful of calculations.

import sys
import time
from secrets import randbelow

# extend the default maximum integer size
sys.set_int_max_str_digits(100000)

# the 32nd Mersenne prime
M = 2**756839 - 1

def simple(a, b, c, M):
    return [pow(x, -1, M) for x in [a, b, c]]

def montgomery(a, b, c, M):
    x = a*b % M
    y = x*c % M
    yinv = pow(y, -1, M)
    cinv = x*yinv % M
    xinv = c*yinv % M
    binv = a*xinv % M
    ainv = b*xinv % M
    return [ainv, binv, cinv]
    
a = randbelow(M)
b = randbelow(M)
c = randbelow(M)

start = time.perf_counter()
result = simple(a, b, c, M)
elapsed = time.perf_counter() - start
print(elapsed)

start = time.perf_counter()
result = montgomery(a, b, c, M)
elapsed = time.perf_counter() - start
print(elapsed)

When we ran this, the direct approach took 121.8 seconds, and Montgomerty’s trick took 47.6 seconds.

Related posts

The post Efficiently computing multiple modular inverses at once first appeared on John D. Cook.

Liked Liked