| Modular numbers |
This document describes the use and implementation of the modulars within the numerix package.
Moduli are implemented in modulus.hpp. An object of type modulus is declared as follows:
modulus<C,V> p;
V is an implementation variant. The default variant, named modulus_naive, is implemented in modulus_naive.hpp and supports a type C with an Euclidean division. More variants are described below. Modulars are implemented in modular.hpp. An object of type modular can be used as follows:
modular<modulus<C, V>, W> x, y, z;
modular<modulus<C, V>, W>::set_modulus (p);
x = y * z;
The variant W is used to define the storage of the modulus. Default storage is modular_local that allows the current modulus to be changed at any time. The default modulus is 0. For the integer type the default variant is set to modulus_integer_naive, so that modular integers can be used as follows:
#include<numerix/integer.hpp>
#include<numerix/modular.hpp>
#include<numerix/modulus_integer_naive.hpp>
typedef modulus<integer> M;
M p (9973);
modular<M>::set_modulus (p);
modular<M> a (1), b (3);
c = a * b;
cout << c << "\n";
If the modulus is known to be fixed at compilation time then the following declaration is preferable for the sake of efficiency:
fixed_value <int, 3> w;
modulus<integer> p (w);
If the modulus is not intended to be changed and known at compilation time then the variant modular_fixed can be used.
From now on a div p and a mod p represent the quotient and the remainder in the long division of a by p respectively. Throughout this section C denotes a genuine integer C type. We write n for the bit-size of C, and p for a modulus that fits in C.
The default variant, modulus_int_naive<m>, is defined in modular_int.hpp. Each modular product performs one product and one integer division. The parameter m is the maximum bit-size of the modulus allowed by the variant. This parameter can be set to n. Best performances are obtained with m = n - 1. Note that we necessarily have m⩽n - 1 if C is a signed integer type. By convention the modulus 0 actually means 2m.
In the variant modulus_int_preinverse<m> a suitable inverse of the modulus is pre-computed and stored as an element of type C, this yields faster computations when doing several operations with the same modulus. The behavior of the parameter m is as in the preceding naive variant. The best performances are attained for when m⩽n - 2.
Within the implementation of this variant the following auxiliary quantities are needed:
By convention we let r = n if p is zero. Note that q<2n, hence fits in C.
Let a be an integer such that 0 ≤a<p2. The remainder a mod p can be obtained as follows:
Compute c = a - b p. From the definition of q we have:
0≤
| 2s + t |
| p |
It follows that 0≤c<p + a1p/2t + a0.
In general we can always take t = r and s = r - 1, so that h = 3. For when r + 1≤n, it is better to take t = r + 1 and s = r - 1 so that h≤2. Finally, for when r + 2≤n one can take t = r + 3 and s = r - 2 so that h≤1. These settings are summarized in the following table:
| r + 2≤n | r + 1≤n | r≤n | |
| s | r - 2 | r - 1 | r - 1 |
| t | r + 3 | r + 1 | r |
| h | 1 | 2 | 3 |
In these three cases remark that the integer a1q has size at most 2n.
It is clear that the case r + 2≤n leads to the fastest algorithm since step 3 reduces to one substraction/comparison that can be implemented as min(c,c - p) within type C alone. A special attention must be drawn to when r = 1, that corresponds to p = 2: we must suppose that s = r - 2 = - 1 is indeed computed within an unsigned int type, hence is sufficiently large to ensure a1 = 0 and c = a, hence the correctness of the algorithm.
Let us now turn to the computation of the numerical inverse
q of p. The strategy presented here consists in
performing one long division in C followed by one
Newton iteration. Let w = 2r -
1/p be the numerical inverse of p,
normalized so that 1/2 ≤w<1 holds. Let
x0 denote the initial approximation, and
x1 be its Newton iterate
x1 = 2x0 - p
x
/2r - 1. If 0≤w -
x0<2 - s0
for some positive integer s0 then it is
classical that 0≤w - x1<2
- s1 with s1 =
2s0 - 1.
2
0
Under the assumption that s0≤n/2, we can compute a suitable x0 as the floor part of 2s0w by the only use of operations within base type C. In other words we calculate q0 = 2s0 + r - 1 div p = 2s0x0 as follows:
The computation of q1 = 2s1 + r - 1 div p, that is the floor part of 2s1w, then continues as follows:
| 2 |
| 0 |
| 2 |
| 0 |
Finally, q is deduced in this way:
Let b be a modulo p integer that is to be involved in several modular products. Then one can compute β = 2rb div p < 2r just once and obtain a speed-up within the products by means of the follows algorithm:
The computation of β can be done as follows from q: