Computer Science Department – Rutgers University Spring 2018
CS 205: Diophantine Equations 16:198:205
In a variety of contexts, we may be interested in solving a system of equations with the additional restriction that the values of the variables are restricted to be integers. Consider as an example, the problem of finding integer solutions x, y to the following equation:
7x+23y = 10. (1)
In the usual approach, we might take the solution x = (10 − 23y)/7, but this does not guarantee that x will be an integer unless 10 − 23y is divisible by 7. The restriction to integer solutions adds an interesting wrinkle to the problem.
But consider the following approach – we can essentially reduce the problem by looking at arithmetic modulus some base. For any given value N, if the two sides are equal, we must also have
(7x+23y)≡10 modN. (2) Various choices of N can lead us to different observations. Consider, for instance, taking N = 2. In this case, since
7≡1 mod2and23≡1 mod2,itcanbeshownthat
(7x+23y)≡1x+1y≡x+y≡0 mod2. (3)
Thistellsusthatforanyintegersolution(x,y),x+ymustbecongruentto0 mod2,i.e.,x+ymustbeeven.This is information we didn’t have before, but it isn’t necessarily the most informative. We know that either x and y are both even, or x and y are both odd.
Can we find an N that gives us more information? Consider N = 7. In this case, we have (7x+23y)≡0x+2y≡2y≡3 mod7. (4)
We’ve reduced the original equation down to a single linear congruence in one variable: 2y ≡ 3 mod 7. What does this say about y? As in the proof of Fermat’s Little Theorem, we can make use of the idea of the function 2n mod 7 as a bijective map from the numbers S = {0, 1, 2, 3, 4, 5, 6} to itself. Because it is a bijection, it has an inverse and a moment or two of thought will suggest that the ’inverse’ value of this function for 3 is 5. That is, 3 ≡ 2 ∗ 5 mod 7. By the uniqueness of the inverse, the conclusion we must draw from this is that for any integer solution (x, y) to the original equation, we must have
y≡5 mod7. (5) This tells us something very specific about just y alone – y must yield a remainder of 5 when divided by 7, or for
some value of k in the integers, we have
y = 5 + 7k. (6) We can take a similar approach for the x term: taking N = 23, we have that
(7x+23y)≡7x+0y≡7x≡10 mod23. (7)
Leftwiththerelationthat7x≡10 mod23,whatcanwesayaboutx?Againwecanarguethatthemap7n mod23 is a bijective map from the set {0, 1, 2, . . . , 22} to itself, hence 10 has some unique inverse under this map. But how can we find it? Note at this point that we have, via Fermat’s Little Theorem, that
723−1 ≡ 1 mod 23. (8) 1
Computer Science Department – Rutgers University
Spring 2018
Massaging this a little, we can write it in the following way:
10∗7∗721 ≡10∗1 mod23,
or
7∗(10∗721)≡10 mod23.
. Comparing this with the original congruence we are interested in, we see that
x ≡ 10 ∗ 721 mod 23
(9)
(10)
(11)
will satisfy the congruence 7x ≡ 10 mod 23. But what does this say about x? We can either compute it directly and take the remainder when divided by 23, or do some kind of computational simplification or expansion like the following (so as to be able to do it with pencil and paper if needed):
x ≡ 10 ∗ 720 ∗ 7 ≡ 10∗4910 ∗7 ≡ 10∗310 ∗7 ≡ 10∗39 ∗3∗7 ≡ 30∗273 ∗7 ≡ 7 ∗ 43 ∗ 7
≡ 49 ∗ 43
≡ 3 ∗ 43
≡ 12 ∗ 16
≡ 24 ∗ 8 ≡1∗8
≡8 mod23
(12)
Hence we’ve arrived at x ≡ 8 mod 23, which checks out, as 7∗8 = 10+2∗23. At this point, we have arrived at the following conclusion: For any integer solution (x, y) to the original equation, we must have
y = 5 + 7k
x = 8+23m.
y≡5 mod7 x≡8 mod23.
(13) We can also think of this as specifying the ‘form’ of x and y relative to the given bases – in particular, for some
integers k, m, we have that
But we have arrived here by effectively decoupling the two variables – we started with a relationship between both variables together, and reduced it to relationships involving a single variable at a time. What values of k and m are allowed? At this point, we can return to the original equation to re-couple to two variables:
10 = 7x+23y
= 7(8 + 23m) + 23(5 + 7k) = 56 + 161m + 115 + 161k = 171 + 161(m + k),
(15)
(14)
2
Computer Science Department – Rutgers University Spring 2018
or, simplifying:
m + k = −1. (16) That is, for x and y of those forms, any choice of m,k where m+k = −1 will suffice. Taking for instance k = −1−m,
we have:
or
y = 5+7(−1−m) x = 8+23m,
y = −2 − 7m x = 8+23m,
(17)
(18)
For any integer m, the values
are integer solutions to the equation 7x + 23y = 10. Further, it can be argued that any solution must be of this form, by the uniqueness of the inverses used to solve the linear congruences.
Taking, as an example, m = 0, we get x = 8, y = −2, and 7 ∗ 8 − 2 ∗ 23 = 56 − 46 = 10, verifying this as a solution.
Generalizing
Let P, Q be prime numbers, and M some integer. We want to derive integer solutions to the equation
P x + Qy = M. (19)
From this, we can decouple the original equation into two congruences: Px≡M modQ
(20)
(21)
(22)
(23)
(24)
We have, by Fermat’s Little Theorem that
and hence that
or
We get from this, and the uniqueness of inverses (mod Q,P) that we must have
x ≡ M ∗ P Q−2 mod Q y ≡ M ∗ QP −2 mod P.
Qy≡M modP
PQ−1≡1 modQ QP−1≡1 modP,
M∗P∗PQ−2≡M∗1 modQ M∗Q∗QP−2≡M∗1 modP,
P[M∗PQ−2]≡M modQ Q[M∗QP−2]≡M modP.
3
Computer Science Department – Rutgers University Spring 2018
Note, with actual values for P,Q,M, we could reduce the above by calculating the actual remainders when divided by P, Q. But nevertheless the congruences must hold. We get then that for some integers k, m, we have
M = P x + Qy
= P (M ∗ P Q−2 + kQ) + Q(M ∗ QP −2 + mP ) = M ∗ P Q−1 + P Qk + M ∗ QP −1 P Qm
= M[PQ−1 + QP−1] + PQ[k + m].
a relationship between k and m:
We can then solve this effectively to get a simple relationship between k and m, namely 1 − [PQ−1 + QP−1]
x = M ∗ P Q−2 + kQ y = M ∗ QP −2 + mP.
(25) However again, we have decoupled the variables from their original relationship. Substituting back in, we can derive
k+m=M PQ . (27)
The above is interesting, because while we know that k and m must be integers (and therefore their sum must be an integer), how can we be certain that for any given value of P or Q, the fraction above divides cleanly? The picture becomes a little more clear noting the following algebraic relationship:
1−PQ−1 +QP−1=PQ−1 −1QP−1 −1−PQ−1QP−1. (28)
Noting that in the above, QP −1 − 1 must be divisible by P (why?) and that P Q−1 − 1 must be divisible by Q (why?), we get that 1 − (P Q−1 + QP −1) must be divisible by P Q. Hence, the formula for k + m above yields an integer. Solving for m in terms of k, and plugging in to the above formulas for x and y, we get that
(26)
For P,Q prime and integer M, any integer solutions x,y to P x + Qy = M
must satisfy for some integer k,
x = M ∗ P Q−2 + kQ PQ−1 −1
y = −M ∗ Q Again, it is worth noting that P Q−1 − 1 must be divisible by Q.
(29)
(30)
− P k.
This gives us a nice explicit parameterization of integer solutions in terms of an integer parameter m. Note, the values above may be quite large (depending on P, Q) but can be potentially reduced by computing the mods explicitly in Eq. (24).
Generalizing Further Still
One way of synthesizing the above is the following:
4
Computer Science Department – Rutgers University
Spring 2018
For any prime P, the congruence
is uniquely solved by
Ax≡D modP
x ≡ DAP−2 mod P.
(31) (32)
IfA̸≡0 modP,thenthissolutionisunique.
This utilizes Fermat’s Little Theorem to generate the multiplicative inverse of A mod P, multiplying it by D to generate the modular equivalent of D/A. Above, this generated the initial congruences for x and y, and by plugging these into the original equation again, we were able to solve for the full integer solutions. The thing that stops the previous solution from generalizing completely to equations like
Ax + By = C (33) is the possibility that A, B are not primes, and hence Fermat’s Little Theorem does not hold (directly). Hence, it is
worth considering congruences of the form
Ax≡1 modN (34) for non-prime bases N. If this can be solved for x, we can generate solutions to the more general form needed to
engineer solutions as above.
For any integers A,N, let x0 be a solution to
Ax0≡1 modN (35) if such a solution exists. In that case, the congruence
is uniquely solved by
Ax≡D modN (36)
x ≡ Dx0 mod N. (37)
Some experimentation will very quickly reveal that not all congruences of this form are solvable at all.
2x≡1 mod4 (38)
hasnosolutions,sinceifxiseven,i.e.,x=2k,then2x=4k≡0 mod4,andifxisodd,i.e.,x=2k+1,then 2x=4k+2≡2 mod4. Hencethereisnowaytogetamodulusof1inthisinstance.
But some are solvable:
5x≡1 mod6 (39)
hasarelativelyimmediatesolutionofx≡5 mod6,since5∗5=25=1+4∗6.Sowhatdifferentiatesthetwocases? It is worth unpacking this into the ‘form’ notion of modular congruences. In the two cases above, we are asserting that there are integers k and m such that
1. Since x − 2k must be an integer, this implies that 2 evenly divides 1, which it does not. There are no integer 5
2x − 4k = 1 5x − 6m = 1.
(40) In the first case, it quickly becomes clear where the problem is: we can factor out a 2 yielding the equation 2(x−2k) =
Computer Science Department – Rutgers University Spring 2018
solutions x,k that could possibly satisfy this equation. This generalizes immediately: if there is any shared (non- trivial) factor between A and N , then the equation Ax − N k = 1 can have no integer solutions for x, k.
IfGCD(A,N)̸=1,thenAx≡1 modNhasnointegersolutions.
In this second case, a quick eyeballing reveals a solution at x = −1,m = −1, which corresponds to the solution above,since−1≡5 mod6.Inthiscase,wehavethatGCD(5,6)=1,inwhichcasetheaboveresultdoesnotapply. Can anything be concluded in this case, generally? Yes.
IfGCD(A,N)=1,thenAx≡1 modNhasintegersolutions.
Consider the equation 5x − 6m = 1. Previously, we would knock out the m-variable term by taking mods with respect to 6 to cancel that term. But in this case, that leads back to the problem we are trying to solve. So consider taking mods relative to a divisor of 6. In particular, we have
1≡(5x−6m) mod2
≡1∗x−0∗m mod2 (41) ≡x mod2.
Hence any such x must satisfy x ≡ 1 mod 2, i.e., x must be odd. Additionally, we have
1≡(5x−6m) mod3
≡2∗x−0∗m mod3 (42) ≡ 2x mod 3.
And we can use the previous solution to solve this via x ≡ 23−2 ≡ 2 mod 3. Hence, we have the following decomposition based on the factorization 6 = 2 ∗ 3:
x = 2+3(2j +1) = 5+6j, (44) 5x≡1 mod6 =⇒ x≡5 mod6, (45)
k = 2j + 1 for some j. Hence, we have
or
x≡1 mod2 x≡2 mod3.
(43) Wecanrecombinetheseintoasolutionforx,notingthatifx=2+3kforsomek,thenx≡0+1k≡1 mod2,or
precisely the result found by inspection.
But how can we generalize this? Let inv(A,N) be the multiplicative inverse of A mod N, i.e., if it exists,
A∗inv(A,N)≡1 modN. (46)
In the case that it exists, what can be said about it? Unpacking the above congruence, we have effectively that for some integer k,
A ∗ inv(A, N ) + N ∗ k = 1. (47) Our standard trick at this point is to reduce the equation by taking mods with respect to one of the coefficients. In
this case, modding with respect to N recovers the original congruence we wish to solve. The alternative then is to
6
Computer Science Department – Rutgers University Spring 2018
mod with respect to A, i.e.
we recover from this that
or
inv(A,N)=1−N∗(inv(N modA,A)+Aj)=1−N∗inv(N modA,A)−Nj (51) AA
1≡(A∗inv(A,N)+N∗k) modA
≡N∗k modA (48) ≡(N modA)∗k modA,
where N mod A is taken to be the remainder when N is divided by A. We get from the above that k must be congruent to inv(N mod A, A). Taking k to be of the form
k=inv(N modA,A)+Aj, (49)
1 = A ∗ inv(A, N ) + N ∗ k =A∗inv(A,N)+N∗(inv(N modA,A)+Aj),
(50)
Note, in the above, (1 − N ∗ inv(N mod A)) must be divisible by A since
N∗inv(N modA,A)≡N∗inv(N,A)≡1 modA. (52)
The above yields the following recursive relationship for the inverse function:
inv(A,N)≡1−N∗inv(N modA,A) modN. (53)
A
Since N mod A must satisfy 0 < N mod A < A, this in some sense represents a reduction of the original problem to a ‘strictly smaller’ problem - going from parameters (A,N) to parameters (N mod A,A). In this way, we may repeatedly reduce the problem to one more easily solved. A base case to work from here is the simple case that if A=1,thenAx≡1 modN issolvedbyx=1,i.e,
inv(1,N)≡1 modN. (54) Applying this to the previous example of 5x ≡ 1 mod 6, we have
inv(5,6)≡1−6∗inv(6 mod5,5) mod6 5
≡1−6∗inv(1,5) mod6 5
≡1−6∗1 mod6 5
≡−5 mod6 5
≡−1 mod6 ≡5 mod6.
(55)
This recovers the result indicated previously, but with a more general computational approach that can be applied more broadly.
7
Computer Science Department - Rutgers University
Spring 2018
If GCD(A,N) = 1, then the congruence Ax ≡ 1 mod N is solved by x≡inv(A,N) modN,
(56)
(57)
mod N, and in the
where 1
inv(A,N) = 1−N∗inv(N A
mod A,A)
ifA≡1 modN modN ifA̸≡1 modN
mod N.
This, combined with the previous commentary, will generally allow you to solve congruences same way as in the previous section, solve linear equations in integer variables.
GCD and Euclidean Algorithm
There remain two mathematical questions to process, or three depending on how motivated you are. • How can you determine whether or not the GCD of A and N is 1?
• How can you be sure that this process for computing the inverse, repeatedly reducing the arguments via mods, will eventually terminate in the base case, i.e., A = 1?
• What is the big-O of this process?
As it turns out, all three of these questions can be addressed essentially at the same time, via the Euclidean
Algorithm.
Given integers A,N, the greatest common divisor of A and N may be calculated via
N GCD(A, N ) =
if A = 0 else.
section - these two algorithms are really performing different versions of the same computation.
In the case of GCD(5, 6), we get GCD(5, 6) = GCD(1, 5) = GCD(0, 1) = 1. For larger numbers, we have that GCD(15, 21) = GCD(6, 15) = GCD(3, 6) = GCD(0, 3) = 3. The algorithm essentially hinges on the idea that this process of repeatedly reducing via mods terminates at the GCD of the two numbers. In this way, we get immediately that the previous algorithm for computing inv will terminate with a base case of A = 1 in the case that the GCD is 1, because this process terminates at the GCD.
But why is this so? Consider expressing N as
N=A∗k+(N modA), (59)
GCD(N
Note, this mimics the same ‘argument reduction via mods’ approach as taken in computing the inverse in the previous
for some integer k. In this case, we see that
mod A, A)
N −A∗k = (N mod A), (60)
(58)
8
Computer Science Department - Rutgers University Spring 2018
in which case we get relatively immediately that any common factor of N and A must divide N mod A - in particular, the greatest common divisor of N and A must divide (N mod A), and further still
GCD(A,N) divides (N mod A) and A =⇒ GCD(A,N) divides GCD(N mod A,A). (61) Similarly however, since
A∗k+(N modA)=N,
we must get that any common divisor of A and N mod A must divide N - in particular, the greatest common divisor
of A and (N mod A) must divide N. As above,
GCD(N mod A, A) divides N and A =⇒ GCD(N mod A, A) divides GCD(A, N ). (63)
If X divides Y and Y divides X it must be (up to sign differences) that X = Y , i.e.,
GCD(N mod A,A) divides GCD(A,N). (64)
The only exception to this is the base case when A = 0, since we cannot divide by 0 and hence N mod A will not exist.
This repeated mod process therefore must terminate (since the size of the arguments is being reduced each time) and must terminate in the GCD of both of the original numbers. If the GCD is 1, then the process for computing the inverse will terminate as desired.
This addresses the first two questions above. For the third, the big-O of this process, the overall computational complexity is governed by the length of this sequence of mods. Once the sequence is computed, computing the inverse is simply a matter of evaluating a constant number of operations for each mod in the sequence.
How long can this sequence run before terminating? This is actually an incredibly fast process, as the mod operation rapidly reduces the size of the relevant arguments. In fact, you can argue that N mod A ≤ N/2: if A ≥ N/2, thenN modA=N−A≤N/2. IfA