Table Of ContentCOMPLETE SOLUTION OF THE DIOPHANTINE
EQUATION X2 +1 = dY4 AND A RELATED FAMILY OF
QUARTIC THUE EQUATIONS
Chen Jian Hua
4
1
0 The Electric Power Testing and Research Institute of Hubei Province
2
Wuhan 430077, P. R. China
n
a
J
AND
1
2
Paul M. Voutier
]
T
N
Department of Mathematics, City University
.
h Northampton Square, London EC1V 0HB, UK
t
a
m
[
In this paper, we use the method of Thue and Siegel, based on explicit
1
v Padeapproximationsto algebraicfunctions, to completely solvea fam-
0
5 ily of quartic Thue equations. From this result, we can also solve the
4
diophantine equation in the title. We prove that this equation has at
5
. mostonesolutioninpositiveintegerswhend 3. Moreover,whensuch
1
≥
0 asolutionexists,itisoftheform(u,√v)where(u,v)isthefundamental
4
1 solution of X2+1=dY2.
:
v
i
X
1. Introduction
r
a
The goal of this paper is the solution of the diophantine equation
(1.1) X2 +1 = dY4.
Many people have studied this equation throughout the history of dio-
phantine equations. Previously, the best known result was
Theorem A (Ljunggren [8]). If the fundamental unit of the quadratic field
Q(√d) is not also the fundamental unit of the ring Z[√d], then equation
(1.1) has at most two possible solutions in positive integers, and these can
be found by a finite algorithm.
1
2
His proof is exceedingly complicated (see Theorem 10 and the surround-
ing text on p.271 of [9]).
In another article [3], the first author proved, that when d > 0 is large
enough, there exists only one possible solution. More precisely, we have
Theorem B (Chen [3]). Let d > 0 and put ǫ = u+v√d where (u,v) is the
least positive solution of the Pell equation
X2 +1 = dY2.
If ǫ > 5 107, then the equation (1.1) has at most one possible solution
×
in positive integers. Moreover, if v = k2l where l is square-free, then this
solution, if it exists, is given by
x+y2√d = ǫl.
Theorem B was proved by making use of the theory of linear forms in
logarithms of algebraic numbers and some algebraic number theory.
Other special cases of (1.1) are known. For d = 2, Ljunggren has shown
that (1.1) has precisely two solutions, (1,1) and (239,13), in positive in-
tegers. Cohn has also contributed to the study of these equations, see [5,
Theorem 7], for example.
In [2], the first author used the method of Thue and Siegel to give a new
proof that (1,1) and (239,13) are the only solutions in positive integers of
(1.1) for d = 2. In that paper, Pade approximations were used to find an
effective improvement of Liouville’s theorem which was, in turn, used to
solve X2 +1 = 2Y4 via a Thue equation.
Here we extend the method of [2]. We will establish an effective im-
provement of Liouville’s theorem on the measure of irrationality for certain
algebraic numbers of degree 4, and then use such a result to solve a family
of Thue equations. In this way, we will prove:
Theorem C. Let t 1 be a rational integer with t = 3. Consider the Thue
≥ 6
equation
(1.2) P (X,Y) = X4 tX3Y 6X2Y2 +tXY3 +Y4 = 1.
t
− − ±
For t = 1, if (x,y) is an integer solution of (1.2) then (x,y) ( 2,1),
∈ { −
( 1, 2), ( 1,0), (0, 1), (1,0), (1,2), (2, 1) .
− − − ± − }
For t = 4, if (x,y) is an integer solution of (1.2) then (x,y) ( 3,2),
∈ { −
( 2,3), ( 1,0), (0, 1), (1,0), (2,3), (3, 2) .
− − ± − }
3
For t = 2 or t 5, if (x,y) is an integer solution of (1.2) then (x,y) =
≥
( 1,0) or (0, 1).
± ±
We exclude t = 3 since P (X,Y) = (X2 +XY Y2)(X2 4XY Y2)
3
− − −
and hence these Thue equations are quite easy to solve.
Finally, in Section 4, we use this result to prove the following one which
concerns (1.1).
Theorem D. Let (u,v) be the fundamental solution of the Pell equation
X2 +1 = dY2. If d 3, then the equation (1.1) has at most one solution
≥
in positive integers. If this solution (x,y) exists, we have v = y2.
We note that Lettl and Peth˝o [7] have independently proved Theorem C
using lower bounds for linear forms in two logarithms in the same manner
as Thomas [13]. Actually, they also consider the case when the constant
is 4, but this can be reduced to studying the Thue equations above (see
±
Lemma 1 of [7]).
As our proof is completely different, based on the explicit construction
of “good” rational approximations to certain algebraic numbers by hyper-
geometric methods, and this appears to be the first time that such methods
are used to solve a family of Thue equations, we feel that there is reason to
present our own proof here.
2. Preliminaries
We start with some notation.
Notation. For positive integers n and r, we put
X (X) = F ( r, r 1/n,1 1/n,X),
n,r 2 1
− − − −
where F denotes the classical hypergeometric function.
2 1
∗
We use X to denote the homogeneous polynomials derived from these
n,r
polynomials, so that
X∗ (X,Y) = YrX (X/Y).
n,r n,r
We now present the following important lemma of Thue.
Lemma 2.1 (Thue [14]). Let P(X) be a polynomial of degree n 2 and as-
≥
sume that there is a quadratic polynomial U(X) with non-zero discriminant
4
such that
n(n 1)
′′ ′ ′ ′′
(2.1) U(X)P (X) (n 1)U (X)P (X)+ − U (X)P(X) = 0.
− − 2
We write
′ ′
Y (X) = 2U(X)P (X) nU (X)P(X),
1
−
n2 1 h
h = − U′(X)2 2U(X)U′′(X) and λ = .
4 − n2 1
−
(cid:0) (cid:1)
Let us define two families of polynomials A (X) and B (X) by the initial
r r
conditions
2h 2(n+1) n 1
′ ′
A (X) = , A (X) = U(X)P (X) − U (X)P(X) ,
0 1
3 3 − 2
(cid:18) (cid:19)
2hX 2(n+1)U(X)P(X)
B (X) = , B (X) = XA (X) ,
0 1 1
3 − 3
and, for r 1, by the recurrence equations
≥
1
λ(n(r +1) 1)Ar+1(X) = r+ Y1(X)Ar(X) (nr +1)P2(X)Ar−1(X),
− 2 −
(cid:18) (cid:19)
1
λ(n((r2.+2)1) 1)Br+1(X) = r+ Y1(X)Br(X) (nr +1)P2(X)Br−1(X).
− 2 −
(cid:18) (cid:19)
(i) For any root β of P(X),
βA (X) B (X) = C (X) = (X β)2r+1S (X),
r r r r
− −
where S (X) is a polynomial.
r
(ii) Put
1 Y (X) 1 Y (X) z(X)
z(X) = 1 +P(X) , u(X) = 1 P(X) and w(X) = .
2 2n√λ 2 2n√λ − u(X)
(cid:18) (cid:19) (cid:18) (cid:19)
Then
(√λ)rA (X) = a(X)X∗ (z,u) b(X)X∗ (u,z) and
r n,r − n,r
(√λ)rB (X) = c(X)X∗ (z,u) d(X)X∗ (u,z),
r n,r − n,r
5
where
(n 1)√λ Y (X) 1
1
a(X) = − A (X) A (X),
1 0
2P(X) ! − 4√λP(X) − 2
(cid:18) (cid:19)
(n 1)√λ Y (X) 1
1
b(X) = − A (X) + A (X),
1 0
2P(X) ! − 4√λP(X) 2
(cid:18) (cid:19)
(n 1)√λ Y (X) 1
c(X) = − B (X) 1 B (X) and
1 0
2P(X) ! − 4√λP(X) − 2
(cid:18) (cid:19)
(n 1)√λ Y (X) 1
1
d(X) = − B (X) + B (X).
1 0
2P(X) ! − 4√λP(X) 2
(cid:18) (cid:19)
These results canbefoundinThue[14, Theoremandequations 35–47]or
Chudnovsky [4] (see, in particular, Lemma 7.1 and the remarks that follow
(pages 364–366)).
We have added two extra hypotheses, requiring that the degree of P(X)
be at least two and that the discriminant of U(X) be non-zero. Since h is
equal to (n2 1)/4 times the discriminant of U(X) and since λ is a multiple
−
of h, we need these conditions to ensure that we do not fall into degenerate
cases with the A ’s and B ’s.
i i
Also notice that there are some differences in notation between the
lemma above, which is similar to Chudnovsky’s [4], and that of Thue. In
particular, here, Thue’s α and F are replaced by β and P, n and r are
switched from [14], our A ’s and B ’s are 2(n 1)/3 times Thue’s and we
i i
−
∗
label Thue’s U (z,y) as X (z,u). Also, what we call Y (X) and u(X)
r n,r 1
respectively, correspond to 2H(x) and y(x) respectively in Thue’s paper.
We now give two lemmas which will give us the remainder of our ap-
proximations in a nice form that can be easily bounded from above.
Lemma 2.2. Let m and n be positive integers and suppose α is a real
number. Define
m n
m α n+α m α n+α
p (X) = − Xν and q (X) = − Xν.
m n
m ν ν ν n ν
ν=0(cid:18) − (cid:19)(cid:18) (cid:19) ν=0(cid:18) (cid:19)(cid:18) − (cid:19)
X X
6
Given a complex number x, we let C denote the straight line from 1 to
x. If 0 is not on C, then
m α n+α
xαq (x) p (x) = α − (t x)m(1 t)ntα−m−1dt.
n m
− m n − −
(cid:18) (cid:19)(cid:18) (cid:19)ZC
Proof. Put
r(x) = xαq (x) p (x).
n m
−
It is a routine matter to verify that, when k = 0,1,...,m,
r(k)(x) n m α n+α ν +α m m α n+α ν
= − xα+ν−k − xν−k.
k! ν n ν k − m ν ν k
ν=0(cid:18) (cid:19)(cid:18) − (cid:19)(cid:18) (cid:19) ν=k(cid:18) − (cid:19)(cid:18) (cid:19)(cid:18) (cid:19)
X X
Note that
n+α ν +α n k +α n+α
= − .
n ν k n ν k
(cid:18) − (cid:19)(cid:18) (cid:19) (cid:18) − (cid:19)(cid:18) (cid:19)
Thus, using the famous Vandermonde formula
k
x y x+y
= ,
ν k ν k
ν=0(cid:18) (cid:19)(cid:18) − (cid:19) (cid:18) (cid:19)
X
where x and y are real numbers, we get
n n
m α n+α ν +α n+α m α n k +α
− = − −
ν n ν k k ν n ν
ν=0(cid:18) (cid:19)(cid:18) − (cid:19)(cid:18) (cid:19) (cid:18) (cid:19)ν=0(cid:18) (cid:19)(cid:18) − (cid:19)
X X
n+α m+n k
= − .
k n
(cid:18) (cid:19)(cid:18) (cid:19)
We also have
n+α ν +k n+α n k +α
= − .
ν +k k k ν
(cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19)(cid:18) (cid:19)
Applying this identity along with Vandermonde’s formula again, we get
m m−k
m α n+α ν n+α m α n k +α
− = − −
m ν ν k k m k ν ν
ν=k(cid:18) − (cid:19)(cid:18) (cid:19)(cid:18) (cid:19) (cid:18) (cid:19) ν=0 (cid:18) − − (cid:19)(cid:18) (cid:19)
X X
n+α m+n k
= − .
k n
(cid:18) (cid:19)(cid:18) (cid:19)
Hence r(k)(1)/k! = 0 for k = 0,1,...,m.
7
Further computation shows
m α n+α ν +α n (m α) (m α ν +1)(n+α) (α+ν m)
− = − ··· − − ··· −
ν n ν m+1 ν (m+1)!n!
(cid:18) (cid:19)(cid:18) − (cid:19)(cid:18) (cid:19) (cid:18) (cid:19)
α m α n+α n
= ( 1)ν−m − .
− m+1 m n ν
(cid:18) (cid:19)(cid:18) (cid:19)(cid:18) (cid:19)
Thus
r(m+1)(x) n m α n+α ν +α
= − xα+ν−m−1
(m+1)! ν n ν m+1
ν=0(cid:18) (cid:19)(cid:18) − (cid:19)(cid:18) (cid:19)
X
α m α n+α
= ( 1)m − xα−m−1(1 x)n.
− m+1 m n −
(cid:18) (cid:19)(cid:18) (cid:19)
Expanding r(x) into its Taylor series with remainder centred around
x = 1, we have
0
1 1 x
r(x) = r(1)+r′(1)(x 1)+...+ r(m)(1)(x 1)m+ r(m+1)(t)(x t)mdt.
− m! − m! −
Z1
Hence
m α n+α
r(x) = α − tα−m−1(t x)m(1 t)ndt,
m n − −
(cid:18) (cid:19)(cid:18) (cid:19)ZC
(cid:3)
and the lemma follows.
Lemma 2.3. Letn 2 andr bepositiveintegers andletβ,λ,a(X),b(X),c(X),C (X),d(X),u(X),
r
≥
w(X),X∗ (X,Y) and z(X) be as in Lemma 2.1. Put
n,r
Γ(r +1+1/n) x
R (x) = (1 t)r(t x)rt−(r+1−1/n)dt,
n,r
r!Γ(1/n) − −
Z1
where the integration path is the straight line from 1 to x.
If x is a non-zero complex number such that w(x) is not a negative num-
ber or zero, then
(√λ)rC (x) = β a(x)w(x)1/n b(x) c(x)w(x)1/n d(x) X∗ (u(x),z(x))
r − − − n,r
(cid:0) ((cid:0)βa(x) c(x))u(x)rR(cid:1)n,r((cid:0)w(x)). (cid:1)(cid:1)
− −
Proof. Letting α = 1/n, it is easy to verify that
r 1/n r 1/n
p (x) = − X (x) and q (x) = − xrX (x−1).
r n,r r n,r
r r
(cid:18) (cid:19) (cid:18) (cid:19)
Thus from Lemma 2.2 and the definition of R (x), we have
n,r
Γ(r +1+1/n) x
x1/nxrX (1/x) X (x) = (1 t)r(t x)rt1/n−r−1dt = R (x).
n,r n,r n,r
− r!Γ(1/n) − −
Z1
8
Substituting x = w into this expression, we have w1/nX∗ (u,z) =
n,r
X∗ (z,u)+urR (w). So replacing X∗ (z,u) by w1/nX∗ (u,z) urR (w)
n,r n,r n,r n,r − n,r
in the expressions for (√λ)rA (X) and (√λ)rB (X) in Lemma 2.1(ii) and
r r
(cid:3)
then applying the relation in Lemma 2.1(i), our result follows.
We now give a generalization of a result of Baker [1, Lemma 3], itself
an improvement of a result of Siegel [11, Hilfssatz 5], in a form which is
suitable for our needs here.
Lemma 2.4. Suppose j = 1 and n and r are positive integers. We define
±
µ = p1/(p−1).
n
pYn
|
p,prime
Then the coefficients of the polynomial
2r
F ( r, r +j/n, 2r,nµ X)
2 1 n
r − − −
(cid:18) (cid:19)
are algebraic integers.
Proof. Let
2r
p(X) = F ( r, r +j/n, 2r,nµ X).
2 1 n
r − − −
(cid:18) (cid:19)
By definition of the hypergeometric function, we have
r r
l 2r s l 2r s
p(X) = sn−s − ( nµ X)s = sµs − ( X)s
s! r − n s! n r −
s=0 (cid:18) (cid:19) s=0 (cid:18) (cid:19)
X X
where
r
l = (kn j).
s
−
k=r−s+1
Y
Defining
σ = p[s/(p−1)],
s
pYn
|
p,prime
for non-negative s, from Lemma 4.1 of [4], we see that l σ /s! is a rational
s s
integer. Since
µs
n = ps/(p−1)−[s/(p−1)]
σ
s
pYn
|
p,prime
9
and s/(p 1) [s/(p 1)] is a non-negative rational number, µs/σ is an
− − − n s
algebraic integer. Therefore,
l µs 2r s l 2r s
sσ n − = sµs −
s! sσ r s! n r
s(cid:18) (cid:19) (cid:18) (cid:19)
(cid:3)
is an algebraic integer and the lemma follows.
Lemma 2.5. Let w = eiϕ,0 < ϕ < π and put √w = eiϕ/2. Let n and r be
positive integers. Define R (x) as in Lemma 2.3. Then
n,r
Γ(r +1+1/n)
2r
R (w) ϕ 1 √w .
n,r
| | ≤ r!Γ(1/n) −
(cid:12) (cid:12)
Proof. By Cauchy’s theorem, (cid:12) (cid:12)
Γ(r +1+1/n)
R (w) = ((1 t)(t w))rt1/n−r−1dt,
n,r
r!Γ(1/n) − −
ZC
where
C = t t = eiθ,0 θ ϕ .
{ | ≤ ≤ }
Put
(1 t)(t w)
f(t) = − − and g(t) = t1/n−1.
t
Define
F(θ) = f eiθ 2,
so (cid:12) (cid:0) (cid:1)(cid:12)
(cid:12) (cid:12)
F(θ) = 4(1 cosθ)(1 cos(θ ϕ)) for 0 θ ϕ.
− − − ≤ ≤
A simple calculation shows that
ϕ θ ϕ θ
′
F (θ) = 16sin θ sin sin − .
− − 2 2 2
(cid:18) (cid:19) (cid:18) (cid:19)
(cid:16) (cid:17)
′
The only values of 0 θ ϕ with F (θ) = 0 are θ = 0,ϕ/2 and ϕ. It is
≤ ≤
easy to check that
ϕ 2
4
F(θ) F(ϕ/2) = 4 1 cos = 1 √w ,
≤ − 2 −
(cid:16) (cid:17)
(cid:12) (cid:12)
and hence
(cid:12) (cid:12)
ϕ
f(t)rg(t)dt f eiθ r g eiθ dθ ϕ 1 √w 2r.
≤ ≤ −
(cid:12)ZC (cid:12) Z0
(cid:12) (cid:12) (cid:12) (cid:0) (cid:1)(cid:12) (cid:12) (cid:0) (cid:1)(cid:12) (cid:12) (cid:12) (cid:3)
The l(cid:12)emma follows.(cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12) (cid:12)
(cid:12) (cid:12)
10
Lemma 2.6. Let u and z be complex numbers with w = z/u = eiϕ where
0 < ϕ < π. Then
X∗ (z,u) 4 u r Γ(1−1/n)r! 1+√w 2r−2 and
n,r ≤ | | Γ(r+1 1/n)
−
(cid:12)(cid:12)X∗ (u,z)(cid:12)(cid:12) 4 z r Γ(1−1/n)r! (cid:12)(cid:12)1+√w(cid:12)(cid:12)2r−2.
n,r ≤ | | Γ(r +1 1/n)
−
(cid:12) (cid:12) (cid:12) (cid:12)
Proof. Rec(cid:12)all that X(cid:12)∗ (z,u) = ur F ( r, r (cid:12)1/n,1 (cid:12)1/n,w).
n,r 2 1 − − − −
We noted in the proof of Lemma 2.3 that
r
r 1/n r +1/n r 1/n
− wk = − F ( r, r 1/n,1 1/n,w).
2 1
r k k r − − − −
k=0(cid:18) − (cid:19)(cid:18) (cid:19) (cid:18) (cid:19)
X
So, by the binomial theorem and Cauchy’s residue theorem,
Γ(1 1/n)r!
F ( r, r 1/n,1 1/n,w) = ( 1)r − t−r−1(1 t)r−1/n(1 wt)r+1/ndt,
2 1
− − − − − 2πiΓ(r +1 1/n) − −
− ZC
where C is any path which encircles the origin once in the positive sense.
Wenow focusonbounding the absolutevalueofthis integral fromabove.
Put
(1 t)(1 wt) (1 t)1−1/n(1 wt)1+1/n
f(t) = − − and g(t) = − − .
t t2
Now
F(θ) = f eiθ 2 = 4(1 cosθ)(1 cos(θ+ϕ)) for 0 θ < 2π.
− − ≤
We have (cid:12) (cid:0) (cid:1)(cid:12)
(cid:12) (cid:12)
ϕ θ ϕ θ
′
F (θ) = 16sin θ+ sin sin − − ,
− 2 2 2
(cid:18) (cid:19) (cid:18) (cid:19)
(cid:16) (cid:17)
′
so that the only values of 0 θ < 2π for which F (θ) = 0 are θ = 0,π
≤ −
ϕ/2,2π ϕ and 2π ϕ/2. It is easy to check that if 0 < ϕ < π then
− −
F(θ) F(π ϕ/2) = 1+√w 4.
≤ − | |
Note that
g eiθ 4.
≤
Thus, by Cauchy’s theorem(cid:12)an(cid:0)d th(cid:1)e(cid:12) Cauchy-Schwartz inequality, we get
(cid:12) (cid:12)
2π
t−r−1(1 t)r−1/n(1 wt)r+1/ndt F(θ)(r−1)/2 g(eiθ) dθ 8π 1+√w 2(r−1).
− − ≤ | | ≤
(cid:12)ZC (cid:12) Z0
(cid:12) (cid:12) (cid:12) (cid:12)
Therefore,
(cid:12) (cid:12) (cid:12) (cid:12)
(cid:12) (cid:12)
Γ(1 1/n)r! 2r−2
F ( r, r 1/n,1 1/n,w) − 8π 1+√w .
2 1
| − − − − | ≤ 2πΓ(r+1 1/n)
−
(cid:12) (cid:12)
(cid:12) (cid:12)