Open Access

Variants of the Selberg sieve, and bounded intervals containing many primes

Research in the Mathematical Sciences20141:12

DOI: 10.1186/s40687-014-0012-7

Received: 18 July 2014

Accepted: 19 July 2014

Published: 17 October 2014

The Erratum to this article has been published in Research in the Mathematical Sciences 2015 2:15

Abstract

For any m≥1, let H m denote the quantity liminf n ( p n + m p n ) . A celebrated recent result of Zhang showed the finiteness of H1, with the explicit bound H1≤70,000,000. This was then improved by us (the Polymath8 project) to H1≤4680, and then by Maynard to H1≤600, who also established for the first time a finiteness result for H m for m≥2, and specifically that H m m3e4m. If one also assumes the Elliott-Halberstam conjecture, Maynard obtained the bound H1≤12, improving upon the previous bound H1≤16 of Goldston, Pintz, and Yıldırım, as well as the bound H m m3e2m.

In this paper, we extend the methods of Maynard by generalizing the Selberg sieve further and by performing more extensive numerical calculations. As a consequence, we can obtain the bound H1≤246 unconditionally and H1≤6 under the assumption of the generalized Elliott-Halberstam conjecture. Indeed, under the latter conjecture, we show the stronger statement that for any admissible triple (h1,h2,h3), there are infinitely many n for which at least two of n+h1,n+h2,n+h3 are prime, and also obtain a related disjunction asserting that either the twin prime conjecture holds or the even Goldbach conjecture is asymptotically true if one allows an additive error of at most 2, or both. We also modify the ‘parity problem’ argument of Selberg to show that the H1≤6 bound is the best possible that one can obtain from purely sieve-theoretic considerations. For larger m, we use the distributional results obtained previously by our project to obtain the unconditional asymptotic bound H m m e 4 28 157 m or H m m e2m under the assumption of the Elliott-Halberstam conjecture. We also obtain explicit upper bounds for H m when m=2,3,4,5.

Keywords

Selberg sieve Elliott-Halberstam conjecture Prime gaps

Background

For any natural number m, let H m denote the quantity
H m : = liminf n p n + m p n ,
where p n denotes the n th prime. The twin prime conjecture asserts that H1=2; more generally, the Hardy-Littlewood prime tuples conjecture [1] implies that H m =H(m+1) for all m≥1, where H(k) is the diameter of the narrowest admissible k-tuple (see the ‘Outline of the key ingredients’ section for a definition of this term). Asymptotically, one has the bounds
1 2 + o ( 1 ) ) k log k H ( k ) ( 1 + o ( 1 ) k log k

as k (see Theorem 17 below); thus, the prime tuples conjecture implies that H m is comparable to m logm as m.

Until very recently, it was not known if any of the H m were finite, even in the easiest case m=1. In the breakthrough work of Goldston et al. [2], several results in this direction were established, including the following conditional result assuming the Elliott-Halberstam conjecture EH[ 𝜗] (see Claim 8 below) concerning the distribution of the prime numbers in arithmetic progressions:

Theorem 1(GPY theorem).

Assume the Elliott-Halberstam conjecture EH[ 𝜗] for all 0<𝜗<1. Then, H1≤16.

Furthermore, it was shown in [2] that any result of the form EH 1 2 + 2 ϖ for some fixed 0<ϖ<1/4 would imply an explicit finite upper bound on H1 (with this bound equal to 16 for ϖ>0.229855). Unfortunately, the only results of the type EH[ 𝜗] that are known come from the Bombieri-Vinogradov theorem (Theorem 9), which only establishes EH[ 𝜗] for 0<𝜗<1/2.

The first unconditional bound on H1 was established in a breakthrough work of Zhang [3]:

Theorem 2(Zhang’s theorem).

H1≤70,000,000.

Zhang’s argument followed the general strategy from [2] on finding small gaps between primes, with the major new ingredient being a proof of a weaker version of EH 1 2 + 2 ϖ , which we call MPZ[ ϖ,δ] (see Claim 10) below. It was quickly realized that Zhang’s numerical bound on H1 could be improved. By optimizing many of the components in Zhang’s argument, we were able (Polymath, DHJ: New equidistribution estimates of Zhang type, submitted), [4] to improve Zhang’s bound to
H 1 4 , 680 .

Very shortly afterwards, a further breakthrough was obtained by Maynard [5] (with related work obtained independently in an unpublished work of Tao), who developed a more flexible ‘multidimensional’ version of the Selberg sieve to obtain stronger bounds on H m . This argument worked without using any equidistribution results on primes beyond the Bombieri-Vinogradov theorem, and among other things was able to establish finiteness of H m for all m, not just for m=1. More precisely, Maynard established the following results.

Theorem 3(Maynard’s theorem).

Unconditionally, we have the following bounds:

(i) H1≤600

(ii) H m C m3e4m for all m≥1 and an absolute (and effective) constant C

Assuming the Elliott-Halberstam conjecture EH[ 𝜗] for all 0<𝜗<1, we have the following improvements:

(iii) H1≤12

(iv) H2≤600

(v) H m C m3e2m for all m≥1 and an absolute (and effective) constant C

For a survey of these recent developments, see [6].

In this paper, we refine Maynard’s methods to obtain the following further improvements.

Theorem 4.

Unconditionally, we have the following bounds:

(i) H1≤246

(ii) H2≤398,130

(iii) H3≤24,797,814

(iv) H4≤1,431,556,072

(v) H5≤80,550,202,480

(vi) H m Cm exp 4 28 157 m for all m≥1 and an absolute (and effective) constant C

Assume the Elliott-Halberstam conjecture EH[ 𝜗] for all 0<𝜗<1. Then, we have the following improvements:

(vii) H2≤270

(viii) H3≤52,116

(ix) H4≤474,266.

(x) H5≤4,137,854.

(xi) H m C m e2m for all m≥1 and an absolute (and effective) constant C

Finally, assume the generalized Elliott-Halberstam conjecture GEH[ 𝜗] (see Claim 12 below) for all 0<𝜗<1. Then,

(xii) H1≤6

(xiii) H2≤252

In the ‘Outline of the key ingredients’ section, we will describe the key propositions that will be combined together to prove the various components of Theorem 4. As with Theorem 1, the results in (vii)-(xiii) do not require EH[ 𝜗] or GEH[ 𝜗] for all 0<𝜗<1, but only for a single explicitly computable 𝜗 that is sufficiently close to 1.

Of these results, the bound in (xii) is perhaps the most interesting, as the parity problem [7] prohibits one from achieving any better bound on H1 than 6 from purely sieve-theoretic methods; we review this obstruction in the ‘The parity problem’ section. If one only assumes the Elliott-Halberstam conjecture EH[ 𝜗] instead of its generalization GEH[ 𝜗], we were unable to improve upon Maynard’s bound H1≤12; however, the parity obstruction does not exclude the possibility that one could achieve (xii) just assuming EH[ 𝜗] rather than GEH[ 𝜗], by some further refinement of the sieve-theoretic arguments (e.g. by finding a way to establish Theorem 20(ii) below using only EH[ 𝜗] instead of GEH[ 𝜗]).

The bounds (ii)-(vi) rely on the equidistribution results on primes established in our previous paper. However, the bound (i) uses only the Bombieri-Vinogradov theorem, and the remaining bounds (vii)-(xiii) of course use either the Elliott-Halberstam conjecture or a generalization thereof.

A variant of the proof of Theorem 4(xii), which we give in ‘Additional remarks’ section, also gives the following conditional ‘near miss’ to (a disjunction of) the twin prime conjecture and the even Goldbach conjecture:

Theorem 5(Disjunction).

Assume the generalized Elliott-Halberstam conjecture GEH[ 𝜗] for all 0<𝜗<1. Then, at least one of the following statements is true:

(a) (Twin prime conjecture) H1=2.

(b) (near-miss to even Goldbach conjecture) If n is a sufficiently large multiple of 6, then at least one of n and n−2 is expressible as the sum of two primes, similarly with n−2 replaced by n+2. (In particular, every sufficiently large even number lies within 2 of the sum of two primes.)

We remark that a disjunction in a similar spirit was obtained in [8], which established (prior to the appearance of Theorem 2) that either H1 was finite or that every interval [x,x+x ε ] contained the sum of two primes if x was sufficiently large depending on ε>0.

There are two main technical innovations in this paper. The first is a further generalization of the multidimensional Selberg sieve introduced by Maynard and Tao, in which the support of a certain cutoff function F is permitted to extend into a larger domain than was previously permitted (particularly under the assumption of the generalized Elliott-Halberstam conjecture). As in [5], this largely reduces the task of bounding H m to that of efficiently solving a certain multidimensional variational problem involving the cutoff function F. Our second main technical innovation is to obtain efficient numerical methods for solving this variational problem for small values of the dimension k, as well as sharpened asymptotics in the case of large values of k.

The methods of Maynard and Tao have been used in a number of subsequent applications [9]-[21]. The techniques in this paper should be able to be used to obtain slight numerical improvements to such results, although we did not pursue these matters here.

1.1 Organization of the paper

The paper is organized as follows. After some notational preliminaries, we recall in the ‘Distribution estimates on arithmetic functions’ section the known (or conjectured) distributional estimates on primes in arithmetic progressions that we will need to prove Theorem 4. Then, in the section ‘Outline of the key ingredients’, we give the key propositions that will be combined together to establish this theorem. One of these propositions, Lemma 18, is an easy application of the pigeonhole principle. Two further propositions, Theorem 19 and Theorem 20, use the prime distribution results from the ‘Distribution estimates on arithmetic functions’ section to give asymptotics for certain sums involving sieve weights and the von Mangoldt function; they are established in the ‘Multidimensional Selberg sieves’ section. Theorems 22, 24, 26, and 28 use the asymptotics established in Theorems 19 and 20, in combination with Lemma 18, to give various criteria for bounding H m , which all involve finding sufficiently strong candidates for a variety of multidimensional variational problems; these theorems are proven in the ‘Reduction to a variational problem’ section. These variational problems are analysed in the asymptotic regime of large k in the ‘Asymptotic analysis’ section, and for small and medium k in the ‘The case of small and medium dimension’ section, with the results collected in Theorems 23, 25, 27, and 29. Combining these results with the previous propositions gives Theorem 16, which, when combined with the bounds on narrow admissible tuples in Theorem 17 that are established in the ‘Narrow admissible tuples’ section, will give Theorem 4. (See also Table 1 for more details of the logical dependencies between the key propositions.)
Table 1

Results used to prove various components of Theorem 16

Theorem 16

Results used

(i)

Theorems 9, 26, and 27

(ii)-(vi)

Theorems 11, 24, and 25

(vii)-(xi)

Theorems 22 and 23

(xii)

Theorems 28 and 29

(xiii)

Theorems 26 and 27

Note that Theorems 22, 24, 26, and 28 are in turn proven using Theorems 19 and 20 and Lemma 18.

Finally, in the ‘The parity problem’ section, we modify an argument of Selberg to show that the bound H1≤6 may not be improved using purely sieve-theoretic methods, and in the ‘Additional remarks’ section, we establish Theorem 5 and make some miscellaneous remarks.

1.2 Notation

The notation used here closely follows the notation in our previous paper.

We use |E| to denote the cardinality of a finite set E, and 1 E to denote the indicator function of a set E; thus, 1 E (n)=1 when nE and 1 E (n)=0 otherwise.

All sums and products will be over the natural numbers : = { 1 , 2 , 3 , } unless otherwise specified, with the exceptions of sums and products over the variable p, which will be understood to be over primes.

The following important asymptotic notation will be in use throughout the paper.

Definition 6(Asymptotic notation).

We use x to denote a large real parameter, which one should think of as going off to infinity; in particular, we will implicitly assume that it is larger than any specified fixed constant. Some mathematical objects will be independent of x and referred to as fixed; but unless otherwise specified, we allow all mathematical objects under consideration to depend on x (or to vary within a range that depends on x, e.g. the summation parameter n in the sum x n 2 x f ( n ) ). If X and Y are two quantities depending on x, we say that X=O(Y) or XY if one has |X|≤C Y for some fixed C (which we refer to as the implied constant), and X=o(Y) if one has |X|≤c(x)Y for some function c(x) of x (and of any fixed parameters present) that goes to zero as x (for each choice of fixed parameters). We use X Y to denote the estimate Xxo(1)Y, XY to denote the estimate YXY, and XY to denote the estimate Y X Y. Finally, we say that a quantity n is of polynomial size if one has n=O(xO(1)).

If asymptotic notation such as O() or appears on the left-hand side of a statement, this means that the assertion holds true for any specific interpretation of that notation. For instance, the assertion n = O ( N ) | α ( n ) | N means that for each fixed constant C>0, one has | n | CN | α ( n ) | N .

If q and a are integers, we write a|q if a divides q. If q is a natural number and a , we use a (q) to denote the residue class
a ( q ) : = a + nq : n
and let / qℤ denote the ring of all such residue classes a(q). The notation b=a (q) is synonymous to b a (q). We use (a,q) to denote the greatest common divisor of a and q, and [ a,q] to denote the least common multiplea. We also let
/ qℤ × : = a ( q ) : ( a , q ) = 1

denote the primitive residue classes of / qℤ .

We use the following standard arithmetic functions:
  1. (i)
    φ ( q ) : = | ( / qℤ ) × |
    denotes the Euler totient function of q.
     
  2. (ii)
    τ ( q ) : = d | q 1
    denotes the divisor function of q.
     
  3. (iii)

    Λ(q) denotes the von Mangoldt function of q; thus, Λ(q)= logp if q is a power of a prime p, and Λ(q)=0 otherwise.

     
  4. (iv)

    θ(q) is defined to equal logq when q is a prime, and θ(q)=0 otherwise.

     
  5. (v)

    μ(q) denotes the Möbius function of q; thus, μ(q)=(−1) k if q is the product of k distinct primes for some k≥0, and μ(q)=0 otherwise.

     
  6. (vi)

    Ω(q) denotes the number of prime factors of q (counting multiplicity).

     
We recall the elementary divisor bound
τ ( n ) 1
(1)
whenever nxO(1), as well as the related estimate
n x τ ( n ) C n log O ( 1 ) x
(2)

for any fixed C>0 (see, e.g. [Lemma 1.5]).

The Dirichlet convolution α β : of two arithmetic functions α , β : is defined in the usual fashion as
α β ( n ) : = d | n α ( d ) β n d = ab = n α ( a ) β ( b ) .

Distribution estimates on arithmetic functions

As mentioned in the introduction, a key ingredient in the Goldston-Pintz-Yıldırım approach to small gaps between primes comes from distributional estimates on the primes, or more precisely on the von Mangoldt function Λ, which serves as a proxy for the primes. In this work, we will also need to consider distributional estimates on more general arithmetic functions, although we will not prove any new such estimates in this paper, relying instead on estimates that are already in the literature.

More precisely, we will need averaged information on the following quantity:

Definition 7(Discrepancy).

For any function α : with finite support (that is, α is non-zero only on a finite set) and any primitive residue class a (q), we define the (signed) discrepancy Δ(α;a (q)) to be the quantity
Δ ( α ; a ( q ) ) : = n = a ( q ) α ( n ) 1 φ ( q ) ( n , q ) = 1 α ( n ) .
(3)

For any fixed 0<𝜗<1, let EH[ 𝜗] denote the following claim:

Claim 8(Elliott-Halberstam conjecture, EH[ 𝜗]).

If Q x 𝜗 and A≥1 is fixed, then
q Q sup a ( / qℤ ) × Δ Λ 1 [ x , 2 x ] ; a ( q ) x log A x.
(4)

In [22], it was conjectured that EH[ 𝜗] held for all 0<𝜗<1. (The conjecture fails at the endpoint case 𝜗=1; see [23],[24] for a more precise statement.) The following classical result of Bombieri [25] and Vinogradov [26] remains the best partial result of the form EH[ 𝜗]:

Theorem 9(Bombieri-Vinogradov theorem).

[25],[26] EH[ 𝜗] holds for every fixed 0<𝜗<1/2.

In [2], it was shown that any estimate of the form EH[ 𝜗] with some fixed 𝜗>1/2 would imply the finiteness of H1. While such an estimate remains unproven, it was observed by Motohashi and Pintz [27] and by Zhang [3] that a certain weakened version of EH[ 𝜗] would still suffice for this purpose. More precisely (and following the notation of our previous paper), let ϖ,δ>0 be fixed, and let MPZ[ ϖ,δ] be the following claim:

Claim 10(Motohashi-Pintz-Zhang estimate, MPZ[ ϖ,δ]).

Let I[1,x δ ] and Q x1/2+2ϖ. Let P I denote the product of all the primes in I, and let S I denote the square-free natural numbers whose prime factors lie in I. If the residue class a (P I ) is primitive (and is allowed to depend on x), and A≥1 is fixed, then
q Q q S I Δ Λ 1 [ x , 2 x ] ; a ( q ) x log A x ,
(5)

where the implied constant depends only on the fixed quantities (A,ϖ,δ), but not on a.

It is clear that EH 1 2 + 2 ϖ implies MPZ[ ϖ,δ] whenever ϖ,δ≥0. The first non-trivial estimate of the form MPZ[ ϖ,δ] was established by Zhang [3], who (essentially) obtained MPZ[ ϖ,δ] whenever 0 ϖ , δ < 1 1 , 168 . In [Theorem 2.17], we improved this result to the following.

Theorem 11.

MPZ[ ϖ,δ] holds for every fixed ϖ,δ≥0 with 600ϖ+180δ<7.

In fact, a stronger result was established, in which the moduli q were assumed to be densely divisible rather than smooth, but we will not exploit such improvements here. For our application, the most important thing is to get ϖ as large as possible; in particular, Theorem 11 allows one to get ϖ arbitrarily close to 7 600 0.01167 .

In this paper, we will also study the following generalization of the Elliott-Halberstam conjecture:

Claim 12(Generalized Elliott-Halberstam conjecture, GEH[ 𝜗]).

Let ε>0 and A≥1 be fixed. Let N,M be quantities such that x ε N,M x1−ε with N Mx, and let α , β : be sequences supported on [ N,2N] and [ M,2M], respectively, such that one has the pointwise bound
| α ( n ) | τ ( n ) O ( 1 ) log O ( 1 ) x ; | β ( m ) | τ ( m ) O ( 1 ) log O ( 1 ) x
(6)
for all natural numbers n,m. Suppose also that β obeys the Siegel-Walfisz type bound
Δ β 1 ( · , r ) = 1 ; a ( q ) τ ( qr ) O ( 1 ) M log A x
(7)
for any q,r≥1, any fixed A, and any primitive residue class a (q). Then for any Q x 𝜗 , we have
q Q sup a ( / qℤ ) × Δ α β ; a ( q ) x log A x.
(8)

In [28], Conjecture 1], it was essentially conjecturedb that GEH[ 𝜗] was true for all 0<𝜗<1. This is stronger than the Elliott-Halberstam conjecture:

Proposition 13.

For any fixed 0<𝜗<1, GEH[ 𝜗] implies EH[ 𝜗].

Proof.

(Sketch) As this argument is standard, we give only a brief sketch. Let A>0 be fixed. For n[ x,2x], we have Vaughan’s identityc[29]
Λ ( n ) = μ < L ( n ) μ < Λ < 1 ( n ) + μ Λ 1 ( n ) ,
where L(n):= log(n) and
Λ ( n ) : = Λ ( n ) 1 n x 1 / 3 , Λ < ( n ) : = Λ ( n ) 1 n < x 1 / 3
(9)
μ ( n ) : = μ ( n ) 1 n x 1 / 3 , μ < ( n ) : = μ ( n ) 1 n < x 1 / 3 .
(10)

By decomposing each of the functions μ<, μ, 1, Λ<, Λ into O(logA+1x) functions supported on intervals of the form [ N,(1+ log−A x)N], and discarding those contributions which meet the boundary of [ x,2x] (cf. [3],[28],[30],[31]), and using GEH[ 𝜗] (with A replaced by a much larger fixed constant A) to control all remaining contributions, we obtain the claim (using the Siegel-Walfisz theorem; see, e.g. [32], Satz 4] or [33], Th. 5.29]).

By modifying the proof of the Bombieri-Vinogradov theorem, Motohashi [34] established the following generalization of that theorem:

Theorem 14(Generalized Bombieri-Vinogradov theorem).

[34] GEH[ 𝜗] holds for every fixed 0<𝜗<1/2.

One could similarly describe a generalization of the Motohashi-Pintz-Zhang estimate MPZ[ ϖ,δ], but unfortunately, the arguments in [3] or Theorem 11 do not extend to this setting unless one is in the ‘Type I/Type II’ case in which N,M are constrained to be somewhat close to x1/2, or if one has ‘Type III’ structure to the convolution αβ, in the sense that it can refactored as a convolution involving several ‘smooth’ sequences. In any event, our analysis would not be able to make much use of such incremental improvements to GEH[ 𝜗], as we only use this hypothesis effectively in the case when 𝜗 is very close to 1. In particular, we will not directly use Theorem 14 in this paper.

Outline of the key ingredients

In this section, we describe the key subtheorems used in the proof of Theorem 4, with the proofs of these subtheorems mostly being deferred to later sections.

We begin with a weak version of the Dickson-Hardy-Littlewood prime tuples conjecture [1], which (following Pintz [35]) we refer to as [ k,j]. Recall that for any k , an admissible k-tuple is a tuple = ( h 1 , , h k ) of k increasing integers h1<…<h k which avoids at least one residue class a p ( p ) : = { a p + np : n } for every p. For instance, (0,2,6) is an admissible 3-tuple, but (0,2,4) is not.

For any kj≥2, we let DHL[ k;j] denote the following claim:

Claim 15(Weak Dickson-Hardy-Littlewood conjecture, DHL[ k;j]).

For any admissible k-tuple = ( h 1 , , h k ) , there exist infinitely many translates n + = ( n + h 1 , , n + h k ) of which contain at least j primes.

The full Dickson-Hardy-Littlewood conjecture is then the assertion that DHL[ k;k] holds for all k≥2. In our analysis, we will focus on the case when j is much smaller than k; in fact, j will be of the order of logk.

For any k, let H(k) denote the minimal diameter h k h1 of an admissible k-tuple; thus for instance, H(3)=6. It is clear that for any natural numbers m≥1 and km+1, the claim DHL[k;m+1] implies that H m H(k) (and the claim DHL[ k;k] would imply that Hk−1=H(k)). We will therefore deduce Theorem 4 from a number of claims of the form DHL[ k;j]. More precisely, we have

Theorem 16.

Unconditionally, we have the following claims:

(i) DHL[50;2].

(ii) DHL[35,410;3].

(iii) DHL[1,649,821;4].

(iv) DHL[75,845,707;5].

(v) DHL[3,473,955,908;6].

(vi) DHL[k;m+1] whenever m≥1 and k C exp 4 28 157 m for some sufficiently large absolute (and effective) constant C.

Assume the Elliott-Halberstam conjecture EH[ θ] for all 0<θ<1. Then, we have the following improvements:

(vii) DHL[54;3].

(viii) DHL[5,511;4].

(ix) DHL[41,588;5].

(x) DHL[309,661;6].

(xi) DHL[k;m+1] whenever m≥1 and kC exp(2m) for some sufficiently large absolute (and effective) constant C.

Assume the generalized Elliott-Halberstam conjecture GEH[ θ] for all 0<θ<1. Then

(xii) DHL[3;2].

(xiii) DHL[51;3].

Theorem 4 then follows from Theorem 16 and the following bounds on H(k) (ordered by increasing value of k):

Theorem 17(Bounds on H(k)).

(xii) H(3)=6.

(i) H(50)=246.

(xiii) H(51)=252.

(vii) H(54)=270.

(viii) H(5,511)≤52,116.

(ii) H(35,410)≤398,130.

(ix) H(41,588)≤474,266.

(x) H(309,661)≤4,137,854.

(iii) H(1,649,821)≤24,797,814.

(iv) H(75,845,707)≤1,431,556,072.

(v) H(3,473,955,908)≤80,550,202,480.

(vi), (xi) In the asymptotic limit k, one has H(k)≤k logk+k log logkk+o(k), with the bounds on the decay rate o(k) being effective.

We prove Theorem 17 in the ‘Narrow admissible tuples’ section. In the opposite direction, an application of the Brun-Titchmarsh theorem gives H ( k ) 1 2 + o ( 1 ) k log k as k (see [4], §3.9] for this bound, as well as with some slight refinements).

The proof of Theorem 16 follows the Goldston-Pintz-Yıldırım strategy that was also used in all previous progress on this problem (e.g. [2],[3],[5],[27]), namely that of constructing a sieve function adapted to an admissible k-tuple with good properties. More precisely, we set
w : = log log log x
and
W : = p w p ,
and observe the crude bound
W log log O ( 1 ) x.
(11)

We have the following simple ‘pigeonhole principle’ criterion for DHL[k;m+1] (cf. [Lemma 4.1], though the normalization here is slightly different):

Lemma 18(Criterion for DHL).

Let k≥2 and m≥1 be fixed integers and define the normalization constant
B : = φ ( W ) W log x.
(12)
Suppose that for each fixed admissible k-tuple (h1,…,h k ) and each residue class b (W)such that b+h i is coprime to W for all i=1,…,k, one can find a non-negative weight function ν : + and fixed quantities α>0 and β1,…,β k ≥0, such that one has the asymptotic upper bound
x n 2 x n = b ( W ) ν ( n ) α + o ( 1 ) B k x W ,
(13)
the asymptotic lower bound
x n 2 x n = b ( W ) ν ( n ) θ ( n + h i ) ( β i o ( 1 ) ) B 1 k x φ ( W )
(14)
for all i=1,…,k, and the key inequality
β 1 + + β k α > m.
(15)

Then, DHL[ k;m+1] holds.

Proof.

Let (h1,…,h k ) be a fixed admissible k-tuple. Since it is admissible, there is at least one residue class b (W) such that (b+h i ,W)=1 for all h i . For an arithmetic function ν as in the lemma, we consider the quantity
N : = x n 2 x n = b ( W ) ν ( n ) i = 1 k θ ( n + h i ) m log 3 x .
Combining (13) and (14), we obtain the lower bound
N ( β 1 + + β k o ( 1 ) ) B 1 k x φ ( W ) ( + o ( 1 ) ) B k x W log 3 x.

From (12) and the crucial condition (15), it follows that N>0 if x is sufficiently large.

On the other hand, the sum
i = 1 k θ ( n + h i ) m log 3 x

can be positive only if n+h i is prime for at least m+1 indices i=1,…,k. We conclude that, for all sufficiently large x, there exists some integer n[ x,2x] such that n+h i is prime for at least m+1 values of i=1,…,k.

Since (h1,…,h k ) is an arbitrary admissible k-tuple, DHL[ k;m+1] follows.

The objective is then to construct non-negative weights ν whose associated ratio β 1 + + β k α has provable lower bounds that are as large as possible. Our sieve majorants will be a variant of the multidimensional Selberg sieves used in [5]. As with all Selberg sieves, the ν are constructed as the square of certain (signed) divisor sums. The divisor sums we will use will be finite linear combinations of products of ‘one-dimensional’ divisor sums. More precisely, for any fixed smooth compactly supported function F : [ 0 , + ) , define the divisor sum λ F : by the formula
λ F ( n ) : = d | n μ ( d ) F ( log x d )
(16)
where logx denotes the base x logarithm
log x n : = log n log x .
(17)

One should think of λ F as a smoothed out version of the indicator function to numbers n which are ‘almost prime’ in the sense that they have no prime factors less than x ε for some small fixed ε>0 (see Proposition 14 for a more rigorous version of this heuristic).

The functions ν we will use will take the form
ν ( n ) = j = 1 J c j λ F j , 1 ( n + h 1 ) λ F j , k ( n + h k ) 2
(18)

for some fixed natural number J, fixed coefficients c 1 , , c J and fixed smooth compactly supported functions F j , i : [ 0 , + ) with j=1,…,J and i=1,…,k. (One can of course absorb the constant c j into one of the Fj,i if one wishes.) Informally, ν is a smooth restriction to those n for which n+h1,…,n+h k are all almost prime.

Clearly, ν is a (positive-definite) linear combination of functions of the form
n i = 1 k λ F i ( n + h i ) λ G i ( n + h i )
for various smooth functions F 1 , , F k , G 1 , , G k : [ 0 , + ) . The sum appearing in (13) can thus be decomposed into linear combinations of sums of the form
x n 2 x n = b ( W ) i = 1 k λ F i ( n + h i ) λ G i ( n + h i ) .
(19)
Also, since from (16) we clearly have
λ F ( n ) = F ( 0 )
(20)
when nx is prime and F is supported on [ 0,1], the sum appearing in (14) can be similarly decomposed into linear combinations of sums of the form
x n 2 x n = b ( W ) θ ( n + h i ) 1 i k ; i i λ F i ( n + h i ) λ G i ( n + h i ) .
(21)
To estimate the sums (21), we use the following asymptotic, proven in the ‘Multidimensional Selberg sieves’ section. For each compactly supported F : [ 0 , + ) , let
S ( F ) : = sup { x 0 : F ( x ) 0 }
(22)

denote the upper range of the support of F (with the convention that S(0)=0).

Theorem 19(Asymptotic for prime sums).

Let k≥2 be fixed, let (h1,…,h k ) be a fixed admissible k-tuple, and let b (W) be such that b+h i is coprime to W for each i=1,…,k. Let 1≤i0k be fixed, and for each 1≤ik distinct from i0, let F i , G i : [ 0 , + ) be fixed smooth compactly supported functions. Assume one of the following hypotheses:

(i) (Elliott-Halberstam) There exists a fixed 0<𝜗<1 such that EH[ 𝜗] holds and such that
1 i k ; i i 0 ( S ( F i ) + S ( G i ) ) < 𝜗.
(23)
(ii) (Motohashi-Pintz-Zhang) There exists fixed 0≤ϖ<1/4 and δ>0 such that MPZ[ϖ,δ] holds and such that
1 i k ; i i 0 ( S ( F i ) + S ( G i ) ) < 1 2 + 2 ϖ
(24)
and
max 1 i k ; i i 0 S ( F i ) , S ( G i ) < δ.
(25)
Then, we have
x n 2 x n = b ( W ) θ ( n + h i 0 ) 1 i k ; i i 0 λ F i ( n + h i ) λ G i ( n + h i ) = ( c + o ( 1 ) ) B 1 k x φ ( W )
(26)
where
c : = 1 i k ; i i 0 0 1 F i ( t i ) G i ( t i ) d t i .

Here of course F denotes the derivative of F.

To estimate the sums (19), we use the following asymptotic, also proven in the ‘Multidimensional Selberg sieves’ section.

Theorem 20(Asymptotic for non-prime sums).

Let k≥1 be fixed, let (h1,…,h k ) be a fixed admissible k-tuple, and let b (W) be such that b+h i is coprime to W for each i=1,…,k. For each fixed 1≤ik, let F i , G i : [ 0 , + ) be fixed smooth compactly supported functions. Assume one of the following hypotheses:

(i) (Trivial case) One has
i = 1 k ( S ( F i ) + S ( G i ) ) < 1 .
(27)
(ii) (Generalized Elliott-Halberstam) There exists a fixed 0<𝜗<1 and i0{1,…,k} such that GEH[ 𝜗] holds, and
1 i k ; i i 0 ( S ( F i ) + S ( G i ) ) < 𝜗.
(28)
Then, we have
x n 2 x n = b ( W ) i = 1 k λ F i ( n + h i ) λ G i ( n + h i ) = ( c + o ( 1 ) ) B k x W ,
(29)
where
c : = i = 1 k 0 1 F i ( t i ) G i ( t i ) d t i .
(30)

A key point in (ii) is that no upper bound on S ( F i 0 ) or S ( G i 0 ) is required (although, as we will see in the ‘The generalized Elliott-Halberstam case’ section, the result is a little easier to prove when one has S ( F i 0 ) + S ( G i 0 ) < 1 ). This flexibility in the F i 0 , G i 0 functions will be particularly crucial to obtain part (xii) of Theorem 16 and Theorem 4.

Remark 21.

Theorems 19 and 20 can be viewed as probabilistic assertions of the following form: if n is chosen uniformly at random from the set {xn≤2x:n=b (W)}, then the random variables θ(n+h i ) and λ F j ( n + h j ) λ G j ( n + h j ) for i,j=1,…,k have mean ( 1 + o ( 1 ) ) W φ ( W ) and 0 1 F j ( t ) G j ( t ) dt + o ( 1 ) B 1 , respectively, and furthermore, these random variables enjoy a limited amount of independence, except for the fact (as can be seen from (20)) that θ(n+h i ) and λ F i ( n + h i ) λ G i ( n + h i ) are highly correlated. Note though that we do not have asymptotics for any sum which involves two or more factors of θ, as such estimates are of a difficulty at least as great as that of the twin prime conjecture (which is equivalent to the divergence of the sum n θ ( n ) θ ( n + 2 ) ).

Theorems 19 and 20 may be combined with Lemma 18 to reduce the task of establishing estimates of the form DHL[ k;m+1] to that of establishing certain variational problems. For instance, in the ‘Proof of Theorem 22’ section, we reprove the following result of Maynard ([5], Proposition 4.2]):

Theorem 22(Sieving on the standard simplex).

Let k≥2 and m≥1 be fixed integers. For any fixed compactly supported square-integrable function F : [ 0 , + ) k , define the functionals
I ( F ) : = [ 0 , + ) k F ( t 1 , , t k ) 2 d t 1 t k
(31)
and
J i ( F ) : = [ 0 , + ) k 1 0 F ( t 1 , , t k ) d t i 2 d t 1 d t i 1 d t i + 1 d t k
(32)
for i=1,…,k, and let M k be the supremum
M k : = sup i = 1 k J i ( F ) I ( F )
(33)
over all square integrable functions F that are supported on the simplex
R k : = ( t 1 , , t k ) [ 0 , + ) k : t 1 + + t k 1
and are not identically zero (up to almost everywhere equivalence, of course). Suppose that there is a fixed 0<𝜗<1 such that EH[ 𝜗] holds and such that
M k > 2 m 𝜗 .

Then, DHL[ k;m+1] holds.

Parts (vii)-(xi) of Theorem 16 (and hence Theorem 4) are then immediate from the following results, proven in the ‘Asymptotic analysis’ and ‘The case of small and medium dimension’ sections, and ordered by increasing value of k:

Theorem 23(Lower bounds on M k ).

(vii) M54>4.00238.

(viii) M5,511>6.

(ix) M41,588>8.

(x) M309,661>10.

(xi) One has M k ≥ logkC for all kC, where C is an absolute (and effective) constant.

For the sake of comparison, in ([5], Proposition 4.3]), it was shown that M5>2, M105>4, and M k ≥ logk−2 log logk−2 for all sufficiently large k. As remarked in that paper, the sieves used on the bounded gap problem prior to the work in [5] would essentially correspond, in this notation, to the choice of functions F of the special form F(t1,…,t k ):=f(t1++t k ), which severely limits the size of the ratio in (33) (in particular, the analogue of M k in this special case cannot exceed 4, as shown in [36]).

In the converse direction, in Corollary 37, we will also show the upper bound M k k k 1 log k for all k≥2, which shows in particular that the bounds in (vii) and (xi) of the above theorem cannot be significantly improved. We remark that Theorem 23(vii) and the Bombieri-Vinogradov theorem also give a weaker version DHL[ 54;2] of Theorem 16(i).

We also have a variant of Theorem 22 which can accept inputs of the form MPZ[ ϖ,δ]:

Theorem 24(Sieving on a truncated simplex).

Let k≥2 and m≥1 be fixed integers. Let 0<ϖ<1/4 and 0<δ<1/2 be such that MPZ[ ϖ,δ] holds. For any α>0, let M k [ α ] be defined as in (33), but where the supremum now ranges over all square-integrable F supported in the truncated simplex
( t 1 , , t k ) [ 0 , α ] k : t 1 + + t k 1
(34)
and are not identically zero. If
M k δ 1 / 4 + ϖ > m 1 / 4 + ϖ ,

then DHL[ k;m+1] holds.

In the ‘Asymptotic analysis’ section, we will establish the following variant of Theorem 23, which when combined with Theorem 11, allows one to use Theorem 24 to establish parts (ii)-(vi) of Theorem 16 (and hence Theorem 4):

Theorem 25(Lower bounds on M k [ α ] ).

(ii) There exist δ,ϖ>0 with 600ϖ+180δ<7 and M 35 410 δ 1 / 4 + ϖ > 2 1 / 4 + ϖ .

(iii) There exist δ,ϖ>0 with 600ϖ+180δ<7 and M 1 649 821 δ 1 / 4 + ϖ > 3 1 / 4 + ϖ .

(iv) There exist δ,ϖ>0 with 600ϖ+180δ<7 and M 75 845 707 δ 1 / 4 + ϖ > 4 1 / 4 + ϖ .

(v) There exist δ,ϖ>0 with 600ϖ+180δ<7 and M 3 473 955 908 δ 1 / 4 + ϖ > 5 1 / 4 + ϖ .

(vi) For all kC, there exist δ,ϖ>0 with 600ϖ+180δ<7, ϖ 7 600 C log k , and M k δ 1 / 4 + ϖ log k C for some absolute (and effective) constant C.

The implication is clear for (ii)-(v). For (vi), observe that from Theorem 25(vi), Theorem 11, and Theorem 24, we see that DHL[ k;m+1] holds whenever k is sufficiently large and
m ( log k C ) 1 4 + 7 600 C log k
which is in particular implied by
m log k 4 28 157 C

for some absolute constant C, giving Theorem 16(vi).

Now we give a more flexible variant of Theorem 22, in which the support of F is enlarged, at the cost of reducing the range of integration of the J i .

Theorem 26(Sieving on an epsilon-enlarged simplex).

Let k≥2 and m≥1 be fixed integers, and let 0<ε<1 be fixed also. For any fixed compactly supported square-integrable function F : [ 0 , + ) k , define the functionals
J i , 1 ε ( F ) : = ( 1 ε ) · R k 1 0 F ( t 1 , , t k ) d t i 2 d t 1 d t i 1 d t i + 1 d t k
for i=1,…,k, and let Mk,ε be the supremum
M k , ε : = sup i = 1 k J i , 1 ε ( F ) I ( F )
over all square-integrable functions F that are supported on the simplex
( 1 + ε ) · R k = ( t 1 , , t k ) [ 0 , + ) k : t 1 + + t k 1 + ε

and are not identically zero. Suppose that there is a fixed 0<𝜗<1, such that one of the following two hypotheses hold:

(i) EH[𝜗] holds, and 1 + ε < 1 𝜗 .

(ii) GEH[𝜗] holds, and ε < 1 k 1 .

If
M k , ε > 2 m 𝜗

then DHL[ k;m+1] holds.

We prove this theorem in the ‘Proof of Theorem 26’ section. We remark that due to the continuity of Mk,ε in ε, the strict inequalities in (i) and (ii) of this theorem may be replaced by non-strict inequalities. Parts (i) and (xiii) of Theorem 16, and a weaker version DHL[ 4;2] of part (xii), then follow from Theorem 9 and the following computations, proven in the ‘Bounding Mk,ε for medium k’ and ‘Bounding M4,ε’ sections:

Theorem 27(Lower bounds on Mk,ε).

(i) M50,1/25>4.0043.

(xii’) M4,0.168>2.00558.

(xiii) M51,1/50>4.00156.

We remark that computations in the proof of Theorem 27(xii’) are simple enough that the bound may be checked by hand, without use of a computer. The computations used to establish the full strength of Theorem 16(xii) are however significantly more complicated.

In fact, we may enlarge the support of F further. We give a version corresponding to part (ii) of Theorem 26; there is also a version corresponding to part (i), but we will not give it here as we will not have any use for it.

Theorem 28(Going beyond the epsilon enlargement).

Let k≥2 and m≥1 be fixed integers, let 0<𝜗<1 be a fixed quantity such that GEH[ 𝜗] holds, and let 0 < ε < 1 k 1 be fixed also. Suppose that there is a fixed non-zero square-integrable function F : [ 0 , + ) k supported in k k 1 · R k , such that for i=1,…,k, one has the vanishing marginal condition
0 F ( t 1 , , t k ) d t i = 0
(35)
whenever t1,…,ti−1,ti+1,…,t k ≥0 are such that
t 1 + + t i 1 + t i + 1 + + t k > 1 + ε.
Suppose that we also have the inequality
i = 1 k J i , ε ( F ) I ( F ) > 2 m 𝜗 .

Then DHL[ k;m+1] holds.

This theorem is proven in the ‘Proof of Theorem 28’ section. Theorem 16(xii) is then an immediate consequence of Theorem 28 and the following numerical fact, established in the ‘Three-dimensional cutoffs’ section.

Theorem 29(A piecewise polynomial cutoff).

Set ε : = 1 4 . Then, there exists a piecewise polynomial function F : [ 0 , + ) 3 supported on the simplex
3 2 · R 3 = ( t 1 , t 2 , t 3 ) [ 0 , + ) 3 : t 1 + t 2 + t 3 3 2
and symmetric in the t1,t2,t3 variables, such that F is not identically zero and obeys the vanishing marginal condition
0 F ( t 1 , t 2 , t 3 ) d t 3 = 0
whenever t1,t2≥0 with t1+t2>1+ε and such that
3 t 1 + t 2 1 ε 0 F ( t 1 , t 2 , t 3 ) d t 3 2 d t 1 d t 2 [ 0 , ) 3 F ( t 1 , t 2 , t 3 ) 2 d t 1 d t 2 d t 3 > 2 .

There are several other ways to combine Theorems 19 and 20 with equidistribution theorems on the primes to obtain results of the form DHL[k;m+1], but all of our attempts to do so either did not improve the numerology or else were numerically infeasible to implement.

Multidimensional Selberg sieves

In this section, we prove Theorems 19 and 20. A key asymptotic used in both theorems is the following:

Lemma 30(Asymptotic).

Let k≥1 be a fixed integer, and let N be a natural number coprime to W with logN=O(logO(1)x). Let F 1 , , F k , G 1 , , G k : [ 0 , + ) be fixed smooth compactly supported functions. Then,
d 1 , , d k , d 1 , , d k d 1 , d 1 , , d k , d k , W , N coprime j = 1 k μ d j μ d j F j log x d j G j log x d j d j , d j = ( c + o ( 1 ) ) B k N k φ ( N ) k
(36)
where B was defined in (12), and
c : = j = 1 k 0 F j ( t j ) G j ( t j ) d t j .

The same claim holds if the denominators d j , d j are replaced by φ d j , d j .

Such asymptotics are standard in the literature (see, e.g. [37] for some similar computations). In older literature, it is common to establish these asymptotics via contour integration (e.g. via Perron’s formula), but we will use the Fourier analytic approach here. Of course, both approaches ultimately use the same input, namely the simple pole of the Riemann zeta function at s=1.

Proof.

We begin with the first claim. For j=1,…,k, the functions te t F j (t), te t G j (t) may be extended to smooth compactly supported functions on all of , and so we have Fourier expansions
e t F j ( t ) = e itξ f j ( ξ )
(37)
and
e t G j ( t ) = e itξ g j ( ξ )

for some fixed functions f j , g j : that are smooth and rapidly decreasing in the sense that f j (ξ),g j (ξ)=O((1+|ξ|)A) for any fixed A>0 and all ξ (here the implied constant is independent of ξ and depends only on A).

We may thus write
F j log x d j = f j ( ξ j ) d j 1 + i ξ j log x d ξ j
and
G j log x d j = g j ξ j d j 1 + i ξ j log x d ξ j
for all d j , d j 1 . We note that
d j , d j | μ d j μ d j | d j , d j d j 1 / log x d j 1 / log x = p 1 + 2 p 1 + 1 / log x + 1 p 1 + 2 / log x exp ( O ( log log x ) ) .
Therefore, if we substitute the Fourier expansions into the left-hand side of (36), the resulting expression is absolutely convergent. Thus, we can apply Fubini’s theorem, and the left-hand side of (36) can thus be rewritten as
K ξ 1 , , ξ k , ξ 1 , , ξ k j = 1 k f j ξ j g j ξ j d ξ j d ξ j ,
(38)
where
K ( ξ 1 , , ξ k , ξ 1 , , ξ k ) : = d 1 , , d k , d 1 , , d k d 1 , d 1 , , d k , d k , W , N coprime j = 1 k μ d j μ d j d j , d j d j 1 + i ξ j log x d j 1 + i ξ j log x .
This latter expression factorizes as an Euler product
K = p WN K p ,
where the local factors K p are given by
K p ξ 1 , , ξ k , ξ 1 , , ξ k : = 1 + 1 p d 1 , , d k , d 1 , , d k d 1 , , d k , d 1 , , d k = p d 1 , d 1 , , d k , d k coprime j = 1 k μ d j μ d j d j 1 + i ξ j log x d j 1 + i ξ j log x .
(39)
We can estimate each Euler factor as
K p ξ 1 , , ξ k , ξ 1 , , ξ k = 1 + O 1 p 2 j = 1 k 1 p 1 1 + i ξ j log x 1 p 1 1 + i ξ j log x 1 p 1 2 + i ξ j + i ξ j log x .
(40)
Since
p : p > w 1 + O 1 p 2 = 1 + o ( 1 ) ,
we have
K ξ 1 , , ξ k , ξ 1 , , ξ k = ( 1 + o ( 1 ) ) j = 1 k ζ WN 1 + 2 + i ξ j + i ξ j log x ζ WN 1 + 1 + i ξ j log x ζ WN 1 + 1 + i ξ j log x
where the modified zeta function ζ WN is defined by the formula
ζ WN ( s ) : = p WN 1 1 p s 1

for (s)>1.

For ( s ) 1 + 1 log x , we have the crude bounds
| ζ WN ( s ) | , | ζ WN ( s ) | 1 p 1 + 1 p 1 + 1 / log x + O 1 p 2 exp p 1 p 1 + 1 / log x exp ( log log x + O ( 1 ) ) log x.
Thus,
K ξ 1 , , ξ k , ξ 1 , , ξ k = O log 3 k x .
Combining this with the rapid decrease of f j ,g j , we see that the contribution to (38) outside of the cube max ξ 1 , , ξ k , ξ 1 , , ξ k log x (say) is negligible. Thus, it will suffice to show that
log x log x log x log x K ξ 1 , , ξ k , ξ 1 , , ξ k j = 1 k f j ξ j g j ξ j d ξ j d ξ j = ( c + o ( 1 ) ) B k N k φ ( N ) k .
When | ξ j | log x , we see from the simple pole of the Riemann zeta function ζ ( s ) = p 1 1 p s 1 at s=1 that
ζ 1 + 1 + i ξ j log x = ( 1 + o ( 1 ) ) log x 1 + i ξ j .
For log x ξ j log x , we see that
1 1 p 1 + 1 + i ξ j log x = 1 1 p + O log p p log x .
Since logW N logO(1)x, this gives
p | WN 1 1 p 1 + 1 + i ξ j log x = φ ( WN ) WN exp O p | WN log p p log x = ( 1 + o ( 1 ) ) φ ( WN ) WN ,
since the sum is maximized when WN is composed only of primes p logO(1)x. Thus,
ζ WN 1 + 1 + i ξ j log x = ( 1 + o ( 1 ) ) ( N ) ( 1 + i ξ j ) N ,
similarly with 1+i ξ j replaced by 1 + i ξ j or 2 + i ξ j + i ξ j . We conclude that
K ξ 1 , , ξ k , ξ 1 , , ξ k = ( 1 + o ( 1 ) ) B k N k φ ( N ) k j = 1 k 1 + i ξ j 1 + i ξ j 2 + i ξ j + i ξ j .
(41)
Therefore, it will suffice to show that
j = 1 k 1 + i ξ j 1 + i ξ j 2 + i ξ j + i ξ j f j ( ξ j ) g j ξ j d ξ j d ξ j = c ,
since the errors caused by the 1+o(1) multiplicative factor in (41) or the truncation | ξ j | , | ξ j | log x can be seen to be negligible using the rapid decay of f j ,g j . By Fubini’s theorem, it suffices to show that
( 1 + ) ( 1 + i ξ ) 2 + + i ξ f j ( ξ ) g j ( ξ ) dξd ξ = 0 + F j ( t ) G j ( t ) dt
for each j=1,…,k. But from dividing (37) by e t and differentiating under the integral sign, we have
F j ( t ) = ( 1 + ) e t ( 1 + ) f j ( ξ ) ,

and the claim then follows from Fubini’s theorem.

Finally, suppose that we replace d j , d j with φ d j , d j . An inspection of the above argument shows that the only change that occurs is that the 1 p term in (39) is replaced by 1 p 1 ; but this modification may be absorbed into the 1 + O 1 p 2 factor in (40), and the rest of the argument continues as before.

4.1 The trivial case

We can now prove the easiest case of the two theorems, namely case (i) of Theorem 20; a closely related estimate also appears in ([5], Lemma 6.2]). We may assume that x is sufficiently large depending on all fixed quantities. By (16), the left-hand side of (29) may be expanded as
d 1 , , d k , d 1 , , d k i = 1 k μ ( d i ) μ d i F i log x d i G i log x d i S d 1 , , d k , d 1 , , d k
(42)
where
S d 1 , , d k , d 1 , , d k : = x n 2 x n = b ( W ) n + h i = 0 ( [ d i , d i ] ) i 1 .
By hypothesis, b+h i is coprime to W for all i=1,…,k, and |h i h j |<w for all distinct i,j. Thus, S d 1 , , d k , d 1 , , d k vanishes unless the d i , d i are coprime to each other and to W. In this case, S d 1 , , d k , d 1 , , d k is summing the constant function 1 over an arithmetic progression in [ x,2x] of spacing W d 1 , d 1 d k , d k , and so
S d 1 , , d k , d 1 , , d k = x W d 1 , d 1 d k , d k + O ( 1 ) .
By Lemma 30, the contribution of the main term x W d 1 , d 1 d k , d k to (29) is ( c + o ( 1 ) ) B k x W ; note that the restriction of the integrals in (30) to [ 0,1] instead of [ 0,+) is harmless since S(F i ),S(G i )<1 for all i. Meanwhile, the contribution of the O(1) error is then bounded by
O d 1 , , d k , d 1 , , d k i = 1 k | F i ( log x d i ) | | G i ( log x d i ) | .
By the hypothesis in Theorem 20(i), we see that for d 1 , , d k , d 1 , , d k contributing a non-zero term here, one has
d 1 , d 1 d k , d k x 1 ε

for some fixed ε>0. From the divisor bound (1), we see that each choice of d 1 , d 1 d k , d k arises from 1 choices of d 1 , , d k , d 1 , , d k . We conclude that the net contribution of the O(1) error to (29) is x1−ε, and the claim follows.

4.2 The Elliott-Halberstam case

Now we show case (i) of Theorem 19. For the sake of notation, we take i0=k, as the other cases are similar. We use (16) to rewrite the left-hand side of (26) as
d 1 , , d k 1 , d 1 , , d k 1 i = 1 k 1 μ ( d i ) μ d i F i log x d i G i log x d i S ~ d 1 , , d k 1 , d 1 , , d k 1
(43)
where
S ~ d 1 , , d k 1 , d 1 , , d k 1 : = x n 2 x n = b ( W ) n + h i = 0 d i , d i i = 1 , , k 1 θ ( n + h k ) .
As in the previous case, S ~ d 1 , , d k 1 , d 1 , , d k 1 vanishes unless the d i , d i are coprime to each other and to W, and so the summand in (43) vanishes unless the modulus q W , d 1 , , d k 1 defined by
q W , d 1 , , d k 1 : = W d 1 , d 1 d k 1 , d k 1
(44)
is square-free. In that case, we may use the Chinese remainder theorem to concatenate the congruence conditions on n into a single primitive congruence condition
n + h k = a W , d 1 , , d k 1 q W , d 1 , , d k 1
for some a W , d 1 , , d k 1 depending on W , d 1 , , d k 1 , d 1 , , d k 1 , and conclude using (3) that
S ~ d 1 , , d k 1 , d 1 , , d k 1 = 1 φ q W , d 1 , , d k 1 x + h k n 2 x + h k θ ( n ) + Δ 1 x + h k , 2 x + h k θ ; a W , d 1 , , d k 1 q W , d 1 , , d k 1 .
(45)
From the prime number theorem, we have
x + h k n 2 x + h k θ ( n ) = ( 1 + o ( 1 ) ) x
and this expression is clearly independent of d 1 , , d k 1 . Thus, by Lemma 30, the contribution of the main term in (45) is ( c + o ( 1 ) ) B 1 k x φ ( W ) . By (11) and (12), it thus suffices to show that for any fixed A we have
d 1 , , d k 1 , d 1 , , d k 1 i = 1 k 1 F i log x d i G i log x d i Δ 1 x + h k , 2 x + h k θ ; a ( q ) x log A x ,
(46)

where a = a W , d 1 , , d k 1 and q = q W , d 1 , , d k 1 . For future reference, we note that we may restrict the summation here to those d 1 , , d k 1 for which q W , d 1 , , d k 1 is square-free.

From the hypotheses of Theorem 19(i), we have
q W , d 1 , , d k 1 x 𝜗
whenever the summand in (43) is non-zero, and each choice q of q W , d 1 , , d k 1 is associated to O(τ(q)O(1)) choices of d 1 , , d k 1 , d 1 , , d k 1 . Thus, this contribution is
q x 𝜗 τ ( q ) O ( 1 ) sup a ( / qℤ ) × Δ 1 [ x + h k , 2 x + h k ] θ ; a ( q ) .
Using the crude bound
Δ 1 [ x + h k , 2 x + h k ] θ ; a ( q ) x q log O ( 1 ) x
and (2), we have
q x 𝜗 τ ( q ) C sup a ( / qℤ ) × Δ 1 [ x + h k , 2 x + h k ] θ ; a ( q ) x log O ( 1 ) x
for any fixed C>0. By the Cauchy-Schwarz inequality, it suffices to show that
q x 𝜗 sup a ( / qℤ ) × Δ 1 [ x + h k , 2 x + h k ] θ ; a ( q ) x log A x
for any fixed A>0. However, since θ only differs from Λ on powers p j of primes with j>1, it is not difficult to show that
Δ 1 x + h k , 2 x + h k θ ; a ( q ) Δ 1 x + h k , 2 x + h k Λ ; a ( q ) x q ,

so the net error in replacing θ here by Λ is x1−(1−𝜗)/2, which is certainly acceptable. The claim now follows from the hypothesis EH[ 𝜗], thanks to Claim 8.

4.3 The Motohashi-Pintz-Zhang case

Now we show case (ii) of Theorem 19. We repeat the arguments from the ‘The Elliott-Halberstam case’ section, with the only difference being in the derivation of (46). As observed previously, we may restrict q W , d 1 , , d k 1 to be square-free. From the hypotheses in Theorem 19(ii), we also see that
q W , d 1 , , d k 1 x 𝜗
and that all the prime factors of q W , d 1 , , d k 1 are at most x δ . Thus, if we set I:= [ 1,x δ ], we see (using the notation from Claim 10) that q W , d 1 , , d k 1 lies in S I and is thus a factor of P I . If we then let A / P I denote all the primitive residue classes a (P I ) with the property that a=b (W), and such that for each prime w<px δ , one has a+h i =0 (p) for some i=1,…,k, then we see that a W , d 1 , , d k 1 lies in the projection of to / q W , d 1 , , d k 1 . Each q S I is equal to q W , d 1 , , d k 1 for O(τ(q)O(1)) choices of d 1 , , d k 1 . Thus, the left-hand side of (46) is
q S I : q x 𝜗 τ ( q ) O ( 1 ) sup a A Δ 1 [ x + h k , 2 x + h k ] θ ; a ( q ) .
Note from the Chinese remainder theorem that for any given q, if one lets a range uniformly in , then a (q) is uniformly distributed among O(τ(q)O(1)) different moduli. Thus, we have
sup a A Δ 1 [ x + h k , 2 x + h k ] θ ; a ( q ) τ ( q ) O ( 1 ) | A | a A Δ 1 [ x + h k , 2 x + h k ] θ ; a ( q ) ,
and so it suffices to show that
q S I : q x 𝜗 τ ( q ) O ( 1 ) | A | a A Δ ( 1 [ x + h k , 2 x + h k ] θ ; a ( q ) ) x log A x
for any fixed A>0. We see it suffices to show that
q S I : q x 𝜗 τ ( q ) O ( 1 ) Δ ( 1 [ x + h k , 2 x + h k ] θ ; a ( q ) ) x log A x

for any given a A . But this follows from the hypothesis MPZ[ ϖ,δ] by repeating the arguments of the ‘The Elliott-Halberstam case’ section.

4.4 Crude estimates on divisor sums

To proceed further, we will need some additional information on the divisor sums λ F (defined in (16)), namely that these sums are concentrated on ‘almost primes’; results of this type have also appeared in [38].

Proposition 14(Almost primality).

Let k≥1 be fixed, let (h1,…,h k ) be a fixed admissible k-tuple, and let b (W)be such that b+h i is coprime to W for each i=1,…,k. Let F 1 , , F k : [ 0 , + ) be fixed smooth compactly supported functions, and let m1,…,m k ≥0 and a1,…,a k ≥1 be fixed natural numbers. Then,
x n 2 x : n = b ( W ) j = 1 k | λ F j ( n + h j ) | a j τ ( n + h j ) m j B k x W .
(47)
Furthermore, if 1≤j0k is fixed and p0 is a prime with p 0 x 1 10 k , then we have the variant
x n 2 x : n = b ( W ) j = 1 k | λ F j ( n + h j ) | a j τ ( n + h j ) m j 1 p 0 | n + h j 0 log x p 0 p 0 B k x W .
(48)
As a consequence, we have
x n 2 x : n = b ( W ) j = 1 k | λ F j ( n + h j ) | a j τ ( n + h j ) m j 1 p ( n + h j 0 ) x ε ε B k x W ,
(49)

for any ε>0, where p(n) denotes the least prime factor of n.

The exponent 1 10 k can certainly be improved here, but for our purposes, any fixed positive exponent depending only on k will suffice.

Proof.

The strategy is to estimate the alternating divisor sums λ F j ( n + h j ) by non-negative expressions involving prime factors of n+h j , which can then be bounded combinatorially using standard tools.

We first prove (47). As in the proof of Proposition 30, we can use Fourier expansion to write
F j log x d = f j ( ξ ) d 1 + log x
for some rapidly decreasing f j : and all natural numbers d. Thus,
λ F j ( n ) = d | n μ ( d ) d 1 + log x f j ( ξ ) ,
which factorizes using Euler products as
λ F j ( n ) = p | n 1 1 p 1 + log x f j ( ξ ) dξ.
The function s p s log x has a magnitude of O(1) and a derivative of O(logx p) when (s)>1, and thus
1 1 p 1 + log x = O min ( ( 1 + | ξ | ) log x p , 1 ) .
From the rapid decrease of f j and the triangle inequality, we conclude that
| λ F j ( n ) | p | n O min ( ( 1 + | ξ | ) log x p , 1 ) ( 1 + | ξ | ) A
for any fixed A>0. Thus, noting that p | n O ( 1 ) τ ( n ) O ( 1 ) , we have
| λ F j ( n ) | a j τ ( n ) O ( 1 ) p | n l = 1 a j min ( ( 1 + | ξ l | ) log x p , 1 ) d ξ 1 d ξ a j ( 1 + | ξ 1 | ) A ( 1 + | ξ a j | ) A
for any fixed a j ,A. However, we have
i = 1 a j min 1 + | ξ i | log x p , 1 min 1 + | ξ 1 | + + | ξ a j | log x p , 1 ,
and so
| λ F j ( n ) | a j τ ( n ) O ( 1 ) p | n min 1 + | ξ 1 | + + | ξ a j | log x p , 1 d ξ 1 d ξ a j 1 + | ξ 1 | + + | ξ a j | A .
Making the change of variables σ : = 1 + | ξ 1 | + + | ξ a j | , we obtain
| λ F j ( n ) | a j τ ( n ) O ( 1 ) 1 p | n min ( σ log x p , 1 ) σ A
for any fixed A>0. In view of this bound and the Fubini-Tonelli theorem, it suffices to show that
x n 2 x : n = b ( W ) j = 1 k τ ( n + h j ) O ( 1 ) p | n min ( σ j log x p , 1 ) B k x W ( σ 1 + + σ k ) O ( 1 )
for all σ1,…,σ k ≥1. By setting σ:=σ1++σ k , it suffices to show that
x n 2 x : n = b ( W ) j = 1 k τ n + h j O ( 1 ) p | n + h j min σ log x p , 1 B k x W σ O ( 1 )
(50)

for any σ≥1.

To proceed further, we factorize n+h j as a product
n + h j = p 1 p r
of primes p1p r in increasing order and then write
n + h j = d j m j
where d j : = p 1 p i j and i j is the largest index for which p 1 p i j < x 1 10 k , and m j : = p i j + 1 p r . By construction, we see that 0≤i j <r, d j x 1 10 k . Also, we have
p i j + 1 p 1 p i j + 1 1 i j + 1 x 1 10 k i j + 1 .
Since n≤2x, this implies that
r = O ( i j + 1 )
and so
τ ( n + h j ) 2 O 1 + Ω ( d j ) ,
where we recall that Ω(d j )=i j denotes the number of prime factors of d j , counting multiplicity. We also see that
p ( m j ) x 1 10 k 1 + Ω ( d j ) x 1 10 k 1 + Ω d 1 d k = : R ,
where p(n) denotes the least prime factor of n. Finally, we have that
p | n + h j min σ log x p , 1 p | d j min σ log x p , 1 ,
and we see that the d1,…,d k ,W are coprime. We may thus estimate the left-hand side of (50) by
j = 1 k 2 O ( 1 + Ω ( d j ) p | d j min ( σ log x p , 1 ) 1

where the outer sum is over d 1 , , d k x 1 10 k with d1,…,d k ,W coprime, and the inner sum is over xn≤2x with n=b (W) and n+h j =0 (d j ) for each j, with p n + h j d j R for each j.

We bound the inner sum 1 using a Selberg sieve upper bound. Let G be a smooth function supported on [ 0,1] with G(0)=1, and let d=d1d k . We see that
1 x n 2 x n + h i 0 ( d i ) n b ( W ) i = 1 k e | n + h i ( e , dW ) = 1 μ ( e ) G ( log R e ) 2 ,
since the product is G(0)2k=1 if p n + h j d j R , and non-negative otherwise. The right-hand side may be expanded as
e 1 , , e k , e 1 , , e k e i e i , dW = 1 i i = 1 k μ ( e i ) μ e i G log R e i G log R e i x n 2 x n + h i 0 ( d i [ e i , e i ] ) n b ( W ) 1 .
As in the ‘The trivial case’ section, the inner sum vanishes unless the e i e i are coprime to each other and dW, in which case it is
x dW [ e 1 , e 1 ] [ e k , e k ] + O ( 1 ) .
The O(1) term contributes R k x1/10, which is negligible. By Lemma 30, if Ω(d) log1/2x, then the main term contributes
d φ ( d ) k x dW ( log R ) k 2 Ω ( d ) B k x dW .
We see that this final bound applies trivially if Ω(d) log1/2x. The bound (50) thus reduces to
j = 1 k 2 O ( 1 + Ω ( d j ) ) d j p | d j min ( σ log x p , 1 ) σ O ( 1 ) .
(51)
Ignoring the coprimality conditions on the d j for an upper bound, we see this is bounded by
w < p x 1 10 k 1 + O ( min ( σ log x ( p ) , 1 ) ) p j 0 O ( 1 ) j p j k exp O p x ( min ( σ log x ( p ) , 1 ) ) p .
But from Mertens’ theorem, we have
p x min ( σ log x p , 1 ) p = O log 1 σ ,

and the claim (47) follows.

The proof of (48) is a minor modification of the argument above used to prove (47). Namely, the variable d j 0 is now replaced by [ d0,p0]<x1/5k, which upon factoring out p0 has the effect of multiplying the upper bound for (51) by O σ log x p 0 p 0 (at the negligible cost of deleting the prime p0 from the sum p x , giving the claim; we omit the details.

Finally, (49) follows immediately from (47) when ε > 1 10 k , and from (48) and Mertens’ theorem when ε 1 10 k .

Remark 32.

As in [38], one can use Proposition 14, together with the observation that the quantity λ F (n) is bounded whenever n=O(x) and p(n)≥x ε , to conclude that whenever the hypotheses of Lemma 18 are obeyed for some ν of the form (18), then there exists a fixed ε>0 such that for all sufficiently large x, there are x log k x elements n of [x,2x] such that n+h1,…,n+h k have no prime factor less than x ε , and that at least m of the n+h1,…,n+h k are prime.

4.5 The generalized Elliott-Halberstam case

Now we show case (ii) of Theorem 20. For the sake of notation, we shall take i0=k, as the other cases are similar; thus, we have
i = 1 k 1 ( S ( F i ) + S ( G i ) ) < 𝜗.
(52)

The basic idea is to view the sum (29) as a variant of (26), with the role of the function θ now being played by the product divisor sum λ F k λ G k , and to repeat the arguments in the ‘The Elliott-Halberstam case’ section. To do this, we rely on Proposition 14 to restrict n+h i to the almost primes.

We turn to the details. Let ε>0 be an arbitrary fixed quantity. From (49) and Cauchy-Schwarz, one has
x n 2 x n = b ( W ) i = 1 k λ F i ( n + h i ) λ G i ( n + h i ) 1 p ( n + h k ) x ε = O ε B k x W
with the implied constant uniform in ε, so by the triangle inequality and a limiting argument as ε→0, it suffices to show that
x n 2 x n = b ( W ) i = 1 k λ F i ( n + h i ) λ G i ( n + h i ) 1 p ( n + h k ) > x ε = ( c ε + o ( 1 ) ) B k x W
(53)
where c ε is a quantity depending on ε but not on x, such that
lim ε 0 c ε = i = 1 k 0 1 F i ( t ) G i ( t ) dt.
We use (16) to expand out λ F i , λ G i for i=1,…,k−1, but not for i=k, so that the left-hand side of (29) becomes
d 1 , , d k 1 , d 1 , , d k 1 i = 1 k μ ( d i ) μ d i F i log x d i G i log x d i S d 1 , , d k 1 , d 1 , , d k 1
(54)
where
S d 1 , , d k 1 , d 1 , , d k 1 : = x n 2 x n = b ( W ) n + h i = 0 ( [ d i , d i ] ) i = 1 , , k 1 λ F k ( n + h k ) λ G k ( n + h k ) 1 p ( n + h k ) > x ε .
As before, the summand in (54) vanishes unless the modulusd q W , d 1 , , d k 1 defined in (44) is square-free, in which case we have the analogue
S d 1 , , d k 1 , d 1 , , d k 1 = 1 φ ( q ) x + h k n 2 x + h k ( n , q ) = 1 λ F k ( n ) λ G k ( n ) 1 p ( n ) > x ε + Δ 1 [ x + h k , 2 x + h k ] λ F k λ G k 1 p ( · ) > x ε ; a ( q )
(55)
of (45). Here we have put q = q W , d 1 , , d k 1 and a = a W , d 1 , , d k 1 for convenience. We thus split
S = S 1 S 2 + S 3 ,
where,
S 1 d 1 , , d k 1 , d 1 , , d k 1 = 1 φ ( q ) x + h k n 2 x + h k λ F k ( n ) λ G k ( n ) 1 p ( n ) > x ε ,
(56)
S 2 d 1 , , d k 1 , d 1 , , d k 1 = 1 φ ( q ) x + h k n 2 x + h k ; ( n , q ) > 1 λ F