You are currently browsing the monthly archive for mars 2015.

What I claim is that, letting p_n being the nth prime

\frac{\lvert\{p_n\leq N\mid p_{n+1}-p_n\leq \log^{1-\epsilon} N\}\rvert}{N/\log N}\rightarrow 0

in other words the proportion of the gaps between primes up to N which are smaller than \log ^{1-\epsilon} N is tending to 0. Thus although we know that there are infinitely many bounded gaps, we see here that they are almost all of size at least logarithmic.

To prove it, we recall a classical application of Selberg’s sieve:

\pi_m(x):=\lvert\{p \leq x\mid p+m\text{ is also a prime}\}\rvert \ll \frac{x}{\log^2 x}\prod_{p\mid m}(1+1/p)

where the implied constant is absolute (does not depend on m).

Now this product is obviously less than than \prod_{p\leq \omega(N)}(1+1/p) which by Mertens’s theorem and the obvious bound \omega (n)\ll \log n implies that is at most \log\log N. But now the numerator of our fraction is obviously

\displaystyle{\sum_{m\leq \log^{1-\epsilon} N}\pi_m(N)\ll\frac{N}{\log^2 N} \sum_{m\leq \log^{1-\epsilon} N} \log\log m\ll \frac{N}{\log^2 N}\log^{1-\epsilon/2}N}

The last term is simply \frac{N}{\log^{1+\epsilon/2}N} which is indeed negligible compared to the number of primes until N, whence the result.

Notice that we could have argued a bit differently: indeed, \sum_{m\leq M}\prod_{p\mid m}(1+1/p)=\sum_{d}[M/d]\frac{\mu(d)^2}{d}\ll M (instead of M\log\log M previously). Thus, the number of primes until N such that the next prime is at distance at most M is bounded by \frac{N}{(\log N)^2}M, which is o(1) as soon as M=o(\log N).

As a conclusion, gaps of size p_{n+1}-p_n=o(\log p_n) have density 0.

There are between 1 and N about \sqrt{N} perfect squares, which makes this set S\subset [N] a quite sparse set, of density N^{-1/2}. Given that \lvert S\rvert^2 \sim N, it could be reasonable to think that S+S, with some luck, cover all of [N] or at least have positive density. In fact it’s not the case.

To estimate the size of S+S, notice first that it contains all the primes congruent to 1 mod 4, which is a well-known theorem of Fermat, which is about one half of the primes, so that \lvert 2S\rvert \gg \frac{N}{2\log N}. This already a huge lot bigger than the squares alone.

In fact there are much more sums of two squares than these primes, because all their products are also sums of two squares. In fact, it contains all the products of such primes. A similar heuristics to the one leading to the prime number theorem suggests that the density of the numbers having no prime factor congruent to 3 mod 4 should be

\prod_{\substack{p\equiv 3[4]\\p\leq N}} (1-1/p), and the log of this product should be roughly

-\sum_{\substack{p\equiv 3[4]\\p\leq N}}1/p\sim -1/2\log\log N

so we get a density  \frac{\lvert S+S\rvert}{N}\sim 1/\sqrt{\log N}. While this indeed is no proof, it should be believable. In general, a number n is a sum of two squares if and only if it is of the form n=ab^2 where a is product of primes congruent to 1 mod 4 and b is a product of primes congruent to 3 mod 4. Thus, the total number of sums of two squares seem to be bounded above (using the union bound) by something like

\sum'_b \lvert 2S\cap [N/b^2]\rvert

where the \sum' denotes a sum restricted over number whose prime factors are all 3 mod 4. But this should be of the order of

\sum'_b \frac{N/b^2}{\sqrt{\log N}}\leq \sum_b1/b^2N/\sqrt{\log N}

and the sum is bounded, so certainly \lvert 2S\rvert\leq C\frac{N}{\sqrt{\log N}}. In fact, Landau has proven in 1908 that the number of sums of two squares until capital N is asymptotic to

\frac{1}{\sqrt{2}}\prod_{p\equiv 3[4]}(1-p^{-2})^{-1/2}\frac{N}{\sqrt{\log N}}

which fits quite well with what the heuristics indicate (the product over the restricted primes corresponds perfectly to the restricted sum).

Repartition of the sums of squares in residue classes

We see that the primes are analogous to the sums of two squares: indeed, the primes are the numbers which have no  factor, and the sums of two squares are the numbers which have no factor congruent to 3 mod 4 (plus other ones of course, the products of such numbers by 9, 49, 121 etc. of which we saw that it simply multiplied the size of the set by a constant). The same heuristic provides a density which then turns out to be correct. So why not, as Dirichlet did for the primes, ask for the repartition of the sums of squares in residue classes?

A sum of two squares n=a^2+b^2 can only be congruent to 0,1 or 2 mod 4, because squares are 0 or 1 mod 4. Let’s show that these values are not equiprobable in fact. First notice that n can be written as n=2^kn' where n'\equiv 1 [4] is also a sum of two squares. So given the number k_1 of n\equiv 1[4] until N which are sums of two squares, we get k_2=1/2 k_1 and k_0=(1/4+1/8+\cdots)k_1=1/2 k_1. At least, the probabilities of being even or odd are the same. Another instance is that the sums of two squares are twice as dense in n\equiv 1[4] than in n\equiv 1[2]. Also notice that everybody in n\equiv 3[9] has v_3(n)=1 so nobody there can be a sum of two squares. The final word is by Prachar. Putting B= \frac{1}{\sqrt{2}}\prod_{p\equiv 3[4]}(1-p^{-2})^{-1/2} the constant appearing above, he found

\sum_{n\leq x,n\equiv a[q]}1_{S+S}(n)\sim B_q\frac{x}{\sqrt{\log x}}


B_q=Bq^{-1}\frac{(4,q)}{(2,q)}\prod_{p\equiv 3[4],p\mid q}(1+p^{-1})

when a\equiv 1\text{ mod } (4,q);  thus the density is the same as the general density in APs, except when the modulus q is divisible by 4 (which causes a multiplication by 2 of the density) and/or it has factors congruent to 3 mod 4.

Additive quadruples in the squares

Recall that by Cauchy-Schwarz, \lvert S+S\rvert \geq \frac{\lvert S\rvert^2}{Q_4(S)} where Q_4 is the number of additive quadruples, i.e. of quadruples of square a,b,c,d \in S with a+b=c+d, which is also the number of quadruples a,b,c,d\in [\sqrt{N}]^4 with a^2+b^2=c^2+d^2. Thus, we have at least Q_4(S)\geq N/\sqrt{\log N} additive quadruples.

We can try to do better. Indeed, rewrite the equation as

(a-c)(a+c)=(d-b)(d+b),\quad -N\leq a,b,c,d\geq N

(we replaced \sqrt{N} by N here). It amounts to

uv=wz,\quad -2N\leq u,v,w,z\geq 2N,\quad u+v,w+z \text{ even}

We’ll count the positive solutions here, the other solutions are easily obtained from them. We partition such quadruples depending on the gcd d=(u,w). So u=du_1,w=dw_1,(u_1,w_1)=1. Then v=w_1e,z=u_1e for some 1\leq e\leq \min (2N/w_1,2N/u_1). So the number of add quadruples should be


By Cauchy-Schwarz, the inner sum ought to be

\sum_{u_1,w_1\leq 2N/d}1_{(u_1,w_1)=1}min(\cdots)\ll \sqrt{6/\pi^2N^2/d^2}\sqrt{\sum \min(N/u_1,N/w_1)}

Now the second sum is not much bigger than

\sum_{1\leq u_1\leq w_1\leq N/d}N/u_1=\sum N/u_1(N/d-u_1)\sim N^2/d\log N

so the total number of additive quadruples should be

\ll \sum_d 1/d^{3/2}N^2/\log N\ll N^2\log N. Returning to the previous definition, we have the inequalities

N\sqrt{\log N}\ll Q_4(S)\ll N\log N

Notice that in a random set of this density you would expect less solutions: of the order of magnitude of N. So if a,b,c, are squares, a+b-c has increased probability to be a square.

Anyway the heuristics that Q_4(A)\sim \alpha^4 N^3 for A\subset [N] of density \alpha implies that that \lvert A+A\rvert\geq N which is not the case here.

Sums of two squares in \mathbb{Z}/n\mathbb{Z}

We know that in \mathbb{Z}/p\mathbb{Z}, there are \frac{p+1}{2} squares (including 0; here p\neq 2 is a prime). This suffices to show, as a simple case of Cauchy-Davenport, that everybody is congruent mod p to a sum of two squares. This of course doesn’t extend to non-prime n, as for instance not everybody is a sum of two squares modulo 4. Let us denote by \rho_a(d)=\lvert\{ (b,c)\in [d]^2\mid a=b^2+c^2\}\rvert the number of representations of a as sum of two squares mod d. Then we can see that for d=p a prime congruent to 1 mod 4 (so that -1 is a square mod p), \rho_0(p)=2p-1, while for any other residue a\neq 0 [p], we must have \rho_0(p)=p-1. On the other hand \rho_0(p)=1 for p\equiv 3[4], and hence \rho_a(p)=p+1 for all other a.

By the Chinese Remainder theorem, we know that \rho_a(d)=\prod_{p}\rho_a(p^{v_p(d)}). But it is far from obvious to estimate the \rho(p^\alpha). In fact it happens that

\rho_{\beta}(p^{\alpha})=(1-\chi_4(p^{\alpha})\sum_{j\geq 0}1_{p^j\mid \beta}\chi_4(p^j)

where \chi_4 is the only non-trivial character mod 4, which is also Legendre symbol mod 4.