The squarefree numbers form a set which is well-known to have density 6/\pi^2 in the integers, in other words their characteristic function \mu^2 satisfy \sum_{n\leq x}\mu^2(n)=6/\pi^2x+o(x). In this set all your dream theorems about the count of linear patterns are true and easy.

Hardy-Littlewood squarefree tuples

The Hardy-Littlewood conjecture consists, in its strongest form, in an asymptotic, for every \mathcal{H}=\{h_1,\ldots,h_k\}, of the form

\displaystyle{\lvert\{n\leq x\mid n+h_1,\ldots,n+h_k\text{ are all prime}\}\rvert=(c(\mathcal{H})+o(1))\frac{x}{\log^kx}}

where c(\mathcal{H}) is a constant possibly equal to 0. It is quite far from being proven, although interesting steps towards it have been made in the last ten years, by Goldston-Pintz-Yildirim and Maynard.

In the squarefree numbers, the analogous problem is rather easy and was settled by Mirsky in 1947. He proved that

\displaystyle{\lvert\{n\leq x\mid n+h_1,\ldots,n+h_k\text{ are all squarefree}\}\rvert=c(\mathcal{H})x+O(x^{2/3+\epsilon})}



and \nu_{\mathcal{H}}(p^2) is the cardinality of the size of the projection of the set \mathcal{H} in \mathbb{Z}/p^2\mathbb{Z}, that is the number of forbidden classes modulo p^2 for n.

I show a quick way of deriving it with a much worse error term O(x/\log x). Fix w, a large number; ultimately, w=C\log x should be ok. Fix W=\prod_{p\leq w}p^2. Then the number X_w of n\leq x such that none of n+h_1,\ldots,n+h_k have a square factor p^2 with p\leq w is easy to compute: in [W] or any interval of length W there are by chinese remainder theorem this many of them
\displaystyle{ \prod_{p\leq w}(p^2-\nu_{\mathcal{H}}). }
Thus in [x], there are this many of them

\displaystyle{x/W^2\prod_{p\leq w}(p^2-\nu_{\mathcal{H}})+O(W^2)}.

There are also people who are not squarefree, but escape our sieving by primes less than w. These are the guys such that at least one of the n+h_i have a divisor p^2 with p>w prime. Now there are are at most x/p^2 multiples of p^2 in [x]. So at most
\displaystyle{ \sum_{w<p\leq \sqrt{x+\max h_i}}k(x+\max h_i)p^{-2}=o(x/w) }
should be removed. Let’s try to balance the error terms O(x/w) and O(W^2)=\exp(2w(1+o(1)); take w=\frac{1}{2}(\log x-\log\log x).
This way both seem to be O(x/\log x).

Green-Tao type theorem

We can also easily get asymptotics of the form

\displaystyle{\sum_{n\in\mathbb{Z}^d\cap K}\prod_{i\in[t]}\mu^2(\psi_i(t))=\text{Vol}(K)(\prod_p\beta_p+o(1))}

where K\subset [0,N]^d is a convex body and \Psi a system of affine-linear forms of which no two are affinely related. Moreover \beta_p is again the proportion of non-forbidden vectors a=(a_1,\ldots,a_d) of residues modulo p^2. One way to do it would be to simply use the Hardy-Littlewood type asymptotics proved above coordinate by coordinate. We can also prove it quite easily, by first observing that the set of n where at least one of the \psi_i vanishes has size O(N^{d-1}) because the 0-set of an affine-linear form is an affine hyperplane. Then as \psi_i(n)=O(N), the divisors a_i will have to satisfy a_i\ll \sqrt{N}. So now we restrict the set of n to the ones where the forms don’t vanish. Then we partition the sum into one where a_1 is restricted to be at most N^{\delta} (we fix some small \delta >0), and the remaining one. Using the well-known fact that \mu^2(n)=\sum_{d^2\mid n}\mu(d), we remark that the remaining one equals

\displaystyle{  \sum_{N^{\delta}\leq a_1\ll \sqrt{N}}\mu(a_1)\sum_{n:a_1^2\mid \psi_1(n)}\mu^2(\psi_2(n))\cdots\mu^2(\psi_t(n))}

Now the number of n in the sum is O(N^d/a_1^2+N^{d-1}). Given that \sum_{N^{\delta}\leq a_1} a_1^{-2}\ll N^{-\delta}, we can bound this sum by O(N^{d-\delta}). We proceed similarly for i=2,\ldots,t, so that the sum to estimate is

\displaystyle{  \sum_{a_1,\ldots,a_t\leq x^\delta}\prod_i\mu(a_i)\lvert\{n\in\mathbb{Z}^d\cap K\mid \forall i \quad \psi_i(n)\equiv 0\mod a_i^2\}\rvert}

Quite gross volume packing arguments indicate that the number of integral points (lattice points) to estimate above is \text{Vol}(K)\alpha_{\Psi}(a_1^2,\ldots,a_t^2)+O(N^{d-1+2t\delta}), where

\displaystyle{ \alpha_{\Psi}(k_1,\ldots,k_t)=\mathbb{E}_{n\in [\text{lcm}(k_i)]^d}\prod_i1_{k_i\mid\psi_i(n)}}


is the density of zeroes modulo the a_i^2. Moreover we can extend the sum beyond N^\delta at the cost of a mere N^{O(\delta)}. Hence finally for any \epsilon >0 we can write that

\displaystyle{\sum_{n\in\mathbb{Z}^d\cap K}\prod_{i\in[t]}\mu^2(\psi_i(t))=\text{Vol}(K)\prod_p\beta_p+o_\epsilon(N^{d-1+\epsilon}).}

Indeed by multiplicativity it is easy to transform \sum_{a_1,\ldots,a_t}\alpha_{\Psi}(a_1^2,\ldots,a_t^2)\prod_i\mu(a_i) into a product over primes.