Prime Counting Function and Chebyshev Bounds

Published:

The distribution of primes plays a central role in number theory. The famous mathematician Gauss conjectured that the number of primes between \(1\) and \(n\) is roughly \(n/\log n\), with this estimation becoming increasingly accurate as \(n\to \infty\). We denote the number of primes up to \(n\) by \(\pi(n)\). Mathematically, Gauss’s conjecture is equivalent to the claim

\[\lim_{n\to\infty}\frac{\pi(n)}{n/\log n}=1\]

This conjecture, now known as the Prime Number Theorem, is notoriously difficult to prove—the first proof appeared nearly a century after it was stated.

While proving the exact asymptotic is hard, establishing that \(\pi(n)\in O\left( \frac{n}{\log n} \right)\) is much simpler. In this post, I present bounds showing that for \(n\ge 12\),

\[(1+2\log 4)\frac{n}{\log n}\ge \pi(n)\ge \frac{\log 2}{2}\frac{n}{\log n}\]

This technique originates from nineteenth-century mathematician Pafnuty Chebyshev, who used a more sophisticated version to prove tighter bounds:

\[1.1\frac{n}{\log n}\ge \pi(n)\ge 0.92\frac{n}{\log n}\]

for sufficiently large \(n\).

The Lower Bound

Suppose \(n\) is even, i.e., \(n=2k\). Consider the prime factorization of the binomial coefficient \(\binom{2k}{k}\). Since all of its prime factors lie between \(1\) and \(2k\), we can write

\[\binom{2k}{k}=p_1^{a_1}\cdots p_{\pi(2k)}^{a_{\pi(2k)}}\]

Claim: For every prime \(p_i\), we have \(p_i^{a_i}\leq 2k\).

Proof: The key observation is the simple identity

\[\lfloor a+b\rfloor-\lfloor a\rfloor-\lfloor b\rfloor\leq1\]

Recall that the largest exponent of a prime \(p_i\) dividing \(r!\) is

\[\left\lfloor\frac{r}{p_i}\right\rfloor+\left\lfloor\frac{r}{p_i^2}\right\rfloor+\cdots\]

Since \(\binom{2k}{k}=\frac{(2k)!}{(k!)^2}\), the exponent \(a_i\) satisfies

\[\begin{align} a_i=&\left\lfloor\frac{2k}{p_i}\right\rfloor-2\left\lfloor\frac{k}{p_i}\right\rfloor+\left\lfloor\frac{2k}{p_i^2}\right\rfloor-2\left\lfloor\frac{k}{p_i^2}\right\rfloor+\cdots \\ &\leq 1+1+\cdots \end{align}\]

The sum contains as many \(1\)s as there are powers \(p_i^j \leq 2k\). Therefore \(a_i \leq \log_{p_i}(2k)\), which implies \(p_i^{a_i}\leq 2k\).

(Q.E.D.)

Combining the factorization with this bound, we obtain

\[\binom{2k}{k}=p_1^{a_1}\cdots p_{\pi(2k)}^{a_{\pi(2k)}}\leq (2k)^{\pi(2k)}\]

Since \(\binom{2k}{k}\ge 2^k\) (which follows from the binomial theorem), we have

\[(2k)^{\pi(2k)}\ge 2^k\implies \pi(2k)\ge \frac{\log 2}{2}\cdot\frac{2k}{\log 2k}\]

An analogous argument applies when \(n\) is odd by considering \(\binom{2k+1}{k+1}\). This establishes the lower bound:

\[\boxed{\pi(n)\ge \frac{\log 2}{2}\cdot\frac{n}{\log n}}\]

The Upper Bound

Claim: For all \(n\ge 2\),

\[\prod_{p\leq n\mid p\ \text{prime}}p\leq 4^n\]

Proof: We proceed by induction. The base case holds trivially. The step from \(n=2k-1\) to \(n=2k\) is immediate since no new primes appear. Thus we need only establish the step from \(n=2k\) to \(n=2k+1\):

\[\begin{align} \prod_{p \leq 2k+1\mid p\ \text{prime}}p &= \prod_{p \leq k+1\mid p\ \text{prime}}p \cdot\prod_{k+2\leq p \leq 2k+1\mid p\ \text{prime}}p\\ &\leq 4^{k+1}\cdot \binom{2k+1}{k}\quad (1) \end{align}\]

By the inductive hypothesis, the first product satisfies \(\prod_{p \leq k+1} p \leq 4^{k+1}\). For the second product, observe that every prime between \(k+2\) and \(2k+1\) divides \(\binom{2k+1}{k}\), so their product is at most \(\binom{2k+1}{k}\).

The binomial coefficient satisfies \(\binom{2k+1}{k}<4^k\). Substituting into (1) yields

\[\prod_{p \leq 2k+1\mid p\ \text{prime}}p \leq 4^{k+1}\cdot 4^k=4^{2k+1}\quad (\text{Q.E.D.})\]

To convert this product bound into a bound on \(\pi(n)\), fix \(2\leq m\leq n\) and consider the primes between \(m\) and \(n\). There are \(\pi(n)-\pi(m)\) such primes, each at least \(m\), so

\[m^{\pi(n)-\pi(m)}\leq \prod_{m\leq p \leq n\mid p\ \text{prime}}p\leq \prod_{1\leq p \leq n\mid p\ \text{prime}}p\leq 4^n\]

Taking logarithms,

\[(\pi(n)-\pi(m))\log m\leq n\log 4\]

Rearranging,

\[\pi(n)\leq \pi(m)+\frac{n\log 4}{\log m}\leq m+\frac{n\log 4}{\log m}\]

Setting \(m=\frac{n}{\log n}\) gives

\[\pi(n)\leq \frac{n}{\log n}\left(1+\frac{\log 4\cdot\log n}{\log(n/\log n)}\right)\]

For \(n\ge 12\), one can verify that \(\frac{\log n}{\log(n/\log n)}\leq 2\) (and in fact much tighter bounds hold for large \(n\)). Therefore

\[\boxed{\pi(n)\leq \left(1+2\log 4\right)\frac{n}{\log n}}\]