Probabilistic approach to prime counting

Probabilistic approach to prime counting
Chris De Corte
chrisdecorte@yahoo.com
KAIZY BVBA
Beekveldstraat 22
9300 Aalst, Belgium
February 2, 2014

1

The goal of this document is to share with the mathematical community the ﬁnding of a new
mathematical relationship with regard to prime counting that originated out of our probabilistic approach to the occurrence of primes.
One of the side results will be a new formula for the natural logarithmic function as well.

2

CONTENTS

CONTENTS

Contents
1 Key-Words

4

2 Introduction

4

3 Derivation of formula’s

4

3.1

Lemma: Probability of ﬁnding a prime . . . . . . . . . . . . . . . . . . . . . .

4

3.2

Theorem: Probabilistic prime counting formula . . . . . . . . . . . . . . . . .

6

3.3

Corollary: new formula for natural logarithm . . . . . . . . . . . . . . . . . .

7

4 Method and techniques

7

4.1

Probabilistic Prime Counting formula . . . . . . . . . . . . . . . . . . . . . .

7

4.2

New formula for natural logarithm . . . . . . . . . . . . . . . . . . . . . . . .

8

5 Discussions

10

6 Acknowledgments

10

7 References

11

3

3

1

DERIVATION OF FORMULA’S

Key-Words

Prime counting, Logarithmic Integral, Riemann, Gauss, Euler-Mascheroni.

2

Introduction

The following document originated out of our interest for primes.
We remain surprised by the fact that the young Riemann was at the origin of such a complex
formula as a proposition for the prime-counting function [1]
π(x) =

x
ln(x) − 1

(1)

or others [1] using the logarithmic integral:
x

li(x) =
2

dt
ln(t)

(2)

Nowhere did we ever see a theoretical explanation where this first formula came from. For
this reason, we started our quest to derive a (better) formula based on theoretical grounds.
The question was: how?

3
3.1

Derivation of formula’s
Lemma: Probability of finding a prime

The probability (Pr) to find a next prime immediately after an existing prime pi can be
represented by:
x=pi

(1 − 1/pi )

P r(pi+1 ) =

(3)

i=2

Proof:
Suppose that we have a classical x-axis which starts at 0 and which has nodes that count up
according to the natural numbers. We will see every natural number as a node on this x-axis.
4

3.1

Lemma: Probability of finding a prime

3


Suppose our task is to touch every natural number with a graph. The easiest way to do this
would be to create a sinus like function with a period T=2. This graph would immediately go
through 0, 1, 2, ... hence through all the natural numbers. We call it the 1-graph. There is no
way to find a vacant spot any more on this x-axis now. So, if a vacant spot would represent a
chance of a possible prime number, there would be no chance of finding a prime number any
longer.
The above is the same as saying that every natural number is divisible by 1.
We will now delete our 1-graph and start with the new challenge to touch all the nodes from
2 up on the x-axis using a combination of different graphs than the 1-graph. Remember: our
challenge is to touch all the nodes on the x-axis without using the 1-graph. So, we will try to
fulfill the fundamental theorem of arithmetic [1] using pi − graph s only.
So, immediately after 1 but before 2: what is our chance of finding a prime? This will be
100% since all the nodes are still free and the next number that we will find will only be
divisible by 1 or itself. This is because there are no natural numbers in between. The next
number we find is 2 and 2 is a prime.
The most logic first graph in this new challenge will be the 2-graph and it will go through:
0, 2, 4, ... The probability of finding vacant spots on our x-axis (from 2 onward) will now be
represented by the formula (1 − 1/2). So, the number of vacant nodes along the x-axis immediately diminishes with 1/2. These vacant spots can temporarily be interpreted as possible
prime candidates over a short distance along the x-axis. And it seems to be true as the next
vacant spot (prime) is node 3.
The next logical graph that we will add will be the 3-graph and will go through 0, 3, 6, ... One
would at first think that the number of possible vacancies along the x-axis would immediately
diminish with an extra 1/3 but that is not completely correct since certain points (like 6, 12,
18, ...) would be touched by the 2-graph as well as by the 3-graph and we should not double
count them. The correct number of vacant spots on our x-axis (from 3 onward) will now be
represented by the formula (1 − 1/2).(1 − 1/3).
From 3 onward the chance of finding a next prime will hence be 33%.
If we continue with the same reasoning, the next vacant spot (prime) we find is node 5. So, the
next logical graph will have to go through 0, 5, 10, ... Now, we must watch out not to overdo
with avoiding the double counting (1/2x3, 1/2x5 and 1/3x5) hence we will have to correct
for a triple counting with adding a factor for 1/(2x3x5). The correct number of vacant spots
(from 5 onward) on our x-axis will however simplify to the formula (1−1/2).(1−1/3).(1−1/5).
Our new formula will be valid till we find the next vacancy: prime 7.
We are now ready to believe that the probability for finding a vacancy (read a prime) immediately after x = pi will be represented by:
x=pi

(1 − 1/pi )

P r(pi+1 ) =

(4)

i=2

5

3.2

Theorem: Probabilistic prime counting formula

3


In the tables, we call this probability the total probability. The individual contribution to this
probability by each different prime pi , we will call the individual probability and is simply
calculated as (1 − 1/pi ).
We can mention already that the contribution of this total probability over a distance ∆x to
the prime counting will then be represented by:
x=pi

(1 − 1/pi )

∆x.

(5)

i=2

3.2

Theorem: Probabilistic prime counting formula

The probabilistic prime counting formula can be represented as follows:
x x=pi

(1 − 1/pi ).dx with α ≈ 1.7810292

π(x = pi ) = α.
2

(6)

i=2

α can be very closely approximated as:
α = eγ where γ ≈ 0.57721 is the Euler − M ascheroniconstant

(7)

Proof (including an heuristic argument)
In order to derive the probabilistic prime counting formula, we have to multiply each new
probability of finding a prime with the distance over which this probability is valid (until the
next prime) and sum up all these values. This can be approximated as:
x=pi

π(x = pi ) =

∆(x)

x=pi

{[

{P r(pi+1 ).∆(x)} =

(1 − 1/pi )].(pi − pi−1 )}

(8)

i=2

i=2

This is like calculating a surface over an irregular step function and can be represented with
an integral symbol. This method is an integration by parts.
However after initial tests on the first 50 million primes (see Method and Techniques),
We came to the conclusion that the above probabilistic prime counting function should
be corrected with a constant α. Therefore, if we refer to table 1, we can check that α =
50847534/28549683. As a conclusion, we come to formula 6.

6

3.3

3.3

Corollary: new formula for natural logarithm

4

METHOD AND TECHNIQUES

Corollary: new formula for natural logarithm

We can now compare formula 2 with formula 6 and conclude the following new formula for
the natural logarithm:
ln(x = pi ) =

β
x=pi
i=2 (1

− 1/pi )

with β = 0.5614575

(9)

β can be very closely approximated as:
β = e−γ where γ ≈ 0.57721 is the Euler − M ascheroniconstant

(10)

Proof
The proof is trivial if we explain that we introduce β = 1/α. We must mention however that
we have optimized β in this case to come as close to li(x) as possible (using a slightly different
α. Therefore, if we refer to table 1, we can check that β = 28549683/50849229.

4

Method and techniques

We have developed a C++ program to verify the above logic for the first 50.7 million primes.
The result is stored in 1000 separate files of each around 7 Megabytes. A summary of this
information can be found in tables 1 and 2.

4.1

Probabilistic Prime Counting formula

We will now explain the calculation steps done and refer to the respective column in both
tables where this information can be found.
In the first column of both file types we have put a counter that basically will count the
primes.
In the second column, we have listed the first 50.7 million primes.
In a third calculation we have calculated the individual probability of the prime of that row
using the formula:
Ind P rob(pi ) = (1 − 1/pi )

(11)

Because of space reasons, we have not listed this calculation in table 1 but it can be found in
the third column of the separate files. The calculation is however trivial.

7

4.2

New formula for natural logarithm

4


In a next calculation, we calculated what we call the total probability of all the primes until
that prime according to the formula:
x=pi

(1 − 1/pi )

T ot P rob(pi ) =

(12)

i=2

The result of this calculation can be found in column 3 of table 1 and in column 4 of the
separate files.
In the next column (My Pure), we calculated formula 6 but with α = 1. So, it comes to
calculating formula 7. The reason for this is that we are then able to calculate the results for
all the primes first and then estimate α afterwards based on the last prime calculated. We
multiplied the previous probabilistic value with the difference of this prime with the previous
prime (∆x) and added this value to all the previous values (integration by parts).
In the next column (My Integr), we did exactly the same as in the previous column but we
corrected α = 1.7810292 from prime 2 onward.
In the next column, we simulated the logarithmic integral (formula 2) in a similar discrete
fashion:
li(pi ) = li(pi−1 ) +

pi − pi−1
ln(pi )

(13)

As mentioned before, for mostly unknown reasons (see later), we came to the conclusion that
the difference between my pure probabilistic integral and the real prime count seems to be
constant and comes to a factor 1.7810292 if we want to correct for primes upto 1 billion.
To illustrate the difference between my pure probabilistic integral and the logarithmic integral, we came to a factor 1.7810785 or 1/0.5614575.
We will use the first constant to optimize our prime count formula and the second to demonstrate our formula for the natural logarithm.
The results are demonstrated in the table 1.

4.2


We were also eager to check our new formula for the natural logarithm and therefore we have
added 2 more columns to our separate tables. One in which we approximate the logarithm
based on our formula: we dividend our β with the total probability of all the primes until
that prime. Another column in which we calculated the natural logarithm the standard way.
A summary of this can be seen in table 2.

8

4.2


4


Table 1: In this table, we calculate the probability for a new prime immediately after another one
(”My Integr”) and compare it with the logarithmic integral (”li(x)”) and with the perfect
prime count (”Count”). My integr has been corrected with a factor α = 50847534/28549683
versus My pure. We can also deduct β = 28549683/50849229.

Count

P rime

T ot prob

M y pure

M y integr

li(x)

1

2

0.5

1

1

1

10

29

0.1579472231019522

6.82

11.37

11.77

100

541

0.0887498837753298

59

105

106

1000

7919

0.06246659294666621

569

1012

1015

10000

104729

0.04855897117165969

5632

10030

10037

100000

1299709

0.03988065693618553

56197

100088

100107

1000000

15485863

0.03391365652866189

561668

1000346

1000408

10000000

179424673

0.02954214659629333

5614822

10000162

10000496

20000000

373587883

0.02844462187012779

11230033

20001015

20001621

30000000

573259391

0.02784067067191051

16844593

30000712

30001582

40000000

776531401

0.02742791846692972

22459541

40001097

40002229

50000000

982451653

0.02711633109429194

28074042

50000689

50002077

50847534

999999937

0.02709315486988111

28549683

50847819

50849229

Table 2: In this table we calculate the natural logarithm of the respective prime using our new formula
(”My ln(x)”) and compare it with the values using the standard formula (”ln(x)”).

Count

P rime

M y ln(x)

ln(x)

1

2

1.122915132600837

0.6931471805599453

10

29

3.554716286072388

3.367295829986474

100

541

6.326290721932071

6.293419278846481

1000

7919

8.988125329322656

8.977020214210413

10000

104729

11.56238595574076

11.55913134035915

100000

1299709

14.078443271354

14.07765095122063

1000000

15485863

16.55550075604224

16.55543810118943

10000000

179424673

19.00530702704133

19.00526602879025

20000000

373587883

19.73861944320852

19.73866383070949

30000000

573259391

20.16681181703334

20.16684886160037

40000000

776531401

20.47029441834519

20.47034763888541

50000000

982451653

20.70551374918884

20.70556169235444

50847534

999999937

20.72322581098074

20.72326577394641

9

6

5

ACKNOWLEDGMENTS

Discussions

The question whether the lower part of the integral in formula 6 needs to start at 2 or maybe
at 1 or 0 as well as the fact that the first value ”1” in table 1 for ”MyIntegr” is the correct
one can be answered as follows: In the range between 1 and before 2, the chance of finding
a prime according to our logic is 100 percent. And our challenge only started at 1. So, the
value for ”MyIntegr” could then be represented by the formula: π(x) = (x − 1). There is
no need to have a correction factor here since this formula suits us fine as the starting point
π(1) = 0 and end point π(2) = 1 are correct according to our logic and according to the facts.
The question of why we had to correct our probabilities with a factor α from 2 onward remains
open. And to be honest, we can’t give a good mathematical answer. By using probabilities,
we assumed that the probability of finding a prime is flat over some range ∆(x). However,
this flat probability might not be the case over some longer distance. Immediately after a
prime, we know that the next number will not be a prime since it will be even. However,
there is a sort of higher chance that the number after this even number might be a prime. We
could explain this by assuming that a bigger part of our periodic waves are still in the phase
of going up or turning to come down but just not yet in the phase of touching the x-axis
again. So we are actually always moving before the bigger waterfall of prime multiples that
come down on us.
It is logic from our side to try to optimize α according to our needs. In the first part, we
wanted to optimize it for getting a better prime counting formula, in the second part, we
wanted to optimize it to come as close as possible to the natural logarithm.

6

Acknowledgments

It was only after the writing of this article that we found out that earlier work of Harald
Cram´r had similarities with the content of this article. However, what the Cram´r model
e
e
didn’t have was our ”infinitesimal” intake of using different probabilities for each distance
∆(x) = pi − pi−1 as we are doing in formula 8 and 6. Instead, it seems, Cram´r used the same
e
√
probability for all p ≤ x.
I would like to thank this publisher, his professional staff and his volunteers for all the effort
they take in reading all the papers coming to them and especially I would like to thank this
reader for reading my paper till the end.
I would like to thank my wife for supporting me during the countless hours I spend behind
my desk.
10

7

7

REFERENCES

References
1. Richard Crandall and Carl Pomerance (2005). Prime Numbers. Springer, second edition. Page 1,10.

11

Probabilistic approach to prime counting

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to Probabilistic approach to prime counting

Similar to Probabilistic approach to prime counting (20)

More from Chris De Corte

More from Chris De Corte (17)

Probabilistic approach to prime counting