OKlibrary  0.2.1.6
CountingProgressions.hpp File Reference

On counting arithmetic progressions in the primes. More...

Go to the source code of this file.


Detailed Description

On counting arithmetic progressions in the primes.

Todo:
Connections
Todo:
The distribution of arithmetic progressions amongst primes
  • The task is to find a nice (thus very likely approximated) law for the values in the list ln_arithprog_primes_c(k,n) (see ComputerAlgebra/NumberTheory/Lisp/PrimeNumbers.mac) for fixed k >= 1.
  • One can consider the densities ln_arithprog_primes_c(k,n) / create_list(i,i,1,n).
  • Hard to believe that there is nothing in the literature / on the Internet: We should enter for example ln_arithprog_primes_c(3,30) = [0,0,0,1,2,2,3,5,7,9,11,11,13,16,17,20,23,24,26,30,32,36,40,44,46,49,53,56,59,64] into that database of integer sequences and see whether there is information in it.
  • Yes, this sequence is A125505 in http://www.research.att.com/~njas/sequences/Seis.html (see http://www.research.att.com/~njas/sequences/A125505).
  • There it is only listed for n=64; this we can easily extend, and perhaps we should do so.
  • And apparently for k >= 4 there is nothing entered there --- we should change this.
    1. Say, up to k=10.
    2. For k=10 for example the first 315 values are 0, and then at least until index 3000 the value is constant 1; for such sequences we need a compressed representation.
  • The best internet resource seems to be http://primes.utm.edu/top20/page.php?id=14 .
  • Ploted via e.g.
    plot2d([discrete,create_list(i,i,1,1000),ln_arithprog_primes_c(3,1000)]);
       
  • For k = 1,2 this is trivial.
  • For k >= 3 regression is to be performed; most powerful is using R, but for initial considerations also simple_linear_regression (use 'load("stats")') can be used.
Todo:
The Grosswald-Hagis-formula
  • In [Grosswald, Hagis, 1979, Arithmetic progressions consisting only of primes, Mathematics of Computation, 33(148):1343-1352] the Hardy-Littlewood estimations have been specialised for arithmetic progressions.
  • The conjecture is that nhyp_arithprog_primes_hg(k,n) is asymptotically equal to C_k / (2*(k-1)) * n^2 / (log n)^(k-2). This is proven for k <= 4.
  • This estimations is gh_nhyp_arithprog_primes_hg(k,n).
  • As stated in "Improving the Grosswald-Hagis estimation" in ComputerAlgebra/RamseyTheory/Lisp/GreenTao/plans/Asymptotics.hpp, one would expect that using logarithmic integrals should yield a more precise formula.
Todo:
Computing the Grosswald-Hagis coefficients C_k
  • We have C_2 = 1.
  • The following data has been computed by
    > GrosswaldHagisFormula-O3-DNDEBUG 3,20 1000000000
    Precision in bits: 416
    The first prime number not taken into account: 1000000007
    
    > GrosswaldHagisFormula-O3-DNDEBUG 11,20 4000000000
    Precision in bits: 416
    The first prime number not taken into account: 4000000007
    
    > GrosswaldHagisFormula-O3-DNDEBUG 3,20 100000000000
    Precision in bits: 448
    The first prime number not taken into account: 100000000003
    
       
  • Computation of C_3:
    C_3 = 1.3203236317546366339
    Finite and infinite part: 1.5, 0.88021575450309108924
    1 - first remaining factor: 9.99999988000000108e-19
    
    C_3 = 1.3203236316942413054
    Finite and infinite part: 1.5, 0.88021575446282753695
    1 - first remaining factor: 9.9999999996e-23
       
    while GH give the value 1.32032; a guess is C_3 = 1.320323631... .
  • Computation of C_4:
    C_4 = 2.8582485961147147258
    Finite and infinite part: 4.5, 0.63516635469215882796
    1 - first remaining factor: 2.999999966000000288e-18
    
    C_4 = 2.8582485957224816583
    Finite and infinite part: 4.5, 0.63516635460499592406
    1 - first remaining factor: 2.9999999999e-22
       
    while GH give the value 2.85825; let's guess C_4 = 2.85824859... .
  • Computation of C_5:
    C_5 = 4.1511808643862090045
    Finite and infinite part: 6.591796875, 0.62974951187133007712
    1 - first remaining factor: 5.999999936000000507e-18
    
    C_5 = 4.1511808632468886479
    Finite and infinite part: 6.591796875, 0.62974951169849095933
    1 - first remaining factor: 5.99999999984e-22
       
    while GH give the value 4.15118; let's guess C_5 = 4.15118086... .
  • Computation of C_6:
    C_6 = 10.131794954669182916
    Finite and infinite part: 24.71923828125, 0.40987488527728368616
    1 - first remaining factor: 9.999999900000000735e-18
    
    C_6 = 10.131794950034614014
    Finite and infinite part: 24.71923828125, 0.40987488508979534818
    1 - first remaining factor: 9.9999999998e-22
       
    while GH give the value 10.1318; let's guess C_6 = 10.1317949... .
  • Computation of C_7:
    C_7 = 17.298612323552886198
    Finite and infinite part: 33.392508824666341146, 0.51803871384394953143
    1 - first remaining factor: 1.4999999860000000945e-17
    
    C_7 = 17.298612311683576106
    Finite and infinite part: 33.392508824666341146, 0.5180387134885012435
    1 - first remaining factor: 1.49999999998e-21
       
    while GH give the value 17.2986; let's guess C_7 = 17.2986123... .
  • Computation of C_8:
    C_8 = 53.971948352406135059
    Finite and infinite part: 146.09222610791524251, 0.36943751074433075237
    1 - first remaining factor: 2.0999999818000001113e-17
    
    C_8 = 53.971948300560721615
    Finite and infinite part: 146.09222610791524251, 0.36943751038944935433
    1 - first remaining factor: 2.099999999986e-21
       
    while GH give the value 53.9720; let's guess C_8 = 53.9719483... .
  • Computation of C_9:
    C_9 = 148.55162885563210856
    Finite and infinite part: 639.15348922212918599, 0.23241933488687408369
    1 - first remaining factor: 2.7999999776000001218e-17
    
    C_9 = 148.55162866536732961
    Finite and infinite part: 639.15348922212918599, 0.23241933458919163
    1 - first remaining factor: 2.8e-21
       
    while GH give the value 148.552; let's guess C_9 = 148.551628... .
  • Computation of C_10:
    C_10 = 336.034327307194497
    Finite and infinite part: 2796.2965153468151887, 0.12017120697427801113
    1 - first remaining factor: 3.5999999736000001242e-17
    
    C_10 = 336.03432675383279702
    Finite and infinite part: 2796.2965153468151887, 0.12017120677638708757
    1 - first remaining factor: 3.600000000024e-21
       
    while GH give the value 336.034; let's guess C_10 = 336.034326... .
  • Computation of C_11:
    C_11 = 511.42228312047417728
    C_11 = 511.42228230839231811
    Finite and the infinite part: 2884.6653988745989106, 0.17728998396414178893
    1 - first remaining factor: 2.8124999953125000046e-18
    
    C_11 = 511.42228206774875224
    Finite and infinite part: 2884.6653988745989106, 0.17728998388072013248
    1 - first remaining factor: 4.50000000006e-21
       
    while GH give the value 511.422; let's guess C_11 = 511.422282... .
  • Computation of C_12:
    C_12 = 1312.3197146277008806
    Finite and infinite part: 13882.452232084007257, 0.094530828753350440844
    1 - first remaining factor: 5.499999967000000099e-17
    
    C_12 = 1312.3197120808120353
    Finite and the infinite part: 13882.452232084007257, 0.094530828569889420933
    1 - first remaining factor: 3.4374999948437500039e-18
    
    C_12 = 1312.3197113260945073
    Finite and infinite part: 13882.452232084007257, 0.094530828515524564108
    1 - first remaining factor: 5.50000000011e-21
       
    while GH give the value 1312.32; let's guess C_12 = 1312.31971... .
  • Computation of C_13:
    C_13 = 2364.598970504069348
    Finite and infinite part: 13428.850937459748114, 0.17608349228957677085
    1 - first remaining factor: 6.5999999648000000693e-17
    
    C_13 = 2364.5989649971453789
    Finite and the infinite part: 13428.850937459748114, 0.17608349187949522368
    1 - first remaining factor: 4.1249999945000000027e-18
    
    C_13 = 2364.5989633652830187
    Finite and infinite part: 13428.850937459748114, 0.17608349175797608791
    1 - first remaining factor: 6.600000000176e-21
       
    while GH give the value 2364.60; let's guess C_13 = 2364.59896... .
  • Computation of C_14:
    C_14 = 7820.6000583800047652
    Finite and infinite part: 70011.873897902124282, 0.11170390996511158496
    1 - first remaining factor: 7.7999999636000000273e-17
    
    C_14 = 7820.6000368550459882
    Finite and the infinite part: 70011.873897902124282, 0.11170390965766432525
    1 - first remaining factor: 4.8749999943125000011e-18
    
    C_14 = 7820.6000304765722324
    Finite and infinite part: 70011.873897902124282, 0.11170390956655872557
    1 - first remaining factor: 7.80000000026e-21
       
    while GH give the value 7820.61; let's guess C_14 = 7820.60003... . So here we have a descrepancy.
  • Computation of C_15:
    C_15 = 22938.908728604769022
    Finite and infinite part: 365009.82172812513753, 0.062844634207379343627
    1 - first remaining factor: 9.0999999635999999727e-17
    
    C_15 = 22938.908654946451496
    Finite and the infinite part: 365009.82172812513753, 0.062844634005581164122
    1 - first remaining factor: 5.6874999943124999989e-18
    
    C_15 = 22938.908633119341404
    Finite and infinite part: 365009.82172812513753, 0.062844633945782471615
    1 - first remaining factor: 9.100000000364e-21
       
    while GH give the value 22939; let's guess C_15 = 22938.9086... .
  • Computation of C_16:
    C_16 = 55651.462823015330355
    Finite and infinite part: 1902993.9143221524098, 0.029244162266718764193
    1 - first remaining factor: 1.0499999964999999906e-16
    
    C_16 = 55651.4626168225131
    Finite and the infinite part: 1902993.9143221524098, 0.029244162158366963537
    1 - first remaining factor: 6.5624999945312499963e-18
    
    C_16 = 55651.462555721561157
    Finite and infinite part: 1902993.9143221524098, 0.029244162126259161466
    1 - first remaining factor: 1.050000000049e-20
       
    while GH give the value 55651; let's guess C_16 = 55651.462... .
  • Computation of C_17:
    C_17 = 91555.111732881423894
    Finite and infinite part: 1539516.4947250383578, 0.059470042735224739834
    1 - first remaining factor: 1.1999999967999999826e-16
    
    C_17 = 91555.111345203124141
    Finite and the infinite part: 1539516.4947250383578, 0.059470042483406522178
    1 - first remaining factor: 7.4999999949999999932e-18
    
    C_17 = 91555.111230322724999
    Finite and infinite part: 1539516.4947250383578, 0.059470042408785432027
    1 - first remaining factor: 1.200000000064e-20
       
    while GH give the value 91555; let's guess C_17 = 91555.111....
  • Computation of C_18:
    C_18 = 256474.86146297656364
    Finite and infinite part: 8527979.2287552010855, 0.030074517606489678389
    1 - first remaining factor: 1.3599999972799999735e-16
    
    C_18 = 256474.86023216558505
    Finite and the infinite part: 8527979.2287552010855, 0.030074517462163461642
    1 - first remaining factor: 8.4999999957499999896e-18
    
    C_18 = 256474.85986744035674
    Finite and infinite part: 8527979.2287552010855, 0.030074517419395389801
    1 - first remaining factor: 1.3600000000816e-20
       
    while GH give the value 256480; let's guess C_18 = 256474.85... . So here we have a descrepancy.
  • Computation of C_19:
    C_19 = 510992.01391519899563
    Finite and infinite part: 6579820.4950027289514, 0.077660479385917816586
    1 - first remaining factor: 1.5299999979599999633e-16
    
    C_19 = 510992.01115644361769
    Finite and the infinite part: 6579820.4950027289514, 0.077660478966642643345
    1 - first remaining factor: 9.5624999968124999857e-18
    
    C_19 = 510992.01033894385383
    Finite and infinite part: 6579820.4950027289514, 0.07766047884239916824
    1 - first remaining factor: 1.530000000102e-20
       
    while GH give the value 510990; let's guess C_19 = 510992.01... .
  • Computation of C_20:
    C_20 = 1900972.5998672649484
    Finite and infinite part: 38473077.653099090942, 0.049410463519653916442
    1 - first remaining factor: 1.7099999988599999521e-16
    
    C_20 = 1900972.5883968371188
    Finite and the infinite part: 38473077.653099090942, 0.049410463221512241035
    1 - first remaining factor: 1.0687499998218749981e-17
    
    C_20 = 1900972.5849978144616
    Finite and infinite part: 38473077.653099090942, 0.04941046313316415805
    1 - first remaining factor: 1.7100000001254e-20
       
    while GH give the value 1901000; let's guess C_20 = 1900972.5... .
Todo:
Using curve-fitting
  • The role model for curve-fitting is implemented by "fit_greentao" in OKlib/Statistics/R/GreenTao.R.
  • One can also consider n_arithprog_primes_nc[k,n] (the non-cumulative data, i.e., as list the difference list of the above list):
    plot2d([discrete,create_list(i,i,1,1000),create_list(n_arithprog_primes_nc[3,n],n,1,1000)]);
       
    Though it seems that the accumulated data is easier to handle (since being smoother).
  • Perhaps it is more appropriate to consider only changes here, that is, skipping n-values where no new arithmetic progression is added). This is plotted in non-cumulative resp. cumulative form by
    plot2d([discrete, sizes_strata_indmon_ohg(arithprog_primes_ohg(3,1000))]);
    plot2d([discrete, sizes_cstrata_indmon_ohg(arithprog_primes_ohg(3,1000))]);
       
  • At the C++-level we have Applications/RamseyTheory/CountProgressions_GreenTao.cpp.
Todo:
k=3
  • Using Applications/RamseyTheory/CountProgressions_GreenTao.cpp and linear regression in R:
    > f = fit_greentao(3,1000)
    Number of observations (changes) =  995
    Max nhyp =  40510
    Coefficients: 17.11290 -2.962008 21.49815 -57.41431 ;  0.3300809
    Residual range: -37.03781 45.99563
    
    > f(100)
      578.0767
    
    > f = fit_greentao(3,10000)
    Number of observations (changes) =  9995
    Max nhyp =  3091531
    Coefficients: 321.6195 -3.338598 30.12935 -101.7892 ;  0.3300809
    Residual range: -715.0495 930.5577
    
    f(100)
      790.0139
       
    (where the correct value for f(100) is 579).
  • So f_3(n) = 321.6195 + 0.3300809*n^2/log(n) * (1 + -3.338598/log(n) + 30.12935/log(n)^2 - 101.7892/log(n)^3) is a good model.
  • For N=30000 we obtain:
    > f = fit_greentao(3,30000)
    Number of observations (changes) =  29995
    Max nhyp =  25000740
    Coefficients: 843.0986 -3.324593 30.58643 -107.2370 ;  0.3300809
    Residual range: -3259.890 3230.183
    
    f(100)
      1289.139
       
    So f_3(n) = 843.0986 + 0.3300809*n^2/log(n) * (1 + -3.324593/log(n) + 30.58643/log(n)^2 - 107.2370/log(n)^3) is a good model.
  • For N=100000 we need a C++ program which doesn't store the progressions.
Todo:
k=4
  • TO BE UPDATED:
    > f = fit_greentao(4,20000)
    Number of observations (changes) =  19975
    Max nhyp =  1462656
    Coefficients: 165.4101 0.3400277 0.684767 -4.955831
    Residual range: -499.0341 410.999
    
    > f = fit_greentao(4,40000)
    Number of observations (changes) =  39975
    Max nhyp =  5148933
    Coefficients: -549.0156 0.4599897 -1.662170 6.5372
    Residual range: -1115.498 923.5618
       
Todo:
k=5
  • TO BE UPDATED:
    > f = fit_greentao(5,40000)
    Number of observations (changes) =  39347
    Max nhyp =  462282
    Coefficients: -219.7298 0.5031083 -2.683063 10.54289
    Residual range: -230.2833 290.1797
    
    > f = fit_greentao(5,80000)
    Number of observations (changes) =  79347
    Max nhyp =  1545857
    Coefficients: -84.72713 0.4143196 -0.8405665 0.9811276
    Residual range: -448.2709 539.9275
       
Todo:
k=6
  • TO BE UPDATED:
    > f = fit_greentao(6,80000)
    Number of observations (changes) =  70976
    Max nhyp =  234774
    Coefficients: -113.8172 0.8496358 -4.188096 14.99478
    Residual range: -168.2007 163.1515
    
    80000 - 70976
      9024
    
    > f = fit_greentao(6,160000)
    Number of observations (changes) =  150810
    Max nhyp =  749499
    Coefficients: -102.2112 0.7783336 -2.665443 6.869895
    Residual range: -322.2387 365.8506
    
    160000 - 150810
      9190
       
Todo:
k=7
  • TO BE UPDATED:
    > f = fit_greentao(7,160000)
    Number of observations (changes) =  59909
    Max nhyp =  78058
    Coefficients: -3.696913 0.8696752 -0.3139054 -13.02317
    Residual range: -159.1105 169.4958
    
    160000 -  59909
      100091
    
    > f = fit_greentao(7,500000)
    Number of observations (changes) =  298388
    Max nhyp =  497046
    Coefficients: 703.3813 -0.3935023 32.41404 -224.6832
    Residual range: -681.7677 457.4785
    
    500000 - 298388
      201612
    
    > f = fit_greentao(7,1000000)
    Number of observations (changes) =  736449
    Max nhyp =  1558942
    Coefficients: -1097.584 1.736234 -23.11872 137.8766
    Residual range: -529.4176 1079.641
    
    1000000 -  736449
      263551
       
Todo:
k=8
  • TO BE UPDATED:
    > f = fit_greentao(8,1000000)
    Number of observations (changes) =  230866
    Max nhyp =  268082
    Coefficients: -170.0498 2.566231 -13.61017 54.1059
    Residual range: -189.4731 201.5056
    
    1000000 - 230866
      769134
    
    > f = fit_greentao(8,2000000)
    Number of observations (changes) =  649644
    Max nhyp =  812685
    Coefficients: 186.8815 3.010198 -22.80165 96.40528
    Residual range: -882.6667 537.2372
    
    2000000 - 649644
      1350356
    
    > f = fit_greentao(8,4000000)
    TO BE UPDATED:
    Number of observations (changes) =  1781803
    Max nhyp =  2491439
    Coefficients: -675.2275 6.202912 -31.85938 883.982
    Residual range: -863.7256 977.2554
    
    4000000 -  1781803
      2218197
    
    > f = fit_greentao(8,8000000)
    TO BE UPDATED:
    Number of observations (changes) =  4688545
    Max nhyp =  7728990
    Non-linear model nhyp = a * n^b:
               a            b
    4.218958e-05 1.631506e+00
    Residual range:  -10123.48 6262.802
    
    8000000 - 4688545
      3311455
       

Definition in file CountingProgressions.hpp.