OKlibrary  0.2.1.6
AdditiveNumberTheory.hpp File Reference

On investigations into additive number theory. More...

Go to the source code of this file.


Detailed Description

On investigations into additive number theory.

Todo:
Connections
Todo:
Some further considerations
  • A natural conjecture here is that for every k there exists some index i_k >= 1 such that for every j >= i_k there exists an arithmetic progression of length k ending with p_j. For k <= 5 we can already conjecture the smallest such i_k, and this should also be possible for k=6, while then it becomes more difficult.
  • Of interest is also http://www.research.att.com/~njas/sequences/Sindx_Pri.html#primes_AP which gives an overview. It contains for given k the smallest starting term, smallest difference and smallest end term for arithmetical progressions of primes of length k. However I do not understand these sequences (strange explanations).
Todo:
Finding the first arithmetic progression
  • Fundamental is to consider k -> how many first primes are needed to get an progression of length k; this is greentao_1(k).
  • See http://users.cybercity.dk/~dsl522332/math/aprecords.htm for current information around this subject; the sequence is available as follows (showing p_i instead of i, that is, the (unranked) prime numbers themselves):
    greentaod1ur;
     [2,3,7,23,29,157,907,1669,1879,2089,
     249037,262897,725663,36850999,173471351,198793279,4827507229,17010526363,83547839407,572945039351,
     6269243827111]
       
  • The ranked data is available via:
    block([L:[]],for k:1 thru inf unless not integerp(greentao([k])) do L : endcons(greentao([k]),L), L);
     [1,2,4,9,10,37,155,263,289,316,
      21966,23060,58464,2253121,9686320,11015837,227225515,755752809,3466256932,22009064470,
      220525414079]
       
  • Additional data (upper bounds):
    1. At http://users.cybercity.dk/~dsl522332/math/aprecords.htm data upper bounds are available also for 22 <= k <= 26 (again, this is the unranked data):
      11410337850553 + 475180·19#·n (108201410428753)
      403185216600637 + 9523·23#·n (449924511422857)
      515486946529943 + 136831·23#·n (1217585417914253)
      6171054912832631 + 366384·23#·n (8132758706802551)
      43142746595714191 + 23681770·23#·n (175223597495211691)
           
      (the numbers in brackets are the end-values, in which we are interest; however, it is not known that these are the smallest possible end-values).
    2. See "Better algorithms" in Structures/NumberTheory/PrimeNumbers/plans/RankPrimes.hpp for thoughts on a better algorithm, and on the ranked data.
  • Plotting the (ranked, precise) data (using R) suggests that (log,log(log))-transformation for (x,y) might be appropriate (that is, a model y = exp(a * x^b)):
    y = c(4,9,10,37,155,263,289,316,21966,23060,58464,2253121,9686320,11015837,227225515,755752809,3466256932,22009064470,220525414079)
    x = 3:21
    plot(log(x),log(log(y)))
    
    m0 = lm(log(log(y)) ~ log(x))
    summary(m0)
                Estimate Std. Error t value Pr(>|t|)
    (Intercept) -1.48793    0.11902  -12.50 5.37e-10 ***
    log(x)       1.54569    0.04925   31.38  < 2e-16 ***
    Residual standard error: 0.1191 on 17 degrees of freedom
    Multiple R-squared: 0.983,      Adjusted R-squared: 0.982
    F-statistic: 984.9 on 1 and 17 DF,  p-value: < 2.2e-16
    
    exp(coefficients(m0)[1])
     0.2258395
    coefficients(m0)[2]
     1.545693
    s = c(a = 0.2258395, b = 1.545693)
    m = nls(y ~ exp(a * x^b), start = s)
    
    summary(m)
      Estimate Std. Error t value Pr(>|t|)
    a 0.084184   0.002663   31.61   <2e-16 ***
    b 1.884508   0.010393  181.32   <2e-16 ***
    Residual standard error: 273300000 on 17 degrees of freedom
    Number of iterations to convergence: 18
    Achieved convergence tolerance: 7.111e-06
    
    plot(x,log(y))
    lines(x,log(predict(m)))
    lines(x,exp(predict(m0)),col="blue")
    
    y/predict(m)
     [1]  2.0522259  2.8563990  1.7419151  3.1481344  5.7470262  3.7987529
     [7]  1.4561912  0.4981426  9.7285376  2.5797764  1.4869497 11.7370109
    [13]  9.3188835  1.7663568  5.4843977  2.4815698  1.4003467  0.9899872
    [19]  1.0000489
       
    Explicitly: the model is greentao_1(k) ~ exp(0.084184*k^1.884508).
Todo:
The conjecture from [Granville 2006]
  • The conjecture is implemented by approxgv_grt1ur(k), and it yields quite good approximations:
    for k : 1 thru 21 do print(k, float(greentaod1ur[k]/approxgv_grt1ur(k)));
    
    1 2.289488896354126
    2 1.965659777448311
    3 2.020885909330544
    4 2.468550619322147
    5 1.019801665249097
    6 1.635675606605423
    7 2.574684204425131
    8 1.201613382006138
    9 0.322264451025868
    10 0.08072794540477299
    11 2.062519736581183
    12 0.445862252087117
    13 0.2417209471089714
    14 2.319918131851896
    15 1.991459630418673
    16 0.4025117578670004
    17 1.670907399187739
    18 0.9772800505078132
    19 0.7748870219193731
    20 0.8355720914865566
    21 1.402142105791055
       
  • For ranked data:
    for k : 3 thru 21 do print(k, round_fdd(greentaod1(k)/approxgv_grt1Li_hp(k,30), 3));
    
    3 1.563
    4 1.535
    5 0.796
    6 1.265
    7 2.003
    8 1.131
    9 0.37
    10 0.11
    11 1.924
    12 0.477
    13 0.269
    14 2.199
    15 1.914
    16 0.423
    17 1.63
    18 0.978
    19 0.783
    20 0.841
    21 1.385
       
  • What is the corresponding *direct* approximation for the ranked numbers?
  • This as model for a linear regression of the ranked data, using rank(p) ~ p/log(p) resp. p/(log(p)-1) (optimising on the factor which in the gv-model is chosen as exp(1-gamma), or using that factor while optimising on an optional outer factor):
    y = c(4,9,10,37,155,263,289,316,21966,23060,58464,2253121,9686320,11015837,227225515,755752809,3466256932,22009064470,220525414079)
    x = 3:21
    
    s = c(a = 1.5262)
    mgvr = nls(y ~ (x/2*a)^(x/2) / (x/2*(log(x/2)+log(a))), start = s)
    > summary(mgvr)
    Parameters:
      Estimate Std. Error t value Pr(>|t|)
    a  1.58024    0.00225   702.4   <2e-16 ***
    
    y/predict(mgvr)
     [1] 1.4189310 2.0736544 1.1071981 1.6212498 2.3319017 1.2150958
     [7] 0.3741266 0.1060549 1.7867252 0.4283643 0.2351793 1.8705521
    [13] 1.5884116 0.3427037 1.2919336 0.7584994 0.5944557 0.6255983
    [19] 1.0094899
    
    s = c(a = 1.5262)
    mgvr2 = nls(y ~ (x/2*a)^(x/2) / (x/2*(log(x/2)+log(a))-1), start = s)
    summary(mgvr2)
    Parameters:
      Estimate Std. Error t value Pr(>|t|)
    a  1.57485    0.00227   693.8   <2e-16 ***
    
    y/predict(mgvr2)
     [1] 0.31887457 1.17447699 0.78879713 1.28351643 1.96101952 1.06252655
     [7] 0.33622554 0.09726638 1.66467365 0.40417970 0.22422941 1.79925882
    [13] 1.53953065 0.33437827 1.26804070 0.74845611 0.58944254 0.62310082
    [19] 1.00963502
    
    s = c(b = 1)
    mgvr3 = nls(y ~ b*(x/2*1.5262051)^(x/2) / (x/2*(log(x/2)+log(1.5262051))-1), start = s)
    summary(mgvr3)
    Parameters:
      Estimate Std. Error t value Pr(>|t|)
    b  1.37258    0.02089   65.72   <2e-16 ***
    
    y/predict(mgvr3)
     [1] 0.20391699 0.86692693 0.60148029 1.00097550 1.55934574 0.86031118
     [7] 0.27700774 0.08150497 1.41837262 0.35010064 0.19742792 1.61013672
    [13] 1.40015594 0.30904249 1.19092347 0.71428235 0.57158839 0.61394061
    [19] 1.01076035
       
    Not too bad for the larger k-values; looks better than the above y ~ exp(a * x^b) model. Perhaps mgvr3 is most sensible.
  • But, as asked in the previous point, one should see to adapt the Greenville-approach directly to the ranked case.
  • Fitting the unranked data:
    y = c(7,23,29,157,907,1669,1879,2089,249037,262897,725663,36850999,173471351,198793279,4827507229,17010526363,83547839407,572945039351,6269243827111)
    x = 3:21
    
    s = c(a = 1.5262)
    mgvu = nls(y ~ (x/2*a)^(x/2), start = s)
    > summary(mgvu)
    Parameters:
      Estimate Std. Error t value Pr(>|t|)
    a 1.574839   0.002108     747   <2e-16 ***
    
    y/predict(mgvu)
     [1] 1.92799913 2.31843766 0.94288222 1.48876878 2.30697236 1.05991619
     [7] 0.27983853 0.06900926 1.73568105 0.36936935 0.19713457 1.86255671
    [13] 1.57397090 0.31317862 1.27983584 0.73690120 0.57519747 0.61059163
    [19] 1.00866581
       
  • None of these fitting-attempts seems to reveal much; so the original model is to be preferred.
Todo:
The first arithmetic progression allowing a missing number
  • greentao_2(2,k) has still a "mostly number theoretical touch".
  • greentao_2(2,0) = 0, greentao_2(2,1) = 2, greentao(2,2) = 3.
  • greentao_2(2,3) = 7.
  • greentao_2(2,4) = 14.
  • greentao_2(2,5) = 31.
  • This sequence is apparently not in that "online encyclopedia" (also not after applying unrank_primes to it, obtaining [3,5,17,43,127]).

Definition in file AdditiveNumberTheory.hpp.