OKlibrary  0.2.1.6
general.hpp File Reference

General plans for the Maxima test system. More...

Go to the source code of this file.


Detailed Description

General plans for the Maxima test system.

Bug:
Error not detected
  • With ComputerAlgebra/Cryptology/Lisp/testobjects/Conversions.mac and with Maxima 5.19.0 (Ecl) we get the error (this is just an example, the functions referred to do no longer exist):
    (%i49) okltest_binl2dnf_l(binl2dnf_l)
    Maxima encountered a Lisp error:
    
     FRAME-STACK overflow at size 2304. Stack can probably be resized.
    
    Automatically continuing.
    To reenable the Lisp debugger set *debugger-hook* to nil.
    Evaluation took 0.1560 seconds (0.1660 elapsed)
       
    however this error is not detected by the buildsystem, but the system continues.
  • On the other hand, with CLisp we get
    
    *** - Program stack overflow. RESET
    
    [../src/eval.d:573] reset() found no driver frame (sp=0xff8c9400-0xff8c2d80)
    Exiting on signal 6
    make: *** [run_maxima] Aborted
       
    and so here the buildsystem detects the error.
Todo:
How to calibrate floating-point efficiency
  • Examples, contrasting Maxima version 5.21.1 with 5.22.1:
    
    testf1(n) := float(apply("+",create_list(log(i)*(-1)^i,i,1,n)));
    testf1c(n):=block([fpprec:30],bfloat(apply("+",create_list(log(i)*(-1)^i,i,1,n))));
    testf1f(n) := apply("+",create_list(float(log(i))*(-1)^i,i,1,n));
    
    21:
    
    (%i24) testf1(10000);
    Evaluation took 79.0520 seconds (125.7410 elapsed)
    (%o24) 4.830986538632788
    (%i25) testf1c(10000);
    Evaluation took 75.6320 seconds (141.3070 elapsed)
    (%o25) 4.8309865386327771337329138576b0
    (%i31) testf1f(10000);
    Evaluation took 0.3380 seconds (0.4770 elapsed)
    (%o31) 4.830986538632788
    
    (%i26) testf1(20000);
    Evaluation took 301.3270 seconds (546.0920 elapsed)
    (%o26) 5.177547628912792
    (%i27) testf1c(20000);
    Evaluation took 326.1730 seconds (542.8570 elapsed)
    (%o27) 5.17754762891278624677437887665b0
    (%i32) testf1f(20000);
    Evaluation took 0.6810 seconds (1.0060 elapsed)
    (%o32) 5.177547628912792
    
    (%i28) testf1(40000);
    Evaluation took 1244.4280 seconds (2258.0720 elapsed)
    (%o28) 5.524114969192852
    (%i29) testf1c(40000);
    Evaluation took 1256.1750 seconds (2077.7460 elapsed)
    (%o29) 5.52411496919276345877464646714b0
    (%i33) testf1f(40000);
    Evaluation took 1.3840 seconds (2.1210 elapsed)
    (%o33) 5.524114969192852
    
    22:
    
    (%i19) testf1(10000);
    Evaluation took 68.2900 seconds (123.3390 elapsed)
    (%o19) 4.83098653863282
    (%i20) testf1c(10000);
    Evaluation took 76.8890 seconds (123.0450 elapsed)
    (%o20) 4.8309865386327771337329138576b0
    (%i26) testf1f(10000);
    Evaluation took 0.3280 seconds (0.6130 elapsed)
    (%o26) 4.83098653863282
    
    (%i21) testf1(20000);
    Evaluation took 289.1520 seconds (468.5160 elapsed)
    (%o21) 5.17754762891287
    (%i22) testf1c(20000);
    Evaluation took 286.7340 seconds (559.2420 elapsed)
    (%o22) 5.17754762891278624677437887665b0
    (%i27) testf1f(20000);
    Evaluation took 0.7100 seconds (1.2810 elapsed)
    (%o27) 5.17754762891287
    
    (%i23) testf1(40000);
    Evaluation took 1175.0270 seconds (2232.6280 elapsed)
    (%o23) 5.524114969192786
    (%i24) testf1c(40000);
    Evaluation took 1206.8040 seconds (1911.2210 elapsed)
    (%o24) 5.52411496919276345877464646714b0
    (%i28) testf1f(40000);
    Evaluation took 1.3790 seconds (2.1250 elapsed)
    (%o28) 5.524114969192786
    
       
    There are differences, but no clear picture.
  • For comparison, a C++ implementation:
    // File SumLog.cpp
    #include <iostream>
    #include <sstream>
    #include <cmath>
    #include <iomanip>
    int main(const int argc, const char* const argv[]) {
      if (argc != 2) return 1;
      std::stringstream s;
      s << argv[1];
      unsigned int n;
      s >> n;
      if (not s) return 2;
      double sum = 0;
      for (struct {unsigned int i; int e;} l = {0,-1}; l.i < n; ++l.i, l.e*=-1)
        sum += std::log(l.i+1) * l.e;
      std::cout << "n=" << n << ": " << std::setprecision(16) << sum << "\n";
    }
    
    > g++ -O3 SumLog.cpp -o SumLog
    > ./SumLog 10000
    n=10000: 4.830986538632788
    > ./SumLog 20000
    n=20000: 5.177547628912792
    > ./SumLog 40000
    n=40000: 5.524114969192852
       
  • The question is what are good values for oklib_float_comparison_factor and oklib_float_comparison_exponent ?
Todo:
Create milestones.
Todo:
DONE Improving assert
  • DONE (now caught by errcatch) There are some errors which our test-system does not notice.
    1. One finds them by searching for "error" in the OKlibBuilding log.
    2. Apparently these are Lisp errors.
    3. Perhaps one could catch "MACSYMA-QUIT" ?!
Todo:
Outline of the test system
  • DONE (we roll our own) Ask on the Maxima mailing list, whether they have a system in use.
    1. Apparently, they only have a system where they put expressions and expected values into files. That's insufficient.
  • Compare with the C++ test system; see TestSystem/plans/TestSystem.hpp.
    1. Similar to the C++ test-system, we have generic test functions, which take as argument the function to be tested.
    2. Likely only functions are to be tested.
    3. So we could just use, as for the C++ test-system, sub-directories "tests" and "testobjects", containing the generic test functions and the test instantiations (i.e., expressions evaluating the test function on the function to be tested), respectively.
  • DONE (at least at this time, we just rely on Maxima evaluating all expressions in the file, and do not have our own mechanism for running tests) Execution of the tests:
    1. Like with the C++ system, in the testobjects-files one finds instructions for loading the "testobjects" into a global list (provided via dynamic binding when running the tests).
      1. These "testobjects" perhaps are just the respective function calls, unevaluated, while executing the tests means evaluating these terms.
      2. So we need one function "install_testokl(t)", which stores the term t, unevaluated, on a global list "testobjects_testokl".
    2. However, just writing the expressions into the testobject-file is easier, and seems to do the job as well ?!?
    3. The files in the "tests"-directories get loaded with oklib_load_all(), but not the testobjects-files.
    4. But the mechanics of running tests (how to find out about errors, how to get more precise information, etc.) is not clear yet. See the next point about "Assert".
  • Asserts:
    1. DONE (when using the buildsystem, then the file with the error is printed, while when calling tests directly, then the caller knows about the filenames, so the apparent inability of Maxima to output better filenames is not so harmful here) In ComputerAlgebra/TestSystem/Lisp/Asserts.mac we have implemented first assert-functions, which seem to work (and so the discussions below perhaps are outdated).
      1. One problem is that for file-names only the basename is printed.
      2. In the context of the test-system this seems less harmful, since the backtrace writes also the oklib_load-function-calls.
      3. However it would be nice to tell the system to print filenames using (some part of) the path.
    2. DONE (only for floating-point comparisons we use a special assert, but otherwise we do not write special asserts --- too much trouble) As with the C++ system, we have some special "asserts" for conditions, which also provide error-messages.
    3. DONE (in the case of an error there is no return-value, since we just abort the test-function) Every test-function returns just true in case of success, while otherwise false is returned --- though the return value likely is not much of use, but the real output is the side effect that an error-message is printed, using "error" (this should cause abortion).
    4. DONE (apparently it doesn't work when called by ourselfes) The "backtrace()"-call is useful here: In case of an error not just the error message is printed, but also the trace of function calls.
    5. DONE (yet the simple system suffices) Perhaps we create a macro for this error-output (similar to the C++-macro).
    6. DONE (this could be considered later) Is it possible to provide information about the file etc. where the error-message was issued? Seems not to be possible. So perhaps some global variables are set, and in case of an error a maxima session is opened? For that we need to evaluate each term with "errcatch", and printing actively the error-message with "errormsg()".
      • With "errcatch(t,true)" the testterm is evaluated, and true is returned if no error was found, and [] otherwise.
      • Via "errormsg()" then the error message is printed, and also the term t should be displayed.
      • For this to be visible it is perhaps needed that an interactive session is started.
      • So we want just batch-processing without output if no error occurs.
      • Seems difficult; perhaps the error-output is stored in a file?
      • This would then be the error-output and the testobject-term.
      • It seems not possible to redirect the output of the backtrace-function?
      • Perhaps we use "load" first to load the testobject-files (without output), and then via "batch" the testobjects are processed, without (much) output in case of no error, while we have all the above error-information in case of an error.
      • We issue then an error inside the batch-file, and so the make-process notices the error and halts.
      • If an error occurred, perhaps with "trace(all)" everything is traced, and the error term is re-evaluated? "trace" seems to be more informative than "backtrace".
    7. DONE (yes, at least for now it's sufficient). On the other hand, just using the error-function seems to provide enough information?!?
    8. DONE (it's called "oklib_test_level") There is a global variable "test_level" for the test-level.
    9. DONE (yet no need for that) And perhaps also "log_level".
    10. DONE (for now that mmust suffice) Since we don't have namespaces, we need naming-conventions. Perhaps "okltest_" as generic prefix.
    11. DONE Each test-function has one argument, the function to be tested.
  • DONE (actually, it is now "oklib new_check"; later we should, perhaps under "full" and "extensive", also run the tests in the full oklib-environment, i.e., calling oklib_load_all() first; and then also calling the tests twice (within the same environment)) "oklib check" is also responsible for the maxima-tests, via a sub-goal (so that also only the maxima-tests can be involved).
    1. After loading all testobject-files, a function "run_testokl" is called which evaluates the terms in "testobjects_testokl".
    2. A complication arises for functions to be tested which require special contexts. Best to avoid this. However if needed, then the testobject should just also contain this context.
    3. "Contexts" seem just to refer to "facts" etc. It could be that a special environment is needed, with special variables and functions defined; but again this should be provided by the testobject.
    4. Preparing the environment:
      1. Since we run the test with a fresh Maxima, we don't need to use "kill(all)" at the beginning.
      2. However we should have the possibility to run the tests several times, to see whether there are harmful side-effects.
      3. So all "main" functions of the OKlibrary shouldn't change the global environment (these "main" functions include all test-functions; perhaps "main" here means "testable").
    5. As usual all testobjects for the calling directory level are executed.
    6. Likely we should not provide a mechanism for running only tests when needed (too complicated).
      1. Just run always all respective tests (and "basic" tests really should run quickly).
      2. But, as usual, only those belonging to the current directory level.
    7. DONE (basically it's done like that, but without any special include-hierarchies; yet for every level we need a generic makefile) Perhaps actually simplest is that "oklib check" gathers all testobject-files (by "find"), writes "oklib_load"-instructions accordingly into a file "okltest", and then just runs Maxima, using the option "--batch=okltest".
      1. More precisely, the manual set-up would just create an include-hierarchy, parallel to the include-hierarchy for normal Maxima-code.
      2. And then running the tests at a specific level happens via batch-processing the include-file at the wished level.
      3. Should we use "oklib_load" or "oklib_include" ? Perhaps oklib_load, since repeated inclusion is unlikely, and if, then there should be a reason.
      4. Now the build-system just simulates creation of these include-files (this is more intelligent than the above simple method, lumping all includes into one file).
      5. The main target is "check-maxima", with subtargets "prepare_tests_maxima" and "run_tests_maxima".
      6. prepare_tests_maxima recursively creates the include-files, in system_directories/tests.
      7. run_tests_maxima runs the appropriate include-file.
  • DONE (error messages are output to a file, and in this way we detect the presence of an error) First rough "script" for running the tests
    1. OKplatform> (for F in $(find OKsystem/OKlib/ComputerAlgebra -path "*/testobjects/*.mac"); do oklib --maxima --batch=${F} --very-quiet; if [[ $? != 0 ]]; then exit 1; fi; done)
           
      would work except that maxima returns 0 apparently in any case??? It seems not possible to get a reaction on the error???
    2. A possibility would be to set an environment-variable via "system". But apparently using for example system("export OKLIBMAXIMA=1") is not transported to the outer world, and thus is unusable.
    3. Or we use a different configuration file, as discussed above; but still the problem how to get informed about the error.
    4. Is it really necessary to write to a file via "system" ???
    5. It seems that we should add the capability to the assert-function that in case of error and oklib_automatic_test=true (by default it's false) something is written to a file.
    6. For batch-running testobject-files, before the run the file is deleted, and if it has been created then the buildsystem issues an error.
    7. This assumes that every testobject-file is run on its own --- seems alright.
    8. The output of the batch-runs is all redirected into one log-file. One needs to look at it from time to time.
    9. Perhaps automatically one should parse the output for "error" etc.
    10. According to the e-mail reply by Mike Hansen to my request at the Maxima-mailing list (27.2.2008) we could use the Sage interface.
Todo:
Testing the demos
  • Also the demos-files need to be run, via oklib_load, to see whether they still function correct.
  • Since they contain asserts, this is also contains tests.
  • A problem is that some demos run longer.
  • So demos need to be qualified as "basic tests", "full tests", or "extensive tests". The problem how to do this.
    1. The first line of a demos-file shall be
      if oklib_test_demos then
       if oklib_test_demos_level < 2 then return()$
      
      (where "2" here is the test-level of this file ("extensive" in this case)).
    2. Normally, oklib_test_demos=false.
    3. For running the demos in test-mode, oklib_test_demos=true, and then oklib_test_demos_level decides whether to run the demos-file or not.
  • And should these "tests" be included in the normal maxima-test?
  • First we create a special target "maxima_check_demos", and then we'll decide.
  • This target, as usual, loads all demos-files (.mac-files in demos-directories) below the given directory.
  • So again a special maxima-init-file is created by the process, which contains all respective load-instructions.
  • Again the question is whether we do this recursively.
  • And again this appears to be superior.
Todo:
Improving the test system
  • It would be better to check whether all testobject-expressions actually evaluate to true (since tests might be broken, and for example simply nothing might be computed):
    1. I remember that there is a Maxima function, which like "batch" executes all expressions in a file, and checks whether each evaluates to true?
    2. Apparently there is an undocumented feature of batch, namely it can be called as "batch(filename, 'test)", in which case it expects the file to be organised as a list of pairs of expressions
      expression;
      expected_value$
           
      however this would be clumsy (since our expected value is always just "true"), and furthermore also some error-log is written to some file, which likely is (too) hard to control.
  • In oklib_test_level>=1 we must also additionally run the test with oklib_test_level-- and with oklib_monitor=true and oklib_monitor_level=0,1 (at least).
  • return-expressions in functions to be tested:
    1. If the function to be tested doesn't use a block, but uses a return-expression, then this just terminates the assert-evaluation itself.
    2. How can this be avoided?
  • DONE (now handled in Buildsystem/MasterScript/SpecialProcessing/plans/general.hpp) We have the problem that the created file maxima-init.mac clashes with other such files created when running "oklib --maxima":
    1. DONE (this is now handled by the userdir-directory) Is it possible to use for the test-runs a different initialisation file? Ask on the Maxima mailing list.
    2. See "Improve locality" in Buildsystem/MasterScript/SpecialProcessing/plans/general.hpp.
Todo:
Handling floating and bigfloating numbers
  • It seems that assert_float_equal and assert_bfloat_equal work fine, however especially for assert_bfloat_equal this needs to be properly tested.

Definition in file general.hpp.