This website will give you bits and pieces of mostly
computer-related and mathematical information that you might not find
anywhere else. Any suggestions or comments, just email me at
*christian.bau@cbau.freeserve.co.uk*,
but don't expect any reply too soon.

You might check this webpage from time to time, as its contents is going to change as time allows.

A document describing the Extended Meissel-Lehmer algorithm for counting the number of primess p ≤ N, plus a preliminary version of an implementation. This is very much work in progress. This document describes the algorithm. PrimeCount.cp implements the algorithm and PrimeCount.h is the header file. main.cp is a simple program calling this function and counting the prime numbers ≤ 10^18.

Counting all prime numbers up to 10^15 takes less than 60 seconds on a Macintosh running at 733 MHz, counting the primes up to 10^16 takes 282 seconds, and counting all primes up to 10^17 takes less than 24 minutes, but significant optimisations especially for large values of N are still possible. The next steps will be cleaning up the code, using a variable size sieve to avoid slowdown for N > 10^18, extending the code to work for values N > 2^64, some fine tuning for maximising the speed of the sieve, and investigating some ideas to avoid sieving prime numbers greater than N^(2/9).

The code has been run on CodeWarrior 7 and ProjectBuilder / gcc on MacOS X and CodeWarrior 6 for Windows. For maximum performance, it is recommended that you try out the tuning section at the start of the program to find settings that optimise speed on your implementation. Especially the cache size setting and the option for hand-tuned division could have a major impact on the execution time.

The x86 processors have some unique problems implementing Java floating point arithmetic exactly as required by the Java standard. This document describes the problem, the implementation and related optimisations.

Sophisticated (I hope) analysis of "restrict pointers" in C99, identifying some problems in the C99 Standard, and analysing how "restrict pointers" can help in an optimising compiler.

Analysis what can be done to optimise floatingpoint comparisons, especially on a PowerPC. These optimisations are not very difficult, yet they will not be found in many optimising compilers. Not a big winner in terms of speed, but every bit helps.

Algorithms for conversion between signed or unsigned 64 bit integer and double precision or single precision floating point. Many people believe that this is very complicated, but between 12 and 18 PowerPC instructions should be enough.

If you are interested in performance tuning of software, the most important thing is measuring. The High Precision Timer page describes an extremly high performance, high precision timer for the Macintosh. Source code included.

A C++ package that implements very fast 128 bit floating point arithmetic on the Macintosh. Source is in MultiPrecisionQD.h and MultiPrecisionQD.c, description of the mathematics involved at MultiPrecisionQD.pdf. Warning: The mathematics is slighlty difficult.