Congratulations. You managed to find the website of Christian Bau.

This website will give you bits and pieces of mostly computer-related and mathematical information that you might not find anywhere else. Any suggestions or comments, just email me at, but don't expect any reply too soon.

You might check this webpage from time to time, as its contents is going to change as time allows.


Prime numbers

A document describing the Extended Meissel-Lehmer algorithm for counting the number of primess p ≤ N, plus a preliminary version of an implementation. This is very much work in progress. This document describes the algorithm. PrimeCount.cp implements the algorithm and PrimeCount.h is the header file. main.cp is a simple program calling this function and counting the prime numbers ≤ 10^18.

Counting all prime numbers up to 10^15 takes less than 60 seconds on a Macintosh running at 733 MHz, counting the primes up to 10^16 takes 282 seconds, and counting all primes up to 10^17 takes less than 24 minutes, but significant optimisations especially for large values of N are still possible. The next steps will be cleaning up the code, using a variable size sieve to avoid slowdown for N > 10^18, extending the code to work for values N > 2^64, some fine tuning for maximising the speed of the sieve, and investigating some ideas to avoid sieving prime numbers greater than N^(2/9).

The code has been run on CodeWarrior 7 and ProjectBuilder / gcc on MacOS X and CodeWarrior 6 for Windows. For maximum performance, it is recommended that you try out the tuning section at the start of the program to find settings that optimise speed on your implementation. Especially the cache size setting and the option for hand-tuned division could have a major impact on the execution time.


Optimisations for Java "strict" floating point operations on x86 processors

The x86 processors have some unique problems implementing Java floating point arithmetic exactly as required by the Java standard. This document describes the problem, the implementation and related optimisations.


Analysis of "restrict pointers" in the C99 Standard

Sophisticated (I hope) analysis of "restrict pointers" in C99, identifying some problems in the C99 Standard, and analysing how "restrict pointers" can help in an optimising compiler.


Optimisations for floating point comparisons

Analysis what can be done to optimise floatingpoint comparisons, especially on a PowerPC. These optimisations are not very difficult, yet they will not be found in many optimising compilers. Not a big winner in terms of speed, but every bit helps.


Conversion from 64 bit integer to floating point

Algorithms for conversion between signed or unsigned 64 bit integer and double precision or single precision floating point. Many people believe that this is very complicated, but between 12 and 18 PowerPC instructions should be enough.


A High Precision Timer for the Macintosh

If you are interested in performance tuning of software, the most important thing is measuring. The High Precision Timer page describes an extremly high performance, high precision timer for the Macintosh. Source code included.


Multiprecision arithmetic for the Macintosh

A C++ package that implements very fast 128 bit floating point arithmetic on the Macintosh. Source is in MultiPrecisionQD.h and MultiPrecisionQD.c, description of the mathematics involved at MultiPrecisionQD.pdf. Warning: The mathematics is slighlty difficult.