I’ve got a fixed point class (10.22) and I have a need of a pow, a sqrt, an exp and a log function.
Alas I have no idea where to even start on this. Can anyone provide me with some links to useful articles or, bettter yet, provide me with some code?
I’m assumign that once I have an exp function then it becomes relatively easy to implement pow and sqrt as they just become.
pow( x, y ) => exp( y * log( x ) )
sqrt( x ) => pow( x, 0.5 )
Its just those exp and log functions that I’m finding difficult (as though I remember a few of my log rules, I can’t remember much else about them).
Presumably, btw, there would also be a faster method for sqrt and pow so any pointers on that front would be appreciated even if its just to say use the methods i outline above 🙂
Please note: This HAS to be cross platform and in pure C/C++ code so I cannot use any assembler optimisations.
Practice As Follows
A very simple solution is to use a decent table-driven approximation. You don’t actually need a lot of data if you reduce your inputs correctly.
exp(a)==exp(a/2)*exp(a/2), which means you really only need to calculate
1 < x < 2. Over that range, a runga-kutta approximation would give reasonable results with ~16 entries IIRC.
sqrt(a) == 2 * sqrt(a/4) which means you need only table entries for
1 < a < 4. Log(a) is a bit harder:
log(a) == 1 + log(a/e). This is a rather slow iteration, but log(1024) is only 6.9 so you won't have many iterations.
You'd use a similar "integer-first" algorithm for pow:
pow(x,y)==pow(x, floor(y)) * pow(x, frac(y)). This works because
pow(double, int) is trivial (divide and conquer).
 For the integral component of
log(a), it may be useful to store a table
1, e, e^2, e^3, e^4, e^5, e^6, e^7 so you can reduce
log(a) == n + log(a/e^n) by a simple hardcoded binary search of a in that table. The improvement from 7 to 3 steps isn't so big, but it means you only have to divide once by
e^n instead of
n times by
And for that last
log(a/e^n) term, you can use
log(a/e^n) = log((a/e^n)^8)/8 - each iteration produces 3 more bits
by table lookup. That keeps your code and table size small. This is typically code for embedded systems, and they don't have large caches.
That's stil not to smart on my side.
log(a) = log(2) + log(a/2). You can just store the fixed-point value
log2=0.30102999566, count the number of leading zeroes, shift
a into the range used for your lookup table, and multiply that shift (integer) by the fixed-point constant
log2. Can be as low as 3 instructions.
e for the reduction step just gives you a "nice"
log(e)=1.0 constant but that's false optimization. 0.30102999566 is just as good a constant as 1.0; both are 32 bits constants in 10.22 fixed point. Using 2 as the constant for range reduction allows you to use a bit shift for a division.
You still get the trick from edit 2,
log(a/2^n) = log((a/2^n)^8)/8. Basically, this gets you a result
(a + b/8 + c/64 + d/512) * 0.30102999566 - with b,c,d in the range [0,7].
a.bcd really is an octal number. Not a surprise since we used 8 as the power. (The trick works equally well with power 2, 4 or 16.)
Still had an open end.
pow(x, frac(y) is just
pow(sqrt(x), 2 * frac(y)) and we have a decent
1/sqrt(x). That gives us the far more efficient approach. Say
frac(y)=0.101 binary, i.e. 1/2 plus 1/8. Then that means
(x^1/2 * x^1/8). But
x^1/2 is just
(sqrt(sqrt(sqrt(x))). Saving one more operation, Newton-Raphson
NR(x) gives us
1/sqrt(x) so we calculate
1.0/(NR(x)*NR((NR(NR(x))). We only invert the end result, don't use the sqrt function directly.