You can implement a faster (than repeated addition) multiply function using bit shifts and addition to perform "long multiplication" in binary.
unsigned long long mul_ull(unsigned long long a, unsigned long long b)
{
unsigned long long product = 0;
unsigned int shift = 0;
while (b)
{
if (b & 1)
{
product += a << shift;
}
shift++;
b >>= 1;
}
return product;
}
EDIT: Alternative implementation of the above using single bit shifts and addition:
unsigned long long mul_ull(unsigned long long a, unsigned long long b)
{
unsigned long long product = 0;
while (b)
{
if (b & 1)
{
product += a;
}
a <<= 1;
b >>= 1;
}
return product;
}
In practice, whether or not this is faster than repeated addition depends on any optimizations done by the compiler. An optimizing compiler could analyze the repeated addition and replace it with a multiplication. An optimizing compiler could also analyze the code of the mul_ull
function above and replace it with a multiplication, but that may be harder for the optimizer to spot. So in practice, it is perfectly reasonable for the repeated addition algorithm to end up faster than the bit-shift and addition algorithm after optimization!
Also, the above implementations of the mul_ull
functions will tend to perform better if the second parameter b
is the smaller of the numbers being multiplied when the one of the numbers is much larger than the other (as is typical for a factorial calculation). The execution time is roughly proportional to the log of b
(when b
is non-zero) but also depends on the number of 1-bits in the binary value of b
. So for the factorial calculation, put the old running product in the first parameter a
and the new factor in the second parameter b
.
A factorial function using the above multiplication function:
unsigned long long factorial(unsigned int n)
{
unsigned long long fac = 1;
unsigned int i;
for (i = 2; i <= n; i++)
{
fac = mul_ull(fac, i);
}
return fac;
}
The above factorial
function is likely to produce an incorrect result for n
> 20 due to arithmetic overflow. 66 bits are required to represent 21! but unsigned long long
is only required to be 64 bits wide (and that is typically the actual width for most implementations).
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…