Provide bn_umul_hilo().
The bignum code needs to be able to multiply two words, producing a
double word result. Some architectures do not have native support for
this, hence a pure C version is required. bn_umul_hilo() provides this
functionality.
There are currently two implementations, both of which are branch free.
The first uses bitwise operations for the carry, while the second uses
accumulators. The accumulator version uses fewer instructions, however
requires more variables/registers and seems to be slower, at least on
amd64/i386. The accumulator version may be faster on architectures that
have more registers available. Further testing can be performed and one
of the two implementations can be removed at a later date.
ok tb@