Provide and use crypto_ro{l,r}_u{32,64}().
Various code in libcrypto needs bitwise rotation - rather than defining
different versions across the code base, provide a common set that can
be reused. Any sensible compiler optimises these to a single instruction
where the architecture supports it, which means we can ditch the inline
assembly.
On the chance that we need to provide a platform specific versions, this
follows the approach used in BN where a MD crypto_arch.h header could be
added in the future, which would then provide more specific versions of
these functions.
ok tb@