Demacro sha256.
Replace macros with static inline functions, as well as writing out the
variable rotations instead of trying to outsmart the compiler. Also pull
the message schedule update up and complete it prior to commencement of
the round. Also use rotate right, rather than transposed rotate left.
Overall this is more readable and more closely follows the specification.
On some platforms (e.g. aarch64) there is no noteable change in
performance, while on others there is a significant improvement (more than
25% on arm).
ok miod@ tb@