timecounting: use full 96-bit product when computing elapsed time
The timecounting subsystem computes elapsed time by scaling (64 bits)
the difference between two counter values (32 bits at most) up into a
struct bintime (128 bits).
Under normal circumstances it is sufficient to do this with 64-bit
multiplication, like this:
struct bintime bt;
bt.sec = 0;
bt.frac = th->tc_scale * tc_delta(th);
However, if tc_delta() exceeds 1 second's worth of counter ticks, that
multiplication overflows. The result is that the monotonic clock appears
to jump backwards.
When can this happen? In practice, I have seen it when trying to
compile LLVM on an EdgeRouter Lite when using an SD card as the
backing disk. The box gets stuck in swap, the hardclock(9) is
delayed, and we appear to "lose time".
To avoid this overflow we need to compute the full 96-bit product of
the delta and the scale.
This commit adds TIMECOUNT_TO_BINTIME(), a function for computing that
full product, to sys/time.h. The patch puts the new function to use
in lib/libc/sys/microtime.c and sys/kern/kern_tc.c.
(The commit also reorganizes some of our high resolution bintime code
so that we always read the timecounter first.)
Doing the full 96-bit multiplication is between 0% and 15% slower than
doing the cheaper 64-bit multiplication on amd64. Measuring a precise
difference is extremely difficult because the computation is already
quite fast.
I would guess that the cost is slightly higher than that on 32-bit
platforms. Nobody ever volunteered to test, so this remains a guess.
Thread: https://marc.info/?l=openbsd-tech&m=
163424607918042&w=2
6 month bump: https://marc.info/?l=openbsd-tech&m=
165124251401342&w=2
Committed after 9 months without review.