artulab.com Git - openbsd/commit

amd64: simplify TSC synchronization testing

Computing a per-CPU TSC skew value is error-prone, especially on
multisocket machines and VMs.  My best guess is that larger latencies
appear to the current skew measurement test as TSC desync, and so the
TSC is demoted to a kernel timecounter on these machines or marked
non-monotonic.

This patch eliminates per-CPU TSC skew values.  Instead of trying to
measure and correct for TSC desync we only try to detect desync, which
is less error-prone.  This approach should allow a wider variety of
machines to use the TSC as a timecounter when running OpenBSD.

In the new sync test, both CPUs repeatedly try to detect whether their
TSC is trailing the other CPU's TSC.  The upside to this approach is
that it yields no false positives.  The downside to this approach is
that it takes more time than the current skew measurement test.  Each
test round takes 1ms, and we run up to two rounds per CPU, so this
patch slows boot down by 2ms per AP.

If any CPU fails the sync test, the TSC is marked non-monotonic and a
different timecounter is activated.  The TC_USER flag remains intact.
There is no middle ground where we fall back to only using the TSC in
the kernel.

Before running the test, we check for the IA32_TSC_ADJUST register and
reset it if necessary.  This is a trivial way to work around firmware
bugs that desync the TSC before we reach the kernel.  Unfortunately,
at the moment this register appears to only be available on Intel
processors.  I cannot find an equivalent but differently-named MSR for
AMD processors.

Because there is no per-CPU skew value, there is also no concept of
TSC drift anymore.

Miscellaneous notes:

- This patch adds a new timecounter utility function, tc_reset_quality().
  Used after sync test failure to mark the TSC non-monotonic.

- I have left TSC_DEBUG enabled for now.  Unsure if we should leave it
  enabled for release or not.  If we disable it we no longer run the
  sync test after failing it once.  Running the test even after failure
  provides information about the desync on every CPU.

- Taking 1ms per test round is fairly conservative.  We can experiment
  with and discuss shorter test rounds.  My main goal with a relatively
  long test round is ensuring VMs actually run the test.  It would be
  bad if a hypervisor interrupted the test for so long that it concealed
  desync.

- The use of two test rounds is mostly a diagnostic tool: it would be
  very strange if a CPU passed the first round but failed the second.
  If we ever saw this in the wild it would indicate something odd.

- Most of the desync seen in test reports is on Ryzen CPUs.  I
  believe, but cannot prove, that this is due to a widespread
  firmware bug on AMD motherboards.  Hopefully AMD and/or the
  downstream vendors fix it.

- Fixing TSC desync by writing the TSC directly with WRMSR is very
  difficult.  The TSC is a moving target incrementing very quickly and
  compensating for WRMSR overhead is non-trivial.  We can experiment
  with this, but my confidence is low that we can make it work reliably.

Prompted by deraadt@ and kettenis@ in 2021. Shepherded along by
deraadt@ throughout.  Reprompted by Yuichiro Naito several times.
With input from Yuichiro Naito, naddy@, sthen@, dv@, and deraadt@.

Tested by florian@, gnezdo@, sthen@, Josh Rickmar, dv@, Mohamed Aslan,
Hrvoje Popovski, Yuichiro Naito, semarie@, mlarkin@, asou@, jmatthew@,
Renato Aguiar, and Timo Myyra.

Patch v1: https://marc.info/?l=openbsd-tech&m=164330092208035&w=2
Patch v2: https://marc.info/?l=openbsd-tech&m=164558519712957&w=2
Patch v3: https://marc.info/?l=openbsd-tech&m=165698681018991&w=2
Patch v4: https://marc.info/?l=openbsd-tech&m=165835507113680&w=2
Patch v5: https://marc.info/?l=openbsd-tech&m=165923705118770&w=2

"just commit it" deraadt@

author	cheloha <cheloha@openbsd.org>
	Fri, 12 Aug 2022 02:20:36 +0000 (02:20 +0000)
committer	cheloha <cheloha@openbsd.org>
	Fri, 12 Aug 2022 02:20:36 +0000 (02:20 +0000)
commit	78156938567f79506a923cf635bd525907207a76
tree	9c60130b8ad2e78585a79a0efb889b822a7245f7	tree \| snapshot
parent	cb33a3f65fa74ec2591df1744951d81177ec9bd3	commit \| diff

sys/arch/amd64/amd64/cpu.c		diff \| blob \| history
sys/arch/amd64/amd64/tsc.c		diff \| blob \| history
sys/arch/amd64/include/cpu.h		diff \| blob \| history
sys/arch/amd64/include/cpuvar.h		diff \| blob \| history
sys/kern/kern_tc.c		diff \| blob \| history
sys/sys/timetc.h		diff \| blob \| history