Perform mitigations for Intel L1TF screwup. There are three options:
(1) Future cpus which don't have the bug, (2) cpu's with microcode
containing a L1D flush operation, (3) stuffing the L1D cache with fresh
data and expiring old content. This stuffing loop is complicated and
interesting, no details on the mitigation have been released by Intel so
Mike and I studied other systems for inspiration. Replacement algorithm
for the L1D is described in the tlbleed paper. We use a 64K PA-linear
region filled with trapsleds (in case there is L1D->L1I data movement).
The TLBs covering the region are loaded first, because TLB loading
apparently flows through the D cache. Before performing vmlaunch or
vmresume, the cachelines covering the guest registers are also flushed.
with mlarkin, additional testing by pd, handy comments from the
kettenis and guenther peanuts