tb [Mon, 28 Jun 2021 14:01:38 +0000 (14:01 +0000)]
Garbage collect loop index i which is no longer used after usage tweak.
jca [Mon, 28 Jun 2021 13:47:46 +0000 (13:47 +0000)]
Fix base-gcc -Wno-error=uninitialized
base-gcc always errored out when -Werror was passed and -Wuninitialized
triggered, even when -Wno-error=uninitialized was passed.
Deemed correct by Miod
espie [Mon, 28 Jun 2021 11:25:14 +0000 (11:25 +0000)]
remove old "paranoid" option, I'm pretty sure nobody uses it.
refactor the code into figuring out simple updates: if we don't have
any @execs but just @tags, we can probably do something simpler wrt
temporary files and temporary filenames, which should speed up texlive
updates significantly.
(the tempfile code is not there yet, just the check for safe updates)
mpi [Mon, 28 Jun 2021 11:19:01 +0000 (11:19 +0000)]
Make anonymous object reference counting independant from the KERNEL_LOCK().
- Use atomic operations for increment/decrement
- Rewrite the loop from uao_swap_off() to only keep a reference to the
next item in the list.
ok jmatthew@
mpi [Mon, 28 Jun 2021 11:04:14 +0000 (11:04 +0000)]
Enable dt(4).
ok kettenis@
kettenis [Mon, 28 Jun 2021 09:35:09 +0000 (09:35 +0000)]
Implement copyin32().
ok deraadt@
bluhm [Mon, 28 Jun 2021 08:55:06 +0000 (08:55 +0000)]
Also show the time spent in userland when analyzing the kernel stack
in flame graph. Only when both kernel and userland are displayed,
the whole picture of system activity becomes clear. Fixes a parsing
bug in the flame graph tool where userland time was interpreted as
invalid kernel stack.
OK kn@
kettenis [Sun, 27 Jun 2021 21:39:55 +0000 (21:39 +0000)]
Make sure __bss_start is aligned on an 8-byte boundary. This makes sure
zeroing out .bss doesn't overrun and overwrite the ELF symbol table.
ok patrick@
kettenis [Sun, 27 Jun 2021 20:36:57 +0000 (20:36 +0000)]
Using the MI mplock should be fine on riscv64.
jsing [Sun, 27 Jun 2021 19:23:51 +0000 (19:23 +0000)]
Track the sigalgs used by ourselves and our peer.
Move the sigalg pointer from SSL_HANDSHAKE_TLS13 to SSL_HANDSHAKE, naming
it our_sigalg, adding an equivalent peer_sigalg. Adjust the TLSv1.3 code
that records our signature algorithm. Add code to record the signature
algorithm used by our peer.
Needed for upcoming API additions.
ok tb@
jsing [Sun, 27 Jun 2021 19:16:59 +0000 (19:16 +0000)]
Have ssl3_send_client_verify() pass *pkey to called functions.
ssl3_send_client_verify() already has a pointer to the EVP_PKEY for the
certificate - pass this as an argument to the functions that it calls,
rather than duplicating code/variable declarations.
jsing [Sun, 27 Jun 2021 18:15:35 +0000 (18:15 +0000)]
Change ssl_sigalgs_from_value() to perform sigalg list selection.
Rather that passing in a sigalg list at every call site, pass in the
appropriate TLS version and have ssl_sigalgs_from_value() perform the
sigalg list selection itself. This allows the sigalg lists to be made
internal to the sigalgs code.
ok tb@
jsing [Sun, 27 Jun 2021 18:09:07 +0000 (18:09 +0000)]
Rename ssl_sigalg() to ssl_sigalg_from_value().
This makes the code more self-documenting and avoids the ambiguity between
ssl_sigalg the struct and ssl_sigalg the function.
ok tb@
jsing [Sun, 27 Jun 2021 17:59:17 +0000 (17:59 +0000)]
Change ssl_sigalgs_build() to perform sigalg list selection.
Rather that doing sigalg list selection at every call site, pass in the
appropriate TLS version and have ssl_sigalgs_build() perform the sigalg
list selection itself. This reduces code duplication, simplifies the
calling code and is the first step towards internalising the sigalg lists.
ok tb@
schwarze [Sun, 27 Jun 2021 17:57:13 +0000 (17:57 +0000)]
add a style message about overlong text lines,
trying very hard to avoid false positives,
not at all trying to catch as many cases as possible;
feature originally suggested by tb@,
OK tb@ kn@ jmc@
jsing [Sun, 27 Jun 2021 17:50:06 +0000 (17:50 +0000)]
Tidy some comments and simplify some code.
ok tb@
jsing [Sun, 27 Jun 2021 17:45:16 +0000 (17:45 +0000)]
Keep sigalg initialiser order consistent - key type, then hash.
This matches the order that sigalgs are specified in.
ok tb@
jsing [Sun, 27 Jun 2021 17:13:23 +0000 (17:13 +0000)]
Add test coverage for TLSv1.3 client hellos.
This is a little bit clunky due to the number of things that vary (largely
thanks to middlebox compatibility mode, along with the versions and key
share extensions), however it works and can be improved at a later date.
jsing [Sun, 27 Jun 2021 16:55:46 +0000 (16:55 +0000)]
Add test coverage for DTLSv1.2 client hellos.
jsing [Sun, 27 Jun 2021 16:54:55 +0000 (16:54 +0000)]
Improve test coverage for SSL_OP_NO_DTLSv1.
jsing [Sun, 27 Jun 2021 16:54:14 +0000 (16:54 +0000)]
Correct handling of SSL_OP_NO_DTLSv1.
When converting to TLS flags, we need to also include SSL_OP_NO_TLSv1,
otherwise the TLS equivalent of SSL_OP_NO_DTLSv1 is TLSv1.0 only, which
does not work so well when we try to switch back to DTLS versions.
jsing [Sun, 27 Jun 2021 16:40:25 +0000 (16:40 +0000)]
Teach hexdump() how to identify differing bytes.
This allows differences between the received data and the test data to be
more readily identified.
jsing [Sun, 27 Jun 2021 16:36:53 +0000 (16:36 +0000)]
More appropriately set cipher_list_len when AES acceleration is available.
jsing [Sun, 27 Jun 2021 16:33:30 +0000 (16:33 +0000)]
Tweak some data types and sprinkle some const.
schwarze [Sun, 27 Jun 2021 15:53:33 +0000 (15:53 +0000)]
In addition to 2-byte and 3-byte UTF-8 sequences, correctly identify all
4-byte UTF-8 sequences and not just some of them, to keep them together
and avoid passing them on byte by byte, helping tools like tmux(1).
While here, also do all the range tests with < and > rather than &
for uniformity and readability, and add some comments.
Input and OK jca@ and nicm@.
Soeren at Soeren dash Tempel dot net originally reported the bug
and provided an incomplete patch that was used as a starting point,
and he also tested this final patch.
kettenis [Sun, 27 Jun 2021 15:02:25 +0000 (15:02 +0000)]
Add Hart State Management functions. These will be needed to spin up
the secondary cores. From FreeBSD.
ok mlarkin@
jsg [Sun, 27 Jun 2021 04:52:01 +0000 (04:52 +0000)]
reuse armv7 installboot for riscv64
ok deraadt@
visa [Sun, 27 Jun 2021 04:33:40 +0000 (04:33 +0000)]
Create DMA maps with 64-bit capability when appropriate.
OK kettenis@
visa [Sun, 27 Jun 2021 04:32:31 +0000 (04:32 +0000)]
Use config register to determine if 64-bit DMA is available.
Suggested by and OK kettenis@
jsg [Sun, 27 Jun 2021 01:58:51 +0000 (01:58 +0000)]
match on sifive,fu540-c000-gem
used by the hifive unmatched device tree in mainline linux and u-boot
ok visa@
jmc [Sat, 26 Jun 2021 18:03:45 +0000 (18:03 +0000)]
make usage less chatty; ok mlarkin
jmc [Sat, 26 Jun 2021 18:02:48 +0000 (18:02 +0000)]
make SYNOPSIS match usage; ok ajacoutot
kettenis [Sat, 26 Jun 2021 17:38:40 +0000 (17:38 +0000)]
For some reason the riscv64 locore.S ended up with the copyright license
from the arm64 locore.S. But the code was clearly copied from FreeBSD's
riscv64 locore.S. The license is the same, but the author/attribution isn't.
Fix this.
ok deraadt@
tb [Sat, 26 Jun 2021 17:36:28 +0000 (17:36 +0000)]
Fix .Xr order. From mandoc -Tlint.
deraadt [Sat, 26 Jun 2021 15:42:58 +0000 (15:42 +0000)]
delete extra explanations in the usage: messages which are described
far better in the manual pages
ok jmc
kettenis [Sat, 26 Jun 2021 14:50:25 +0000 (14:50 +0000)]
Make lazy binding work on riscv64.
prompted by deraadt@
kettenis [Sat, 26 Jun 2021 14:47:54 +0000 (14:47 +0000)]
Build ld.so with --march=rv64imac on riscv64 to be absolutely sure that
ld.so doesn't use the FP registers.
ok deraadt@
kettenis [Sat, 26 Jun 2021 14:46:48 +0000 (14:46 +0000)]
Use AFLAGS when building syscall stubs. Drop AINC wich isn't used.
ok deraadt@
visa [Sat, 26 Jun 2021 10:47:59 +0000 (10:47 +0000)]
cad: Implement 64-bit DMA mode
This lets the driver utilize 64-bit DMA on hardware that supports it.
Currently, riscv64 does not constrain DMA-reachable memory to the 32-bit
range. This caused memory errors with cad(4) on machines that have RAM
above 4GB in the physical address space.
Prompted by Mickael Torres
OK kettenis@
kettenis [Sat, 26 Jun 2021 09:24:51 +0000 (09:24 +0000)]
Add riscv64 support. From Mickael Torres.
ok matthieu@, jsg@
kettenis [Sat, 26 Jun 2021 09:23:24 +0000 (09:23 +0000)]
Add powerpc64 and riscv64 to the list of architectures that have DRM.
ok matthieu@, deraadt@, jsg@
matthieu [Sat, 26 Jun 2021 06:54:00 +0000 (06:54 +0000)]
Revert last change, which is under an #ifdef __linux__ block so no used.
noticed by jsg@
jsg [Sat, 26 Jun 2021 02:02:47 +0000 (02:02 +0000)]
riscv64 struct cpu_info has ci_idepth
jsg [Sat, 26 Jun 2021 00:48:28 +0000 (00:48 +0000)]
sync
jsg [Sat, 26 Jun 2021 00:43:28 +0000 (00:43 +0000)]
add /dev/dri/card0 and /dev/dri/renderD128
ok deraadt@
jsg [Sat, 26 Jun 2021 00:38:38 +0000 (00:38 +0000)]
add /dev/dri/
ok deraadt@
dlg [Fri, 25 Jun 2021 23:48:30 +0000 (23:48 +0000)]
let pfsync_request_update actually retry when it overfills a packet.
a continue in the middle of a do { } while (0) loop is effectively
a break, it doesnt restart the loop.
without the retry, the code leaked update messages which in turn
made pool_destroy in pfsync destroy trip over a kassert cos items
were still out.
found by and fix tested by hrvoje popovski
ok sashan@
krw [Fri, 25 Jun 2021 20:40:23 +0000 (20:40 +0000)]
Move unused eficall.h files to the Attic.
patrick [Fri, 25 Jun 2021 19:55:22 +0000 (19:55 +0000)]
Clean up and remove debug prints, and add a few more relevant prints for
when things go wrong.
matthieu [Fri, 25 Jun 2021 19:27:40 +0000 (19:27 +0000)]
basic radeondrm / X support for riscv64. Ok kettenis@
- add wscons devices
- build radeondrm and add MD uvm bits to support it.
krw [Fri, 25 Jun 2021 19:24:53 +0000 (19:24 +0000)]
Replace instances of the magic number '64' with a nice #define
BLOCKALIGNMENT. This will make it more obvious where this
512-byte block count could/should be converted to a disk sector
count.
No functional change.
matthieu [Fri, 25 Jun 2021 19:22:51 +0000 (19:22 +0000)]
add SIZE_MAX. ok kettenis@
kettenis [Fri, 25 Jun 2021 18:55:26 +0000 (18:55 +0000)]
Make sure we translate prefetchable mmio space as well.
From Mickael Torres.
krw [Fri, 25 Jun 2021 17:49:49 +0000 (17:49 +0000)]
1) Finish eliminating all uses of EFI_CALL() used in the tree, allowing for the
removal of eficall.h files.
2) Allow booting from 4k-byte sector devices.
3) Don't leak memory after successfull i/o.
The end result is that riscv64 efidev.c and efipxe.c are identical to the
arm64/armv7 versions, efirng.c is identical to the amd64/arm64 versions and
efiboot.c has only the arm64 -> riscv64 changes.
ok kettenis@
patrick [Fri, 25 Jun 2021 17:41:22 +0000 (17:41 +0000)]
While it seems like we can choose any I/O virtual address for peripheral
devices, this isn't really the case. It depends on the bus topology of
how devices are connected. In the case of PCIe, devices are assigned
addresses (in PCI BARs) from the PCI address spaces. Now if we take an
address from one of these address spaces for our IOVA, transfers from
from a PCI device to that address will terminate inside of the PCI bus.
This is because from the PCI buses' point-of-view, the address we chose
is part of its address space. To make sure we don't allocate addresses
from there, reserve the PCI addresses in the IOVA.
Note that smmu(4) currently gives each device its own IOVA. So the PCI
addresses will be reserved only in IOVA from PCI devices, and only the
addresses concerning the PCI bus it is connected to will be reserved.
All other devices behind an smmu(4) will not have any changes to their
IOVA.
ok kettenis@
krw [Fri, 25 Jun 2021 17:27:07 +0000 (17:27 +0000)]
Allow (w)hole disk allocation for GPT disks. Use fdisk -A when Apple APFS ISC
partition is detected. Otherwise the normal big hammer fdisk -ig.
Only create EFI SYS boot partition on GPT disks that are the ROOTDISK.
ok kettenis@ deraadt@
jsg [Fri, 25 Jun 2021 13:41:09 +0000 (13:41 +0000)]
add linux style memory barriers for risc-v to drm
based on linux operation to rvwmo mapping table in
the rvwmo appendix of the risc-v unprivileged isa spec
ok kettenis@
visa [Fri, 25 Jun 2021 13:29:40 +0000 (13:29 +0000)]
Remove an unused struct.
jsg [Fri, 25 Jun 2021 13:25:53 +0000 (13:25 +0000)]
use weaker fences for riscv64 membar
Fences are described in 'RISC-V Unprivileged ISA' syntax is
'fence predecessor,successor'.
"Any combination of device input (I), device output (O), memory reads (R),
and memory writes (W) may be ordered with respect to any combination
of the same."
Previously "fence" was used for membar_* which is short for
"fence iorw,iorw". Change this to more specific fences based on the
text in membar_sync(9) with store -> w, load -> r.
build test by and ok kettenis@
patrick [Fri, 25 Jun 2021 12:40:29 +0000 (12:40 +0000)]
Save quite a bit of space by removing the existence of PTEDs. The
dynamics of SMMU are a bit different to regular MMU usage, as we do
not need P->V lists or ref/mod emulation (with page access upgrade).
While in the future we might want to save cacheability modes, it is
not necessary right now. Our PTED construct, which holds that kind
of information, is not needed. With these gone, we save around 93%
of smmu(4)'s previous memory overhead.
Discussed with drahn@ kettenis@
claudio [Fri, 25 Jun 2021 09:25:48 +0000 (09:25 +0000)]
The network flush code only operates on peerself (like all the other
network commands). Instead of passing the peer as argument to the tree
walker just default to peerself in network_flush_upcall().
OK benno@
claudio [Fri, 25 Jun 2021 09:23:26 +0000 (09:23 +0000)]
Do the multiprotocol check first for the IPv4 case. So it is the same
everywhere.
OK benno@
djm [Fri, 25 Jun 2021 06:30:22 +0000 (06:30 +0000)]
fix decoding of X.509 subject name; from Leif Thuresson via bz3327
ok markus@
dtucker [Fri, 25 Jun 2021 06:20:39 +0000 (06:20 +0000)]
Use better language to refer to the user. From l1ving via github
PR#250, ok jmc@
jsg [Fri, 25 Jun 2021 05:22:02 +0000 (05:22 +0000)]
sync set sizes with latest snapshot
initial sizes were from arm64
jsg [Fri, 25 Jun 2021 04:51:52 +0000 (04:51 +0000)]
sync set sizes with latest snapshot
initial sizes were from arm64
dtucker [Fri, 25 Jun 2021 03:38:17 +0000 (03:38 +0000)]
Replace SIGCHLD/notify_pipe kludge with pselect.
Previously sshd's SIGCHLD handler would wake up select() by writing a
byte to notify_pipe. We can remove this by blocking SIGCHLD, checking
for child terminations then passing the original signal mask through
to pselect. This ensures that the pselect will immediately wake up if
a child terminates between wait()ing on them and the pselect.
In -portable, for platforms that do not have pselect the kludge is still
there but is hidden behind a pselect interface.
Based on other changes for bz#2158, ok djm@
deraadt [Fri, 25 Jun 2021 01:36:04 +0000 (01:36 +0000)]
minimalistic diff to use %ld instead of %d for ptrdiff printing
deraadt [Fri, 25 Jun 2021 01:35:13 +0000 (01:35 +0000)]
Pull in support from a future clang for __GCC_HAVE_SYNC_COMPARE_AND_SWAP_x
defines because we need it now
from https://reviews.llvm.org/D91784
ok mlarkin kettenis
cheloha [Thu, 24 Jun 2021 22:43:31 +0000 (22:43 +0000)]
alarm(3): remove superfluous pointer
The pointer `itp' doesn't serve any purpose here, remove it.
Since we're changing these lines, we may as well rename `it' to `itv'
to match the existing `oitv'.
Thread: https://marc.info/?l=openbsd-tech&m=
162380665115598&w=2
ok millert@
jmc [Thu, 24 Jun 2021 21:11:40 +0000 (21:11 +0000)]
trim usage to match the man page;
remove -DSEEALSO, as suggested by millert
ok millert
ian [Thu, 24 Jun 2021 18:40:59 +0000 (18:40 +0000)]
Add Buttonville and Peterborough (ON), both I've flown into.
mlarkin [Thu, 24 Jun 2021 18:05:02 +0000 (18:05 +0000)]
Update the name of RNO (name changed in 1994), also add:
MEV - Minden-Tahoe airport, Minden, Nevada, USA
CXP - Carson airport, Carson City, Nevada, USA
TKF - Truckee Tahoe airport, California, USA
I have landed at all three.
jsg [Thu, 24 Jun 2021 13:27:45 +0000 (13:27 +0000)]
add some aarch64 bits missed in Makefile.in 1.6
ok deraadt@ drahn@
claudio [Thu, 24 Jun 2021 13:03:31 +0000 (13:03 +0000)]
Simplify the multiprotocol handling by moving the while loops out of the
switch statement. This way common code is referenced only once.
OK sthen@
claudio [Thu, 24 Jun 2021 10:04:05 +0000 (10:04 +0000)]
aspath_deflate() did free the passed in data but since the way aspaths
are processed in the Adj-RIB-Out this is no longer needed since the passed
in pointer is still referenced and is not allowed to be freed.
Adjust the mrt code similar to how up_generate_attr() uses aspath_deflate().
OK sthen@
kettenis [Thu, 24 Jun 2021 09:34:17 +0000 (09:34 +0000)]
Add support for the 64-bit prefetchable memory window.
ok patrick@
claudio [Thu, 24 Jun 2021 09:26:18 +0000 (09:26 +0000)]
Fix add-path capability encoding, the length was not correctly calculated
because it included two extra bytes (copy-paste error from graceful restart).
semarie [Thu, 24 Jun 2021 07:21:59 +0000 (07:21 +0000)]
unveil: cleanup code. no intented functional change.
return early for simple conditions instead of using navigating inside
if-branches.
with and ok claudio@
deraadt [Thu, 24 Jun 2021 05:41:43 +0000 (05:41 +0000)]
repair missing dependencies against bfd.h for riscv64
ok jsg drahn
deraadt [Wed, 23 Jun 2021 23:57:43 +0000 (23:57 +0000)]
sync
kettenis [Wed, 23 Jun 2021 22:39:31 +0000 (22:39 +0000)]
Adjust test. You're not supposed to change errno in a signal handler and
count on it being observable in the normal program flow after the signal
handler returns. Such code would break code that sets errno to 0 and
looks at its value later. With the recent futex(2) changes this particular
aspect of the test no longer passed.
ok deraadt@, bluhm@
patrick [Wed, 23 Jun 2021 19:46:13 +0000 (19:46 +0000)]
The first page of the I/O virtual address space is reserved so that
is easier to spot misconfiguration or wrong behaviour where NULL is
used as address. Right now that page is not part of the IOVA at all,
so when we reserve regions, like PCI I/O space, which can cover that
page as well, extent(9) will panic. Instead, include it in the IOVA
but reserve it right away. This way that page can be reserved twice.
espie [Wed, 23 Jun 2021 16:51:15 +0000 (16:51 +0000)]
help the debugger look in ports for external parts like PadWalker
and Readline.
feedback and okay afresh1@
cheloha [Wed, 23 Jun 2021 16:10:45 +0000 (16:10 +0000)]
rtsock: revert from timeout_set_flags(9) to timeout_set_proc(9); ok mvs@
kettenis [Wed, 23 Jun 2021 15:32:40 +0000 (15:32 +0000)]
titmp(4)
kettenis [Wed, 23 Jun 2021 15:26:10 +0000 (15:26 +0000)]
Enable titmp(4).
ok deraadt@
kettenis [Wed, 23 Jun 2021 15:25:39 +0000 (15:25 +0000)]
Add titmp(4), a driver for the TI TMP451 temperature sensor.
ok deraadt@
cheloha [Wed, 23 Jun 2021 14:12:59 +0000 (14:12 +0000)]
adb(4/macppc): fix adb_cuda_tickle() prototype
Timeout callback functions are of type void (*)(void *).
adb_cuda_tickle() needs a void pointer for a first parameter.
ok mpi@
claudio [Wed, 23 Jun 2021 14:09:01 +0000 (14:09 +0000)]
In unveil_add_vnode() refactor code around the indexes i and j. In one
place the wrong index is used resulting in re-evaluating all unveil nodes.
Also loop over over all but the last (just added vnode) -- again there is
no need to re-evaluate the cover of the just added unveil.
OK anton@ semarie@
kettenis [Wed, 23 Jun 2021 13:39:12 +0000 (13:39 +0000)]
Make sure the bus is idle before starting a transfer.
ok deraadt@
krw [Wed, 23 Jun 2021 13:07:13 +0000 (13:07 +0000)]
The value of -l should be treated as a 512-byte block count.
Tweak man page.
tobhe [Wed, 23 Jun 2021 12:21:23 +0000 (12:21 +0000)]
Use print_host() to log destination, netmask and gateway. Add pretty
printing for route flags.
ok markus@
tobhe [Wed, 23 Jun 2021 12:11:40 +0000 (12:11 +0000)]
Factor out vroute_addr().
ok markus@
dv [Wed, 23 Jun 2021 11:24:01 +0000 (11:24 +0000)]
btrace(8): init and update timespec for BEGIN/END event
BEGIN and END use a fake dt(4) event, so in order to use the nsecs
var or time() it needs a timespec set. Init for BEGIN and update
at END.
ok mpi@
tb [Wed, 23 Jun 2021 11:12:33 +0000 (11:12 +0000)]
Garbage collect prototoype for ssl_parse_serverhello_tlsext() which
was removed in t1_lib.c r1.141.
dlg [Wed, 23 Jun 2021 06:53:51 +0000 (06:53 +0000)]
augment the global pf state list with its own locks.
before this, things that iterated over the global list of pf states
had to take the net, pf, or pf state locks. in particular, the
ioctls that dump the state table took the net and pf state locks
before iterating over the states and using copyout to export them
to userland. when we tried replacing the use rwlocks with mutexes
under the pf locks, this blew up because you can't sleep when holding
a mutex and there's a sleeping lock used inside copyout.
this diff introduces two locks around the global state list: a mutex
that protects the head and tail of the list, and an rwlock that
protects the links between elements in the list. inserts on the
state list only occur during packet handling and can be done by
taking the mutex and putting the state on the tail before releasing
the mutex. iterating over states is only done from thread/process
contexts, so we can take a read lock, then the mutex to get a
snapshot of the head and tail pointers, and then keep the read lock
to iterate between the head and tail points. because it's a read
lock we can then take other sleeping locks (eg, the one inside
copyout) without (further) gymnastics. the pf state purge code takes
the rwlock exclusively and the mutex to remove elements from the
list.
this allows the ioctls and purge code to loop over the list
concurrently and largely without blocking the creation of states
when pf is processing packets.
pfsync also iterates over the state list when doing bulk sends,
which the state purge code needs to be careful around.
ok sashan@
dlg [Wed, 23 Jun 2021 05:51:27 +0000 (05:51 +0000)]
pf_purge_expired_states can check the time once instead of for every state.
dlg [Wed, 23 Jun 2021 05:43:53 +0000 (05:43 +0000)]
pfsync_undefer_notify needs to be careful before dereferecing state keys.
pfsync_undefer_notify uses the state keys to look up the address
family, which is used to figure out if it should call ipv4 or ipv6
functions. however, the pf state purge code can unlink a state from
the trees (ie, the state keys get removed) while the pfsync defer
code is holding a reference to it and expects to be able to send
the deferred packet in the future. we can test if the state keys
are set by checking if the timeout state is PFTM_UNLINK or not.
this currently relies on both pf_remove_state and pfsync_undefer_notify
being called with the NET_LOCK held. this probably needs to be
rethought later but is good enough for now.
found the hard way on a production firewall at work.
dlg [Wed, 23 Jun 2021 04:16:32 +0000 (04:16 +0000)]
rework pf_state_expires to avoid confusion around state->timeout.
im going to make it so pf_purge_expired_states() can gather states
largely without sharing a lock with pfsync or actual packet processing
in pf. if pf or pfsync unlink a state while pf_purge_expired_states
is looking at it, we can race with some checks and fall over a
KASSERT.
i'm fixing this by having the caller of pf_state_expires read
state->timeout first, do it's checks, and then pass the value as
an argument into pf_state_expires. this means there's a consistent
view of the state->timeout variable across all the checks that
pf_purge_expired_states in particular does. if pf/pfsync does change
the timeout while pf_purge_expired_states is looking at it, the
worst thing that happens is that it doesn't get picked as a candidate
for purging in this pass and will have to wait for the next sweep.
ok sashan@ as part of a bigger diff