try pf.c r1.1143 again: move pf_purge out from under the kernel lock
this also avoids holding NET_LOCK too long.
the main change is done by running the purge tasks in systqmp instead
of systq. the pf state list was recently reworked so iteration over
the state can be done without blocking insertions.
however, scanning a lot of states can still take a lot of time, so
this also makes the state list scanner yield if it has spent too
much time running.
the other purge tasks for source nodes, rules, and fragments have
been moved to their own timeout/task pair to simplify the time
accounting.
in my environment, before this change pf purges often took 10 to
50ms. the softclock thread runs next to it often took a similar
amount of time, presumably because they ended up spinning waiting
for each other. after this change the pf_purges are more like 6 to
12ms, and dont block softclock. most of the variability in the runs
now seems to come from contention on the net lock.
tested by me sthen@ chris@
ok sashan@ kn@ claudio@
the diff was backed out because it made things a bit more racey,
but sashan@ has squashed those races this week. let's try it again.