PF_ANCHOR_STACK_MAX is insufficient protection against stack overflow.
On amd64 stack overflows for anchor rule with depth ~30. The tricky
thing is the 'safe' depth varies depending on kind of packet processed
by pf_match_rule(). For example for local outbound TCP packet stack
overflows when recursion if pf_match_rule() reaches depth 24.
Instead of lowering PF_ANCHOR_STACK_MAX to 20 and hoping it will
be enough on all platforms and for all packets I'd like to stop
calling pf_match_rule() recursively. This commit brings back
pf_anchor_stackframe array we used to have back in 2017. It also
revives patrick@'s idea to pre-allocate stack frame arrays
from per-cpu.
OK kn@