From: dlg Date: Sat, 26 Feb 2022 02:15:45 +0000 (+0000) Subject: have another go at fixing assert "sc->sc_dev == NUM" failed. X-Git-Url: http://artulab.com/gitweb/?a=commitdiff_plain;h=1c9104c31d3f30112714e8d264aa072933855a4f;p=openbsd have another go at fixing assert "sc->sc_dev == NUM" failed. claudio figured it out. his clue was that multiple concurrent calls to tunopen (or tapopen) will share a vnode. because tunopen can sleep, multiple programs can be inside tunopen for the same tun interface at the same time, all with references against the same vnode. at the same time as this another thread/program can call VOP_REVOKE via tun_clone_destroy (eg, ifconfig tun1 destroy does this). VOP_REVOKE marks a vnode as bad, which in turn means that subsequent open()s of a tun interface will get a brand new vnode. so multiple threads holding references to a vnode can be sleeping in tun_dev_open on the interface cloner lock. one thread wins and takes ownership of the tun interface, then another thread can destroy that tun interface, calls VOP_REVOKE which calls tun_dev_close to tear down the vnodes association with the tun interface and mark the vnode as bad. the thread that called tun_clone_destroy then creates another instance of the interface by calling tun_clone_create immediately. one of the original threads with the old vnode reference wakes up and takes ownership of the new tun_softc. however, because the vnode is bad, all the vnode ops have been replaced with the deadfs ops. the close() op on the old vnode is now a nop from the point of view of tun interfaces. the old vnode is no longer associated with tun and tap and will now never call tun_dev_close (via tunclose or tapclose), which in turn means sc_dev won't get cleared. another thread can now call tun_clone_destroy against the new instance of tun_softc. this instance has sc_dev set, so it tries to revoke it, but there's no vnode associated with it because the old vnode reference is dead. because this second call to VOP_REVOKE couldnt find a vnode, it can't call tunclose against it, so sc_dev is still set and this KASSERT fires. claudio and i came up with the following, which is to have tun_dev_open check the state of the vnode associated with the current open call after all the sleeping and potential tun_clone_destroy and tun_clone_create calls. if the vnode has been made bad/dead after all the sleeping, it returns with ENXIO. Reported-by: syzbot+5e13201866c43afbfbf6@syzkaller.appspotmail.com ok claudio@ visa@ --- diff --git a/sys/net/if_tun.c b/sys/net/if_tun.c index 6b974dbe9f4..e1a10a35412 100644 --- a/sys/net/if_tun.c +++ b/sys/net/if_tun.c @@ -1,4 +1,4 @@ -/* $OpenBSD: if_tun.c,v 1.235 2022/02/22 01:15:02 guenther Exp $ */ +/* $OpenBSD: if_tun.c,v 1.236 2022/02/26 02:15:45 dlg Exp $ */ /* $NetBSD: if_tun.c,v 1.24 1996/05/07 02:40:48 thorpej Exp $ */ /* @@ -373,10 +373,19 @@ tun_dev_open(dev_t dev, const struct if_clone *ifc, int mode, struct proc *p) struct ifnet *ifp; int error; u_short stayup = 0; + struct vnode *vp; char name[IFNAMSIZ]; unsigned int rdomain; + /* + * Find the vnode associated with this open before we sleep + * and let something else revoke it. Our caller has a reference + * to it so we don't need to account for it. + */ + if (!vfinddev(dev, VCHR, &vp)) + panic("%s vfinddev failed", __func__); + snprintf(name, sizeof(name), "%s%u", ifc->ifc_name, minor(dev)); rdomain = rtable_l2(p->p_p->ps_rtableid); @@ -413,6 +422,12 @@ tun_dev_open(dev_t dev, const struct if_clone *ifc, int mode, struct proc *p) } } + /* Has tun_clone_destroy torn the rug out under us? */ + if (vp->v_type == VBAD) { + error = ENXIO; + goto done; + } + if (sc->sc_dev != 0) { /* aww, we lost */ error = EBUSY;