tighten up the tx start/completion handling of the producer and consumer.
the hypervisor obvious snoops the descriptor rings like crazy, and
it can run and complete transmit of packets as soon as the ownership
bit is set on the descriptor and before the txh register is updated
with the producer index. txintr would only process tx completions
if the producer and consumer indexes the driver maintains were
different, but would then go and pop every packet the hardware said
was done off the ring.
this changes txintr so it will only iterate over packets between
the driver consumer and producer indexes. also, have the start code
update the producer before flipping the ownership bit in the ring.
this keeps the start and intr code in sync.