Before pmap7.c rev 1.35 and pmap.h rev 1.44 DMA'able memory with the
BUS_DMA_COHERENT flag was mapped as device memory which does not use the
store buffer. It is now mapped as normal inner and outer non-cacheable
which does.
While we drain the cpu store buffer for this case, on cortex a9 systems we
also need to explicitly drain the PL310 L2's store buffer. With PL310
revisions r3p2 and later this is done automatically after being present in
the store buffer for 256 cycles. On i.MX6 PL310 is rev r3p1 which does
not have this behaviour. This issue is i.MX6 errata ERR055199 and PL310
errata 769419.
This change restores io performance with a usb flash drive attached to
my cubox. Raw reads go from 3 MB/s to 19 MB/s for example.
Based on code written by patrick@ some time ago.
ok kettenis@ patrick@