Video ram bandwidth + coder suggestions

News, Support and Development discussions relating to SuperVidel.

Moderators: Mug UK, [ProToS], lp, moondog/.tSCc., instream, Moderator Team, Nature

Post Reply
User avatar
Posts: 1004
Joined: Tue Aug 01, 2006 9:21 am
Location: Halmstad, Sweden

Video ram bandwidth + coder suggestions

Post by shoggoth »

Evil of Dead Hackers Society recently made some interesting benchmarks:
Well not a demo, but I threw together a little test program to see real numbers.

It draws a normal 320x240 8-bit tunnel effect and then display it in various ways. These are the results:

Tunnel results
Fastram -> ST-ram (c2p): ~0050 FPS
Fastram -> SV-ram (c2p): ~0066 FPS
Fastram -> SV-ram (move16): ~0090 FPS
SV-ram direct render: ~0062 FPS

Quite suprising that even a Fastram->SV-ram c2p is faster than rendering directly to SV-ram. Fastram->SV-ram with move16 wins with a big margin and that's good because replacing the c2p pass with a move16 pass is almost no work at all.

If you'd like to try this program, here's a link:

Just be aware that it's probably not 100% bugfree and it's not made to look pretty (screen tearings etc). Sources not included, they are a mess. Also it will crash on anything less than 68040 (move16 instruction) and it will certainly not look correct if you don't have a Supervidel. The tests will take a while, it draws 1000 frames per test to get a good average.

Anders Eriksson
A note about the "SV-ram direct renderer" - this code uses byte transfers to SV-ram, which is a natural thing to do in e.g. a texture mapper or other byte-oriented code. Since SV-ram isn't cached by the CPU, this is by nature highly inefficient. So if you have a large byte-oriented frame, it is probably a good idea to render this in Fastram and use move16 transfers -> SV-ram. On the other hand, if your code uses word or longword transfers already, performance may be quite a bit faster than the "SV-ram direct renderer" benchmark above. I guess more benchmarks are needed for that scenario! Anyway - huge thanks to Evil for providing these tests!

IMPORTANT NOTE: It became evident that the current SV firmware has some issues with move16 transfers at > 66MHz. This is currently being worked on and will be fixed in a later firmware release. It is temporarily recommended to use 4 * move.l (or some movem.l construct ) instructions until this issue is sorted out; it won't give move16 performance, but it will prepare the code for it. Stay tuned and thanks for your patience!
Ain't no space like PeP-space.

Posts: 175
Joined: Mon Aug 03, 2009 9:08 am
Location: Göteborg, Sweden

Re: Video ram bandwidth + coder suggestions

Post by instream »

I have made some changes in the VHDL code to the move16 write handling so now it works for me at 95MHz in both Mikro's quake 1.03 (patched by pep to use move16) and in Evl's tunneldemo. Just need to test on a few more close SVs before unleashing it. :D

We used to assert the TAn signal every other cycle when the CPU writes using move16, to try to get rid of the ugly overshoots and ringing on the CT60 signals. But it seems that that is not enough above ~75Mhz. So now the last 3 longwords get 3 cycles each to stabilize instead of 2. This adds 3 clock cycles in total to a move16 write, which should give slightly lower performance than before (maybe 10%). But in Evl's tunneldemo I got the same FPS as before, 83 FPS at 95MHz. So in this case the 060 was able to continue executing instructions while its bus unit waited on the move16 to complete. For pure block copying where you execute lots of move16 (a0)+,(a1)+ in a row I guess you will get a bit lower performance though.

Post Reply

Return to “SuperVidel”