Quake 2 on Falcon030
Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team
Re: Quake 2 on Falcon030
I'm working to a budget now within DSP memory and it's a very tight squeeze. Testing the first map forced me to adjust the sizes of various buffers a few times to get stuff to fit.
Fortunately its possible to shrink the size of geometry batches (vertices and 3d face edges) to free up some memory for global 2d edges and draw surfaces.
The budget is currently 1000 vertices per batch, 500 edges per batch, 2000 global edges and 600 global surfaces. This consumes all of the DSP ram at least briefly. Some ram is freed up again before drawing since batching has stopped by then - but using smaller batches means less gets freed up at the end.
The main limiting factor is global edges + draw surfaces since those accumulate and persist until the scene is drawn. There isn't any sensible way to deliver these in batches since all need to be present at once for the hidden surface scanning to work. Even drawing the scene in tiles will not help here. So they just need to be packed as small as possible and if the limit gets hit then no more get sent and some holes appear in distant scenery.
I think though the Falcon will be too slow to render scenes which overflow in this way so it's not a big deal. Scanning 600 faces is already a lot of work for it to do even without textures.
Vertices are currently taking 8 words each (tx,ty,tz,pad,px,py,pz,pad) distributed over x:y memory so it takes 4 words in of address space. This will get packed down to tx,ty,px,py,z,pad) at the very least which is a 25% saving. A couple of other things can perhaps be packed e.g. spans from 2 words to 1, but at higher processing cost. Won't do that unless it is needed.
So in general it is a tight fit but it does fit for the kinds of maps that seem to be manageable, and the type of damage for bigger maps is limited to distant details and probably offset by the huge cost of drawing such big scenes.
Will post an update when I get the scan convertor to work.
Fortunately its possible to shrink the size of geometry batches (vertices and 3d face edges) to free up some memory for global 2d edges and draw surfaces.
The budget is currently 1000 vertices per batch, 500 edges per batch, 2000 global edges and 600 global surfaces. This consumes all of the DSP ram at least briefly. Some ram is freed up again before drawing since batching has stopped by then - but using smaller batches means less gets freed up at the end.
The main limiting factor is global edges + draw surfaces since those accumulate and persist until the scene is drawn. There isn't any sensible way to deliver these in batches since all need to be present at once for the hidden surface scanning to work. Even drawing the scene in tiles will not help here. So they just need to be packed as small as possible and if the limit gets hit then no more get sent and some holes appear in distant scenery.
I think though the Falcon will be too slow to render scenes which overflow in this way so it's not a big deal. Scanning 600 faces is already a lot of work for it to do even without textures.
Vertices are currently taking 8 words each (tx,ty,tz,pad,px,py,pz,pad) distributed over x:y memory so it takes 4 words in of address space. This will get packed down to tx,ty,px,py,z,pad) at the very least which is a 25% saving. A couple of other things can perhaps be packed e.g. spans from 2 words to 1, but at higher processing cost. Won't do that unless it is needed.
So in general it is a tight fit but it does fit for the kinds of maps that seem to be manageable, and the type of damage for bigger maps is limited to distant details and probably offset by the huge cost of drawing such big scenes.
Will post an update when I get the scan convertor to work.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
Now after fixing a couple of bugs the scan convertor seems to be working - in the sense that it ticks over from frame to frame and builds surfaces and spans, and best of all, doesn't crash. I don't know what the data looks like yet and if it is correct. So that will be next. Seems like a step forward though.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
I think the hidden surface removal process is now beginning to work on the DSP. It has taken quite a lot of debugging and hassle but probably the worst part is over for the single most complicated part of the project. The screenshots show it's starting to do the right thing.
There are strange glitches and the scene flickers on/off at some angles so there's still work to do.
None of this new code is optimized, trying to match the C code 1:1 so it is also currently quite slow. It's also running debug checks on every scanline, on the edge and surface buffers to detect faults, which takes a huge amount of processing time.
There are strange glitches and the scene flickers on/off at some angles so there's still work to do.
None of this new code is optimized, trying to match the C code 1:1 so it is also currently quite slow. It's also running debug checks on every scanline, on the edge and surface buffers to detect faults, which takes a huge amount of processing time.
You do not have the required permissions to view the files attached to this post.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
- FedePede04
- Atari God
- Posts: 1215
- Joined: Fri Feb 04, 2011 12:14 am
- Location: Denmark
- Contact:
Re: Quake 2 on Falcon030
Hi Doug
you are a true wizard
and i love to follow the process of your work.
you are a true wizard

and i love to follow the process of your work.
Atari will rule the world, long after man has disappeared
sometime my English is a little weird, Google translate is my best friend
sometime my English is a little weird, Google translate is my best friend

Re: Quake 2 on Falcon030
Thanks Peter! It's been fun hacking at it so far. Might start getting more interesting soon, after a bunch more fixes. It's always a lot easier to work on something that isn't brokenFedePede04 wrote:Hi Doug
you are a true wizard![]()
and i love to follow the process of your work.

d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
Finally got it working. It's still pretty slow but now that the bugs are out I can work on the planned optimizations (of which there are many - hopefully the speed will improve enough before they run out!).
https://www.youtube.com/watch?v=ZPQVd2t ... e=youtu.be
The most important optimization will be removing the linear list scan for inserting edges into the active edge table. Q2 reduced this a bit by pre-sorting the pending lists on each scanline so it inserts a sorted list into a sorted list. But I don't even want to be doing that if it can be avoided, so I have a couple of other methods to try... I haven't profiled it yet but I expect much of the time is lost in list maintenance.
The DSP code is also not using internal fast ram or decent addressing modes, so it's about 3x slower than it needs to be because of that alone.
https://www.youtube.com/watch?v=ZPQVd2t ... e=youtu.be
The most important optimization will be removing the linear list scan for inserting edges into the active edge table. Q2 reduced this a bit by pre-sorting the pending lists on each scanline so it inserts a sorted list into a sorted list. But I don't even want to be doing that if it can be avoided, so I have a couple of other methods to try... I haven't profiled it yet but I expect much of the time is lost in list maintenance.
The DSP code is also not using internal fast ram or decent addressing modes, so it's about 3x slower than it needs to be because of that alone.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
- Scarlettkitten
- Captain Atari
- Posts: 261
- Joined: Thu Mar 19, 2009 11:42 am
- Location: Northamptonshire, UK
Re: Quake 2 on Falcon030
Just wow, impressive 

My musical dribbles https://soundcloud.com/sophierosemusix
- calimero
- Fuji Shaped Bastard
- Posts: 2422
- Joined: Thu Sep 15, 2005 10:01 am
- Location: STara Pazova, Serbia
- Contact:
Re: Quake 2 on Falcon030
Crazy poo!
Now I can really imagine Tron like game!
What are difference in engine betwen Quake 1 and Quake 2?
If quake 2 player works so fast with 16mhz 030, how much cpu cycles would be needed more for drawing textures (e.g. second 030 on 16mhz (32mhz 030), or drawing textures is way more than 100% than this what you made so far)?
VBL counter - it is how many vbl is needed to draw screen?
Now I can really imagine Tron like game!

What are difference in engine betwen Quake 1 and Quake 2?
If quake 2 player works so fast with 16mhz 030, how much cpu cycles would be needed more for drawing textures (e.g. second 030 on 16mhz (32mhz 030), or drawing textures is way more than 100% than this what you made so far)?
VBL counter - it is how many vbl is needed to draw screen?
using Atari since 1986. ・ http://wet.atari.org ・ http://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
Re: Quake 2 on Falcon030
For software rasterization, not big changes - they use a similar set of algorithms. There were some improvements though and especially to the map and lightmaps, and with nice transparency using fat lookups. The maps are also more complex. The target machines were bigger and faster (I had something like a 450MHz PIII when it was released, with an early NVidia TNT videocard)calimero wrote:Crazy poo!
Now I can really imagine Tron like game!
What are difference in engine betwen Quake 1 and Quake 2?
Under 3D acceleration Q2 uses colour lightmaps instead of mono under software. I was hoping to make use of the colour lightmaps on Falcon too if the project got far enough - it should look interesting. At the very least its possible to use the colour lightmaps for flat shading tones.
Q2 also had significant architectural improvements over Q1, which don't matter much for this project.

Almost all of the compute time so far is geared towards preventing polygon spans smaller than 1 pixel as early as possible and preventing any overdraw. So providing these stages can be optimized well enough, the chances of doing something interesting with those polygon spans is fair-to-good. The 68030 should be able to fill the number of pixels needed since it is a constant number at all times - the varying cost comes from hidden surface removal and final span count (setup time per span).calimero wrote: If quake 2 player works so fast with 16mhz 030, how much cpu cycles would be needed more for drawing textures (e.g. second 030 on 16mhz (32mhz 030), or drawing textures is way more than 100% than this what you made so far)?
I have prototyped a filling technique that should work in principle on the Falcon and be cheap enough for a chunky display at least - but I won't know the total cost until it is tried for real.
Yes, kind of - it's the number of VBLs needed for a frame-compute, including drawing. In this case though drawing takes a tiny amount of time since it is just some dots (yellow for span left edges, blue for right edges - or vice versa, i dont remember which way round I did it).calimero wrote: VBL counter - it is how many vbl is needed to draw screen?
However before attempting any kind of filling, I'm going to work on the existing stages to try to make them faster. It's still a bit too slow for my liking, and I think filling needs more room.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
So to summarize progress on this thing - it has reached a kind of milestone where some of the evil bits have been prototyped and it still looks viable to me on 16MHz base platform, albeit needing still more effort.
game engine (simplified, 1-player):
seems to work ok, although depends on FPU currently
map size (geometry):
seems viable for at least some maps. Tests have been done using maps with 10,000 faces and 20,000 edges and while there are certainly slow spots, the map size has a relatively small impact on speed, compared with the varying cost of 'nearby stuff'. at about 500k for map storage it should be ok.
map size (textures):
unknown - could become a problem later. but I intend to drop the top mipmap, which will cut the size of the surface cache by at least half. if that goes well it might buy enough to use colour lightmaps later.
scene size (geometry):
has been a bit problematic, but I think it is mostly solved by batching. I can now also see a way to cut the cost of vertex and edge storage significantly (by changing the algorithms which use them) which will should free up enough DSP ram for the bad cases too.
BSP tree:
this is a bit problematic. it is quite well optimized and still costs too much. i'll revisit this and try to use DSP to help with the frustum culling, which probably is most of the cost.
geometry (transform/projection):
the DSP easily digests the vertex processing cost, it's hardly registering as a portion of total time. it's also easy to optimize it much more so it's not going to be an issue. it costs more to transmit the vertices than it does to process them, and the transmit size can probably still be cut in half if needed.
geometry (face clipping):
seems to be ok, but could do with speeding up. plenty of room to do so, and some improvements can be made to the original algorithms, e.g. to cache all clipped edges which are shared by polygons - which saves DSP ram and avoids divides.
currently there is poor overlap between CPU and DSP which needs some thought. changing the way this part works could be beneficial. in principle the CPU-side reindexing of vertices and edges could be overlapped with DSP face clipping.
hidden surface removal:
a really expensive process for our old Atari, but DSP is coping ok. needs a rewrite and some changes to the algorithm to suit the machine better. considering shift-sentinels or a BTree for the active edge table, to flatten edge insertion cost. the DSP structures need to be reorganized to avoid using the terrible (r0+n0) addressing mode, which costs 2x as much as the others and is currently used absolutely everywhere in my code. the code also needs a lot of work to start running routines from internal ram, which will speed up access to main ram.
filling (flat colour):
don't see any problems with this. that's mainly why I haven't bothered to do it yet. better spend time on the bad things first.
filling (textures):
successful texture addressing prototype on PC using integer arithmetic only, not yet tried on Falcon. i do know it will work with 23-bit fixedpoint though since I was careful to work with that in the prototype.
still some problems with texture plane setup - needs some planning, and figuring out which chip should do it. It can be done by CPU, FPU or DSP but each has its pros and cons...
not yet clear if Q1/Q2 share texture planes across adjacent faces (I think it would make sense to do so) - if not then it could also be used to cut setup time on F30 since setup can be done only the first time a new texture plane is visited on each frame.
Those are just the main notes - there are lots of smaller details. Clearly though it is going to get faster with a bit more work, and we can try out a filled version.
I have cold/flu this week so probably won't be doing a lot - but will post if anything changes...
game engine (simplified, 1-player):
seems to work ok, although depends on FPU currently
map size (geometry):
seems viable for at least some maps. Tests have been done using maps with 10,000 faces and 20,000 edges and while there are certainly slow spots, the map size has a relatively small impact on speed, compared with the varying cost of 'nearby stuff'. at about 500k for map storage it should be ok.
map size (textures):
unknown - could become a problem later. but I intend to drop the top mipmap, which will cut the size of the surface cache by at least half. if that goes well it might buy enough to use colour lightmaps later.
scene size (geometry):
has been a bit problematic, but I think it is mostly solved by batching. I can now also see a way to cut the cost of vertex and edge storage significantly (by changing the algorithms which use them) which will should free up enough DSP ram for the bad cases too.
BSP tree:
this is a bit problematic. it is quite well optimized and still costs too much. i'll revisit this and try to use DSP to help with the frustum culling, which probably is most of the cost.
geometry (transform/projection):
the DSP easily digests the vertex processing cost, it's hardly registering as a portion of total time. it's also easy to optimize it much more so it's not going to be an issue. it costs more to transmit the vertices than it does to process them, and the transmit size can probably still be cut in half if needed.
geometry (face clipping):
seems to be ok, but could do with speeding up. plenty of room to do so, and some improvements can be made to the original algorithms, e.g. to cache all clipped edges which are shared by polygons - which saves DSP ram and avoids divides.
currently there is poor overlap between CPU and DSP which needs some thought. changing the way this part works could be beneficial. in principle the CPU-side reindexing of vertices and edges could be overlapped with DSP face clipping.
hidden surface removal:
a really expensive process for our old Atari, but DSP is coping ok. needs a rewrite and some changes to the algorithm to suit the machine better. considering shift-sentinels or a BTree for the active edge table, to flatten edge insertion cost. the DSP structures need to be reorganized to avoid using the terrible (r0+n0) addressing mode, which costs 2x as much as the others and is currently used absolutely everywhere in my code. the code also needs a lot of work to start running routines from internal ram, which will speed up access to main ram.
filling (flat colour):
don't see any problems with this. that's mainly why I haven't bothered to do it yet. better spend time on the bad things first.
filling (textures):
successful texture addressing prototype on PC using integer arithmetic only, not yet tried on Falcon. i do know it will work with 23-bit fixedpoint though since I was careful to work with that in the prototype.
still some problems with texture plane setup - needs some planning, and figuring out which chip should do it. It can be done by CPU, FPU or DSP but each has its pros and cons...
not yet clear if Q1/Q2 share texture planes across adjacent faces (I think it would make sense to do so) - if not then it could also be used to cut setup time on F30 since setup can be done only the first time a new texture plane is visited on each frame.
Those are just the main notes - there are lots of smaller details. Clearly though it is going to get faster with a bit more work, and we can try out a filled version.
I have cold/flu this week so probably won't be doing a lot - but will post if anything changes...
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
How are the coordinates for each face handled in Q1 & Q2? The routines I used in the past for perspective correct mapping always needed mapping coordinates besides the normal face vertices or else they would stretch out the texture to fit the corners, just like an affine mapper, so if there was anything else than just square faces, when it needed extra mapping coordiantes for the texture. But I think there is some way to do it without this?
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
Re: Quake 2 on Falcon030
Yes, the Q engines use a clever technique which builds on the 'quasi-raytracing' methods which are present in Wolf3D, Doom, although not explicitly tracing rays they intercept lines in texture space (Doom used a tangent table to achieve this effect without actually casting rays).Zamuel_a wrote:How are the coordinates for each face handled in Q1 & Q2? The routines I used in the past for perspective correct mapping always needed mapping coordinates besides the normal face vertices or else they would stretch out the texture to fit the corners, just like an affine mapper, so if there was anything else than just square faces, when it needed extra mapping coordiantes for the texture. But I think there is some way to do it without this?
The Q engines treat polygon faces as bounded areas of a plane equation in 3D space. So the texture for a surface is mapped to the plane equation of the surface - and not to vertices. If you move the vertices of a face around, the texture will not move - it is part of the wall 'plane' in 3D space, not the face itself.
This in turn means you can query the intercept coordinates for any pixel on the screen, given the equation of the surface known to be under that pixel.
This in turn means you can find the u,v for any pixel on screen using a simple equation.
u = uoff + (screen_x * ustepx) + (screen_y * ustepy);
v = voff + (screen_x * vstepx) + (screen_y * vstepy);
z = zoff + (screen_x * zstepx) + (screen_y * zstepy);
[apologies for mistakes, typing from memory here]
Even on a chip with a fast multiplier, this might seem expensive until you realize that it removes the need to store or process texture coordinates completely, within polygons or within spans. This is a bit unhelpful if you are drawing a simple cube, but is very helpful when you have a crap-ton of faces to draw.

TBH It's easier to make sense of it by following the source to the drawing routines. The real trick is understanding the importance of the BSP and its 3D hyperplanes being at the core of all the geometry. Faces are just little pieces of that BSP, plane-based world.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
It seems the reason for the slowdown in places is simply because I changed the polygon batch size to 1 for debugging, and forgot to put it back to the default of 128 before recording the vid. So it was going through the whole batching cycle for every face. doh.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
Between sneezes I managed to get a quick profile from base1.bsp in the hotspots and have a decent picture of where time is spent on the DSP. It's reasonably close to what I thought but not exactly. Some interesting findings as usual.
(Eero - the dsp profiler is such a nice tool to have
)
- sorting time is a significant portion of total time (sorting occurs within R_AETGenerateSpans, R_AETStepActive, R_AETInsert_r1)
- surface sorting (R_AETGenerateSpans) seems to be a bigger deal than edge sorting, which I didn't really expect. might be worth changing the representation of the surface stack.
- vertices don't take long to process compared with time transmitting them, but still one of the bigger DSP costs internally
- it takes about 3vbls worst case to do scan conversion on DSP currently
- face clipping takes a bit less time than I expected compared with the rest
Used cycles:
44.31% command_base (idle)
15.28% R_AETGenerateSpans
10.31% dbg_recover_surface
5.75% R_XFormProjectVertices
4.78% R_AETStepActive
4.43% R_Edge2DAddToViewport_
4.23 R_AETInsert_r1
3.40% R_SubmitFaceGeometry
2.02% R_SpanEmit
1.88% R_IndexedWoundEdge3DAddToFrustum
0.81% R_LinkGlobalEdge
0.60% R_SpanEmit_r6_y0
0.54% R_SpanEmit_r7_y0
0.28% R_AETRemoveInactive
0.26% R_AETAddPending
0.20% R_AETRemove_r1
0.18% R_Line2DIntersectY
0.13% R_Edge2DAddToViewport
0.13% R_Edge2DCacheCull
0.12% R_ClippedEdge2DAddToViewport
0.11% R_VerticalEdge2DAddToViewport
0.09% R_ScanConvertGET
0.07% R_EdgeIntersectZ
0.06% R_RecoverSurfaces
0.02% R_BeginFrame
0.00% R_BeginScan
0.00% R_BeginGeometry
(Eero - the dsp profiler is such a nice tool to have

- sorting time is a significant portion of total time (sorting occurs within R_AETGenerateSpans, R_AETStepActive, R_AETInsert_r1)
- surface sorting (R_AETGenerateSpans) seems to be a bigger deal than edge sorting, which I didn't really expect. might be worth changing the representation of the surface stack.
- vertices don't take long to process compared with time transmitting them, but still one of the bigger DSP costs internally
- it takes about 3vbls worst case to do scan conversion on DSP currently
- face clipping takes a bit less time than I expected compared with the rest
Used cycles:
44.31% command_base (idle)
15.28% R_AETGenerateSpans
10.31% dbg_recover_surface
5.75% R_XFormProjectVertices
4.78% R_AETStepActive
4.43% R_Edge2DAddToViewport_
4.23 R_AETInsert_r1
3.40% R_SubmitFaceGeometry
2.02% R_SpanEmit
1.88% R_IndexedWoundEdge3DAddToFrustum
0.81% R_LinkGlobalEdge
0.60% R_SpanEmit_r6_y0
0.54% R_SpanEmit_r7_y0
0.28% R_AETRemoveInactive
0.26% R_AETAddPending
0.20% R_AETRemove_r1
0.18% R_Line2DIntersectY
0.13% R_Edge2DAddToViewport
0.13% R_Edge2DCacheCull
0.12% R_ClippedEdge2DAddToViewport
0.11% R_VerticalEdge2DAddToViewport
0.09% R_ScanConvertGET
0.07% R_EdgeIntersectZ
0.06% R_RecoverSurfaces
0.02% R_BeginFrame
0.00% R_BeginScan
0.00% R_BeginGeometry
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
Bless you!
Re: Quake 2 on Falcon030
hehemfro wrote:Bless you!

Here's an updated vid with the debug stuff removed this time and filling added:
https://www.youtube.com/watch?v=t7OFlxi ... M&index=12
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Atari Super Hero
- Posts: 895
- Joined: Thu Sep 11, 2003 10:49 pm
- Location: UK
Re: Quake 2 on Falcon030
Top stuff Doug!
Re: Quake 2 on Falcon030
just awesome DML
Mega ST 1 / 7800 / Portfolio / Lynx II / Jaguar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net / AT Speed C16
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net / AT Speed C16
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/
Re: Quake 2 on Falcon030
Simply stunning. Awesome progress. Thanks Doug.


Re: Quake 2 on Falcon030
Very impressive!
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
-
- Captain Atari
- Posts: 400
- Joined: Sat Jul 25, 2009 3:35 pm
Re: Quake 2 on Falcon030
WOW [smilie=greencolorz4_pdt_08.gif]
Re: Quake 2 on Falcon030
Doug, please report for the nearest cloning station 
Btw. I really liked pastel colors in the video. Can you explain how you assigned colors?

Btw. I really liked pastel colors in the video. Can you explain how you assigned colors?
Atari: FireBee, Falcon030 + CT60e + SuperVidel + SvEthlana, TT, 520ST + 4MB ST RAM + 8MB TT RAM + CosmosEx + SC1435, 1040STFM + UltraSatan + SM124, 1040STE 4MB ST RAM + 8MB TT RAM + CosmosEx + NetUSBee + SM144 + SC1224, 65XE + U1MB + VBXE + SIDE2, Jaguar, Lynx II, 2 x Portfolio (HPC-006)
Adam Klobukowski [adamklobukowski@gmail.com]
Adam Klobukowski [adamklobukowski@gmail.com]
Re: Quake 2 on Falcon030
Thanks for the comments everyone once again. I'm working on a correctly-coloured/lit (but not textured) version for the next video.

In fact it's a bit sad. Here it is:AdamK wrote: Btw. I really liked pastel colors in the video. Can you explain how you assigned colors?

Code: Select all
; get original face index
move.w (a5,d6.l*2),d7
; mix it up a bit
mulu.w #4913,d7
; constrain 565 colour to muddy pastels, otherwise i got a lot of seriously bright pink which hurts my eyes
and.w #%0011100111100111,d7
add.w #%0011100111100111,d7
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Quake 2 on Falcon030
Please excuse my ignorance, but why re you doing and.w twice?
Atari: FireBee, Falcon030 + CT60e + SuperVidel + SvEthlana, TT, 520ST + 4MB ST RAM + 8MB TT RAM + CosmosEx + SC1435, 1040STFM + UltraSatan + SM124, 1040STE 4MB ST RAM + 8MB TT RAM + CosmosEx + NetUSBee + SM144 + SC1224, 65XE + U1MB + VBXE + SIDE2, Jaguar, Lynx II, 2 x Portfolio (HPC-006)
Adam Klobukowski [adamklobukowski@gmail.com]
Adam Klobukowski [adamklobukowski@gmail.com]
Re: Quake 2 on Falcon030
I had to look twice there before replying - my eyes don't seem to be working properlyAdamK wrote:Please excuse my ignorance, but why re you doing and.w twice?

It's masking off MSBs to limit the range and adding an offset to brighten it.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools (source) https://bitbucket.org/d_m_l/agtools/downloads?tab=tags
BadMooD p/l: http://www.youtube.com/playlist?list=PL ... oOGiLtcniv
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM