Quake 2 on Falcon030

All 680x0 related coding posts in this section please.

Moderators: exxos, simonsunnyboy, Mug UK, Zorro 2, Moderator Team

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Sat Aug 23, 2014 9:02 am

I'm working to a budget now within DSP memory and it's a very tight squeeze. Testing the first map forced me to adjust the sizes of various buffers a few times to get stuff to fit.

Fortunately its possible to shrink the size of geometry batches (vertices and 3d face edges) to free up some memory for global 2d edges and draw surfaces.

The budget is currently 1000 vertices per batch, 500 edges per batch, 2000 global edges and 600 global surfaces. This consumes all of the DSP ram at least briefly. Some ram is freed up again before drawing since batching has stopped by then - but using smaller batches means less gets freed up at the end.

The main limiting factor is global edges + draw surfaces since those accumulate and persist until the scene is drawn. There isn't any sensible way to deliver these in batches since all need to be present at once for the hidden surface scanning to work. Even drawing the scene in tiles will not help here. So they just need to be packed as small as possible and if the limit gets hit then no more get sent and some holes appear in distant scenery.

I think though the Falcon will be too slow to render scenes which overflow in this way so it's not a big deal. Scanning 600 faces is already a lot of work for it to do even without textures.

Vertices are currently taking 8 words each (tx,ty,tz,pad,px,py,pz,pad) distributed over x:y memory so it takes 4 words in of address space. This will get packed down to tx,ty,px,py,z,pad) at the very least which is a 25% saving. A couple of other things can perhaps be packed e.g. spans from 2 words to 1, but at higher processing cost. Won't do that unless it is needed.

So in general it is a tight fit but it does fit for the kinds of maps that seem to be manageable, and the type of damage for bigger maps is limited to distant details and probably offset by the huge cost of drawing such big scenes.

Will post an update when I get the scan convertor to work.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Sun Aug 24, 2014 1:44 pm

Now after fixing a couple of bugs the scan convertor seems to be working - in the sense that it ticks over from frame to frame and builds surfaces and spans, and best of all, doesn't crash. I don't know what the data looks like yet and if it is correct. So that will be next. Seems like a step forward though.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Mon Aug 25, 2014 5:53 am

I think the hidden surface removal process is now beginning to work on the DSP. It has taken quite a lot of debugging and hassle but probably the worst part is over for the single most complicated part of the project. The screenshots show it's starting to do the right thing.

There are strange glitches and the scene flickers on/off at some angles so there's still work to do.

None of this new code is optimized, trying to match the C code 1:1 so it is also currently quite slow. It's also running debug checks on every scanline, on the edge and surface buffers to detect faults, which takes a huge amount of processing time.
You do not have the required permissions to view the files attached to this post.

FedePede04
Atari Super Hero
Atari Super Hero
Posts: 934
Joined: Fri Feb 04, 2011 12:14 am
Location: Denmark
Contact:

Re: Quake 2 on Falcon030

Postby FedePede04 » Mon Aug 25, 2014 7:57 am

Hi Doug
you are a true wizard :D
and i love to follow the process of your work.
Atari will rule the world, long after man has disappeared

sometime my English is a little weird, Google translate is my best friend :)

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Mon Aug 25, 2014 9:55 am

FedePede04 wrote:Hi Doug
you are a true wizard :D
and i love to follow the process of your work.


Thanks Peter! It's been fun hacking at it so far. Might start getting more interesting soon, after a bunch more fixes. It's always a lot easier to work on something that isn't broken :)

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Mon Aug 25, 2014 11:39 pm

Finally got it working. It's still pretty slow but now that the bugs are out I can work on the planned optimizations (of which there are many - hopefully the speed will improve enough before they run out!).

https://www.youtube.com/watch?v=ZPQVd2t ... e=youtu.be

The most important optimization will be removing the linear list scan for inserting edges into the active edge table. Q2 reduced this a bit by pre-sorting the pending lists on each scanline so it inserts a sorted list into a sorted list. But I don't even want to be doing that if it can be avoided, so I have a couple of other methods to try... I haven't profiled it yet but I expect much of the time is lost in list maintenance.

The DSP code is also not using internal fast ram or decent addressing modes, so it's about 3x slower than it needs to be because of that alone.

User avatar
Scarlettkitten
Captain Atari
Captain Atari
Posts: 259
Joined: Thu Mar 19, 2009 11:42 am
Location: Northamptonshire, UK

Re: Quake 2 on Falcon030

Postby Scarlettkitten » Tue Aug 26, 2014 12:27 am

Just wow, impressive 8)
Music gear
Falcon 030 14MB, Cubase Audio, Soundpool FA8,FDI, MAudio 88 keystation, Roland S750, Yamaha A5000, Roland JV1080, Yamaha MG10, 1040STE, ZX Spectrum 128k.

User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2025
Joined: Thu Sep 15, 2005 10:01 am
Location: STara Pazova, Serbia
Contact:

Re: Quake 2 on Falcon030

Postby calimero » Tue Aug 26, 2014 6:52 pm

Crazy poo!
Now I can really imagine Tron like game! :D

What are difference in engine betwen Quake 1 and Quake 2?

If quake 2 player works so fast with 16mhz 030, how much cpu cycles would be needed more for drawing textures (e.g. second 030 on 16mhz (32mhz 030), or drawing textures is way more than 100% than this what you made so far)?

VBL counter - it is how many vbl is needed to draw screen?
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Aug 27, 2014 8:52 am

calimero wrote:Crazy poo!
Now I can really imagine Tron like game! :D
What are difference in engine betwen Quake 1 and Quake 2?


For software rasterization, not big changes - they use a similar set of algorithms. There were some improvements though and especially to the map and lightmaps, and with nice transparency using fat lookups. The maps are also more complex. The target machines were bigger and faster (I had something like a 450MHz PIII when it was released, with an early NVidia TNT videocard)

Under 3D acceleration Q2 uses colour lightmaps instead of mono under software. I was hoping to make use of the colour lightmaps on Falcon too if the project got far enough - it should look interesting. At the very least its possible to use the colour lightmaps for flat shading tones.

Q2 also had significant architectural improvements over Q1, which don't matter much for this project. :)

calimero wrote:If quake 2 player works so fast with 16mhz 030, how much cpu cycles would be needed more for drawing textures (e.g. second 030 on 16mhz (32mhz 030), or drawing textures is way more than 100% than this what you made so far)?


Almost all of the compute time so far is geared towards preventing polygon spans smaller than 1 pixel as early as possible and preventing any overdraw. So providing these stages can be optimized well enough, the chances of doing something interesting with those polygon spans is fair-to-good. The 68030 should be able to fill the number of pixels needed since it is a constant number at all times - the varying cost comes from hidden surface removal and final span count (setup time per span).

I have prototyped a filling technique that should work in principle on the Falcon and be cheap enough for a chunky display at least - but I won't know the total cost until it is tried for real.

calimero wrote:VBL counter - it is how many vbl is needed to draw screen?


Yes, kind of - it's the number of VBLs needed for a frame-compute, including drawing. In this case though drawing takes a tiny amount of time since it is just some dots (yellow for span left edges, blue for right edges - or vice versa, i dont remember which way round I did it).

However before attempting any kind of filling, I'm going to work on the existing stages to try to make them faster. It's still a bit too slow for my liking, and I think filling needs more room.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Aug 27, 2014 9:34 am

So to summarize progress on this thing - it has reached a kind of milestone where some of the evil bits have been prototyped and it still looks viable to me on 16MHz base platform, albeit needing still more effort.

game engine (simplified, 1-player):

seems to work ok, although depends on FPU currently

map size (geometry):

seems viable for at least some maps. Tests have been done using maps with 10,000 faces and 20,000 edges and while there are certainly slow spots, the map size has a relatively small impact on speed, compared with the varying cost of 'nearby stuff'. at about 500k for map storage it should be ok.

map size (textures):

unknown - could become a problem later. but I intend to drop the top mipmap, which will cut the size of the surface cache by at least half. if that goes well it might buy enough to use colour lightmaps later.

scene size (geometry):

has been a bit problematic, but I think it is mostly solved by batching. I can now also see a way to cut the cost of vertex and edge storage significantly (by changing the algorithms which use them) which will should free up enough DSP ram for the bad cases too.

BSP tree:

this is a bit problematic. it is quite well optimized and still costs too much. i'll revisit this and try to use DSP to help with the frustum culling, which probably is most of the cost.

geometry (transform/projection):

the DSP easily digests the vertex processing cost, it's hardly registering as a portion of total time. it's also easy to optimize it much more so it's not going to be an issue. it costs more to transmit the vertices than it does to process them, and the transmit size can probably still be cut in half if needed.

geometry (face clipping):

seems to be ok, but could do with speeding up. plenty of room to do so, and some improvements can be made to the original algorithms, e.g. to cache all clipped edges which are shared by polygons - which saves DSP ram and avoids divides.
currently there is poor overlap between CPU and DSP which needs some thought. changing the way this part works could be beneficial. in principle the CPU-side reindexing of vertices and edges could be overlapped with DSP face clipping.

hidden surface removal:

a really expensive process for our old Atari, but DSP is coping ok. needs a rewrite and some changes to the algorithm to suit the machine better. considering shift-sentinels or a BTree for the active edge table, to flatten edge insertion cost. the DSP structures need to be reorganized to avoid using the terrible (r0+n0) addressing mode, which costs 2x as much as the others and is currently used absolutely everywhere in my code. the code also needs a lot of work to start running routines from internal ram, which will speed up access to main ram.

filling (flat colour):

don't see any problems with this. that's mainly why I haven't bothered to do it yet. better spend time on the bad things first.

filling (textures):

successful texture addressing prototype on PC using integer arithmetic only, not yet tried on Falcon. i do know it will work with 23-bit fixedpoint though since I was careful to work with that in the prototype.

still some problems with texture plane setup - needs some planning, and figuring out which chip should do it. It can be done by CPU, FPU or DSP but each has its pros and cons...

not yet clear if Q1/Q2 share texture planes across adjacent faces (I think it would make sense to do so) - if not then it could also be used to cut setup time on F30 since setup can be done only the first time a new texture plane is visited on each frame.


Those are just the main notes - there are lots of smaller details. Clearly though it is going to get faster with a bit more work, and we can try out a filled version.

I have cold/flu this week so probably won't be doing a lot - but will post if anything changes...

Zamuel_a
Atari God
Atari God
Posts: 1218
Joined: Wed Dec 19, 2007 8:36 pm
Location: Sweden

Re: Quake 2 on Falcon030

Postby Zamuel_a » Wed Aug 27, 2014 9:51 am

How are the coordinates for each face handled in Q1 & Q2? The routines I used in the past for perspective correct mapping always needed mapping coordinates besides the normal face vertices or else they would stretch out the texture to fit the corners, just like an affine mapper, so if there was anything else than just square faces, when it needed extra mapping coordiantes for the texture. But I think there is some way to do it without this?
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Aug 27, 2014 10:08 am

Zamuel_a wrote:How are the coordinates for each face handled in Q1 & Q2? The routines I used in the past for perspective correct mapping always needed mapping coordinates besides the normal face vertices or else they would stretch out the texture to fit the corners, just like an affine mapper, so if there was anything else than just square faces, when it needed extra mapping coordiantes for the texture. But I think there is some way to do it without this?


Yes, the Q engines use a clever technique which builds on the 'quasi-raytracing' methods which are present in Wolf3D, Doom, although not explicitly tracing rays they intercept lines in texture space (Doom used a tangent table to achieve this effect without actually casting rays).

The Q engines treat polygon faces as bounded areas of a plane equation in 3D space. So the texture for a surface is mapped to the plane equation of the surface - and not to vertices. If you move the vertices of a face around, the texture will not move - it is part of the wall 'plane' in 3D space, not the face itself.

This in turn means you can query the intercept coordinates for any pixel on the screen, given the equation of the surface known to be under that pixel.

This in turn means you can find the u,v for any pixel on screen using a simple equation.

u = uoff + (screen_x * ustepx) + (screen_y * ustepy);
v = voff + (screen_x * vstepx) + (screen_y * vstepy);
z = zoff + (screen_x * zstepx) + (screen_y * zstepy);

[apologies for mistakes, typing from memory here]

Even on a chip with a fast multiplier, this might seem expensive until you realize that it removes the need to store or process texture coordinates completely, within polygons or within spans. This is a bit unhelpful if you are drawing a simple cube, but is very helpful when you have a crap-ton of faces to draw. :) All you need to do is run that equation at the start of each span, and you can advance along the span using +ustepx,+vstepx.

TBH It's easier to make sense of it by following the source to the drawing routines. The real trick is understanding the importance of the BSP and its 3D hyperplanes being at the core of all the geometry. Faces are just little pieces of that BSP, plane-based world.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Aug 27, 2014 11:08 am

It seems the reason for the slowdown in places is simply because I changed the polygon batch size to 1 for debugging, and forgot to put it back to the default of 128 before recording the vid. So it was going through the whole batching cycle for every face. doh.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Aug 27, 2014 2:13 pm

Between sneezes I managed to get a quick profile from base1.bsp in the hotspots and have a decent picture of where time is spent on the DSP. It's reasonably close to what I thought but not exactly. Some interesting findings as usual.

(Eero - the dsp profiler is such a nice tool to have ;) )

- sorting time is a significant portion of total time (sorting occurs within R_AETGenerateSpans, R_AETStepActive, R_AETInsert_r1)
- surface sorting (R_AETGenerateSpans) seems to be a bigger deal than edge sorting, which I didn't really expect. might be worth changing the representation of the surface stack.
- vertices don't take long to process compared with time transmitting them, but still one of the bigger DSP costs internally
- it takes about 3vbls worst case to do scan conversion on DSP currently
- face clipping takes a bit less time than I expected compared with the rest

Used cycles:
44.31% command_base (idle)
15.28% R_AETGenerateSpans
10.31% dbg_recover_surface
5.75% R_XFormProjectVertices
4.78% R_AETStepActive
4.43% R_Edge2DAddToViewport_
4.23 R_AETInsert_r1
3.40% R_SubmitFaceGeometry
2.02% R_SpanEmit
1.88% R_IndexedWoundEdge3DAddToFrustum
0.81% R_LinkGlobalEdge
0.60% R_SpanEmit_r6_y0
0.54% R_SpanEmit_r7_y0
0.28% R_AETRemoveInactive
0.26% R_AETAddPending
0.20% R_AETRemove_r1
0.18% R_Line2DIntersectY
0.13% R_Edge2DAddToViewport
0.13% R_Edge2DCacheCull
0.12% R_ClippedEdge2DAddToViewport
0.11% R_VerticalEdge2DAddToViewport
0.09% R_ScanConvertGET
0.07% R_EdgeIntersectZ
0.06% R_RecoverSurfaces
0.02% R_BeginFrame
0.00% R_BeginScan
0.00% R_BeginGeometry

User avatar
mfro
Atari Super Hero
Atari Super Hero
Posts: 663
Joined: Thu Aug 02, 2012 10:33 am
Location: SW Germany

Re: Quake 2 on Falcon030

Postby mfro » Wed Aug 27, 2014 2:15 pm

Bless you!

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Aug 27, 2014 5:21 pm

mfro wrote:Bless you!


hehe :)

Here's an updated vid with the debug stuff removed this time and filling added:

https://www.youtube.com/watch?v=t7OFlxi ... M&index=12

EvilFranky
Atari Super Hero
Atari Super Hero
Posts: 843
Joined: Thu Sep 11, 2003 10:49 pm
Location: UK
Contact:

Re: Quake 2 on Falcon030

Postby EvilFranky » Wed Aug 27, 2014 5:34 pm

Top stuff Doug!

User avatar
Cyprian
Atari God
Atari God
Posts: 1398
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: Quake 2 on Falcon030

Postby Cyprian » Wed Aug 27, 2014 5:49 pm

just awesome DML
Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Aranym / Steem / Saint
http://260ste.appspot.com/

User avatar
Anima
Atari Super Hero
Atari Super Hero
Posts: 626
Joined: Fri Mar 06, 2009 9:43 am
Contact:

Re: Quake 2 on Falcon030

Postby Anima » Wed Aug 27, 2014 7:04 pm

Simply stunning. Awesome progress. Thanks Doug.

:cheers:

Zamuel_a
Atari God
Atari God
Posts: 1218
Joined: Wed Dec 19, 2007 8:36 pm
Location: Sweden

Re: Quake 2 on Falcon030

Postby Zamuel_a » Wed Aug 27, 2014 10:19 pm

Very impressive!
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe

kristjanga
Captain Atari
Captain Atari
Posts: 400
Joined: Sat Jul 25, 2009 3:35 pm

Re: Quake 2 on Falcon030

Postby kristjanga » Wed Aug 27, 2014 11:48 pm

WOW [smilie=greencolorz4_pdt_08.gif]

User avatar
AdamK
Captain Atari
Captain Atari
Posts: 232
Joined: Wed Aug 21, 2013 8:44 am

Re: Quake 2 on Falcon030

Postby AdamK » Thu Aug 28, 2014 5:15 am

Doug, please report for the nearest cloning station ;)

Btw. I really liked pastel colors in the video. Can you explain how you assigned colors?
Atari: FireBee, Falcon030 + CT60e + SuperVidel + SvEthlana, TT, 520ST + 4MB ST RAM + 8MB TT RAM + CosmosEx + SC1435, 1040STFM + UltraSatan + SM124, 1040STE 4MB ST RAM + 8MB TT RAM + CosmosEx + NetUSBee + SM144 + SC1224, 65XE + U1MB + VBXE + SIDE2, Jaguar, Lynx II, 2 x Portfolio (HPC-006)

Adam Klobukowski [adamklobukowski@gmail.com]

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Thu Aug 28, 2014 10:24 am

Thanks for the comments everyone once again. I'm working on a correctly-coloured/lit (but not textured) version for the next video.

AdamK wrote:Btw. I really liked pastel colors in the video. Can you explain how you assigned colors?


In fact it's a bit sad. Here it is: :)


Code: Select all

; get original face index
   move.w      (a5,d6.l*2),d7
; mix it up a bit
   mulu.w      #4913,d7
; constrain 565 colour to muddy pastels, otherwise i got a lot of seriously bright pink which hurts my eyes
   and.w      #%0011100111100111,d7
   add.w      #%0011100111100111,d7

User avatar
AdamK
Captain Atari
Captain Atari
Posts: 232
Joined: Wed Aug 21, 2013 8:44 am

Re: Quake 2 on Falcon030

Postby AdamK » Thu Aug 28, 2014 11:33 am

Please excuse my ignorance, but why re you doing and.w twice?
Atari: FireBee, Falcon030 + CT60e + SuperVidel + SvEthlana, TT, 520ST + 4MB ST RAM + 8MB TT RAM + CosmosEx + SC1435, 1040STFM + UltraSatan + SM124, 1040STE 4MB ST RAM + 8MB TT RAM + CosmosEx + NetUSBee + SM144 + SC1224, 65XE + U1MB + VBXE + SIDE2, Jaguar, Lynx II, 2 x Portfolio (HPC-006)

Adam Klobukowski [adamklobukowski@gmail.com]

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3468
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Thu Aug 28, 2014 12:06 pm

AdamK wrote:Please excuse my ignorance, but why re you doing and.w twice?


I had to look twice there before replying - my eyes don't seem to be working properly :) but in fact it is aNd, aDd.

It's masking off MSBs to limit the range and adding an offset to brighten it.


Social Media

     

Return to “680x0”

Who is online

Users browsing this forum: No registered users and 1 guest