Quake 2 on Falcon030

All 680x0 related coding posts in this section please.

Moderators: Zorro 2, Moderator Team

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

calimero wrote: I am not sure how you manage to get same speed as mono lighting but this screen shot are more than amazing!!
Trickery! Always and only cheating :) But in a good, realtime way.
User avatar
dhedberg
Atari God
Atari God
Posts: 1388
Joined: Mon Aug 30, 2010 8:36 am

Re: Quake 2 on Falcon030

Post by dhedberg »

Jaw-dropping! I think you're onto something big here! :)
Daniel, New Beat - http://newbeat.atari.org.
Like demos? Have a look at our new Falcon030 demo It's that time of the year again, or click here to feel the JOY.
User avatar
Atari030
Atari Super Hero
Atari Super Hero
Posts: 784
Joined: Mon Feb 27, 2012 6:14 am
Location: Melbourne, Australia

Re: Quake 2 on Falcon030

Post by Atari030 »

Holy bloody moley, that looks stunning.
User avatar
TheNameOfTheGame
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2612
Joined: Mon Jul 23, 2012 8:57 pm
Location: Almost Heaven, West Virginia

Re: Quake 2 on Falcon030

Post by TheNameOfTheGame »

It's incredible what Doug is coaxing out of this machine. Looks stunning! I never would have believed it possible if I hadn't seen it here. :cheers:
User avatar
DarkLord
Ultimate Atarian
Ultimate Atarian
Posts: 5790
Joined: Mon Aug 16, 2004 12:06 pm
Location: Prestonsburg, KY - USA

Re: Quake 2 on Falcon030

Post by DarkLord »

dml wrote:
calimero wrote: I am not sure how you manage to get same speed as mono lighting but this screen shot are more than amazing!!
Trickery! Always and only cheating :) But in a good, realtime way.
Anytime something comes up that someone doesn't understand in our
D&D sessions - someone always points out that its just "magic". I think
Doug's work falls under that category. :)
Welcome To DarkForce! http://www.darkforce.org "The Fuji Lives.!"
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2639
Joined: Thu Sep 15, 2005 10:01 am
Location: Serbia

Re: Quake 2 on Falcon030

Post by calimero »

dml wrote:
calimero wrote: I am not sure how you manage to get same speed as mono lighting but this screen shot are more than amazing!!
Trickery! Always and only cheating :) But in a good, realtime way.
Witchcrafting! :D
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
AnthonyJ
Captain Atari
Captain Atari
Posts: 165
Joined: Sat Jan 26, 2013 8:16 am

Re: Quake 2 on Falcon030

Post by AnthonyJ »

dml wrote:These screenies were taken from the Falcon build using colour lighting. This demonstrates the most significant difference between Q1 and Q2 map formats, since Q1 maps only encoded mono lighting data.
Looks great. It's amazing what a difference the coloured lighting makes - really recognisable as Q2 now. Or maybe that's just because I spent too long playing it with the OpenGL rendering :)
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

AnthonyJ wrote: Looks great. It's amazing what a difference the coloured lighting makes - really recognisable as Q2 now. Or maybe that's just because I spent too long playing it with the OpenGL rendering :)
:cheers:

Since posting that I think I have a way now to combine coloured dynamic lights with the static coloured lightmaps in realtime.

I don't think filtering of coloured lights however is an option on the CPU. It can be done but it's too expensive. The two options available are A) soften the hard edges within the lightmap source data or B) perform all lightmap+texture combining using a bit of dedicated DSP code (upload the lightmap palette once, and 4 corner samples per surface tile, then exhange each texture pixel for a filtered pixel in surface cache format).

Actually there are other difficulties, mainly to do with understanding all the different kinds of lights in the IdTech2 engine, and how they work. There are several different paths in the code for handling lights (and potentially 4 different styles of lightmap being composited!), for static, point and dynamic lights. So I'm starting with the static part and the surface cache itself and working back through the other areas. I think some of them aren't really necessary anyway until some game AI takes control over some of them e.g. flashing/flickering lights in the map.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

The most annoying problem I have run into so far has nothing to do with code - it is the fact that textures are separate files and assume long filenames. This doesn't work too well on the Falcon under TOS and results in truncated names, causing the wrong texture to be loaded and silently messes up the map.

This also means some maps are currently loading which possibly shouldn't, due to the extra ram which would be needed to load all the textures uniquely, without name-aliasing.

(While the minimum spec for Q2 was indeed 16MB of ram - not far from the Falcon's 14MB - it obscures an important hidden detail - that on a Windows platform this refers to *physical* memory - there's an abundance of virtual memory behind that, which we don't have under TOS)

I'll therefore need to batch-rename the textures to some sort of hash for TOS, since the windows 8.3 scheme is useless (It is context-sensitive and just adds a digit for each non-unique name. This makes it impossible to uniquely identify files in an absolute manner. The correct solution is probably a hash with low collision probability - or alternatively a translation table. I prefer a hash because it's less stuff to move around. )
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3999
Joined: Sun Jul 31, 2011 1:11 pm

Re: Quake 2 on Falcon030

Post by Eero Tamminen »

dml wrote:(While the minimum spec for Q2 was indeed 16MB of ram - not far from the Falcon's 14MB - it obscures an important hidden detail - that on a Windows platform this refers to *physical* memory - there's an abundance of virtual memory behind that, which we don't have under TOS)
With the Mercurial version of Hatari you can use TT-ram with Falcon emulation & TOS v4. I.e. memory optimizations could be left for later.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Eero Tamminen wrote: With the Mercurial version of Hatari you can use TT-ram with Falcon emulation & TOS v4. I.e. memory optimizations could be left for later.
Hi!

Well I'm having some trouble with the new Hatari, so have not been experimenting with TT ram yet. Some weird things going on with emulation - crashes and flakyness which does not occur in previous versions or on two types of real CPU. I don't quite understand whats going on yet because it doesn't appear focused in one place. Still looking into it. :-z
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Got a bit more time to play today, and did a few things:

- dynamic lights working - but slow. some problems with the idea that depend more on the fast PC bus that I first thought. entire walls need rebuilt in the cache each frame if a single pixel is lit, not just the affected pixels. hmm. easiest hacks are to de-rez the wall while it is being lit, and/or skip lighting updates on odd frames.

- profiled the code in Hatari and took note of all the cache misses. nearly all of them happening in the face drawing code, because its flipping between 3 different (large) jobs.

- split one of those jobs (face setup math) into a separate pre-pass. still doesn't quite fit in the cache but is measurable at about 13% of total time. can probably save 10% of total time by doing this stage on the DSP.

- should be able to split the surface cache from face drawing and do this in a pre-pass also, allowing the surface drawing to fit in the cache properly and speed it up.


Used cycles:
49.07% _R_RecoverSurfacesTex_DSP56k <- to get speeded up, by killing cache misses
13.16% _R_RecoverSurfaceIDs_DSP56k <- to mostly disappear
7.58% _R_ReindexFaceVertices_HeadCPU
6.12% _R_RecursiveWorldNode_BodyCPUDSP
4.89% _R_XFormProjectVertices_BodyCPUD
4.03% imat_invmulvec_Vd456_Ma0_Rd123 <- to disappear (68030 matrix math)
2.52% _R_SubmitFaceGeometry_BodyCPUDSP
2.38% _R_ProcessBSPVisitQueue_HeadCPU

Instruction cache misses:
39.21% _R_RecoverSurfacesTex_DSP56k <- to get reduced a lot
20.34% _R_RecoverSurfaceIDs_DSP56k <- to mostly disappear
10.97% imat_invmulvec_Vd456_Ma0_Rd123 <- to disappear (68030 matrix math)
4.87% _ClipBrushes
4.81% ivec_dotsum_Va0_Nd4567_sumd7 <- to disappear (68030 vector math)
1.34% _R_ReindexFaceVertices_HeadCPU
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Minor update:

- project is in a decent position to begin optimizing the last areas properly, but shouldn't begin until precision issues are sorted out as it will just become 10x harder to fix those after optimization

- precision issues cause sparkles/glitches in textures, more serious problems on distant textures (e.g. skybox) and faulty clipping of lines on distant geometry (sykbox, again)

- texture sparkles are caused by precision loss in texture math, which has around 5 separate stages. with each stage being different from the method used in PC games, the error just builds up until it looks ugly. the most fundamental difference is the change from floating point division to fixedpoint quadratic solution, but the other stages were all converted to fixedpoint and each of those introduces additional errors. it's therefore necessary to isolate each stage to debug the sources of error.

- the quadratic bit is hardwired in now on the Falcon and no point in using the PC version for precision debugging with so many necessary differences now in the other stages - so i'll have to ignore that and work on the other areas, and whatever error is left in the end must be coming from the quadratic part.

- the other parts i can build in a configurable way either float or fixed, so each one can be debugged. greatly complicates the code but it can be trimmed away later.

- each fixedpoint stage must maximize the utilization of fixedpoint bits at each step, especially when stored in memory structures or transferred to DSP, since those transfers effectively truncate the available range. while maximizing use of those bits, they must also not get clipped at the upper limit - this has already caused one texture bug (in one map!) which needed diagnosed. finding the ranges at all these stages is quite laborious. it's certainly ok as it is, but needs to be made better before more optimization is done.

- once the fixedpoint ranges have been optimized for storage and transfer, the use of those terms in calculation also needs to be improved so bits are preserved as far as possible. this is also laborious.

- skybox probably needs a more general solution, such as adjusting the camera near/far planes into a different relative range


There is one final problem with texturing precision - even if I make it 'perfect' there will still be a problem, because I noticed in the original Q2 sources they clamp the texture range on each span so it doesn't wander off the texture due to precision issues. And that's with floating point. Maybe it doesn't happen often but it's important to know about it (would not be fun to chase a fault that was always there on the PC!). Anyway I'm going to have to think about that one and try to solve it some other way - e.g. by manipulating the texture plane to be more conservative before it is used.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3999
Joined: Sun Jul 31, 2011 1:11 pm

Re: Quake 2 on Falcon030

Post by Eero Tamminen »

dml wrote: - hmm. easiest hacks are to de-rez the wall while it is being lit, and/or skip lighting updates on odd frames.
Like current PC games do for shadows? :-)
dml wrote:- each fixedpoint stage must maximize the utilization of fixedpoint bits at each step, especially when stored in memory structures or transferred to DSP, since those transfers effectively truncate the available range. while maximizing use of those bits, they must also not get clipped at the upper limit - this has already caused one texture bug (in one map!) which needed diagnosed. finding the ranges at all these stages is quite laborious. it's certainly ok as it is, but needs to be made better before more optimization is done.
Couldn't range finding be automated? E.g. have your code track the ranges for different values/passes and output them at suitable intervals. Then just play few potentially problematic levels.

If you can plug in rest of the Quake code, you could just record some demos and let playback automatically find you the ranges (similarly how BM scripting automatically finds and profiles slowest frames with Hatari :)).
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Eero Tamminen wrote: Like current PC games do for shadows? :-)
I suppose so - it's an old trick but it does work better if your fps is high to begin with :) maybe not so great here.
Eero Tamminen wrote: Couldn't range finding be automated? E.g. have your code track the ranges for different values/passes and output them at suitable intervals. Then just play few potentially problematic levels.

If you can plug in rest of the Quake code, you could just record some demos and let playback automatically find you the ranges (similarly how BM scripting automatically finds and profiles slowest frames with Hatari :)).
This is probably a better way to do it via Hatari. I did automate it partly by having the code track abs max bounds in one structure and I checked it with a few levels. Still there are several other places it needs done and it's dull work :)
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Yesterday evening made some changes which split the drawing into 3 phases now. Originally they were all done in one piece of code.

- per-face texture plane setup
- per-face surface cache updates
- per-face drawing

None of these stages yet fit in the 68030 cache but two of them are getting close (face setup under 500 bytes, drawing under 350 bytes). One of them - surface cache - still involves a bunch of C code and not really close to fitting yet. Will need separate work.

With some effort though two stages should fit fully inside the CPU, and the surface cache should at least have the outerloop in the cache, and intermittent updates with the inner loops inside the cache. With additional splitting (separate event discovery from event execution, use separate event list for each dedicated mip size/routine) it might be possible to get everything inside the CPU.

Once all of that is done, we'll be a lot closer to knowing how well the Falcon can cope with this kind of engine :) But there is still quite a lot to do between now and that endpoint.
User avatar
Mindthreat
Captain Atari
Captain Atari
Posts: 279
Joined: Tue Dec 16, 2014 4:39 am

Re: Quake 2 on Falcon030

Post by Mindthreat »

dml wrote:Yesterday evening made some changes which split the drawing into 3 phases now. Originally they were all done in one piece of code.

- per-face texture plane setup
- per-face surface cache updates
- per-face drawing

None of these stages yet fit in the 68030 cache but two of them are getting close (face setup under 500 bytes, drawing under 350 bytes). One of them - surface cache - still involves a bunch of C code and not really close to fitting yet. Will need separate work.

With some effort though two stages should fit fully inside the CPU, and the surface cache should at least have the outerloop in the cache, and intermittent updates with the inner loops inside the cache. With additional splitting (separate event discovery from event execution, use separate event list for each dedicated mip size/routine) it might be possible to get everything inside the CPU.

Once all of that is done, we'll be a lot closer to knowing how well the Falcon can cope with this kind of engine :) But there is still quite a lot to do between now and that endpoint.
Awesome, can't wait to see the results - keep up the great work :D
Atari-related YouTube Videos Here: - https://www.youtube.com/channel/UCh7vFY ... VqA/videos
Atari ramblings on Twitter Here: https://twitter.com/mindthreat
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Mindthreat wrote: Awesome, can't wait to see the results - keep up the great work :D
:)

The last three texture matrix*vector transforms, two plane distance calcs, four projection muls and a single projection divide have all been moved now to the DSP, so there's no math left for the 030 to do per face. And it still works as before. It's all a big mess now of different compile-time paths for doing basically the same thing but it should be possible now to optimize the face drawing properly, after a cleanup.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

More changes last night, although only got half an hour or so of coding so limited progress.

- got 2 of the 3 per-face DSP processes working as iterators using a generic iterator mode. the 3rd is the actual drawing step and will try to get that using the same approach. saves DSP code space adn limits opportunity for bugs in otherwise similar code.

- removed some unnecessary stuff that might have been slowing things down also

Took a few screengrabs using the current version to show FPS at a decent portion of the default 320x screen area (albeit, while standing still - no real surfcache activity) and that texturing accuracy has survived, maybe even a bit better than before. Getting quite close to BadMooD for simpler maps.

Some i.f16 fixedpoint stuff became i.f24 just because it is DSP, which improved accuracy. Still needs more fixes though to get closer to the floating point version - glitches amplify with distance. Very busy today so won't waste time cropping/resizing the grabs.
grab0135.png
grab0136.png
grab0137.png
grab0138.png
grab0139.png
grab0140.png
I think it should be mostly downhill from here - at least for optimizing the scenery drawing. All the hard bits are solved - they just need to be carefully streamlined. Sky is still needing work, not had time to look at it yet.
You do not have the required permissions to view the files attached to this post.
User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2639
Joined: Thu Sep 15, 2005 10:01 am
Location: Serbia

Re: Quake 2 on Falcon030

Post by calimero »

These "stills" look really beautiful and amazing!
It is really mindblowing that F030 can proceed Quake maps and to display so much different textures at astonishing accuracy. Better zillion times that any demo so far.

I also notice that you have two new YT videos in last few days:
https://www.youtube.com/watch?v=QCvx2O5M69E
https://www.youtube.com/watch?v=vPsY4P8bnVw

is there any topic on atari-forum regarding these videos?
where I can download them?
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
User avatar
GokMasE
Captain Atari
Captain Atari
Posts: 323
Joined: Sun Mar 02, 2003 11:16 pm
Location: Sweden

Re: Quake 2 on Falcon030

Post by GokMasE »

Ah, interesting find!

I followed the ST-Ray thread quite closely and seeing the YT video now was a real treat - very nice indeed :)
Considering the 64 colour palette used, the FPS from this 8mhz machine is pretty impressive.

Awesome work Doug!


The original thread is located here: http://www.atari-forum.com/viewtopic.ph ... &start=350

Seems like the news about this video never made it into there - yet ;)


Regards,

/Joakim
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Thanks!

Yes, I didn't post anywhere about the new vids because the projects are quite old now - it was a sort of cleanout before I forgot. I actually did more ST stuff than Falcon overall but none of it really got finished into any kind of proper demo etc... only some test binaries linked off various AF threads. There are still some other ST projects kicking around which need similarly cleaned out :)

(There is a thread for the PCS6 vid somewhere here too and that same test binary is linked towards the end).
AnthonyJ
Captain Atari
Captain Atari
Posts: 165
Joined: Sat Jan 26, 2013 8:16 am

Re: Quake 2 on Falcon030

Post by AnthonyJ »

dml wrote:(There is a thread for the PCS6 vid somewhere here too and that same test binary is linked towards the end).
That would be this one: http://www.atari-forum.com/viewtopic.php?f=16&t=25798
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

This weekend I found a number of old backup CDs containing project stuff & tools going back to '99-ish.

I also found CDs & licenses for the Nichimen Graphics 3D modelling tools we were using on some of those projects (NWorld, Nendo, Mirai). That company died and the products disappeared from the market but they were v. good at the time. Written in LISP and fully hackable/extensible via the LISP engine & console. I remember you could record everything you did as a script and play it back, and the model would be rebuilt by the script in realtime. Good for tutorials! They were among the earliest tools using Subdivision Surfaces and edgeloop modelling (instead of NURBS - which were bad) - tech derived from R&D done by Symbolics Corp some years before...

I have a feeling that stuff won't work on Win7 though - pretty much no chance since it was probably made for NT-era machines - but I can maybe get a VM set up to see if the old project data can be loaded up.

Also found some Maya plugins I did for game model optimization & portal generation, some old POVRay projects and other goodies which will take time to get through.

Not sure what any of that has to do with this project, except that having familiar tools & datapaths is always a good basis for trying things!
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3991
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

So I got the per-polygon-span math down to about 20 instructions (40 cycles) plus an additional 10 or so preamble for the pixel part. There's another 10 or so to account for span position/size info being sent back to the CPU. So about 40 ops per span excluding syncs. Should be possible to get it nearer 30 then it hits some limits.

The mapper consumes about 18 ops per pixel excluding the sync, so the per-span cost is equivalent to extending each span by around 2-3 pixels in real terms (count = n+3). Optimized, maybe n+2. That's not too bad.

The per-span overhead imposes a limit for usable polygon count because it adds up rapidly when the spans are very short, but its more than good enough for scenes/objects made to suit the Falcon.

This overhead is not so present in the flat-filled version because it happens while the CPU is drawing pixels. In the textured version the overhead is bigger (more math to do) and the DSP is busy during pixel drawing so it can only be hidden behind transfers for span start/count etc. That trick only goes so far but it just about good enough in this case.

Return to “680x0”