Quake 2 on Falcon030

All 680x0 related coding posts in this section please.

Moderators: Zorro 2, Moderator Team

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

I have replaced the 68k BSP traversal algorithm with a CPU:DSP hybrid with a local stack on both sides - one stack (CPU) manages node and sidecode recursion, and the other stack (DSP) manages clipflags and sidecodes. The clipflags are now a bit harder to obtain on the CPU side but I don't think anything needs them there so it should be ok. This has created a new bug which causes faces to disappear when the camera moves, because the camera is stale on the DSP side when the BSP part is done - it lags behind by one frame. This will be fixed.

While it is not quite finished, it looks like it is working properly and shows something like a 25-40% speedup in the slowest areas, and a 10-20% speedup in other areas. Definitely worthwhile. I'm still looking for ways to optimize it further on both sides.

I'm also figuring out how to draw the ingame objects during BSP traversal but this will need more experiments and new code in the renderer first.

I think I should start by just drawing them on top of the map without sorting, to make sure they draw correctly at all, and then deal with sorting separately. Drawing on top of the map also means drawing more objects than technically should be visible, due to occlusion. I'm not yet sure how much occlusion will come for free since this engine was not designed to occlude via BSP (unlike Doom).

I also notice that I haven't yet implemented anything for AreaPortals so closed doors still try to draw map geometry on the other side, where the Q2 engine does not. I don't think it should offer a big gain for most maps to implement that because the PVS works well enough providing join corridors are not straight, but I suppose it was added for a good reason. I haven't figured out how this works yet but it looks simple enough.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

So the engine is currently trying to execute the following services:

* attempt to synchronize server state with clients (null network)
* server iterate game entities...
- perform player control
- perform AI tick (door open/close behaviour, pickup spinning etc.)
- perform player & AI movement clipping / collision detection
- perform draw event (null operation)

* locate the player in the map (bsp search)
* unpack PVS changes when player location changes
* generate PVS for current frame
* traverse BSP to implement...
- viewcone culling
- generate clipflags to accelerate edges (not yet used - all edges currently clipped)
- surface backface removal
- scene sorting
* draw skybox (null operation)
* transform, clip all face edges
* rasterize scenery
- scan-convert face edges into draw surfaces
- fill surfaces

* draw player overlays (null operation)
* draw player model (null operation)

The order of events for drawing non-scenery items is incorrect for the Falcon and are currently 'null operation'. Once I have figured out how to integrate them with the drawing pass the order will have changed.

There is a lot of detail missing here e.g. surface cache management, weapon firing etc. some of which is present but disabled or invisible.

So there is multiplayer capability within reach. There is no single-player AI component. The only way to obtain that is to plug the Falcon code into the Q2 game project as a software rasterizer. This is possible but quite distant from the current short term goals (drawing the world state as effectively as possible at 16mhz).

Working with the full Q2 game project in Hatari is difficult because it does not provide TT ram, and 14MB has been problematic for booting up unmodified Q2 as a startpoint for anything.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

A few random notes on next steps:

- Running out of DSP memory for a single program - adding more code means smaller buffers. Will need to look at using overlays to load pieces of code on demand and make more efficient use of fast P: memory < $200.
- There are still plenty of optimization opportunities across the code but still trying to focus on raising the speed of the slowest areas only.
- Collision detection is becoming a bottleneck. It gets worse when the scenery nearby is more complex.
- Undecided on whether to start on speeding up collision, drawing objects or on texturing next. Maybe some very basic work on objects then texturing in that order, leaving collision code until later.
- Code is getting unwieldy, untidy - needs some refactoring, esp. boundary between Atari/PC parts, and also between DSP & non-DSP versions of same stuff.
User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2639
Joined: Thu Sep 15, 2005 10:01 am
Location: Serbia

Re: Quake 2 on Falcon030

Post by calimero »

dml wrote:- Undecided on whether to start on speeding up collision, drawing objects or on texturing next. Maybe some very basic work on objects then texturing in that order, leaving collision code until later.
I am personally most curious how much texture you will manage to put on screen on 16MHz... 8)
dml wrote:- Code is getting unwieldy, untidy - needs some refactoring, esp. boundary between Atari/PC parts, and also between DSP & non-DSP versions of same stuff.
do you plan separate 060 version? :) :angel:
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

calimero wrote:I am personally most curious how much texture you will manage to put on screen on 16MHz... 8)
[/i]
Exactly 160x120 pixels worth :)

In fact I'm also curious. 90% of this so far has been guesswork followed by experiments. I'll stop when it stops me.
dml wrote: do you plan separate 060 version? :) :angel:
I'm sure that the 060 issue (whatever it is) won't be complicated. It's just one of those things that isn't so easy to find from just looking at the source. It's actually easier to fix things like that blind with a bit more information (exact machine spec and TOS or MiNT for example!).

I expect it is either related to changing video mode with SV installed, or some other early initialization thing. If it is not related to SV then it will probably do the same thing on my 040, and I'll find it sooner (since it is nearer than my 060, which is stored for house-moving early next year!)

I'm not sure how well it will suit 060 since it is aimed at a base machine and variants on the base machine, relying a lot on DSP, but if I can get some extras in there - like full colour lightmaps - it might be more interesting to watch on 060.

When I finally get 060+SV set up I might spend time on a version of the engine only for that spec...
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

calimero wrote: I am personally most curious how much texture you will manage to put on screen on 16MHz... 8)
I estimated that the per-pixel cost for Quake should be (hopefully) no worse than the per-pixel cost of the liquid shader in BadMooD. The difference is that BadMooD doesn't fill more than a portion of the image with that shader at a time - unless you look straight down into the liquid. :)

Since the number of pixels needing filled is fixed, that sets a cap on the cost.

I also changed the way Q2 works by waiting until spans are generated and linked to surfaces, before deciding which surfaces need texture calculations. This limits texture calculations to surfaces with visible pixels only (compared with calculating for all input surfaces, which is a much bigger number).

There are other costs involved though which could present problems:

- calculating the surface texture plane from the current viewpoint, for each visible surface
- updating the surface cache fast enough to keep up with player movement / view changes

So we'll just have to see how much of a problem these really are, and what the solutions need to look like.
User avatar
DarkLord
Ultimate Atarian
Ultimate Atarian
Posts: 5789
Joined: Mon Aug 16, 2004 12:06 pm
Location: Prestonsburg, KY - USA

Re: Quake 2 on Falcon030

Post by DarkLord »

dml wrote:
When I finally get 060+SV set up I might spend time on a version of the engine only for that spec...
And all the '060/SV users in the crowd will rise up on that day and cheer heartily! :cheers:
Welcome To DarkForce! http://www.darkforce.org "The Fuji Lives.!"
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
kristjanga
Captain Atari
Captain Atari
Posts: 400
Joined: Sat Jul 25, 2009 3:35 pm

Re: Quake 2 on Falcon030

Post by kristjanga »

Objects then texturing sounds about right :) but hey, I can not code so i just agree with what you say. :)
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Quick update. Last night's changes seem to have made a big difference to speed. Best of all, it's now even faster on a real F030 than in Hatari. I had already seen a good speedup in Hatari but didn't try F030 until tonight.

There are still some slow areas (like the smashed wall & rubble geometry at the start of base1), and some new problems are showing up...

The FPU performance of the collision detection is beginning to show on real hardware (FPU is artificially very fast in Hatari) so the framerate suddenly drops if the player's feet touch complex geometry. Took me a while to figure out what was going on there but it makes sense now. In all other areas the F030 HW now seems to be quicker and maintains a decent framerate in most areas of many maps.

Currently the best way to avoid the FPU slowdown is to avoid maps with lots of detail in the floors (like craters and smashed up tiles etc.) this is mainly single-player maps. Multiplayer maps don't seem to be much affected - mainly because those maps were designed to stop players getting snagged in the geometry during competitions.

The collision detection needs rewritten but I don't have time for it just now. It can at least be improved though given time.


There is one very nice thing about Q2 though which makes this less of a concern for multiplayer mode:

The server doesn't need to run on the same 16MHz machine, and the server does all of the collision detection and game object management. The client only needs to draw stuff quickly and handle player input and network packets.
User avatar
viking272
Atari Super Hero
Atari Super Hero
Posts: 961
Joined: Mon Oct 13, 2008 12:50 pm
Location: west of London, UK

Re: Quake 2 on Falcon030

Post by viking272 »

dml wrote:The server doesn't need to run on the same 16MHz machine, and the server does all of the collision detection and game object management. The client only needs to draw stuff quickly and handle player input and network packets.
Ker-ching! Music to many! :)
As ever great work Doug.
ctirad
Captain Atari
Captain Atari
Posts: 312
Joined: Sun Jul 15, 2012 9:44 pm

Re: Quake 2 on Falcon030

Post by ctirad »

Could the FPU code benefit from the FPU clocked to higher speed or the bottleneck is elsewhere?
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

ctirad wrote:Could the FPU code benefit from the FPU clocked to higher speed or the bottleneck is elsewhere?
Yes a faster FPU helps for sure. But I'll have a go at optimizing it later anyway. The collision detection still uses the original C code, I haven't been near it yet.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

New vid. Latest code + several maps, including a few heavy ones:

https://www.youtube.com/watch?v=amr0JBi0xdk
AnthonyJ
Captain Atari
Captain Atari
Posts: 165
Joined: Sat Jan 26, 2013 8:16 am

Re: Quake 2 on Falcon030

Post by AnthonyJ »

dml wrote:The server doesn't need to run on the same 16MHz machine, and the server does all of the collision detection and game object management. The client only needs to draw stuff quickly and handle player input and network packets.
It's been a while since I touched Q2, but I'm pretty sure that standard Q2 also runs PMove for prediction of entities too, which I guess runs the collision detection that you're referring to. You could turn off prediction, but my memory is that you didn't really want to do that unless you couldn't avoid it. That was in the modem era though, so perhaps these days you can cope with it a little better.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Yes that could well be the case - I think the original Q1 had only the camera angle client-local, with all movement server side. And then it was upgraded to movement prediction for QuakeWorld - thatwas probably inherited by Quake2 but I haven't checked.

In any case it may be that the network speed is now always better than the rendering speed, in which case movement prediction can be sidestepped for the Falcon (maybe).
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Textures...

'first light' on the Falcon, albeit first version very slow and FPU-only.
grab0027.png
However it does mean that:

1) the map and texture data is loading correctly
2) the face attributes are correctly formatted
3) the DSP and CPU agree on the identity of faces
4) the texture math is correct for the surface plane calculations
5) the texture function is correct (prototype in fpu/68k)
6) the surface cache is being correctly filled and lightmaps generated
7) it fits in ram, at least enough for testing
You do not have the required permissions to view the files attached to this post.
EvilFranky
Atari Super Hero
Atari Super Hero
Posts: 926
Joined: Thu Sep 11, 2003 10:49 pm
Location: UK

Re: Quake 2 on Falcon030

Post by EvilFranky »

Awesome! :)
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

...in colour...
grab0028.png
You do not have the required permissions to view the files attached to this post.
EvilFranky
Atari Super Hero
Atari Super Hero
Posts: 926
Joined: Thu Sep 11, 2003 10:49 pm
Location: UK

Re: Quake 2 on Falcon030

Post by EvilFranky »

Woohaa!!! :mrgreen: :cheers:
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

EvilFranky wrote:Woohaa!!! :mrgreen: :cheers:
:)

There's a lot of work needed now to make it fast, but having this as a simple startpoint helps a lot. I'll start on the float-free version soon.
User avatar
Scarlettkitten
Captain Atari
Captain Atari
Posts: 262
Joined: Thu Mar 19, 2009 11:42 am
Location: Northamptonshire, UK

Re: Quake 2 on Falcon030

Post by Scarlettkitten »

Well done DML this is looking amazing :cheers:
My musical dribbles 🎶 https://sophie-rose.bandcamp.com
Mega ST4, 520STM.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Scarlettkitten wrote:Well done DML this is looking amazing :cheers:
:cheers:

It's so slow just now that I couldn't record a video in Hatari - partly becase it is slow to start with (about 1fps even with Hatari's fast FPU) and partly because Hatari seems to go into ultra-low-gear while recording videos. Not sure why - the overhead for recording should be pretty fixed, but everything gets 10x slower and 1fps drops to 0.01fps..... I have no explanation for that :) just have to live with it until it gets speeded up...


I started taking a look at the non-FPU texture solution during lunch and finally got all the values used by the inner part to fit in 23 bits for the sake of DSP. It was mostly 23bit already but one term was greedy with 32bits and I needed to fix it before going any further.

It is mostly working well but there are a couple of things bugging me. One is that it stops working completely at some distance from the camera (mostly outside views), definitely caused by z range compression. I'll probably fiddle with it a bit more to see if that can be pushed out further but otherwise I'll probably switch to another mapping scheme or somehow recalibrate the z-range for greater distances. It is at least a tidy cutoff though. Within that range it does work properly and then it just fails past that distance.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Yep there is a simple mip-like transform that allows the z-range to be expanded arbitrarily, at the cost of precision loss very near the eye (i.e. with your face in a wall), so it will be easy to divide the scene into very near and far content and just change some constants to handle each. The same transform magically cleaned up some of the speckle/texture overrun problems which appeared at distance - which is nice :)
User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2639
Joined: Thu Sep 15, 2005 10:01 am
Location: Serbia

Re: Quake 2 on Falcon030

Post by calimero »

so you start at 1 FPS :) ...
now apply some of your magic :)

original C code textures slowdown everything by 15x times?
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3987
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

calimero wrote: original C code textures slowdown everything by 15x times?
Not quite the original code, but it is very very slow, yes.

In Doom/BadMooD each column or row of surface had it's own depth perspective 1/z.

In Quake each *pixel* has its own 1/z (not quite, but effectively for the view) in order to render faces at any angle.

M. Abrash sidestepped it one way, using a parallel divide on the FPU and linear interpolating every 16 pixels, because even the Pentium wasn't fast enough for a divide per pixel. If you stand very near a wall, where it recedes sharply from the eye, and move slowly forwards you can see the ripples where the texture changes velocity every 16 pixels.

I'll need to sidestep it a different way because we're way lacking in cycle power for that kind of thing.


The timeouts on AF now are getting quite extreme.

Return to “680x0”