Quake 2 on Falcon030

All 680x0 related coding posts in this section please.

Moderators: Zorro 2, Moderator Team

nemodhs
Atari freak
Atari freak
Posts: 51
Joined: Sat Aug 31, 2013 2:29 pm

Re: Quake 2 on Falcon030

Post by nemodhs »

dml wrote:...and a quick capture at 160x80 pixels using the new improved mapper:

https://dl.dropboxusercontent.com/u/129 ... 160x80.avi

Starting to look like realtime yet? :wink:
8O Awesome.
User avatar
Atari030
Atari Super Hero
Atari Super Hero
Posts: 784
Joined: Mon Feb 27, 2012 6:14 am
Location: Melbourne, Australia

Re: Quake 2 on Falcon030

Post by Atari030 »

That's just incredible, fantastic. Yellow_Colorz_PDT_11
User avatar
Scarlettkitten
Captain Atari
Captain Atari
Posts: 262
Joined: Thu Mar 19, 2009 11:42 am
Location: Northamptonshire, UK

Re: Quake 2 on Falcon030

Post by Scarlettkitten »

Amazing 8)
My musical dribbles 🎶 https://sophie-rose.bandcamp.com
Mega ST4, 520STM.
User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2639
Joined: Thu Sep 15, 2005 10:01 am
Location: Serbia

Re: Quake 2 on Falcon030

Post by calimero »

I watch it like 10 times :)
it is crazy FAST!

these pauses is what you mention dml: "The surface cache itself is tremendously slow, so turning a corner or waving the camera around can cause delays of a second" otherwise it is amazingly fast! Pure joy for watching.
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
Zamuel_a
Atari God
Atari God
Posts: 1291
Joined: Wed Dec 19, 2007 8:36 pm
Location: Sweden

Re: Quake 2 on Falcon030

Post by Zamuel_a »

This is very impressive!
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
User avatar
DarkLord
Ultimate Atarian
Ultimate Atarian
Posts: 5790
Joined: Mon Aug 16, 2004 12:06 pm
Location: Prestonsburg, KY - USA

Re: Quake 2 on Falcon030

Post by DarkLord »

Just googled the recommend requirements (for a PC) to run Quake 2...

Interesting! :)
Quake2 reqs.jpeg
You do not have the required permissions to view the files attached to this post.
Welcome To DarkForce! http://www.darkforce.org "The Fuji Lives.!"
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
User avatar
Atari030
Atari Super Hero
Atari Super Hero
Posts: 784
Joined: Mon Feb 27, 2012 6:14 am
Location: Melbourne, Australia

Re: Quake 2 on Falcon030

Post by Atari030 »

250 MIPs vs 20 MIPs, DSP inclusive. Fair effort I'd say.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3999
Joined: Sun Jul 31, 2011 1:11 pm

Re: Quake 2 on Falcon030

Post by Eero Tamminen »

Atari030 wrote:250 MIPs vs 20 MIPs, DSP inclusive. Fair effort I'd say.
While I know you're partly joking, I really need to comment on this. :-)

While Douglas' work is incredible, what is running on Falcon differs somewhat from normal Quake, besides it still being just the rendering part.

On PC, Quake is run at higher resolution and its game engine requires higher FPS (at least in Doom the timertick was 35 FPS whereas in BadMood Douglas had to lower it to 12 so that engine would work OK at Atari FPS).

Memory bandwidth differences are also relevant, not just instruction speed. PC has graphics card so screen refresh doesn't eat system memory bandwidth.
User avatar
Atari030
Atari Super Hero
Atari Super Hero
Posts: 784
Joined: Mon Feb 27, 2012 6:14 am
Location: Melbourne, Australia

Re: Quake 2 on Falcon030

Post by Atari030 »

Mostly joking. :wink:

Its a huge effort, I get what he's doing and I would have considered it near impossible. A medal would be due IMHO. :cheers:
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Thanks for all the comments!

I'm currently trying to get textures working more generally/completely - it so far works for small, less detailed areas with the surface cache pre-filled (e.g. I run around the room a bit to let it see most of the walls first), but it breaks for larger scenes. Partly because there are no mipmaps to limit detail vs scene depth - uses up ram too quickly. And some bugs.

I'll try to fix the bugs first, then the mipmaps and then look at speed again.
Eero Tamminen wrote:
Atari030 wrote:250 MIPs vs 20 MIPs, DSP inclusive. Fair effort I'd say.
On PC, Quake is run at higher resolution and its game engine requires higher FPS (at least in Doom the timertick was 35 FPS whereas in BadMood Douglas had to lower it to 12 so that engine would work OK at Atari FPS).

Memory bandwidth differences are also relevant, not just instruction speed. PC has graphics card so screen refresh doesn't eat system memory bandwidth.
One of the biggest differences is actually in the type of computation involved - Quake 1 and 2 engines are largely floating point problems. They depend not just on 100+ MIPs but also 100+ MFlops.

IIRC the 16MHz 68882 measures only 0.34MFlops (340kFlops!). The first experiment shows this result quite well:

https://www.youtube.com/watch?v=J7KCzRt ... 5nMm10m0UM


Part of what makes decent performance possible on a Falcon is replacing all of the floating point computation with something that approximates it cheaply without causing grievous damage to all of the math in the process. More so with texturing than the rest, but the float stuff is widespread though the engine - basically assumed everywhere as the minimum spec. Floating point goes down close to the pixel level - its used for sorting polygon spans by depth and computing perspective correction every 16th pixel along a polygon line.

Doom didn't use floats at all - it went to some lengths to avoid using them because it was aimed at 386. However much of it operated in 2D with quite a few tricks and shortcuts.

I haven't managed to remove all of the float stuff from drawing yet - there is a bit left before each textured face can be rendered but i might be able to cut it in half again before its done. And I'm hoping to dodge it completely on the smallest faces. There is at least no float activity left in the rest of the graphics pipeline, or within each face - which would otherwise bring the project to an end pretty fast :)
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Have begun replacing the surface cache with something that will run faster and allow new capabilities later on.

One of the things I've been doing alongside this is a DSP programming guide containing useful tricks and some less obvious things I learned or used while developing BM and this Q2 derivative. This won't deal much with Falcon audio DSP programming - interrupts, DMA, crossbar etc - it is more aimed at general purpose co-processing and writing efficient algorithms with it. The kind of thing I had to use a lot of in both of these projects.

The guide is far from done but I update it as things go along - it will be shared when I can't think of much more to add :)
mikro
Hardware Guru
Hardware Guru
Posts: 4725
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia

Re: Quake 2 on Falcon030

Post by mikro »

dml wrote:The guide is far from done but I update it as things go along - it will be shared when I can't think of much more to add :)
That's very kind of you. Can't wait! :)
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

mikro wrote:
dml wrote:The guide is far from done but I update it as things go along - it will be shared when I can't think of much more to add :)
That's very kind of you. Can't wait! :)
maybe some other experienced DSP coders can then patch it with good tips that i missed ;)
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

After spending a bit of time with the texturing routines I had two minor breakthroughs which will probably result in better texture fillrate.

So I expect the following will likely become possible soon:

- 192x120 resolution with textures (fullscreen+overscan chunky mode on RGB)
- 16bit surfaces (a bit like BadMooD - better lighting and less fuzz. currently all texture and lighting is 8bit)
- colour lightmaps, like HW/OpenGL (software Q2 used monochrome lighting)

I was able to drop some of the perspective correction math from 48 to 24 bits, and render all z distances from the same table using slighly different encodings. This allows much more compact/parallel code. Some (maybe all) of the remaining FPU math can also be DSP'd or turned into 030 fixedpoint.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Gradually recovering from a bad cold.

The DSP texturing routine has been optimized, and the instruction count per pixel has dropped from 26+sync to 17+sync. It may be possible to reach 15 but that will be near the limit for this one. When I began this I was aiming for around 20.

A quick test in Hatari shows that the CPU can use this sequence to draw pixels:

Code: Select all

	move.w		(a4),d7
	move.w		(a5,d7.l*2),(a1)+
...which is around 23 cycles per pixel in TC/RGB mode (20.2c with Videl bus turned off, and 28c in Hatari 1.8 )

The theoretical optimal sequence is probably this write-buffer alternator:

Code: Select all

	add.l		a0,a3
	move.w		(a2),a4
	move.w		(a3),(a1)+
	add.l		a0,a4
	move.w		(a2),a3
	move.w		(a4),(a1)+
...at 20.7 cycles per pixel in TC/RGB mode (18.2c with Videl off, or 28c in Hatari 1.8 )

I don't know if the latter will be possible because it's getting really difficult to optimize the DSP side further - but I'd be happy with 23 clocks for now. It's already a 50% improvement per pixel.
User avatar
Mindthreat
Captain Atari
Captain Atari
Posts: 279
Joined: Tue Dec 16, 2014 4:39 am

Re: Quake 2 on Falcon030

Post by Mindthreat »

This is one of those things where I want to pull out the Falcon, remove the top cover to show all techie friends alike that it is indeed stock Falcon hardware under the hood. Boot it up and then watch their jaws drop! :D

I love this poo! :D
Atari-related YouTube Videos Here: - https://www.youtube.com/channel/UCh7vFY ... VqA/videos
Atari ramblings on Twitter Here: https://twitter.com/mindthreat
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Here's a dump of some of my notes on the project. For the technically minded it may give some insight to the planning involved in each area :)



- Need special routine for drawing transparent or otherwise tiled (water, lava) textures. Need to read up on this part of Q2 first to make sure its done right. e.g. rules for various texture sizes, flags.
- FPU code for per-face camera UV plane transforms to be turned into 68030, then possibly DSP. This is one of the most significant remaining costs in texture mode, responsible for large variation in framerate depending on where the camera is looking. In the F030 engine it only needs to happen on about 2/3rds of all faces and is shared/cached between them. UV plane transforms can probably be paralleled during polygon scanner, or at least done in a uniform batch.
- FPU code for per-face uz, vz, z setup to be turned into 68030, maybe DSP. This must happen separately for every face - no sharing. It is the other main bottleneck.
- Per-span setup cost to be shrunk/amortized behind 68030 pixel tower.
- Upgrade surface cache to 16bits (from 8 ), initially with mono lighting but later colour. Allocator has already been replaced with a more efficient version (close to Carmack’s) and the drawing code shortcuts all surface cache activity if the page pointer is valid immediately before drawing. No jumping out of line unless necessary.
- Replace surface cache block filler with something fast enough e.g. 68k+tables or DSP. This is very much a 2D blitting / image processing problem and there are lots of ways to speed it up. Current version is plain C code.
- Track minz per edge, then per face to select mipmap level on DSP side.
- Use miplevel to select correct rendering routine for near vs far perspective correction methods. Can also be used to select texture vs approx RGB fillmode for tiny faces.
- Look at dynamic lightmaps & control logic for surface refresh. Need fast dist. formula approx/table/whatever.
- Implement skybox renderer - probably not optimized until later. Doesn’t need p-correction due to large distances involved but not sure there is much benefit in doing yet another kind of routine if main one is fast and works.
- Make skybox emitter conditional on encountering sky faces during BSP queue processing, to remove unnecessary scanning work.
- General final pass at all of the 68k+DSP code in the drawing areas. Review i-cache misses. Review DSP sync points.

Doors/BModels: Use SB solid state path. To be done in stages starting with C code to effectively CSG the objects into the BSP tree as they move, and if visible. Fortunate that not many of these move at once.

Pickups/player MD2s: Use SB transparent state path to prevent scenery fragmentation. Push objects as final scene pass, in reverse BSP order so they paint back-front. Assign sortkeys from BSP leaves (owners) + quantized Z in low 12 bits of sortkey to help inter-object sorting. Let the spanbuffer transparency logic sort out the mess.

Other: DSP code is now too large, competition over internal fast p: memory for the hot parts. Break it into overlays and transmit before each phase. Split into 4 phases: BSP, geometry, scanning, texel blitting. Place hot parts of each phase in low p: memory.
User avatar
alexh
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3107
Joined: Wed Oct 20, 2004 1:52 pm
Location: UK - Oxford

Re: Quake 2 on Falcon030

Post by alexh »

All interesting stuff. I too enjoy reading about your progress.

When you say "FPU code to be turned into 68030 / maybe DSP" are we talking about routines which you previously implemented as 68882 FPU ASM?

Or are we talking about routines which are currently floating point math in C code compiled for the 68030 + 68882? (unlikely)

I remember you saying that FPU operations take hundreds of cycles but low-bandwidth high precision calculations are ok.

How do you partition this? Presumably small changes to the 68030, DSP or FPU code can unbalance this partitioning greatly?
Principal ASIC Engineer
520 ST, 4160 STfm, 4160 STe, MegaST2, MegaSTe 4, Falcon060, Jaguar
Thalion Webshrine
Atari Forum Wiki
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

alexh wrote: When you say "FPU code to be turned into 68030 / maybe DSP" are we talking about routines which you previously implemented as 68882 FPU ASM?
Yes it's code which I had converted from C into 882 ASM for polygon texture plane calculations (and which runs much faster in Hatari than it does on real HW).

alexh wrote: Or are we talking about routines which are currently floating point math in C code compiled for the 68030 + 68882?
There is some of that too but not in the graphics part. Collision detection etc. But I'm mostly ignoring that area for now.

alexh wrote: I remember you saying that FPU operations take hundreds of cycles but low-bandwidth high precision calculations are ok.
How do you partition this?
In the case of polygon setup, there is quite a lot of math happening before each face can be drawn - currently on a real 882 that takes longer than it does to draw a small polygon with a texture on it. It's less of an issue if the polygon is really big since it is a constant overhead. However it's still too expensive for most polygons being drawn so I'll need to redo it.

The math has 2 parts - one part is per geometric 3D plane (which can be shared among more than one face) and the other part is working out the screenspace origin and row/column increments, per face. I'll probably need to rework both of them before its done.
User avatar
Mindthreat
Captain Atari
Captain Atari
Posts: 279
Joined: Tue Dec 16, 2014 4:39 am

Re: Quake 2 on Falcon030

Post by Mindthreat »

As a side note, and I'm not sure if you're in the game development business outside of doing this hobby stuff... but you're exactly the type of person I would hire in a second if I was currently running an active dev getup targeted at squeezing every ounce of performance out of a dedicated platform.

Phenomenal work and I enjoy your interesting tech notes. Thanks! :cheers:
Atari-related YouTube Videos Here: - https://www.youtube.com/channel/UCh7vFY ... VqA/videos
Atari ramblings on Twitter Here: https://twitter.com/mindthreat
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Mindthreat wrote:As a side note, and I'm not sure if you're in the game development business outside of doing this hobby stuff...
I was, once upon a time :) Thanks for following progress.
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Just now got mipmaps working and replaced the surface cache processor with some temporary 68k to fill square lit blocks fast. While the juddering/pauses have not completely disappeared it's already a lot better and its possible to move around again. Feeling a bit more confident about that side of it now.

Also quickly tested truecolour (mono, but not palette based) lighting, and it does look better. Tried RGB lightmaps but didn't get it to work - needs more time. It will be slower anyway without some tables or DSP help. Something for later.

Working with 192x96 resolution currently, and direct 16bit surfaces for speed.
Zamuel_a
Atari God
Atari God
Posts: 1291
Joined: Wed Dec 19, 2007 8:36 pm
Location: Sweden

Re: Quake 2 on Falcon030

Post by Zamuel_a »

Is it possible to include the light maps? They take ofcourse a lot of extra RAM and combining them with the texture also takes some extra time. I think they are stored as 1/16 of the texture size and when scaled up with bilinerar filtering before combined with the texture. Since its not exactly realtime it works on the Falcon to I guess, but every time a new map needs to be calculated it seems that it would take a lot of time so it halts the engine, but I guess you found a clever way of doing it :wink:
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3988
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Post by dml »

Zamuel_a wrote:Is it possible to include the light maps? They take ofcourse a lot of extra RAM and combining them with the texture also takes some extra time. I think they are stored as 1/16 of the texture size and when scaled up with bilinerar filtering before combined with the texture. Since its not exactly realtime it works on the Falcon to I guess, but every time a new map needs to be calculated it seems that it would take a lot of time so it halts the engine, but I guess you found a clever way of doing it :wink:
Yes it is already using the lightmaps - the surface cache is largely about combining repeatable textures with unique lightmaps to make a unique texture surface for some part of the world.

I've been optimizing that combine step so it doesn't pause when you look at a new surface. It's still quite slow but improved a lot - so its possible to move around the map now in realtime with textures. Previously it was difficult until an are had been 'seen' by the camera first.
Zamuel_a
Atari God
Atari God
Posts: 1291
Joined: Wed Dec 19, 2007 8:36 pm
Location: Sweden

Re: Quake 2 on Falcon030

Post by Zamuel_a »

I've been optimizing that combine step so it doesn't pause when you look at a new surface. It's still quite slow but improved a lot - so its possible to move around the map now in realtime with textures. Previously it was difficult until an are had been 'seen' by the camera first.
Very impressive that you got it to work. Do you filter the lightmap when you scale it or just combine it as it is? (sharp shadows wouldn't look so good).
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe

Return to “680x0”