Quake 2 on Falcon030

All 680x0 related coding posts in this section please.

Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team

DrTypo
Atari freak
Atari freak
Posts: 73
Joined: Sat Apr 09, 2011 12:57 pm
Location: Paris, France

Re: Quake 2 on Falcon030

Postby DrTypo » Tue Jun 16, 2015 5:39 pm

I tested the engine on a regular 14MB Falcon.
It runs at about the same speed than the previous version I tried but with a lot less noticeable texture popping. That week-end was well spent! ;)
The outdoor lighting in IKDM4 is really nice, with its soft shadows.

User avatar
bear
Atari freak
Atari freak
Posts: 53
Joined: Fri Jul 02, 2004 4:44 pm

Re: Quake 2 on Falcon030

Postby bear » Tue Jun 16, 2015 7:04 pm

The test works good on my Falcon. It was very interesting testing the map where there's a moving door/portal, and walk into it. So cool moaire effects appear when looking from different angels! :) Good overall speed and nice mipmapping too.

ctirad
Captain Atari
Captain Atari
Posts: 287
Joined: Sun Jul 15, 2012 9:44 pm

Re: Quake 2 on Falcon030

Postby ctirad » Tue Jun 16, 2015 7:11 pm

dml wrote:The engine displays a fixed 256x128 window always. It changes to low res (320x2xx) first so there's no way to actually run it 640x480... in fact it may bomb out if you start it from a greedy video mode.


Don't you use any special mode with less Hz derived from the 32MHz system clock like in title screen of BadMood? The CT2 accelerates system clock to 50MHz clock and thus the video sync is accelerated as well. Some monitors tends to display such a mode as a small window in the center of the screen.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 16, 2015 8:17 pm

DrTypo wrote:I tested the engine on a regular 14MB Falcon.
It runs at about the same speed than the previous version I tried but with a lot less noticeable texture popping. That week-end was well spent! ;)


That's reassuring. I was in 2 minds as to whether I should release the new surface cache or the old one - the framerate is more even in the old one but OTOH it looks worse and overall actually costs a bit more due to poorer caching. When preparing all surfaces together some extra tricks are possible and still some overlap with DSP is achieved.

DrTypo wrote:The outdoor lighting in IKDM4 is really nice, with its soft shadows.


Credit there goes to -> Iikka "Fingers" Keranen. Amazing map designs!

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 16, 2015 8:23 pm

bear wrote:The test works good on my Falcon. It was very interesting testing the map where there's a moving door/portal, and walk into it. So cool moaire effects appear when looking from different angels! :) Good overall speed and nice mipmapping too.


:D

Yeah those are supposed to be teleport/portal thingies with some parallax transparency effects across them (not quite working well with transparency off!).

The texturemapping can break down at very near-z and extreme tangential angles to the viewer. It is mostly managed well in the main parts of the map but when you walk through liquids/portals etc. (and occasionally some less common cases in walls) it becomes noticed. One of the compromises from lack of uberfast floating point... I'll try to improve some of these things further but it's probably getting close now to what I can reasonably manage with it.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 16, 2015 8:28 pm

ctirad wrote:Don't you use any special mode with less Hz derived from the 32MHz system clock like in title screen of BadMood? The CT2 accelerates system clock to 50MHz clock and thus the video sync is accelerated as well. Some monitors tends to display such a mode as a small window in the center of the screen.


Yes its possible to do a lot more with the video mode setup - I've barely touched it on this project. So many other problems and bugs got priority. :D

It should be possible to get some kind of fullscreen chunky mode going for example - but unless both axes can be doubled in hardware (which IIRC isn't an option) it will mean drawing more pixels - either duplicating scanlines or doubling-up pixels. This doesn't cost a huge amount but still isn't free. Since FPS is borderline already, I am not sure yet what is the best way to go with it. Might become more obvious with time.

One possible advantage of doubling-up pixels is that it buys more DSP time per pixel - so you can maybe even do some extra effects :)

User avatar
Atari030
Atari Super Hero
Atari Super Hero
Posts: 614
Joined: Mon Feb 27, 2012 6:14 am
Location: Melbourne, Australia

Re: Quake 2 on Falcon030

Postby Atari030 » Wed Jun 17, 2015 4:26 am

dml wrote:
Atari030 wrote:It runs like a cut cat on the CT2b (128mb @ 68882). Running 640x480 it only displays 320x240(?) in the middle of the screen. But super fast.


I'm very very surprised that it runs at all on that thing (**scratches head!**) but not complaining either :-D (...does it have a boosted DSP? This would help explain it working...)

The engine displays a fixed 256x128 window always. It changes to low res (320x2xx) first so there's no way to actually run it 640x480... in fact it may bomb out if you start it from a greedy video mode. It does suck a lot of ram.

It may be possible in future to resize the rendering window but I hardcoded some math (oops) so the size is truly hardwired for now unless I rewrite that bit again...


That would explain the resolution. I briefly ran it @ 16mhz, still good but not as smooth. DSP is running @ 50mhz. Speaking of which, mine can be pretty flaky but it ran without a hiccup.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Sun Jun 21, 2015 8:35 pm

AF has been unreachable for nearly a week.

I have been re-optimizing a number of areas, so the engine is now running measurably faster since the test release. There are still several more to do but I'll spread these out among other changes.

Before the next release I'll also try to redo and enable the MMU+cache optimizations using the new Hatari 1.9 stats for reference.

One of the rendering bugs will also get some attention soon (it was difficult to fix before optimizing parts very near it because fixing it involves adding more ops in a sensitive place).

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Mon Jun 22, 2015 7:44 pm

1) Managed to shave some cycles off the per-polygon-span DSP-side code...

2) ...and also fixed a very annoying bug which escaped me for quite a long time. Turned out to be a shift/mul happening in the wrong order, sometimes causing overflow. It resulted in corrupt textures on view-tangential surfaces - usually walls which appear for the first time from around corners.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 23, 2015 7:55 am

Here is a new build with a new map (ikdm2) added to the existing 3.

https://dl.dropboxusercontent.com/u/129 ... ublic2.zip

This build has a few improvements:

- 1 less CPU instruction (and several DSP) per polygon span - slightly quicker per textured poly
- bug in texture math fixed - which caused corrupt texturing of large surfaces at view-tangential angles
- flatshade-at-tangential-angles logic removed, since it was mainly there to hide the texturing bug
- corrected spawn point camera orientation (DrTypo reported this one)
- light levels changed (was too saturated - shading was getting lost)
- included both mono and colour lighting builds this time

known problems:

- removing the flatshade-at-tangential-angles logic made the sparkles a bit worse, since it was hiding those too. but the sparkles need separate attention so this is ok. (it may also slow things down in some places since it is now texturing more polys on average).
- colour lighting build is slow to start - must colour-reduce a huge lightmap texture array on each run
- colour lightmaps unfiltered (there was a prefilter step used for the videos but its turned off for fixage)
- .ttp size is about 340k when it should be 160k (with gcc libs removed) but there's a bug in my posix/gemdos mapping so the lightweight build only seems to work in Hatari for now, not real HW. will fix for another time.
- current sky is not a match for ikdm2, but so what?

notes:
- ikdm4 probably looks best in colour (bluish lighting tints inside, orange outside)
Last edited by dml on Tue Jun 23, 2015 8:25 am, edited 1 time in total.

User avatar
jvas
Captain Atari
Captain Atari
Posts: 450
Joined: Fri Jan 28, 2005 4:30 pm
Location: Budapest, Hungary
Contact:

Re: Quake 2 on Falcon030

Postby jvas » Tue Jun 23, 2015 8:16 am

Works fine in Hatari! What is the difference between the two executables?

Edit: ok, I guess one is "color" and the other is "mono" :)

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 23, 2015 8:24 am

jvas wrote:Works fine in Hatari! What is the difference between the two executables?

Edit: ok, I guess one is "color" and the other is "mono" :)


yep - thats it! :D

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 23, 2015 8:34 am

There are some optimizations still missing from this build ^^^ so I may as well note those also:

- haven't revised the hardware PMMU/d-cache opts yet, so that is still turned off. was busy fixing the texturing math.
- some further optimization of the CPU-side drawing loop is likely possible.
- the main pixel inner loop is unrolled @ 8 pixels, and fillrate definitely goes up on simpler scenery if I unroll @ 16 pixels, but then the surface outer loop doesn't fit inside the CPU any more and performance drops on complicated scenes. so it's currently still @ 8 pixels for more balanced fps.
- using a modulo pair table would allow unrolling by odd amounts, to *exactly* optimize for i-cache. didn't try it, but I don't see why not.

I also think that the BSP algo would run faster if I switched it to use bounding spheres instead of AABBs since a lot of the dead time in the BSP routine is frustum culling with complicated view-dependent point-plane patterns for AABBs. This is quite a big change and needs lengthy map preprocessing so I didn't bother to try it, but I think it is the case and could be done in future if needed. probably more appropriate for a forked version!

User avatar
jvas
Captain Atari
Captain Atari
Posts: 450
Joined: Fri Jan 28, 2005 4:30 pm
Location: Budapest, Hungary
Contact:

Re: Quake 2 on Falcon030

Postby jvas » Tue Jun 23, 2015 8:38 am

FYI: I presented your demo last Friday in our retro computer club to the "amiga guys", and it was their reaction:
"Is it 16Mhz? Hmmm...." and then silently watched me walking around the level :)

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 23, 2015 8:46 am

jvas wrote:FYI: I presented your demo last Friday in our retro computer club to the "amiga guys", and it was their reaction:
"Is it 16Mhz? Hmmm...." and then silently watched me walking around the level :)


:D They might not be inviting you back!

User avatar
jvas
Captain Atari
Captain Atari
Posts: 450
Joined: Fri Jan 28, 2005 4:30 pm
Location: Budapest, Hungary
Contact:

Re: Quake 2 on Falcon030

Postby jvas » Tue Jun 23, 2015 9:00 am

I pay for being there :)

DrTypo
Atari freak
Atari freak
Posts: 73
Joined: Sat Apr 09, 2011 12:57 pm
Location: Paris, France

Re: Quake 2 on Falcon030

Postby DrTypo » Tue Jun 23, 2015 11:50 am

I briefly tested the new public built in Hatari.

Lightmaps seem to be a bit messed up. Here is a screenshot from ikdm4 in color mode.

lightmap.jpg


It also happens in mono mode. It seems to me that something doesn't use the proper dimensions.
You do not have the required permissions to view the files attached to this post.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 23, 2015 12:19 pm

DrTypo wrote:I briefly tested the new public built in Hatari.
Lightmaps seem to be a bit messed up. Here is a screenshot from ikdm4 in color mode.


Thanks. I haven't tried it in the latest Hatari - it has been receiving quite a lot of changes recently and I have mostly been standing back and waiting for it to settle out.

I haven't seen this sort of issue before - while it does look like a lightmap stride issue, I can't think of a cause which would be machine-sensitive (since it works on earlier Hataris and on my Falcon). I'll look later but it may just be a temporary emulation problem...

DrTypo
Atari freak
Atari freak
Posts: 73
Joined: Sat Apr 09, 2011 12:57 pm
Location: Paris, France

Re: Quake 2 on Falcon030

Postby DrTypo » Tue Jun 23, 2015 1:11 pm

dml wrote:Thanks. I haven't tried it in the latest Hatari .


Ah, my sentence was not very clear: I tried your new public build of Q2 engine in Hatari 1.8.0.
Tonight I'll try with my Falcon...

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Jun 23, 2015 1:16 pm

DrTypo wrote:Ah, my sentence was not very clear: I tried your new public build of Q2 engine in Hatari 1.8.0.
Tonight I'll try with my Falcon...


I have a build of 1.8 (one of them anyway!) so I'll try it later - it seemed ok on a relatively recent (but not latest) 1.9 build but there have been many changes relating to the emulation code, caches and other stuff and I know this project has been on the edge of working/not-working both on Hatari and real HW for a while so it's not a big surprise to me if some Hataris work and others do not. I did have some settings to build 'Hatari-safe' versions for example, just in case timings were too wobbly.

I'll see if it's just a timing thing later - but in the past any lightmap stride problem has been FPU-rounding related and I need to be a bit careful with any 'fixes' related to those cases...

AxisOxy
Atariator
Atariator
Posts: 23
Joined: Tue Jun 23, 2015 2:00 pm

Re: Quake 2 on Falcon030

Postby AxisOxy » Tue Jun 23, 2015 2:40 pm

Hey Douglas,

First of all, thumbs up for your fantastic work on Doom and Quake 2 on a stock Falcon. The results are kind of insane.
And I know what I´m talking about. I did something very similiar for AGA-Amigas some time ago.
I still work a bit on them from time to time.
Funnily I was also focusing on Doom and Quake 2. Perhaps they are just the sweet-spot for this kind of hardware.
My stuff is far away from the things you do in terms of performance. But still ahead of everything I have ever seen on Amiga.

I´d like to share some of the things I stumbled upon during the development. Perhaps it helps a bit on squeezing more out of your engine.
Looks like I had a little bit different approach than you. If I followed it right, you based your engine on the original code and optimized it.
Like porting from float to int and porting parts from C to ASM/DSP-ASM.
I started a complete new engine from scratch with only the fileformat as shared base.
And even for the fileformat I used a converter on the PC, which improved loading times and also enabled some more aggressive optimizations of the data-structures.

A big bottleneck in my engine was the BSP-traversal and culling of the node bounding boxes. Moving from an AABB-tree to a minimum fit sphere-tree helped alot on that. It cut the amount of needed dot-products for the culling by about 50%. Another thing that helped was removing the recursion from the BSP-traversal. The alternative is to use a loop with a stack. Michael Abrash wrote an article about that technique in his "Zen of Graphics programming" book. Dunno why they dont used the tech in Quake, perhaps it just didnt matter on Pentium class CPU´s.

Another thing I stumbled upon is the fact, that there are alot of "unneccesary" points in the polygons. Like 3 or more points along the same edge, which is caused by splitting the polygons for the BSP-generation. But I didnt got them away. My problem was, removing them worked fine when rendering the edges in float, but produced ugly artifacts (gaps) when rendering them in int. I guess that caused by T-junctions. But normally everything that works fine in float should also be possible in int, if you are careful enough about the normalization of your values.

Thats the first things I remember from my work on the Quake renderer. I guess, there will be more to come.

So, greetings from the Amiga-Scene,
Axis/Oxyron

User avatar
Mindthreat
Captain Atari
Captain Atari
Posts: 213
Joined: Tue Dec 16, 2014 4:39 am
Contact:

Re: Quake 2 on Falcon030

Postby Mindthreat » Wed Jun 24, 2015 4:22 am

jvas wrote:FYI: I presented your demo last Friday in our retro computer club to the "amiga guys", and it was their reaction:
"Is it 16Mhz? Hmmm...." and then silently watched me walking around the level :)


Truly, it's hard for anyone to believe this is running on a stock Falcon and anyone not impressed with it despite what side of the fence they sit on, would be an ass lol
"My attempt at trying to create cool things for the Atari Jaguar:" - http://www.RISCGames.com

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Jun 24, 2015 1:20 pm

AxisOxy wrote:Hey Douglas,


Hi! It's always good to meet another traveler on the same road :)

Thanks for the comments and summary of some of your own experiences here...

AxisOxy wrote:And I know what I´m talking about. I did something very similiar for AGA-Amigas some time ago.
I still work a bit on them from time to time.
Funnily I was also focusing on Doom and Quake 2. Perhaps they are just the sweet-spot for this kind of hardware.


I think that's probably true - I had a sort of checklist of reasons to focus on Q2 and it covered quite a lot of ground. I think most of them held firm as well (so far, anyway!).

On the one hand the original engines were a special kind of genius to begin with but they also left plenty of rich pickings for explorers ;-) Not just code optimizations but interchangeable techniques as well. We'll discuss one here (spheres!) and a couple of other things I did in my own version.

AxisOxy wrote:I´d like to share some of the things I stumbled upon during the development.
Perhaps it helps a bit on squeezing more out of your engine.


Cool! I'll offer up what I can here also.

AxisOxy wrote:Looks like I had a little bit different approach than you. If I followed it right, you based your engine on the original code and optimized it.
Like porting from float to int and porting parts from C to ASM/DSP-ASM.


That's mostly true, but the process overall was quite complicated and happened in differently-shaped stages. I'll try to summarize it a bit so it makes more sense.

There are a few reasons I wanted to start with the original code (or at least, a correctly functioning reference version, original or otherwise!). I had some quite painful experiences with the earlier Doom project because the rendering part had been created before I got anywhere near ID's source, and worked quite differently from the original. Remember all those cool tricks mapmakers were able to exploit to produce interesting effects? Yep - lots of those failed to work in my version because the techniques didn't match, the bugs didn't match and figuring out the differences was in some cases just too much trouble for me. By the end it got pretty close but still not 100%, and a lot of time was burned getting it sensible.

Doom is internally quite convoluted, compared with Q2 (which I think is more elegant/simple relative to what it outputs even if it does require careful study!) so with hindsight it probably wouldn't have been so bad. I think only the lightmap dimension estimation needs to be 100% right (a lot of people had trouble with this it seems) and the rest can reproduced more or less independently - at least it seems manageable without lots of divergence. But I approached the Q2 source pretty cold - I had last visited Q1 source in 1997 and didn't remember much detail - and adapting the existing code was a good way to catch up again. I only remembered a few important properties about the renderer that were enough to make me think it might translate well.


Anyway I planned to start with ID's source but immediately ran into problems with RAM - both physical machines and emulation. So I had to scrap that and adopt an alternate engine as the reference (that was Alexey Goloshubin's PolyEngine). This is a much leaner, simplified framework which I could run natively without all the data and the same early demand for RAM.

Unfortunately I discovered that PolyEngine contained a number of quite serious bugs and spent nearly as much time diagnosing and fixing those as I could have spent reducing the memory & data footprint of ID's original. But that kind of thing always happens to me :) One of the bugs actually turned out to be the lightmap extents calculation being wrong, causing every 11'th map to have 1 or 2 skewed lightmaps. Just enough not to notice until too late :D Other bugs were more insidious... particularly those involving clipping.... :cry: But I can't complain because it got the whole process started and was a win overall.

So between the ID source and PolyEngine, I cobbled together a lean-ish reference engine that would display maps correctly as Q2 did (i.e. without the bugs, glitches, and with an actual BSP algorithm driving the scene, which PolyEngine did not use - it used a flat list of clusters only, seemingly aimed more at hardware).

From there I started converting all of the render-side float work (and related data) into fixedpoint, and made sure the changes could be confirmed against the original with each progress made. I didn't include the texture transform math in this conversion - since it would need replaced by a different solution, so it remained as floats until later on. I did however implement the per-pixel steps using integers only, as a sub-project of the whole thing. I won't detail that here because it would take too long but it was extremely useful to have a C reference version for that!

Once I was sure that the main parts (vertex pipeline, clipping, spans) were all float-free, I started to review the algorithms/functional block being used and began to replace these with alternative solutions. This is where it starts to look a lot more like a 'from scratch' project. I'll explain a couple of specific cases below.

Once all the functional blocks had been replaced with more Falcon-friendly methods I started to create 68k versions of those to see how they would map to the CPU cache etc. and occasionally would go back and redo the C reference if necessary to keep the two versions closely related.

When that seemed settled, I'd start on a 68k/DSP hybrid implementation and in some cases a pure-DSP implementation where it made sense (most are hybrid). In a few cases I changed the technique again and would rewrite all 3 versions (ouch) but that was rare :)

By the time all that was done, the engine has quite a different structure and methods from either ID's or PolyEngine, and I did eventually lose sync with the reference C version of the renderer (because it was becoming a lot of work to maintain it) so it is now quite far behind the Falcon implementation - has no transparency support etc. I keep it around mainly to help debug the loader when adding a new format, and will refer to it again when I start working properly on the dynamic models.

So yes - this one is not a from-cold, from-scratch engine, but OTOH there is no part of it left (the renderer) which is from any other engine - most of the final blocks are essentially from scratch (err... except for a small number of functions e.g. R_PushDLights, which I lifted from Q2 and kept pretty much as is :). The whole process was slow and incremental but has resulted in something quite different in structure from ID's.

AxisOxy wrote:I started a complete new engine from scratch with only the fileformat as shared base.
And even for the fileformat I used a converter on the PC, which improved loading times and also enabled some more aggressive optimizations of the data-structures.


Well that would have been a tough but rewarding task :)

I think I did everything I could with the loader without actually preprocessing the maps offline. I thought of a number of convert steps which could have helped but decided to see how far I could get without it. It's now at the point though where further progress would need to involve an offline step (or a lot of waiting at runtime! already the case with coloured lightmaps...).

The new loader is c++ with a specialization per format - and once loaded, the map data is split into that which is needed by the game layer (e.g. collision stuff) and that which is needed by the renderer, since optimizing for each is different. So there is already separation between what's in the BSP file and what is in memory. This could be a startpoint for offline preprocessing later on.

AxisOxy wrote:A big bottleneck in my engine was the BSP-traversal and culling of the node bounding boxes. Moving from an AABB-tree to a minimum fit sphere-tree helped alot on that. It cut the amount of needed dot-products for the culling by about 50%. Another thing that helped was removing the recursion from the BSP-traversal. The alternative is to use a loop with a stack. Michael Abrash wrote an article about that technique in his "Zen of Graphics programming" book. Dunno why they dont used the tech in Quake, perhaps it just didnt matter on Pentium class CPU´s.


You're dead right about the BSP as a bottleneck - I spent quite a lot of time getting it to work sensibly.

I mentioned a few posts back that AABBs could perhaps be dumped for spheres but wasn't sure about the change in culling effectiveness on typical node shapes. Hadn't done the experiment. Sounds like it really has been a win for you though - I will have to try it now :)

(I did try it in Doom but the gain was limited because AABBs were more often optimal in 2D for those maps)

The Falcon's DSP is very good at dotproducts so that part isn't too much of an issue. Dynamically ordering the AABB min/max points however is very painful and it also involves (slowly) transmitting 6 words per node box, (vs 4 per sphere), which makes a mockery of DSP-side performance. I even tried some nasty SMC on the DSP to do the reordering but it just moved cost around and didn't help overall.

Removing the BSP recursion - I did exactly the same thing. It uses a data stack with one child on the stack and the other always in a register. However I had to do a number of other things with the BSP algorithm to get it to cache properly and still cover all the necessary aspects.

- The original BSP algorithm implements multiple duties at once. All mixed up: Depth-ordering (priority keys), backface culling, PVS intersections and drawing related work. I split these duties up into separate blocks of work, each narrowing the results of the previous (and outputting a finer grain than the previous). This allowed each block to fit in the tiny i-cache. The priority keys are essential later when trying to draw models without a zbuffer :-/ so I needed this side of things not to get lost.
- The DSP is better at dotproducts and math than the CPU, but can't access main RAM, so I brewed an algorithm which had them working synchronously, each with their own mirror of the BSP stack. This allows each chip to predict to some extent what the other will do next, in order to limit stalls.
- The original algorithm implements backface removal by testing the 'side' flag of each face, in relation to the current node 'side' being visited. This involves looking up faces and testing words etc. per face which I didn't like (not to mention the drawing part) so I rewired the node face lists into pairs of winding pools and index the correct winding pool with the active node side. This gets all the visible faces at once without any conditions (admittedly not many at once - but still helped quite a bit).
- The 68030 can cache aligned longword writes, so I fiddle the PMMU to make nodes non-cacheable while the stack remains cacheable, so the pushes get buffered and the pops get read from the d-cache. This offers a small speed advantage.
- I ended up with 3 different in-memory representations for planes. One float (really for the un-converted collision code), one storing 15bit normals for the BSP algorithm, and one storing 23bit normals, for precise texture transforms. A bit wasteful but was necessary for speed.

AxisOxy wrote:Another thing I stumbled upon is the fact, that there are alot of "unneccesary" points in the polygons. Like 3 or more points along the same edge, which is caused by splitting the polygons for the BSP-generation. But I didnt got them away. My problem was, removing them worked fine when rendering the edges in float, but produced ugly artifacts (gaps) when rendering them in int. I guess that caused by T-junctions. But normally everything that works fine in float should also be possible in int, if you are careful enough about the normalization of your values.


I was aware of the extra colinear points - but didn't try to remove them for that very reason. In fact I've been juggling with the idea of introducing extra points to seal existing t-junctions. I still have problems with texture sparkles and most are caused by texture addressing imprecision (because the pixel colours are wild) - but some structures when viewde at great distances appear a bit like sky colour and might be t-juncts opening up. So we'll see what happens there....

I suppose having 3 options isn't bad either - 1) optimize colinear points away, 2) leave it as is, 3) close existing t-juncts with new points. I suppose that which one suits will depend on the use case!



AxisOxy wrote:Thats the first things I remember from my work on the Quake renderer. I guess, there will be more to come.
So, greetings from the Amiga-Scene,
Axis/Oxyron


Greetings straight back from the Atari-Scene. Thanks again for chipping in - I'm definitely thinking seriously about the bounding spheres again and the whole offline thing :)

User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2310
Joined: Thu Sep 15, 2005 10:01 am
Location: STara Pazova, Serbia
Contact:

Re: Quake 2 on Falcon030

Postby calimero » Wed Jun 24, 2015 3:47 pm

just to say hi to Axis!
- C64 demos that you worked on are AMAZING! Brilliant and mind blowing code :)

btw I did not know that you code on Amiga too...
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Jun 24, 2015 4:18 pm

calimero wrote:- C64 demos that you worked on are AMAZING! Brilliant and mind blowing code :)


^^^ what he said :) those C64 demos = crazy-impressive coding indeed :)


Social Media

     

Return to “680x0”

Who is online

Users browsing this forum: No registered users and 3 guests