Bad Mood : Falcon030 'Doom'
Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team
Bad Mood : Falcon030 'Doom'
So further updates/news on the Bad Mood project are being moved over here, from the previous thread:
http://www.atari-forum.com/viewtopic.ph ... 19#p225019
For now Bad Mood it's still a development project so the 680x0 forum seems to be the place for it, at least until it turns into something playable.
The current aim is to speed up the rendering (3D scene view) as much as possible, before trying to bolt it onto the Doom engine itself (which is a bit of an unknown performance-wise, so for now I'm going to assume it will run fast enough @16MHz with the graphics layer stripped away and some minor mods!).
I have been doing experiments in the background as time permits and have reached a few conclusions on rendering performance. I expect to have some (good) news soon.
http://www.atari-forum.com/viewtopic.ph ... 19#p225019
For now Bad Mood it's still a development project so the 680x0 forum seems to be the place for it, at least until it turns into something playable.
The current aim is to speed up the rendering (3D scene view) as much as possible, before trying to bolt it onto the Doom engine itself (which is a bit of an unknown performance-wise, so for now I'm going to assume it will run fast enough @16MHz with the graphics layer stripped away and some minor mods!).
I have been doing experiments in the background as time permits and have reached a few conclusions on rendering performance. I expect to have some (good) news soon.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Bad Mood : Falcon030 'Doom'
Here are some notes on experiments I did today with textured rendering of floors and ceilings (a.k.a. 'visplanes') in the Bad Mood engine.
Visplane rendering incurs a high cost in the Bad Mood engine - the highest single cost for any typical view (except transparent walls - which do need redone completely and best ignored for now, and turbulent lava stuff, which I'll deal with another time).
Visplane rendering is expensive for a few reasons:-
- the visplane textures rotate, 'roto-zoomer' style (while the wall textures do not), so the texture addressing unit is much more complicated than for walls.
- more complicated texture addressing means more cycles per pixel, more ops per pixel, and poor loop unrolling in the CPU cache
- there are often more 'spans' making up visplane surfaces than there are wall 'columns', each one needing some CPU setup before drawing (so lots of short spans == bad)
- the setup for each span involves several exchanges between CPU/DSP to acquire du/dv affine gradients and get them formatted for drawing, again adding bus accesses and pressuring the CPU cache on each new span
The best demoscene tip I had received so far (Mikro) was to implement some kind of DSP-based texture unit, using the DSP host port as a sort of 'texel server'. I had used this in the past for gouraud shading tricks but hadn't tried actually putting a whole texture on the DSP (!). I was also concerned that the host port is 3 separate byte-wide ports - so reading words from it probably cost twice as much as for bytes and might even the score a bit with normal CPU texturing. However, that may not be the case in the Falcon (I will measure all this stuff separately in 'Nimbench' at some point). It could well be mapped as words on the host side, and bytes on the DSP side at 2x the clock rate? will come back to it!
Anyway I ran 2 new experiments using the DSP as a texture unit to compare with the last release, using RGB as the reference display mode.
1) original 'CPU-only' based visplane texturing
2) DSP+CPU based texel server, feeding 8bit texels to CPU, which then 'lights' and draws the pixel
3) DSP based texel server, feeding 16bit pixels to CPU, which just get drawn directly (no lighting)
I used the built-in sampling profiler to measure the relative cost of each method, and results are below. Note that the profiler sucks CPU time so the indicated FPS can be as much as 1fps lower than it is in a normal build.
So any kind of DSP based texture unit is faster than the current CPU-only one. That's useful
The hybrid DSP+CPU version has the advantage that it can be implemented relatively easily and lighting still works as before. However the DSP-only solution is the fastest of all, and that's a bit of a dilemma...
Making the DSP-only method work with dynamic lighting is *hard* - the lighting is currently done with 64-level fog tables and there isn't space on the DSP for that. It's also too expensive to do the arithmetic explicitly per texel so that's not an option. One way that might work is to reindex the textures to use no more than 32 or 64 colours each (from the 256c palette) - most of the textures aren't too variant so this could work. It would allow the lighting/fog table to fit in the DSP assuming it is re-uploaded for each new texture.
Other possibilities involve mipmaps and encoding fog tables directly into the mipmaps, but it doesn't help for other dynamic lighting effects and the whole thing becomes quite complex to manage and make future changes.
So this is current status - there's a considerable speedup on the table - pretty much guaranteed at this point, but the biggest speedup would involve quite a lot of work.
I'm going to spend a bit longer thinking about it all before writing any new code in case there is some way to simplify it further.
Visplane rendering incurs a high cost in the Bad Mood engine - the highest single cost for any typical view (except transparent walls - which do need redone completely and best ignored for now, and turbulent lava stuff, which I'll deal with another time).
Visplane rendering is expensive for a few reasons:-
- the visplane textures rotate, 'roto-zoomer' style (while the wall textures do not), so the texture addressing unit is much more complicated than for walls.
- more complicated texture addressing means more cycles per pixel, more ops per pixel, and poor loop unrolling in the CPU cache
- there are often more 'spans' making up visplane surfaces than there are wall 'columns', each one needing some CPU setup before drawing (so lots of short spans == bad)
- the setup for each span involves several exchanges between CPU/DSP to acquire du/dv affine gradients and get them formatted for drawing, again adding bus accesses and pressuring the CPU cache on each new span
The best demoscene tip I had received so far (Mikro) was to implement some kind of DSP-based texture unit, using the DSP host port as a sort of 'texel server'. I had used this in the past for gouraud shading tricks but hadn't tried actually putting a whole texture on the DSP (!). I was also concerned that the host port is 3 separate byte-wide ports - so reading words from it probably cost twice as much as for bytes and might even the score a bit with normal CPU texturing. However, that may not be the case in the Falcon (I will measure all this stuff separately in 'Nimbench' at some point). It could well be mapped as words on the host side, and bytes on the DSP side at 2x the clock rate? will come back to it!
Anyway I ran 2 new experiments using the DSP as a texture unit to compare with the last release, using RGB as the reference display mode.
1) original 'CPU-only' based visplane texturing
2) DSP+CPU based texel server, feeding 8bit texels to CPU, which then 'lights' and draws the pixel
3) DSP based texel server, feeding 16bit pixels to CPU, which just get drawn directly (no lighting)
I used the built-in sampling profiler to measure the relative cost of each method, and results are below. Note that the profiler sucks CPU time so the indicated FPS can be as much as 1fps lower than it is in a normal build.
Code: Select all
Method: VP render time: Wall render time: Framerate:
-------------------------------------------------------------------------
CPU only: 99.7ms 41.0ms 5.02fps
DSP+CPU (8bit, indexed): 56.6ms 41.4ms 6.42fps
DSP (16bit, non-indexed): 42.6ms 41.5ms 7.07fps
^ with reduced exchanges: 41.5ms 41.4ms 7.14fps
^ with profiler disabled: ?????? ?????? 8.04fps

The hybrid DSP+CPU version has the advantage that it can be implemented relatively easily and lighting still works as before. However the DSP-only solution is the fastest of all, and that's a bit of a dilemma...
Making the DSP-only method work with dynamic lighting is *hard* - the lighting is currently done with 64-level fog tables and there isn't space on the DSP for that. It's also too expensive to do the arithmetic explicitly per texel so that's not an option. One way that might work is to reindex the textures to use no more than 32 or 64 colours each (from the 256c palette) - most of the textures aren't too variant so this could work. It would allow the lighting/fog table to fit in the DSP assuming it is re-uploaded for each new texture.
Other possibilities involve mipmaps and encoding fog tables directly into the mipmaps, but it doesn't help for other dynamic lighting effects and the whole thing becomes quite complex to manage and make future changes.
So this is current status - there's a considerable speedup on the table - pretty much guaranteed at this point, but the biggest speedup would involve quite a lot of work.

d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Bad Mood : Falcon030 'Doom'
The table in my last post has been updated with an optimized version, which removes the redundant CPU/DSP exchanges needed for CPU-based texturing on each new span of pixels, and a version with the sampling profiler disabled (with timings missing, obviously).
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
- calimero
- Fuji Shaped Bastard
- Posts: 2592
- Joined: Thu Sep 15, 2005 10:01 am
- Location: Serbia
- Contact:
Re: Bad Mood : Falcon030 'Doom'
wow! it is quite achievement!
from 5.02 to 7.14 - it is almost 50% !!!!!!!!!!
is there any sense to render same way walls (using DSP)?
from 5.02 to 7.14 - it is almost 50% !!!!!!!!!!
is there any sense to render same way walls (using DSP)?
using Atari since 1986. ・ http://wet.atari.org ・ http://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X
Re: Bad Mood : Falcon030 'Doom'
There is to a lesser degree yes, but I would quickly have problems with it - the DSP is already busy at that time generating visplanes. So a change like that would need a lot of rework - could be painful. But the walls can be speeded up yes - even without DSP. There are some details about the wall rendering which make it suitable for other kinds of speedup.calimero wrote: is there any sense to render same way walls (using DSP)?
I did a quick test already with 1-pix textures and the walls were much faster - so they are very sensitive to texel reuse (and probably texel skipping) through the CPU data cache. This is because wall columns are not rotated, and the textures are oriented such that texels are always scanned in memory order (unlike floors).
I think the minimum that can be done is to use MipMaps and always select a Mip which is => 1:1 pixel/texel ratio so there is always some pixel stretching. MipMaps can be adjusted easily with a config value so the degree of stretching will adjust performance vs detail.
Even without any stretching, MipMaps would eliminate texel skipping - which the data cache hates, and this is probably a major cost in the wall drawing just now since a lot of the drawn surface area of walls is scaling the textures down, not up (notice the framerate can rise quite a lot if you walk right up to a blank wall).
I also have some 68030 PMMU trickery to try on the display memory to prevent accidental cache pollution through writing, and some other stuff I haven't mentioned yet - such as selective cache control for du:dv texel rates. Doom has a nice feature that every texel is exactly the same 'size' in the world at a given distance - there is no texture scaling anywhere. So you know exactly how big each texel is at every given scene depth. This means you can tell when texel stretching or skipping is going to occur for each new line of pixels and you can plan ahead a little.
So, I'm not done with the performance thing yet - not for a while

I'm currently working out the encoding a texture format for the DSP which will look at least as good, or better, than the current version does. I figure the texture and it's light/fog tables can be interleaved into the same 4k of memory for a single 64x64 floor tile, which will make a 'DSP only' visplane texture route viable for Bad Mood (in it's basic form it won't allow any lighting or depth-cue). This will take time because I need to reformat the textures into a 2nd form, possibly cached on disk, and write some new DSP code etc. etc. So there may not be any interesting updates until I get that part working.
BTW on the subject of texture formats - another thing which can be done with Bad Mood is to adopt truecolour textures. It does rely on indexed textures but they don't all have to share the same 256 colour palette. Each texture can have it's own palette (or palette group), effectively. So at some point we could rework (or lift, from the Jag version?) truecolour versions of the Doom texture set and use those instead of the old PC ones...
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Bad Mood : Falcon030 'Doom'
How do you do the floor and ceiling mapping? About the time when DOOM and Duke 3D were hot I started to make a DOOM engine myself on PC. I didn't finish it, but the technic I used was different from what they used in DOOM. I didn't use a BSP tree but sectors and portals instead. To draw the walls I used a perspective mapper but with one perspective division per line instead of per pixel since you can't look up and down. This is probibly what you already do to.
For the floor and ceiling I used a floor mapper routine that works in about the same way as mode 7 on SNES. It's very simple and you can put all divisions and multiplications in look up tables so that you don't need any during runtime. I found one explanation here: http://gamedev.stackexchange.com/questi ... -in-pygame
As you can see, the routine is very simple and can easily map the entire screen. Instead of having the y and x FOR loops you just do a normal polygon render and use this as the horizontal line routine. One test I did once was to build the entire 3d engine from just the floor mapper. I had different tiles for floors and walls, so when the floor mapper found a wall tile, I didn't draw the floor, but instead a wall strip like in Wolfenstein. The end result is a wolfenstein engine with floors and ceilings that are quite fast. Had been interessting to try this on Atari to see how fast it would be. Maybe even an ST could handle it. You don't need a raycast engine and aslong you can draw 320x100 pixels (or maybe 160x50 with c2p) with a decent framerate, then this could work. The ceiling could be a copy of the floor with a different palette to speed things up.
For the floor and ceiling I used a floor mapper routine that works in about the same way as mode 7 on SNES. It's very simple and you can put all divisions and multiplications in look up tables so that you don't need any during runtime. I found one explanation here: http://gamedev.stackexchange.com/questi ... -in-pygame
As you can see, the routine is very simple and can easily map the entire screen. Instead of having the y and x FOR loops you just do a normal polygon render and use this as the horizontal line routine. One test I did once was to build the entire 3d engine from just the floor mapper. I had different tiles for floors and walls, so when the floor mapper found a wall tile, I didn't draw the floor, but instead a wall strip like in Wolfenstein. The end result is a wolfenstein engine with floors and ceilings that are quite fast. Had been interessting to try this on Atari to see how fast it would be. Maybe even an ST could handle it. You don't need a raycast engine and aslong you can draw 320x100 pixels (or maybe 160x50 with c2p) with a decent framerate, then this could work. The ceiling could be a copy of the floor with a different palette to speed things up.
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
Re: Bad Mood : Falcon030 'Doom'
Well it very similar except I don't need a lookup table for the DSP version - there isn't much space for it and the perspective calc is absorbed in parallel with drawing time. For a CPU-only solution I would use the LUT of course.Zamuel_a wrote:How do you do the floor and ceiling mapping?
For the floor and ceiling I used a floor mapper routine that works in about the same way as mode 7 on SNES. It's very simple and you can put all divisions and multiplications in look up tables so that you don't need any during runtime. I found one explanation here:
Yes I was quite into portals at the time and made an engine with that too for Atari - but didn't finish it. I did use them in a commercial game project, and wrote a Maya plugin to build the 'portal maps' automatically out of CSG primitives.Zamuel_a wrote:About the time when DOOM and Duke 3D were hot I started to make a DOOM engine myself on PC. I didn't finish it, but the technic I used was different from what they used in DOOM. I didn't use a BSP tree but sectors and portals instead. To draw the walls I used a perspective mapper but with one perspective division per line instead of per pixel since you can't look up and down. This is probibly what you already do to.
Zamuel_a wrote:I had different tiles for floors and walls, so when the floor mapper found a wall tile, I didn't draw the floor, but instead a wall strip like in Wolfenstein. The end result is a wolfenstein engine with floors and ceilings that are quite fast. Had been interessting to try this on Atari to see how fast it would be. Maybe even an ST could handle it. You don't need a raycast engine and aslong you can draw 320x100 pixels (or maybe 160x50 with c2p) with a decent framerate, then this could work. The ceiling could be a copy of the floor with a different palette to speed things up.
Yes there's no need for 'raycasting' as such. Bad Mood doesn't really raycast either - it's all affine mapping with a perspective calculation per row or column. The perspective calc cost is just hidden in a different way.
BTW You should try your project on Atari and see how well your technique works

d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
- Eero Tamminen
- Fuji Shaped Bastard
- Posts: 3385
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
Note that although Hatari WinUAE CPU core has preliminary MMU emulation and some simpler things work with it, is still missing some bits from full MMU (exception) emulation and CPU cycles information in Hatari profiler will be (even more) bogus for that variant of the WinUAE CPU core.dml wrote:I also have some 68030 PMMU trickery to try on the display memory to prevent accidental cache pollution through writing,
Re: Bad Mood : Falcon030 'Doom'
I won't be using Hatari for the MMU tests - it's a very specific optimization and definitely best done on real kit. It's also not clear if it will have any effect with write-allocation disabled (the default case on Falcon IIRC - although I will be trying to combine the two in some parts of the BM code).Eero Tamminen wrote: Note that although Hatari WinUAE CPU core has preliminary MMU emulation and some simpler things work with it, is still missing some bits from full MMU (exception) emulation and CPU cycles information in Hatari profiler will be (even more) bogus for that variant of the WinUAE CPU core.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Bad Mood : Falcon030 'Doom'
I remember when Doom came out and I tried to find information about how it was done, everyone said that it was a raycast engine, similair to wolfenstein, but more advanced. I found one "good" text that tried to explain it and they said that in doom every line was raycasted instead of every block in wolfenstein. I haven't looked into the Doom source code, but I can't think that it is a raycasting engine, but instead a "real" 3d engine, except that they don't calculate everything in all axis. Doing a pixel precise raycaster would be very slow I think.Yes there's no need for 'raycasting' as such. Bad Mood doesn't really raycast either
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
-
- Atari God
- Posts: 1207
- Joined: Wed Feb 11, 2004 4:34 pm
- Location: Middle Earth (Npton) UK
- Contact:
Re: Bad Mood : Falcon030 'Doom'
Meanwhile, on other retro platforms, developments move apace...
No pressure Doug!
http://www.youtube.com/watch?v=Y7h3H-_8N_o
No pressure Doug!

http://www.youtube.com/watch?v=Y7h3H-_8N_o
"Where teh feck is teh Hash key on this Mac?!"
- DarkLord
- Ultimate Atarian
- Posts: 5271
- Joined: Mon Aug 16, 2004 12:06 pm
- Location: Prestonsburg, KY - USA
- Contact:
Re: Bad Mood : Falcon030 'Doom'
Ewwww......CiH wrote:Meanwhile, on other retro platforms, developments move apace...
No pressure Doug!![]()
http://www.youtube.com/watch?v=Y7h3H-_8N_o

Welcome To DarkForce! http://www.darkforce.org "The Fuji Lives.!"
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
Re: Bad Mood : Falcon030 'Doom'
Hahaha.CiH wrote:Meanwhile, on other retro platforms, developments move apace...
No pressure Doug!![]()

d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Bad Mood : Falcon030 'Doom'
It's a 2D raycasting engine, as I understand it.Zamuel_a wrote:I found one "good" text that tried to explain it and they said that in doom every line was raycasted instead of every block in wolfenstein. I haven't looked into the Doom source code, but I can't think that it is a raycasting engine, but instead a "real" 3d engine, except that they don't calculate everything in all axis. Doing a pixel precise raycaster would be very slow I think.
For each vertical pixel row, it fires a single ray out from the eye, colliding it very quickly with the world using the 2D BSP tree. Each point where it crosses a plane (which is basically a height change) it can find the location in 3D space, calculate how much new visible floor / wall / ceiling it defines and rasterise the visible space. Very clever technique - it very much decouples the cost of rendering the world from it's size, the span generation is pretty much proportional to the number of rays and not much else. (Hence why Doom scales nearly linearly with adjusting the vertical resolution). Every pixel is written exactly once and no more, so no messing about with complex Z algorithms.
The only drawback is you're restricted to a 2D world. Can't rotate in the roll or yaw axes, looking up or down causes perspective distortion (hence why Doom doesn't even do it).
Re: Bad Mood : Falcon030 'Doom'
Fires a ray for each row would be very slow. Wolfenstein 3d is fast since you know the world is made of 64x64 pixel blocks so you can more or less jump 64 pixels and don't have to shoot the ray for each potential pixel. If you do a "real" raycaster for each pixel, when even a wolf 3d engine would be terribly slow on a Pentium 2 computer (I tried this once). In Doom you don't know were a wall is since the world is not made up from blocks like in wolf 3d so you don't know were to "shoot", but would have to test each pixel and that would be very slow, so I can't beleve they do that.
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
Re: Bad Mood : Falcon030 'Doom'
The way Doom maps are processed allows them to be drawn implicitly without casting rays into the BSP, or sorting polygons or z-buffering etc. It is pretty clever. It certainly could be raycasted because it provides everything that would be needed to do that - but it's cheaper not to. Raycasting is used in the engine but for other stuff - for indexing textures, collision detection and AI interactions.
The engine walks down the BSP tree in view-dependent order (near node first), dealing with each convex 'ssector' (sub-sector or physical node) one at a time, front to back, with only front-facing 'ssegs' (sub-segments) from each ssector drawn as walls, effectively as complete polygons made up of individual columns. However each column is first clipped against an occlusion buffer for the entire image, and itself updates the occlusion buffer. In this way no pixel is drawn twice, and the engine knows when to stop drawing (all pixels occluded) and even when to stop walking the BSP. It tracks scene coverage to the pixel, very cheaply.
This does impose a limit though of one 'window' per sector join - each column has simple miny/maxy occlusion tracking, and can only 'close off' the scene from the top/bottom of the image inwards. You won't ever see one window above another (unless it's faked using a transparent wall texture with holes in the texture)...
I expect Doom and BadMood don't process upper/lower wall segments the same way - BadMood treats them as separate 'sub walls', added in sequence to the occlusion buffer. Doom might be processing them both at once as a kind of 'inverse wall' where the window between ssectors fills the occlusion buffer for upper/lower walls together. I'll probably have to check that sometime.
The floors (visplanes) are just area fills (actually gaps) between wall top/bottoms, but they have to be scan converted into the opposite axis before rasterization. There are no explicit floor/ceiling primitives involved - just a clever way of tracking what's left after the walls are generated. The visplane conversion is messy compared with everything else for a few reasons but the method does work and cheaper than using primitives.
So the BSP divides and sorts the scene into front-back order, allowing the occlusion buffer works properly - that's it's main duty. It's also used for casting rays by the game engine to find obstacles between entities - line of sight etc. But these tests are relatively rare compared with rendering occlusion tests. The BSP walk does depend on good visibility testing for each ssector, to avoid wasted work processing ssegs which are out of view. I don't remember what the Doom engine does for this, but I use 3 methods at once: 1) fast octant cull, 2) viewcone cull of node bounding box, 3) occlusion buffer test using 2D poly representing node bounding box
Transparent objects, sprites etc. are tracked and occluded in the same way but not drawn until the walls and floors are done, after which they are drawn in reverse order, back to front.
The engine walks down the BSP tree in view-dependent order (near node first), dealing with each convex 'ssector' (sub-sector or physical node) one at a time, front to back, with only front-facing 'ssegs' (sub-segments) from each ssector drawn as walls, effectively as complete polygons made up of individual columns. However each column is first clipped against an occlusion buffer for the entire image, and itself updates the occlusion buffer. In this way no pixel is drawn twice, and the engine knows when to stop drawing (all pixels occluded) and even when to stop walking the BSP. It tracks scene coverage to the pixel, very cheaply.
This does impose a limit though of one 'window' per sector join - each column has simple miny/maxy occlusion tracking, and can only 'close off' the scene from the top/bottom of the image inwards. You won't ever see one window above another (unless it's faked using a transparent wall texture with holes in the texture)...
I expect Doom and BadMood don't process upper/lower wall segments the same way - BadMood treats them as separate 'sub walls', added in sequence to the occlusion buffer. Doom might be processing them both at once as a kind of 'inverse wall' where the window between ssectors fills the occlusion buffer for upper/lower walls together. I'll probably have to check that sometime.
The floors (visplanes) are just area fills (actually gaps) between wall top/bottoms, but they have to be scan converted into the opposite axis before rasterization. There are no explicit floor/ceiling primitives involved - just a clever way of tracking what's left after the walls are generated. The visplane conversion is messy compared with everything else for a few reasons but the method does work and cheaper than using primitives.
So the BSP divides and sorts the scene into front-back order, allowing the occlusion buffer works properly - that's it's main duty. It's also used for casting rays by the game engine to find obstacles between entities - line of sight etc. But these tests are relatively rare compared with rendering occlusion tests. The BSP walk does depend on good visibility testing for each ssector, to avoid wasted work processing ssegs which are out of view. I don't remember what the Doom engine does for this, but I use 3 methods at once: 1) fast octant cull, 2) viewcone cull of node bounding box, 3) occlusion buffer test using 2D poly representing node bounding box
Transparent objects, sprites etc. are tracked and occluded in the same way but not drawn until the walls and floors are done, after which they are drawn in reverse order, back to front.
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Captain Atari
- Posts: 400
- Joined: Sat Jul 25, 2009 3:35 pm
Re: Bad Mood : Falcon030 'Doom'
whoa... well... I never...CiH wrote:Meanwhile, on other retro platforms, developments move apace...
My Stuff: FB/Falcon CT63 CTPCI ATI RTL8139 USB 512MB 30GB HDD CF HxC_SD/ TT030 68882 4+32MB 520MB Nova/ 520STFM 4MB Tos206 SCSI
Shared SCSI Bus:ScsiLink ethernet, 9GB HDD,SD-reader @ http://phsw.atari.org
My Atari stuff that are no longer for sale due to them over 30 years old - click here for list
Shared SCSI Bus:ScsiLink ethernet, 9GB HDD,SD-reader @ http://phsw.atari.org
My Atari stuff that are no longer for sale due to them over 30 years old - click here for list
Re: Bad Mood : Falcon030 'Doom'
With respect to Dio's post on raycasting - I have to admit some bias on techniques because I began the current engine some time before I had a copy of the Doom source code, so I'm partly influenced by the way I had to do things to suit the Falcon, DSP etc. and deriving methods logically from scraps of information on the net (e.g. I wouldn't try to stick the scene BSP on the DSP because it's size is pretty arbitrary and would prevent limit reuse for any other task). It's likely BM diverges from Doom in a number of ways, and there is more than one way to render those scenes from the BSP correctly and quickly, with pretty much the same termination conditions. I have also seen multiple, different accounts of how it works...Dio wrote: It's a 2D raycasting engine, as I understand it.
The main objection I had to literal raycasting is the 5-20 intersections per display column, and 'work' at potentially many of those interfaces (sector windows - upper/lower walls) which can be cheaply dealt with in scan conversion order per-surface.
However, if you had a really fast implementation of that ray casting test, simply to find the first contact (and which sseg, ssector owns the contact), it would be a viable way to generate a visible surface & column list from which to do the rest. The other bits still need done (y occlusion tables) but many of the stages are seperable.
I suppose it's something I should try at some point just to see what the cost trade looks like. I fear it's something that would need a nice big CPU cache because of the duty switching & tracking involved at each BSP split. It would also have to be entirely CPU-side I think and might not involve *literally* flinging entire rays into the BSP root every time - possibly some kind of incremental scan across hyperplanes, spawning sub-rays on each split (marching cubes kind of thing), if it happened to reduce duty switching.
Anyway I should be careful re: what I say about what Doom does/doesn't do without proper study of the source again after all these years, and many of these comments keep me thinking of possibilities so..

d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Bad Mood : Falcon030 'Doom'
How fast can it be if the original Doom source is used but with the graphics stuff optimized for the Falcon? First time I played it was on a 20Mhz 386SX computer and it runned at ok speed in low detail mode. Is the Falcon so much slower than a 386 with more or less the same clock speed? The MCGA mode on PC are ofcourse much better than the bitplane modes on Falcon, but in highcolor that shouldn't matter so much, unless it's very slow to draw stuff in that mode.
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe
Re: Bad Mood : Falcon030 'Doom'
That's sort of what I'm trying to do by bolting BM onto the Doom code - if that works out (BM would be a sort of Falcon-optimized graphics layer).Zamuel_a wrote:How fast can it be if the original Doom source is used but with the graphics stuff optimized for the Falcon? First time I played it was on a 20Mhz 386SX computer and it runned at ok speed in low detail mode.
However it's difficult to speculate on the performance of a 100% direct port of the original sources without DSP etc. since nobody has tried it (If anyone did try it, it would have happened while porting PMDOOM, in which case we'd already have a Falcon030 'Doom'), and I expect quite a lot of the C code would need redone for Falcon, as it was for x86...
In any case it's a different project. I'll stick with the BM implementation for now and incorporate any improvements and methods that seem to make sense for the Falcon (including anything I scrape from Doom sources too).
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Re: Bad Mood : Falcon030 'Doom'
I started a long ramble on some progress on DSP texturing last couple of days which ended up being a bit of a ramble on DSP generally, so here's the short version.
My tests showed there are between 7 and 10 'instruction cycles' (each one of those being 2x 32MHz clock cycles) free & available for use on the DSP, for every single texel/pixel the CPU can copy onto the screen from the DSP port.
So any texture mapping implementation has to fit into that, before the CPU needs to be slowed down.
I know this has been done already in at least one Falcon demo (albeit without texel lighting) in just 7 operations, I had to figure a way to get lighting in there as well to make it useful for BadMood.
I say between 7 and 10 because it varies depending on the bus load, display size blah blah... this is a bit of a worry. I don't want the game crashing/locking up mid way due to an optimization. Sometimes I was able to get 10 ops, other times only 9 - at one point there was only time for 8 when I reduced the display to a small window.
Anyway I managed a full implementation of texture addressing + uv wrapping + texture + lighting lookups in 8 just ops (still to be properly verified with a real texture), and I don't think I can do any better than that - it's already a densely packed, unreadable mess of parallel moves and addressing tricks.
Now it will either works reliably *or* the CPU will need artificial padding to slow it down and allow the DSP to keep up.
Will find out before next week and report.
There is another problem, which may prevent me using pure DSP texturing anyway - at least for now. Texture state changes - number of textures needing uploaded per frame. It's another solvable thing but more work, for later. Another time....
Busy for the rest of the week but will post any interesting progress if any is made...
My tests showed there are between 7 and 10 'instruction cycles' (each one of those being 2x 32MHz clock cycles) free & available for use on the DSP, for every single texel/pixel the CPU can copy onto the screen from the DSP port.
So any texture mapping implementation has to fit into that, before the CPU needs to be slowed down.
I know this has been done already in at least one Falcon demo (albeit without texel lighting) in just 7 operations, I had to figure a way to get lighting in there as well to make it useful for BadMood.
I say between 7 and 10 because it varies depending on the bus load, display size blah blah... this is a bit of a worry. I don't want the game crashing/locking up mid way due to an optimization. Sometimes I was able to get 10 ops, other times only 9 - at one point there was only time for 8 when I reduced the display to a small window.
Anyway I managed a full implementation of texture addressing + uv wrapping + texture + lighting lookups in 8 just ops (still to be properly verified with a real texture), and I don't think I can do any better than that - it's already a densely packed, unreadable mess of parallel moves and addressing tricks.
Now it will either works reliably *or* the CPU will need artificial padding to slow it down and allow the DSP to keep up.
Will find out before next week and report.
There is another problem, which may prevent me using pure DSP texturing anyway - at least for now. Texture state changes - number of textures needing uploaded per frame. It's another solvable thing but more work, for later. Another time....
Busy for the rest of the week but will post any interesting progress if any is made...
d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Captain Atari
- Posts: 400
- Joined: Sat Jul 25, 2009 3:35 pm
Re: Bad Mood : Falcon030 'Doom'
(edit) Why on earth does this "thread" not show up in "View active topics" ??? 

Last edited by kristjanga on Wed Feb 06, 2013 1:11 am, edited 1 time in total.
- DarkLord
- Ultimate Atarian
- Posts: 5271
- Joined: Mon Aug 16, 2004 12:06 pm
- Location: Prestonsburg, KY - USA
- Contact:
Re: Bad Mood : Falcon030 'Doom'
I've seen that happen with other messages threads that were very active. Not sure why.kristjanga wrote:Why on earth does this threat not show up in "View active topics" ???
BTW, I think you meant "thread" instead of "threat". While Doug is a tour-de-force, I don't really
consider him a threat...

Welcome To DarkForce! http://www.darkforce.org "The Fuji Lives.!"
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
Atari SW/HW based BBS - Telnet:darkforce-bbs.dyndns.org 1040
Re: Bad Mood : Falcon030 'Doom'
My overly long posts could be considered some kind of threat to forum storage!DarkLord wrote:I've seen that happen with other messages threads that were very active. Not sure why.kristjanga wrote:Why on earth does this threat not show up in "View active topics" ???
BTW, I think you meant "thread" instead of "threat". While Doug is a tour-de-force, I don't really
consider him a threat...

d:m:l
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM