
Bad Mood : Falcon030 'Doom'
Moderators: Zorro 2, Moderator Team
-
- Atari God
- Posts: 1223
- Joined: Wed Nov 20, 2002 11:22 pm
- Location: France
Re: Bad Mood : Falcon030 'Doom'

-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
Hi, sorry I forgot about this. The demo/game state management area is quite hard to follow because it runs as a sort of service. There is a symbol D_AdvanceDemo which IIRC is called every time the attract mode switches state e.g. from one title/splash screen to another, or in and out of demo mode. It won't happen during the demo, but will at the beginning and end (e.g. before loading, after death).Eero Tamminen wrote: Don't catch exact point when the player dies / timedemo ends, like they do with Doom II timedemo. There's some credits screen showed for a while before profiling ends. Could you suggest a symbol which isn't called, or doesn't change during normal gameplay, but does when timedemo ends?
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
Thanks, that seems to work fine!dml wrote:The demo/game state management area is quite hard to follow because it runs as a sort of service. There is a symbol D_AdvanceDemo which IIRC is called every time the attract mode switches state e.g. from one title/splash screen to another, or in and out of demo mode. It won't happen during the demo, but will at the beginning and end (e.g. before loading, after death).
With Doom I, the first startup seems to be taking ~2.5 min when it generates/caches the data. With real Atari hard disk that's probably a bit longer. There are no resource loads during normal game play.
As to latest worst frame profiles for Doom I, here's the thinking part (gcc 2.x, TIMERBASE=1)...
CPU side:
Code: Select all
Time spent in profile = 0.23188s.
Visits/calls:
- max = 1057, in _PIT_CheckLine at 0x32564, on line 2281
- 6332 in total
Executed instructions:
- max = 4333, in _R_DrawColumn+90 at 0x3d51c, on line 3329
- 379634 in total
Used cycles:
- max = 63476, in _PIT_CheckLine+780 at 0x32870, on line 2429
- 3719900 in total
Instruction cache misses:
- max = 3072, in _PIT_CheckLine+778 at 0x3286e, on line 2428
- 91763 in total
...
Executed instructions:
13.85% 13.88% 35.48% 52567 52682 134712 _P_CheckPosition
12.37% 12.39% 12.39% 46959 47024 47024 _R_PointInSubsector
11.48% 11.62% 14.51% 43564 44115 55070 _BM_P_CrossBSPNode
10.43% 10.45% 76.12% 39612 39688 288978 _P_RunThinkers
8.09% 8.10% 8.10% 30713 30733 30733 _BM_A_Mux3x2
7.51% 7.52% 8.50% 28494 28559 32274 _R_DrawColumn
4.79% 4.81% 6.11% 18187 18257 23198 _PIT_CheckLine
4.40% 4.40% 4.55% 16689 16703 17255 _V_DrawPatch
4.29% 4.30% 5.27% 16293 16333 19992 _PIT_CheckThing
2.09% 2.10% 10.61% 7937 7955 40261 _R_DrawVisSprite
1.94% 1.94% 17.48% 7370 7370 66362 _P_LookForPlayers
1.66% 1.66% 16.17% 6296 6314 61384 _BM_P_CheckSight
1.47% 1.48% 1.48% 5595 5622 5622 _P_UpdateSpecials
1.38% 1.38% 38.83% 5230 5237 147427 _P_TryMove
1.15% 1.15% 31.57% 4354 4354 119835 _P_Move
1.05% 1.05% 52.69% 3972 3979 200044 _P_SetMobjState
0.83% 3139 copy16_d
0.76% 0.76% 3.89% 2895 2895 14766 _P_SetThingPosition
0.59% 0.59% 33.47% 2223 2223 127071 _A_Chase
0.55% 0.55% 0.55% 2091 2091 2091 _P_UnsetThingPosition
0.54% 0.54% 25.63% 2037 2044 97292 _P_NewChaseDir
...
Instruction cache misses:
19.17% 19.23% 41.08% 17594 17648 37696 _P_CheckPosition
12.80% 12.83% 13.85% 11748 11776 12707 _PIT_CheckLine
8.81% 8.85% 85.22% 8085 8122 78203 _P_RunThinkers
5.66% 5.70% 5.70% 5198 5232 5232 _R_PointInSubsector
5.51% 5.94% 6.36% 5052 5453 5840 _BM_P_CrossBSPNode
4.31% 4.32% 10.68% 3953 3962 9802 _BM_P_CheckSight
3.75% 3.75% 46.63% 3440 3443 42790 _P_TryMove
3.38% 3.41% 3.57% 3101 3125 3276 _PIT_CheckThing
2.70% 2.70% 60.14% 2474 2477 55183 _P_SetMobjState
2.67% 2.68% 4.54% 2454 2463 4162 _R_DrawVisSprite
2.62% 2.62% 38.50% 2403 2403 35329 _P_Move
2.59% 2.59% 13.59% 2374 2374 12471 _P_LookForPlayers
2.00% 2.00% 3.41% 1832 1832 3133 _P_SetThingPosition
1.64% 1.68% 1.83% 1509 1540 1679 _R_DrawColumn
1.56% 1.56% 42.28% 1431 1431 38793 _A_Chase
1.35% 1.35% 31.62% 1240 1243 29014 _P_NewChaseDir
Code: Select all
Used cycles:
84.77% 6306918 command_base
7.23% 537714 ALGO_P_CrossBSPNode
5.55% 412866 P_CrossSubsector_body
1.05% 1.05% 1.05% 78400 78428 78428 InterceptVectorsUF
0.84% 62308 Divs48_Real
...
Visits/calls:
22.25% 22.25% 627 627 InterceptVectorsUF
20.40% 20.40% 575 575 TestLineSegVectorBisection
14.51% 409 P_CrossSubsector_body
12.46% 351 VECOP_return
12.21% 12.21% 344 344 InterceptVectors
11.99% 338 Divs48_Real
3.09% 87 ALGO_P_CrossBSPNode
3.09% 87 command_base
CPU part:
Code: Select all
Time spent in profile = 0.20246s.
Visits/calls:
- max = 188, in R_AdvanceSurface_NMip1 at 0x53f58, on line 2149
- 2945 in total
Executed instructions:
- max = 4769, in R_VisPlaneShaderQuickMip+202 at 0x537b6, on line 1743
- 351945 in total
Used cycles:
- max = 74208, in R_VisPlaneShaderQuickMip+200 at 0x537b4, on line 1742
- 3247900 in total
Instruction cache misses:
- max = 646, in R_AddSpriteSpans+48 at 0x52c18, on line 1069
- 27882 in total
...
Executed instructions:
18.39% 64717 R_VisPlaneShaderQuickMip
17.38% 61151 R_SpriteColumnShader_Masked2
12.18% 42858 R_DrawTSurface_Masked1
6.31% 22195 R_AdvanceSurface_NMip1
5.52% 5.53% 5.53% 19416 19472 19472 stream_texture
4.79% 16841 R_AdvanceSurface_NMip2
4.45% 4.45% 4.45% 15657 15675 15675 R_ViewTestSpriteLines
3.32% 11687 R_StackTransparentSurface
3.24% 11405 R_BSPHyperPlane
3.14% 3.15% 3.30% 11058 11092 11617 R_AddSpriteSpans
2.32% 2.33% 2.33% 8170 8190 8190 _BM_A_Mux1x2
2.07% 2.10% 2.42% 7269 7392 8510 stack_visplane_area
1.99% 7002 R_AdvanceSurface_NMip3
1.81% 6353 R_AddLine_loop
1.40% 4928 R_AdvanceSurface_TMip2
1.06% 1.06% 1.06% 3741 3745 3745 add_wall_segment
0.99% 0.99% 0.99% 3478 3478 3478 R_SetSubSectorLuma
0.91% 3198 build_ssector
0.87% 0.88% 0.88% 3079 3099 3099 init_stategroups
0.63% 0.63% 2.83% 2230 2230 9952 get_ssector
0.59% 0.59% 21.69% 2067 2094 76330 R_FlushDeferredSurfaces
0.58% 2026 R_DrawSurface_NMip
...
Used cycles:
22.89% 743488 R_VisPlaneShaderQuickMip
12.70% 412512 R_SpriteColumnShader_Masked2
9.07% 9.11% 9.11% 294652 295724 295724 stream_texture
7.78% 252612 R_DrawTSurface_Masked1
4.94% 160588 R_AdvanceSurface_NMip1
4.84% 4.85% 4.85% 157072 157456 157456 R_ViewTestSpriteLines
3.76% 122264 R_AdvanceSurface_NMip2
3.62% 3.64% 3.85% 117520 118112 124936 R_AddSpriteSpans
3.52% 114196 R_StackTransparentSurface
3.46% 112384 R_BSPHyperPlane
2.71% 2.72% 2.72% 87888 88192 88192 _BM_A_Mux1x2
...
Instruction cache misses:
21.14% 21.20% 22.14% 5893 5910 6173 R_AddSpriteSpans
6.71% 6.74% 6.74% 1870 1879 1879 R_ViewTestSpriteLines
5.26% 1467 build_ssector
4.81% 1342 R_AddLine_loop
4.17% 4.22% 21.30% 1163 1177 5938 R_FlushDeferredSurfaces
3.29% 916 R_AddOverlappingSprites
3.17% 3.17% 29.20% 883 883 8141 R_SubSectorTryFlush
Code: Select all
Used cycles:
34.84% 2263440 command_base
21.71% 21.71% 21.71% 1410012 1410012 1410012 VPRenderSpanQuickMip
9.03% 586500 R_VPLoadTexture
7.57% 9.40% 9.40% 491538 610408 610408 R_DoColumnPerspCorrect
5.50% 5.50% 5.50% 357400 357400 357400 extract_subvisplane
3.23% 209658 R_DoColumnTextureUV
3.19% 207464 R_ViewTestAddLine
1.89% 123080 R_VPRenderPlane
1.78% 115800 project_node
1.07% 1.14% 3.92% 69232 73784 254728 AddTransWall
1.06% 1.08% 4.10% 69104 70218 266220 AddUpperWall
1.03% 67080 R_CheckBBoxPair
1.03% 1.06% 2.76% 66964 68926 179106 AddLowerWall
0.99% 1.02% 7.01% 64062 66386 455574 AddMidWall
0.99% 2.27% 2.27% 64056 147554 147554 R_DoColumnConstantClone
0.76% 0.76% 0.76% 49500 49500 49500 R_BufferSurface
0.68% 44090 R_ViewTestSpriteLine
If you have about finished the optimizations for the timedemo compatible parts of thinking phase, I could switch the worst frame profiling to TIMERBASE=3 again.
Btw. I get one error on console when BM starts:
Code: Select all
InitTextureserror: could not map flat [F1_START] via BM API
With Doom II, BM tries access following nonexisting directory: BMC/SPR/VILE. The BMC/FLT subdirectory created under cache directory, is empty even after running both Doom I & Doom II.
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
CPU side:
Code: Select all
Time spent in profile = 111.73971s.
Visits/calls:
- max = 175998, in copy16_d at 0x56e10, on line 14504
- 2174274 in total
Executed instructions:
- max = 1025620, in R_VisPlaneShaderQuickMip+202 at 0x537b6, on line 12643
- 195219745 in total
Used cycles:
- max = 16201124, in R_VisPlaneShaderQuickMip+200 at 0x537b4, on line 12642
- 1792583640 in total
Instruction cache misses:
- max = 441287, in _P_RunThinkers+268 at 0x3a96a, on line 8172
- 24581582 in total
...
Executed instructions:
8.88% 17333744 R_AdvanceSurface_TMip0
8.06% 8.16% 8.70% 15740734 15925570 16978783 _BM_P_CrossBSPNode
7.05% 13762875 R_VisPlaneShaderQuickMip
6.74% 6.75% 30.01% 13162760 13184062 58578535 _P_RunThinkers
6.65% 12989756 R_AdvanceSurface_NMip0
5.14% 10039907 R_SpriteColumnShader_Masked2
3.77% 7367649 R_AdvanceSurface_NMip1
3.73% 7285190 R_AdvanceSurface_TMip1
3.15% 3.16% 3.29% 6152144 6163795 6420717 _R_DrawColumn
3.11% 3.11% 3.11% 6080012 6079945 6079945 * _BM_A_Mux3x2
3.04% 3.04% 3.17% 5927874 5933514 6197347 _R_PointInSubsector
2.99% 3.00% 3.25% 5842231 5851897 6342972 _V_DrawPatch
2.93% 2.94% 6.89% 5720589 5733546 13456957 _P_CheckPosition
2.28% 4441648 R_DrawTSurface_Masked1
1.98% 1.99% 2.17% 3863784 3877200 4232881 stream_texture
1.40% 2725760 R_AdvanceSurface_NMip2
1.26% 1.27% 10.94% 2469461 2473636 21365567 _P_LookForPlayers
1.20% 1.21% 4.57% 2346797 2353235 8930069 _R_DrawVisSprite
1.13% 1.13% 1.13% 2197346 2198606 2198606 _BM_A_Mux2x2
1.10% 1.10% 9.90% 2145626 2155984 19327954 _BM_P_CheckSight
1.08% 2110282 R_BSPHyperPlane
1.00% 1957029 R_StackTransparentSurface
0.95% 0.95% 1.02% 1856451 1861708 1999475 _P_UpdateSpecials
0.91% 0.91% 0.97% 1767765 1771830 1886364 R_ViewTestSpriteLines
0.87% 0.88% 1.03% 1702978 1709257 2013488 _PIT_CheckLine
0.85% 1659082 R_AdvanceSurface_TMip2
0.83% 0.83% 0.83% 1610563 1611663 1611663 _BM_A_Mux1x2
0.74% 1442033 R_AddLine_loop
0.71% 0.71% 0.76% 1391610 1395026 1484345 _PIT_CheckThing
0.70% 0.70% 0.78% 1365293 1368782 1525302 R_AddSpriteSpans
0.68% 0.68% 0.71% 1323970 1326014 1385895 init_stategroups
0.67% 1300430 R_VisPlaneShaderWarp
0.61% 0.61% 19.64% 1181773 1184537 38349732 _P_SetMobjState
0.51% 0.51% 0.54% 993652 995646 1057567 R_SetSubSectorLuma
0.51% 986931 copy16_d
0.50% 0.51% 0.55% 979733 997206 1074135 stack_visplane_area
Used cycles:
8.98% 160970528 R_VisPlaneShaderQuickMip
8.67% 8.81% 9.36% 155427272 157942540 167785476 _BM_P_CrossBSPNode
6.72% 120376872 R_AdvanceSurface_TMip0
5.87% 5.89% 31.41% 105182088 105642172 563053292 _P_RunThinkers
5.16% 92466824 R_AdvanceSurface_NMip0
3.78% 67752036 R_SpriteColumnShader_Masked2
3.27% 3.29% 3.47% 58631892 58898244 62237536 stream_texture
3.14% 3.15% 3.29% 56208744 56447780 58915972 _R_DrawColumn
3.07% 3.08% 7.34% 55006164 55260180 131661912 _P_CheckPosition
3.00% 3.00% 3.00% 53787584 53808236 53808236 _BM_A_Mux3x2
2.96% 53007324 R_AdvanceSurface_NMip1
2.83% 50811876 R_AdvanceSurface_TMip1
2.46% 2.47% 2.76% 44138360 44331724 49414868 _V_DrawPatch
1.99% 2.00% 2.13% 35639496 35786720 38231024 _R_PointInSubsector
1.75% 1.76% 11.22% 31353176 31526736 201122240 _BM_P_CheckSight
1.60% 1.61% 1.77% 28647508 28776152 31788204 _PIT_CheckLine
1.49% 26661392 R_DrawTSurface_Masked1
1.29% 1.30% 4.67% 23197192 23314416 83791336 _R_DrawVisSprite
1.28% 1.29% 1.36% 22951200 23054480 24347164 _P_UpdateSpecials
1.23% 1.24% 12.27% 22061980 22157364 219986268 _P_LookForPlayers
...
Instruction cache misses:
10.30% 10.35% 59.69% 2531672 2543028 14673456 _P_RunThinkers
7.45% 7.48% 15.42% 1832323 1838568 3790031 _P_CheckPosition
6.95% 7.51% 7.72% 1709433 1846350 1896794 _BM_P_CrossBSPNode
5.47% 5.49% 13.24% 1345721 1349484 3255528 _BM_P_CheckSight
4.70% 4.71% 4.89% 1154917 1157890 1202419 _PIT_CheckLine
3.30% 3.31% 17.24% 811840 814022 4237675 _P_LookForPlayers
3.17% 3.18% 3.33% 779144 780871 817637 R_AddSpriteSpans
2.95% 2.95% 40.84% 724927 726329 10038799 _P_SetMobjState
2.85% 2.86% 4.70% 700489 703285 1156491 _R_DrawVisSprite
2.67% 2.69% 2.74% 657187 660850 673259 _R_PointInSubsector
1.95% 1.96% 19.60% 479552 481052 4817158 _P_TryMove
1.69% 1.71% 1.77% 415062 421135 434763 _R_DrawColumn
1.69% 1.71% 2.11% 414785 419617 517605 _V_DrawPatch
1.68% 1.68% 3.90% 412976 413595 957796 _V_CopyRect
1.37% 1.37% 19.14% 337239 337814 4704346 _A_Look
1.34% 1.35% 2.38% 330298 330818 584228 _P_SetThingPosition
1.33% 1.34% 1.36% 326692 328700 333499 R_ViewTestSpriteLines
1.32% 1.32% 14.57% 324315 325183 3581684 _P_CheckSight
1.27% 1.27% 17.72% 311809 312287 4355208 _A_Chase
1.23% 1.23% 1.27% 301649 303345 311789 _PIT_CheckThing
1.17% 1.17% 7.02% 286676 287519 1725232 _STlib_drawNum
1.13% 1.14% 1.15% 278497 279436 282632 _PIT_AddLineIntercepts_L
1.10% 1.10% 3.61% 269294 270222 886508 _P_PathTraverse
1.00% 1.00% 15.51% 244598 245307 3813141 _P_Move
DSP side:
Code: Select all
Used cycles:
48.27% 1730650824 command_base
18.06% 19.34% 19.34% 647582434 693331682 693331682 R_DoColumnPerspCorrect
8.90% 8.90% 8.90% 319216842 319216842 319216842 VPRenderSpanQuickMip
5.76% 206496458 ALGO_P_CrossBSPNode
3.44% 123435498 R_VPLoadTexture
2.33% 83666106 P_CrossSubsector_body
1.73% 61959168 R_DoColumnTextureUV
1.65% 1.65% 1.65% 59319566 59319566 59319566 extract_subvisplane
0.90% 0.93% 15.58% 32381858 33226524 558545954 AddMidWall
0.90% 0.91% 1.89% 32100864 32462482 67821952 AddLowerWall
0.88% 31536670 R_ViewTestAddLine
0.81% 0.81% 0.81% 28867582 28867582 28867582 VPRenderSpanWarp
0.80% 0.80% 0.80% 28712952 28720104 28720104 InterceptVectorsUF
0.66% 0.00% 0.00% 23546370 95358 95358 * Divs48_Real
0.57% 20292740 R_VPRenderPlane
0.56% 20206338 project_node
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
I got this when I started also worst frame profiling from the first A_Chase() call (instead of LineAttack) in Doom I timedemo:
Code: Select all
CPU side:
Time spent in profile = 0.29958s.
...
Executed instructions:
61.20% 411147 R_SpriteColumnShader_Masked2
28.03% 188337 R_AdvanceSurface_NMip0
5.58% 5.58% 5.58% 37477 37497 37497 _BM_A_Mux3x2
3.04% 20454 R_StackTransparentSurface
...
Visits/calls:
47.36% 287 R_AdvanceSurface_NMip0
9.90% 60 init_font
2.31% 14 R_BSPHyperPlane
1.82% 1.82% 11 11 _BM_A_Mux3x2
1.82% 11 _audio_mux_asm
1.82% 1.82% 11 11 _frame_event
1.82% 3.63% 11 22 _subframe_block
1.82% 7.26% 11 44 _audio_mux_frame
1.65% 10 R_AddLine_loop
1.49% 9 R_AddLine_invisible
1.49% 4.29% 9 26 render_wall
1.32% 1.32% 8 8 cache_resource
1.32% 1.32% 8 8 add_wall_segment
1.32% 1.32% 8 8 update_dirty_sector
1.16% 7 R_TransparentSurfaceLoop
1.16% 7 R_TransparentSurfaceNext
1.16% 7 R_DrawTSurface_Masked2
1.16% 7 R_StackTransparentSurface
1.16% 7 R_SpriteColumnShader_Masked2
0.66% 4 R_RenderBSPNode
0.66% 4 ssector_node
0.66% 4 R_PopBSPNode
...
DSP side:
Used cycles:
64.52% 6201732 command_base
29.17% 29.88% 29.88% 2803770 2872548 2872548 R_DoColumnPerspCorrect
2.58% 247768 R_DoColumnTextureUV
1.40% 1.41% 4.15% 134764 135578 398976 AddTransWall
0.83% 2.68% 2.68% 79974 257714 257714 R_DoColumnConstantClone
0.70% 0.70% 30.59% 66828 66926 2940288 AddMidWall
Code: Select all
$054e8e : jmp $54e92(pc,d0.w*2) 0.33% (2192, 26304, 0)
$054e92 : bra.s $54eca 0.12% (838, 6760, 14)
$054e94 : bra.s $54ebe 0.07% (474, 3812, 5)
$054e96 : bra.s $54eb2 0.06% (376, 3016, 4)
$054e98 : bra.s $54ea6 0.08% (504, 4056, 12)
$054e9a : move.b (a2,d4.w),d0 1.94% (13056, 156840, 0)
$054e9e : adda.l d6,a0 1.94% (13056, 156, 45)
$054ea0 : move.w (a5,d0.w*2),(a0) 1.94% (13056, 209372, 37)
$054ea4 : addx.l d3,d4 1.94% (13056, 28, 8)
$054ea6 : move.b (a2,d4.w),d0 2.02% (13560, 162892, 0)
$054eaa : adda.l d6,a0 2.02% (13560, 196, 53)
$054eac : move.w (a5,d0.w*2),(a0) 2.02% (13560, 217428, 48)
$054eb0 : addx.l d3,d4 2.02% (13560, 32, 10)
$054eb2 : move.b (a2,d4.w),d0 2.07% (13936, 167412, 0)
$054eb6 : adda.l d6,a0 2.07% (13936, 212, 57)
$054eb8 : move.w (a5,d0.w*2),(a0) 2.07% (13936, 223052, 22)
$054ebc : addx.l d3,d4 2.07% (13936, 16, 2)
$054ebe : move.b (a2,d4.w),d0 2.14% (14410, 173204, 0)
$054ec2 : adda.l d6,a0 2.14% (14410, 64, 14)
$054ec4 : move.w (a5,d0.w*2),(a0) 2.14% (14410, 231000, 12)
$054ec8 : addx.l d3,d4 2.14% (14410, 28, 6)
$054eca : dbra d1,$54e9a 2.27% (15248, 130976, 0)
$054ece : move.w d4,d0 0.33% (2192, 8768, 23)
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
I don't think this one is an optimization problem TBH, at least not anymore. The main optimization which had been applied there was turning pixel-testing for transparency into solid runs and skips. Having done that, and with the code caching fully, there isn't really anything else left to do with it beyond tweaks. Further optimizations will result in features being removed (such as lighting of sprites).Eero Tamminen wrote:Hm, it seems there might still be something to optimize before looking more into TIMERBASE=3.
The primary problem is the number of objects, and their proximity, demanded by the game at that point in time. Unlike walls, sprites don't influence the occlusion buffers. They are clipped by this system but don't contribute to it. So overdraw occurs and it's not avoidable. This isn't specific to BadMood it's a generic problem with sprites in these engines.
The second problem is the fact that the scaled sprite system (and now the walls) rely on a datacache to buy back cycles on large fills. Hatari has no datacache emulation, so it is not giving accurate figures here, and the cost of sprites rises dramatically in the profiler as sprites get bigger. We won't be able to profile that accurately until Hatari supports the 68030 datacache.
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
These two are a rough indicator of scene complexity - number of state changes relating to map sectors - steps, ponds, ledges, doorways etc. stack_visplane tends to happen when any state changes for floor or ceiling between two sectors in draw order. stream_texture counts the number of unique texture state changes in the scene for floors and ceilings.Eero Tamminen wrote: E.g. stream_texture and stack_visplane were higher on the worst frame, but in general worst frame and whole timedemo average are getting fairly close in what's the most expensive functionality.
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
The first run will take a long time if the cache is empty. It has to do all the mipmapping/recolouring stuff, especially slow if there are HD textures available on disk which map to the WAD.Eero Tamminen wrote: With Doom I, the first startup seems to be taking ~2.5 min when it generates/caches the data. With real Atari hard disk that's probably a bit longer. There are no resource loads during normal game play.
???QuickMip is a version of the floor/ceiling shader for the most common cases (normal floors, mipmapped). The sky and liquids have separate versions. It's approximately fill-limited in most cases except maps with tons of tiny sectors.Eero Tamminen wrote: QuickMip stuff is high both on CPU & DSP side. Flushing with stuff it calls seems to causing i-cache misses.
TBH it's going to be tricky to auto-profile anything in that department because most such optimizations break demo replay coherence. Even if we get demo recording working on the Falcon, such a demo will immediately be invalidated by some of the changes I would make for optimization.Eero Tamminen wrote: If you have about finished the optimizations for the timedemo compatible parts of thinking phase, I could switch the worst frame profiling to TIMERBASE=3 again.
I'm not sure there is an answer to this one, except perhaps profiling only the level startpoints (which are not representative at all).
I have seen it, but I'm not sure yet where the fault is. It seems like Doom starts counting 'flat' texture indices in the WAD from the F1_START marker instead of the first actual flat. Whether this is an existent bug or not I'm not sure. But it is a bug now because it's being fed to BM as a texture translation mapping.Eero Tamminen wrote: Btw. I get one error on console when BM starts:Is this something known?Code: Select all
InitTextureserror: could not map flat [F1_START] via BM API
Not sure what's going on there - VILE?? is a sprite, not a directory. I did fix a bug recently relating to string8 conversion to filenames but it only affects HD texture loading, and there are none for the VILE sprite. I'll look later.Eero Tamminen wrote: With Doom II, BM tries access following nonexisting directory: BMC/SPR/VILE. The BMC/FLT subdirectory created under cache directory, is empty even after running both Doom I & Doom II.
FLT\ cache directory will remain empty for now - I haven't got around to caching the floor textures. They are still generated at runtime (and slow down loading accordingly).
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
Whole timedemo, CPU side:
Code: Select all
Time spent in profile = 76.79584s.
...
Executed instructions:
7.82% 7.90% 8.32% 10604911 10708703 11276857 _BM_P_CrossBSPNode
6.50% 8813693 R_AdvanceSurface_TMip2
6.09% 6.10% 6.29% 8256622 8265670 8528958 _R_PointInSubsector
5.52% 7482941 R_AdvanceSurface_NMip0
5.37% 5.38% 33.27% 7278744 7290796 45099937 _P_RunThinkers
5.26% 7137557 R_AdvanceSurface_TMip0
4.52% 6121373 R_VisPlaneSkyShader
4.27% 4.28% 13.05% 5784277 5798608 17687692 _P_CheckPosition
2.99% 4055421 R_AdvanceSurface_TMip1
2.70% 3660966 R_AdvanceSurface_TMip3
2.56% 3476590 R_VisPlaneShaderWarp
2.56% 2.56% 2.69% 3470122 3477338 3645132 _R_DrawColumn
2.21% 2.21% 2.21% 2994077 2995857 2995857 _BM_A_Mux3x2
2.13% 2.14% 2.31% 2892354 2899677 3133794 _PIT_CheckThing
2.12% 2.13% 2.30% 2880416 2884693 3118471 _V_DrawPatch
2.05% 2778391 R_VisPlaneShaderQuickMip
1.86% 2517742 R_BSPHyperPlane
1.79% 1.80% 1.96% 2427000 2435859 2658515 stream_texture
1.37% 1.37% 1.37% 1856122 1857162 1857162 _BM_A_Mux2x2
1.35% 1827082 R_AdvanceSurface_NMip3
1.32% 1.33% 1.43% 1796108 1799312 1932754 _P_UpdateSpecials
1.28% 1737562 R_DrawTSurface_Masked1
1.05% 1.05% 3.81% 1425594 1428978 5167834 _R_DrawVisSprite
1.03% 1390094 R_AddLine_loop
1.02% 1.02% 1.10% 1381492 1384795 1486430 R_ViewTestSpriteLines
0.98% 0.98% 1.07% 1332894 1335343 1450196 R_AddSpriteSpans
0.97% 0.97% 0.97% 1310742 1311602 1311602 _BM_A_Mux1x2
0.88% 0.89% 1.00% 1196467 1200011 1351137 _PIT_CheckLine
0.84% 1134760 R_SpriteColumnShader_Masked2
0.82% 0.84% 0.88% 1109655 1134982 1187881 stack_visplane_area
0.77% 0.77% 9.14% 1041176 1044253 12385409 _BM_P_CheckSight
0.70% 0.70% 8.55% 943806 945574 11598169 _P_LookForPlayers
0.67% 0.67% 0.70% 902900 904503 943594 R_SetSubSectorLuma
0.59% 804014 R_AdvanceSurface_NMip2
0.58% 787953 build_ssector
0.58% 0.58% 13.79% 782750 785441 18702401 _P_Move
0.55% 0.55% 1.37% 747034 749028 1862292 get_ssector
0.54% 0.54% 0.56% 732802 733997 757066 init_stategroups
0.52% 0.53% 15.06% 708808 712134 20413976 _P_TryMove
...
Used cycles:
8.46% 8.57% 9.01% 104172448 105610296 111000388 _BM_P_CrossBSPNode
4.99% 61526524 R_AdvanceSurface_TMip2
4.77% 4.79% 34.50% 58736536 59000952 425097356 _P_RunThinkers
4.65% 4.67% 12.89% 57287348 57550596 158786028 _P_CheckPosition
4.31% 53157716 R_AdvanceSurface_NMip0
4.07% 4.09% 4.29% 50120808 50337108 52875856 _R_PointInSubsector
4.02% 49518132 R_AdvanceSurface_TMip0
3.27% 40232000 R_VisPlaneSkyShader
3.02% 37174892 R_VisPlaneShaderWarp
2.99% 3.00% 3.18% 36830660 37007772 39150112 stream_texture
2.70% 2.71% 2.89% 33295392 33448176 35646288 _PIT_CheckThing
2.60% 2.61% 2.74% 31992468 32135896 33744216 _R_DrawColumn
2.47% 30435508 R_VisPlaneShaderQuickMip
2.30% 28365352 R_AdvanceSurface_TMip1
2.15% 2.15% 2.15% 26488644 26515700 26515700 _BM_A_Mux3x2
2.10% 25865264 R_AdvanceSurface_TMip3
2.02% 24861936 R_BSPHyperPlane
1.81% 1.81% 1.91% 22239928 22316944 23568652 _P_UpdateSpecials
1.77% 1.77% 1.98% 21774956 21867144 24337900 _V_DrawPatch
1.38% 1.38% 1.38% 17039628 17055436 17055436 _BM_A_Mux2x2
1.38% 1.39% 1.52% 17022900 17094636 18742504 _PIT_CheckLine
1.22% 14984996 R_AddLine_loop
1.21% 1.22% 10.27% 14919440 14978924 126586288 _BM_P_CheckSight
1.16% 1.16% 1.24% 14244816 14311724 15263664 R_ViewTestSpriteLines
1.14% 1.15% 3.96% 14078120 14144984 48844176 _R_DrawVisSprite
1.14% 1.14% 1.14% 14063676 14076748 14076748 _BM_A_Mux1x2
1.14% 1.14% 1.24% 14027448 14081064 15298956 R_AddSpriteSpans
1.08% 13284316 R_AdvanceSurface_NMip3
1.01% 1.15% 1.19% 12464728 14200136 14708300 stack_visplane_area
0.96% 0.97% 14.18% 11853008 11906824 174710764 _P_Move
0.89% 0.90% 15.06% 10988412 11044436 185555400 _P_TryMove
0.88% 10886872 R_DrawTSurface_Masked1
0.88% 10822052 build_ssector
0.76% 9308440 copy16_d
0.69% 0.69% 9.57% 8509012 8545492 117898520 _P_LookForPlayers
0.65% 8046744 copy256
0.64% 7831256 copy256_d
0.62% 7659756 R_SpriteColumnShader_Masked2
0.60% 0.61% 1.68% 7425596 7459856 20669880 get_ssector
0.59% 0.59% 0.62% 7295684 7328064 7697088 R_SetSubSectorLuma
0.56% 6889360 R_StackTransparentSurface
0.54% 0.54% 27.05% 6685372 6709656 333230124 _P_SetMobjState
...
Instruction cache misses:
12.75% 12.78% 23.73% 2450727 2456834 4561713 _P_CheckPosition
7.22% 7.25% 57.58% 1387437 1393988 11070116 _P_RunThinkers
5.71% 6.08% 6.23% 1097846 1168064 1196894 _BM_P_CrossBSPNode
4.71% 4.73% 4.81% 904836 910239 924926 _R_PointInSubsector
3.85% 3.85% 3.98% 739745 740924 764672 R_AddSpriteSpans
3.80% 3.81% 4.02% 730967 732681 772390 _PIT_CheckLine
3.44% 3.44% 9.68% 660550 661850 1861822 _BM_P_CheckSight
2.82% 2.84% 2.91% 542833 546494 559311 _PIT_CheckThing
2.53% 2.53% 28.67% 485599 487135 5511377 _P_TryMove
2.22% 2.23% 26.84% 427518 428860 5161334 _P_Move
2.22% 2.23% 3.67% 427153 428803 706091 _R_DrawVisSprite
1.97% 1.97% 44.96% 378401 378930 8645156 _P_SetMobjState
1.89% 363658 build_ssector
1.74% 1.74% 10.54% 334245 335136 2026920 _P_LookForPlayers
1.62% 310587 R_AddLine_loop
1.54% 1.55% 1.57% 296006 297601 302548 R_ViewTestSpriteLines
1.32% 1.33% 1.38% 253037 256562 265374 _R_DrawColumn
1.21% 1.21% 22.89% 232003 232519 4401117 _P_NewChaseDir
1.19% 1.19% 2.76% 229030 229495 530899 _V_CopyRect
1.16% 1.17% 1.44% 222748 225022 276277 _V_DrawPatch
1.11% 1.12% 29.57% 214284 214563 5684766 _A_Chase
1.07% 1.07% 6.69% 205551 206452 1286290 R_FlushDeferredSurfaces
1.05% 1.05% 8.94% 201342 201612 1718419 R_SubSectorTryFlush
0.96% 185309 R_BSPHyperPlane
0.95% 183156 R_AddOverlappingSprites
0.91% 0.92% 1.80% 175809 176221 345651 _P_SetThingPosition
0.83% 0.83% 0.84% 159586 160244 162389 add_wall_segment
0.80% 0.80% 4.89% 153352 153803 939280 _STlib_drawNum
0.73% 0.74% 12.96% 141276 141577 2491036 _A_Look
0.72% 0.73% 10.42% 139266 139741 2003112 _P_CheckSight
Code: Select all
Used cycles:
48.29% 1189949724 command_base
18.24% 19.48% 19.48% 449379294 479969092 479969092 R_DoColumnPerspCorrect
4.87% 119998916 ALGO_P_CrossBSPNode
3.39% 83597630 R_VPRenderSky
3.15% 77630598 R_VPLoadTexture
2.94% 2.94% 2.94% 72481156 72481156 72481156 VPRenderSpanWarp
2.62% 64641630 P_CrossSubsector_body
2.21% 2.21% 2.21% 54484914 54484914 54484914 VPRenderSpanQuickMip
1.88% 1.88% 1.88% 46356406 46356406 46356406 extract_subvisplane
1.69% 41575386 R_ViewTestAddLine
1.44% 35372808 R_DoColumnTextureUV
0.99% 1.02% 4.96% 24319178 25056370 122239306 AddLowerWall
0.99% 24276688 project_node
0.89% 0.89% 0.89% 21873920 21884536 21884536 InterceptVectorsUF
0.73% 0.77% 15.81% 18077728 19017240 389545590 AddMidWall
0.66% 0.00% 0.00% 16230648 51224 51224 * Divs48_Real
0.65% 16102746 R_CheckBBoxPair
0.57% 14036102 R_VPRenderPlane
0.56% 0.58% 1.70% 13900012 14369390 41824068 AddUpperWall
After first A_Chase() call there are few disk reads, not through load/read_resource() though:
Code: Select all
- 0x3dcba: _R_RenderPlayerView (return = 0x22e70)
- 0x40a02: _R_DrawMasked (return = 0x3de2c)
- 0x403d4: _R_DrawPSprite (return = 0x40a92)
- 0x3ff54: _R_DrawVisSprite (return = 0x4054e)
- 0x41152: _W_CacheLumpNum (return = 0x3ff72)
- 0x608dc: ___read (return = 0x4122e)
GEMDOS 0x3F Fread(64, 3400, 0x3f2c8c)
...
GEMDOS 0x3F Fread(64, 1156, 0x3f3f90)
...
GEMDOS 0x3F Fread(64, 8860, 0x3f442c)
...
GEMDOS 0x3F Fread(64, 10368, 0x3f66e0)
...
GEMDOS 0x3F Fread(64, 10668, 0x3f9680)
...
GEMDOS 0x3F Fread(64, 1668, 0x3fc784)
...
GEMDOS 0x3F Fread(64, 8180, 0x3fd1a4)
...
GEMDOS 0x3F Fread(64, 2684, 0x3ff3cc)
...
GEMDOS 0x3F Fread(64, 8204, 0x3fff14)
...
GEMDOS 0x3F Fread(64, 2308, 0x401f38)
The VILE warnings stuff is from BM trying to do:
Code: Select all
GEMDOS 0x3D Fopen("BMC\SPR\VILE\1.BMT", read-only)
No GEMDOS dir '/home/linuxdoom/autoprofile/BMC/SPR/VILE'
GEMDOS 0x42 Fseek(9525352, 65, 0)
GEMDOS 0x3F Fread(65, 3120, 0x96fc70)
GEMDOS 0x3C Fcreate("BMC\SPR\VILE\1.BMT", 0x0)
No GEMDOS dir '/home/linuxdoom/autoprofile/BMC/SPR/VILE'
[EDIT] fixed typos.
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
More complex levels (and/or more objects on the map using their eyes) will cause this. It's the enemy count which determines the number of rays cast, and the map which determines how the ray gets chopped up. So it can get quite bad if both are increased together.Eero Tamminen wrote: Interestingly about same amount of DSP is free / looping in command base as with Doom I. Here CrossBSPNode is higher than QuickMip, which I guess is expected due to more complex levels.
A bit more tricky than before, if the files come from the local cache. They pass through here in that case... it takes a few instructions until a4 is loaded with the address of the texturedef (name). You could set a breakpoint there or we could introduce a special label to make it easier to catch.Eero Tamminen wrote: After first A_Chase() call there are few disk reads, not through load/read_resource() though:Any comments on how to catch the names of these sprites?Code: Select all
- 0x3dcba: _R_RenderPlayerView (return = 0x22e70) - 0x40a02: _R_DrawMasked (return = 0x3de2c) - 0x403d4: _R_DrawPSprite (return = 0x40a92) - 0x3ff54: _R_DrawVisSprite (return = 0x4054e) - 0x41152: _W_CacheLumpNum (return = 0x3ff72) - 0x608dc: ___read (return = 0x4122e) GEMDOS 0x3F Fread(64, 3400, 0x3f2c8c)
Code: Select all
*-------------------------------------------------------*
D_TextureCacheIn:
*-------------------------------------------------------*
movem.l d2-d7/a0-a6,-(sp)
*-------------------------------------------------------*
move.w cache_entry(a0),d0
move.l resourcedef_table,a4
move.l (a4,d0.w*4),a4
That just looks wrong to me. VILE1 (etc) is a sprite lump, not a directory. Likely a bug in filename/path handling, possibly the non-zero-terminated strings again...Eero Tamminen wrote: The VILE warnings stuff is from BM trying to do:But it's not creating the VILE subdirectory first.Code: Select all
GEMDOS 0x3D Fopen("BMC\SPR\VILE\1.BMT", read-only) No GEMDOS dir '/home/linuxdoom/autoprofile/BMC/SPR/VILE' GEMDOS 0x42 Fseek(9525352, 65, 0) GEMDOS 0x3F Fread(65, 3120, 0x96fc70) GEMDOS 0x3C Fcreate("BMC\SPR\VILE\1.BMT", 0x0) No GEMDOS dir '/home/linuxdoom/autoprofile/BMC/SPR/VILE'
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
Here's worst frame for Doom II timedemo.
Rendering, CPU side:
Code: Select all
Time spent in profile = 0.24069s.
...
Executed instructions:
22.46% 105599 R_SpriteColumnShader_Masked2
14.07% 66163 R_AdvanceSurface_NMip0
9.58% 45062 R_VisPlaneSkyShader
6.87% 32305 R_AdvanceSurface_TMip3
6.53% 6.54% 6.54% 30712 30732 30732 _BM_A_Mux3x2
3.96% 18630 R_AdvanceSurface_TMip2
3.81% 17928 R_VisPlaneShaderWarp
3.50% 16478 R_DrawTSurface_Masked1
2.75% 2.76% 2.76% 12944 12978 12978 stream_texture
2.71% 12728 R_BSPHyperPlane
2.55% 11984 R_AdvanceSurface_NMip3
2.00% 9403 R_StackTransparentSurface
1.85% 1.86% 1.95% 8719 8726 9167 R_AddSpriteSpans
1.49% 6992 R_AddLine_loop
1.47% 6920 R_VisPlaneShaderQuickMip
1.35% 1.38% 1.38% 6365 6481 6481 stack_visplane_area
1.33% 1.33% 1.33% 6257 6261 6261 R_ViewTestSpriteLines
0.93% 4372 R_AdvanceSurface_TMip4
0.84% 3946 R_DrawSurface_NMip
0.84% 0.84% 0.84% 3931 3931 3931 R_SetSubSectorLuma
0.82% 3857 build_ssector
0.76% 0.76% 0.76% 3569 3569 3569 add_wall_segment
0.65% 0.65% 0.65% 3079 3079 3079 init_stategroups
0.65% 3053 R_AdvanceSurface_NMip2
0.50% 0.50% 0.50% 2351 2351 2351 _BM_A_Mux2x2
...
Instruction cache misses:
17.40% 17.41% 18.16% 4898 4901 5113 R_AddSpriteSpans
6.28% 1767 build_ssector
5.79% 5.80% 5.80% 1629 1632 1632 R_ViewTestSpriteLines
5.37% 1511 R_AddLine_loop
3.89% 3.95% 23.06% 1096 1113 6493 R_FlushDeferredSurfaces
3.56% 3.56% 31.76% 1001 1001 8942 R_SubSectorTryFlush
...
Visits/calls:
5.70% 178 R_AddLine_loop
5.67% 177 R_AddLine_invisible
4.77% 149 R_AdvanceSurface_TMip3
3.59% 112 R_AdvanceSurface_NMip3
3.52% 110 R_BSPHyperPlane
3.40% 106 R_AdvanceSurface_NMip0
3.17% 4.84% 99 151 render_wall
2.72% 2.72% 85 85 cache_resource
2.72% 85 R_AdvanceSurface_TMip2

DSP side:
Code: Select all
Used cycles:
35.83% 2766732 command_base
22.17% 23.97% 23.97% 1711970 1851238 1851238 R_DoColumnPerspCorrect
7.59% 585804 R_VPRenderSky
5.45% 5.45% 5.45% 421038 421038 421038 VPRenderSpanWarp
5.06% 390570 R_VPLoadTexture
4.02% 310650 R_ViewTestAddLine
2.77% 2.77% 2.77% 213642 213642 213642 extract_subvisplane
2.68% 207096 R_DoColumnTextureUV
2.63% 2.63% 2.63% 203100 203100 203100 VPRenderSpanQuickMip
1.71% 1.73% 6.59% 132116 133522 508670 AddLowerWall
1.63% 126206 project_node
1.07% 1.09% 3.86% 82708 83830 297706 AddUpperWall
1.02% 78882 R_CheckBBoxPair
0.90% 0.93% 18.23% 69774 71846 1408152 AddMidWall
0.76% 0.81% 2.36% 58788 62528 181910 AddTransWall
0.63% 0.63% 0.63% 48632 48632 48632 R_BufferSurface
0.51% 39696 R_VPRenderPlane
Thinking, CPU side:
Code: Select all
Time spent in profile = 0.25277s.
...
Executed instructions:
17.87% 18.05% 21.41% 73464 74219 88016 _BM_P_CrossBSPNode
13.47% 13.49% 14.37% 55373 55451 59090 _R_PointInSubsector
8.52% 8.53% 69.72% 35024 35083 286663 _P_RunThinkers
8.51% 8.52% 26.65% 34997 35042 109582 _P_CheckPosition
6.64% 6.65% 6.65% 27306 27326 27326 _BM_A_Mux3x2
5.30% 5.30% 13.80% 21774 21801 56725 _P_PathTraverse
4.11% 4.12% 4.12% 16906 16937 16937 _PIT_CheckThing
3.78% 3.80% 4.70% 15555 15615 19308 _PIT_AddLineIntercepts_L
3.61% 3.62% 4.64% 14841 14879 19067 _V_DrawPatch
2.05% 2.06% 2.06% 8440 8475 8475 _P_UpdateSpecials
1.97% 1.97% 2.09% 8093 8107 8587 _PIT_CheckLine
1.57% 1.57% 22.98% 6464 6471 94487 _BM_P_CheckSight
1.57% 1.57% 1.57% 6452 6452 6452 _R_DrawColumn
1.27% 1.27% 19.10% 5226 5233 78544 _P_LookForPlayers
1.19% 1.20% 2.78% 4908 4935 11419 _R_DrawVisSprite
1.14% 1.14% 1.14% 4702 4702 4702 _BM_A_Mux2x2
1.13% 1.14% 30.78% 4627 4674 126537 _P_TryMove
1.12% 1.12% 1.12% 4609 4616 4616 _P_PointOnDivlineSide
1.11% 1.12% 26.10% 4574 4601 107320 _P_Move
1.10% 1.10% 2.22% 4514 4521 9137 _PIT_AddThingIntercepts
0.82% 0.82% 53.70% 3358 3358 220784 _P_SetMobjState
0.68% 0.69% 1.52% 2810 2830 6233 _PTR_ShootTraverse
0.55% 2280 copy16_d
0.53% 0.54% 23.13% 2198 2205 95088 _P_NewChaseDir
Code: Select all
Used cycles:
76.56% 6209388 command_base
10.86% 880410 ALGO_P_CrossBSPNode
7.06% 572658 P_CrossSubsector_body
1.89% 1.89% 1.89% 153374 153478 153478 InterceptVectorsUF
1.28% 103930 Divs48_Real
1.26% 102284 ALGO_P_LineIntercept
0.74% 0.74% 0.74% 59662 59662 59662 TestLineSegVectorBisection
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
Framerate reading when facing a wall is 62 FPS... maintains about 20-25 FPS for most of the map, drops to 12 FPS if you try to look from the nukeage/slime area across the courtyard.
Anyone want to try making an Atari game *without* the Doom game code?

Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
By designing "doom" level suitable for Falcon/BadMood and adding their own game logic on top of that? Maybe it's time to publish the code to more public HG server [1], and find out what happens?dml wrote:Framerate reading when facing a wall is 62 FPS... maintains about 20-25 FPS for most of the map, drops to 12 FPS if you try to look from the nukeage/slime area across the courtyard.
Anyone want to try making an Atari game *without* the Doom game code?

[1] does e.g. atariforge support Mercurial?
-
- Nature
- Posts: 1447
- Joined: Tue Aug 01, 2006 9:21 am
- Location: Halmstad, Sweden
Re: Bad Mood : Falcon030 'Doom'

-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
That could work.shoggoth wrote:Imagine a multiplayer game without bots, thenNo AI to think about there, "just" some clever networking code.
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
* Pink monsters aren't transparent/invisible in BM like they're in PrBoom.
* After player goes down the stairs, BM plays the demo wrong.
BM cannot play "doom1.wad" (shareware) WAD timedemo, says it's for different version, although PrBoom plays it fine.
(PrBoom plays "doom2.wad" timedemo completely wrong though

-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
I haven't quite tied up replacement shaders for sprites but it will happen. I wanted to get the floors done first (liquids) and that's finished now.Eero Tamminen wrote:I was checking "doomu.wad" timedemo behavior in BM against Linux PrBoom and there are few differences:
* Pink monsters aren't transparent/invisible in BM like they're in PrBoom.
Yes. This is a consequence of changing the original code, and involving DSP in game vector calcs. Any numerical differences at all will accumulate and cause replay to drift, because the demo only records player input actions and everything else must 'simulate' as it did on the original machine.Eero Tamminen wrote: * After player goes down the stairs, BM plays the demo wrong.
Some of the more severe drift cases are conditioned by the TIMEBASE_CONTROL flag but there are essential optimizations (like P_CheckSight) which remain on, and don't match the original 100%.
You can get old behaviour back by enabling the ORIGINAL_VERSION flag - if it still works - but even then the Linux1.10 distro isn't properly compatible with demos recorded from other engines (even Dos Doom) and the demos still have small desync problems (especially on maps with those flying skulls). As it is I had to hack the demo format version number to get the standard demos to load at all.
This isn't really fixable but playing PC demos won't be part of the final release anyway.
The doom game code has a version check for the demo lump. I suspect PrBoom has bypassed it, whereas I just changed the version number. Both will suffer from the same problems though - only demos recorded by *that* engine will be sure to work. Changes to game code cause desync.Eero Tamminen wrote: BM cannot play "doom1.wad" (shareware) WAD timedemo, says it's for different version, although PrBoom plays it fine.
(PrBoom plays "doom2.wad" timedemo completely wrong though).
It's also worth noting that some engine ports - ZDoom, Boom etc. have some significant modifications from the original game which add new features and change the WAD spec.
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
I also tried latest FreeDoom:
http://www.nongnu.org/freedoom/download.html
While the "Doom II" replacement WAD gives this from timedemo and trying to play myself:
Code: Select all
Demo is from a different game version!
Error: Z_Malloc: failed on allocation of 127436 bytes
http://savannah.nongnu.org/download/fre ... latest.zip
It loaded with couple of warnings:
Code: Select all
WARNING: have to clip 4 chars from 'FLOOR4_8=�.bmp' base!
WARNING: have to clip 4 chars from 'TLITE6_5]�.bmp' base!
WARNING: have to clip 4 chars from 'TLITE6_1=�.bmp' base!
WARNING: have to clip 4 chars from 'FLOOR5_1M�.bmp' base!
WARNING: have to clip 4 chars from 'TLITE6_4M�.bmp' base!
Code: Select all
Error: NetUpdate: netbuffer->numtics > BACKUPTICS
Btw. Should (Doom II) WAD overlays to work?
I tried couple of old Doom II PWADs with doom2.wad and BM got stuck here:
Code: Select all
GEMDOS 0x42 Fseek(952923136, 67, 0)
Finalizing costs for 11 non-returned functions:
- 0x4d5ba: read_resource_header (return = 0x516da)
- 0x51622: D_CacheRegisterSpritesMarkedSet (return = 0x4ed08)
- 0x4ece0: W_InitCacheSpriteDefs (return = 0x4eb60)
- 0x4eafc: W_InitCacheDefs (return = 0x4addc)
- 0x4acd8: _BM_E_OpenWADs (return = 0x24dd4)
- 0x247c6: _D_DoomMain (return = 0x4a53a)
- 0x4a47c: _BM_S_AppEntryPoint (return = 0x4a6f6)
- 0x4a6b4: _BM_S_EntryPoint (return = 0x4a446)
- 0x4a24e: _main (return = 0x61d26)
Code: Select all
GEMDOS 0x42 Fseek(605905920, 67, 0)
Code: Select all
Zone: used memory: 0x91210
Zone: free memory: 0x6edf0
Error: Z_Malloc: failed on allocation of 138456 bytes
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
It does work - it's just that you have an old version of shareware WADEero Tamminen wrote:Based on PrBoom output, the shareware WAD has different timedemo, so it would be nice to have it working.

I don't really want to try to support all the older versions of the WADs (1.666, 1.7, 1.8 etc) because it's extra trouble and there are 1.9 versions of all of them now, including a free patch for the shareware one. It might just be a case of disabling the version check, but then again it might not.
I assume that's an open-source version of the Doom II IWAD or something? I don't know much about that but it looks like it's falling foul of the version check as well.Eero Tamminen wrote: I also tried latest FreeDoom:
http://www.nongnu.org/freedoom/download.html
While the "Doom II" replacement WAD gives this from timedemo and trying to play myself:Code: Select all
Demo is from a different game version! Error: Z_Malloc: failed on allocation of 127436 bytes
There's probably an argument for supporting open-source replacements but I really don't know what's hiding in there. The fact it doesn't pass a v1.9 version check is not encouraging

Those warnings are caused by a bug in the cache file handling which i mentioned recently, and is fixed. Should be in the repo now.Eero Tamminen wrote: The "Ultimate Doom" replacement nearly works:
http://savannah.nongnu.org/download/fre ... latest.zip
It loaded with couple of warnings:Code: Select all
WARNING: have to clip 4 chars from 'FLOOR4_8=�.bmp' base! WARNING: have to clip 4 chars from 'TLITE6_5]�.bmp' base! WARNING: have to clip 4 chars from 'TLITE6_1=�.bmp' base! WARNING: have to clip 4 chars from 'FLOOR5_1M�.bmp' base! WARNING: have to clip 4 chars from 'TLITE6_4M�.bmp' base!
The corruption is another bug which I mentioned, and fixed today but I think it hasn't been checked in yet. I was working on yet *another* bug which was a Hatari/real-hardware divergence thing. :-zEero Tamminen wrote: And the timedemo started running. However, after a while the part around the screen got messed up and eventually demo playback stopped to this:Code: Select all
Error: NetUpdate: netbuffer->numtics > BACKUPTICS
Yes probably should be supported. Hopefully it's just a version check thing and not a format change.Eero Tamminen wrote: It would be nice to have support for FreeDoom, at least the Ultimate version replacement so that everybody can get full WADs.
All PWADs should work if their associated IWADs also work. I have found plenty which don't work but they tend to fall into categories:Eero Tamminen wrote: Btw. Should (Doom II) WAD overlays to work?
- weird, inconsistent bugs in maps which Doom tolerates but BM doesn't know what to do with. fixed some of these recently but some are confounding.
- PWAD is ZDoom (or other) custom format, won't load - warnings + crashes
- map is way too complex, causes DSP buffer overflow (seems to be rare now - haven't seen it recenty - famous last words)
- nonstandard texture sizes, or other weird problems with resources that BM doesn't like
- other weird corner cases
I think there may be flakiness problems atm with BadMood. Perhaps try again after I get some fixes checked in.Eero Tamminen wrote:I tried couple of old Doom II PWADs with doom2.wad and BM got stuck here:
Ok I'll keep that in mind. I have seen something like this but when I inspected the WADs they were not standard ones, they had nonstandard stuff inside.Eero Tamminen wrote: In both cases there was invalid Fseek() offset. This was from the second (earth.wad):
GEMDOS 0x42 Fseek(605905920, 67, 0)
Z_Malloc failing probably means 1mb isn't enough for Doom to allocate level data for things etc. I could raise it a bit further if this is a problem. But it can also happen if the WAD contains stuff that the game or BM doesn't understand - like floor textures which aren't 64x64, or empty directory entries for things that should be textures etc..Eero Tamminen wrote: Then I tried smallest Doom II WAD I found, "mtfactor.wad" and while that loaded fine, it gave this when tried to start a new level:Code: Select all
Zone: used memory: 0x91210 Zone: free memory: 0x6edf0 Error: Z_Malloc: failed on allocation of 138456 bytes
There should be plenty but I really haven't had time to go and look at them this year. There's a site with '100 best WADs of all time' or suchlike, although a few of those I found were ZDoom format... they are by date so just rewind beyond the ZDoom incept date and it should be fineEero Tamminen wrote: Are there any nice PWADs for Doom I (In case they would need less RAM)?

Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3999
- Joined: Sun Jul 31, 2011 1:11 pm
Re: Bad Mood : Falcon030 'Doom'
I thought I had, it was correct size and all... But I loaded another version and with that the timedemo works fine. Good, I'll provide new profiles for that next week.dml wrote:It does work - it's just that you have an old version of shareware WADGet the v1.9 revision
Great, thanks! Hopefully I'll have time tomorrow to check these.dml wrote:Before the next checkin, I'll raise the ram limit for the game code to 1.5mb (from 1mb) and try to remove the WAD version check. See if this makes any difference.
According to FreeDoom readme, its levels may have some Boom stuff:
Code: Select all
Levels should be in Boom format; you may exceed the limits of Vanilla
Doom and use Boom features; however, do not use features that are not
supported by Boom 2.02 and compatible ports. Levels should be in Doom's
original format, not in ``Hexen'' format.
It is sensible to also heed the following guidelines:
...
* Do not use tricks that exploit Doom's software renderer; some source
ports, especially those that use hardware accelerated rendering, may
not render it properly. Examples of tricks to avoid include those used
to simulate 3D bridges and ``deep water'' effects.
* Boom removes almost all of the limits on rendering; however, do not
make excessively complicated scenes. It is desirable that Freedoom
levels should be playable on old or low-powered hardware.
* Always test in http://www.teamtnt.com/boompubl/boom2.htm[Boom]
itself rather than a derivative such as PrBoom. This ensures that
your levels really are Boom-compatible rather than using any extra
features.
...
=== Graphics
* Graphics should be the same color and size as the originals to
remain compatible with PWADs (otherwise, they may end up looking
like a mess). They cannot use the Doom font.
* Textures should be the same dimensions as the originals. They
should be similar but not identical (to avoid IP infringement) --
...
* Sprites should be roughly the same size and shape, but different to
the originals.
...

I may look a bit into timedemo version checks too. I already peeked at the PrBoom code, but I need to check a bit what all those variables mean.
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
It's always possible I broke something but I made the same mistake a few times this week - having the wrong version of the DOOM2 wad sitting on the Falcon CFLASH and wondering why the demo wouldn't start during tests...Eero Tamminen wrote: I thought I had, it was correct size and all... But I loaded another version and with that the timedemo works fine. Good, I'll provide new profiles for that next week.
Done. Version checks removed, ram limit raised. My only checkouts now are gametick optimization related.Eero Tamminen wrote:Great, thanks! Hopefully I'll have time tomorrow to check these.dml wrote:Before the next checkin, I'll raise the ram limit for the game code to 1.5mb (from 1mb) and try to remove the WAD version check. See if this makes any difference.
Hmm. Limits in some cases aren't a problem because BM doesn't have them (or they are higher, where limits apply). However one of the 'mods' involved changing the coordinate storage system and that definitely won't work. I have also seen WADs with 'meta directory' structures, which confuse the loader. Won't be easy to guess what sort of things will cause problems without going through it all and checking it.Eero Tamminen wrote: According to FreeDoom readme, its levels may have some Boom stuff:
* Do not use tricks that exploit Doom's software renderer; some source
ports, especially those that use hardware accelerated rendering, may
not render it properly. Examples of tricks to avoid include those used
to simulate 3D bridges and ``deep water'' effects.
These probably wouldn't work in BM anyway. They are exploitations of unintended behaviour.
* Boom removes almost all of the limits on rendering; however, do not
make excessively complicated scenes. It is desirable that Freedoom
levels should be playable on old or low-powered hardware.
Fine

* Graphics should be the same color and size as the originals to
remain compatible with PWADs (otherwise, they may end up looking
like a mess). They cannot use the Doom font.
* Textures should be the same dimensions as the originals. They
should be similar but not identical (to avoid IP infringement) --
...
* Sprites should be roughly the same size and shape, but different to
the originals.
Also fine.
Ok, that could be useful. I tried briefly to get my head around it but it just looks like a mess so i left it alone.Eero Tamminen wrote: I may look a bit into timedemo version checks too. I already peeked at the PrBoom code, but I need to check a bit what all those variables mean.
I suspect most of the original (Id) version changes were due to desync problems, caused by bugfixes to game code, breaking existing demos and had to be re-recorded. This is why all WAD version have different demos

Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
Currently, when things get slow, the adaptive behaviour will act to make it even slower. This is only true because the cost of game code relative to rendering is so high.
If the cost of ticking had been relatively low (which it seems to be in the commercially released/rewritten/optimized/whatever PC and Jag versions) the adaptive behaviour would be ideal and exactly what you'd want. The game could catch up with realtime with impunity. In our case that isn't happening. OTOH, the adaptive stuff is mainly there for synchronized network games anyway (all clients tick synchronously so they must all try to chase realtime with reasonable accuracy).
So I'll probably try an experiment - fix the TICRATE at some sensible multiple of the average framerate (say, 6fps x2 = 12Hz, or x3 = 18Hz) and change the game loop to enforce exactly 2 (or 3) ticks per render regardless of what's happening. When the framerate drops, action will slow down, but no longer 'exponential slowdown'. This would also require an artificial cap on FPS to stop the game running too fast in corner cases, but that's already true anyway with the adaptive system.
Another approach would be to cap the number of ticks at some low value and allow it to be slightly adaptive but within limits - and work purely with time deltas instead of absolute time but the code is complicated so that might be harder to get working as intended. It would also be less effective as a pressure valve.
It's worth some experiments to see what works best. Probably will have a go in the evening.
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM
-
- Atari God
- Posts: 1223
- Joined: Wed Nov 20, 2002 11:22 pm
- Location: France
Re: Bad Mood : Falcon030 'Doom'
-
- Fuji Shaped Bastard
- Posts: 3991
- Joined: Sat Jun 30, 2012 9:33 am
Re: Bad Mood : Falcon030 'Doom'
The engine/viewer will run in 4MB (or can do, with a bit of cleanup - it definitely did fit in the past). With the *current* Doom game attached to it however it won't fit.dma wrote:By the way, while being in the toilet the other day (which is of no importance here) i was thinking and wondering if your DOOM engine could run on standard 4mb Falcon, with specifically designed WADs (which would then have some size limits on their various contents)? In the prospective of using your engine for a Falcon specific game.
The compiled Doom executable is 500k, and Doom gamestate wants approx 1MB -> 1.5MB for it's own stuff. The framebuffers require another 256k, and TOS/GEM itself uses something. I forget how much (this can be overcome via AUTO folder but that's not ideal for HD booting a game!). So that's nearly 3MB gone before BMEngine gets to load any data, and it's not counting BM's own static storage either which is currently still on the large side, about 1MB, maybe more.
Starting from scratch for the game would improve chances. Some additional stuff can be done with texture formats, per-texture lighting tables, acceleration info for masked sprites etc. to keep memory use down, if required. I haven't done much in that direction given that fitting Doom in 4MB looks so far from the mark as it is.
So if somebody started on such a project, I'd squeeze BMEngine for the smaller footprint.
Home: http://www.leonik.net/dml/sec_atari.py
AGT project https://bitbucket.org/d_m_l/agtools
BadMooD: https://bitbucket.org/d_m_l/badmood
Quake II p/l: http://www.youtube.com/playlist?list=PL ... 5nMm10m0UM