Out of curiosity

Moderators: Zorro 2, Moderator Team
Zamuel_a wrote:I am impressed with the mip mapping. It's more or less seamless and makes the view much better than what the original version had.
I don't know the answer. If I started again, I might be able to do a 68030-only version which comes close but it's hard to say that it would match it or be faster.Zamuel_a wrote:Out of curiosityHow much speed gain do you get with the DSP compared to running it only on the 68030? Is it only a few fps or would it be unplayable if it hadn't been for the DSP?
Code: Select all
Used cycles:
26.49% 26.58% 36.89% 405388 406856 564564 _PIT_AddLineIntercepts_L
22.02% 22.12% 74.74% 336960 338552 1143848 _P_PathTraverse
9.38% 9.44% 9.44% 143612 144416 144416 _P_InterceptVector.constprop.6
6.75% 6.78% 8.02% 103288 103728 122752 _PIT_AddThingIntercepts
5.89% 5.90% 5.90% 90188 90352 90352 _BM_A_Mux3x2
5.69% 5.74% 92.22% 87128 87816 1411348 _P_Ticker
3.29% 3.29% 3.33% 50408 50408 50904 _P_CheckPosition
1.69% 1.78% 3.93% 25936 27204 60076 _P_SetThingPosition
1.41% 1.41% 1.41% 21568 21568 21568 _BM_A_Mux2x2
1.39% 1.39% 1.39% 21336 21336 21336 _V_DrawPatch320
1.36% 1.36% 6.27% 20844 20844 96024 _PTR_ShootTraverse
1.21% 1.21% 1.21% 18568 18568 18568 _P_UpdateSpecials
1.10% 1.10% 1.43% 16824 16824 21956 _PTR_AimTraverse
0.81% 12400 _bcopy
0.74% 11320 R_StackTransparentSurface
0.71% 0.71% 0.71% 10804 10804 10804 _P_LineOpening
0.68% 0.68% 0.68% 10396 10396 10396 _V_CopyRect
0.52% 0.53% 7.89% 8020 8184 120708 _audio_mux_frame
Code: Select all
Used cycles:
14.81% 395652 R_VPShaderLiquid_2x1
12.03% 321452 R_Column_TMip_2x1
8.97% 239508 R_AddLine_loop
8.09% 216164 R_Column_NMip_2x1
5.84% 155924 R_BSPHyperPlane
4.65% 4.67% 4.67% 124340 124780 124780 R_ViewTestSpriteLines
4.32% 115420 R_VPShaderStdDSP_2x1
3.59% 3.59% 3.59% 95772 95772 95772 R_VPCommitZoneFlush
3.57% 3.60% 4.05% 95268 96240 108196 R_VisPlaneStreamTexture
3.07% 81920 R_ProcessSubSector
3.06% 81808 R_AddOverlappingSprites
2.46% 2.46% 2.46% 65624 65736 65736 _BM_A_Mux1x2
2.01% 2.01% 27.12% 53820 53820 724536 R_FlushDeferredSurfaces
1.94% 1.94% 2.35% 51880 51880 62904 R_AddSpriteSpans
1.75% 46632 R_PrepareSubSector
1.66% 44456 R_VPShaderSkyCyl_2x1
Code: Select all
Used cycles:
20.56% 1539933336 R_AdvanceSurface_NMip_2x1
17.07% 1278051612 R_AdvanceSurface_TMip_2x1
11.05% 827067668 R_VPShaderStdDSP2x1
7.97% 8.03% 8.03% 597143144 601144376 601144376 R_VisPlaneStreamTexture
5.02% 376016724 R_SCShader_Masked_2x1
3.93% 3.96% 8.66% 294605212 296410300 648242664 _P_RunThinkers
2.57% 2.58% 2.58% 192318256 193565396 193565396 R_ViewTestSpriteLines
2.27% 169738876 R_BSPHyperPlane
1.69% 1.70% 1.73% 126788636 127667024 129322296 _P_CheckPosition
1.56% 117157312 R_SCShader_MaskSk_2x1
1.36% 101962028 R_StackTransparentSurface
1.34% 100119720 R_AddLine_loop
1.30% 1.46% 1.46% 97609532 109489880 109489880 stack_visplane_area
1.05% 78575292 R_ProcessSubSector
1.03% 77251240 R_AddOverlappingSprites
0.96% 71964992 R_VPShaderSkyCyl2x1
0.95% 0.96% 1.32% 71509332 71988312 98752332 R_AddSpriteSpans
0.86% 0.86% 0.86% 64155316 64576992 64580136 _P_UpdateSpecials
0.85% 0.86% 0.86% 63526656 64635100 64635100 _BM_P_CrossBSPNode
0.85% 0.85% 0.85% 63353556 63787968 63787968 R_SetSubSectorLuma
0.77% 57691184 R_WCShader_MaskPM_2x1
0.77% 0.78% 0.78% 57685232 58075412 58075412 add_wall_segment
0.75% 56535244 R_DrawSurface_NoTileMip2x1
0.63% 0.63% 42.20% 46847760 472979483159832536 R_FlushDeferredSurfaces
0.57% 0.57% 0.57% 42564256 42748140 42748140 _V_DrawPatch
0.56% 41652812 R_VPShaderLiquid2x1
I actually have a problem right now with normal profiling - it doesn't seem to end automatically when the demo expires. Not really sure what's going on there because worst-frame profiling quits properly at the end of the demo. I tried fixing the breakpoints in normal profile mode to match worst-frame mode, in the script/config but didn't help. So I have to quit normal profiles by hand. This makes their total time a bit bogus.Eero Tamminen wrote:Thanks for the profiles, they're interesting!
Btw. Especially when listing worst frame profiles, it would be interesting to see also the timing for that frame, shown by the profiler. I.e. how bad the worst frame really is.
Code: Select all
Time spent in profile = 0.09540s
Code: Select all
Time spent in profile = 0.16652s.
Could you enable your hg server? In profile.sh I have, normal profiles are missing following breakpoints, which are there for worst frame profiling: pc=BM_E_PlayerDeath, pc=_I_Quit.dml wrote:I actually have a problem right now with normal profiling - it doesn't seem to end automatically when the demo expires. Not really sure what's going on there because worst-frame profiling quits properly at the end of the demo. I tried fixing the breakpoints in normal profile mode to match worst-frame mode, in the script/config but didn't help. So I have to quit normal profiles by hand. This makes their total time a bit bogus.
Ok, there clearly isn't anything loaded from disk anymore during worst frame, and thinking part is nearly 2x of rendering speed.dml wrote: Worst-frame is ok, but the duration shown is quite short (its just for the single frame).
Knowing the total demo replay time in normal profiling mode is helpful because it gives accurate measurement of optimization +/- effect...
For reference, the worst-think time was:
Worst-render time was:Code: Select all
Time spent in profile = 0.09540s
Not too bad really, considering I picked a nearly-worst-case mapCode: Select all
Time spent in profile = 0.16652s.
These are the ones I added myself - but I may not have done it correctly. I'll start the server in a few minutes (didn't notice it was down).Eero Tamminen wrote: Could you enable your hg server? In profile.sh I have, normal profiles are missing following breakpoints, which are there for worst frame profiling: pc=BM_E_PlayerDeath, pc=_I_Quit.
This can probably be done via the default.cfg, so I'll prepare two or three different settings.Eero Tamminen wrote: Ok, there clearly isn't anything loaded from disk anymore during worst frame, and thinking part is nearly 2x of rendering speed.
Something else that might be interesting, is checking the worst thinking vs. rendering times for different quality / screen size settings. With higher settings you should get clearly higher worst rendering time, but it's also interesting to see whether that has any effect on what's worst thinking frame (e.g. due to different timings or different memory allocation patterns).
Could you add to repository BM config files which set up most demanding and least demanding rendering? I could automate profiling of both.
Code: Select all
Used cycles:
54.41% 54.74% 76.07% 731968 736452 1023372 _P_PathTraverse
7.85% 7.90% 7.90% 105568 106224 106224 _PIT_AddThingIntercepts
6.71% 6.73% 6.73% 90272 90548 90548 _BM_A_Mux3x2
5.82% 5.85% 92.89% 78356 78768 1249784 _P_Ticker
Code: Select all
- add 2D fastpath (Laurent's compiled sprites)
They would, because most source graphics are stored 8-bit and compiled sprites approx 16bits or so (assuming a register move per pixel - but it could encode to a smaller or larger size depending).Eero Tamminen wrote:Regarding 4MB limit and this in the known-issues file:Do compiled sprites add a lot to memory consumption?Code: Select all
- add 2D fastpath (Laurent's compiled sprites)
Code: Select all
Time spent in profile = 0.39641s.
...
Used cycles:
72.97% 73.36% 76.89% 4640472 4665580 4889532 _W_CheckNumForName
11.96% 12.00% 12.20% 760816 763036 775544 _V_DrawPatch
5.72% 363540 clearlongs
2.59% 2.60% 2.60% 164612 165628 165628 _BM_A_Mux1x2
1.22% 1.23% 1.23% 77728 78304 78304 _BM_V_Sync
0.56% 0.56% 0.56% 35640 35752 35752 _strupr
Code: Select all
Time spent in profile = 0.24426s.
...
Used cycles:
14.99% 587236 R_Column_TMip_1x1
14.51% 568484 R_Column_NMip_1x1
5.61% 219992 R_BSPHyperPlane
5.54% 217064 R_SCShader_Masked_1x1
4.68% 183232 R_VPShaderStdDSP_1x1
4.34% 170076 R_AddLine_loop
4.25% 4.26% 4.26% 166404 166880 166880 R_VPCommitZoneFlush
3.89% 152264 R_ProcessSubSector
3.71% 145252 R_VPShaderSkyCyl_1x1
3.67% 3.68% 4.19% 143720 144192 164172 R_AddSpriteSpans
3.47% 136084 R_AddOverlappingSprites
2.80% 2.81% 2.81% 109824 110304 110304 _BM_A_Mux1x2
2.20% 2.20% 40.96% 86164 86328 1604876 R_FlushDeferredSurfaces
1.88% 73636 R_PrepareSubSector
1.84% 72008 R_StackTransparentSurface
1.79% 1.79% 1.79% 70076 70076 70076 add_wall_segment
1.59% 1.60% 7.10% 62228 62792 278188 R_FlushVisPlaneZones
1.33% 52160 R_AddLine_end
1.24% 1.25% 5.31% 48748 48796 208172 get_ssector
1.22% 1.23% 1.23% 47692 48264 48264 R_VisPlaneStreamTexture
1.13% 44384 R_DrawSurface_NoTileMip1x1
1.12% 1.12% 1.12% 44052 44052 44052 R_ViewTestSpriteLines
1.03% 40184 R_AddSurface_loop
0.85% 33424 build_ssector
...
Instruction cache misses:
7.81% 7.83% 9.04% 5042 5055 5837 R_AddSpriteSpans
7.53% 4863 R_AddOverlappingSprites
6.12% 3955 R_ProcessSubSector
6.08% 3927 R_AddLine_loop
5.56% 5.56% 23.27% 3591 3594 15027 R_FlushDeferredSurfaces
4.85% 3135 R_PrepareSubSector
4.18% 4.18% 7.54% 2702 2698 4870 * get_ssector
4.16% 2685 R_AddSurface_loop
4.13% 4.13% 4.13% 2669 2669 2669 add_wall_segment
3.40% 3.42% 3.42% 2199 2210 2210 R_VPCommitZoneFlush
3.30% 2130 R_BSPHyperPlane
3.28% 2117 render_wall_solid
2.64% 1708 ignore_lower
2.36% 1527 R_Column_TMip_1x1
2.27% 1464 add_lower
2.23% 1439 R_Column_NMip_1x1
2.16% 1396 lower_texture
2.10% 1358 build_ssector
1.96% 1.97% 33.13% 1265 1270 21396 R_SubSectorTryFlush
Code: Select all
Used cycles:
31.77% 2490240 command_base
25.00% 27.61% 27.61% 1959366 2163814 2163814 R_DoColumnPerspCorrect
6.18% 484060 R_ViewTestAddLine
4.13% 323710 VPRenderSpanQuickMipLOD_SMC
3.65% 285802 R_DoColumnTextureUV
3.58% 280728 R_VPRenderSky
3.32% 4.13% 30.77% 260210 323974 2411698 AddLowerWall
3.31% 259678 project_node
2.76% 215986 R_VPZoneCommitCF
2.10% 2.10% 2.10% 164536 164536 164536 VPZoneCommitDefrag_internal
1.68% 1.68% 1.68% 131662 131662 131662 R_VPZoneCommitFlush
1.31% 1.31% 1.31% 103056 103056 103056 R_BufferSurface
1.22% 95676 R_VPLoadTexture
1.03% 80760 R_CheckBBoxPair
0.74% 58174 R_VPRenderPlane
0.73% 0.73% 1.68% 57358 57358 131736 R_SetupSurface
0.70% 0.70% 0.70% 54582 54582 54582 VPRenderSpanWarp
0.69% 0.69% 0.69% 54160 54160 54160 divs_x1_a
0.62% 0.74% 3.05% 48546 57986 239056 AddMidWall
I think so, yesEero Tamminen wrote: Could above be because I visited options menu in start of the game?
If you give profile.sh "-s" option, it asks Hatari to provide stack traces to Fread() calls and redirects the output to "hatari.log" file. If you want backtraces to allocations, just change the Fread() backtrace breakpoint to something appropriate for allocations.dml wrote:I think it is mostly working now on 4mb with tonight's changes but is failing on the final switch at level end. It seems to alloc (and fail to find) a very large chunk for some reason and the size seems to vary. Looks wrong, so will investigate. Apart from that just more optimization work to do. The memory managers are now joined and seem ok like that.
That's usefulEero Tamminen wrote:If you give profile.sh "-s" option, it asks Hatari to provide stack traces to Fread() calls and redirects the output to "hatari.log" file. If you want backtraces to allocations, just change the Fread() backtrace breakpoint to something appropriate for allocations.dml wrote:I think it is mostly working now on 4mb with tonight's changes but is failing on the final switch at level end. It seems to alloc (and fail to find) a very large chunk for some reason and the size seems to vary. Looks wrong, so will investigate. Apart from that just more optimization work to do. The memory managers are now joined and seem ok like that.
Contributions list says "MIDI support: closed". Is the Doom -> MIDI music playback code already done? I didn't see it in the commits yet.dml wrote:Have posted a short update on progress: http://devilsdoorbell.com/2014/03/17/progress-on-beta/
No it just means the position is closed. i.e. somebody already looking at itEero Tamminen wrote:Contributions list says "MIDI support: closed". Is the Doom -> MIDI music playback code already done? I didn't see it in the commits yet.dml wrote:Have posted a short update on progress: http://devilsdoorbell.com/2014/03/17/progress-on-beta/
Hi dougdml wrote:A brief (2mb) vid recorded from Hatari showing freelook working with skies, using HD sky textures. Previously the sky would be locked to a fixed horizon when the player looks up, would tile vertically and just spoiled the effect.
FPS looks choppy and colours/contrast look strange in the AVI recording - but the idea is clear enough.
https://dl.dropboxusercontent.com/u/129 ... elook2.AVI