As you know what the bmengine symbols are for, I think it's better if you add that filtering.dml wrote:I recommend filtering out symbols like R_SwitchSurface[*] and MARKER as these are hiding or breaking up real functions in the profile view. I still have some things to tidy up.

MARKER seems to be most of R_DrawTSurface_Masked2().
Here's another profile from Doom1, from longer gameplay and another level (and of course C-code built with gcc v2.x):
Code: Select all
Time spent in profile = 0.49573s.
Visits/calls:
- max = 592, in R_BSPHyperPlane_RHS at 0x4ad82, on line 6692
- 6823 in total
Executed instructions:
- max = 20008, in D_GeneratePostBuffer+78 at 0x496d2, on line 5909
- 1099371 in total
Used cycles:
- max = 161216, in D_GeneratePostBuffer+110 at 0x496f2, on line 5923
- 7952784 in total
Instruction cache misses:
- max = 1080, in _P_RunThinkers+376 at 0x35d42, on line 3340
- 68909 in total
...
Executed instructions:
32.33% 32.36% 35.22% 355372 355800 387240 D_GeneratePostBuffer
11.51% 11.52% 11.92% 126483 126625 131015 D_PatchRender_8_8
8.08% 88782 R_SwitchSurface_T8
7.95% 7.95% 8.04% 87349 87369 88376 _subframe_block
5.24% 5.25% 10.00% 57643 57743 109918 _P_RunThinkers
3.69% 40515 R_VisPlaneShader
2.72% 29918 MARKER
2.25% 24702 R_DrawSurface_NW
2.21% 24249 R_SwitchSurface_T4
2.10% 2.10% 2.50% 23039 23099 27469 turbo_memclr
1.91% 1.91% 1.91% 20948 20986 20986 _R_DrawColumn
1.76% 19377 R_SwitchSurface_T2
1.62% 1.63% 2.35% 17858 17934 25835 _V_DrawPatch
1.18% 1.18% 1.18% 12944 12994 12994 stream_texture
1.04% 11465 R_VisPlaneSkyShader
0.95% 0.95% 0.95% 10407 10466 10466 stack_visplane_area
0.85% 0.85% 2.76% 9354 9361 30347 _R_DrawMaskedColumn
0.75% 0.75% 0.75% 8241 8248 8248 _R_PointInSubsector
0.56% 6102 R_StackTransparentSurface
0.55% 0.56% 4.08% 6089 6151 44806 _P_SetMobjState
0.51% 0.51% 1.13% 5620 5651 12425 _P_UpdateSpecials
DSP side:
Code: Select all
Used cycles:
67.72% 10770918 command_base
13.74% 14.54% 14.54% 2184912 2312488 2312488 R_DoColumnPerspCorrect
6.10% 6.10% 6.10% 969762 969762 969762 VPRenderSpanDT
2.46% 391488 R_VPLoadTexture
1.27% 1.27% 1.27% 201962 201962 201962 extract_subvisplane
1.04% 165012 R_ViewTestAddLine
1.02% 162458 R_DoColumnTextureUV
0.93% 148550 R_VPRenderSky
0.57% 89928 ALGO_P_CrossBSPNode
0.53% 84776 R_VPRenderPlane
I need separate Hatari counters for that, but I could e.g. use DSP instruction counter for determining worst rendering part and CPU instruction counter for determining worst thinking part.