dml wrote:Looks like there are different versions of the Jag WAD already, and related conversion tools. I'll have to dig through it all to find out what's been going on and what resources are available now.
http://www.doomworld.com/vb/wads-mods/5 ... onversion/
MrMaddog wrote:I'll have to check it out... Since the TC wad works with vanilla Doom (split into episodes) it should be playable on the Falcon version of Doom, wouldn't that be a kicker?
dml wrote:There is no code/rendering profiler in BM. This would have been a good move - seeing the %age impact of various kinds of drawing and relative use, for guiding sensible optimisation effort. A prime-number based TimerC event sampler coupled with a task index updated in the main code would be enough to build a decent picture over a few seconds and quite easy to implement.
Eero Tamminen wrote:No need to implement that in BM, Hatari debugger includes a profiler, and it supports both CPU and DSP side as well:
http://hg.tuxfamily.org/mercurialroot/h ... e_debugger
If you load DSP symbol address information to the debugger, it can also show how many times certain symbols get called, or a trace of how the symbols get called.
dml wrote:Thanks for that. It will be very useful - especially the DSP profiler which is something I didn't have access to before. I think the original DSP debugger for the Falcon showed cycle times (or stalls) next to DSP instructions, taking external memory sources into account. That was pretty useful because you could tell quickly if relocating code or buffers into local/low memory would yield an instant win without inspecting the addresses. A small thing, but really quite valuable.
The CPU profiler is probably really useful for getting a good view of whole program performance. I'd probably use a mix of profiling techniques for BadMood - finding concurrency bottlenecks can involve a bit more than symbols and an address sampling profiler can be quite useful there. Optimizing fragments is also sometimes best done with a separate profiling harness.
dml wrote:I tried a quick test on BadMood using 256 colour + c2p (chunky pixels -> atari bitplanes) instead of 16bit truecolour, before I go too far down the optimization route.
It's clearly not a win
25+% of total time is now c2p, and the main rendering code is only a little faster with 8bit pixels. It could be speeded further for 8bit but it's not going to offset that extra cost whatever happens.
For the time being 16bit chunky is still the way to go on the 030 for BadMood.
calimero wrote:hm... evil or kalms / dhs or maybe mikro or amiga coders from tbl should have much experience at c2p.
if I am not mistaking, 25% of CPU is far to much for c2p only! I think that there is much faster technik today (mainly from amiga world )
but do not take my writings to seriously; best talk to ppl at http://dhs.nu/bbs-scene/ first
dml wrote:I have played a little now with the Hatari debugger and profiler so I get the general idea. I had trouble getting HiSoft symbols imported (had to hand-reformat the text file) but it worked once I had done that.
dml wrote:I can probably use the Hatari profiler to find out what percentage load the DSP is carrying as well, since idle time will show as excess activity in the command processing loop. If the DSP is idle a lot of the time it could be used more with some reorganization. I see there is still quite a bit of CPU-side math for walls which doesn't seem like it has to be done by the CPU.
dml wrote:I guess I knew all this at the time and just got tired picking at areas when nothing specific stood out.
Eero Tamminen wrote:If you show what the HiSoft symbols text format looks like, or better, send an example (to address oak at helsinkinet fi), I can write a simple awk or python script that converts it to Hatari format and include it with Hatari.
Eero Tamminen wrote:If there's something where the debugger / profiler could help, just ask and I'll look how feasible it is. Emulator in theory can access all information in the emulated system, so a lot is possible, if not always practical.
Eero Tamminen wrote:I know that feeling... (from looking into Qt GUI toolkit memory usage)
dml wrote:I can already do a lot of cross-dev work without shuffling things between PC/Atari on floppies/flash cards etc..
Cyprian wrote:lotek, thats c2p is not optimized for 030:
As far as I remember, the best one for 030 and Falcon is prepared by Kalms and optimized by Mikro:
"Chunky to planar routines by Kalms / TBL and Mikro / Mystic Bytes "
dml wrote:It would be great to get a complete disassembly/dump of the program with the incidence 'counts' and cycle times immediately beside the code, either left or right side - doesn't matter. This would make for excellent analysis and some nice Python tools could be made to work on that information.
Code: Select all
b pc = text
Code: Select all
Code: Select all
Users browsing this forum: No registered users and 2 guests