ScummVM/Falcon060 pre-release

Latest news in the Atari world

Moderators: Mug UK, Silver Surfer, Moderator Team

User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Teki wrote: Tue Apr 16, 2024 11:08 am - alice not working says no scumm game found
Only "alice" I found from ScummVM games lists is this: https://wiki.scummvm.org/index.php?titl ... ive_Museum ?

It's listed as supported, but there's no demo for it, and this says support level is "Untested": https://www.scummvm.org/compatibility/

Maybe your game version does not match any of the detected files?

Code: Select all

$ grep '"alice"' engines/director/detection_tables.h
        { "alice",				"Alice: An Interactive Museum" },
	MACGAME1_l("alice", "", "Alice", "e54ec74aeb4355b0acd46320327c1bed", 271740, Common::JA_JPN, 200),
	MACGAME1_l("alice", "Digipak", "Alice", "e54ec74aeb4355b0acd46320327c1bed", 274018, Common::JA_JPN, 200),
	MACGAME1("alice", "", "Alice", "3b61149c922f0fd815ca29686e4f554a", 304458, 400),
	WINGAME1t("alice", "", "ALICE_W/ALICE.EXE", "da6b3cb75f548d5c79ef831320b97035", 684733, 400),
	MACGAME1_l("alice", "Hybrid", "Alice", "3b61149c922f0fd815ca29686e4f554a", 304486, Common::JA_JPN, 404),
	WINGAME1t_l("alice", "Hybrid", "ALICE_W/ALICE.EXE", "ea9c19490428c8ef13934d3c159e1950", 684733, Common::JA_JPN, 404),
Teki wrote: Tue Apr 16, 2024 11:08 am - buzz explore the jungle is working good
- dragonsphere says not fully supported then stays black at loading
Yes, game not working is expected, when it's not (fully) supported: https://wiki.scummvm.org/index.php?title=Dragonsphere

(At least that's been my experience with all games that have not been stated to be fully supported.)
Teki wrote: Tue Apr 16, 2024 11:08 am - spyfox 1 working good
- spyfox 2working but intro sound stuttering
- torins passage workking but in intro scenes background sometimes slears (all with single buffering)
If you want to avoid tearing, you need to use triple buffering mode (mentioned in Atari backend readme).
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

mikro wrote: Tue Apr 16, 2024 8:39 pm Spy Fox 2 on the other hand, that's quite interesting. Its demo is perfectly playable as well (incl. the intro). However the full version's intro seems to be doing something terrible. Even 060 + SuperVidel + direct rendering + native 11025 Hz sample rate + Eero's patch can't make the playback not stutter. If you have some time Eero, try to run the full version of Spy Fox, whether it differs in the profiler somehow (I certainly hope so!)
Did not wait for intro to finish, but here's quick profile for the first nearly 10 mins:

Code: Select all

Time spent in profile = 570.84345s.
...
Used cycles:
  67.31%  68.37%  70.15%  123282810081252303801412848665451   Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&)
   6.21%   6.28%   7.06%  113735127211505972181292549631   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   3.64%   3.69%   3.99%   667121204 676727822 730372948   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   2.67%                   489390776                       c2p1x1_8_rect_start
   2.06%                   376549702                       c2p1x1_8_rect_pix16
   1.55%                   284445408                       ROM_TOS
I.e. basically all the cycles are spent on byleRLEDecode().

The data it processes seems to fit OKish into cache:

Code: Select all

Data cache hits:
  74.80%  74.99%  76.53%   667168886 668900395 682580315   Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&)
   2.89%   2.90%   4.55%    25807047  25854567  40587950   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
But making its code fit better in cache could help:

Code: Select all

Instruction cache misses:
  69.94%  75.97%  79.17%   320754569 348431584 363101102   Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&)
   8.78%                    40286150                       ROM_TOS
   5.41%   5.72%   5.86%    24811325  26232596  26860618   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   1.83%   1.96%   7.88%     8413899   8983797  36138646   Scumm::Gdi::resetBackground(int, int, int)
I'll post the annotated assembly later on.
mikro
Hardware Guru
Hardware Guru
Posts: 4185
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: ScummVM/Falcon060 pre-release

Post by mikro »

Hmm... so is it again about the amount of data being processed? You probably can't see it in Hatari but it's not only about audio -- the animation itself has very bad framerate, so no wonder sound is crap, too. But I have no explanation for it, both the demo and full version are 640x480, both use the same sample rate (11025 Hz), it doesn't seem that much is happening on the full version's intro screen either... weird.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

mikro wrote: Wed Apr 17, 2024 6:27 am Hmm... so is it again about the amount of data being processed? You probably can't see it in Hatari but it's not only about audio -- the animation itself has very bad framerate, so no wonder sound is crap, too. But I have no explanation for it, both the demo and full version are 640x480, both use the same sample rate (11025 Hz), it doesn't seem that much is happening on the full version's intro screen either... weird.
When looking at the demo intro, it seems to be done like rest of the game animation, but the full version looks more like video playback. Maybe encoding for latter is crap in a sense that each frame is 640x480 sized, instead of encoding just the changed part screen?

As to the disassembly and differences in it...

byleRLEDecode() usage in demo startup:

Code: Select all

# <instructions percentage>% (<sum of instructions>, <sum of cycles>, <sum of i-cache misses>, <sum of d-cache hits>)
...
Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&):
$010e4590  adda.w    #$fff0,sp            0.00% (1913, 5739, 0, 0)
$010e4594  movem.l   d2-d7/a2-a6,-(sp)    0.00% (1913, 66955, 2, 0)
$010e4598  movea.l   $40(sp),a3           0.00% (1913, 19126, 1915, 1913)
$010e459c  movea.l   $44(sp),a2           0.00% (1913, 7660, 2, 1912)
$010e45a0  move.l    $4(a2),d6            0.00% (1913, 7660, 2, 1912)
$010e45a4  movea.l   $48(a3),a6           0.00% (1913, 21045, 4, 0)
$010e45a8  movea.l   $18(a2),a4           0.00% (1913, 19136, 1913, 1910)
$010e45ac  move.b    $27(a2),d3           0.00% (1913, 7675, 3, 1910)
$010e45b0  move.b    $26(a2),d4           0.00% (1913, 7651, 3, 1913)
$010e45b4  move.l    $5e(a3),d7           0.00% (1913, 21073, 3, 1910)
$010e45b8  move.l    $8(a2),d2            0.00% (1913, 20868, 1913, 1035)
$010e45bc  add.l     $34(a2),d2           0.00% (1913, 6815, 3, 1906)
$010e45c0  move.l    (a2),d0              0.00% (1913, 3825, 3, 1913)
$010e45c2  moveq     #$7,d1               0.00% (1913, 3823, 0, 0)
$010e45c4  and.l     d0,d1                0.00% (1913, 3829, 3, 0)
$010e45c6  moveq     #$7f,d5              0.00% (1913, 3823, 0, 0)
$010e45c8  not.b     d5                   0.00% (1913, 17199, 1913, 0)
$010e45ca  asr.l     d1,d5                0.00% (1913, 1913, 0, 0)
$010e45cc  movea.l   $2a(a3),a1           0.00% (1913, 21046, 3, 1913)
$010e45d0  clr.l     d1                   0.00% (1913, 1919, 3, 0)
$010e45d2  move.b    $1a(a3),d1           0.00% (1913, 21049, 3, 0)
$010e45d6  move.l    d1,-(sp)             0.00% (1913, 7652, 0, 0)
$010e45d8  move.l    d6,-(sp)             0.00% (1913, 22956, 1913, 0)
$010e45da  move.w    $7572(a1),d1         0.00% (1913, 16945, 0, 34)
$010e45de  andi.w    #$7,d1               0.00% (1913, 5773, 0, 0)
$010e45e2  andi.l    #$ffff,d1            0.00% (1913, 11478, 0, 0)
$010e45e8  sub.l     d1,d0                0.00% (1913, 17217, 1913, 0)
$010e45ea  move.l    d0,-(sp)             0.00% (1913, 7652, 0, 0)
$010e45ec  move.l    a1,-(sp)             0.00% (1913, 5739, 0, 0)
$010e45ee  lea.l     $1119f3a.l,a5        0.00% (1913, 5739, 0, 0)
$010e45f4  jsr (a5)                       0.00% (1913, 40173, 3826, 0)
$010e45f6  movea.l   d0,a1                0.00% (1913, 3820, 0, 0)
$010e45f8  adda.w    #$10,sp              0.00% (1913, 19124, 1913, 0)
$010e45fc  tst.b     d3                   0.00% (1913, 3827, 1, 0)
$010e45fe  bne.w     $10e475c             0.00% (1913, 7652, 1, 0)
$010e4602  moveq     #$ff,d1              0.00% (1913, 3825, 0, 0)
$010e4604  move.l    d1,$30(sp)           0.00% (1913, 13391, 1, 0)
$010e4608  move.b    (a6),d0              0.00% (1913, 34438, 1915, 0)
$010e460a  clr.l     d4                   0.00% (1913, 1913, 0, 0)
$010e460c  move.b    d0,d4                0.00% (1913, 3827, 1, 0)
$010e460e  clr.l     d1                   0.00% (1913, 3825, 0, 0)
$010e4610  move.b    $25(a2),d1           0.00% (1913, 21044, 1, 0)
$010e4614  asr.l     d1,d4                0.00% (1913, 1915, 1, 0)
$010e4616  move.b    d0,d3                0.00% (1913, 3825, 0, 0)
$010e4618  and.b     $24(a2),d3           0.00% (1913, 19124, 1913, 1913)
$010e461c  bne.w     $10e4740             0.00% (1913, 15057, 1062, 0)
$010e4620  lea.l     $2(a6),a0            0.01% (122142, 367597, 320, 0)
$010e4624  move.l    a0,$34(sp)           0.01% (122142, 855006, 327, 0)
$010e4628  move.b    $1(a6),d3            0.01% (122142, 1397237, 122013, 99889)
$010e462c  move.b    $1c(a3),d0           0.62% (5058781, 18340128, 10864, 4602078)
$010e4630  cmpi.b    #$ff,d0              0.62% (5058781, 19780308, 10975, 0)
$010e4634  beq.b     $10e463e             0.62% (5058781, 14836171, 732051, 0)
[...]
$010e463e  tst.b     $28a(a3)             0.62% (5058781, 22878894, 36617, 4561521)
$010e4642  beq.w     $10e4772             0.62% (5058781, 55002718, 5106487, 0)
[...]
$010e465c  adda.l    $32(a3),a4           0.30% (2489282, 11446276, 36307, 4742377)
$010e4660  adda.l    $44(a3),a1           0.30% (2489282, 11969081, 36357, 2169240)
$010e4664  addq.l    #$1,d6               0.30% (2489282, 4697846, 36356, 0)
$010e4666  subq.w    #$1,d7               0.62% (5058781, 10059787, 138, 0)
$010e4668  bne.w     $10e471e             0.62% (5058781, 90107919, 10116038, 0)
$010e466c  move.l    $14(a2),d0           0.01% (85189, 938036, 977, 0)
$010e4670  subq.l    #$1,d0               0.01% (85189, 87125, 968, 0)
$010e4672  move.l    d0,$14(a2)           0.01% (85189, 596316, 977, 0)
$010e4676  beq.w     $10e4752             0.01% (85189, 764664, 84958, 0)
$010e467a  move.l    $5e(a3),$38(sp)      0.01% (83276, 1554946, 32, 38681)
$010e4680  move.l    $4(a2),d6            0.01% (83276, 96462, 17, 80736)
$010e4684  move.l    $8(a2),d2            0.01% (83276, 331719, 17, 82124)
$010e4688  movea.l   $34(a2),a6           0.01% (83276, 839662, 83282, 81518)
$010e468c  move.l    (a2),$30(sp)         0.01% (83276, 581198, 27, 82966)
$010e4690  move.b    $1b(a3),d0           0.01% (83276, 162789, 20, 72987)
$010e4694  move.l    $20(a2),d1           0.01% (83276, 389826, 19, 73415)
$010e4698  cmpi.b    #$ff,d0              0.01% (83276, 832690, 83288, 0)
$010e469c  beq.b     $10e46aa             0.01% (83276, 749460, 83314, 0)
[...]
$010e46aa  move.l    $30(sp),d0           0.01% (83276, 249869, 26, 83273)
$010e46ae  add.l     d1,d0                0.01% (83276, 166525, 4, 0)
$010e46b0  move.l    d0,(a2)              0.01% (83276, 416399, 21, 0)
$010e46b2  bmi.w     $10e4752             0.01% (83276, 83426, 21, 0)
$010e46b6  movea.w   $2e(a2),a0           0.01% (83276, 916327, 83290, 83237)
$010e46ba  cmpa.l    d0,a0                0.01% (83276, 83272, 2, 0)
$010e46bc  ble.w     $10e4752             0.01% (83276, 333084, 11, 0)
$010e46c0  movea.w   #$80,a1              0.01% (83276, 333091, 9, 0)
$010e46c4  moveq     #$7,d5               0.01% (83276, 166547, 6, 0)
$010e46c6  and.l     d0,d5                0.01% (83276, 166543, 0, 0)
$010e46c8  movea.l   d5,a0                0.01% (83276, 749388, 83284, 0)
$010e46ca  move.l    a1,d5                0.01% (83276, 83274, 2, 0)
$010e46cc  move.l    a0,d7                0.01% (83276, 166567, 19, 0)
$010e46ce  asr.l     d7,d5                0.01% (83276, 166529, 2, 0)
$010e46d0  movea.l   $2a(a3),a0           0.01% (83276, 916070, 32, 83276)
$010e46d4  clr.l     d7                   0.01% (83276, 83322, 26, 0)
$010e46d6  move.b    $7b2a(a0),d7         0.01% (83276, 1665320, 83294, 0)
$010e46da  muls.l    d1,d7                0.01% (83276, 249842, 38, 0)
$010e46de  movea.l   d7,a4                0.01% (83276, 166508, 12, 0)
$010e46e0  adda.l    $18(a2),a4           0.01% (83276, 333855, 42, 83165)
$010e46e4  move.l    a4,$18(a2)           0.01% (83276, 582849, 46, 0)
$010e46e8  movea.l   $30(a2),a1           0.01% (83276, 824826, 82200, 82507)
$010e46ec  move.w    $3a(sp),d7           0.01% (83276, 332447, 11, 83259)
$010e46f0  add.l     a6,d2                0.01% (83276, 166529, 7, 0)
$010e46f2  add.l     a1,d1                0.01% (83276, 166539, 0, 0)
$010e46f4  move.l    d1,$30(a2)           0.01% (83276, 582928, 15, 0)
$010e46f8  clr.l     d1                   0.01% (83276, 48540, 5396, 0)
$010e46fa  move.b    $1a(a3),d1           0.01% (83276, 250098, 4, 83237)
$010e46fe  move.l    d1,-(sp)             0.01% (83276, 416339, 6, 0)
$010e4700  move.l    d6,-(sp)             0.01% (83276, 249840, 8, 0)
$010e4702  move.w    $7572(a0),d1         0.01% (83276, 749505, 20, 0)
$010e4706  andi.w    #$7,d1               0.01% (83276, 257943, 1832, 0)
$010e470a  andi.l    #$ffff,d1            0.01% (83276, 497832, 789, 0)
$010e4710  sub.l     d1,d0                0.01% (83276, 167320, 770, 0)
$010e4712  move.l    d0,-(sp)             0.01% (83276, 415611, 8, 0)
$010e4714  move.l    a0,-(sp)             0.01% (83276, 249832, 4, 0)
$010e4716  jsr (a5)                       0.01% (83276, 1748798, 166568, 0)
$010e4718  movea.l   d0,a1                0.01% (83276, 748860, 83286, 0)
$010e471a  adda.w    #$10,sp              0.01% (83276, 249937, 117, 0)
$010e471e  subq.b    #$1,d3               0.62% (5056868, 5140112, 64, 0)
$010e4720  bne.w     $10e462c             0.62% (5056868, 51917082, 4596222, 0)
$010e4724  movea.l   $34(sp),a6           0.07% (581901, 6402704, 1853, 0)
$010e4728  move.b    (a6),d0              0.07% (581901, 9520225, 582045, 102429)
$010e472a  clr.l     d4                   0.07% (581901, 581899, 8, 0)
$010e472c  move.b    d0,d4                0.07% (581901, 1165650, 1865, 0)
$010e472e  clr.l     d1                   0.07% (581901, 1161944, 22, 0)
$010e4730  move.b    $25(a2),d1           0.07% (581901, 2933871, 1903, 490178)
$010e4734  asr.l     d1,d4                0.07% (581901, 1074286, 1871, 0)
$010e4736  move.b    d0,d3                0.07% (581901, 1161910, 22, 0)
$010e4738  and.b     $24(a2),d3           0.07% (581901, 5807459, 581993, 581880)
$010e473c  beq.w     $10e4620             0.07% (581901, 3176030, 123641, 0)
$010e4740  lea.l     $1(a6),a0            0.06% (461672, 1845653, 1764, 0)
$010e4744  move.l    a0,$34(sp)           0.06% (461672, 3231728, 1805, 0)
$010e4748  bra.w     $10e462c             0.06% (461672, 9392385, 1043816, 0)
[...]
$010e4752  movem.l   (sp)+,d2-d7/a2-a6    0.00% (1913, 71037, 3, 13391)
$010e4756  adda.w    #$10,sp              0.00% (1913, 5739, 0, 0)
$010e475a  rts                            0.00% (1913, 43047, 3828, 954)
[...]
$010e4772  movea.w   $28(a2),a0           0.62% (5058781, 17677289, 32184, 4691864)
$010e4776  cmp.l     a0,d6                0.62% (5058781, 9721781, 206, 0)
$010e4778  blt.w     $10e465c             0.62% (5058781, 20756996, 99746, 0)
$010e477c  movea.w   $2c(a2),a0           0.62% (5058781, 23322294, 12760, 4599177)
$010e4780  cmp.l     a0,d6                0.62% (5058781, 9672647, 12534, 0)
$010e4782  bge.w     $10e465c             0.62% (5058781, 20234678, 12782, 0)
$010e4786  movea.l   (a2),a0              0.62% (5058781, 13214086, 224, 4588234)
$010e4788  tst.l     a0                   0.62% (5058781, 10356803, 100555, 0)
$010e478a  blt.w     $10e465c             0.62% (5058781, 20147638, 13708, 0)
$010e478e  movea.w   $2e(a2),a6           0.62% (5058781, 20241026, 13776, 5058009)
$010e4792  cmpa.l    a0,a6                0.62% (5058781, 10103483, 178, 0)
$010e4794  ble.w     $10e465c             0.62% (5058781, 20234647, 13836, 0)
$010e4798  move.b    d5,d0                0.62% (5058781, 10737742, 99829, 0)
$010e479a  and.b     (a1),d0              0.62% (5058781, 45529339, 728, 0)
$010e479c  bne.w     $10e465c             0.62% (5058781, 15188865, 13266, 0)
$010e47a0  tst.w     d4                   0.62% (5058781, 10130105, 13244, 0)
$010e47a2  beq.w     $10e465c             0.62% (5058781, 22178178, 312898, 0)
$010e47a6  clr.l     d0                   0.31% (2569499, 5137279, 82, 0)
$010e47a8  move.w    d4,d0                0.31% (2569499, 5730534, 85411, 0)
$010e47aa  move.w    $64(a3,d0.l),$3a(sp) 0.31% (2569499, 25137037, 1365, 2160984)
$010e47b0  move.b    $d(a3),d1            0.31% (2569499, 4875068, 1041, 2258243)
$010e47b4  cmpi.b    #$1,d1               0.31% (2569499, 9967173, 928, 0)
$010e47b8  beq.w     $10e4870             0.31% (2569499, 10787920, 85417, 0)
$010e47bc  cmpi.b    #$2,d1               0.31% (2569499, 10277613, 352, 0)
$010e47c0  beq.w     $10e494a             0.31% (2569499, 10277642, 323, 0)
$010e47c4  move.l    $2a(a3),d0           0.31% (2569499, 28261065, 632, 2570012)
$010e47c8  cmpi.b    #$3,d1               0.31% (2569499, 8309616, 86486, 0)
$010e47cc  beq.b     $10e482e             0.31% (2569499, 5139160, 577, 0)
$010e47ce  movea.l   d0,a0                0.31% (2569499, 5138396, 76, 0)
$010e47d0  cmpi.b    #$2,$7b2a(a0)        0.31% (2569499, 33404772, 1439, 0)
$010e47d6  beq.w     $10e489c             0.31% (2569499, 8307349, 86270, 0)
$010e47da  move.b    $3b(sp),(a4)         0.31% (2569499, 19891255, 971, 2249725)
$010e47de  adda.l    $32(a3),a4           0.31% (2569499, 4289367, 747, 4902268)
$010e47e2  adda.l    $44(a3),a1           0.31% (2569499, 14306728, 668, 1936347)
$010e47e6  addq.l    #$1,d6               0.31% (2569499, 4505712, 60, 0)
$010e47e8  bra.w     $10e4666             0.31% (2569499, 10963249, 108679, 0)
[...]
Scumm::AkosRenderer::paintCelByleRLE(int, int):
And in first ~10 min of the full version:

Code: Select all

Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&):
$010e4590  adda.w    #$fff0,sp            0.00% (1588, 4764, 0, 0)
$010e4594  movem.l   d2-d7/a2-a6,-(sp)    0.00% (1588, 55580, 0, 0)
$010e4598  movea.l   $40(sp),a3           0.00% (1588, 15880, 1588, 1588)
$010e459c  movea.l   $44(sp),a2           0.00% (1588, 6352, 0, 1588)
$010e45a0  move.l    $4(a2),d6            0.00% (1588, 6352, 0, 1588)
$010e45a4  movea.l   $48(a3),a6           0.00% (1588, 17468, 0, 0)
$010e45a8  movea.l   $18(a2),a4           0.00% (1588, 15880, 1588, 1588)
$010e45ac  move.b    $27(a2),d3           0.00% (1588, 6500, 0, 1458)
$010e45b0  move.b    $26(a2),d4           0.00% (1588, 6222, 0, 1588)
$010e45b4  move.l    $5e(a3),d7           0.00% (1588, 31760, 0, 0)
$010e45b8  move.l    $8(a2),d2            0.00% (1588, 16440, 1588, 1308)
$010e45bc  add.l     $34(a2),d2           0.00% (1588, 6093, 0, 1585)
$010e45c0  move.l    (a2),d0              0.00% (1588, 3173, 0, 1588)
$010e45c2  moveq     #$7,d1               0.00% (1588, 3176, 0, 0)
$010e45c4  and.l     d0,d1                0.00% (1588, 3176, 0, 0)
$010e45c6  moveq     #$7f,d5              0.00% (1588, 3176, 0, 0)
$010e45c8  not.b     d5                   0.00% (1588, 14292, 1590, 0)
$010e45ca  asr.l     d1,d5                0.00% (1588, 1587, 0, 0)
$010e45cc  movea.l   $2a(a3),a1           0.00% (1588, 6348, 0, 3176)
$010e45d0  clr.l     d1                   0.00% (1588, 3176, 0, 0)
$010e45d2  move.b    $1a(a3),d1           0.00% (1588, 6373, 0, 1585)
$010e45d6  move.l    d1,-(sp)             0.00% (1588, 7937, 0, 0)
$010e45d8  move.l    d6,-(sp)             0.00% (1588, 19056, 1588, 0)
$010e45da  move.w    $7572(a1),d1         0.00% (1588, 14292, 0, 0)
$010e45de  andi.w    #$7,d1               0.00% (1588, 4764, 0, 0)
$010e45e2  andi.l    #$ffff,d1            0.00% (1588, 9528, 0, 0)
$010e45e8  sub.l     d1,d0                0.00% (1588, 14292, 1588, 0)
$010e45ea  move.l    d0,-(sp)             0.00% (1588, 6352, 0, 0)
$010e45ec  move.l    a1,-(sp)             0.00% (1588, 4764, 0, 0)
$010e45ee  lea.l     $1119f3a.l,a5        0.00% (1588, 4764, 0, 0)
$010e45f4  jsr (a5)                       0.00% (1588, 33348, 3176, 0)
$010e45f6  movea.l   d0,a1                0.00% (1588, 3164, 0, 0)
$010e45f8  adda.w    #$10,sp              0.00% (1588, 15856, 1588, 0)
$010e45fc  tst.b     d3                   0.00% (1588, 3180, 4, 0)
$010e45fe  bne.w     $10e475c             0.00% (1588, 9136, 352, 0)
$010e4602  moveq     #$ff,d1              0.00% (1414, 2824, 0, 0)
$010e4604  move.l    d1,$30(sp)           0.00% (1414, 9898, 4, 0)
$010e4608  move.b    (a6),d0              0.00% (1414, 25452, 1414, 0)
$010e460a  clr.l     d4                   0.00% (1414, 1414, 0, 0)
$010e460c  move.b    d0,d4                0.00% (1414, 2828, 0, 0)
$010e460e  clr.l     d1                   0.00% (1414, 2828, 0, 0)
$010e4610  move.b    $25(a2),d1           0.00% (1414, 15554, 0, 0)
$010e4614  asr.l     d1,d4                0.00% (1414, 1414, 0, 0)
$010e4616  move.b    d0,d3                0.00% (1414, 2828, 0, 0)
$010e4618  and.b     $24(a2),d3           0.00% (1414, 14140, 1414, 1414)
$010e461c  bne.w     $10e4740             0.00% (1414, 8449, 399, 0)
$010e4620  lea.l     $2(a6),a0            0.03% (1098697, 3309300, 12210, 0)
$010e4624  move.l    a0,$34(sp)           0.03% (1098697, 7690948, 12303, 0)
$010e4628  move.b    $1(a6),d3            0.03% (1098697, 12515653, 1092064, 899241)
$010e462c  move.b    $1c(a3),d0           1.60% (69118580, 242684154, 387798, 63823834)
$010e4630  cmpi.b    #$ff,d0              1.60% (69118580, 271220853, 389204, 0)
$010e4634  beq.b     $10e463e             1.60% (69118580, 165265543, 6195757, 0)
[...]
$010e463e  tst.b     $28a(a3)             1.60% (69118580, 559662048, 1488686, 28290537)
$010e4642  beq.w     $10e4772             1.60% (69118580, 713077139, 70800038, 0)
[...]
$010e465c  adda.l    $32(a3),a4           0.71% (30425583, 136270927, 1474545, 58680909)
$010e4660  adda.l    $44(a3),a1           0.71% (30425583, 134803843, 1475634, 28273832)
$010e4664  addq.l    #$1,d6               0.71% (30425583, 60278363, 1475371, 0)
$010e4666  subq.w    #$1,d7               1.60% (69118579, 136642480, 2148, 0)
$010e4668  bne.w     $10e471e             1.60% (69118579, 1229377482, 141046695, 0)
$010e466c  move.l    $14(a2),d0           0.01% (294410, 3243883, 5398, 0)
$010e4670  subq.l    #$1,d0               0.01% (294410, 305156, 5378, 0)
$010e4672  move.l    d0,$14(a2)           0.01% (294410, 2060842, 5415, 0)
$010e4676  beq.w     $10e4752             0.01% (294410, 2648040, 294244, 0)
$010e467a  move.l    $5e(a3),$38(sp)      0.01% (292823, 5630379, 136, 120077)
$010e4680  move.l    $4(a2),d6            0.01% (292823, 1269680, 70, 170747)
$010e4684  move.l    $8(a2),d2            0.01% (292823, 1049228, 59, 292819)
$010e4688  movea.l   $34(a2),a6           0.01% (292823, 3050716, 292881, 273799)
$010e468c  move.l    (a2),$30(sp)         0.01% (292823, 2030853, 122, 292812)
$010e4690  move.b    $1b(a3),d0           0.01% (292823, 423525, 108, 272888)
$010e4694  move.l    $20(a2),d1           0.01% (292823, 1397411, 58, 253894)
$010e4698  cmpi.b    #$ff,d0              0.01% (292823, 2928029, 292879, 0)
$010e469c  beq.b     $10e46aa             0.01% (292823, 2635333, 292967, 0)
[...]
$010e46aa  move.l    $30(sp),d0           0.01% (292823, 878631, 84, 292810)
$010e46ae  add.l     d1,d0                0.01% (292823, 585537, 4, 0)
$010e46b0  move.l    d0,(a2)              0.01% (292823, 1464188, 102, 0)
$010e46b2  bmi.w     $10e4752             0.01% (292823, 293314, 79, 0)
$010e46b6  movea.w   $2e(a2),a0           0.01% (292823, 3226670, 292875, 291170)
$010e46ba  cmpa.l    d0,a0                0.01% (292823, 292802, 2, 0)
$010e46bc  ble.w     $10e4752             0.01% (292823, 1171212, 39, 0)
$010e46c0  movea.w   #$80,a1              0.01% (292823, 1171268, 40, 0)
$010e46c4  moveq     #$7,d5               0.01% (292823, 585652, 37, 0)
$010e46c6  and.l     d0,d5                0.01% (292823, 585596, 12, 0)
$010e46c8  movea.l   d5,a0                0.01% (292823, 2635005, 292867, 0)
$010e46ca  move.l    a1,d5                0.01% (292823, 292808, 10, 0)
$010e46cc  move.l    a0,d7                0.01% (292823, 585708, 102, 0)
$010e46ce  asr.l     d7,d5                0.01% (292823, 585540, 12, 0)
$010e46d0  movea.l   $2a(a3),a0           0.01% (292823, 2013886, 115, 461546)
$010e46d4  clr.l     d7                   0.01% (292823, 461703, 103, 0)
$010e46d6  move.b    $7b2a(a0),d7         0.01% (292823, 5855619, 292924, 0)
$010e46da  muls.l    d1,d7                0.01% (292823, 878548, 169, 0)
$010e46de  movea.l   d7,a4                0.01% (292823, 585428, 4, 0)
$010e46e0  adda.l    $18(a2),a4           0.01% (292823, 1174446, 168, 292378)
$010e46e4  move.l    a4,$18(a2)           0.01% (292823, 2049473, 194, 0)
$010e46e8  movea.l   $30(a2),a1           0.01% (292823, 2883786, 286647, 287538)
$010e46ec  move.w    $3a(sp),d7           0.01% (292823, 1166642, 54, 292726)
$010e46f0  add.l     a6,d2                0.01% (292823, 585544, 39, 0)
$010e46f2  add.l     a1,d1                0.01% (292823, 585595, 12, 0)
$010e46f4  move.l    d1,$30(a2)           0.01% (292823, 2049741, 56, 0)
$010e46f8  clr.l     d1                   0.01% (292823, 385587, 42855, 0)
$010e46fa  move.b    $1a(a3),d1           0.01% (292823, 879535, 23, 292674)
$010e46fe  move.l    d1,-(sp)             0.01% (292823, 1463967, 20, 0)
$010e4700  move.l    d6,-(sp)             0.01% (292823, 878541, 26, 0)
$010e4702  move.w    $7572(a0),d1         0.01% (292823, 2635477, 70, 0)
$010e4706  andi.w    #$7,d1               0.01% (292823, 929193, 14806, 0)
$010e470a  andi.l    #$ffff,d1            0.01% (292823, 1742157, 8832, 0)
$010e4710  sub.l     d1,d0                0.01% (292823, 594392, 8752, 0)
$010e4712  move.l    d0,-(sp)             0.01% (292823, 1455386, 30, 0)
$010e4714  move.l    a0,-(sp)             0.01% (292823, 878492, 14, 0)
$010e4716  jsr (a5)                       0.01% (292823, 6149369, 585780, 0)
$010e4718  movea.l   d0,a1                0.01% (292823, 2632943, 292855, 0)
$010e471a  adda.w    #$10,sp              0.01% (292823, 878904, 439, 0)
$010e471e  subq.b    #$1,d3               1.60% (69116992, 69409902, 1108, 0)
$010e4720  bne.w     $10e462c             1.60% (69116992, 735410341, 69000163, 0)
$010e4724  movea.l   $34(sp),a6           0.06% (2619353, 22137417, 25410, 838688)
$010e4728  move.b    (a6),d0              0.06% (2619353, 43657779, 2619997, 365009)
$010e472a  clr.l     d4                   0.06% (2619353, 2619375, 32, 0)
$010e472c  move.b    d0,d4                0.06% (2619353, 5264145, 25466, 0)
$010e472e  clr.l     d1                   0.06% (2619353, 5213325, 114, 0)
$010e4730  move.b    $25(a2),d1           0.06% (2619353, 13550176, 25632, 2138710)
$010e4734  asr.l     d1,d4                0.06% (2619353, 4787216, 25553, 0)
$010e4736  move.b    d0,d3                0.06% (2619353, 5213126, 90, 0)
$010e4738  and.b     $24(a2),d3           0.06% (2619353, 26039094, 2619725, 2619266)
$010e473c  beq.w     $10e4620             0.06% (2619353, 18137082, 1136107, 0)
$010e4740  lea.l     $1(a6),a0            0.04% (1522070, 6087970, 14170, 0)
$010e4744  move.l    a0,$34(sp)           0.04% (1522070, 10654534, 14264, 0)
$010e4748  bra.w     $10e462c             0.04% (1522070, 32722676, 3636508, 0)
[...]
$010e4752  movem.l   (sp)+,d2-d7/a2-a6    0.00% (1587, 55283, 2, 11511)
$010e4756  adda.w    #$10,sp              0.00% (1587, 4761, 0, 0)
$010e475a  rts                            0.00% (1587, 39816, 3175, 336)
$010e475c  andi.w    #$ff,d4              0.00% (174, 522, 0, 0)
$010e4760  move.l    a6,$34(sp)           0.00% (174, 1218, 0, 0)
$010e4764  moveq     #$ff,d1              0.00% (174, 0, 0, 0)
$010e4766  move.l    d1,$30(sp)           0.00% (174, 2262, 174, 0)
$010e476a  subq.b    #$1,d3               0.00% (174, 0, 0, 0)
$010e476c  bne.w     $10e462c             0.00% (174, 2426, 238, 0)
$010e4770  bra.b     $10e4724             0.00% (55, 495, 55, 0)
$010e4772  movea.w   $28(a2),a0           1.60% (69118580, 240816313, 1202964, 64106082)
$010e4776  cmp.l     a0,d6                1.60% (69118580, 132116482, 2226, 0)
$010e4778  blt.w     $10e465c             1.60% (69118580, 278423867, 723943, 0)
$010e477c  movea.w   $2c(a2),a0           1.60% (69116232, 310562507, 400998, 63779712)
$010e4780  cmp.l     a0,d6                1.60% (69116232, 133338203, 398386, 0)
$010e4782  bge.w     $10e465c             1.60% (69116232, 276529564, 416975, 0)
$010e4786  movea.l   (a2),a0              1.60% (68859963, 423047059, 6808, 28049100)
$010e4788  tst.l     a0                   1.60% (68859963, 99890446, 719520, 0)
$010e478a  blt.w     $10e465c             1.60% (68859963, 275113797, 405059, 0)
$010e478e  movea.w   $2e(a2),a6           1.60% (68859963, 275535892, 405116, 68847482)
$010e4792  cmpa.l    a0,a6                1.60% (68859963, 137309764, 2196, 0)
$010e4794  ble.w     $10e465c             1.60% (68859963, 275434594, 405622, 0)
$010e4798  move.b    d5,d0                1.60% (68859963, 140335936, 721624, 0)
$010e479a  and.b     (a1),d0              1.60% (68859963, 619621214, 10050, 17064)
$010e479c  bne.w     $10e465c             1.60% (68859963, 206994910, 408517, 0)
$010e47a0  tst.w     d4                   1.60% (68859963, 138118705, 407814, 0)
$010e47a2  beq.w     $10e465c             1.60% (68859963, 288333114, 3946525, 0)
$010e47a6  clr.l     d0                   0.90% (38692997, 77359034, 1216, 0)
$010e47a8  move.w    d4,d0                0.90% (38692997, 79440437, 305497, 0)
$010e47aa  move.w    $64(a3,d0.l),$3a(sp) 0.90% (38692997, 370022105, 18401, 34408403)
$010e47b0  move.b    $d(a3),d1            0.90% (38692997, 63456740, 13732, 35204934)
$010e47b4  cmpi.b    #$1,d1               0.90% (38692997, 151292720, 14610, 0)
$010e47b8  beq.w     $10e4870             0.90% (38692997, 156582106, 307014, 0)
$010e47bc  cmpi.b    #$2,d1               0.90% (38692997, 154767692, 4814, 0)
$010e47c0  beq.w     $10e494a             0.90% (38692997, 154767470, 4730, 0)
$010e47c4  move.l    $2a(a3),d0           0.90% (38692997, 176338925, 4930, 73484457)
$010e47c8  cmpi.b    #$3,d1               0.90% (38692997, 153069412, 323098, 0)
$010e47cc  beq.b     $10e482e             0.90% (38692997, 77389804, 7523, 0)
$010e47ce  movea.l   d0,a0                0.90% (38692997, 77377956, 1332, 0)
$010e47d0  cmpi.b    #$2,$7b2a(a0)        0.90% (38692996, 503025329, 20264, 0)
$010e47d6  beq.w     $10e489c             0.90% (38692996, 118249215, 319360, 0)
$010e47da  move.b    $3b(sp),(a4)         0.90% (38692996, 541406570, 16894, 0)
$010e47de  adda.l    $32(a3),a4           0.90% (38692996, 61321008, 10606, 73858584)
$010e47e2  adda.l    $44(a3),a1           0.90% (38692996, 169949052, 9121, 35563263)
$010e47e6  addq.l    #$1,d6               0.90% (38692996, 74254424, 1234, 0)
$010e47e8  bra.w     $10e4666             0.90% (38692996, 157585778, 456566, 0)
[...]
Scumm::AkosRenderer::paintCelByleRLE(int, int):
In both cases cycles seem to be spent in the same places in the function, so I assume they use same RLE decoding "features".

However, while in the (time-wise much shorter) demo intro this is called 1913 times, in the full version it's called only 1588 times, but still uses much more cycles in the loops, which indicates it indeed decoding much larger pieces.

=> Add some debug prints to find out the frame sizes used in decoding?
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Here's just the demo intro without the logo stuff (1+ min):

Code: Select all

Time spent in profile = 84.91363s.
...
Used cycles:
  33.63%  34.10%  37.02%   916321649 9290671121008713652   Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&)
  21.58%  21.90%  23.63%   587874370 596653327 643885251   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   7.55%   7.69%   9.23%   205585719 209458149 251511182   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   4.14%   4.21%   9.76%   112816028 114807611 265898950   Scumm::ScummEngine::resetActorBgs()
   3.43%                    93545294                       c2p1x1_8_rect_start
   3.41%   3.46%   3.46%    93019087  94389690  94399666   void Scumm::Wiz::decompressWizImage<1>(unsigned char*, int, int, unsigned char const*, Common::Rect const&, int, unsigned char const*, unsigned char const*, unsigned char)
   2.61%                    71228163                       c2p1x1_8_rect_pix16
   1.55%                    42157294                       ROM_TOS
And here's finally profile of the full intro for the full version (~21 mins on emulated 32Mhz Falcon):

Code: Select all

Time spent in profile = 1270.58971s.
...
Used cycles:
  71.35%  72.48%  74.48%  290859811632954591917130361602798   Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&)
   5.87%   5.96%   6.67%  239351896824314243732719421376   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   3.03%   3.07%   3.32%  123459452112534031381352664581   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   2.51%                  1024843876                       c2p1x1_8_rect_start
   1.94%                   789061269                       c2p1x1_8_rect_pix16
   1.53%                   623603897                       ROM_TOS
...
Instruction cache misses:
  72.95%  79.57%  83.37%   720327019 785709721 823232439   Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&)
   9.03%                    89200824                       ROM_TOS
...
Data cache hits:
  73.39%  73.65%  76.00%  116130876411653917121202619070   Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&)
   3.44%   3.46%   5.39%    54444875  54781075  85276198   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
Callgraph:
spy2-intro-callgraph.pdf
You do not have the required permissions to view the files attached to this post.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Last active <40 instructions in "byleRLEDecode()" account for 45%(!) of all the instructions executed during the full intro:

Code: Select all

Scumm::AkosRenderer::byleRLEDecode(Scumm::BaseCostumeRenderer::ByleRLEData&):
$010e4590  adda.w    #$fff0,sp            0.00% (4773, 14319, 0, 0)
...
$010e4628  move.b    $1(a6),d3            0.03% (2715589, 30957109, 2703025, 2222714)
$010e462c  move.b    $1c(a3),d0           1.68% (153874699, 1072481069, 906194, 67068228)
$010e4630  cmpi.b    #$ff,d0              1.68% (153874699, 528769621, 901098, 0)
$010e4634  beq.b     $10e463e             1.68% (153874699, 372861331, 14651979, 0)
[...]
$010e463e  tst.b     $28a(a3)             1.68% (153874699, 1685638081, 3413991, 965939)
$010e4642  beq.w     $10e4772             1.68% (153874699, 1528692747, 157948290, 0)
...
$010e4772  movea.w   $28(a2),a0           1.68% (153874699, 624649188, 2829045, 130321360)
$010e4776  cmp.l     a0,d6                1.68% (153874699, 281595257, 4712, 0)
$010e4778  blt.w     $10e465c             1.68% (153874699, 621167148, 1862192, 0)
$010e477c  movea.w   $2c(a2),a0           1.67% (152904549, 1682874944, 946867, 0)
$010e4780  cmp.l     a0,d6                1.67% (152904549, 154745624, 926310, 0)
$010e4782  bge.w     $10e465c             1.67% (152904549, 612042689, 1046196, 0)
$010e4786  movea.l   (a2),a0              1.66% (151511179, 377284117, 6044, 139887453)
$010e4788  tst.l     a0                   1.66% (151511179, 298827841, 1740034, 0)
$010e478a  blt.w     $10e465c             1.66% (151511179, 605225664, 939466, 0)
$010e478e  movea.w   $2e(a2),a6           1.66% (151511179, 606275803, 939949, 151481138)
$010e4792  cmpa.l    a0,a6                1.66% (151511179, 302070225, 4930, 0)
$010e4794  ble.w     $10e465c             1.66% (151511179, 606032781, 940363, 0)
$010e4798  move.b    d5,d0                1.66% (151511179, 309578920, 1744388, 0)
$010e479a  and.b     (a1),d0              1.66% (151511179, 1363611826, 22056, 0)
$010e479c  bne.w     $10e465c             1.66% (151511179, 455458858, 947978, 0)
$010e47a0  tst.w     d4                   1.66% (151511179, 303949495, 946454, 0)
$010e47a2  beq.w     $10e465c             1.66% (151511179, 634953830, 8887023, 0)
$010e47a6  clr.l     d0                   0.90% (82653116, 165245811, 2776, 0)
$010e47a8  move.w    d4,d0                0.90% (82653116, 170437582, 760064, 0)
$010e47aa  move.w    $64(a3,d0.l),$3a(sp) 0.90% (82653116, 873078134, 42047, 60130794)
$010e47b0  move.b    $d(a3),d1            0.90% (82653116, 165019383, 31569, 71494906)
$010e47b4  cmpi.b    #$1,d1               0.90% (82653116, 319471337, 32776, 0)
$010e47b8  beq.w     $10e4870             0.90% (82653116, 335117072, 763308, 0)
$010e47bc  cmpi.b    #$2,d1               0.90% (82653116, 330602822, 11148, 0)
$010e47c0  beq.w     $10e494a             0.90% (82653116, 330602761, 10093, 0)
$010e47c4  move.l    $2a(a3),d0           0.90% (82653116, 909188176, 19241, 82653116)
$010e47c8  cmpi.b    #$3,d1               0.90% (82653116, 253413845, 798574, 0)
$010e47cc  beq.b     $10e482e             0.90% (82653116, 165311886, 17659, 0)
$010e47ce  movea.l   d0,a0                0.90% (82653116, 165288202, 2658, 0)
$010e47d0  cmpi.b    #$2,$7b2a(a0)        0.90% (82653116, 1074531500, 46527, 0)
$010e47d6  beq.w     $10e489c             0.90% (82653116, 253346798, 791322, 0)
$010e47da  move.b    $3b(sp),(a4)         0.90% (82653116, 1156408292, 37445, 0)
$010e47de  adda.l    $32(a3),a4           0.90% (82653116, 131612434, 24115, 157540579)
$010e47e2  adda.l    $44(a3),a1           0.90% (82653116, 363875660, 20177, 75768610)
$010e47e6  addq.l    #$1,d6               0.90% (82653116, 158418989, 2688, 0)
$010e47e8  bra.w     $10e4666             0.90% (82653116, 337523850, 1115112, 0)
[...]
Scumm::AkosRenderer::paintCelByleRLE(int, int):
(No idea whether it's part of that function's C-code, or something compiler inlines there.)

Hatari profiler callgraph (in previous post) does not show what's calling the "Scumm::AkosRenderer::byleRLEDecode()", probably because their portion of the cycles is too small (i.e. they get stripped away by the graph simplification).

I think it gets called from the 2 functions following "byleRLEDecode()":
  • Scumm::AkosRenderer::paintCelByleRLE(int, int)
  • Scumm::AkosRenderer::drawLimb(Scumm::Actor const*, int)
As the first one has exactly the same call count, and second is the only other function including parts of code having same run count.
mikro
Hardware Guru
Hardware Guru
Posts: 4185
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: ScummVM/Falcon060 pre-release

Post by mikro »

You know how hardly I am to be convinced to optimise engine stuff but I'll take a look, whether I don't see something suspicious. I doubt the RLE decoder is called more than the demo by chance, it really points out to a bigger data set to decode.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

mikro wrote: Thu Apr 18, 2024 7:16 am You know how hardly I am to be convinced to optimise engine stuff but I'll take a look, whether I don't see something suspicious. I doubt the RLE decoder is called more than the demo by chance, it really points out to a bigger data set to decode.
SCI and SCUMM engines are the ones with most games (supported by the full ScummVM Atari version), so I think optimizing them could be warranted. I'm hoping some others would get involved in this too though, as the amount of code is not that large:
https://github.com/mikrosk/scummvm/blob ... s.cpp#L490

I.e. one may be able to experiment with the C++ code just by building that single file and checking generated m68k asm.

Btw. looking at the SCUMM engine code, "byleRLEDecode()" call sequence is following:

Code: Select all

-> Actor::drawActorCostume()
  -> BaseCostumeRenderer::drawCostume()
    -> AkosRenderer::drawLimb()
      -> AkosRenderer::paintCelByleRLE()
        -> AkosRenderer::byleRLEDecode()
With only "drawActorCostume()" being called from more than one place in the SCUMM engine code.
mikro
Hardware Guru
Hardware Guru
Posts: 4185
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: ScummVM/Falcon060 pre-release

Post by mikro »

I have taken a quick look and the conclusion is that there's really nothing wrong going on. To me it looks like there's some kind of RLE-packed stream for each 'actor' which is depacked on the fly: more the actors, more the CPU drag. For instance in the full version when you see the smelly bag, everything suddenly works great: that's because there are only two flys and some other object. On the other hand, the clouds are moving in every frame, so they have to be depacked and drawn in 640x480 every frame (and yet, the rectangle updates work for every 'actor', there aren't forced fullscreen updates).

This also explains why SuperVidel is of a little help, there's not much blitting/c2p'ing but very much of depacking & writing by hand.

Btw, there is a similar (but shorter) routine ClassicCostumeRenderer::proc3 ... and this one got rewritten into ARM assembly apparently for speed reasons. I wouldn't dare to do the same for AkosRenderer::byleRLEDecode, that function is a mess.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

mikro wrote: Thu Apr 18, 2024 7:38 pm I have taken a quick look and the conclusion is that there's really nothing wrong going on.
...
Btw, there is a similar (but shorter) routine ClassicCostumeRenderer::proc3 ... and this one got rewritten into ARM assembly apparently for speed reasons. I wouldn't dare to do the same for AkosRenderer::byleRLEDecode, that function is a mess.
Ok, so there's nothing that could be realistically be done for "byleRLEDecode()".

(Maybe somebody notices issue in the produced GCC assembly shown above, that eventually gets fixed in GCC upstream, but such improvement is going to take years to appear.)

At least on our platform "byleRLEDecode()" seems to be more common SCUMM engine bottleneck than "proc3()".

ARM assembly is there probably because "proc3()" is the main bottleneck in DOTT:
https://www.atari-forum.com/viewtopic.p ... c3#p458397

And visible also in SOTI:
https://www.atari-forum.com/viewtopic.p ... 37#p444337

And Atlantis:
https://www.atari-forum.com/viewtopic.p ... 38#p444238

Hm. SOTI & Atlantis data is from year ago i.e. much older ScummVM & GCC versions.

=> I guess I'll need to profile them again with the new version to verify how much of a bottleneck "proc3()" is currently for them...

(Now that Gunnar is here, maybe somebody on Amiga side would be interested in adding m68k asm for that...)
mikro
Hardware Guru
Hardware Guru
Posts: 4185
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: ScummVM/Falcon060 pre-release

Post by mikro »

Yeah, that's interesting, isn't it. We have never had hard performance issues with proc3(). I think the problem here is that the RLE depacker is used for something which hadn't been used before in earlier Scumm games.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

mikro wrote: Fri Apr 19, 2024 9:17 am We have never had hard performance issues with proc3(). I think the problem here is that the RLE depacker is used for something which hadn't been used before in earlier Scumm games.
Err... I mentioned both issues in my ScummVM perf issues list last summer: https://www.atari-forum.com/viewtopic.p ... 74#p449474

And profiles listed "proc3()" as main bottleneck for DOTT already a year ago: https://www.atari-forum.com/viewtopic.p ... 98#p445998

Whereas "byleRLEDecode()" showed as main bottleneck for Dig even earlier: https://www.atari-forum.com/viewtopic.p ... 08#p445008

Back then there were more pressing issues (SCI engine memory alloc/free handling, Adlib emu being enabled by default etc) though.

PS. In ScummVM 2.6 RLE decode function was named "codec1_genericDecode()": https://www.atari-forum.com/viewtopic.p ... 77#p445177
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

The Secret of Monkey Island

I tried SOMI with last SummVM binary (HE-engine enabled full version).

While Amiga works as well as before, there's some problem with the DOS EGA demo & full VGA versions.

Game animation works OK if mouse is not moved. But whenever mouse is moved, everything freezes for several seconds.

Additionally, OPL emulation is enabled for DOS SOMI versions (and Paula emulation for Amiga one, but that's to be expected), and there isn't in the GUI an option to disable it. Does user need to set "music_driver=null" manually in "scummvm.ini" now for this game?

SOMI Amiga demo

Intro cycles go to memset() calls done mixCallback():

Code: Select all

Time spent in profile = 82.96485s.
...
Used cycles:
  52.65%                  1401530824                       set256
   7.71%   7.82%   7.95%   205175906 208214281 211674777   int Audio::Paula::readBufferIntern<true>(short*, int)
   5.10%   5.18%  70.52%   135785685 1378397821877278981   Audio::MixerImpl::mixCallback(unsigned char*, unsigned int)
   3.97%   7.64%  82.06%   105682828 2032679672184488234   OSystem_Atari::update()
   2.48%                    65898836                       AtariMixerManager::update() [clone .part.0]
   2.18%                    57954508                       OSystem_Atari::delayMillis(unsigned int)
Play cycles go to Paula emulation called by rate converting, and to costume renderer "drawLimb()" -> "mainRoutine()" -> "proc3()":

Code: Select all

Time spent in profile = 58.71333s.
...
Executed instructions:
  49.44%  49.84%  50.53%   192306434 193884790 196546504   int Audio::Paula::readBufferIntern<true>(short*, int)
  10.18%  10.26%  61.20%    39613163  39909217 238053545   Audio::RateConverter_Impl<true, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   9.56%   9.62%   9.62%    37172077  37404351  37407669   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   5.50%   5.49%   6.03%    21390200  21349920  23455389 * Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   3.85%                    14987153                       c2p1x1_8_rect_start
   3.02%                    11740230                       c2p1x1_8_rect_pix16
   2.01%   3.45%  66.82%     7819469  13427821 259919687   OSystem_Atari::update()
   1.69%   1.71%   2.59%     6591396   6654370  10077438   DefaultTimerManager::checkTimers(unsigned int)
   1.37%                     5342409                       OSystem_Atari::delayMillis(unsigned int)
SOMI DOS EGA demo

Intro:

Code: Select all

Time spent in profile = 160.11666s.
...
Used cycles:
  74.55%  75.72%  75.72%  382969746338897730713890247793   OPL::MAME::OPL_CALC_CH(OPL::MAME::fm_opl_channel*)
  13.26%                   681179328                       OPL::MAME::YM3812UpdateOne(OPL::MAME::fm_opl_f*, short*, int)
   3.25%   3.30%  93.49%   166982655 1697100394803014855   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   1.65%                    84838864                       ROM_TOS
SOMI DOS VGA

During first few minutes of startup, "proc3()" uses most cycles after OPL emu:

Code: Select all

Time spent in profile = 373.83317s.
...
Used cycles:
  55.49%  56.35%  56.36%  665571624867588074326759554296   OPL::MAME::OPL_CALC_CH(OPL::MAME::fm_opl_channel*)
  17.21%  17.48%  17.48%  206364939920966125462096842425   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   9.36%                  1122526966                       OPL::MAME::YM3812UpdateOne(OPL::MAME::fm_opl_f*, short*, int)
   2.32%   2.36%  68.88%   278838050 2832576938261813460   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   2.24%                   268377866                       set256
   1.78%   1.81%   2.18%   213512182 217258489 261149806   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   1.62%                   194527307                       ROM_TOS
Play cycles go just to OPL emu audio conversion:

Code: Select all

Time spent in profile = 265.86727s.
...
Used cycles:
  74.50%  75.70%  75.71%  635537494464574216796458238121   OPL::MAME::OPL_CALC_CH(OPL::MAME::fm_opl_channel*)
  12.57%                  1071963142                       OPL::MAME::YM3812UpdateOne(OPL::MAME::fm_opl_f*, short*, int)
   3.14%   3.19%  92.50%   267885003 2719780317890276471   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   1.95%   1.98%   1.98%   166679694 169173370 169193855   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   1.56%                   133049196                       ROM_TOS
Callgraph:
somi-dos-play.png
You do not have the required permissions to view the files attached to this post.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Indiana Jones and the Fate of Atlantis (demo)

Perf is good as expected, but game idles (calls delayMillis()) less than I would have though.

Over half of the intro cycles go to mixCallback() calling memset(), nearly 15% goes to "proc3()", <1% to idling:

Code: Select all

Time spent in profile = 70.17284s.
...
Used cycles:
  47.18%                  1062184404                       set256
  14.11%  14.30%  14.31%   317779176 322068537 322103552   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   4.56%   4.65%  54.89%   102658362 1047070121235856915   Audio::MixerImpl::mixCallback(unsigned char*, unsigned int)
   2.83%                    63808160                       ROM_TOS
   1.76%   1.79%   2.49%    39668836  40318249  56037766   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   1.43%   1.47%   2.71%    32278053  33045012  60950561   Scumm::ScummEngine::resetActorBgs()
   1.38%   3.14%  61.09%    31034703  706453861375449118   OSystem_Atari::update()
   1.37%                    30890839                       AtariMixerManager::update() [clone .part.0]
   0.87%                    19540914                       c2p1x1_8_rect_start
   0.84%                    18883497                       OSystem_Atari::delayMillis(unsigned int)
Game play cycles go even more to mixCallback()'s memset() calls:

Code: Select all

Time spent in profile = 487.79532s.
...
Used cycles:
  59.25%                  9273358412                       set256
   6.02%   6.10%   6.11%   941986736 955407900 955525532   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   5.76%   5.86%  69.35%   901737678 91666436910854224604   Audio::MixerImpl::mixCallback(unsigned char*, unsigned int)
   2.18%                   340643681                       ROM_TOS
   1.74%   3.96%  76.54%   272417549 61972142711979587826   OSystem_Atari::update()
   1.73%                   271422080                       AtariMixerManager::update() [clone .part.0]
   1.38%   1.40%   1.92%   215991200 219365448 300502396   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   1.06%                   165122767                       OSystem_Atari::delayMillis(unsigned int)
(There are no sounds, so I assume it's mixing silence.)
mikro
Hardware Guru
Hardware Guru
Posts: 4185
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: ScummVM/Falcon060 pre-release

Post by mikro »

Right, thanks for reminders. :-)

Still, all of the games above work perfectly fine on CT60 while the HE ones are beyond help due to 640x480. But good to know / remember, maybe one day I'll take a look at least on proc3().

delayMillis() & memset() are still TODO, I have some plans for those but busy with other stuff...
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

The Secret Of Monkey Island
mikro wrote: Sat Apr 20, 2024 12:05 am Right, thanks for reminders. :-)

Still, all of the games above work perfectly fine on CT60
EGA version of SOMI had several audio options, so I tested them all.

Mouse is not really working at all when one of these is selected:
- Adlib
- Amiga Audio
- Creative Music System
- FM-Towns
- PC-98
- Sega-CD
- STMIDI (uses OPL instead)

(Amiga one is a bit odd because while slow, the Amiga version of SOMI was actually playable.)

Did you have one of those options enable when you tried (DOS) SOMI on CT60?

Only following audio options gave a working game:
- No Music
- PC-Speaker
- IBM PCjr

Bear Stormin' (demo)

Although I've written down that "Bear Stormin'" would be in same category with SOMI and "Passport to Adventure", that one does use MIDI when audio is set to "STMIDI".

However, I cannot seem to be able to control the plane in the demo any more. If I remember correctly, keys on the left of keyboard were used for that, but now e.g. "q" key shows game settings, "m" toggles music and few keys control volume, but others do not seem to do anything.

Does anybody know/remember how the plane was (supposed to be) controlled?
mikro
Hardware Guru
Hardware Guru
Posts: 4185
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: ScummVM/Falcon060 pre-release

Post by mikro »

I usually don't change anything when it comes to music emulation (except when connected to MT-32 Pi, then I choose ST MIDI), therefore -> default (which I unfortunately don't know what it is, IIRC my first ScummVM feature request was this, i.e. show *what* is going to be used by "default").

But I never tested EGA (or Amiga for that matter). In general, I don't like / trust those non-DOS/VGA versions very much. EGA is emulated in a very expensive way (640x400 downscaled to 320x200, IIRC) and Amiga versions usually have incomplete implementations due to lesser testing.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Slow audio options

Most of those heavy SOMI "Audio" alternatives (Adlib, Amiga, FM-Towns, PC-89, Sega-CD, STMIDI) map to OPL emulation:

Code: Select all

Used cycles:
  80.00%  81.33%  81.34%  171098530817393398611739587641   OPL::MAME::OPL_CALC_CH(OPL::MAME::fm_opl_channel*)
  11.81%                   252622212                       OPL::MAME::YM3812UpdateOne(OPL::MAME::fm_opl_f*, short*, int)
   2.87%   2.90%  97.38%    61351113  621238432082744014   Audio::RateConverter_Impl<false, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   1.59%                    34091347                       ROM_TOS
And "Creative Music System" maps to DosBox emu:

Code: Select all

Used cycles:
  82.54%  83.82%  87.37%  127122844112909939441345612066   DOSBoxCMS::update(int, short*, int) [clone .part.0]
   4.17%   4.25%  96.21%    64301561  655321031481852566   Audio::RateConverter_Impl<true, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   3.23%   3.28%   3.28%    49821525  50488814  50495634   DOSBoxCMS::envelope(int, int)
   1.57%                    24124124                       ROM_TOS
(Above profiles are from LucasFilm logo to game startup screen where I in vain try nearly a minute to get game to react to any mouse input.)

With those audio options, even quitting game with Ctrl-Q takes nearly 1/2 minute, after ScummVM has printed "silencing the mixer".

Fast audio options

With the fast audio options, quitting takes only couple of seconds, and audio generation is not within the largest bottlenecks...

IBM PCjr:

Code: Select all

Time spent in profile = 45.98794s.
...
Used cycles:
  17.63%  30.29%  56.91%   260185428 446921638 839689485   OSystem_Atari::update()
  10.70%  10.86%  10.86%   157875584 160230204 160249219   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   9.70%   9.81%  17.98%   143143997 144781616 265281479   DefaultTimerManager::checkTimers(unsigned int)
   9.60%                   141714654                       OSystem_Atari::delayMillis(unsigned int)
   8.03%   8.19%   8.19%   118430025 120791018 120807182   virtual thunk to OSystem_Atari::getMillis(bool)
   6.93%                   102187753                       AtariMixerManager::update() [clone .part.0]
   5.43%   5.53%   8.08%    80098342  81541472 119212014   Audio::RateConverter_Impl<true, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   5.32%   5.42%   6.35%    78443416  79988427  93676994   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   5.24%                    77259681                       AtariMixerManager::update()
   2.35%                    34635917                       c2p1x1_8_rect_start
   1.79%                    26399225                       c2p1x1_8_rect_pix16
   1.58%                    23304052                       ROM_TOS
   1.19%   1.20%   1.20%    17517283  17645395  17647568   Scumm::Gdi::drawStripEGA(unsigned char*, int, unsigned char const*, int) const
   0.99%   1.01%   2.44%    14657070  14955503  35947032   Scumm::ScummEngine::resetActorBgs()
   0.82%   0.83%   0.83%    12075323  12311956  12313375   Scumm::Player_V2::squareGenerator(int, int, int, int, short*, unsigned int)
   0.81%                    11891846                       copy256
   0.77%   0.77%   1.90%    11296198  11388138  27974857   Scumm::Player_V2::generatePCjrSamples(short*, unsigned int)
PC-Speaker:

Code: Select all

Time spent in profile = 58.08039s.
...
Used cycles:
  18.04%  31.00%  59.10%   336251347 5777349391101330212   OSystem_Atari::update()
  10.78%  10.94%  10.94%   200804725 203789051 203813193   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
  10.61%  10.78%  19.10%   197679422 200842256 355935338   DefaultTimerManager::checkTimers(unsigned int)
   9.83%                   183138005                       OSystem_Atari::delayMillis(unsigned int)
   8.21%   8.35%   8.35%   153048986 155519446 155540307   virtual thunk to OSystem_Atari::getMillis(bool)
   7.09%                   132034193                       AtariMixerManager::update() [clone .part.0]
   5.69%   5.75%   8.37%   105945446 107212607 155893225   Audio::RateConverter_Impl<true, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   5.36%                    99837566                       AtariMixerManager::update()
   4.36%   4.44%   5.31%    81200822  82708680  98861008   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   1.98%                    36985591                       c2p1x1_8_rect_start
   1.58%                    29399719                       ROM_TOS
   1.51%                    28095314                       c2p1x1_8_rect_pix16
   1.24%   1.26%   2.00%    23133718  23527679  37316678   Scumm::Player_V2::generateSpkSamples(short*, unsigned int)
   1.01%   1.03%   2.48%    18749435  19169130  46209269   Scumm::ScummEngine::resetActorBgs()
   0.81%   0.82%   0.82%    15047065  15311697  15313924   Scumm::Gdi::drawStripEGA(unsigned char*, int, unsigned char const*, int) const
   0.73%                    13679873                       copy256
   0.44%   0.44%   0.44%     8224388   8263321   8264133   Scumm::Player_V2::squareGenerator(int, int, int, int, short*, unsigned int)
(Above two profiles start from from LucasFilm logo, and end when fortune teller speaks.)

Callgraph:
somi-pc-speaker-callgraph.pdf
Those options don't sound that good, but I think some music is better than none. Sadly they are not available with VGA DOS version(s), only with LucasFilm EGA DOS version(s).
You do not have the required permissions to view the files attached to this post.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

mikro wrote: Sat Apr 20, 2024 1:40 pm But I never tested EGA (or Amiga for that matter). In general, I don't like / trust those non-DOS/VGA versions very much. EGA is emulated in a very expensive way (640x400 downscaled to 320x200, IIRC) and Amiga versions usually have incomplete implementations due to lesser testing.
Only EGA version has those extra additional audio options. With VGA version only option is setting "audio" to "no music".

With that, most of its cycles go to mixCallback() memset()s:

Code: Select all

Time spent in profile = 40.92444s.
...
Used cycles:
  43.66%                   573336548                       set256
  11.94%  12.13%  12.13%   156795060 159272092 159295973   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   7.98%   8.11%   9.30%   104761657 106525966 122080932   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   4.24%   4.32%  51.16%    55629488  56715489 671760066   Audio::MixerImpl::mixCallback(unsigned char*, unsigned int)
   3.46%                    45366300                       c2p1x1_8_rect_start
   2.64%                    34685260                       c2p1x1_8_rect_pix16
   1.63%                    21352844                       ROM_TOS
   1.40%   3.11%  56.34%    18366358  40869002 739810054   OSystem_Atari::update()
   1.32%                    17289975                       AtariMixerManager::update() [clone .part.0]
   1.03%   1.04%   1.04%    13526802  13626526  13627709   Scumm::Gdi::drawStripBasicH(unsigned char*, int, unsigned char const*, int, bool) const
   0.99%   1.01%   2.59%    12987849  13288193  34011715   Scumm::ScummEngine::resetActorBgs()
   0.83%                    10956226                       OSystem_Atari::delayMillis(unsigned int)
(This is without the logo part, just walk to fortune teller.)

Callgraph:
somi-vga-nomusic-callgraph.pdf
Because delayMillis() is <1%, it does not seem to idle, unlike the EGA version which on average delayed its work by ~10%.
You do not have the required permissions to view the files attached to this post.
mikro
Hardware Guru
Hardware Guru
Posts: 4185
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: ScummVM/Falcon060 pre-release

Post by mikro »

I'll take a look at all the SOMI versions. Three issues are standing out:

- ST MIDI mapping to OPL in EGA SOMI (why? was it because EGA = DOS = no need for native MIDI for this version?)
- DOS EGA demo & full VGA versions: whenever mouse is moved, everything freezes for several seconds
- inability to switch off the adlib emulator ("No music") -- which one was it? You mention VGA full having only No music and EGA has No music, too in your list.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

mikro wrote: Sat Apr 20, 2024 7:48 pm I'll take a look at all the SOMI versions. Three issues are standing out:
- ST MIDI mapping to OPL in EGA SOMI (why? was it because EGA = DOS = no need for native MIDI for this version?)
Could it be because this is audio setting, not music one (SOMI DOS versions do not have music settings tab)?
mikro wrote: Sat Apr 20, 2024 7:48 pm - DOS EGA demo & full VGA versions: whenever mouse is moved, everything freezes for several seconds
This happens only with audio options using OPL/DosBox emu (like "STMIDI").

With "no music", "PC-Speaker" and "IBM PCjr" there's no problem with SOMI though.
mikro wrote: Sat Apr 20, 2024 7:48 pm - inability to switch off the adlib emulator ("No music") -- which one was it? You mention VGA full having only No music and EGA has No music, too in your list.
Sorry, that was just my earlier confusion (due to there being no "music" tab in SOMI DOS versions options).
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Tried also SOMI Amiga version. It lists the same audio override options as SOMI EGA version, but regardless of which option I select (tried "STMIDI" & "Adlib"), "Paula" emulation is used:

Code: Select all

Time spent in profile = 50.81826s.
...
Used cycles:
  50.93%  51.73%  52.52%   830452486 843421598 856360701   int Audio::Paula::readBufferIntern<true>(short*, int)
   8.96%   9.11%   9.11%   146088517 148570648 148588785   Scumm::ClassicCostumeRenderer::proc3(Scumm::BaseCostumeRenderer::ByleRLEData&)
   7.30%   7.41%  60.46%   119007263 120798294 985883736   Audio::RateConverter_Impl<true, true, false>::convert(Audio::AudioStream&, short*, unsigned int, unsigned short, unsigned short)
   4.56%   4.63%   5.36%    74280645  75518370  87430396   Scumm::ScummEngine::drawStripToScreen(Scumm::VirtScreen*, int, int, int, int)
   3.98%   6.69%  71.08%    64835819 1090119221158906146   OSystem_Atari::update()
   2.34%   2.38%   4.22%    38111660  38856834  68755585   DefaultTimerManager::checkTimers(unsigned int)
   2.16%                    35253502                       OSystem_Atari::delayMillis(unsigned int)
   1.97%                    32095096                       c2p1x1_8_rect_start
   1.82%   1.84%   1.84%    29726045  30067227  30071004   virtual thunk to OSystem_Atari::getMillis(bool)
   1.59%                    25907525                       ROM_TOS
   1.57%                    25556595                       AtariMixerManager::update() [clone .part.0]
   1.50%                    24487672                       c2p1x1_8_rect_pix16
   1.01%                    16396384                       AtariMixerManager::update()
It's a bit slower than EGA version with PC-speaker / PCjr options, but still completely playable despite music not quite keeping up (on 32Mhz emulated Falcon).
Last edited by Eero Tamminen on Sun Apr 21, 2024 11:03 am, edited 1 time in total.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Amiga version looks better than the EGA version (more colors), and music is better with Paula than PC-speaker emulation, and quite a few things seem to be have Amiga versions in ScummVM demo page: https://www.scummvm.org/demos/#livingbooks

=> If other games have same problem (STMIDI defaulting to OPL emu), I'll look into Amiga versions.

PS. On quick look Paula emulation did not look like having any easy optimizations, especially as the code is originally from (Win)UAE:
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

I noticed few issues when testing additional games from ScummVM demos page: https://www.scummvm.org/demos/

Crash

I got a ScummVM crash (under plain EmuTOS):

Code: Select all

[00700555] The game in 'realm-wi' seems to be an unknown game variant.

Please report the following data to the ScummVM team at
https://bugs.scummvm.org/ along with the name of the game you tried to add and
its version, language, etc.:

Matched game IDs for the sci engine: roomzero-fallback

  {"resmap.000", 0, "279264c38faa7840cb550d1d2acd72c2", 2359},
  {"resmap.001", 0, "5b1b4a5007090343c07a9a41315078f7", 580},
  {"resmap.002", 0, "7e89d6fd5ac47b2ee4bab41d15eb7133", 568},
  {"resmap.003", 0, "5eeb74cdb5883c2eb8036fedcac93652", 2026},
  {"resmap.004", 0, "19a3c16860f219560ab2a81007c9d6e5", 208},
  {"resmap.005", 0, "f8f26267e6e499cd87539f2ebb3a851e", 193},
  {"resmap.006", 0, "41524f59539027a1fa6ab767e81aab98", 676},
  {"resmap.007", 0, "bbec1cad2bf0450301ee0368a003cdec", 997},
  {"ressci.000", 0, "1b53bc006bd3a04b7922b3cbaf8a2aca", 392576},
  {"ressci.001", 0, "ef59584c004702d6066ffe0898f027e9", 1022480},
  {"ressci.002", 0, "5bad7140bbd7c752a6c9487ebd8db560", 772490},
  {"ressci.003", 0, "2765fc6ae6610ce0a25b0b1ba61e7611", 3d8391},
  {"ressci.004", 0, "224053feef179a53378cfe308f7451e5", 377552},
  {"ressci.005", 0, "fc29df2a18b78f7f7e216e2c857eae56", .21461},
  {"ressci.006", 0, "4a42fd733c1db18e8caa992de10f2495", 254951},
  {"ressci.007", 0, "1fed006e4c967b8043819b6cde69c4e2", 1092654},

Panic: Illegal Instruction
sr=0300 pc=02845cac
When trying to add demo for unsupported game: https://downloads.scummvm.org/frs/demos ... emo-en.zip

Freeze

Another issue was ScummVM stopping to take input after I had left it with main menu "Add Game" file selector open for several hours.

When I got back to computer, ScummVM did not react to any input. Profiler showed it still calling Atari update etc though...

Unrecognized games

I guess these are not recognized:
* https://wiki.scummvm.org/index.php?title=Blade_Runner
* https://wiki.scummvm.org/index.php?title=A_Golden_Wake

Due to using too high color, although Wiki does not list color depth for latter.

But ScummVM not recognizing demo for LSL 6: https://downloads.scummvm.org/frs/demos ... emo-en.zip

was a bit unexpected.

Especially as it recognized all the other LSL demos on the demos page, and LSL 6 is supported: https://wiki.scummvm.org/index.php?titl ... it_Larry_6

EDIT: demo list says "DOS Non-Engine Demo". Is "Non-Engine" = "Will not be supported by ScummVM"?
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3674
Joined: Sun Jul 31, 2011 1:11 pm

Re: ScummVM/Falcon060 pre-release

Post by Eero Tamminen »

Crash is reproducible, so I ran it with profiling enabled to get a backtrace:

Code: Select all

Panic: Illegal Instruction
sr=0300 pc=02845cac

D0-3:31343631 025e6600 02af7b3c 02af79a4
D4-7:00000001 00000001 0000000a 02dc00cb
A0-3:02d0d8a0 00007fc6 02af7bbc 02af8d14
A4-7:0100f8e4 0235f51c 02af7ddc 00007fc4
 USP:02af7920

basepage=01000224
text=01000324 data=02a1abd4 bss=02a3d18e
Crash at text+01845988

*** Press any key to continue ***
Finalizing costs for 6 non-returned functions:
- 1. 0xe56176: virtual thunk to OSystem_Atari::logMessage(LogMessageType::Type, char const*) -0x1b976e
- 2. 0x235f9bc: GUI::LauncherDialog::doGameDetection(Common::Path const&) +0x4a0
- 3. 0x235ff78: GUI::LauncherSimple::handleCommand(GUI::CommandSender*, unsigned int, unsigned int) -0xad4 (GUI::LauncherDialog::addGame() +0x44)
- 4. 0x23ab9a2: GUI::DropdownButtonWidget::handleMouseUp(int, int, int, int) +0x2ce
- 5. 0x23538f2: GUI::Dialog::handleMouseUp(int, int, int, int) +0xe8
- 6. 0x235895e: GUI::GuiManager::processEvent(Common::Event const&, GUI::Dialog*) [clone .part.0] +0x376
Doing some additional debugging shows first function in backtrace calling the actual implementation, which may be missing from backtrace because it was called by BRA, not JSR:

Code: Select all

OSystem_Atari::logMessage(LogMessageType::Type, char const*)
That function does NatFeats call at the end, which is illegal instruction...

=> ScummVM NatFeats detection code is broken? Or detection variable gets overwritten?

After enabling NatFeats, illegal instruction panic disappeared, but ScummVM did not print the message with NatFeats either (i.e. second time).

And it crashed to some other issue slightly later:

Code: Select all

...
  {"ressci.006", 0, "4a42fd733c1db18e8caa992de10f2495", 254951},
  {"ressci.007", 0, "1fed006e4c967b8043819b6cde69c4e2", 1092654},

ERROR: invalid NF ID 786 requested
GEMDOS 0x3E Fclose(2) at PC 0x25F1CA4
WARN : Address Error reading at address $19, PC=$25e46b6 addr_e3=25e46b6 op_e3=4e75

Panic: Address Error
Another issue with the log function is it using static 1KB buffer for the messages, but not checking that the printed message actually fits there. It should be using e.g. "snprintf(str, sizeof(str), ...)" instead.

PS. NatFeats printing is intended for additional debug outputs like assert()s. Using it to replicate something already going to console is not that useful because you get that already with "--conout 2" or "--trace os_base" Hatari options.
Post Reply

Return to “News & Announcements”