ljbk wrote: just made a quick search through some of my sources and i found 2 pairing instructions:
lsr.w #2,dn takes 10 cycles but goes to 12 if followed by a nop
move.b 0(an,dn),dn takes 14 cycles but goes to 16 if followed by a nop
One after the other, they take 24 cycles and the order does not matter !
rule of thumb explanation: lsr does lots work later... (shifting) but adress register indirect has to work internally earlier to add an+dn.
with what we learned previously:
the move.b has 2 words, so it's prefetched entirely.
As the lsr does not access the bus to write out any result, the move can start executing as soon as lsr is finished (the glue can't block the CPU as the CPU does no access to RAM).
My conclusion: uneven instructions where the result is not output to RAM followed by uneven instructions with an opcode not longer than 32bit gain in total 4 cycles (1 nop) in execution time.
About EXG pairing, EXG is 16 bit, so during it's execution, there is a prefetch, RAM access, 4cycle alignment. So another rule to add: the instruction has to be longer than the prefetch (e.g. 1 word instructions lasts more than 8 cycles).
if I follow my reasoning and do:
32 bit, uneven, 14 cycles
16 bit, prefetched during move, even 8 cycles
16 bit, prefetched during move, uneven, 10 cycles
does the subq effectively execute in uneven cycles so that this code uses 32 cycles and not 36?
No time to test... your turn
PS: something I wrote down in my timing sheet is movem R->M, motorola says it takes 8+5n or 8+10n for W or L respectively, but I noted that it's 8+4n and 8+8n on Atari.
4n is more logical, we write 16 bit words. So why is motorola saying 5n?
Is the glue forcing the CPU to "speed up" or is the documentation wrong?
Is the ST's movem write faster than on other MC68000 architectures?