Here comes the explanation: The program exploits the fact that during ADDX.L -(An),-(Ay) the reads happen in descending order. I.e. consider...
Code: Select all
LEA $2004, A0
LEA $1004, A1
The CPU will first read the least significant word (16 bits, LSW) from address $2002, then the most significant word (MSW) from $2000. This is in fact different from e.g. "ADD.L -(A0),D0" where the CPU will first read the MSW from $2000, then the LSW from $2002. This is by no means a new finding, it is documented in the YACHT ("Yet Another Cycle Hunting Table") and probably also in Motorola's patent US4,325,121. However, emulators did not emulate the order of bus accesses correctly for ADDX.L (and SUBX.L) -- in case of Hatari until I reported this on the mailing list.
Thus, if you force the CPU to take a bus error exception, a different fault address will be in the stack frame on a real 68000 compared to an emulated one. The rest of the program is only "smoke and mirrors" to disguise this difference. It is left as exercise to the reader to find out how the code reacts differently to those different exception stack frames.
There are still more subtle differences between emulation and real CPU in the processing of bus and address error exceptions.