alien wrote:So you seem to be saying that every prefetch is programmed into the instruction (as I seem to have figured out will call).
That's interesting, and would indicate that some instructions would always have the option of pairing with whatever instruction that followed.
Hmm, I don’t understand how you reach this conclusion.
But I don't understand why it would forbid the same instruction from pairing with itself.
Let’s see if we can agree on a few terms and concepts (to avoid semantic misunderstanding):
Glue splits bus access in slots. Every two CPU cycles (every 250 ns), bus ownership rotates between Shifter and the CPU. This means that all bus accesses performed by the CPU must be separated by a multiple of 500 ns (4 CPU cycles). If the CPU attempts a “misaligned” bus access, Glue will insert wait states and force the alignment.
A couple of notes about this:
This happens always, disregarding if Shifter needs data or not. In other words, the Shifter still receives its bus slots even at vertical and horizontal blank.
There are additional wait states for floppy/hard disk DMA, Blitter, Ste DMA sound, and when accessing slow devices (ACIAs are the slowest ones).
Bus access is exactly the same for instructions prefetch or for memory access.
The cycle by cycle order (including prefetch cycles) of a specific instructions is fixed in microcode.
Let’s call “penalty” the additional 2 cycles inserted by GLUE for instructions that attempt a misaligned bus access. An instruction that takes 6 cycles (or any value not multiple of 4) will generate a penalty when preceded and followed by a NOP (and most other instructions). This is because the following NOP will attempt a misaligned prefetch. Note that the penalty will actually be applied to the NOP, and not to the 6 cycles instruction, but this doesn’t matter.
We call pairing, when two instructions that generate a penalty when bracketed by NOPs, they avoid the penalty if they are one after the other. There is only one way that pairing can occur (as I said in a previous message):
For two instructions to pair, the first instruction must have all bus cycles aligned to a four cycles boundary, and the second one must have all bus cycles misaligned (always in relation to the first cycle of the instruction).
So no instruction can pair with itself (well, it is possible that there are exceptions, see below). Let’s take the case of the “EXG” instruction. It might have two possible behaviors (but the behavior is fixed, is either always one or always the other):
1 – The prefetch is executed in cycles 0-3, and the bus is idle during the last two cycles.
2 – The bus is idle during the two first cycles and the prefetch is executed on cycles 2-5.
It seems that the actual behavior is the first one. But either way it cannot pair with itself. Because either all bus accesses fall on a 4 cycles boundary, or either never.
Note that this is as long as the two EXGs are bracketed with NOPs. Two EXG together might still pair if properly matched with a preceding and following instructions. But the pairing will be between each EXG and the other instruction. Not between themselves.
As I said, it is possible that there are instructions with “weird” behavior. For example, and instruction might have two or more bus accesses that they are misaligned between themselves. Or it might have extra idle cycles both at the start and at the end. Again, I have no idea if such instructions exist or not. One possible candidate is move d8(Ax,Xx),d8(Ax,Xx).
But... this example doesn't seem to match a 2 word always rule:
Code: Select all
move.l code(pc), (a0)
nop ; if prefetch is 0 instructions we should hit here
nop ; if prefetch is 1 instruction we should hit here
illegal ; If prefetch is 2 instructions we should hit here
We hit on the second illegal corresponding to 1 instruction prefetch.
No, this doesn’t mean the prefetch is one word only. It just means that the last prefetch performed by the move is executed after
the write. This is something we already know. See my “prefetch rules” when answering to Leonard question. Actually, you already suggested the same conclusion when you answered to Leonard.
Now I still don't get why 2 exchanges following each other don't end up always being 12 cycles. After all the GLUE's not involved, and by the time they execute they're both in the prefetch buffer.
It doesn’t matter exactly which instructions are prefetched (this only matters for self modified code prefetch tricks). The EXG still must perform one word prefetch (see my prefetch rules). So GLUE will delay EXG if it attempts a misaligned prefetch.