Prefetch

A place to discuss current and future developments for STeem

Moderator: Moderator Team

User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Prefetch

Post by Steven Seagal »

The following is part of development notes for Steem 3.5 (http://ataristeven.t15.org/Steem_35_coming_soon1.htm).
I copy it here where it belongs. It discusses some timing imprecisions and hacks.
Only the little table can't be properly reproduced.


Prefetch Timing
Prefetch for a microprocessor means loading the next instruction (or more) while an instruction is being executed.

To learn more about prefetch on the 68000, check this article by ijor:

http://pasti.fxatari.com/68kdocs/68kPrefetch.html

Before this article, very little was known about prefetch on the M68000, the concept was still mysterious.

Steem authors had to try things out and see if they worked with games and demos.

They did an amazing work, so many demos already work in Steem 3.2 because the timings globally are very good, but unfortunately the prefetch timings weren't well placed in general. They came at the start of an instruction, even before operands of the current instruction were gathered. It doesn't make sense if you know that operands and instructions follow the "PC" (program counter) and the next instruction is fetched while the current one is being executed (all directly commanded by microcode), but there was much confusion about prefetch really (and about "extra prefetch").

Code: Select all

 PC ->	PC ->	PC ->	PC ->	PC ->
Instruction 1	Operand 1	Operand 2	Instruction 2	Operand 1
In the table above, it's clear that instruction 2 can be fetched only after operands 1 & 2 of instruction 1 have been fetched.

Steem authors suspected that something was wrong, as indicated by a comment in MOVE, where the premature fetch must be compensated later in the instruction, with strange things like INSTRUCTION_TIME(10-4);

A consequence of this premature fetch timing is that memory was read too late in emulation. In particular when reading the shifter counter, Steem was 4 cycles backward. To make up for that, so that programs work, the value returned is different from what a real ST would return at the same cycle. In other words, it's a hack compensating wrong timings of almost all instructions! Steem authors weren't aware of that, they just returned the value that worked.

Note that the exact same problems seem to plague Hatari (AFAIK, v1.6.2)!

Hopefully, the "read SDP" hack will be removed in Steem v3.5 (depending on... timing).

It remains to be seen if other parts of emulation were dependent on bad timings.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

To illustrate, here is the original Steem code for "MOVE.B", before any "SSE" fix.

Code: Select all

void m68k_0001(){  //move.b
  INSTRUCTION_TIME(4); // I don't think this should be here, does move read on cycle 0?
  m68k_GET_SOURCE_B;
  if((ir&BITS_876)==BITS_876_000){
    SR_CLEAR(SR_V+SR_C+SR_N+SR_Z);
    m68k_dest=&(r[PARAM_N]);
    m68k_DEST_B=m68k_src_b;
    SR_CHECK_Z_AND_N_B;
  }else if((ir&BITS_876)==BITS_876_001){
    m68k_unrecognised();
  }else{   //to memory
    bool refetch=0;
    switch(ir&BITS_876){
    case BITS_876_010:
//      INSTRUCTION_TIME(8-4-4);
      abus=areg[PARAM_N];
      break;
    case BITS_876_011:
//      INSTRUCTION_TIME(8-4-4);
      abus=areg[PARAM_N];
      areg[PARAM_N]++;
      if(PARAM_N==7)areg[7]++;
      break;
    case BITS_876_100:
//      INSTRUCTION_TIME(8-4-4);
      areg[PARAM_N]--;
      if(PARAM_N==7)areg[7]--;
      abus=areg[PARAM_N];
      break;
    case BITS_876_101:
      INSTRUCTION_TIME(12-4-4);
      abus=areg[PARAM_N]+(signed short)m68k_fetchW();
      pc+=2;
      break;
    case BITS_876_110:
      INSTRUCTION_TIME(14-4-4);
      m68k_iriwo=m68k_fetchW();pc+=2;
      if(m68k_iriwo&BIT_b){  //.l
        abus=areg[PARAM_N]+(signed char)LOBYTE(m68k_iriwo)+(int)r[m68k_iriwo>>12];
      }else{         //.w
        abus=areg[PARAM_N]+(signed char)LOBYTE(m68k_iriwo)+(signed short)r[m68k_iriwo>>12];
      }
      break;
    case BITS_876_111:
      if (SOURCE_IS_REGISTER_OR_IMMEDIATE==0) refetch=true;
      switch (ir & BITS_ba9){
        case BITS_ba9_000:
          INSTRUCTION_TIME(12-4-4);
          abus=0xffffff & (unsigned long)((signed long)((signed short)m68k_fetchW()));
          pc+=2;
          break;
        case BITS_ba9_001:
          INSTRUCTION_TIME(16-4-4);
          abus=m68k_fetchL() & 0xffffff;
          pc+=4;
          break;
        default:
          m68k_unrecognised();
      }
    }
    SR_CLEAR(SR_Z+SR_N+SR_C+SR_V);
    if(!m68k_src_b){
      SR_SET(SR_Z);
    }
    if(m68k_src_b&MSB_B){
      SR_SET(SR_N);
    }

    m68k_poke_abus(m68k_src_b);
    FETCH_TIMING;  // move fetches after instruction
    if (refetch) prefetch_buf[0]=*(lpfetch-MEM_DIR);
  }
}
INSTRUCTION_TIME counts CPU cycles. FETCH_TIMING counts 4 cycles (rounded).
m68k_GET_SOURCE_B means that we read the source, the rest depends on the kind of destination (bits "876").
When we know where, we write to the 'abus' (address bus) with a 'poke' or m68k_DEST_B if it's a register.
SR_ ... concern the Status Register

Comments were by Steem authors as well.
The first comment "I don't think this should be here, does move read on cycle 0?" is true:
MOVE has no other business than moving things and prefetching next opcode.
Why would it "idle" for 4 cycles before doing anything meaningful? Every cycle is precious. As soon as it can MOVE reads, then it writes, with the prefetch generally coming at the end (as found by ijor, there are various ways for MOVE).
In this case it was visible, but the timing needed to be there at the start because "read shifter counter", used by a lot of other instructions (CMP, TST...), depended on it.
That's why you can fix prefetch in Steem only if you fix "read shifter counter" at the same time.
It's evident now but I only understood it this week! Before that, all my attempts miserably failed and I thought, maybe, prefetch on the M68000 still wasn't what we thought.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

I just forgot to present a typical instruction with early "fetch timing", MOVE is exceptional in that Steem authors were aware that it fetches in the end, and so the early timing was more obvious.
So here's one, this time with the mod:

Code: Select all

void                              m68k_tst_b(){
#if !(defined(STEVEN_SEAGAL) && defined(SS_CPU_LINE_4_TIMINGS))
  FETCH_TIMING;
#endif
  BYTE x=m68k_read_dest_b();
#if defined(STEVEN_SEAGAL) && defined(SS_CPU_LINE_4_TIMINGS)
  FETCH_TIMING;
#endif
  PREFETCH_IRC;
  SR_CLEAR(SR_N+SR_Z+SR_V+SR_C);
  if(!x)SR_SET(SR_Z);
  if(x&MSB_B)SR_SET(SR_N);
}
The FETCH_TIMING directive comes before the "read". Now imagine if we're reading the shifter counter. What "line cycle" will "it" (our emulation) think it's on? It will think that it's 4 cycles later, and that data is vital for all programs (demos essentially) that sync on it.
Now imagine that above we hadn't the INSTRUCTION_TIME(4) at the start of the MOVE, despite our knowledge that MOVE fetches in the end? Then there would be a difference between MOVE and TST, and ST programs would receive wrong values for one of the instructions (in emulation, as opposed to true ST).
Since this is (one of) the forum where I brag, a word on the mods:
SS_CPU_LINE_4_TIMINGS is defined only for debugging, there's one such macro for each M68K "line". Normally, macro FETCH_TIMING does nothing anymore and the cycles are counted at macro PREFETCH_IRC, which I added myself in every instruction already for v3.3. Before v3.5, this macro just "prefetched" the value at PC in CPU register IRC as explained by ijor. In v3.5, it also counts cycles.
Since much depends on where exactly those PREFETCH_IRC are placed, some bugs (regressions) are possible at this stage: this is a very big change!
But essentially it works: the value returned by "read shifter counter" is now hopefully correct. In practice, beside the almost certain regressions to fix, it doesn't change anything that I know.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
Dio
Captain Atari
Captain Atari
Posts: 451
Joined: Thu Feb 28, 2008 3:51 pm

Re: Prefetch

Post by Dio »

The biggest problem with debugging prefetch is that there's no automatic testing to demonstrate that it works identically to a real ST, just lots of applications that have to be run.

If that existed, actually fixing the problems wouldn't be the hardest thing in the world. But without that, it's all stabbing in the dark a bit.
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

Yes, no choice but wait for bug reports!
And you must see it, check this subtle one (scroll down):
drag_prefetch_bug.png
It depends on prefetch timing. I knew it because that's what I was working on but without that, guesses would go to "shifter tricks" problem instead.
You do not have the required permissions to view the files attached to this post.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Prefetch

Post by danorf »

Hello,

For 68000 CPU, exact number of cycles used by instructions, instructions pairing possibilities and cycle by cycle data bus usage can be fully deducted from two documents:
- The user manual
- Motorola patent No. 4325121 - Two level control store for microprogrammed data processor.

In my former company, we had a working tool (an "extended" cycles table) made from these two documents (we also had a more or less corrected and completed transcript of patent n ° 4325121).

I still have those documents in my possession, lurking somewhere on a USB stick.

As this company no longer do anything around Motorola processors, I will inquire about the possibility of disseminating these documents.

For exemple, in attachments is the move section.
in the following table data bus usage is described like this :
n : nop : data bus is not used
p : Program fetch; read from next consecutive location in program memory
W : Write MSW onto data bus when using long word
w : Write one word onto data bus (LSW if long word operation)
R : Read MSW from data bus when using long word
r : Read one word from data bus (LSW if long word operation)
You do not have the required permissions to view the files attached to this post.
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

This table seems coherent with ijor's article, for the -(An) case it's clear ("class 0")

Code: Select all

<ea>,-(An) :      |                 |               |                           
  .B or .W :      |                 |               |                           
    Dn            |  8(1/1)         |               |                  np nw	  
    An            |  8(1/1)         |               |                  np nw    
    (An)          | 12(2/1)         |            nr |                  np nw    
    (An)+         | 12(2/1)         |            nr |                  np nw    
    -(An)         | 14(2/1)         | n          nr |                  np nw    
    (d16,An)      | 16(3/1)         |      np    nr |                  np nw    
    (d8,An,Xn)    | 18(3/1)         | n    np    nr |                  np nw    
    (xxx).W       | 16(3/1)         |      np    nr |                  np nw    
    (xxx).L       | 20(4/1)         |   np np    nr |                  np nw    
    #<data>       | 12(2/1)         |      np       |                  np nw    
  .L :            |                 |               |                           
    Dn            | 12(1/2)         |               |                  np nw nW 
    An            | 12(1/2)         |               |                  np nw nW 
    (An)          | 20(3/2)         |         nR nr |                  np nw nW 
    (An)+         | 20(3/2)         |         nR nr |                  np nw nW 
    -(An)         | 22(3/2)         | n       nR nr |                  np nw nW 
    (d16,An)      | 24(4/2)         |      np nR nr |                  np nw nW 
    (d8,An,Xn)    | 26(4/2)         | n    np nR nr |                  np nw nW 
    (xxx).W       | 24(4/2)         |      np nR nr |                  np nw nW 
    (xxx).L       | 28(5/2)         |   np np nR nr |                  np nw nW 
    #<data>       | 20(3/2)         |   np np       |                  np nw nW 
For <ea>,(xxx).L it also seems to indicate 2 prefetches at the end ("class 2") except the table would be incomplete at the top ('n' instead of 'np')? Because according to ijor the behavior is the same disregarding transfer size (byte, word or long).

Code: Select all

<ea>,(xxx).L :    |                 |               |                           
  .B or .W :      |                 |               |                           
    Dn            | 16(3/1)         |               |   np np    nw np		      
    An            | 16(3/1)         |               |   np np    nw np		      
    (An)          | 20(4/1)         |            nr |      np    nw np n        
    (An)+         | 20(4/1)         |            nr |      np    nw np n        
    -(An)         | 22(4/1)         | n          nr |      np    nw np n        
    (d16,An)      | 24(5/1)         |      np    nr |      np    nw np n        
    (d8,An,Xn)    | 26(5/1)         | n    np    nr |      np    nw np n        
    (xxx).W       | 24(5/1)         |      np    nr |      np    nw np n        
    (xxx).L       | 28(6/1)         |   np np    nr |      np    nw np n        
    #<data>       | 20(4/1)         |      np       |   np np    nw np		      
  .L :            |                 |               |                           
    Dn            | 20(3/2)         |               |   np np nW nw np		      
    An            | 20(3/2)         |               |   np np nW nw np		      
    (An)          | 28(5/2)         |         nR nr |      np nW nw np np       
    (An)+         | 28(5/2)         |         nR nr |      np nW nw np np       
    -(An)         | 30(5/2)         | n       nR nr |      np nW nw np np       
    (d16,An)      | 32(6/2)         |      np nR nr |      np nW nw np np       
    (d8,An,Xn)    | 34(6/2)         | n    np nR nr |      np nW nw np np       
    (xxx).W       | 32(6/2)         |      np nR nr |      np nW nw np np       
    (xxx).L       | 36(7/2)         |   np np nR nr |      np nW nw np np       
    #<data>       | 28(5/2)         |   np np       |   np np nW nw np        
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Prefetch

Post by danorf »

Sorry,

This was a cut & past error, I miss the last letter column ! (you can tell by the exec time not corresponding with the number of letters in the 2 last columns x 2 cycles) (yes each letter "takes" 2 cycles).

Code: Select all

-------------------------------------------------------------------------------
                  |    Exec Time    |               Data Bus Usage             
       MOVE       |      INSTR      |  1st OP (ea)  |          INSTR           
------------------+-----------------+---------------+--------------------------
<ea>,Dn :         |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            |  4(1/0)         |               |               np		     
    An            |  4(1/0)         |               |               np		     
    (An)          |  8(2/0)         |            nr |               np		     
    (An)+         |  8(2/0)         |            nr |               np		     
    -(An)         | 10(2/0)         | n          nr |               np		     
    (d16,An)      | 12(3/0)         |      np    nr |               np		     
    (d8,An,Xn)    | 14(3/0)         | n    np    nr |               np		     
    (xxx).W       | 12(3/0)         |      np    nr |               np		     
    (xxx).L       | 16(4/0)         |   np np    nr |               np		     
    #<data>       |  8(2/0)         |      np       |               np		     
  .L :            |                 |               |                          
    Dn            |  4(1/0)         |               |               np		     
    An            |  4(1/0)         |               |               np		     
    (An)          | 12(3/0)         |         nR nr |               np		     
    (An)+         | 12(3/0)         |         nR nr |               np		     
    -(An)         | 14(3/0)         | n       nR nr |               np		     
    (d16,An)      | 16(4/0)         |      np nR nr |               np		     
    (d8,An,Xn)    | 18(4/0)         | n    np nR nr |               np		     
    (xxx).W       | 16(4/0)         |      np nR nr |               np		     
    (xxx).L       | 20(5/0)         |   np np nR nr |               np		     
    #<data>       | 12(3/0)         |   np np       |               np		     
<ea>,(An) :       |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            |  8(1/1)         |               |            nw np         
    An            |  8(1/1)         |               |            nw np         
    (An)          | 12(2/1)         |            nr |            nw np         
    (An)+         | 12(2/1)         |            nr |            nw np         
    -(An)         | 14(2/1)         | n          nr |            nw np         
    (d16,An)      | 16(3/1)         |      np    nr |            nw np         
    (d8,An,Xn)    | 18(3/1)         | n    np    nr |            nw np         
    (xxx).W       | 16(3/1)         |      np    nr |            nw np         
    (xxx).L       | 20(4/1)         |   np np    nr |            nw np         
    #<data>       | 12(2/1)         |      np       |            nw np         
  .L :            |                 |               |                          
    Dn            | 12(1/2)         |               |         nW nw np		     
    An            | 12(1/2)         |               |         nW nw np		     
    (An)          | 20(3/2)         |         nR nr |         nW nw np		     
    (An)+         | 20(3/2)         |         nR nr |         nW nw np		     
    -(An)         | 22(3/2)         | n       nR nr |         nW nw np		     
    (d16,An)      | 24(4/2)         |      np nR nr |         nW nw np		     
    (d8,An,Xn)    | 26(4/2)         | n    np nR nr |         nW nw np		     
    (xxx).W       | 24(4/2)         |      np nR nr |         nW nw np		     
    (xxx).L       | 28(5/2)         |   np np nR nr |         nW nw np		     
    #<data>       | 20(3/2)         |   np np       |         nW nw np		     
<ea>,(An)+ :      |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            |  8(1/1)         |               |            nw np         
    An            |  8(1/1)         |               |            nw np         
    (An)          | 12(2/1)         |            nr |            nw np         
    (An)+         | 12(2/1)         |            nr |            nw np         
    -(An)         | 14(2/1)         | n          nr |            nw np         
    (d16,An)      | 16(3/1)         |      np    nr |            nw np         
    (d8,An,Xn)    | 18(3/1)         | n    np    nr |            nw np         
    (xxx).W       | 16(3/1)         |      np    nr |            nw np         
    (xxx).L       | 20(4/1)         |   np np    nr |            nw np         
    #<data>       | 12(2/1)         |      np       |            nw np         
  .L :            |                 |               |                          
    Dn            | 12(1/2)         |               |         nW nw np         
    An            | 12(1/2)         |               |         nW nw np         
    (An)          | 20(3/2)         |         nR nr |         nW nw np         
    (An)+         | 20(3/2)         |         nR nr |         nW nw np         
    -(An)         | 22(3/2)         | n       nR nr |         nW nw np         
    (d16,An)      | 24(4/2)         |      np nR nr |         nW nw np         
    (d8,An,Xn)    | 26(4/2)         | n    np nR nr |         nW nw np         
    (xxx).W       | 24(4/2)         |      np nR nr |         nW nw np         
    (xxx).L       | 28(5/2)         |   np np nR nr |         nW nw np         
    #<data>       | 20(3/2)         |   np np       |         nW nw np         
<ea>,-(An) :      |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            |  8(1/1)         |               |                  np nw   
    An            |  8(1/1)         |               |                  np nw   
    (An)          | 12(2/1)         |            nr |                  np nw   
    (An)+         | 12(2/1)         |            nr |                  np nw   
    -(An)         | 14(2/1)         | n          nr |                  np nw   
    (d16,An)      | 16(3/1)         |      np    nr |                  np nw   
    (d8,An,Xn)    | 18(3/1)         | n    np    nr |                  np nw   
    (xxx).W       | 16(3/1)         |      np    nr |                  np nw   
    (xxx).L       | 20(4/1)         |   np np    nr |                  np nw   
    #<data>       | 12(2/1)         |      np       |                  np nw   
  .L :            |                 |               |                          
    Dn            | 12(1/2)         |               |                  np nw nW
    An            | 12(1/2)         |               |                  np nw nW
    (An)          | 20(3/2)         |         nR nr |                  np nw nW
    (An)+         | 20(3/2)         |         nR nr |                  np nw nW
    -(An)         | 22(3/2)         | n       nR nr |                  np nw nW
    (d16,An)      | 24(4/2)         |      np nR nr |                  np nw nW
    (d8,An,Xn)    | 26(4/2)         | n    np nR nr |                  np nw nW
    (xxx).W       | 24(4/2)         |      np nR nr |                  np nw nW
    (xxx).L       | 28(5/2)         |   np np nR nr |                  np nw nW
    #<data>       | 20(3/2)         |   np np       |                  np nw nW
<ea>,(d16,An) :   |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            | 12(2/1)         |               |      np    nw np         
    An            | 12(2/1)         |               |      np    nw np		     
    (An)          | 16(3/1)         |            nr |      np    nw np		     
    (An)+         | 16(3/1)         |            nr |      np    nw np		     
    -(An)         | 18(3/1)         | n          nr |      np    nw np		     
    (d16,An)      | 20(4/1)         |      np    nr |      np    nw np		     
    (d8,An,Xn)    | 22(4/1)         | n    np    nr |      np    nw np		     
    (xxx).W       | 20(4/1)         |      np    nr |      np    nw np		     
    (xxx).L       | 24(5/1)         |   np np    nr |      np    nw np		     
    #<data>       | 16(3/1)         |      np       |      np    nw np		     
  .L :            |                 |               |                          
    Dn            | 16(2/2)         |               |      np nW nw np		     
    An            | 16(2/2)         |               |      np nW nw np		     
    (An)          | 24(4/2)         |         nR nr |      np nW nw np         
    (An)+         | 24(4/2)         |         nR nr |      np nW nw np         
    -(An)         | 26(4/2)         | n       nR nr |      np nW nw np         
    (d16,An)      | 28(5/2)         |      np nR nr |      np nW nw np         
    (d8,An,Xn)    | 30(5/2)         | n    np nR nr |      np nW nw np         
    (xxx).W       | 28(5/2)         |      np nR nr |      np nW nw np         
    (xxx).L       | 32(6/2)         |   np np nR nr |      np nW nw np         
    #<data>       | 24(4/2)         |   np np       |      np nW nw np		     
<ea>,(d8,An,Xn) : |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            | 14(2/1)         |               | n    np    nw np		     
    An            | 14(2/1)         |               | n    np    nw np		     
    (An)          | 18(3/1)         |            nr | n    np    nw np		     
    (An)+         | 18(3/1)         |            nr | n    np    nw np		     
    -(An)         | 20(3/1)         | n          nr | n    np    nw np		     
    (d16,An)      | 22(4/1)         |      np    nr | n    np    nw np		     
    (d8,An,Xn)    | 24(4/1)         | n    np    nr | n    np    nw np		     
    (xxx).W       | 22(4/1)         |      np    nr | n    np    nw np		     
    (xxx).L       | 26(5/1)         |   np np    nr | n    np    nw np		     
    #<data>       | 18(3/1)         |      np       | n    np    nw np		     
  .L :            |                 |               |                          
    Dn            | 18(2/2)         |               | n    np nW nw np		     
    An            | 18(2/2)         |               | n    np nW nw np		     
    (An)          | 26(4/2)         |         nR nr | n    np nW nw np         
    (An)+         | 26(4/2)         |         nR nr | n    np nW nw np         
    -(An)         | 28(4/2)         | n       nR nr | n    np nW nw np         
    (d16,An)      | 30(5/2)         |      np nR nr | n    np nW nw np         
    (d8,An,Xn)    | 32(5/2)         | n    np nR nr | n    np nW nw np         
    (xxx).W       | 30(5/2)         |      np nR nr | n    np nW nw np         
    (xxx).L       | 34(6/2)         |   np np nR nr | n    np nW nw np         
    #<data>       | 26(4/2)         |   np np       | n    np nW nw np		     
<ea>,(xxx).W :    |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            | 12(2/1)         |               |      np    nw np		     
    An            | 12(2/1)         |               |      np    nw np		     
    (An)          | 16(3/1)         |            nr |      np    nw np		     
    (An)+         | 16(3/1)         |            nr |      np    nw np		     
    -(An)         | 18(3/1)         | n          nr |      np    nw np		     
    (d16,An)      | 20(4/1)         |      np    nr |      np    nw np		     
    (d8,An,Xn)    | 22(4/1)         | n    np    nr |      np    nw np		     
    (xxx).W       | 20(4/1)         |      np    nr |      np    nw np		     
    (xxx).L       | 24(5/1)         |   np np    nr |      np    nw np		     
    #<data>       | 16(3/1)         |      np       |      np    nw np		     
  .L :            |                 |               |                          
    Dn            | 16(2/2)         |               |      np nW nw np		     
    An            | 16(2/2)         |               |      np nW nw np		     
    (An)          | 24(4/2)         |         nR nr |      np nW nw np         
    (An)+         | 24(4/2)         |         nR nr |      np nW nw np         
    -(An)         | 26(4/2)         | n       nR nr |      np nW nw np         
    (d16,An)      | 28(5/2)         |      np nR nr |      np nW nw np         
    (d8,An,Xn)    | 30(5/2)         | n    np nR nr |      np nW nw np         
    (xxx).W       | 28(5/2)         |      np nR nr |      np nW nw np         
    (xxx).L       | 32(6/2)         |   np np nR nr |      np nW nw np         
    #<data>       | 24(4/2)         |   np np       |      np nW nw np		     
<ea>,(xxx).L :    |                 |               |                          
  .B or .W :      |                 |               |                          
    Dn            | 16(3/1)         |               |   np np    nw np		     
    An            | 16(3/1)         |               |   np np    nw np		     
    (An)          | 20(4/1)         |            nr |      np    nw np np	     
    (An)+         | 20(4/1)         |            nr |      np    nw np np	     
    -(An)         | 22(4/1)         | n          nr |      np    nw np np	     
    (d16,An)      | 24(5/1)         |      np    nr |      np    nw np np	     
    (d8,An,Xn)    | 26(5/1)         | n    np    nr |      np    nw np np	     
    (xxx).W       | 24(5/1)         |      np    nr |      np    nw np np	     
    (xxx).L       | 28(6/1)         |   np np    nr |      np    nw np np	     
    #<data>       | 20(4/1)         |      np       |   np np    nw np		     
  .L :            |                 |               |                          
    Dn            | 20(3/2)         |               |   np np nW nw np		     
    An            | 20(3/2)         |               |   np np nW nw np		     
    (An)          | 28(5/2)         |         nR nr |      np nW nw np np      
    (An)+         | 28(5/2)         |         nR nr |      np nW nw np np      
    -(An)         | 30(5/2)         | n       nR nr |      np nW nw np np      
    (d16,An)      | 32(6/2)         |      np nR nr |      np nW nw np np      
    (d8,An,Xn)    | 34(6/2)         | n    np nR nr |      np nW nw np np      
    (xxx).W       | 32(6/2)         |      np nR nr |      np nW nw np np      
    (xxx).L       | 36(7/2)         |   np np nR nr |      np nW nw np np      
    #<data>       | 28(5/2)         |   np np       |   np np nW nw np         
                                                                               
-------------------------------------------------------------------------------
                  |    Exec Time    |               Data Bus Usage             
      MOVEA       |      INSTR      |  1st OP (ea)  |          INSTR           
------------------+-----------------+---------------+--------------------------
<ea>,An :         |                 |               |                          
  .W :            |                 |               |                          
    Dn            |  4(1/0)         |               |               np		     
    An            |  4(1/0)         |               |               np		     
    (An)          |  8(2/0)         |            nr |               np         
    (An)+         |  8(2/0)         |            nr |               np         
    -(An)         | 10(2/0)         | n          nr |               np         
    (d16,An)      | 12(3/0)         |      np    nr |               np         
    (d8,An,Xn)    | 14(3/0)         | n    np    nr |               np         
    (xxx).W       | 12(3/0)         |      np    nr |               np         
    (xxx).L       | 16(4/0)         |   np np    nr |               np         
    #<data>       |  8(2/0)         |      np       |               np		     
  .L :            |                 |               |                          
    Dn            |  4(1/0)         |               |               np		     
    An            |  4(1/0)         |               |               np		     
    (An)          | 12(3/0)         |         nR nr |               np         
    (An)+         | 12(3/0)         |         nR nr |               np         
    -(An)         | 14(3/0)         | n       nR nr |               np         
    (d16,An)      | 16(4/0)         |      np nR nr |               np         
    (d8,An,Xn)    | 18(4/0)         | n    np nR nr |               np         
    (xxx).W       | 16(4/0)         |      np nR nr |               np         
    (xxx).L       | 20(5/0)         |   np np nR nr |               np         
    #<data>       | 12(3/0)         |   np np       |               np		     
By the way, don't worry too much about the accuracy of this table, years of use have proved it. :mrgreen: (at least when I don't forget a column when cutting & pasting ! :oops:)
This table is not made of empirical measures but is only an application of the manufacturer (Motorola) documentations I named earlier (the patent is defitively a must read if you realy want to understand how M68000 works).

Soon as I'll get a positive (I hope) response I will publish the entire table.
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

Alright, then it's in agreement with the article I mentioned for MOVE.
A full table would be excellent doc, because the article was interested in prefetch behaviour to help emulation of prefetch tricks, but this doesn't give us the full timing order like the table.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
Dio
Captain Atari
Captain Atari
Posts: 451
Joined: Thu Feb 28, 2008 3:51 pm

Re: Prefetch

Post by Dio »

It is possible to generate some (perhaps all) of the same documentation and automatically test out of an ST. I've done some of it - particularly, the bus error timing is quite revealing.

(One point to be slightly wary of is that the microcode in the patent may not be the same as the shipping chip in minor cases).

Still very useful extra data to have though.
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Prefetch

Post by danorf »

Dio wrote:(One point to be slightly wary of is that the microcode in the patent may not be the same as the shipping chip in minor cases).
In very very few cases, if I remember well (DBcc come to my mind when thinking of this).

But the table I own have been tested (and mostly corrected) against stock 68000.
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Prefetch

Post by danorf »

User avatar
Nyh
Atari God
Atari God
Posts: 1533
Joined: Tue Oct 12, 2004 2:25 pm
Location: Netherlands

Re: Prefetch

Post by Nyh »

If have a demo for you using prefetch tricks. It doesn't show the correct result with Steem because move.w Dn,-(An) is not correct implemented.

Hans Wessels
You do not have the required permissions to view the files attached to this post.
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

Funny, I would say it's correct in Steem SSE (4 pixels).
Are there real ST pics?

Edit: I see it was different in v3.2, is that what you mean? Surely it's better in SSE since v3.3, when I fixed prefetch except for the timing placement, which I did in v3.5.
Please confirm, I'm anxious to add this to my brag pages! :)
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

No reply after the bold claim? :(

Then here are pics of this demo in Steem 3.2 then the SSE build:

Image

Image

I'm not sure SSE is correct but the rasters are 4pixel and it's prettier.
There's also a shift between STF & STE, and between wake-up states if your play around in Steem SSE 3.5!
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Nyh
Atari God
Atari God
Posts: 1533
Joined: Tue Oct 12, 2004 2:25 pm
Location: Netherlands

Re: Prefetch

Post by Nyh »

Steven Seagal wrote:No reply after the bold claim? :(

I'm not sure SSE is correct but the rasters are 4pixel and it's prettier.
There's also a shift between STF & STE, and between wake-up states if your play around in Steem SSE 3.5!
Sorry. I am very busy at the moment. If I can browse this forum once a week I am happy.

The prefetch timing is an interesting subject. I used it for a demo as you can see. STeem SSE does the right thing. Displaying 4 pixels rasters. I didn't complete this demo because of the STeem problems and released a version that worked fine on STeem using the add.w Dn,-(An) instruction which happened th work correct in STeem.

Thanks for the good work!
Hans Wessels
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed

Re: Prefetch

Post by Steven Seagal »

Thx for confirming, I wasn't sure if you had seen something else incorrect in new versions of Steem.
At least now I know placing all those 'PREFETCH_IRC' macros had practical use too.
add.w Dn,-(An) worked because it's one of the rare instructions with 'extra prefetch' in v3.2,
probably done for demos like Anomaly.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse

Return to “Development”