Fast 15q16 decoding in m68k

All 680x0 related coding posts in this section please.

Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team

User avatar
simonsunnyboy
Moderator
Moderator
Posts: 5082
Joined: Wed Oct 23, 2002 4:36 pm
Location: Friedrichshafen, Germany
Contact:

Fast 15q16 decoding in m68k

Postby simonsunnyboy » Mon Nov 11, 2013 4:24 pm

Hi all,

I'm slowly working on some stuffs requiring fixed point arithmetics. I keep my data in 15q16 format, that is 15 bits + sign for integer part, and 16 bits fraction.
Now I want to quickly convert this to some plain int16_t.

The naive way is to do lsr.l #16,d0 ... would simply doing swap d0 do the trick as well? Or isthere another method i can't think of yet?

Regards,
ssb
Simon Sunnyboy/Paradize - http://paradize.atari.org/

Stay cool, stay Atari!

1x2600jr, 1x1040STFm, 1x1040STE 4MB+TOS2.06+SatanDisk, 1xF030 14MB+FPU+NetUS-Bee

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2978
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: Fast 15q16 decoding in m68k

Postby AtariZoll » Mon Nov 11, 2013 7:55 pm

There is no lsr.l #16 ... you need to do 2x lsr.l #8 in order. But on 68000 shift with 8 bits is slow.
If you need it in some cycle, fastest would be:

moveq #0,d1 * clear upper 16 bits of d1

cycle
swap d0
move.w d0,d1 - result in d1, only 16 bits , or if further code will operate with d0 as 16 bit (word), you don't need this step .
processing result ....
loop
Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

User avatar
FedePede04
Atari God
Atari God
Posts: 1211
Joined: Fri Feb 04, 2011 12:14 am
Location: Denmark
Contact:

Re: Fast 15q16 decoding in m68k

Postby FedePede04 » Mon Nov 11, 2013 9:10 pm

AtariZoll wrote:There is no lsr.l #16 ... you need to do 2x lsr.l #8 in order. But on 68000 shift with 8 bits is slow.
If you need it in some cycle, fastest would be:

moveq #0,d1 * clear upper 16 bits of d1

cycle
swap d0
move.w d0,d1 - result in d1, only 16 bits , or if further code will operate with d0 as 16 bit (word), you don't need this step .
processing result ....
loop


can't you shift more then 8 if you put the shift number in an other reg like this

Code: Select all

   Move    #16,d0
   Move    #Number That Need To Be Shifted,d1
   lsr.l    d0,d1
Atari will rule the world, long after man has disappeared

sometime my English is a little weird, Google translate is my best friend :)

sarnau
Atari nerd
Atari nerd
Posts: 44
Joined: Tue Sep 07, 2010 4:22 am

Re: Fast 15q16 decoding in m68k

Postby sarnau » Tue Nov 12, 2013 4:45 am

On a 68000 the cycles for lsr are depended on the number of shifts. To shift 16 bits down, this is many times faster:

Code: Select all

   clr.w d0
   swap d0


For an asr, you would add an ext.l d0.

User avatar
simonsunnyboy
Moderator
Moderator
Posts: 5082
Joined: Wed Oct 23, 2002 4:36 pm
Location: Friedrichshafen, Germany
Contact:

Re: Fast 15q16 decoding in m68k

Postby simonsunnyboy » Tue Nov 12, 2013 4:12 pm

sarnau wrote:On a 68000 the cycles for lsr are depended on the number of shifts. To shift 16 bits down, this is many times faster:

Code: Select all

   clr.w d0
   swap d0


For an asr, you would add an ext.l d0.


I'm not 1005 sure if I understood it correctly: will this keep the sign information in the MSB? Or is this the mentioned sign extension I have to do?

What I want to achieve is basically a cast from int32_t to int16_t with throwing away the 16 LSBs....
Simon Sunnyboy/Paradize - http://paradize.atari.org/

Stay cool, stay Atari!

1x2600jr, 1x1040STFm, 1x1040STE 4MB+TOS2.06+SatanDisk, 1xF030 14MB+FPU+NetUS-Bee

User avatar
simonsunnyboy
Moderator
Moderator
Posts: 5082
Joined: Wed Oct 23, 2002 4:36 pm
Location: Friedrichshafen, Germany
Contact:

Re: Fast 15q16 decoding in m68k

Postby simonsunnyboy » Tue Nov 12, 2013 4:20 pm

FedePede04 wrote:
AtariZoll wrote:There is no lsr.l #16 ... you need to do 2x lsr.l #8 in order. But on 68000 shift with 8 bits is slow.
If you need it in some cycle, fastest would be:

moveq #0,d1 * clear upper 16 bits of d1

cycle
swap d0
move.w d0,d1 - result in d1, only 16 bits , or if further code will operate with d0 as 16 bit (word), you don't need this step .
processing result ....
loop


can't you shift more then 8 if you put the shift number in an other reg like this

Code: Select all

   Move    #16,d0
   Move    #Number That Need To Be Shifted,d1
   lsr.l    d0,d1


AHCC actually generates such code:

Code: Select all

moveq #16,d1
asr.l d1,d0


but the shift takes cycles, more for more steps so I'd like to avoid that.
Simon Sunnyboy/Paradize - http://paradize.atari.org/

Stay cool, stay Atari!

1x2600jr, 1x1040STFm, 1x1040STE 4MB+TOS2.06+SatanDisk, 1xF030 14MB+FPU+NetUS-Bee

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2978
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: Fast 15q16 decoding in m68k

Postby AtariZoll » Tue Nov 12, 2013 5:28 pm

Using 2 registers instead 1 ? If can be little faster , yes . As I see, code with moveq #16 is faster for 4 T states.

Of course, I did not recommend it for this case, just mentioned. Swap is what you need. In ASM there is no specific int size definition. All depends from CPU commands used. Some operate with 32 bits, some with 16, some with 8 . All it comes with practice. If you are not sure, clear upper bytes. So, best would be : clr.w d1 ; swap d1 - after it upper 16 bits will be 0 .
Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

seedy1812
Atari User
Atari User
Posts: 34
Joined: Tue May 18, 2010 2:04 pm

Re: Fast 15q16 decoding in m68k

Postby seedy1812 » Tue Nov 12, 2013 10:04 pm

Looking at both methods suggested

Code: Select all

moveq #16,d1               ==  4
asr.l     d1,d0   << 8+2m  == 24
                           == 28 cycles

swap d0                    ==  4
ext.l d0                   ==  4
                           ==  8 cycles

Both should give you a sign extended 16 to 32 bit answer

If you do not need the top 16 bits then just do a swap.

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2978
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: Fast 15q16 decoding in m68k

Postby AtariZoll » Wed Nov 13, 2013 12:03 am

ext.l d0 is not OK - if bit 15 is 1, then it will set to 1 bits 16-31 .
Same time takes clr.w d0; swap d0 . But is clearing of bits 16-31 necessary, depends from further code.
Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.

User avatar
simonsunnyboy
Moderator
Moderator
Posts: 5082
Joined: Wed Oct 23, 2002 4:36 pm
Location: Friedrichshafen, Germany
Contact:

Re: Fast 15q16 decoding in m68k

Postby simonsunnyboy » Wed Nov 13, 2013 4:17 pm

I will not need the ext instruction according to my M68k Programmer's Reference Manual.

It states on p.98: "Extend the sign bit of a data register from a byte to a word or from a word to long operand [....]"

That is the opposite direction of what I want to do. I have a long and want to strip it down to a word in M68K speak.
Simon Sunnyboy/Paradize - http://paradize.atari.org/

Stay cool, stay Atari!

1x2600jr, 1x1040STFm, 1x1040STE 4MB+TOS2.06+SatanDisk, 1xF030 14MB+FPU+NetUS-Bee

User avatar
simonsunnyboy
Moderator
Moderator
Posts: 5082
Joined: Wed Oct 23, 2002 4:36 pm
Location: Friedrichshafen, Germany
Contact:

Re: Fast 15q16 decoding in m68k

Postby simonsunnyboy » Wed Nov 13, 2013 5:22 pm

Seems the clr/swap combination is what I am looking for, I tried the following short code in AHCC and it gives 0 fails. So the m&8k routine should be equivalent to the C implementation:

Code: Select all

#include <stdint.h>
#include <stdio.h>
#include <tos.h>

int16_t my_cast(int32_t in);
int16_t internal_cast(int32_t v);

static int16_t internal_cast(int32_t v)
{
   return (int16_t)(v >> 16);
}

void __asm__ my_asm(void)
{
   EXPORT my_cast
   
my_cast:
   clr.w d0
   swap d0
   rts
}

int main(int argc, char **argv)
{
   int16_t in,res_c, res_asm;
   int32_t tst;
   int fail = 0;
   
   const int16_t low = 0x8000;
   const int16_t high = 0x7fff;
   
   
   for(in = low; in < high-1; in++)
   {
      tst = (int32_t)((uint32_t)in << (uint32_t)16); /* generate q16.16 notation for given integer */
      
      res_c = internal_cast(tst);
      res_asm = my_cast(tst);
      
      if(res_c != res_asm)
      {
         printf("Loop: %04x  Input: %08x  Internal: %04x  Asm: %04x\n",in,tst,res_c, res_asm);
         fail++;
      }
#if 0      
      else
      {
         printf("%d = %d\n",res_c, res_asm);
      }
#endif      

   }
   printf(" -- fails: %d \n", fail);
   Bconin(2);
   
   return 0;
}


..and i learned the AHCC way of writing inline code as a sideeffect. The printf() seemed to have some endianness problem in the display but this is offtopic here.
Simon Sunnyboy/Paradize - http://paradize.atari.org/

Stay cool, stay Atari!

1x2600jr, 1x1040STFm, 1x1040STE 4MB+TOS2.06+SatanDisk, 1xF030 14MB+FPU+NetUS-Bee

AtariZoll
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2978
Joined: Mon Feb 20, 2012 4:42 pm
Contact:

Re: Fast 15q16 decoding in m68k

Postby AtariZoll » Wed Nov 13, 2013 8:18 pm

Ext just copies bit 15 to all bits 16-31. If your data in bits 0-15 is lover than 32768 it is OK for purpose, since all bits 16-31 will be 0. Ext is used a lot in ASM.
But need careful usage. You can not strip register - it is always 32 bits :D If want to save result to memory, you just use
move.w d0, someAddress
and no need for anything except swap . Only lower 16 bits will be saved.
Famous Schrodinger's cat hypothetical experiment says that cat is dead or alive until we open box and see condition of poor animal, which deserved better logic. Cat is always in some certain state - regardless from is observer able or not to see what the state is.


Social Media

     

Return to “680x0”

Who is online

Users browsing this forum: No registered users and 2 guests