fastest way to copy a memory block in 68060 assembly

All 680x0 related coding posts in this section please.

Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team

Post Reply
User avatar
rmd
Atari maniac
Atari maniac
Posts: 92
Joined: Fri Dec 15, 2017 11:30 am
Location: Berlin, Zermany
Contact:

fastest way to copy a memory block in 68060 assembly

Post by rmd »

Hello,
I'm looking for a code snippet of the fastest way to copy a 64KB block in 68060 assembly. (when source and destination don't overlap)
thanks!
User avatar
swapd0
Obsessive compulsive Atari behavior
Obsessive compulsive Atari behavior
Posts: 113
Joined: Thu Dec 13, 2007 8:56 pm

Re: fastest way to copy a memory block in 68060 assembly

Post by swapd0 »

something like this.

rept 64*1024/13*4
movem.l (a0),d0-d7/a2-a6
movem.l d0-d7/a2-a6,(a1)
lea 13*4(a0),a0
lea 13*4(a1),a1
endr
User avatar
rmd
Atari maniac
Atari maniac
Posts: 92
Joined: Fri Dec 15, 2017 11:30 am
Location: Berlin, Zermany
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by rmd »

swapd0 wrote:something like this.

rept 64*1024/13*4
movem.l (a0),d0-d7/a2-a6
movem.l d0-d7/a2-a6,(a1)
lea 13*4(a0),a0
lea 13*4(a1),a1
endr
:cheers:
found a version there too https://chromium.googlesource.com/nativ ... k/memcpy.S
JeanMars
Captain Atari
Captain Atari
Posts: 276
Joined: Fri Apr 09, 2010 5:15 pm
Location: France
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by JeanMars »

Hi,

How does the rept(movem.l ...) based routine compares in term of clock cycles to the rept( move.l (a0)+,(a1)+ ...) ?
Estimating CPU cycles is a bit too far way in my mind but just for curiosity if someone has already sone the maths I'll appreciate.

Thanks,
Jean
JeanMars
Captain Atari
Captain Atari
Posts: 276
Joined: Fri Apr 09, 2010 5:15 pm
Location: France
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by JeanMars »

OK, found some 68k clock cycles here:
http://oldwww.nvg.ntnu.no/amiga/MC680x0 ... mmove.HTML
http://oldwww.nvg.ntnu.no/amiga/MC680x0 ... mpetc.HTML

So it's a long tile away from me but I would say:
move.l: 20n
movem.l: 12+8n+8+8n=20+16n
n being the number of long words moved.
So for 13 (#of registers for movem.l method): 228 cycles
For movel.l method: 260 cycles

But it's quite old for me, not sure about the maths here.
User avatar
chlu600
Retro freak
Retro freak
Posts: 15
Joined: Wed Mar 04, 2015 8:32 am

Re: fastest way to copy a memory block in 68060 assembly

Post by chlu600 »

I’ve never done something with the 68060 cpu. But why not using the instruction cache?
The small loop will easily fit into the cache and then just read & write.

Perhaps I’ve missed something?
JeanMars
Captain Atari
Captain Atari
Posts: 276
Joined: Fri Apr 09, 2010 5:15 pm
Location: France
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by JeanMars »

Yep sure considering the amount of memory to move, need a loop counter anyway else code size would be too much.
Don't know how big 68060 instruction cache is (BTW why 68060?) but idea is to get as close as possible as this cache size and loop on n rept of movem.l
User avatar
thomas3
Captain Atari
Captain Atari
Posts: 161
Joined: Tue Apr 11, 2017 8:57 pm
Location: the people's republic of south yorkshire, uk.

Re: fastest way to copy a memory block in 68060 assembly

Post by thomas3 »

swapd0 wrote:something like this.

rept 64*1024/13*4
movem.l (a0),d0-d7/a2-a6
movem.l d0-d7/a2-a6,(a1)
lea 13*4(a0),a0
lea 13*4(a1),a1
endr
You can lose all the a0 leas and almost all the a1 leas by using (a0)+ and then addressing to d(a1).
User avatar
rmd
Atari maniac
Atari maniac
Posts: 92
Joined: Fri Dec 15, 2017 11:30 am
Location: Berlin, Zermany
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by rmd »

JeanMars wrote:(BTW why 68060?)
Because silly venture :wink:
JeanMars
Captain Atari
Captain Atari
Posts: 276
Joined: Fri Apr 09, 2010 5:15 pm
Location: France
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by JeanMars »

Hi,
You can lose all the a0 leas and almost all the a1 leas by using (a0)+ and then addressing to d(a1).
? Don't get it.
BTW in my cycle calculation I forgot to include these lea, so it's 228+8+8=244 (movem.l) vs 260 (move.l)
Well not that better, assuming I'm correct on cycles hwich is pretty risky considering how long I did not this kind of things :-)

Also here it's 68000, not 68060 and don't know is adresses are even aligned.
User avatar
rmd
Atari maniac
Atari maniac
Posts: 92
Joined: Fri Dec 15, 2017 11:30 am
Location: Berlin, Zermany
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by rmd »

JeanMars wrote: Also here it's 68000, not 68060 and don't know is adresses are even aligned.
Yes the adresses will be aligned.
OL
Atari Super Hero
Atari Super Hero
Posts: 549
Joined: Fri Apr 01, 2005 6:59 am
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by OL »

Hello

you should be able to do faster in theory you should be able to do 2 instruction in same time but I think issue is link to memory acces with move.l, on CT60 limit look at 50Mb/sec at 100Mhz, on V4 it more than 100Mb/sec

Olivier
JeanMars wrote:Hi,
You can lose all the a0 leas and almost all the a1 leas by using (a0)+ and then addressing to d(a1).
? Don't get it.
BTW in my cycle calculation I forgot to include these lea, so it's 228+8+8=244 (movem.l) vs 260 (move.l)
Well not that better, assuming I'm correct on cycles hwich is pretty risky considering how long I did not this kind of things :-)

Also here it's 68000, not 68060 and don't know is adresses are even aligned.
OL
evil
Captain Atari
Captain Atari
Posts: 192
Joined: Sun Nov 12, 2006 8:03 pm
Location: Devpac

Re: fastest way to copy a memory block in 68060 assembly

Post by evil »

rmd wrote:Hello,
I'm looking for a code snippet of the fastest way to copy a 64KB block in 68060 assembly. (when source and destination don't overlap)
thanks!
Hello rmd,

in my experience move16 is the fastest on 060. It requires the start and end buffer to be aligned by 16 bytes, but apart from that it is very straight forward.

So I did a little test now to see how it stacks up against movem.l and movem.l with scrambled source data.
Each test copies 8k of data each loop, it isn't completely optimal for movem.l so it can do a little better than shown here.
Also unlike the 68000 we have a cache and don't want to rept everything as it won't fit.
Hopefully I didn't do some fatal mistake :) Just lower the loop count to #8-1 for a 64k copy.

First try, movem.l

Code: Select all

;		movem.l linear copy about 488kbyte / 60Hz VBL, 66 MHz CPU
copy_movem:	
		move.l	buf1addr,a5
		move.l	buf2addr,a6
		
		moveq	#61-1,d0
.loop:
		movem.l	(a5)+,d1-a4		;52 bytes
		movem.l	d1-a4,(a6)

.q:		set	52
		rept	156
		movem.l	(a5)+,d1-a4		;8112 bytes
		movem.l	d1-a4,.q(a6)
.q:		set	.q+52
		endr

		movem.l	(a5)+,d1-d7		;28 bytes = 8k per loop
		movem.l	d1-d7,.q(a6)

		lea	8192(a6),a6	
		
		dbra	d0,.loop
Second try, movem.l with the source data scrambled backwards, so we can use (an)+ for source and -(an) for destination.

Code: Select all

;		movem.l scrambled copy about 496kbyte / 60Hz VBL, 66 MHz CPU
copy_movem_scrambled:
		move.l	buf1addr,a5
		move.l	buf2addr,a6
		add.l	#1024*496,a6
		
		moveq	#62-1,d0
.loop:
		rept	157
		movem.l	(a5)+,d1-a4		;8164 bytes
		movem.l	d1-a4,-(a6)
		endr
		
		movem.l	(a5)+,d1-d7		;28 bytes = 8k per loop
		movem.l	d1-d7,-(a6)

		dbra	d0,.loop
And finally, the nice and clean move16

Code: Select all

;		move16 about 520kbyte/ 60Hz VBL, 66 MHz CPU
copy_16:	
		move.l	buf1addr,a0
		move.l	buf2addr,a1

		moveq	#65-1,d0
.loop:
		rept	512
		move16	(a0)+,(a1)+		;8k per loop
		endr
		
		dbra	d0,.loop
Worth to note is that the test program didn't shut off all of the OS so Timer C and VBL from the OS were still running. Doing it more naughty will gain some numbers for each method.
leonard
Moderator
Moderator
Posts: 665
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by leonard »

I didn't know anything about 68060 (except movep does not exist anymore :)). I didn't even know about the "move16" instruction existence. Sounds very nice!
Leonard/OXYGENE.
User avatar
rmd
Atari maniac
Atari maniac
Posts: 92
Joined: Fri Dec 15, 2017 11:30 am
Location: Berlin, Zermany
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by rmd »

evil wrote:
rmd wrote:Hello,

Code: Select all

;		move16 about 520kbyte/ 60Hz VBL, 66 MHz CPU
copy_16:	
		move.l	buf1addr,a0
		move.l	buf2addr,a1

		moveq	#65-1,d0
.loop:
		rept	512
		move16	(a0)+,(a1)+		;8k per loop
		endr
		
		dbra	d0,.loop
Worth to note is that the test program didn't shut off all of the OS so Timer C and VBL from the OS were still running. Doing it more naughty will gain some numbers for each method.
amazing, so if I want to inline that, what are the clobbered regs, : "d0", "a0", "a1" ?
evil
Captain Atari
Captain Atari
Posts: 192
Joined: Sun Nov 12, 2006 8:03 pm
Location: Devpac

Re: fastest way to copy a memory block in 68060 assembly

Post by evil »

rmd wrote:
evil wrote:
rmd wrote:Hello,

Code: Select all

;		move16 about 520kbyte/ 60Hz VBL, 66 MHz CPU
copy_16:	
		move.l	buf1addr,a0
		move.l	buf2addr,a1

		moveq	#65-1,d0
.loop:
		rept	512
		move16	(a0)+,(a1)+		;8k per loop
		endr
		
		dbra	d0,.loop
Worth to note is that the test program didn't shut off all of the OS so Timer C and VBL from the OS were still running. Doing it more naughty will gain some numbers for each method.
amazing, so if I want to inline that, what are the clobbered regs, : "d0", "a0", "a1" ?
In this case yes, but you can easily change that around.
User avatar
rmd
Atari maniac
Atari maniac
Posts: 92
Joined: Fri Dec 15, 2017 11:30 am
Location: Berlin, Zermany
Contact:

Re: fastest way to copy a memory block in 68060 assembly

Post by rmd »

evil wrote:
In this case yes, but you can easily change that around.
thanks!
User avatar
thomas3
Captain Atari
Captain Atari
Posts: 161
Joined: Tue Apr 11, 2017 8:57 pm
Location: the people's republic of south yorkshire, uk.

Re: fastest way to copy a memory block in 68060 assembly

Post by thomas3 »

evil wrote: Second try, movem.l with the source data scrambled backwards, so we can use (an)+ for source and -(an) for destination.
The most obvious gfx optimisation I never thought of, part #4822 of an eternally ongoing series.......
tommo
Atari User
Atari User
Posts: 33
Joined: Mon Jan 29, 2018 6:00 pm

Re: fastest way to copy a memory block in 68060 assembly

Post by tommo »

Do you want to copy a screen from fast-ram to st-ram on a ct60 in GFA?

The fastram of ct60 is ca 10x faster than st-ram 8)
with a 66mhz on LONG alignment i can read or write about 50mb/s in fastram
when tos interupts are still normal running using "movem.l"

GFA 3.6tt-interpreter does not like the 060-multple caches to be on!
I have not tried it after compilation.

I you are interested in the workings find a "68060 um pdf"
the instruction-timing tabels are also there.
Post Reply

Return to “680x0”