GCC on ATARI - how much faster could it be?

C and PASCAL (or any other high-level languages) in here please
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

I haven't compiled it yet for anything but linux. But you can find the branch at https://github.com/th-otto/m68k-atari-m ... int/gcc-13

Update: I've now updated my scripts for macos, and all compilers are freshly recompiled (but only gcc 4.6.4, gcc-7.5.0 and gcc-13.2.0 have fastcall support so far). Note that these scripts run on github runners and are build for macos-11 and above (universal x86_64/M1). Binaries can be found at https://tho-otto.de/snapshots/crossmint/macos/

For older macos versions, compilation is currently going on ;)

This were compiled on macos 10.13, and should work for 10.9 and above:
gcc-13.2.0-mint-20230908-bin-macos.tar.xz
gcc-13.2.0-mintelf-20230908-bin-macos.tar.xz
PeyloW
Atari freak
Atari freak
Posts: 67
Joined: Mon Apr 11, 2011 8:34 pm

Re: GCC on ATARI - how much faster could it be?

Post by PeyloW »

ThorstenOtto wrote: Wed May 01, 2024 4:08 am I haven't compiled it yet for anything but linux. But you can find the branch at https://github.com/th-otto/m68k-atari-m ... int/gcc-13

Update: I've now updated my scripts for macos, and all compilers are freshly recompiled (but only gcc 4.6.4, gcc-7.5.0 and gcc-13.2.0 have fastcall support so far). Note that these scripts run on github runners and are build for macos-11 and above (universal x86_64/M1). Binaries can be found at https://tho-otto.de/snapshots/crossmint/macos/

For older macos versions, compilation is currently going on ;)

This were compiled on macos 10.13, and should work for 10.9 and above:
gcc-13.2.0-mint-20230908-bin-macos.tar.xz
gcc-13.2.0-mintelf-20230908-bin-macos.tar.xz
Lovely thank you. Now I have yet another good reason to experiment with my dev setup instead of completing the project :D.
medmed
Atari God
Atari God
Posts: 1006
Joined: Sat Apr 02, 2011 5:06 am
Location: France, Paris

Re: GCC on ATARI - how much faster could it be?

Post by medmed »

Hi,

Is there some sort of benchmark for using mfastcall - I suppose that to benefit from it we have to recompile all the libraries?

Thank you very much
M.Medour - 1040STF, Mega STE + Spektrum card, Milan 040 + S3Video + ES1371.
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

IIRC @mfro did some benchmarks some time ago with gcc 4.6.4. And yes, all code (including all libraries) should be recompiled. Alternatively, if you have libraries you can't or dont want to recompile, you can declare the exported functions in their header with __attribute__((cdecl)) but that is even more work. Also that attribute is only valid for compilers that support fastcall.
medmed
Atari God
Atari God
Posts: 1006
Joined: Sat Apr 02, 2011 5:06 am
Location: France, Paris

Re: GCC on ATARI - how much faster could it be?

Post by medmed »

ThorstenOtto wrote: Sat May 11, 2024 1:22 pm IIRC @mfro did some benchmarks some time ago with gcc 4.6.4. And yes, all code (including all libraries) should be recompiled. Alternatively, if you have libraries you can't or dont want to recompile, you can declare the exported functions in their header with __attribute__((cdecl)) but that is even more work. Also that attribute is only valid for compilers that support fastcall.
Thanks Thorsten. This is a huge work again :)
M.Medour - 1040STF, Mega STE + Spektrum card, Milan 040 + S3Video + ES1371.
medmed
Atari God
Atari God
Posts: 1006
Joined: Sat Apr 02, 2011 5:06 am
Location: France, Paris

Re: GCC on ATARI - how much faster could it be?

Post by medmed »

Just to let you know that I detected some strange behavior:

I built my code on my osx x64/m68k-atari-mint-gcc 9.1 platform - Applications were fine.
Then I copied the same libraries to my osx arm environment, then the problems start:
- It has m68k-atari-mint-gcc 12.x, the build went without any problems but when I tried to open a file that uses libav the application got stuck!
-> Application locked on avformat_alloc_context();
- I installed m68k-atari-mint-gcc 13.2 -> same behavior
- I installed m68k-atari-mint-gcc 9.5 -> Everything is fine

So it looks like there are at least some issues with m68k-atari-mint-gcc >= 12 - I haven't tried with m68k-atari-mint-gcc 10.x/11.x - Tests were done with m68k-atari-mint-gcc with osx/arm.

EDIT: I'll do more tests again to be sure - May be it's a pebcak again :)
Last edited by medmed on Sun May 12, 2024 1:57 am, edited 1 time in total.
M.Medour - 1040STF, Mega STE + Spektrum card, Milan 040 + S3Video + ES1371.
User avatar
mfro
Atari God
Atari God
Posts: 1294
Joined: Thu Aug 02, 2012 10:33 am
Location: SW Germany

Re: GCC on ATARI - how much faster could it be?

Post by mfro »

ThorstenOtto wrote: Sat May 11, 2024 1:22 pm IIRC @mfro did some benchmarks some time ago with gcc 4.6.4...
That wasn't me. @peylow was the one who provided the benchmark numbers.
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

medmed wrote: Sat May 11, 2024 5:26 pm So it looks like there are at least some issues with m68k-atari-mint-gcc >= 12
Sorry but i think i can't help much there. The only more or less working OSX environment i have is running macos 10.13 in VirtualBox, and that does not support ARM.

First thing i would try would be to use the same compiler on both x64 and arm machines, and see if they produce something different.
medmed
Atari God
Atari God
Posts: 1006
Joined: Sat Apr 02, 2011 5:06 am
Location: France, Paris

Re: GCC on ATARI - how much faster could it be?

Post by medmed »

ThorstenOtto wrote: Sun May 12, 2024 3:55 am
medmed wrote: Sat May 11, 2024 5:26 pm So it looks like there are at least some issues with m68k-atari-mint-gcc >= 12
Sorry but i think i can't help much there. The only more or less working OSX environment i have is running macos 10.13 in VirtualBox, and that does not support ARM.

First thing i would try would be to use the same compiler on both x64 and arm machines, and see if they produce something different.
Yes I'll do. Thanks. I think I'll reproduce the whole process first just to be sure I havent forgot to copy some files from the gcc tarball or whatever else...
M.Medour - 1040STF, Mega STE + Spektrum card, Milan 040 + S3Video + ES1371.
PeyloW
Atari freak
Atari freak
Posts: 67
Joined: Mon Apr 11, 2011 8:34 pm

Re: GCC on ATARI - how much faster could it be?

Post by PeyloW »

mfro wrote: Sat May 11, 2024 5:57 pm
ThorstenOtto wrote: Sat May 11, 2024 1:22 pm IIRC @mfro did some benchmarks some time ago with gcc 4.6.4...
That wasn't me. @peylow was the one who provided the benchmark numbers.
Think this refers to the very rudimentary benchmarking bundled in libcmini: https://github.com/freemint/libcmini/tr ... ests/bench

I'd take the numbers with a grain of salt, inlining will always be faster, but when you cannot or the compiler choose not to, fastballs will be a tiny bit faster. The question is how much of a bottleneck for your use-case is function call overhead, vs. what the individual functions do.

For my current project where I written a C++ stdlib streams like system with many virtual functions it cuts loading/decoding of assets from about 75s to 65s. I cannot claim this is a normal use-case.
mikro
Hardware Guru
Hardware Guru
Posts: 4723
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: GCC on ATARI - how much faster could it be?

Post by mikro »

The preliminary results me and Thorsten have done are more than promising -- for instance, my ScummVM build was reduced by nearly 2 MB (!) in size (out of ~24 MB). That alone is a fantastic improvement.
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3993
Joined: Sun Jul 31, 2011 1:11 pm

Re: GCC on ATARI - how much faster could it be?

Post by Eero Tamminen »

mikro wrote: Wed May 22, 2024 8:22 pm The preliminary results me and Thorsten have done are more than promising -- for instance, my ScummVM build was reduced by nearly 2 MB (!) in size (out of ~24 MB). That alone is a fantastic improvement.
I would expect SCI engine perf to be impacted most as it calls very small functions / methods so often that they show high in perf profiles.

As SCI engine (along with SCUMM engine) has most games, it could have noticeable impact.
PeyloW
Atari freak
Atari freak
Posts: 67
Joined: Mon Apr 11, 2011 8:34 pm

Re: GCC on ATARI - how much faster could it be?

Post by PeyloW »

mikro wrote: Wed May 22, 2024 8:22 pm The preliminary results me and Thorsten have done are more than promising -- for instance, my ScummVM build was reduced by nearly 2 MB (!) in size (out of ~24 MB). That alone is a fantastic improvement.
~8% binary size, that is pretty awesome!

One change I also did that works for some arguments even if not passed in registers, is that 8bit and 16 bit values are not extended and passed as 32bits on the stack, 16 bits. Which is good if you pass around bools, or if you use -mshort.
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

PeyloW wrote: Wed May 22, 2024 11:12 pm [One change I also did that works for some arguments even if not passed in registers, is that 8bit and 16 bit values are not extended and passed as 32bits on the stack, 16 bits. Which is good if you pass around bools, or if you use -mshort.
Oh, i must have missed that one. Can you tell me which change is responsible for that? But that would be plain wrong, since it would interfere with assembler code that expects 32bit values on the stack when compiled without -mshort.

BTW, i just recently identified another problem: when calling nested functions, you use register A2 as chain register (that is a register used to tell the called function where the local variables are in the calling function). But A2 is a callee-saved register and must not be clobbered by a call. A0 and A1 cannot be used either, since they might be used for arguments. Although nested function are rarely used, that will crash mintlib: https://github.com/freemint/mintlib/blo ... _fp.c#L200. For that reason i currently disallow nested functions with -mfastcall, maybe there are better ways to solve this.
PeyloW
Atari freak
Atari freak
Posts: 67
Joined: Mon Apr 11, 2011 8:34 pm

Re: GCC on ATARI - how much faster could it be?

Post by PeyloW »

ThorstenOtto wrote: Thu May 23, 2024 3:48 am
PeyloW wrote: Wed May 22, 2024 11:12 pm [One change I also did that works for some arguments even if not passed in registers, is that 8bit and 16 bit values are not extended and passed as 32bits on the stack, 16 bits. Which is good if you pass around bools, or if you use -mshort.
Oh, i must have missed that one. Can you tell me which change is responsible for that? But that would be plain wrong, since it would interfere with assembler code that expects 32bit values on the stack when compiled without -mshort.

BTW, i just recently identified another problem: when calling nested functions, you use register A2 as chain register (that is a register used to tell the called function where the local variables are in the calling function). But A2 is a callee-saved register and must not be clobbered by a call. A0 and A1 cannot be used either, since they might be used for arguments. Although nested function are rarely used, that will crash mintlib: https://github.com/freemint/mintlib/blo ... _fp.c#L200. For that reason i currently disallow nested functions with -mfastcall, maybe there are better ways to solve this.
Its done with this define in m68k.h:

Code: Select all

#define PARM_BOUNDARY ((TARGET_SHORT ||  (TARGET_FASTCALL && TUNE_68000_10)) ? 16 : 32)
Think the original is only defined to check TARGET_SHORT, but I made it also check for fastcall for plain 68000 target that would have a 16bit bus. Guess this is a lie for Falcon too ;).

As for the chain register, same file and this:

Code: Select all

#define STATIC_CHAIN_REGNUM (TARGET_FASTCALL ? A2_REG : A0_REG)
#define M68K_STATIC_CHAIN_REG_NAME (TARGET_FASTCALL ? REGISTER_PREFIX "a2" : REGISTER_PREFIX "a0")
I guess the more proper fix here is to not use a register for chain, but instead define a plain STATIC_CHAIN and pass on stack.

Nested function being non-standard, and C++ lambdas being the better option my my own use-case I never tried the gcc nested functions myself.
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

Your setting does not seem to work:

Code: Select all

void test2(int, int, int, short, short);

short x, y;
void test(int n)
{
	test2(1, 2, 3, x, y);
}
Produces:

Code: Select all

_test:
        move.w _y,%a0
        move.l %a0,-(%sp)
        move.w _x,%a0
        move.l %a0,-(%sp)
        moveq #3,%d2
        moveq #2,%d1
        moveq #1,%d0
        jsr _test2
        addq.l #8,%sp
        rts
So that still pushes a 32bit value for the last two (short) parameters.

And yes, nested functions are rarely used (i was a bit buffled that it was used in mintlib, but that code originally came from gmp). But the same code might be used also for other things (c++ lambdas might be such a case, haven't checked that), so it should not produce invalid code.
mikro
Hardware Guru
Hardware Guru
Posts: 4723
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: GCC on ATARI - how much faster could it be?

Post by mikro »

mikro wrote: Mon Apr 29, 2024 4:18 pm Now that I have the whole "Thorsten Otto's collection" ;) (mintlib, gemlib, fdlibm, g++ and his original binutils) I can confirm that ScummVM builds and links with -mfastcall added to CXXFLAGS. No other libraries are needed.

I wont have time to look at it this week but if you can't wait Thorsten, just make sure you run "./backends/platform/atari/build-release030.sh" and publish that ZIP for Eero. Then he will hopefully tell us whether this whole effort wasn't totally pointless. ;)
Oh look, it took me only six weeks. In the end, Thorsten did everything right except one thing... forgot to add "-mfastcall" to ASFLAGS, too. ;)

Anyway, I have managed to prepare two identical builds with and without -mfastcall support. Eero, when/if you find some time, try to profile the sh*t out of them, i.e. pick a few favourites of yours (also e.g. the UI), run them in both versions and please do tell us that this whole effort has brought something else than an shrinked executable by ~2 MB. :)

https://mikro.naprvyraz.sk/private/scum ... stcall.zip
https://mikro.naprvyraz.sk/private/scum ... i-full.zip
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

mikro wrote: Sun Jun 09, 2024 9:50 pm forgot to add "-mfastcall" to ASFLAGS, too. ;)
Oh. That is probably because i'm used to use $(CFLAGS) everywhere.
do tell us that this whole effort has brought something else than an shrinked executable by ~2 MB. :)
If nothing else, that is already a step forward ;)

Using -flto can also help, but dunno how well that works when linking such a large executable.
mikro
Hardware Guru
Hardware Guru
Posts: 4723
Joined: Sat Sep 10, 2005 11:11 am
Location: Kosice, Slovakia
Contact:

Re: GCC on ATARI - how much faster could it be?

Post by mikro »

Unfortunately, I didn't have good results with -flto, IIRC the executable was even slightly larger. But maybe I used it wrong.
chicane
Captain Atari
Captain Atari
Posts: 269
Joined: Mon Jul 02, 2012 11:25 am
Location: Leeds, UK

Re: GCC on ATARI - how much faster could it be?

Post by chicane »

mikro wrote: Mon Jun 10, 2024 11:14 am Unfortunately, I didn't have good results with -flto, IIRC the executable was even slightly larger. But maybe I used it wrong.
Following this thread with interest as I make heavy use of GCC :D

Sorry if I've missed some context here, but my understanding is that -flto (link time optimisation) is more about improving performance than reducing code size. It's certainly helped me in the past from a performance standpoint.
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

Typically, code size reduce is a side-effect of using -flto. Essentially, that will make the linker invoke the compiler again, asking it to compile the whole program again as if it were a single source file. That makes new optimizations possible, like inlining functions that are from different modules. IIRC, experiments in emutos could reduce the codesize by ~10k for the 192k ROMs. But it has the bad side effect that linking can take extremely long, and use lots of memory (several GB).
User avatar
Eero Tamminen
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3993
Joined: Sun Jul 31, 2011 1:11 pm

Re: GCC on ATARI - how much faster could it be?

Post by Eero Tamminen »

mikro wrote: Sun Jun 09, 2024 9:50 pm Anyway, I have managed to prepare two identical builds with and without -mfastcall support. Eero, when/if you find some time, try to profile the sh*t out of them, i.e. pick a few favourites of yours (also e.g. the UI), run them in both versions and please do tell us that this whole effort has brought something else than an shrinked executable by ~2 MB. :)

https://mikro.naprvyraz.sk/private/scum ... stcall.zip
https://mikro.naprvyraz.sk/private/scum ... i-full.zip
I haven't forgotten about this, I just haven't had time for it yet. Hopefully later this week. Info will go to the ScummVM thread.
medmed
Atari God
Atari God
Posts: 1006
Joined: Sat Apr 02, 2011 5:06 am
Location: France, Paris

Re: GCC on ATARI - how much faster could it be?

Post by medmed »

Hi,

Just wanted to know if it would make sense to build freemint / fvdi / Xaaes with -mfastcall ?
Or may be it's already the case?
Or it's bad idea?
M.Medour - 1040STF, Mega STE + Spektrum card, Milan 040 + S3Video + ES1371.
ThorstenOtto
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3413
Joined: Sun Aug 03, 2014 5:54 pm

Re: GCC on ATARI - how much faster could it be?

Post by ThorstenOtto »

Bad idea, because they use lots of assembler interfaces, and those would have to be adjusted.
medmed
Atari God
Atari God
Posts: 1006
Joined: Sat Apr 02, 2011 5:06 am
Location: France, Paris

Re: GCC on ATARI - how much faster could it be?

Post by medmed »

Ah ok - I understand. Many thanks :)
M.Medour - 1040STF, Mega STE + Spektrum card, Milan 040 + S3Video + ES1371.
Post Reply

Return to “C / PASCAL etc.”