ao486

https://github.com/MiSTer-devel/Main_MiSTer/wiki

Moderators: Mug UK, Zorro 2, spiny, Greenious, Sorgelig, Moderator Team

Locked
olin
Obsessive compulsive Atari behavior
Obsessive compulsive Atari behavior
Posts: 106
Joined: Tue Nov 21, 2017 8:57 pm

Re: ao486 Performance Technical Discussion

Post by olin »

calvinmorrow wrote: Memory Read is a MOV %(ebx), %eax...
I might be completely wrong on this, but to me it looks like a memory write: the first parameter is the destination (memory address stored in register ebx) and the source is the second parameter - the value of the register eax. My knowledge of asm is rusted, so please correct me if I'm wrong. It would also make more sense to me that reading takes less cycles than writing.
calvinmorrow
Atariator
Atariator
Posts: 17
Joined: Thu Oct 31, 2019 6:17 pm

Re: ao486 Performance Technical Discussion

Post by calvinmorrow »

In the (more common?) Intel Syntax that would be correct, but apparently gcc uses AT&T by default.

From: https://www.ibiblio.org/gferg/ldp/GCC-I ... HOWTO.html
GCC uses AT&T asm syntax. This is a little bit different from the regular
Intel format. The main differences are:

* AT&T syntax uses the opposite order for source and destination operands,
source followed by destination.
olin
Obsessive compulsive Atari behavior
Obsessive compulsive Atari behavior
Posts: 106
Joined: Tue Nov 21, 2017 8:57 pm

Re: ao486 Performance Technical Discussion

Post by olin »

calvinmorrow wrote:.. but apparently gcc uses AT&T by default..
Thanks, was not aware of that.
calvinmorrow
Atariator
Atariator
Posts: 17
Joined: Thu Oct 31, 2019 6:17 pm

Re: ao486 Performance Technical Discussion

Post by calvinmorrow »

I posted the C code and assembly I'm using here: https://gist.github.com/calvinmorrow/90 ... dd9af1ca64

After getting the code back to my regular Linux machine, I recompiled it and ran it for a quick sanity check. There's some minor differences ... no -march=i486 since I'm on 64-bit, increased operations *10, and I had to disable the leal ... wrong register error or something on x86-64.

The main reason for running locally vs ao486 was to double check that my code produced the result I would have expected, that memory reads would be slightly faster than memory writes. On my machine that seems to be the case, which only makes the ao486 result stand out more.

Code: Select all

gcc -o bench -O2 -funroll-loops bench.c 
./bench

1000000000 NOP Operations in 88 Milliseconds
1000000000 MEMREAD Operations in 131 Milliseconds
1000000000 MEMWRITE Operations in 259 Milliseconds
1000000000 ADD Operations in 257 Milliseconds
1000000000 SUBTRACT Operations in 252 Milliseconds
JimDrew
Atari Super Hero
Atari Super Hero
Posts: 865
Joined: Mon Nov 04, 2013 5:23 pm

Re: ao486 Performance Technical Discussion

Post by JimDrew »

Wow.. that's a weird assembler.

What are you using for your timer when you are doing your tests? I have a LOT of experience with the PC and assembly code. I wrote FUSION-PC, which is a Mac emulator for the PC. It was 1.7 million lines of assembly code. I also wrote PCx, which was the first Intel Pentium based PC emulator for the Amiga and that was 1.6 million lines of 68K assembly. I would be happy to help with speed improvements. I found that making things native (like the video and video BIOS) made a huge improvement to the overall speed of my emulation.
I am the flux ninja
calvinmorrow
Atariator
Atariator
Posts: 17
Joined: Thu Oct 31, 2019 6:17 pm

Re: ao486 Performance Technical Discussion

Post by calvinmorrow »

My replies are a bit slow atm because most of them are getting stuck awaiting moderator approval (new account and links probably). I posted the code, url inbound shortly. Would certainly welcome the help and any improvements!
calvinmorrow
Atariator
Atariator
Posts: 17
Joined: Thu Oct 31, 2019 6:17 pm

Re: ao486 Performance Technical Discussion

Post by calvinmorrow »

JimDrew wrote:What are you using for your timer when you are doing your tests?
Originally I started down the route of trying to use the RDTSC instruction so I could get an accurate cycle count while performing the operational loop. I actually had a working (on my PC) implementation, only to find out that the instruction wasn't added until the 586.

At the moment I'm using C's time.h clock() method and trying to have a large enough opcode loop to minimize the impact for the lack of clock/timer accuracy. My brief search wasn't yielding a lot of options in the 486's architecture for good timer options.
Sorgelig
Ultimate Atarian
Ultimate Atarian
Posts: 6348
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 Performance Technical Discussion

Post by Sorgelig »

Besides the performance, i think something wrong with either interrupt disabling or specifically to keyboard controller disabling.
When PC is booting and in time when RAM expander is loading (QEMM for example) and you press any key at that time, most likely it will crash. DOS4GW apps are also affected by this issue. Something happens only in loading QEMM (and loading/unloading of DOS4GW) procedure.
calvinmorrow
Atariator
Atariator
Posts: 17
Joined: Thu Oct 31, 2019 6:17 pm

Re: ao486 Performance Technical Discussion

Post by calvinmorrow »

Sorgelig wrote:Besides the performance, i think something wrong with either interrupt disabling or specifically to keyboard controller disabling.
When PC is booting and in time when RAM expander is loading (QEMM for example) and you press any key at that time, most likely it will crash. DOS4GW apps are also affected by this issue. Something happens only in loading QEMM (and loading/unloading of DOS4GW) procedure.
Possibly related, but a lot of the development tools provided with FreeDOS hang under DOS (also tried DOS 6.22) on ao486. Trying to run GCC, NASM, MASM, or a handful of other programs seemed to hang the core with a reset being the only option. I was only able to get those tools to run (and the programs they compiled) under Windows 95 with the FreeDOS VHD mounted as a second drive.

Those same tools ran under a DOS VM with a modern processor, even when restricted to presenting a 486 CPU.
JimDrew
Atari Super Hero
Atari Super Hero
Posts: 865
Joined: Mon Nov 04, 2013 5:23 pm

Re: ao486 Performance Technical Discussion

Post by JimDrew »

The BIGGEST problem I have seen with a LOT of various programs (like MASM, which is what I use) are the missing CPU instructions (CMPXCHG8B, MOV to/from control register, etc.) and especially because there is no FPU! The lack of a FPU kills a TON of stuff because every 486 and later had a FPU built-in, so most everything used the FPU either deliberately or unknowingly through a library call that relies on the FPU.

There are definitely interrupt issues. There is something wrong with the interrupt controller hardware. This seems to affect everything in the system. It's like once an interrupt occurs the entire system is held off for some period of time. This results in some recursion that hangs the system.
I am the flux ninja
Sorgelig
Ultimate Atarian
Ultimate Atarian
Posts: 6348
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 Performance Technical Discussion

Post by Sorgelig »

JimDrew wrote:because every 486 and later had a FPU built-in
Not every. Most i486SX had no FPU.
You are free to contribute to opensource project. Currently you only suck the money from MiSTer. So, you are definitely not the one who can complain.
calvinmorrow
Atariator
Atariator
Posts: 17
Joined: Thu Oct 31, 2019 6:17 pm

Re: ao486 Performance Technical Discussion

Post by calvinmorrow »

I've been trying to formulate a hypothesis as to why memory reads would be considerably slower than writes in ao486. The best idea I have at the moment (based on my limited understanding) is a potential issue in the TLB that would manifest as TLB thrashing.

At the moment I'm combing through the memory management code trying to get a grasp on how it operates with particular attention to the TLB. If anyone knows what else I should be paying attention to I'm certainly open to pointers, otherwise I'm going to do my best to get as good of an understanding as possible and then try to decide how best to run some tests.
JimDrew
Atari Super Hero
Atari Super Hero
Posts: 865
Joined: Mon Nov 04, 2013 5:23 pm

Re: ao486 Performance Technical Discussion

Post by JimDrew »

We really didn't use the SX versions in the U.S. We basically went from the 38, briefly to the 486, and straight to the Pentium (math bug and all). There were a slew programs that required the FPU to run by the time the 486 was popular.

I am certainly not complaining. I was pointing out that the lack of the FPU is probably the biggest issue with the core's compatibility, followed by the interrupt controller. I am not "sucking money out of MiSTer". I am selling 95%+ of my products to universities and students who are using the DE-10 as an educational tool. It's the reason why i am working with Terasic on the SDRAM reliability differences between different DE-10 Nano boards.
I am the flux ninja
Sorgelig
Ultimate Atarian
Ultimate Atarian
Posts: 6348
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 Performance Technical Discussion

Post by Sorgelig »

calvinmorrow wrote:I've been trying to formulate a hypothesis as to why memory reads would be considerably slower than writes in ao486. The best idea I have at the moment (based on my limited understanding) is a potential issue in the TLB that would manifest as TLB thrashing.

At the moment I'm combing through the memory management code trying to get a grasp on how it operates with particular attention to the TLB. If anyone knows what else I should be paying attention to I'm certainly open to pointers, otherwise I'm going to do my best to get as good of an understanding as possible and then try to decide how best to run some tests.
with sequential read of memory, cache doesn't work as data is not in cache yet. So, in this case it will hit the DDR3 latency issue and thus is slow. Probably pre-fetch the data will speed up it. Or switch to SDRAM..
calvinmorrow
Atariator
Atariator
Posts: 17
Joined: Thu Oct 31, 2019 6:17 pm

Re: ao486 Performance Technical Discussion

Post by calvinmorrow »

Sorgelig wrote: with sequential read of memory, cache doesn't work as data is not in cache yet. So, in this case it will hit the DDR3 latency issue and thus is slow. Probably pre-fetch the data will speed up it. Or switch to SDRAM..
The read test I ran dereferenced the same RAM address 100000000 times at 3-4x the write latency rather than a sequential read.

I don't think prefetching has been implemented that I can tell, but with it enabled, we should be able to get roughly the same performance (or slightly better) than the memory writes correct? I was expecting reads should run closer to 6-8 seconds rather than almost 30.
Sorgelig
Ultimate Atarian
Ultimate Atarian
Posts: 6348
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 Performance Technical Discussion

Post by Sorgelig »

calvinmorrow wrote:
Sorgelig wrote: with sequential read of memory, cache doesn't work as data is not in cache yet. So, in this case it will hit the DDR3 latency issue and thus is slow. Probably pre-fetch the data will speed up it. Or switch to SDRAM..
The read test I ran dereferenced the same RAM address 100000000 times at 3-4x the write latency rather than a sequential read.

I don't think prefetching has been implemented that I can tell, but with it enabled, we should be able to get roughly the same performance (or slightly better) than the memory writes correct? I was expecting reads should run closer to 6-8 seconds rather than almost 30.
No, pre-fetch is not implemented.
After some more precise exploration of memory access, i think DDR3 is not effectively used. Although address translator can use up to 256 words burst it doesn't look like using it. The clients are 4-word burst only with 32bits, so it means only 2x burst on 64bit DDR3 bus. Pretty much ineffective. It either needs external cache like it was on old main boards, or SDRAM use.
With 128MB SDRAM module it's now possible to switch without loosing in memory size.
Lroby74
Captain Atari
Captain Atari
Posts: 170
Joined: Sun Sep 04, 2016 8:35 pm

Re: ao486 Performance Technical Discussion

Post by Lroby74 »

I would like to have an option for change about or ram avaiable to ao486 core, actually is 64mb and is too much for dos games like Aladdin (even if enabled EMS ram via EMM386, is too much), would like to have 1mb, 2mb, 4mb, 8mb, 16mb, 32mb and 64mb options for better compatibility with games..
Last edited by Lroby74 on Mon Nov 04, 2019 7:04 am, edited 1 time in total.
User avatar
kitrinx
Captain Atari
Captain Atari
Posts: 192
Joined: Wed Sep 26, 2018 6:03 am

Re: ao486 Performance Technical Discussion

Post by kitrinx »

JimDrew wrote:We really didn't use the SX versions in the U.S. We basically went from the 38, briefly to the 486, and straight to the Pentium (math bug and all). There were a slew programs that required the FPU to run by the time the 486 was popular.

I am certainly not complaining. I was pointing out that the lack of the FPU is probably the biggest issue with the core's compatibility, followed by the interrupt controller. I am not "sucking money out of MiSTer". I am selling 95%+ of my products to universities and students who are using the DE-10 as an educational tool. It's the reason why i am working with Terasic on the SDRAM reliability differences between different DE-10 Nano boards.
That's not true at all, you must have been out of the country for 5 years or so. I had a gateway 2000 486 SX 25mhz as my first 486. It wasn't until years later that I upgraded to a 486 dx 100. I knew a lot of other people with the same cpu at the time.
Poobah
Obsessive compulsive Atari behavior
Obsessive compulsive Atari behavior
Posts: 133
Joined: Wed Aug 03, 2005 11:45 am
Location: Ohio, USA

Re: ao486 Performance Technical Discussion

Post by Poobah »

kitrinx wrote:
JimDrew wrote:We really didn't use the SX versions in the U.S. We basically went from the 38, briefly to the 486, and straight to the Pentium (math bug and all). There were a slew programs that required the FPU to run by the time the 486 was popular.

I am certainly not complaining. I was pointing out that the lack of the FPU is probably the biggest issue with the core's compatibility, followed by the interrupt controller. I am not "sucking money out of MiSTer". I am selling 95%+ of my products to universities and students who are using the DE-10 as an educational tool. It's the reason why i am working with Terasic on the SDRAM reliability differences between different DE-10 Nano boards.
That's not true at all, you must have been out of the country for 5 years or so. I had a gateway 2000 486 SX 25mhz as my first 486. It wasn't until years later that I upgraded to a 486 dx 100. I knew a lot of other people with the same cpu at the time.
That was my experience as well, at least in central new york, 486sx25 chips everywhere, which was odd, as they weren't all that much faster than a 386dx-40, given the price delta...
US based seller MiSTer Expansion Boards and Atari items
https://www.legacypixels.com/
ExCyber
Retro freak
Retro freak
Posts: 13
Joined: Sun Aug 25, 2019 3:16 am

Re: ao486 Performance Technical Discussion

Post by ExCyber »

Poobah wrote:
kitrinx wrote:
JimDrew wrote:We really didn't use the SX versions in the U.S. We basically went from the 38, briefly to the 486, and straight to the Pentium (math bug and all). There were a slew programs that required the FPU to run by the time the 486 was popular.

I am certainly not complaining. I was pointing out that the lack of the FPU is probably the biggest issue with the core's compatibility, followed by the interrupt controller. I am not "sucking money out of MiSTer". I am selling 95%+ of my products to universities and students who are using the DE-10 as an educational tool. It's the reason why i am working with Terasic on the SDRAM reliability differences between different DE-10 Nano boards.
That's not true at all, you must have been out of the country for 5 years or so. I had a gateway 2000 486 SX 25mhz as my first 486. It wasn't until years later that I upgraded to a 486 dx 100. I knew a lot of other people with the same cpu at the time.
That was my experience as well, at least in central new york, 486sx25 chips everywhere, which was odd, as they weren't all that much faster than a 386dx-40, given the price delta...
This disconnect might have to do with how the market was segmented at the time. Brand new systems were routinely sold based on CPUs one or two generations old because each new generation typically launched with only a high-end chip targeted at business users. Going down the price scale was basically going back in time. So if you were doing stuff like CAD or software development, you could probably afford and seriously benefit from a 486DX ca. 1990 and a Pentium in 1993. But if you were a home user on a budget, you might very well have bought a 486SX-based system as late as 1995.
JimDrew
Atari Super Hero
Atari Super Hero
Posts: 865
Joined: Mon Nov 04, 2013 5:23 pm

Re: ao486 Performance Technical Discussion

Post by JimDrew »

Poobah wrote:That was my experience as well, at least in central new york, 486sx25 chips everywhere, which was odd, as they weren't all that much faster than a 386dx-40, given the price delta...
That's precisely why they were not popular in the U.S. Gateway immediately dropped them after releasing systems with them and switched to non-SX and the Pentium.
I am the flux ninja
BBond007
Captain Atari
Captain Atari
Posts: 466
Joined: Wed Feb 28, 2018 3:23 am

Re: ao486 Performance Technical Discussion

Post by BBond007 »

Poobah wrote: That was my experience as well, at least in central new york, 486sx25 chips everywhere, which was odd, as they weren't all that much faster than a 386dx-40, given the price delta...
The 486sx was commonly paired with the (P24T) "Pentium Overdrive" socket which contained an extra row of pins unused by the 486sx CPU.

The pretext was that people would buy a lower-cost 486sx entry-level system with the promise it could later be upgraded to a (not yet available) Pentium CPU variant.

Intel just never made the P24T a competitive option as far as price...
bhamadicharef
Atariator
Atariator
Posts: 23
Joined: Tue Jul 18, 2017 8:31 am
Location: Singapore

Re: ao486 Performance Technical Discussion

Post by bhamadicharef »

Notice that there is also the 586 with quite a lot of work done from what we can read at
https://opencores.org/projects/v586 ... to keep in mind, some modules can be re-used.
Brahim HAMADI CHAREF:: Singapore
softtest1
Atari User
Atari User
Posts: 34
Joined: Tue Apr 30, 2019 6:37 pm

Re: ao486 Performance Technical Discussion

Post by softtest1 »

There is also Zet: http://zet.aluzina.org/index.php/Zet_processor

It seems less mature than the other cores, but maybe it has some code bits that could be useful.
Glaurung
Atari freak
Atari freak
Posts: 66
Joined: Sat Mar 30, 2019 6:22 am

Re: ao486 Performance Technical Discussion

Post by Glaurung »

softtest1 wrote:There is also Zet: http://zet.aluzina.org/index.php/Zet_processor

It seems less mature than the other cores, but maybe it has some code bits that could be useful.
So this is a 486?

. It can boot successfully MS-DOS 6.22, FreeDOS 1.1 and run Microsoft Windows 3.0 and other MS-DOS games.
Locked

Return to “MiSTer”