Routine to measure read sector time (DMA/FDC Programming)

GFA, ASM, STOS, ...

Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team

User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

ijor wrote:I can't reproduce the behavior, I constantly get the first header on all tracks.
This is amazing. I tried it on two different systems and I am getting the same result?
Anybody else can try?
I did note that apparently you are selecting/deselecting the drive without disabling interrupts. That's too bad. That might explain everything because you would get all sort of strange behavior.
True I do not disable interupt. But however I am setting the "flock" variable which is suppose to turn off the floppy VBL check routine (_flopvbl) to keep the DMA free of disturbance.
But you may refer only to select/deselect drive action. Originally I was selecting/deselecting directly accessing the PSG from my C code. But I found quickly that it did not work I was seting bits and they were changed in my back. Looking at the TOS I found that this had to be done in a critical section where interupt where disabled. I therefore decided to use the Giaccess function that does it for me, and therefore I should be safe on this one.
For further tests, please do as I asked. Reduce the program to the bare minimum for reproducing the problem. Make sure you are still getting the problem with the reduced version.
I will do. But first I wanted to make sure I was not doing an obvious thing wrong?
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

DrCoolZic wrote:This is amazing. I tried it on two different systems and I am getting the same result?
This is precisely where reducing the program to the minimum could help a lot. Remove everything, including the read and write test, and leave the read address stuff only. You can easily do that with conditionals. You might be doing something different than me when testing, which could alter the behavior.

It is always a good idea to isolate problematic code as much as possible. You might find that the reduced version actually works fine. Then you could get some clues yourself about the problem
Originally I was selecting/deselecting directly accessing the PSG from my C code... I therefore decided to use the Giaccess function
In your test above you still steems to accessing the PSG directly at some point. I guess you replaced some, but not all PSG accesses from your program. I can't say, of course, if this is the problem or not.
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Striped down version to bare minimum attached.

Also source sent to you in pm
You do not have the required permissions to view the files attached to this post.
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

DrCoolZic wrote:Striped down version to bare minimum attached.
Also source sent to you in pm
Sorry, I'm still getting correct behavior.

Will check the sources later. In the meantime, do you have a digital camera? Would you mind posting a picture of the output you get?
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

ijor wrote:Sorry, I'm still getting correct behavior.
I found the problem! In fact as you mentionned the program seems to work correctly and the problem comes from the diskette used for test !!!

Here is what I found. To run the test I just formated a diskette on an Atari STE (so I could use it on a PC). My assumption was that it would create all tracks with 9 sectors starting at sector 1. To make sure FD was fine I did verify the FD (but only first track because flux mode takes lots of space) with DC and it sure looked OK on track 0.

But in fact the diskette is formated as:
t0: 1 2 3 4 5 6 7 8 9
t1: 6 7 8 9 1 2 3 4 5
t2: 2 3 4 5 6 7 8 9 1
t4: 7 8 9 1 2 3 4 5 6
...
Just what my program returns!!! (btw you asked if the FD was not formatted with weird sect num).

I am almost 99% sure that FD formated with TOS 1.4 had all tracks starting with sector 1, but apparently with TOS 1.6 (and may be above) this is not the case anymore ??? Although it should be transparent to user/prog why would Atari format tracks like that ??? I can only think of optimization when moving from trk to trk ???

This is really stupid from my part I should have checked that before, but I was so sure about the format on Atari (I even reformated to be sure) that I did not bother checking!!!

Side note: So it seems that what I have implemented so far seems to work with one exception: When I quit my test program the system is "confuse" and sometimes bombs or do not work on next execution. I know that the BIOS keeps information on current diskette and position and therefore I have tried to put back the FDC in same "state" as when the test prog starts. For that matter I copy the track number (and actually the sector number - but this one should not matter) from the FDC. Before quiting I restore the track number in the FDC and I seek to this track.

Anything else I should/could do so that it works correctly after ?
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

DrCoolZic wrote:I found the problem! In fact as you mentionned the program seems to work correctly and the problem comes from the diskette used for test !!!
I see, glad that you solved the problem.
Although it should be transparent to user/prog why would Atari format tracks like that ??? I can only think of optimization when moving from trk to trk ???
Yes, it might make reading faster. See the thread about reading disks on the PC for a long discussion of this topic.
When I quit my test program the system is "confuse" and sometimes bombs or do not work on next execution.
That might be more difficult to find the culprit. I will check the sources again for possible suspicious reasons.
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Ok now I have my basic library working in alpha test with most of the basic functions.
By the way to ensure that you get the first address sector after the Index Pulse you have to be relatively quick .
I mentioned about 2ms but this is for a track formatted with 9 sectors. For a track with 11 sectors the post index GAP goes from 60 bytes to only 10 bytes or about 320µs and here you better be fast!

But now back to the very begining of the thread: How do you measure the time it takes to "receive" a character when reading a sector or a track. I thought of 2 relatively simple ways:
1) setup a timer to interutp every xx µs. Each time it interrupt measure how many charaters have been received by reading current dma address compared to start address store info in an array.
2) setup a "clock" in a timer. During a simple read track/sector we are looping waiting for the intrq that indicates termination of the command. Inside each of the loop I get the current dma address to measure number of bytes received and stuff a "timing" array with clock value (elapse time from previously received char).

I have decided to try solution 2 and the code look like this:

Code: Select all

int fd_timed_track(SelDrive drive, int track, SelSide side, char* buffer, char* timing) {
	int time = 100000;
	BYTE* mfp_gpio = MFP_GPIO;
	BYTE* dma = DMA_HIGH;
	int saddr = (int)buffer;
	int caddr;
	
	fd_select(drive, side);						// set drive and side
	fd_goto_track(track);						// seek to track
	dma_set_fdc(buffer, 20, READ_MODE);			// set DMA reg

	fdc_set_command(0xE0, READ_MODE);   // read track cmd
	while (--time) {
		if (!(*mfp_gpio & 0x20)) break;
		caddr = ((*dma & 0xFF) << 16) | ((*(dma+2) & 0xFF) << 8 ) | ((*(dma+4) & 0xFF));
		// fprintf(stdout, "%x\n", caddr - saddr);
		timing[caddr - saddr] = 0xFF;
	}
...
So far I have no timer but I just set to $FF each location based on number of character received in the timing buffer (initialized to zero). Note dma_set_fdc sets the dma starting address to buffer and caddr obviouly read current value of the dma address and difference with saddr is number of bytes received.

If I uncomment the print statement (making loop slow) then it shows that numer of byte received stays to 0 for few iterations (until transfer begin) and only few other iterations with nbytes increasing and therefore only few bytes of the timing buffer are set to $FF which is was is expected and all seems to work ok.

However if I remove the print statement the loop is fast and I therefore was expecting many bytes (if not all) of the timing buffer to be set but ... nothing is set !!! It seems that the dma address goes from start to end address without going through intermediate values and this does not make any sense?
To get an idea of loop time I print the number of iteration to reach the intrq and get between 7000 and 11000. As a read track should be around 200 ms this gives an aproximate loop time of about 20 to 25 µs ...

Is this too fast queries for the dma to handle?
Is it possible to query the dma continously without problem?
Does this make sense?

By the way I tested on Atari STE
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Please ignore the problems described above.

I had a compiler problem + misinterpretation of some results.

Reading dma value now seems to work as expected. One obvious thing to note the address returned by the dma are always multiple of 16 and of course this make sense as the dma buffer 16 bytes (in the two 16 bytes buffers) and then transfer them using dma cycle on 68000 bus.

Therefore there are plenty of time between 2 increments: about 32µs * 16 = 512µs !!!
So there is really no need to optimize internal loop + timer can be set to bigger value. I am thinking of about 10µs resolution which make about 0.6µs on a byte basis (about 2%)
keep you posted of progress and sorry for the fuzz
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Here is the latest status

Let's first review my understanding of the mechanisms involved in the transfer of bytes from the FDC to the memory through the DMA controller. Suppose that you want to read a track, you first have to prepare the DMA for transfer by providing it with a buffer address and a count. The address must point to a buffer big enough to contain all the data from the FDC (about 6500 bytes in case of a read track), and the count should indicate the maximum number of chunk of 512 bytes the DMA will have to transfer (for a read track we can use 20). Internally the DMA has to 16 bytes buffer that are used alternatively. This allows filling a buffer while waiting for the processor to read the other one.

The attached figure shows the sequence of events when reading a track through the DMA. Note that the scale is widely incorrect, but this is acceptable as the purpose of the diagram is just to show the sequence of events.
After the FDC receive the read track command, it waits for the index pulse. Shortly after the drive has reached the index pulse the read can start and therefore the FDC raise a DRQ to indicate that one byte has been assembled and is ready to be fetched. When the DMA receives this signal it will start a fetch cycle to grab the byte from the FDC on the private bus that joins the DMA and the FDC. The byte is then stored in one FIFO of the DMA and the floppy lower the DRQ until the next byte is assembled. When this happen a new transfer takes place and this goes on until 16 bytes has been store in the DMA FIFO. At this point the DMA issues a Bus request to the 68000 to indicate that it wants to perform a DMA transfer. When the Bus request is granted by the 68000 the DMA take over the control of the system bus and transfer the 16 bytes from the FIFO into the memory pointed by its address register. At the end of the transfer the DMA release the bus to the processor and increment its internal address register by 16.

Here is roughly the timing values involved in the transfer: The time between the reception of the command and the IP can be anything up to 200 ms (one revolution). A byte is assembled by the FDC every 32µs (8 bits of 4µs). And therefore 16 bytes are stored in the DMA for burst transfer roughly every 512 µs.

Now let's get back to writing a routine to measure the time it takes for a bytes to be assembled by the FDC. Measuring this time can be very useful for example to find out if a bit width variation has been used for protection purpose.
For that matter we are going to use one of the 68900 MFP timers. Usually timer A is a good choice. The MFP is connected to a 2.4576 MHz crystal and offer several pre-scaling. It will become obvious later on that the best choice is to use a pre-scaling of 10. This gives a frequency of 245.76 KHz and a period of 4.0690104167 µs (4069 ns is a good integer approximation). As the timer register is 8 bits wide an overflow happen every 256 ticks or about every 1042 µs.
When the read track command execute we need to set up a loop that will wait for the INTRQ to be raised by the FDC indicating the end of command. This signal is polled in the loop from the bit 5 of the MFP. Based on the above information we see that in order to get the transfer time for the characters we can look at when the address is changed in the DMA address register. This change happens every 16 received characters or at nominal rate every 512 µs.
The pseudo code looks like this:

Code: Select all

Prepare the FDC (select, seek, etc)
Prepare the DMA (read mode, buffer address, count)
Prepare the timer (reset, set pre-scale to 10, start)
Loop {
	Has dma address changed ?
		Yes Get the timer time and store
			Store the new address
	Has the FDC raised the INTRQ ?
		Yes break
}
The loop is composed of reading the DMA address and the MFP GPIO register, and two tests to check if address has changed or command has terminated. The loop must take less than 512µs in order not to miss an address change and this is of course not a problem. But the precision of the measurement is directly related to the time of the loop (time when both tests fail). If the loops takes for example 102 µs the precision will be of 102/512 or about 20% for 16 bits or 1.2% / byte which is not acceptable. It is therefore important to optimize this loop. For example I started with a loop time of about 110µs and optimized to about 35µs (still not acceptable) and went down to about 15 µs (bare strict minimum). Now this is acceptable as it represent a worst case precision of 12/512 (about 3%) on a chunk of 16 bytes or about 0.2% per byte. Although you try to shorten the loop a maximum do not forget to handle the timer overflow (but if you are smart it should not affect the loop time).
During the discussion about the timer I mentioned that the pre-scale of 10 (or about 4µs) was a good choice. Indeed if you look at the values in the loop 512 is about 128 * 4µs tick and this happen to be just the median value ($80) of a byte which is very convenient to store in array of byte the time of each 16 bytes chunks transferred. Note also that 4µs provides a precision which is in line with the loop time.
If you do not take anymore care you will notice that on regular basis the values measured/stored are completely off. For example you are getting one value largely bigger than normal and the next one shorter to compensate? It does not take to much time to infer that the problem comes from the fact that the processor is diverted to do other things which in turn means that it processes interrupt. You can ignore the problem by post processing the values and correcting the two wrong values by replacing both values by the mean value. However a better solution is to enter a critical section at the beginning of the loop by turning off all the interrupt (note that as far as I know this requires using assembly code!).

Now everything works fine, except one thing: you do not get the timing for the first chunk of data transferred as the first address increment happen after 16 bytes transfer. This may be acceptable, but I have tried to come with a way to measure this first chunk. The idea that I explored was to find a way to identify when the first transfer occurs. As already mentioned this can't be inferred by reading the DMA address as it is changed only every 16 bytes. But the DMA has a status register that reflect the state of the FDC DRQ signal in bit 3. In the Atari HW documentation it is explicitly said that it is a bad idea (i.e. don't do it) to query the status register during DMA transfer. This makes sense as the DMA has two sides: one toward the FDC to transfer bytes, and one toward the 68000 to read and writes DMA registers (not to mention DMA transfer). Consequently it is probably too much load for the DMA to transfer a byte to/from the FDC while the 68000 try to read/write some internal register.
However it would be nice to get the time of the first DRQ by reading the DMA status. As a matter of fact as the transfer has not yet started we are not creating too much perturbation to the DMA and it works great.
This mean that just before the loop already described we need to add another tight loop that just check for the first DRQ (just after the IP) at the end of the loop we store the current time which correspond to the time of transfer of the first byte.
The problem is that is does not return the expected timing. If you look a the transfer diagram you can see that between the first DRQ and the first address change we have the transfer of the 16 bytes (this s good), but we also have some time to get the bus granted and released by the 68000, plus the time to burst transfer the bytes from the FIFO to memory. It is only after that the DMA address register is changed. We therefore have some extra time that accumulates for this first chunk transfer. I am now trying to see if this is a constant introducing a predictable extra time (in which case it would be easy to compensate) or not.

Also attached a pdf version of this doc.
You do not have the required permissions to view the files attached to this post.
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Last update on the subject.
-----------------------------------
I now have completed my FD library, and I am able to perform all low and high level access to the FDrive as planned.

I am also able to measure the timing for reading a track or a sector or a piece of a sector. By coding some pieces in assembly, I was able to reduce the loop to measure time (as describe above) down to less than 15 usec and this give me a pretty good precision.

Thanks for all people that have helped.
leonard
Moderator
Moderator
Posts: 665
Joined: Thu May 23, 2002 10:48 pm
Contact:

Post by leonard »

Hey it's cool to see people doing good research on atari today !

I don't look deep into that thread, only on the C code you posted, and I may noticed a problem. For exemple:

BYTE* mfp_gpio = MFP_GPIO;

I guess you should use

volatile BYTE* mfp_gpio = MFP_GPIO;

I'm pretty sure it works actually great because ATARI C compiler don't have agressive optimisation, but you have to use "volatile" keyword simply because MFP_GPIO is an hardware register (so its content could change, the compiler have to know that in order to generate working code)

Hope it could help for your great library.

Cheers,
Leonard/OXYGENE.
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Hey Leonard,
This is very interesting information, you are perfectly right I should use volatile on any I/O mapped location. With the following explanation for volatile:

Code: Select all

The compiler assumes that, at any point in the program, a volatile variable can be accessed by an unknown process that uses or modifies its value. Therefore, regardless of the optimizations specified on the command line, the code for each assignment to or reference of a volatile variable must be generated even if it appears to have no effect. If volatile is used alone, int is assumed. The volatile type specifier can be used to provide reliable access to special memory locations. Use volatile with data objects that may be accessed or altered by signal handlers, by concurrently executing programs, or by special hardware such as memory-mapped I/O control registers. You can declare a variable as volatile for its lifetime, or you can cast a single reference to be volatile. 
Indeed I have just completed yesterday a key disk protection anlyzer (see the http://www.atari-forum.com/viewtopic.ph ... 2&start=75 thread for more info) that already finds 18 out of the 23 protection mechanisms that I have identified, and therefore yes the lib seems to work ok.
However the lib works great with the Lattice C compiler, but fails immediately with the Pure C compiler??? It gives a bus error on the first access to a HW register??? It might either be related to not using the volatile [keywork or to the fact that Lattice C align the things differently than Pure C. I will have to ask Nyh that seems to be pretty knowledgeable about Pure C. I will try to use volatile on all HW access and see if it helps for Pure C and keep you posted.
I am also considering to publish the lib if there is some interest as it allow to write very quickly FD applications.
I am also writing a document on how FD programming (already 34 pages!) if any interest?

By the way an updated diagram that shows that transfer to system to/from DMA is performed in 8 16 bytes and not 16 as shown previously.
You do not have the required permissions to view the files attached to this post.
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

I started to modify the code and I have a doubt:

I use:

Code: Select all

volatile WORD* dma = 0xFF8606;
...
*dma = 90; /* set mode reg */
*(dma - 1) = count; /*set count value*/
...
I expect that the fact that I declare dma to be a pointer to volatile short also apply to "(dma-1)"? Right?

Other question:
I am using some "system variables" as defined in the hitchhiker's guide to the BIOS. Obviously some variables might be changed externally to my program durint interrupt and therefore I guess I should also declare them with the volatile qualifier ?
Currently to make things readable I use:

Code: Select all

#define	flock		*((WORD*)0x43E)		/* floppy lock variable */
....
....
flock = 1;
...
I guess I should use:

Code: Select all

#define	flock		*((volatile WORD*)0x43E)		/* floppy lock variable */
....
....
flock = 1;
...
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Now this is getting interesting?????

As you suggested I have qualified all pointers to HW register with the volatile keyword as this seems to be the right thing to do.
So I now have for example:

Code: Select all

void dma_set_mode(DmaTransMode rw) {
	volatile WORD* wptr = DMA_MODE;
	*wptr = 0x90 | (rw * 0x100);
}


BYTE* dma_get_address(void) {
	volatile BYTE* ptr = DMA_HIGH;			/* DMA_HIGH address */
	long p;
		
	p = ((*ptr & 0xFF) << 16) | ((*(ptr+2) & 0xFF) << 8 ) | ((*(ptr+4) & 0xFF));
	return (BYTE*)p;
}
And ....
It does not work anymore?????????? Works fine in Steem but on real HW it return random errors ????
I suspected the this might come from the flock variable and therefore I changed:

Code: Select all

#define	flock		*((volatile WORD*)0x43E)		/* floppy lock variable */
Back to

Code: Select all

#define	flock		*((WORD*)0x43E)		/* floppy lock variable */
but this no change.
Therefore the conclusion seems to be that the volatile qualifier on HW register pointer breaks the code ?????????????

I turned the debug flag recompiled everything and looked at the assembly generated code with volatile.
Recompiled with volatile removed.
On this short example it seems like the code around the I/O access is the same? I will therefore recheck all the places where I have placed a volatile keyword.
Jean
You do not have the required permissions to view the files attached to this post.
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

Hi Jean,

Sorry for not being able to help here. I never use "C" for this kind of low level stuff. I always use 100% assembler.
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

ijor wrote:Sorry for not being able to help here. I never use "C" for this kind of low level stuff. I always use 100% assembler.
Hi Jorge,
No problem. I am not in hurry as I have a fully working version.
It is just that it does not make sense and I do not like that. Unclear things can mask a real problem...

As I mentionned I have been able to write a first protopyte of a protection analyzer program that run on Atari in one day, which is nice. I am now fixing problem and adding basic features, and then I should be able to start testing my many diskettes. As you know I also have one program that analyze the output of the discovery cartridge, but this one is very "delicate" as it is working at the FD flux transition and ... it breaks easily ....
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

DrCoolZic wrote:It is just that it does not make sense and I do not like that. Unclear things can mask a real problem...
I understand :)

And it is precisely because of this that I don't like to use C in these cases. You don't know if the problem is in your code, in the logic, in the compiler, debugger, etc.
leonard
Moderator
Moderator
Posts: 665
Joined: Thu May 23, 2002 10:48 pm
Contact:

Post by leonard »

You all know I love hardcore assembler optimlization, but I really like simplicity and efficiency of C code. Unfortunatly I never used "68k" based C compiler (but these seems very poor at optimizing level).

Yiour volatile problem is very strange. I guess this has nothing related with volatile, maybe other? (with or without optimisation?).

For ex, I don't understand the two code you posted: are they supposed be the same routine (exept they use volatile or not?). Because I really dont understand why the second routine use a "AND" instruction??? (the first code don't use). I'm pretty sure these routines don't come from the same C source code.
Leonard/OXYGENE.
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

leonard wrote:For ex, I don't understand the two code you posted: are they supposed be the same routine (exept they use volatile or not?). Because I really dont understand why the second routine use a "AND" instruction??? (the first code don't use). I'm pretty sure these routines don't come from the same C source code.
After your initial post I have hunted for all the places where I was accessing the HW registers. For all the declarations found I have added the qualifier volatile. I recompiled and as mentioned it did not work? I could not beleived it and therefore made lot of tests (as usual I was doing other things at the same time!).
Finaly I decided to replace all the volatile qualifiers with VOLATILE. Now if I define VOLATILE as volatile I get the second listing that has a AND. If I recompile and define VOLATILE as nothing I get the first listing without the AND.
I promise apart from adding the volatile I did not change this code for the past two monthes. I know it does not make too much sense ...
The C source is the following (cannot be simpler):

Code: Select all

BYTE fdc_get_reg(WORD reg_num) {
	VOLATILE WORD* wptr;
	register BYTE value;
	
	wptr = DMA_MODE;	
	*wptr = reg_num;
	wptr = DMA_DATA;
	value = *wptr;
	return value;
}
Where:

Code: Select all

#define BYTE	unsigned char
#define WORD	short
#define LONG	long
#define VOLATILE
For the bus violation with Pure C here is a picture of where I am getting stuck. (Did not try volatile on Pure C).
Remember that even if I wrote a small piece of 68000 assembly code this is still mostly chinese too me ... would have feel more home if it was '86 ASM !
You do not have the required permissions to view the files attached to this post.
Lautreamont
Obsessive compulsive Atari behavior
Obsessive compulsive Atari behavior
Posts: 103
Joined: Fri Jan 27, 2006 9:11 pm
Location: Friceland

Post by Lautreamont »

Sorry but I only roughly read the thread. I only hope I won't introduce more mess.

Code: Select all

p = ((*ptr & 0xFF) << 16) | ((*(ptr+2) & 0xFF) << 8 ) | ((*(ptr+4) & 0xFF));
You tried to shift a 8-bit variable 16 bits to the left.
The compiler will usually do what you expect and use a long to do it.
But when you add "volatile", you won't get a long to do the operation.
Although it looks nice to read, it can be dangerous.

Nyh carefully avoided that kind of instruction in the demo he posted.
Use it as a reference. EDIT:Pure C (demo) coding
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

Lautreamont wrote:

Code: Select all

p = ((*ptr & 0xFF) << 16) | ((*(ptr+2) & 0xFF) << 8 ) | ((*(ptr+4) & 0xFF));
You tried to shift a 8-bit variable 16 bits to the left.
The compiler will usually do what you expect and use a long to do it.
But when you add "volatile", you won't get a long to do the operation.
Thanks for your feedback. Any information is always welcome as it is easy to make mistakes!

p is declared as a long. I originally wrote the p = ... line as several lines

Code: Select all

	register BYTE* ptr = DMA_HIGH;
	register long p = 0L;
	(BYTE)p = *ptr;
	p <<= 8;
	(BYTE)p = *(ptr + 2);
	p <<= 8;
	(BYTE)p = *(ptr + 4);
But I beleive it is safe and more elegant to write it as above.
The convertion rules used here are that the arithmetic expression is converted to long due to the target, and in turn the operands of the arethmetic expression are converted to long.
What is not shown here is that BYTE is in fact declared as an unsigned char. What that means is that during convertion of the the byte to a long it is not sign extended and therfore the conversion is perfectly safe. Actually I dont remember why I have added the & 0xFF ???
User avatar
DrCoolZic
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2261
Joined: Mon Oct 03, 2005 7:03 pm
Location: France
Contact:

Post by DrCoolZic »

I remember why I used & 0xFF
I was using the same pointer to access the DMA mode/couter and the addresses. The DMA mode/counter/data are accessed as WORD and therefore the pointer was declared as:
WORD* ptr = DMA_MODE;
and therefore latter on it was safer to and the reading with 0xFF
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

leonard wrote:You all know I love hardcore assembler optimlization, but I really like simplicity and efficiency of C code.
Here the issue is not about optimization, but about having full control of the code. It is just my personal preference, I agree it doesn't have to be like that. But personally, when coding low level hardware stuff, I like to be in total control of the CPU.

I also (again, a personal preference) find it much more difficult to debug "C" code that works at this level. A "C" debugger is usually not suited for this, you need a low level monitor/debugger. And while it is possible to use say, Devpac with Lattice C generated code, it is not as easy as a pure assembler development.
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

DrCoolZic wrote:

Code: Select all

$00167EC0 CMPA.L $00000000,A7
That line above checking for stack overflow doesn't make much sense. It is either a compiler bug, or your code is somehow overwriting that instruction??? Under User mode, it will provoke a Bus Error exception. So it seems, in addition, that somehow you called that routine from User mode.
ijor
Hardware Guru
Hardware Guru
Posts: 4013
Joined: Sat May 29, 2004 7:52 pm
Contact:

Post by ijor »

Lautreamont wrote:

Code: Select all

p = ((*ptr & 0xFF) << 16) | ((*(ptr+2) & 0xFF) << 8 ) | ((*(ptr+4) & 0xFF));
You tried to shift a 8-bit variable 16 bits to the left.
The compiler will usually do what you expect and use a long to do it.
But when you add "volatile", you won't get a long to do the operation.
It seems safe to me in this regard. If the compiler is ANSI compliant it should promote the 8-bit value to integer before doing the shift (actually, before the AND). This is disregarding if ptr is volatile or not.
DrCoolZic wrote:p is declared as a long. I originally wrote the p = ... line as several lines
I think Lautreamont wasn't talking about p being 8 bit, but about ptr being a pointer to an 8-bit variable. Again, I think it doesn't matter.

What it is not safe however, it is the order of the OR operations. The compiler is free to reorder all the three reads. This is in theory, all compilers I've seen will not make any reorder.

Anyway, what is wrong here is the design. You can't read the DMA address like that when DMA is running. This is not an atomic operation, and the hardware doesn't guarantee that you are not reading across two different DMA transfers.
Post Reply

Return to “Coding”