Cycle counts and phasing: tables and tester

GFA, ASM, STOS, ...

Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team

Dio
Captain Atari
Captain Atari
Posts: 451
Joined: Thu Feb 28, 2008 3:51 pm

Re: Cycle counts and phasing: tables and tester

Postby Dio » Mon Apr 27, 2009 11:57 am

No, I don't like the table way of doing things. It implies that you don't understand why the hardware is doing something. Better to work it out.

You're right about the PC+4; I wasn't clear on my thinking there: it's not the address of the next prefetch, but the address of the next candidate for decoding.

My code currently has a distinction between a fetch (which only replaces the 'extension' part of the queue) and a prefetch (which moves the extension slot to the decode slot and then fetches the extension). Can't remember exactly why I did this but it did make sense for the way the hardware worked. At the moment, my best guess is that the hardware also has a two-type-of-fetch system, one of which causes the 'candidate decode' PC to update to point at the address which contains the contents of the decode slot, and one of which doesn't. Very unclear exactly as to how this works in the hardware though - seems hard to believe they'd have extra registers or an offset.

ijor
Hardware Guru
Hardware Guru
Posts: 3854
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Cycle counts and phasing: tables and tester

Postby ijor » Mon Apr 27, 2009 9:12 pm

Dio wrote:No, I don't like the table way of doing things. It implies that you don't understand why the hardware is doing something. Better to work it out.


I don't think that tables are (neccesary) a consequence of not understanding the hardware. Emulators use tables, most of the time, because it is the most efficient way to emulate the hardware parallel processing. In a way, you can say that tables are somewhat the software equivalent for hardware parallel processing.

Of course that there are cases that tables are not the most efficient way. But sometimes, even knowing every single detail about the hardware won't help. Sometimes hardware does things "just because". Sometimes it is because of bugs in the silicon, sometimes it is because you are trying to emulate a very minor detail that the hardware designers didn't care at all. Consider, e.g, the "undocumented/illegal" opcodes of the 6502. Some of them have a really wierd behaviour, and if you would ask the 6502 designers, they would say that they had no idea then, and didn't even consider what would happen if somebody would use them.

Anyway, this is going off-topic. And if you don't want to use tables, it is, of course, completely up to you.

My code currently has a distinction between a fetch (which only replaces the 'extension' part of the queue) and a prefetch (which moves the extension slot to the decode slot and then fetches the extension). Can't remember exactly why I did this but it did make sense for the way the hardware worked. At the moment, my best guess is that the hardware also has a two-type-of-fetch system, ...


This doesn't make much sense to me. Consider the two following code sequences:

Code: Select all

nop
move    (A0),D0
nop

nop
move   128(A0),D0


Both sequences have three words of code. In both cases, when the first NOP is executed, the last word of the code sequence is being pre-fetched. In one case it is an opcode, in the other it is an extension word. At this point, the next (to the first NOP) opcode hasn't been decoded yet (it is being decoded). The CPU has no way to know at this point if it is fetching an opcode or an extension word. It must fetch, in both cases, exactly the same way.

The only case that the CPU might perform a different fetching, is when fetching a second extension word (third word of an instruction). As say, MOVE ABS.L,DO. Because here the CPU does know that it is not fetching an opcode.

seems hard to believe they'd have extra registers or an offset.


Of course that it has extra internal registers that are not seen at the programmer model, (almost) every CPU has them. And that is part of the problem. Not sure what you mean by an offset.

Anyway, all of this is not relelvant to the two cases I mentioned in the previous message. One using indirect mode, and the other pre-decrement. There are no extension words in none of those two cases.

Dio
Captain Atari
Captain Atari
Posts: 451
Joined: Thu Feb 28, 2008 3:51 pm

Re: Cycle counts and phasing: tables and tester

Postby Dio » Tue Apr 28, 2009 9:03 am

To be clear, the tables I don't like are the ones where the whole thing is just measured. Certainly they have many useful places in implementations.

What I meant by hidden registers or offsets is that the PC stacked on bus / address error has to come from somewhere. The obvious presumption is that it reflects the actual value the 68000's PC register holds internally at the point of the BE. But then the current tester results seem to imply that there must be some offset somewhere to the next location to be fetched for prefetch.

One alternative, which at the moment I think is most likely, is that there are two PC registers inside the thing, one that's watching the prefetch and one that's the programmers' model view. It's possible that the documentation for bus error implies this: it states that on a control flow instruction like jmp or jsr, the PC stacked will be in the region of the faulting instruction, not the target. I'd guess this isn't actually a real hard-coded register, but some address generation temp register being used for some types of prefetch (but not for others).

Time to take a gander at those patents, perhaps...

ijor
Hardware Guru
Hardware Guru
Posts: 3854
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Cycle counts and phasing: tables and tester

Postby ijor » Tue Apr 28, 2009 7:33 pm

Dio wrote:What I meant by hidden registers or offsets is that the PC stacked on bus / address error has to come from somewhere. The obvious presumption is that it reflects the actual value the 68000's PC register holds internally at the point of the BE. But then the current tester results seem to imply that there must be some offset somewhere to the next location to be fetched for prefetch.


It does come from the PC register. What doesn't come from the PC, not directly, is the address being fetched. See below.

One alternative, which at the moment I think is most likely, is that there are two PC registers inside the thing, one that's watching the prefetch and one that's the programmers' model view.


Having two PCs wouldn't make any sense.

What it has, again, is several internal general purpose registers. And the address to be fetched comes from one of those registers. The problem is that the relation between the PC and those registers is not fully constant and consistent. It depends on each specific instruction when and how one is updated from the other.

Let's assume that you would want to implement something like "MOVE d8(A0),D0" in software. You have multiple ways that you could do it:

Code: Select all

// Case 1
    fetch( pc);
    pc += 2;

    readOperand();

    fetch( pc);
    pc += 2;

// Case 2
     unsigned temp;

    temp = pc;
    fetch( temp);
    temp += 2;
    pc = temp;

    readOperand();

    temp = pc;
    fetch(  temp);
    temp += 2;
    pc = temp;

// Case 3
     unsigned temp;

    temp = pc;
    fetch( temp);
    temp += 2;

    readOperand();

    fetch();
    temp += 2;
    pc = temp;


They all (assuming I didn't make a mistake) achieve the same basic goal. But the PC when the operand is read is different.

You may say that the two last ones are wierd and not realistic, and that you would only use the first case (or something similar). But this is not software, it is hardware. Hardware doesn't have the flexibility (in this sense) of software. Not every register in hardware has a counter or adder capability. And you can't always transfer freely among registers, you need some kind of path for register transfers. Sometimes there is no direct path between two registers. Sometimes the path is a bus that is shared for multiple purposes, and it must be "free" if you want to use it.

So what happens is that the CPU, depending on the specific instruction, updates the PC from the internal registers before, or after (or in the middle) of a prefetch bus cycle.

I'd guess this isn't actually a real hard-coded register, but some address generation temp register being used for some types of prefetch (but not for others). Time to take a gander at those patents, perhaps...


As I said, it is worth to read the patents. But you don't need to read the patents for the basics of this. There are several books, papers, notes, and even Wiki that cover some of these aspects.

Anyway, you keep insisting in that this depends on some "type of prefetch", which I already showed it is wrong. Both indirect and pre-decrement have the same exact "type of prefetch", yet the PC in the exception frame is different.

You developed a very useful tool. Use your tool and use its results. You seems to be obstinated to deny the results that you get from your own tool.

Dio
Captain Atari
Captain Atari
Posts: 451
Joined: Thu Feb 28, 2008 3:51 pm

Re: Cycle counts and phasing: tables and tester

Postby Dio » Tue Apr 28, 2009 8:05 pm

That's an excellent simplifying insight.

In the cases where the PC seems 'correct' from the obvious simple implementation (abs W and L) the ALU (and so probably the register write path) is idle during the fetch, while in the cases where it looks odd (indexed and displacement) the ALU is probably in use at that point to do the address generation. Which seems to fit my deductions so far with the 'two types of fetch' - one where the PC is updated afterwards and one where it's updated before, if not my wild speculation about the implementation. My implementation certainly needs a bit of work mind, but I hadn't actually fiddled with anything yet so I just have the 'obvious' behaviour so far.

But anyway, I haven't accepted or denied anything yet. I just haven't had time to do the analysis properly, and I'm not closing my mind to options until I do so. I'll be surprised if there's not something simple and obvious underlying it once it gets worked out. I always seem to look at the complicated ways first (I had a wonderfully mad method worked out for the HBL timing - involving signals being sampled at 10-cycle intervals - before I realised that it is probably down to wait states on the iack).

laurieboshell
Atarian
Atarian
Posts: 3
Joined: Wed Jan 01, 2014 12:10 pm
Location: Blue Mountains NSW

Re: Cycle counts and phasing: tables and tester

Postby laurieboshell » Wed Jan 01, 2014 12:57 pm

I'm new to this forum, but I see a reference to a 68K instruction set tester for emulators, similar to ZEXALL for Z80.
I'm working with a 68K emulator in Delphi/Pascal (extracted from the DSP project by Leniad at https://code.google.com/p/dsp-emulator/ ) and I would like to validate the emulated 68k code.
Can anyone help with such an instruction set tester.
Laurie

Dio
Captain Atari
Captain Atari
Posts: 451
Joined: Thu Feb 28, 2008 3:51 pm

Re: Cycle counts and phasing: tables and tester

Postby Dio » Thu Jan 02, 2014 10:14 am

I do have one but I'm not certain what state it's in at the moment (and it's not quite as complete as zexall). It's also very ST-only, but I could give you the source.

Send me a PM please.

laurieboshell
Atarian
Atarian
Posts: 3
Joined: Wed Jan 01, 2014 12:10 pm
Location: Blue Mountains NSW

Re: Cycle counts and phasing: tables and tester

Postby laurieboshell » Tue Jan 07, 2014 1:49 am

Thanks Dio but as I'm new to this forum, I get this error when attempting to send a PM:
"We are sorry, but you are not authorised to use this feature. You may have just registered here and may need to participate more to be able to use this feature."

I would appreciate the instruction tester. My 68K exposure goes back a few years..mainly with Sage II and IV machines (Later renamed Stride). I mainly used the UCSD-Pascal Assember for 68K stuff. A neighbour of mine, Eric Lindsay had a lot to do with Atari St machines as well as the Australian Applix 1616 machine and a bit rubbed off onto me!

I presume the instruction tester is written in assembly code. I guess all I would need to do is implement traps for screen output and keyboard input. My toolchain would be windows base. This is how I used ZEXALL to test a Z80 emulator without using the CP/M console stuff.
As I cannot send a PM just yet, can you ping me and I'll then contact you. My username is laurieboshell, the domain is gmail ending in dot com.
Regards,
Laurie

Dal
Administrator
Administrator
Posts: 4178
Joined: Tue Jan 18, 2011 12:31 am
Location: Cheltenham, UK
Contact:

Re: Cycle counts and phasing: tables and tester

Postby Dal » Tue Jan 07, 2014 11:08 pm

Laurie: you should now be able to pm.
Mega"SST" 12, MegaSTE, STE: Desktopper case, IDE interface, UltraSatan (8GB + 512Mb) + HXC floppy emulator. Plus some STE's/STFM's


Social Media

     

Return to “Coding”

Who is online

Users browsing this forum: No registered users and 3 guests