Program counter and exception

A place to discuss current and future developments for STeem
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

I've tried to make a table for <EA> calculation routs, and, for me, we still have a problem : on some nanowords it's hard to know when exception wil be raised by the data or address error.

for exemple for (d16,An) <EA> on a .b or .w instruction :
the second read on the bus is done by the couple of nanowords adsw2+adrw2.If we look to nanocode for these 2 nanowords :

Code: Select all

au → ab → aob, at
(dbin) → abd → alu
(pc) → db → au
-1 → alu
+4 → au

Code: Select all

edb → dbin
(rxd) → ab* → dcr
We first put a memory address to read on the address bus ("au → ab → aob, at"). Then we compute things, changing AU value, and then we read value from the data bus ("edb → dbin").
So the error will raise somewhere between "au → ab → aob, at" and "edb → dbin" at S4 bus state. As far as I understand things, I've no clue to know when S4 comes. Will error raise before AU = PC, before AU=AU+4 or after that ?

If we look at M68000UM Figure 5-25. Bus Error Timing Diagram, it seems that data are already on the bus when error is raised. So, perhaps, the error will raise on "edb → dbin" nanoisntruction, but I've no real proof of that.

Nevertheless, here is the table i've written :

Code: Select all

iAU = initial value of AU at this stage                                         
nAU = new value of AU at this stage                                             
??? = unknown value                                                             
an 'x' in IRC column indicates when IRC is updated                                                                          
------------------------------------------------------------------------------- 
             | bfore 1st | bfore 2nd I | bfore 3rd I | bfore 4th | end of    I  
    <ea>     | db access | db access R | db access R | db access | microinst R  
             | AU   PC   | AU   PC   C | AU   PC   C | AU   PC   | AU   PC   C  
-------------+-----------+-------------+-------------+-----------+------------- 
.B or .W :   |           |             |             |           |              
  (An)       | nop  nop  |             |             |           |              
  (An)+      |(An)+ nop  |             |             |           | PC+2 nop     
  -(An)      |-(An) iAU  |             |             |           | PC   nop     
  (d16,An)   | d16  nop  | ???  nop  x |             |           | PC+4 nop     
  (d8,An,Xn) | An+d nop  | ???  nop  x |             |           | PC+4 nop     
  (xxx).W    |(xxx) iAU  | nop  nop  x |             |           | PC+2 nop     
  (xxx).L    | nop  nop  |(xxx) AU+2 x | nop  nop    |           | PC+2 nop     
  #<data>    | nop  AU   |             |             |           | +2   nop  x  
.L :         |           |             |             |           |              
  (An)       | nop  nop  | An+2 nop    |             |           | PC+2 nop     
  (An)+      | nop  nop  |(An)+ nop    |             |           | PC+2 nop     
  -(An)      |-(An) nop  | +2   nop    |             |           | PC+2 nop     
  (d16,An)   | d16  nop  | +An  nop  x | +2   nop    |           | PC+4 nop     
  (d8,An,Xn) | An+d nop  | +Xn  nop  x | +2   nop    |           | PC+4 nop     
  (xxx).W    |(xxx) iAU  | nop  nop  x | +2   nop    |           | PC+2 nop     
  (xxx).L    | nop  nop  |(xxx) AU+2   | nop  nop  x | +2   nop  | PC+2 nop     
  #<data>    | nop  nop  | +2   nAU    |             |           | +2   nop  x
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

Well, thanks a lot for those precious data and insights!
Will need some time to digest it all now, can't reply at once. This will certainly help a lot.
Just one thing, for MOVEM, I've finally accepted that it prefetches at the end (=>there's some other problem in Steem).
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
Dio
Captain Atari
Captain Atari
Posts: 451
Joined: Thu Feb 28, 2008 3:51 pm

Re: Program counter and exception

Post by Dio »

Yeah, movem is astoundingly tricksy. My best guess at this point is that it set the final PC back into PC before the moves start, although the actual fetch clearly happens at the end from both prefetch testing on a real machine and danorf's tables. Why it finds it convenient to do it this way is astoundingly unclear.
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

Dio wrote:Yeah, movem is astoundingly tricksy. My best guess at this point is that it set the final PC back into PC before the moves start, although the actual fetch clearly happens at the end from both prefetch testing on a real machine and danorf's tables. Why it finds it convenient to do it this way is astoundingly unclear.
I think the microcode confirms this (but it doesn't tell why).
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

BTW :
Steven Seagal wrote: 8O
Certainly thought they did more, like indexing (D16+An etc.)
You're right on this, I totally forgot this point (as it occures only a few time in nanocode).
(d16,An) and (d8,An,Xn) are calculated by the AUs (or at least It's what I understand from nanocode :mrgreen: ).
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

Steven Seagal wrote:I think the microcode confirms this (but it doesn't tell why).
Nanocode reusability, I presume ?

Here is a table for Movem :

Code: Select all

iAU = initial value of AU at this stage                                         
nAU = new value of AU at this stage                                             
??? = unknown value                                                             
an 'x' in IRC or IRD column indicates when IRC or IRD are updated      
-----------------------------------------------------------------------------------------------------------------------
             |           |             |             |             |       in loop :       |           |
	MOVEM     | bfore 1st | bfore 2nd I | bfore 3rd I | bfore 1nd I | bfre each | bfre each | bfore lst | end of    I I
   M-->R     | prefetch  | prefetch  R | prefetch  R | mem read  R | LSW read  | MSW read  | prefetch  | microinst R R
             | AU   PC   | AU   PC   C | AU   PC   C | AU   PC   C | AU   PC   | AU   PC   | AU   PC   | AU   PC   C D
-------------+-----------+-------------+-------------+-------------+-----------+-----------+-----------+---------------
 .W          |           |             |             |             |           |           |           |
  (An)       | nop  AU   |             |             |(An) iAU+2 x | +2   nop  |           | PC+2 nop  | nop  nop  x x
  (An)+      | nop  AU   |             |             |(An) iAU+2 x | +2   nop  |           | PC+2 nop  | nop  nop  x x
  (d16,An)   | nop  AU   | ???  nop  x |             | An+d +4   x | +2   nop  |           | PC+2 nop  | nop  nop  x x
  (d8,An,Xn) | nop  AU   | ???  nop  x |             | ??? ???   x | +2   nop  |           | PC+2 nop  | nop  nop  x x 
  (xxx).W    | nop  AU   | +2   nop  x |             |(xxx)iAU+2 x | +2   nop  |           | PC+2 nop  | nop  nop  x x
  (xxx).L    | nop  AU   | +2   nop  x | +2   nAU    |(xxx)iAU+2 x | +2   nop  |           | PC+2 nop  | nop  nop  x x         
 .L          |           |             |             |             |           |           |           |                                        
  (An)       | nop  AU   |             |             |(An) iAU+2 x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  (An)+      | nop  AU   |             |             |(An) iAU+2 x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  (d16,An)   | nop  AU   | ???  nop  x |             | An+d +4   x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  (d8,An,Xn) | nop  AU   | ???  nop  x |             | ??? ???   x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x 
  (xxx).W    | nop  AU   | +2   nop  x |             |(xxx)iAU+2 x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  (xxx).L    | nop  AU   | +2   nop  x | +2   nAU    |(xxx)iAU+2 x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x         
-------------+-----------+-------------+-------------+-------------+-----------------------+-----------+----------------
             |           |             |             |             |       in loop :       |           |
	MOVEM     | bfore 1st | bfore 2nd I | bfore 3rd I | before    I | aftr each | aftr each | bfore lst | end of    I I
   R-->M     | prefetch  | prefetch  R | prefetch  R | loop      R | MSW write | LSW write | prefetch  | microinst R R
             | AU   PC   | AU   PC   C | AU   PC   C | AU   PC   C | AU   PC   | AU   PC   | AU   PC   | AU   PC   C D
-------------+-----------+-------------+-------------+-------------+-----------+-----------+-----------+---------------
 .W          |           |             |             |             |           |           |           | 
  (An)       | ???   ??? |             |             |(An)  nop  x |           | +2   nop  | PC+2 nop  | nop  nop  x x
  -(An)      | ???   ??? |             |             |-(An) nop  x |           | -2   nop  | PC+2 nop  | nop  nop  x x
  (d16,An)   | nop   AU  | ???  ???  x |             | An+d nop  x |           | +2   nop  | PC+2 nop  | nop  nop  x x
  (d8,An,Xn) | nop   nop | +2  iAU+4 x |             |A+d+X nop  x |           | +2   nop  | PC+2 nop  | nop  nop  x x
  (xxx).W    | nop   AU  | ???  ???  x |             | nop  nop  x |           | +2   nop  | PC+2 nop  | nop  nop  x x
  (xxx).L    | nop   AU  | +2   nop  x | ???  ???    | nop  nop  x |           | +2   nop  | PC+2 nop  | nop  nop  x x
 .L          |           |             |             |             |           |           |           | 
  (An)       | ???   ??? |             |             |(An)  nop  x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  -(An)      | ???   ??? |             |             |-(An) nop  x | -2   nop  | -2   nop  | PC+2 nop  | nop  nop  x x
  (d16,An)   | nop   AU  | ???  ???  x |             | An+d nop  x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  (d8,An,Xn) | nop   nop | +2  iAU+4 x |             |A+d+X nop  x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  (xxx).W    | nop   AU  | ???  ???  x |             | nop  nop  x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
  (xxx).L    | nop   AU  | +2   nop  x | ???  ???    | nop  nop  x | +2   nop  | +2   nop  | PC+2 nop  | nop  nop  x x
NOTES :                                             
  .switch MSW write and LSW write column header for MOVEM/R-->M/.L/-(An) as LSW 
   is written before MSW.
Note : we run numerous times into the same problem than with (d16,An) <EA> calculation (at a moment of nanocode execution, a memory address is put on the address bus ("au → ab → aob, at") then things are computed (including changing of AU and PC values) and, at last, data are read from the data bus ("edb → dbin")). In this case and without more informations it's hard to define AU and PC values at the exact moment bus or address error exceptions will raise. It's all depending of when S4 bus state will start and there's no clue in nanocode for that.
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

Yoho, so I examined one case with displacement:

Code: Select all

.W (d16, An): ADSW1 -> ADSW2 -> ADRW2

ADSW1
au->aob                 copy effective fetch pointer
(dbin)->dbo->au         use IRC (d16 -> AU)
adb->dbin,irc           read cycle, prefetch (fill IRC)
(rya)->ab->au           (An) -> AU, where it is added to d16

ADSW2
au->ab->aob,at          AU->address busses
(dbin)->abd->alu        copy IRC value on address bus data section only
                        (just in case?)
(pc)->db->au            AU=PC
-1->alu                 ?
+4->au                  AU=PC+4, effective fetch pointer

ADRW2
adb->dbin               read cycle, read word at EA
(rxd)->abo->dcr         ?
I don't understand everything, but it seems to prove that PC isn't changed during EA for this case, which confirms the table. We see how AU is updated instead.
It also illustrates what I was saying: as soon as the IRC word is used, a new prefetch fills IRC.
My interpretation of '(rya)->ab->au' is that the register value is added to the displacement already in AU, or it wouldn't make sense.
I intend to progressively do the same for other EA routines.
Edit: except if danorf beats me to it!
Last edited by Steven Seagal on Sun Mar 31, 2013 8:17 am, edited 1 time in total.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

danorf wrote:I've tried to make a table for <EA> calculation routs, and, for me, we still have a problem : on some nanowords it's hard to know when exception wil be raised by the data or address error.

for exemple for (d16,An) <EA> on a .b or .w instruction :
the second read on the bus is done by the couple of nanowords adsw2+adrw2.If we look to nanocode for these 2 nanowords :

Code: Select all

au → ab → aob, at
(dbin) → abd → alu
(pc) → db → au
-1 → alu
+4 → au

Code: Select all

edb → dbin
(rxd) → ab* → dcr
We first put a memory address to read on the address bus ("au → ab → aob, at"). Then we compute things, changing AU value, and then we read value from the data bus ("edb → dbin").
So the error will raise somewhere between "au → ab → aob, at" and "edb → dbin" at S4 bus state. As far as I understand things, I've no clue to know when S4 comes. Will error raise before AU = PC, before AU=AU+4 or after that ?

If we look at M68000UM Figure 5-25. Bus Error Timing Diagram, it seems that data are already on the bus when error is raised. So, perhaps, the error will raise on "edb → dbin" nanoisntruction, but I've no real proof of that.
My guess is that the effective read and bus error happen at adrw2, before internal registers are updated.
Note that for PC itself it's all same-same.
Nevertheless, here is the table i've written :

Code: Select all

iAU = initial value of AU at this stage                                         
nAU = new value of AU at this stage                                             
??? = unknown value                                                             
an 'x' in IRC column indicates when IRC is updated                                                                          
------------------------------------------------------------------------------- 
             | bfore 1st | bfore 2nd I | bfore 3rd I | bfore 4th | end of    I  
    <ea>     | db access | db access R | db access R | db access | microinst R  
             | AU   PC   | AU   PC   C | AU   PC   C | AU   PC   | AU   PC   C  
-------------+-----------+-------------+-------------+-----------+------------- 
.B or .W :   |           |             |             |           |              
  (An)       | nop  nop  |             |             |           |              
  (An)+      |(An)+ nop  |             |             |           | PC+2 nop     
  -(An)      |-(An) iAU  |             |             |           | PC   nop     
  (d16,An)   | d16  nop  | ???  nop  x |             |           | PC+4 nop     
  (d8,An,Xn) | An+d nop  | ???  nop  x |             |           | PC+4 nop     
  (xxx).W    |(xxx) iAU  | nop  nop  x |             |           | PC+2 nop     
  (xxx).L    | nop  nop  |(xxx) AU+2 x | nop  nop    |           | PC+2 nop     
  #<data>    | nop  AU   |             |             |           | +2   nop  x  
.L :         |           |             |             |           |              
  (An)       | nop  nop  | An+2 nop    |             |           | PC+2 nop     
  (An)+      | nop  nop  |(An)+ nop    |             |           | PC+2 nop     
  -(An)      |-(An) nop  | +2   nop    |             |           | PC+2 nop     
  (d16,An)   | d16  nop  | +An  nop  x | +2   nop    |           | PC+4 nop     
  (d8,An,Xn) | An+d nop  | +Xn  nop  x | +2   nop    |           | PC+4 nop     
  (xxx).W    |(xxx) iAU  | nop  nop  x | +2   nop    |           | PC+2 nop     
  (xxx).L    | nop  nop  |(xxx) AU+2   | nop  nop  x | +2   nop  | PC+2 nop     
  #<data>    | nop  nop  | +2   nAU    |             |           | +2   nop  x
This also seems to confirm the net effect on PC at least for -(An), (d16,An).
Last edited by Steven Seagal on Sun Mar 31, 2013 8:57 am, edited 1 time in total.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

danorf wrote: Here i how stack frame is written when Bus or Address error occurs :
Start at microword bser1 (bser is for BuS ERror -address bus or data bus-) and following nanowords bser1, bser2, bser3, bser4, bser5, bser6, trap3, trap4, trap5, ... (trap is for trap :mrgreen: so address and data bus errors have common nanowords with traps). In the following I only repport things that are about writing the stack frame, so it's far from the complete processing of exception.
In code section that follows :
pcl = pc lower 16bits
pch = pc higher 16bits
at = temporary address register
atl = at lower 16bits
ath = at higher 16bits
au, at, pc, sp ar 32bit registers
alu, ftu are 16bit registers
ird = instruction register decoder
psw = program status word
ssw = special word which monitors status of current instruction

Code: Select all

         at= address curently on the address bus
bser1    au = pc
         ftu = psw

bser2    alu = au = pcl
         au = sp-2

         write alu at au <--> write pcl at sp-2
bser3    alu = ftu = psw
         au = sp-6

         write alu at au <--> write psw at sp-6
bser4    alu = pch
         ftu = ird
         au = sp-4

         write alu at au <--> write pch at sp-4
bser5    alu = ftu = ird
         au = sp-8

         write alu at au <--> write ird at sp-8
         alu = atl = address on the address bus when error occurs
bser6    pc = at = address on the address bus when error occurs
         ftu = ssw
         au = sp-10

         write alu at au <--> write atl at sp-10
trap3    alu = ftu = ssw
         au = au-4 = sp-14

         write alu at au and sp=au <--> write ssw at sp-14 
trap4    sp = sp-14
         alu = pch = ath
         au = au+2 = old_sp-12 = new_sp+2

trap5    write alu at au <--> write ath at old_sp-12 (new_sp+2)
         ...
         ...

         ...
...      ...
         ...
Thanks again for the schema.
If I get the first 3 steps right, we have:
PC -> ALU
SP -> AU
write ALU at AU address
What counts is that PC and no other register/computed value is pushed on the stack. Which explains much.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

Steven Seagal wrote:Edit: except if danorf beats me to it!
I won't beat anyone, I promise ! :mrgreen:
Steven Seagal wrote: .W (d16, An): ADSW1 -> ADSW2 -> ADRW2

ADSW1
au → aob copy effective fetch pointer (in fact it send the address in AU to the address bus)
(dbin) → dbe → au use IRC (d16 -> AU)(don't understand where you get IRC in this one ? It get d16 from a buffer/latch (dbin) of the data bus not from IRC)
edb → dbin, irc read cycle, prefetch (fill IRC) (and the buffer/latch named dbin)
(rya) → ab → au (An) -> AU, where it is added to d16 (I'm OK with this)

ADSW2
au → ab → aob, at AU->address busses
(dbin) → abd → alu copy IRC value on address bus data section only (same remark than befor it's not IRC but the content of the buffer/latch dbin which is copied. In this pecular case the value in DBin is the same tahn the one in IRC).
(just in case?) (if fact it depend of the macroword execution which call ADSW2...)
(pc) → db → au AU=PC
-1 → alu ? (as far as I understand things, ALU is activated for every nanoword, so -1 is here to say that ALU do nothing for this peculiar nanoword, but can be wrong).
+4 → au AU=PC+4, effective fetch pointer

ADRW2
edb → dbin read cycle, read word at EA (and store it in dbin not IRC :mrgreen: )
(rxd) → ab* → dcr ? (put the data register part of the macro instruction in a decoder for further execution)
Ste/ven Seagal wrote:I don't understand everything, but it seems to prove that PC isn't changed during EA for this case, which confirms the table. We see how AU is updated instead.
I agree.
Ste/ven Seagal wrote:It also illustrates what I was saying: as soon as the IRC word is used, a new prefetch fills IRC.
I partially disagree, as IRC is never fetched in this exemple (DBin is not IRC) and I won't take this as a generality as this stage.
Ste/ven Seagal wrote:My interpretation of '(rya)->ab->au' is that the register value is added to the displacement already in AU, or it wouldn't make sense.
I agree.
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

Steven Seagal wrote:My guess is that the effective read and bus error happen at adrw2, before internal registers are updated.
I've got a little discution with some old pals on this subject (and others). There's a chance than all the moves done within a nanoword are executed at the same time, with any sequencing. so the fastest will end first and the longest will end last. In this case, we can say that internal registers should be updated before error occurs on S4.
Steven Seagal wrote:Note that for PC itself it's all same-same.
Yes, but not for AU and in some instrcution (as movem) it's a real pain.
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

Steven Seagal wrote:Thanks again for the schema.
If I get the first 3 steps right, we have:
PC -> ALU
SP -> AU
write ALU at AU address
What counts is that PC and no other register/computed value is pushed on the stack. Which explains much.
Basically, yes.
But as PC and AU are 32bit "registers" and ALU is only 16bit it only write PC LSW in step 3.
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

Steven Seagal wrote: .W (d16, An): ADSW1 -> ADSW2 -> ADRW2

ADSW1
au → aob copy effective fetch pointer (in fact it send the address in AU to the address bus)
(dbin) → dbe → au use IRC (d16 -> AU)(don't understand where you get IRC in this one ? It get d16 from a buffer/latch (dbin) of the data bus not from IRC)
edb → dbin, irc read cycle, prefetch (fill IRC) (and the buffer/latch named dbin)
(rya) → ab → au (An) -> AU, where it is added to d16 (I'm OK with this)

ADSW2
au → ab → aob, at AU->address busses
(dbin) → abd → alu copy IRC value on address bus data section only (same remark than befor it's not IRC but the content of the buffer/latch dbin which is copied. In this pecular case the value in DBin is the same tahn the one in IRC).
(just in case?) (if fact it depend of the macroword execution which call ADSW2...)
(pc) → db → au AU=PC
-1 → alu ? (as far as I understand things, ALU is activated for every nanoword, so -1 is here to say that ALU do nothing for this peculiar nanoword, but can be wrong).
+4 → au AU=PC+4, effective fetch pointer

ADRW2
edb → dbin read cycle, read word at EA (and store it in dbin not IRC :mrgreen: )
(rxd) → ab* → dcr ? (put the data register part of the macro instruction in a decoder for further execution)
About your corrections, the problem is that the patent is kind of blurry and hard to read.
About IRC, it's in fact the value that currently is in IRC (d16), still available on dbin. It's a way to say "the currently prefetched word".
Ste/ven Seagal wrote:It also illustrates what I was saying: as soon as the IRC word is used, a new prefetch fills IRC.
I partially disagree, as IRC is never fetched in this exemple (DBin is not IRC) and I won't take this as a generality as this stage.
You're certainly right that it's just one case, but prefetched values do go to IRC.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

danorf wrote:
Steven Seagal wrote:My guess is that the effective read and bus error happen at adrw2, before internal registers are updated.
I've got a little discution with some old pals on this subject (and others). There's a chance than all the moves done within a nanoword are executed at the same time, with any sequencing. so the fastest will end first and the longest will end last. In this case, we can say that internal registers should be updated before error occurs on S4.
Yes that's what I meant and actually wrote and later I edited it thinking it was an English mistake (I meant "before that, internal registers are updated" which I read "before that internal registers are updated")... anyway.
Steven Seagal wrote:Note that for PC itself it's all same-same.
Yes, but not for AU and in some instrcution (as movem) it's a real pain.
I'll keep MOVEM for later, it's a more difficult case.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

danorf wrote: But as PC and AU are 32bit "registers" and ALU is only 16bit it only write PC LSW in step 3.
Oh yeah, I get it now! It pushes PC in two parts, coming back down the stack after having pushed the status word. That's step 4.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

Steven Seagal wrote:About your corrections, the problem is that the patent is kind of blurry and hard to read.
About IRC, it's in fact the value that currently is in IRC (d16), still available on dbin. It's a way to say "the currently prefetched word".
I know ! I really have to share my transcripted version of this patent... a little remaining work to do on it and you'll be able to put your glasses off !
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

Today I completed the partial analysis of <EA> routines with the focus only on PC value. It confirms the little table above and explains the behaviour seen in some cases and in Dio's test tables.
The key to understanding this is that the microcode uses AU more than PC (which of course scuttles my first theory!)

Code: Select all

-------------------------------------------------------------------------------
010	R            (An)            

.B,.W (An)  ADRW1->ADRW2

ADRW1
(dbin)->ab*->alu
(rya)->*->aob,at
-1->alu

ADRW2
edb->dbin               read cycle, read word at bus
(rxd)->ab*->dcr

PC: no change


.L (An) ADRL1->ADRL2

ADRL1
(dbin)->ab*->alu
edb->dbin
(rya)->db->aob,au
-1->alu
+2->au

ADRL2
au->ab->aob,at
(dbin)->dbd->alub
edb->dbin
(pc)>db->au
+2->au

PC: no change

-------------------------------------------------------------------------------
011	R            (An)+  

.W (An)+: PINW1->PINW2

PINW1
(dbin)->dbd->alub
(rxd)->ab*->dcr
(rya)->db->aob,at,au
+1,+2->AU

PINW2
(alub)->alu
au->ab->rya             An++
edb->dbin               read cycle, read word at bus
(pc)->db->au
-1->alu
+2->au

PC: no change


.L (An)+: PINL1->PINL2->PINL3

PINL1
(dbin)->ab*->alu
edb->dbin
(rya)->db->aob,au
-1->alu
+2->au

PINL2
au->db->aob,at,au
(dbin)->dbd->alub,alue
+2->au

PINL3
au->ab->rya
edb->dbin
(pc)->db->au
+2->au

PC: no change

-------------------------------------------------------------------------------
100	R            –(An)          

.B,.W -(An): PDCW1->PDCW2

PDCW1
au->pc                  as AU=PC+2, this increments PC as very first step
(dbin)->dbd->alub
(rxd)->ab*->dcr
(rya)->db->au           (An) -> AU
-1,-2->au               makes AU--, both for byte and word (?)

PDCW2
(alub)->alu
au->ab->aob,at,rya      AU on address bus, update An
edb->dbin               read cycle, read word at bus
(pc)->db->au            restore in one step
-1->alu
0->au

PC: +2


.L -(An): PDCL1->PDCL2->ADRL2

PDCL1
(dbin)->ab*->alu
(rya)->db->au
-1->alu                 (An) -> AU
-4->au                  AU-- (long)

PDCL2
au->db->aob,au,rya      AU on address bus, update An
edb->dbin               read cycle, read high word at bus
+2->au                  position AU on low word

ADRL2
au->ab->aobnat          AU on address bus, update An
(dbin)->dbd->alub       copy word currently on dbin
edb->dbin               read cycle, read low word at bus
(pc)->db->au            restore AU...
+2->au                  in 2 steps

PC: no change
-------------------------------------------------------------------------------
101	R            (d16, An)      
111	010          (d16, PC)    


.B,.W (d16, An): ADSW1->ADSW2->ADRW2

ADSW1
au->aob                 copy effective fetch pointer
(dbin)->dbe->au         use 'IRC' value still in dbin (d16 -> AU)
edb->dbin,irc           read cycle, prefetch (fill IRC)
(rya)->ab->au           (An) -> AU, where it is added to d16

ADSW2
au->ab->aob,at          AU->address busses
(dbin)->abd->alu        copy 'IRC' value 
(pc)->db->au            AU=PC
-1->alu                 
+4->au                  AU=PC+4, effective fetch pointer

ADRW2
adb->dbin               read cycle, read word at EA
(rxd)->ab*->dcr         

PC: no change


.L (d16,An): ADSL1->ADSL2->ADSL3

ADSL1
au->aob
(dbin)->dbe->au
edb->dbin,irc
(rya)>ab->au

ADSL2
au->db->aob,au
(dbin)->ab*->alu
edb->dbin
-1->alu
+2->au

ADSL3
au->ab->aob,at
(dbin)->dbd->alub
edb->dbin
(pc)->db->AU
+4_>AU

PC: no change
   
-------------------------------------------------------------------------------
110	R            (d8, An, Xn)    
111	011          (d8, PC, Xn)   


.B,.W (d8, An, Xn): AIXW0->AIXW1->AIXW2->ADSW2->ADRW2   (irc[11]=0)
                                  AIXW4                 (irc[11]=1)

AIXW0
(dbin)->ab*->alu        copy prefetched displacement D8
0->alu

AIXW1
alu->*e->au
au->aob
rya->*->au              add An to displacement

AIXW2
au->*->au
edb->dbin,irc           read cycle, prefetch
(rxl)->*e->au           add index to displacement and (An)
reset pren

or

AIXW4
au->*->au
edb->dbin,irc           read cycle, prefetch
(rx)->*->au             add index to displacement and (An)
reset pren

ADSW2
au->ab->aob,at          AU->address busses
(dbin)->abd->alu        copy 'IRC' value 
(pc)->db->au            AU=PC
-1->alu                 
+4->au                  AU=PC+4, effective fetch pointer

ADRW2
edb->dbin               read cycle, read word at EA
(rxd)->ab*->dcr         

PC: no change


.L (d8, An, Xn): AIXL0->AIXL1->AIXL2->ADSL2  (irc[11]=0)
                               AIXL3         (irc[11]=1)

AIXL0
(dbin)->ab*->alu        D8
0->alu

AIXL1
alu->*e->au
au->aob
(rya)->*->au            D8+An

AIXL2
au->*->au
edb->dbin,irc
(rxl)->*e->au
reset pren

or

AIXL3  
au->*->au
edb->dbin,irc
(rx)->*e->au
reset pren

ADSL2
au->db->aob,au
(dbin)->ab*->alu
edb->dbin
-1->alu
+2->au

PC: no change

-------------------------------------------------------------------------------
111	000          (xxx).W         

.B,.W (XXX).W: ABWW1->ABLW3

ABWW1
au->aob,pc              PC+2
(dbin)->dbe->at,au
edb->dbin,irc           read cycle, prefetch
(rxd)>ab*->dcr
0->au

ABLW3
(at)->ab->aob
(dbin)->abd->alu
edb->dbin               read cycle, read word at EA
(pc)->db->au
-1->alu
+2->au

PC: +2


.L (XXX).W: ABLW1->ABLW2->ABLW3

ABLW1
au->db->aob,au
(dbin)->ab->ath
edb->dbin
+2->au

ABLW2
(ath)->dbh->au
au->aob,pc              PC+4
(dbin)->dbl->atl,au
edb->dbin,irc           read cycle, prefetch
(rxd)->ab*->dcr
0->au

ABLW3
(at)->ab->aob
(dbin)->abd->alu
edb->dbin               read cycle, read word at EA
(pc)->db->au
-1->alu
+2->au

PC: +4

-------------------------------------------------------------------------------
111	001          (xxx).L   


.W (XXX).L: ABWL1->ABLL3

ABWL1
au->aob,pc              PC+2
(dbin)->dbe->at,au
edb->dbin,irc
(rxd)->ab*->dcr
0->au

ABLL3
au->db->aob,au
(dbin)->ab*->alu
edb->dbin               read cycle
-1->alu
+2->au

PC: +2


.L (XXX).L: ABLL1->ABLL2->ABLL3

ABLL1
au->db->aob,au
(dbin)->ab->ath
edb->bin                read cycle, prefetch without IRC
+2->au

ABLL2
(ath)->dbh->au
au->aob,pc              PC+4
(dbin)->dbl->atl,au
edb->dbin,irc           read cycle, prefetch
(rxd)->ab*->dcr
0->au

ABLL3
au->db->aob,au
(dbin)->ab*->alu
edb->dbin               read cycle
-1->alu
+2->au

PC: +4

-------------------------------------------------------------------------------
111	100          #<data>      


.W #: E#W1 (or O#W1?)

E#W1
au->db->aob,au,pc       PC+2
(dbin)->ab->rydl,ath
edb->dbin,irc           read cycle, prefetch
+2->au

PC: +2


.L #: O#L1->O#W1

O#L1
au->db->aob,au
(dbin)->ab->rxh
edb->dbin               read cycle
+2->au

O#W1
au->db->aob,au,pc       PC+4
(dbin)->ab->rx1
edb->dbin,irc           read cycle, prefetch
+2->au

PC: +4

-------------------------------------------------------------------------------
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

From those <EA> microcodes you can also deduce other things: for example, contrary to expectations, for <EA>=.W (An)+, the register itself is incremented before the read cycle, it works because the old address is still on the data (?) bus.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

Steven Seagal wrote:From those <EA> microcodes you can also deduce other things: for example, contrary to expectations, for <EA>=.W (An)+, the register itself is incremented before the read cycle, it works because the old address is still on the data (?) bus.
Not exactly (at least in your exemple). As I already write (first time in the first post of this page :mrgreen: ) reading a value in memory is a 2 steps operation : first putting an address on the address bus then reading the result on the data bus.

So, on PINW1 you have :

Code: Select all

(rya) → db → aob, at, au
So, the content of the address register is stored in 'au' and 'at' and put on aob (address output buffer) and is, consequently, sent on the address bus to the external world.

Then on PINW2 :

Code: Select all

au → ab → rya
edb → dbin
Address register is incremented and the CPU read what the external world had put on the data bus as an answer of its previous request (sending the address on the address bus).

In conclusion, For <EA>=.W (An)+ the address is, as expected, pushed on the address bus before the register is incremented. Let's say that the register is, incremented in the middle or at the end of the read cycle, depending of how nano instructions in a nanoword are really executed : sequentially or all at the time, and, in this last case the duration of each of these nanoinstructions.

Until we don't have a strong answer to this last question (and its corollary question : when does the bus error exception / address error exception really occurs in therm of nanoinstruction execution), it will be very difficult to guest value of PC, AU and other internal or external registers from nanocode :wink: .
Last edited by danorf on Fri Apr 05, 2013 12:45 am, edited 5 times in total.
danorf
Atari maniac
Atari maniac
Posts: 84
Joined: Tue Feb 12, 2013 1:18 pm
Location: Behind a computer

Re: Program counter and exception

Post by danorf »

Steven Seagal wrote:Today I completed the partial analysis of <EA> routines with the focus only on PC value.
If you want ot extend your analysis to instructions, you have to track AU and AT change in <EA> routines as values put at this stage in these internal registers would be reused as is later.

I haven't read all your table by now, but to answer your question in PDCW1 :
rya is decremented by 1 or 2 depending of size and register as decoded from IRD bits 12-13 and 6-8.
--> -1 if .B and not -(sp) else -2
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

danorf wrote: Not exactly (at least in your exemple). As I already write (first time in the first post of this page :mrgreen: ) reading a value in memory is a 2 steps operation : first putting an address on the address bus then reading the result on the data bus.

So, on PINW1 you have :

Code: Select all

(rya) → db → aob, at, au
So, the content of the address register is stored in 'au' and 'at' and put on aob (address output buffer) and is, consequently, sent on the address bus to the external world.

Then on PINW2 :

Code: Select all

au → ab → rya
edb → dbin
Address register is incremented and the CPU read what the external world had put on the data bus as an answer of its previous request (sending the address on the address bus).
Alright, I hadn't got it yet. The bus access attempt starts as soon as you put something on aob.
Until now (and in my 'read cycle' comments), I thought it happened with edb->...
In fact, edb-> means that you read data that has been fetched previously!
I will update my tables, thanks for the clarification.

In conclusion, For <EA>=.W (An)+ the address is, as expected, pushed on the address bus before the register is incremented. Let's say that the register is, incremented in the middle or at the end of the read cycle, depending of how nano instructions in a nanoword are really executed : sequentially or all at the time, and, in this last case the duration of each of these nanoinstructions.

Until we don't have a strong answer to this last question (and its corollary question : when does the bus error exception / address error exception really occurs in therm of nanoinstruction execution), it will be very difficult to guest value of PC, AU and other internal or external registers from nanocode :wink: .
Yeah forget my last "finding".
It can be tested on a ST though.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

danorf wrote:
Steven Seagal wrote:Today I completed the partial analysis of <EA> routines with the focus only on PC value.
If you want ot extend your analysis to instructions, you have to track AU and AT change in <EA> routines as values put at this stage in these internal registers would be reused as is later.
But just for PC, it's less complicated. I think the goal of this thread will be reached for the most part.
The general rule, beside <EA>, is: before any write cycle, PC already has its definitve value (2 beyond next instruction).
Steem beta works with this theory; of course there are only a few known cases to cover.
I haven't read all your table by now, but to answer your question in PDCW1 :
rya is decremented by 1 or 2 depending of size and register as decoded from IRD bits 12-13 and 6-8.
--> -1 if .B and not -(sp) else -2
Thx!
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

As an update, here's the current version:



Program counter and exception
===============================

Scope
=======

This text doesn't describe the full exception process on the M68000 nor even
how the full exception stack frame is built.
You need to check other doc for those aspects (Motorola MC68000 User Manual)
It is also useful to check doc on prefetch.
The text only focuses on the exact value of the program counter (PC) that is
pushed on the stack when an address or bus error exception occurs.

Sources
=========

- Motorola M68000 Microprocessors User's Manual (http://www.freescale.com)
- US patent 4,325,121 assigned to Motorola (which contains microcodes)
- Discussions in Atari-Forum (http://www.atari-forum.com=AF)
- Tables and schemas by 'danorf' (AF)
- Exception PC tables by 'Dio' (AF)
- wikipedia.org and maybe other web sites
- Cases (programs)
- And of course development of Steem SSE: for this reason, Steem specifics are
discussed.

Why?
======

This aspect of ST emulation came to attention when it was realised that some
programs work or crash based on that value being correct, mainly in the context
of code protection.
For example, the value pushed on a (sometimes very) low stack would later be
executed as code, which is a rather intricate procedure.
If the value is wrong, the instruction is garbage and the program crashes.

Cases (some programs sensitive to stacked PC):
[Method to detect them: just push garbage as PC and see what crashes!]

Code: Select all

Aladin                                       move.b (a7)+,(a1)      write
Aladin                                       clr.w (a1)             write!
Aladin                                       move.w (a2),(a1)       write
Aladin                                       move.l d3,(a1)         write
Aladin                                       move.l d0,(a2)         write
Amiga Demo                                   tst.l (a1)+            read
BIG Demo                                     tst.l (a1)+            read
Blood Money                                  movem.l a0-5,(a6)      write
Darkside of the Spoon                        move.b d1,+$2476(a4)   write
European Demos, Phaleon, Transbeauce 2       move.l $0.W,$24.w      read
European Demos, Phaleon, Transbeauce 2       clr.w $70fff           read!
Punish Your Machine                          move.b d1,+$2476(a4)   write
The Teller                                   jmp $201.w             fetch
War Heli                                     move.l d5,(a1)         write
Note that the 24,960 exceptions of the BIG Demo are part of the normal loading
procedure.
New record: 876,079 for The Teller.


User Manual
=============

The M68000 User's Manual states:
"Value saved for the program counter is advanced 2–10 bytes beyond the address
of the first word of the instruction that made the reference causing the bus
error.
If the bus error occurred during the fetch of the next instruction, the saved
program counter has a value in the vicinity of the current instruction, even
if the current instruction is a branch, a jump, or a return instruction."

Motorola doesn't give any precise information.
Based on just that, the programmer who would actually use the pushed PC
couldn't compute it. He had to be lucky or to test on real hardware.

Tables
========

Dio (AF) wrote a test program that issues tables for a lot of cases
(also available here: "Exception PC tables").
The only value of interest for us is PC (2nd value after the / on
line 2), values are relative to PC of current instruction, but normally
PC is already 2 bytes beyond (so 2 in the table means no increment yet
at time of crash).

Fetching
==========

The PC (program counter) points to a word (16bit value) in memory that is
supposed to be fetched. A fetch is copying the value pointed by PC to a
register in the CPU (IRC). Those values are the program, made of macro-
instructions, that is instructions and their operands.
Both the instructions and the operands are fetched but only the instructions
are decoded (going from register IRC to register IRD where decoding is done).
Because fetching often occurs in advance, it is generally called prefetch.

Increment time
================

When a word is fetched, PC is incremented (PC=PC+2, shortened as 'PC+2') to
point to the correct word.
The first question is: do we increment then fetch, or fetch then increment?
It depends on the processor:
"In most processors, PC is incremented after fetching an instruction,
and holds the memory address of (“points to”) the next instruction that
would be executed. (In a processor where the incrementation precedes the
fetch, PC points to the current instruction being executed.) "
(http://en.wikipedia.org/wiki/Program_counter)
On the M68000, when an instruction starts, PC generally points to the next word.
So apparently it would be 'fetch then increment'.
But in fact, it points to the word being currently copied in IRC (prefetch
rule: at the start of any instruction, IRC is loaded with the next instruction
or operand).
It implies that PC isn't incremented right after the word in memory
has been fetched. After fetching, PC points to the last word having been
fetched.

Separation of incrementing and fetching
=========================================

The second question is: does the fetch immediately follow PC increment, or could
there be some delay between both events?
In the first case, the CPU would use one unique routine to fetch program words.
In the second case, at least two routines would be used to accomplish those
tasks.
Both possibilities makes sense. On the M68000, the second possibility is true:
increment and fetching are separate.

Microcode and Internal registers
==================================

All M68000 behaviour is commanded by microcodes, sort of mini-instructions
(and actually called 'micro' and 'nano' instructions by Motorola) or routines.
This applies to 'PC+2' and fetching as well, and makes questions above a bit
theoretical: fetching and 'PC+2' are not just separate, they're also split
in different tasks themselves. In fact, 'PC+2' can be done different ways.
There are also other internal registers in the CPU invisible to the programmer.
Those registers are involved in fetching, it's not just PC and IRC, for any
reasons (performance for example).
What's more, the address currently stored in the AU (double 16bit Arithmetic
Unit) is heavily used for fetching.
This has many consequences but the one that interests us here is that the
value of PC will not necessarily point to the word that's in IRC at any time.
The value of PC at any given time (for example when some crash happens) is
perfectly determined by microcodes but doesn't follow a simple logic.

Program Counter or other register?

It is the PC register and nothing else that is copied on the stack in case
of bus/address error, as established by analysis of microcodes.

Effective Address
===================

On the M68000, generic microcode routines are used to read the effective
address in most instructions.
This implies that prefetch can't happen before or during getting effective
address.
That there be no prefetch doesn't mean that PC is unchanged. First, of course,
PC may be updated when operands are fetched (but not necessarily so!).
Second, microcode may update PC as part of the <EA> routine, sometimes using
PC as a temporary variable. The latter is what actually happens on the M68000.
The <EA> rule we deduce from microcode is that at the end of <EA>, AU must
contain the address of the next word to fetch/prefetch. The value of PC doesn't
matter.
The <EA> microwords for BYTE and for WORD are the same.

Analysis of microcodes delivers the following table:

Code: Select all

b543	b210         Mode           .B, .W             .L

000	R            Dn              PC                PC
001	R            An              PC                PC
010	R            (An)            PC                PC
011	R            (An)+           PC                PC
100	R            –(An)           PC+2              PC
101	R            (d16, An)       PC                PC
110	R            (d8, An, Xn)    PC                PC
111	000          (xxx).W         PC+2              PC+2
111	001          (xxx).L         PC+4              PC+4
111	010          (d16, PC)       PC                PC
111	011          (d8, PC, Xn)    PC                PC
111	100          #<data>         PC+2              PC+4
This table is in agreement with the table issued by the test program (Dio).
PC here is relative to PC at the start of the instruction, that is, pointing to
the word beyond the instruction (IRC at the start of instruction).
The values may look strange to the M68000 expert, but they are explained
by the use of other registers than PC to do the actual fetches.

Seconds part of Instruction
=============================

It seems that in most instructions, PC is updated at the beginning of execution
of second part, and that it takes the AU value at the end of EA.
This AU value is the right 'fetch address' one. That is, if no fetches for
destination are needed, it would point to the word beyond next instruction
already (a logical value).
If fetches are needed, the value of PC doesn't change! AU is updated instead.
At the end of the instruction, the correct fetch address is in AU and will be
used by the <EA> routine of next instruction. PC will only be updated:
1) in some cases of <EA>, used as temporary variable
2) at the start of second part of instruction.
This makes real emulation easier than at first expected.

For example, in Darkside of the Spoon's move.b d1,+$2476(a4), the crash happens
after <EA>, before write, after fetch for destination.
Two inaccuracies compensate each other in Steem before v3.5.1:
PC wasn't changed at the end of <EA>, but PC+2 was (logically but not
accurately!) made when fetching for the destination.

Another problem
-----------------

If you look at the microcodes for instructions like CLR, you realise those
use the same <EA> microcodes to get the correct address on the bus.
This is a simplification in the CPU design.
That means that a crash would occur during the 'read' part, and not during
the second part of the instruction.
This wasn't the way in Steem, and so special care has to be taken to avoid
confusion in case of crash ('read' or 'write').
So at 'clr.w $70fff' of Phaleon that precedes the fatal 'move.l $0.W,$24.w',
this is a <EA> crash, not a 'write' crash.
On the other hand, 'clr.w (a1)' of Aladin crashes because a1 contains 0.
The ST may read memory there but not write into it. So it crashes after <EA>
and PC points to next instruction.


Branches
==========

Bcc, BRA... use a relative displacement.

JMP and JSR don't call any <EA> microcodes, but use their own microcodes.
Since the microcodes are specific, there's no reason to think that PC would
be incremented prior to being absolutely set, as this would waste CPU
resources.
We verified this for jmp (xxx).w (The Teller) and jmp (xxx).l.


PEA
=====

From a quick look at microcodes, which are specific, PC seems to follow the
<EA> rules.


Steem SSE
===========

In Steem, to be sure to push the correct PC in case of crash:
- Create a new variable 'true PC'.
- This variable follows Steem's pc, but at <EA> time follows the rules (see
table above) instead.
- At 'get destination' stage, it is set at address of next instruction + 2,
except if it's a 'read before write' case.
If it's a 'read before write' case, we follow the <EA> rules instead, except
if it's read-only memory (first 8 bytes of RAM).
- We use a flag to detect 'read before write' cases. This flag is set by
another macro in the body of specific instruction functions (such as CLR).
- For branches, PC is unchanged before it's set.
All in all, there's some more code running in the core CPU emulation but that's
the price we pay for a more satisfying (hopefully) emulation.

One may use tables based on opcode instead, but that will not be enough for
some cases that also depend on the memory address (Aladin).









Edit: there you'll find the current version of this theory:
http://ataristeven.t15.org/txt/Program% ... eption.txt
Last edited by Steven Seagal on Wed Apr 29, 2015 6:29 pm, edited 3 times in total.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

Steven Seagal wrote:Well, thanks a lot for those precious data and insights!
Will need some time to digest it all now, can't reply at once. This will certainly help a lot.
Just one thing, for MOVEM, I've finally accepted that it prefetches at the end (=>there's some other problem in Steem).
As an update, I found the problem, it's in the same instruction, the timing on .L writes wasn't totally counted at the time of the write as it should, 4 cycles were counted after the write for no good reason.
Now Steem works with the prefetch at the end of MOVEM as it should.

Code: Select all

    short mask=1;
    for (int n=0;n<16;n++){
      if (m68k_src_w & mask){
        ad-=4;
#if defined(SSE_CPU_MOVEM_RM_L_TIMING2)
        CPU_ABUS_ACCESS_WRITE;
        CPU_ABUS_ACCESS_WRITE;
#else
        INSTRUCTION_TIME(4);
#endif
        m68k_lpoke(ad,r[15-n]);
#if !defined(SSE_CPU_MOVEM_RM_L_TIMING2)
        INSTRUCTION_TIME(4);
#endif
        if (ioaccess & IOACCESS_FLAG_DO_BLIT) Blitter_Start_Now();
      }
      mask<<=1;
    }

In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
User avatar
Steven Seagal
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2018
Joined: Sun Dec 04, 2005 9:12 am
Location: Undisclosed
Contact:

Re: Program counter and exception

Post by Steven Seagal »

Updated with this part for The Teller:
Branches
==========

Bcc, BRA... use a relative displacement.

JMP and JSR don't call any <EA> microcodes, but use their own microcodes.
Since the microcodes are specific, there's no reason to think that PC would
be incremented prior to being absolutely set, as this would waste CPU
resources.
We verified this for jmp (xxx).w (The Teller) and jmp (xxx).l.
Don't know why but I assumed that JMP used <EA> microcodes too, it doesn't.
So it's still coherent, only more complete.
In the CIA we learned that ST ruled
Steem SSE: http://sourceforge.net/projects/steemsse
Post Reply

Return to “Development”