ao486 core: constrains and timing issues

https://github.com/MiSTer-devel/Main_MiSTer/wiki

Moderators: Mug UK, Zorro 2, Greenious, spiny, Sorgelig, Moderator Team

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

ao486 core: constrains and timing issues

Postby ijor » Wed Nov 15, 2017 4:18 am

The ao486 is a big core. Compiling the full version (as it is configured now) takes a lot of time. Most of the time, by far, is spent by the router. The router is the compiler module that wires the logic connections inside the FPGA. In some cases it can even fail to route the design.

Looking at the messages an performing some timing analysis reveals that what the router is doing is trying to help meeting timing. The compiler finds many paths with (apparent) hold timing violations and is trying to introduce in purpose, routing delays to achieve timing. Although it might seems counterproductive to add delays to meet timing, sometimes it is required (I'll elaborate below). To add some significant delay to the connection the router might need to wire a signal from one corner of the device to the opposite one back and forth. For a couple of signals that wouldn't be a problem. But if they are many, this will produce a serious routing congestion. The router will spend lot of time trying to solve the congestion and not always it will succeed.

Hold time violations happens when a signal reaches the target too fast. This might happen for different reasons. In this case it seems they are mostly on unrelated clock transfers. Signals that go from a register on one clock to a register in a different unrelated clock. These transfers usually require synchronization or specific CDC (cross domain clocking) techniques. In anycase, they usually should not be included in the timing analysis. But the compiler doesn't know that unless it is specifically told, usually with timing constrains.

Cutting those clock transfers from the timing analysis solves the router problem. It compiles much faster and probably will succeed all the times. This doesn't necessarily mean that the actual timing problem of the timing is solved. It would need a deeper and more detailed analysis of the design to check if those clock domain transfers are properly handled.

But there is still another problem when trying to meet timing. Constraining the clock transfers is not enough. I will describe the problem, mostly unrelated to the routing issue, in a separate message ...

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Wed Nov 15, 2017 4:38 am

The other problem is that the CPU clock at 90 MHz is too fast. After cutting the clock transfers some timing violations remain. They are mostly setup (not hold) issues on the CPU clock. They are not clock transfers. They are paths within the same clock. Conceivable some of the failed paths are false ones, I am not really very familiar with the design to be 100% sure. But at least many of the failing paths seem genuine, just a result of too much combinatorial logic for the clock frequency.

Reducing the CPU clock to the half, 45 MHz, solves the problem and I could finally meet (internal) timing. It is possible to go a little higher, seems the limit (without some redesign) is somewhere around 50 MHz. It might be possible to optimize the design logic somehow but this might require to introduce some pipelining. Anyway, it requires to be very familiar with the design.

Some I/O ports still remain unconstrained. The only important ones seem to be HDMI signals. I will post the constrains I added, but again, the constrains alone might not solve timing issues. They might even hide some real timing defects. And I'm not familiar with the design to tell. As I keep saying, all of this should really be performed by the original developer. Proper constrains also help documenting the design, how clock transfers are handled and what requirements they need.

User avatar
Newsdee
Atari God
Atari God
Posts: 1037
Joined: Fri Sep 19, 2014 8:40 am

Re: ao486 core: constrains and timing issues

Postby Newsdee » Wed Nov 15, 2017 7:43 am

Very interesting read Ijor, this EE newbie appreciates insight into more complex topics like this :) So if I understood correctly the core would be more stable at 50Mhz with information telling the "compiler" how to use FPGA space. Albeit it might cause some glitches if for some reason the constraints goes against the original design intent. Is that the gist of it?

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Wed Nov 15, 2017 9:23 am

Original design is targeted to 30MHz. So probably all timing issues can be solved at this master clock.
But the big problem of this core is emulated speed. CPU model requires much more clocks to execute instructions than original i486 CPU.
To make this core useful i had to rise clock to 90MHz. It's still very slow, at the edge of acceptance. But at least it can be used now.
So here is a dilemma:
1) make core fully constrained and probably get stable from compilation to compilation. But this core will be very slow and practically useless.
2) make more or less useful core, but have problem with compilation.

I choose the second option. May be it's a good core to practice on timing constraints by reducing the master clock, but i'm afraid the result won't be useful as core will be unacceptably slow. It seems i486 CPU has been converted from some high-level language like C or Java. You can even find some scripts in source code used for conversion. Probably the way ao486 designed is a dead-end and cannot be refactored without the writing from scratch. Original author has no further interest to improve the core. Other devs will have very hard time in studying the core.
But i have to admit - this core is a master piece if you consider the features it implemented. Except FPU, everything is implemented.

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Wed Nov 15, 2017 12:00 pm

Sorgelig wrote:Original design is targeted to 30MHz. So probably all timing issues can be solved at this master clock.
But the big problem of this core is emulated speed. CPU model requires much more clocks to execute instructions than original i486 CPU.


Ah, I didn't know that. What a pity.

Yes, I agree that it might be reasonable to overclock the core to get a faster more usable version. But probably better to document that it is being overclocked, and may be offer a non-overclocked version as well.

For overclocking you might want to enable the "Performance (Aggressive) Optimization Mode" in Compiler Settings. This will probably increase compilation time, mostly at the fitter, but not nearly as much as it takes now the router without the constraints. This will reduce the overclocking without actually lowering the frequency and it will be more reliable.

1) make core fully constrained and probably get stable from compilation to compilation. But this core will be very slow and practically useless.
2) make more or less useful core, but have problem with compilation.

I choose the second option. May be it's a good core to practice on timing constraints by reducing the master clock, but i'm afraid the result won't be useful as core will be unacceptably slow.


Note that the timing constraints and the compilation issues is one thing, the overclocking that misses timing is another one. And they are mostly unrelated.

You can (and probably should) constrain the design anyway even when you choose to overclock leaving the cpu clock at 90 MHz. You obviously will get the warning that the design doesn't meet timing. But the constraints will solve the problem of the router taking ages and sometimes failing.

Add the following line to the file "syn\system\sys_top.sdc". This constraint declares the different clocks asynchronous to each other and prevent the router to being too smart trying to add that much delays.

Code: Select all

set_clock_groups -asynchronous
   -group [get_clocks { emu|u0|pll_0|altera_pll_i|general[0].gpll~PLL_OUTPUT_COUNTER|divclk}] \
   -group [get_clocks { emu|u0|pll_0|altera_pll_i|general[1].gpll~PLL_OUTPUT_COUNTER|divclk}] \
   -group [get_clocks { emu|u0|pll_0|altera_pll_i|general[2].gpll~PLL_OUTPUT_COUNTER|divclk}] \
   -group [get_clocks { pll_hdmi|pll_hdmi_inst|altera_pll_i|general[0].gpll~PLL_OUTPUT_COUNTER|divclk}] \
   -group [get_clocks { sysmem|fpga_interfaces|clocks_resets|h2f_user0_clk}] \
   -group [get_clocks { FPGA_CLK1_50 FPGA_CLK2_50 FPGA_CLK3_50}]

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Wed Nov 15, 2017 12:07 pm

Newsdee wrote:So if I understood correctly the core would be more stable at 50Mhz with information telling the "compiler" how to use FPGA space. Albeit it might cause some glitches if for some reason the constraints goes against the original design intent. Is that the gist of it?


Yes, more or less. The timing constraints mostly solves the compilation problem that the router takes so much time. Reducing the master clock frequency is what makes the core more stable. As I mentioned in the previous message they are mostly separated and unrelated issues.

I doubt that the constraints themselves might introduce some problems. But they might hide timing problems (by eliminating warnings during compilation) that are not correctly handled.

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Wed Nov 15, 2017 12:24 pm

Thanks for concrete offer for sdc. That addition should be universal for all cores.

As for ao486 - any help with specific to this constraints are welcome. Even for version with slower clock. I won't offer a slower version of ao486 (although anyone can offer if want) but any additional constraints will be helpful.

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Wed Nov 15, 2017 12:28 pm

By the way, for those who don't understand programming: Word "Overclocking" here is not relative to i486 overclocking. So, probably it's not really correct word here. i486 emulated by ao486 is severely under-clocked being run even on 90MHz. It's something like 386SX 10-16MHz. So, there is no room to clock it lower than it's being clocked.

User avatar
Newsdee
Atari God
Atari God
Posts: 1037
Joined: Fri Sep 19, 2014 8:40 am

Re: ao486 core: constrains and timing issues

Postby Newsdee » Wed Nov 15, 2017 3:33 pm

if I'm not mistaken this is based on Bochs source code, so you need to run at 90Mhz to emulate the 386 at 16Mhz?

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Wed Nov 15, 2017 4:06 pm

Newsdee wrote:so you need to run at 90Mhz to emulate the 386 at 16Mhz?

yes. 386 16mhz is approximate. you can make some real tests to find the speed.

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Wed Nov 15, 2017 8:43 pm

Sorgelig wrote:Thanks for concrete offer for sdc. That addition should be universal for all cores.
As for ao486 - any help with specific to this constraints are welcome. Even for version with slower clock. I won't offer a slower version of ao486 (although anyone can offer if want) but any additional constraints will be helpful.


The constraint I posted is specific to the ao486 core, it is not universal. The concept of declaring the async clock groups do is generic, but the clocks and which clock groups are async and which not, it depends on each core.

As far as I can see there is no other internal constraint needed for this core. Making the master clock slower or faster doesn't change the constraints. You change the PLL configuration and the Timing analyzer will get the clock parameters automatically from the PLL configuration. Only some I/O ports remain unconstrained. I will look to constrain the HDMI signals and post the constraints, and that yes should be universal for all cores.

Note again that constrains do not solve by themselves all the timing issues. If there are synchronization issues that are not handled properly these must be done by actual code.

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Wed Nov 15, 2017 10:15 pm

I've already made an universal async group definition in recent cores. I didn't try it on large cores. It seems shortens the routing time on small arcade cores.
Your explanations are good for those who want to understand about signals relations and how it's done in FPGA.
As for me, all you wrote is known to me long time since my background is EE. What is really need is specific commands in SDC (like that you've posted). Universal settings and specific to ao486 core are both welcome.
The main problem is not understanding how FPGA signals are working, but concrete commands reducing the compilation time and improving the signals timing.

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Wed Nov 15, 2017 10:18 pm

I'm not sure what you mean by HDMI timings. VIP part has its own constraints settings automatically generated by Quartus and i'm sure it's good enough to not touch it.

User avatar
Newsdee
Atari God
Atari God
Posts: 1037
Joined: Fri Sep 19, 2014 8:40 am

Re: ao486 core: constrains and timing issues

Postby Newsdee » Wed Nov 15, 2017 11:08 pm

Naive question. Are these constraints why it's useful to measure timings in original hardware with an oscilloscope? Because it allows feeding the timing constraints from the original machine?

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Thu Nov 16, 2017 3:15 am

Sorgelig wrote:What is really need is specific commands in SDC (like that you've posted). Universal settings and specific to ao486 core are both welcome. The main problem is not understanding how FPGA signals are working, but concrete commands reducing the compilation time and improving the signals timing.


Ok. As for the ao486 it seems to me that with the constraint I posted we are done internally. Only some external I/O ports still need to be constrained ...

I'm not sure what you mean by HDMI timings. VIP part has its own constraints settings automatically generated by Quartus and i'm sure it's good enough to not touch it.


VIP doesn't constrain the external HDMI signals. It only has internal constraints. If you look at the Timing Report you will see that the HDMI signals show as part of the "Unconstrained Output ports":

Code: Select all


+-------------------------------------------------------------------------------------------------------------+
; Unconstrained Output Ports                                                                                  ;
+---------------------+---------------------------------------------------------------------------------------+
; Output Port         ; Comment                                                                               ;
+---------------------+---------------------------------------------------------------------------------------+
...
; HDMI_I2C_SCL        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_I2C_SDA        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_LRCLK          ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_SCLK           ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_CLK         ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_DE          ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[0]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[1]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[2]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[3]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[4]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[5]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[6]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[7]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[8]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[9]        ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[10]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[11]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[12]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[13]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[14]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[15]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[16]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[17]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[18]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[19]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[20]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[21]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[22]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_D[23]       ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_HS          ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
; HDMI_TX_VS          ; No output delay, min/max delays, false-path exceptions, or max skew assignments found ;
+---------------------+---------------------------------------------------------------------------------------+

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Thu Nov 16, 2017 3:26 am

Newsdee wrote:Naive question. Are these constraints why it's useful to measure timings in original hardware with an oscilloscope? Because it allows feeding the timing constraints from the original machine?


Not too much. Constraints for the external I/O ports are mostly based on the specifications of the external device. For instance, when you constrain the DRAM interface you use the timing specifications on the DRAM chip datasheet.

Scope measurements are more useful for the quality of the signals (such issues as ringing, overshoot) than for timing. If you have high end equipment and you can measure the trace board delay, that might be useful. And of course, scoping can be useful for verification that your timing constrains are correct.

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Thu Nov 16, 2017 7:33 pm

I am constraining the HDMI interface. But not completely sure what you are doing at the code Sorgelig. Let's see if I get it right.

- The VIP module receives the HDMI clock inverted in relation to the output clock. This is, I assume, for the purpose to improve timing margin.

- The output of the VIP module is processed by the OSD module. But the OSD module seems to change the data in the positive edge of the clock. That would be the opposite edge than the one used by the VIP module. And then the final output is combinatorial between the OSD and the VIP data.

If I got it right, not sure that makes much sense. Processing the same signal at one edge on one module and at the opposite edge at the other???

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Thu Nov 16, 2017 9:04 pm

I've took VIP/HDMI clocking from one of Terasic sample. I didn't explore the purpose of inverting the clock for VIP. As long as it works - it's ok.

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Thu Nov 23, 2017 8:18 pm

Sorry for the late reply, was rather busy ...

Inverting the clock is a common technique to improve timing, But mixing an inverted clock with the non inverted one is not very good. This doesn't meet timing (as expected). Yes, it works mainly because a timing violation on the video interface is not fatal. In the worst case you will get a few video glitches that might even be unnoticed.

I implemented a small modification to use the inverted clock in the OSD module as well, it was using the positive version of the clock. I constrained the HDMI video interface, and I also added a registered pipeline to the video output because otherwise it doesn't meet timing. I will post the modified files later.

Note that the clock inversion is good enough for not so high frequencies. If you ever decide to use higher video clocks it might be better to use a clock shift as already implemented with the DRAM clock.

Btw, the DRAM interface doesn't meet timing, not even at 128 MHz, let alone at 150 MHz or higher.It is almost there at 128 MHz.

One problem is that the specific pins used in the DRAM board are not the most appropriate. It would have been better to use a different pin (that is present in the same GPIO connector) for the clock that is specific for PLL clock outputs. I realize is too late to change that. And this would gain you just a few MHz. That wouldn't be enough for 150 MHz or higher.

Note that failing timing closure doesn't mean it won't work and always fail. It just means it is not guaranteed to work in all the conditions.

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Thu Nov 23, 2017 9:00 pm

ijor wrote:One problem is that the specific pins used in the DRAM board are not the most appropriate. It would have been better to use a different pin (that is present in the same GPIO connector) for the clock that is specific for PLL clock outputs. I realize is too late to change that. And this would gain you just a few MHz. That wouldn't be enough for 150 MHz or higher.

Theory not always match the practice. I know about dedicated clock pin on GPIO. One of first revisions of SDRAM board used this pin. The max clock is much worse. Something like 80MHz only. Because dedicated clock pin is too far from chip clock pin and gathers a lot of noise. Current design is the best i could make. My SDRAM board works at 150MHz. And even at 167MHz in horizontal version.

I have nothing against your timing problem exploration. It's open source project at last. But The only matter is the achieved result. If your changes will shorten compilation time or will make repeatable result from one compilation to another compilation then it will be great for this core. Bare warning elimination without any noticeable difference in result doesn't matter.

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Thu Nov 23, 2017 9:36 pm

Sorgelig wrote:I have nothing against your timing problem exploration. It's open source project at last. But The only matter is the achieved result. If your changes will shorten compilation time or will make repeatable result from one compilation to another compilation then it will be great for this core. Bare warning elimination without any noticeable difference in result doesn't matter.


Well, that is your opinion. Achieving timing closure is considered critical by most people involved in hardware development. It is the difference between guaranteed to work and "it works for me". There is no way that you could test all the possibilities and all the conditions without proper timing analysis. Of course, testing in practice is needed as well, sometimes theory might be wrong for some reason. But analysis and testing complement each other.

Doesn't sound very wise to ignore the analysis just because it works or it seems to work. That's not the way that hardware should be developed. But, oh, well, everybody is free to do it the way he likes it.

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Fri Nov 24, 2017 2:03 am

These are the modified files. If you don't want to use them because you believe timing closure is a waste of time, that's up to you. The cost is not significant anyway. In this case it is just the pipeline registers I added to the video output.
You do not have the required permissions to view the files attached to this post.

Sorgelig
Atari God
Atari God
Posts: 1192
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: ao486 core: constrains and timing issues

Postby Sorgelig » Fri Nov 24, 2017 2:10 am

Thanks. I will check this out.

ijor
Hardware Guru
Hardware Guru
Posts: 3148
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: ao486 core: constrains and timing issues

Postby ijor » Fri Nov 24, 2017 2:14 am

Sorgelig wrote:Thanks. I will check this out.


Just in case it matters, note that even when the files are mostly generic to the framework, they are based in the mem test project.


Return to “MiSTer”

Who is online

Users browsing this forum: No registered users and 2 guests