Sorgelig wrote:IPF support - a lot of work with little gain. Most games are already de-protected. It's not a platform where new games are developed. So, there is no much reason to do it. Well, unless you have a lot of free time and just want to add it anyway. Even in Amiga core there are many other things to improve and get more from result.
I don't have that much free time, maybe I had underestimated the task as I thought would be an easy one to get started. I see your point and agree it won't be that useful, I just tend to prefer to run things as they were initially designed to. But that's not a good reason enough to spend that much time on a task. I think my time can be better used.
paulofduarte wrote:ao486 I want to first do some measurements of how many cycles each instruction takes with core comparing to the original 80486 specs
I don't think this is something worth doing. It is unlikely that the 486 HDL designer tried to match the cycle accuracy of the original 486, the wide range of 486 parts and speed grades and different manufacturers meant that the games programmers rarely (never?) wrote effects which relied on cycle accuracy (unlike other closed systems).
My intention is not try to achieve cycle accuracy, I just want to identify which are the worst case instructions and see if I can improve the HDL to make they require less cycles. But as your next quote suggests the original designer already identified the memory as the biggest bottleneck.
alfikpl wrote:I remember that on avarage the ao486 performs most of the x86 instructions in more than 10 cycles. The execution stages are pipelined, but the memory accesses are not. It simply takes a long time to prepare the memory access operations in the core. Perhaps the instruction fetch is also too slow ?
The original designer hints that memory access is the bottle neck and that should be the part of the design looked at first.
I missed that comment from him. I guessed there wasn't any pipeline as he had ported the cpu from Bochs.
OK so it seems that improving instruction fetch times would bring the most immediate benefit. So I think I'll first start looking to implement a small Level 1 instruction cache using some Memory Blocks in the FPGA (if any available). If I get it right it should improve the performance on fetching most sequential instructions until a branch is hit and the cache needs to be invalidated. But this was the expected behaviour in a 486, no branch-prediction.