I'm only using my DE1 for research as a logic analyzer or a signal generator, I never meant to make the project run on it.
From the beginning the goal is to be able to run actual cartridges. That requires a large amount of I/Os, which most FPGA dev boards lack.
System RAM isn't a problem, it's not that large (as noted).
Storage space for games shouldn't be a problem either since no games are over 90MB and most of them fit in 32MB (as noted also):

The crucial difference is that Genesis carts have 1 bus while those for the NeoGeo have 5.
I tried to summarize the maximum bandwidth requirements, hoping I didn't make mistakes:
Sprite graphics: 91.55Mbps
Fix graphics: 22.89Mbps
Main CPU program: 45.78Mbps
Sub CPU program: 7.63Mbps
Audio data: 1.13Mbps
Total: 169Mbps
The real challenge I'm seeing is the need to interleave all those random accesses while keeping latency under the limits for each.
Honestly, I want to stay focused on the core itself. I don't want to start hacking up the original logic to solve timing problems only caused by the use of SDRAM (at least for now).