Quake 2 on Falcon030

All 680x0 related coding posts in this section please.

Moderators: exxos, simonsunnyboy, Mug UK, Zorro 2, Moderator Team

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue Feb 16, 2016 8:01 pm

shoggoth wrote:I'm still polling this thread @ 3000Hz freq - but I really got my fix today. Amazing, dml. I didn't expect it to look that good with doubled pixels, but it's still really pretty, and given better FPS I think it looks like a good move.


Cheers!

Progress is slow but have almost caught up with where I left things - should have something new to report soon.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Feb 17, 2016 1:40 pm

I had a read through the Q2 collision detection code (the low level tracing part - not the fly/walk/slide stuff) and it looks straightforward enough to implement on the Falcon. Also looks like there's plenty of room to optimize it into ASM, and maybe even remove the floating point requirement.

I definitely want to use this as the reference version because its very well tested and unlikely to glitch. It's therefore worth spending time optimizing it properly once it works (unlike the temporary code which does glitch sometimes and doesn't take advantage of the BSP quite so well - and which I didn't want to waste time optimizing into ASM).

User avatar
Mindthreat
Captain Atari
Captain Atari
Posts: 190
Joined: Tue Dec 16, 2014 4:39 am
Contact:

Re: Quake 2 on Falcon030

Postby Mindthreat » Mon Feb 29, 2016 3:59 pm

Awesome! :) Looking forward to it!
"To create the future, you must first embrace the past." - http://cerka.weebly.com

AxisOxy
Atariator
Atariator
Posts: 23
Joined: Tue Jun 23, 2015 2:00 pm

Re: Quake 2 on Falcon030

Postby AxisOxy » Sat May 07, 2016 10:17 am

I again stumbled about 2 interesting things that could be interesting here.

First I discovered that the good old trick on 68000 replacing

Code: Select all

move (an,dn),dn

with

Code: Select all

move.l dn,an
move (an),dn

also saves 2 cycles on 68020/30. This saved 15% on all my old 030 texturemap innerloops. I´m wondering why no one did that in the old days, including me... ;)

I dont think this is practical for texturemapping in big engines like Doom or Quake, because it needs the data to be aligned to 64k boundaries. And you dont want to waste all the memory. Also 16 bit truecolor mapping is kind of a problem there, because you loose the nice dn*2 feature. But probably this works in other areas like transform, culling, bsp-traveral, ...

Another thing on my mind is, that in the old days a lot of rumours were flying around that some falcon democoders used the blitter to transfer data between dsp and memory to achieve 3.5 mb/s transfer rate.
Not sure if that is an urban legend. I did some fast test on this using Hatari 1.9, since I dont have access to a real Falcon 030.
But the test results didn´t look good.
In Hog_bus-mode the transfer was roundabout at the expected speed, but didn´t work (It was always transfering the same word over and over again, as if the dsp is blocked away from the bus). And after some time the dsp even locked up.
Without Hog_bus-mode the transfer more or less works, but is very slow (3 times slower than an unrolled CPU-loop).
I guess Hatari is massively differing from a real Falcon in these edge cases.
Would be interesting to know if such a hack really works on a real Falcon and if Dougs engine would profit from it.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Sat May 07, 2016 10:54 am

Hi!

All good observations. I probably did miss opportunities in the Doom code - haven't looked at it in a long time and returning to things usually finds optimizations quickly :) IIRC the main problem is just as you describe - the cost of alignment and the loss of *2 for TC texels. But it's always fun to have a 2nd or third look sometime.

I think the alignment issue can be reduced a lot if the textures are allocated in 2^n rings anyway because then you only need to align the first one, and for at least some subset of the blocks which follow will be aligned. But most textures in those game engines are smaller than 64k so it's hard to use effectively on a per-texture basis. OTOH packing small textures onto a series of aligned texture atlases might help, just using coords to find them (and some error padding between them). I guess that should work. You could even pass the initial alignment 'waste' back to the resource allocator so it could find some other use?

For the F030 Quake engine there is room to improve on the CPU side using address deltas instead of indirection but I got stuck for cycles on the DSP side so couldn't arrange it for real, with the current code anyway. I started looking at an alternative version on the DSP which spreads the P-correction out over several pixels and might allow the CPU-side changes, but the complexity is much higher and per-span overhead is also higher. Still worth trying to see if that can work without making the most dense scenes slower.

Blitter to/from DSP... well I admit I tried this once on real HW (in one direction at least - don't remember which one but probably reading from DSP) and got a puzzling result. The machine seems to bus-lock and go into a deep sleep. I tried to fiddle the DSP side to be ready in time and re-sync on any missed transfers but it always seemed to lock up regardless, as soon as the blitter is started. I figured there is some kind of incomplete handshaking path between blitter and the DSP host port, or something. I might have screwed it up of course but after a few shots and no joy, gave up. Wrote it off as a mystery. :)

BTW when you say 2-3x slower in non-hog mode, is that with or without spinning on forced-restarts? The default duty cycle on non-hog mode is 50% without intervention. But OTOH this could just be a weird Hatari timing effect as it probably never happens in existing code so far...

AxisOxy
Atariator
Atariator
Posts: 23
Joined: Tue Jun 23, 2015 2:00 pm

Re: Quake 2 on Falcon030

Postby AxisOxy » Mon May 09, 2016 12:07 pm

Regarding the alignment trick on texture mapping. I think, I have to revisit my old doom engine again. In my case it could nicely work on the floor-textures. They are already organised as 256x256 atlases.

dml wrote:BTW when you say 2-3x slower in non-hog mode, is that with or without spinning on forced-restarts?

That forced restart looping helped a bit on the speed side, but it´s still slower than the CPU-only transer. I also found an entry on the DHS forum by TAT/Avena who wrote that he tried it and failed. So I am with you "It´s most likely a myth".

tat
Retro freak
Retro freak
Posts: 12
Joined: Wed Nov 12, 2014 10:07 am
Contact:

Re: Quake 2 on Falcon030

Postby tat » Mon May 09, 2016 12:56 pm

AxisOxy wrote:
dml wrote:BTW when you say 2-3x slower in non-hog mode, is that with or without spinning on forced-restarts?

That forced restart looping helped a bit on the speed side, but it´s still slower than the CPU-only transer. I also found an entry on the DHS forum by TAT/Avena who wrote that he tried it and failed. So I am with you "It´s most likely a myth".


Hi

It's totally, totally possible that I was doing the test wrong. That was back in 199[3-8] and I really didn't know what I was doing.

My very hazy memory is that I only ever got 8 bits of data at a time using Blitter to read the host port, presumably because of the way the host port access worked. The rest came out at either 0 or 0xff.

Steve (tat)

User avatar
calimero
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 2064
Joined: Thu Sep 15, 2005 10:01 am
Location: STara Pazova, Serbia
Contact:

Re: Quake 2 on Falcon030

Postby calimero » Tue May 10, 2016 9:34 am

Hi Tat,
is there any chance to see sonolumineszenz sequel finished O:)
using Atari since 1986.http://wet.atari.orghttp://milan.kovac.cc/atari/software/ ・ Atari Falcon030/CT63/SV ・ Atari STe ・ Atari Mega4/MegaFile30/SM124 ・ Amiga 1200/PPC ・ Amiga 500 ・ C64 ・ ZX Spectrum ・ RPi ・ MagiC! ・ MiNT 1.18 ・ OS X

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue May 10, 2016 10:56 am

Hi Steve!

tat wrote:It's totally, totally possible that I was doing the test wrong. That was back in 199[3-8] and I really didn't know what I was doing.

My very hazy memory is that I only ever got 8 bits of data at a time using Blitter to read the host port, presumably because of the way the host port access worked. The rest came out at either 0 or 0xff.

Steve (tat)


That seems to be more than I got out of it. I don't think I was even getting 8 bits at the time. Probably didn't help that it was inside some other complicated drawing code though. Hmm. Maybe it's worth a separate test outside of the engine stuff and see if its doing anything consistent or not? Maybe inside the memory benchmark proggy...

I also tried a couple of times to 'dodge' the hostport sync on the DSP side - like every 2nd pixel doesn't do the jclr spin - and that failed too. I thought maybe if there is a FIFO present it would allow slightly async sequences and squeeze out another op or two. But it seemed to behave as if no FIFO behaviour on the DSP side (Which seems likely anyway - would probably cause more confusion than good overall).

tat
Retro freak
Retro freak
Posts: 12
Joined: Wed Nov 12, 2014 10:07 am
Contact:

Re: Quake 2 on Falcon030

Postby tat » Tue May 10, 2016 12:46 pm

I think my test was pretty simple, just pumping out known values with a full handshake on the DSP side. The "8-bit result" seemed to make some sense at the time, since I was always under the impression that the CPU-DSP port read/write was 8-bit, and the blitter was asking for 16 in one go. But I am not a hardware person, and don't know how DMA or MCU gets involved there.

Pulling the data off the DSP (and memory access in general) always seemed to be the primary bottleneck, but I'm sure you know that all too well :)

Calimero: no, I'll never go back to finish Binliner, sorry. I would love to find the time to do something on the Falcon (my machine still boots, although the NVRAM is dead), but it's very unlikely. Particularly as the computer is covered in my son's toy dinosaurs.

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Tue May 10, 2016 1:09 pm

tat wrote:Calimero: no, I'll never go back to finish Binliner, sorry. I would love to find the time to do something on the Falcon (my machine still boots, although the NVRAM is dead), but it's very unlikely. Particularly as the computer is covered in my son's toy dinosaurs.


Last week I shook lego pieces out of one machine... and found some magnets stuck to the PC tower case on another occasion...

tat
Retro freak
Retro freak
Posts: 12
Joined: Wed Nov 12, 2014 10:07 am
Contact:

Re: Quake 2 on Falcon030

Postby tat » Tue May 31, 2016 8:02 am

So I took the plastic dinosaurs off the Falcon and did an experiment or two.

The TL;DR is that if I read the DSP host port with the blitter, address $ffffa206 reads OK as a byte, but $ffffa207 is bogus. It reads either $ff or $ef, in an unstable pattern. The pattern seems to alter depending on whether I run in RGB or VGA, and the size of the copy.

My initial theory is that the flicker might be related to VIDEL wanting memory, but that's unproven. The access also seems rather slow, although I have not timed this properly.

So conceivably, you could use mask, skew and the halftone RAM to read a 16-entry palette into truecolor, or read 8 bits into a buffer for c2p. Otherwise, it looks like a bit of a blind alley.

The code is at https://github.com/tattlemuss/falcon-experiments , it needs gcc, vasm and asm56k.

It's very (almost naively) simple. At some point I'll try to get it to run through different tests (hog on/off, vbl lock, different buffer sizes, proper timings).

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Thu Jun 02, 2016 11:01 am

Thanks Tat - its useful to see that something is happening even if one of the bytes is wrong.

Since the host port is mapped as separate 8-bit ports its possible the blitter isn't addressing it properly as a single 16bit field, or maybe there is a magic waitstate which only the CPU & GALs is applying properly. A strange but interesting result anyway...

Yes the halftone thing might be worth a test if the transfer is fast enough generally. 16 colours per blit is enough to do something with.

One difference between my test and your sample is the size of the transfer - I didn't use an infinite loop as the source so if there was a mismatch in bytes expected vs transfered it would lock. Although it seemed to me that it was locking sooner than this. I should maybe look at that again after this last result and see what was really happening - had a bit too many variables involved :-/

User avatar
Ragstaff
Atari Super Hero
Atari Super Hero
Posts: 610
Joined: Mon Oct 20, 2003 3:39 am
Location: Melbourne Australia
Contact:

Re: Quake 2 on Falcon030

Postby Ragstaff » Thu Jun 02, 2016 3:06 pm

Do you think there is any scope to take advantage of overclocked DSP's? I assume you have manually balanced the workloads between the CPU and DSP in their stock form (and rightly so) and having a faster DSP can't really be taken advantage of, but just curious. I remember people OC'd their DSP's without any other accelerator board a bit back in the day (Eg here, up to 64mhz claimed)

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Thu Jun 02, 2016 5:54 pm

Hi,

Ragstaff wrote:Do you think there is any scope to take advantage of overclocked DSP's? I assume you have manually balanced the workloads between the CPU and DSP in their stock form (and rightly so) and having a faster DSP can't really be taken advantage of, but just curious. I remember people OC'd their DSP's without any other accelerator board a bit back in the day (Eg here, up to 64mhz claimed)


It's a fair question - although in this case the answer is probably a bit complicated - it depends.

In a few places it has been balanced exactly for 030@16 vs. 56k@32, and in other places its just as near as I could get but with some margin on one side or the other. In most places there are optional sync controls which allow the code to run safely with any ratio. In a few (pixel-filling) areas of code there is no sync control at all so things can break there if the ratio is bad. Adding sync control there would make drawing too slow so it's not useful to add it even for accelerated machines.

Generally speaking though it should work providing the ratio of DSP to CPU+BUS favours the DSP side. As the DSP speed goes up vs the bus speed, the probability of things going wrong will drop, even if the CPU is also faster. If the bus is boosted though (e.g. to 20MHz) then the DSP needs an equivalent increase or it will lag behind and the program will freeze. e.g. a 24MHz bus with fast CPU probably needs a DSP at 50MHz minimum for safe pixel drawing (an increase of approx 1.5x).

Currently the 030/16 optimized build is probably unsafe on funny speed configurations because sync control is off in various other bits of 3D code and I haven't properly measured margin at all those spots - but is easily rectified with some build options if needed.

User avatar
jvas
Captain Atari
Captain Atari
Posts: 444
Joined: Fri Jan 28, 2005 4:30 pm
Location: Budapest, Hungary
Contact:

Re: Quake 2 on Falcon030

Postby jvas » Fri Jun 03, 2016 8:00 am

I have accelerated DSP and everything worked fine so far (stock CPU/BUS speed)

User avatar
exxos
Hardware Guru
Hardware Guru
Posts: 4933
Joined: Fri Mar 28, 2003 8:36 pm
Location: England
Contact:

Re: Quake 2 on Falcon030

Postby exxos » Fri Jun 03, 2016 8:10 am

http://www.exxoshost.co.uk/atari/last/FPU/index.htm

My results for overclocking the FPU on my page.
4MB STFM 1.44 FD- VELOCE+ 020 STE - Falcon 030 CT60 - Atari 2600 - Atari 7800 - Gigafile - SD Floppy Emulator - PeST - various clutter

http://www.exxoshost.co.uk/atari/ All my hardware guides - mods - games - STOS
http://www.exxoshost.co.uk/atari/last/storenew/ - All my hardware mods for sale - Please help support by making a purchase.
http://ataristeven.exxoshost.co.uk/Steem.htm Latest Steem Emulator

User avatar
Ragstaff
Atari Super Hero
Atari Super Hero
Posts: 610
Joined: Mon Oct 20, 2003 3:39 am
Location: Melbourne Australia
Contact:

Re: Quake 2 on Falcon030

Postby Ragstaff » Fri Jun 03, 2016 9:30 am

jvas wrote:I have accelerated DSP and everything worked fine so far (stock CPU/BUS speed)

Great to know, this backs up Doug's answer - the DSP can be much faster and everything will still work. It seems it would not be worth his effort though, to throw more work at the DSP if it's detected running faster so that a fast DSP actually speeds the engine up

exxos wrote:http://www.exxoshost.co.uk/atari/last/FPU/index.htm

My results for overclocking the DSP on my page.

Cool. Apologies if I didn't read it properly but I assume the FPU OC'ing techniques has some applicability to overclocking the DSP?

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Fri Jun 03, 2016 9:39 am

Ragstaff wrote:Great to know, this backs up Doug's answer - the DSP can be much faster and everything will still work. It seems it would not be worth his effort though, to throw more work at the DSP if it's detected running faster so that a fast DSP actually speeds the engine up


I think you'll get some gain as it is because some services are hogging the DSP a bit and the CPU must wait, but only in a few places. I think a 50MHz DSP will cause this waiting to disappear.

The engine also tries to use some of the DSP busy time to update textures (texture popping) so a faster DSP will increase FPS but may also cause some more texture popping. I haven't tested this but its possible.

Ragstaff wrote:Cool. Apologies if I didn't read it properly but I assume the FPU OC'ing techniques has some applicability to overclocking the DSP?


I don't see any real connection between the two chips here, although some FPU time is spent in the Q2 engine for collision detection and the code for that isn't so great so a boosted FPU will claw back 5-10% potentially - but probably not more than that. If more than one player was present, or enemy objects then this would become a real hotspot and needs DSP/FPU optimization.

User avatar
exxos
Hardware Guru
Hardware Guru
Posts: 4933
Joined: Fri Mar 28, 2003 8:36 pm
Location: England
Contact:

Re: Quake 2 on Falcon030

Postby exxos » Fri Jun 03, 2016 9:39 am

Ragstaff wrote:Cool. Apologies if I didn't read it properly but I assume the FPU OC'ing techniques has some applicability to overclocking the DSP?


Oops, I think I'm thinking the doom thread where dml was talking a lot about overclocking the FPU not DSP. My bad :)
4MB STFM 1.44 FD- VELOCE+ 020 STE - Falcon 030 CT60 - Atari 2600 - Atari 7800 - Gigafile - SD Floppy Emulator - PeST - various clutter

http://www.exxoshost.co.uk/atari/ All my hardware guides - mods - games - STOS
http://www.exxoshost.co.uk/atari/last/storenew/ - All my hardware mods for sale - Please help support by making a purchase.
http://ataristeven.exxoshost.co.uk/Steem.htm Latest Steem Emulator

User avatar
Trixster
Obsessive compulsive Atari behavior
Obsessive compulsive Atari behavior
Posts: 145
Joined: Sat Nov 07, 2015 1:15 pm
Location: England

Re: Quake 2 on Falcon030

Postby Trixster » Wed Dec 28, 2016 11:55 am

Anything new to report Doug? Tremendous thread btw!
Atari Falcon + CT60e | Atari 2600 | Atari Jaguar | A1200 + 80mhz B1260 + Indi AGA2 + Ide-fix Express | A3000/060
A4000/060 Cyberstorm Mk2 + Indi AGA + Voodoo3 + Sonnet G3 400Mhz PPC + Deneb | Saturn | PS1 | PS2
Acorn A3020 | A3000 | A420/1 | BBC B | Atom | Master Turbo | A500 | SNES | C64 | 3DO | CPC6128

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3472
Joined: Sat Jun 30, 2012 9:33 am

Re: Quake 2 on Falcon030

Postby dml » Wed Dec 28, 2016 2:32 pm

Hi!

My health took a turn for the worse recently so progress on everything has been slow. Have done some more work on AGT to fix a few bugs in the first release and the Atari gcc6 side-project took a bit of time also.

However in the background I have started working on a new 3D engine which should develop things further. Not quake but I think parts of the drawing system for f030 will be based on this work.

I'll continue to maintain the Q2 engine as well to fill the remaining holes and improve the demo but also as a tech sandbox for developing & testing engine parts.

User avatar
Trixster
Obsessive compulsive Atari behavior
Obsessive compulsive Atari behavior
Posts: 145
Joined: Sat Nov 07, 2015 1:15 pm
Location: England

Re: Quake 2 on Falcon030

Postby Trixster » Wed Dec 28, 2016 8:36 pm

Blimey, I hope you're on the mend! Super interested to see the outcome of your next project!
Atari Falcon + CT60e | Atari 2600 | Atari Jaguar | A1200 + 80mhz B1260 + Indi AGA2 + Ide-fix Express | A3000/060
A4000/060 Cyberstorm Mk2 + Indi AGA + Voodoo3 + Sonnet G3 400Mhz PPC + Deneb | Saturn | PS1 | PS2
Acorn A3020 | A3000 | A420/1 | BBC B | Atom | Master Turbo | A500 | SNES | C64 | 3DO | CPC6128

User avatar
viking272
Captain Atari
Captain Atari
Posts: 242
Joined: Mon Oct 13, 2008 12:50 pm
Location: west of London, UK

Re: Quake 2 on Falcon030

Postby viking272 » Wed Dec 28, 2016 8:53 pm

Thanks for the update Doug, interesting developments as always. Hope you're back on yr feet soon. Take care.

User avatar
CiH
Atari God
Atari God
Posts: 1098
Joined: Wed Feb 11, 2004 4:34 pm
Location: Middle Earth (Npton) UK
Contact:

Re: Quake 2 on Falcon030

Postby CiH » Wed Dec 28, 2016 9:10 pm

Great to hear from you Doug. Take things easy :-)
"Where teh feck is teh Hash key on this Mac?!"


Social Media

     

Return to “680x0”

Who is online

Users browsing this forum: No registered users and 3 guests