Software sprites

All 680x0 related coding posts in this section please.

Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team

User avatar
Frank B
Atari God
Atari God
Posts: 1008
Joined: Wed Jan 04, 2006 1:28 am
Location: Boston

Re: Software sprites

Postby Frank B » Thu Oct 09, 2014 10:17 pm

As long as your sprite doesn't have any transparent pixels in the middle

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Software sprites

Postby dml » Thu Oct 09, 2014 10:22 pm

Cyprian wrote:Sounds interesting.
Do you have any working examples? I'm curious about some technical details regarding Endmask usage. Like how do you mask, in one pass, four bitplanes with Endmask (Endmask1 and Endmask3 are applied only on the one word per line). And how do you mask sprites greater than 32pixels with Endmask (I guess the same mask from Endmask2 will be used for every internal Word). And also it is possible that we have different understanding of „one pas” ☺.


If you go back to the start of this thread, follow Anima's description and sample code (its pretty much what you guess - a bit limiting for sprites > 32 pixels but good for the 32pixel case).

You'll also see me failing to beat it with a software sprite experiments :-p (although I never tried very small or large sizes - just the ideal 32 pixel case).

Zamuel_a
Atari God
Atari God
Posts: 1234
Joined: Wed Dec 19, 2007 8:36 pm
Location: Sweden

Re: Software sprites

Postby Zamuel_a » Fri Oct 10, 2014 4:46 pm

It seems to work best on a Falcon in 8 bitplane mode. On an STE it doesn't give you much, if any, extra speed. A 64 pixel wide sprite could be better.
ST / STFM / STE / Mega STE / Falcon / TT030 / Portfolio / 2600 / 7800 / Jaguar / 600xl / 130xe

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Software sprites

Postby dml » Mon Oct 13, 2014 7:01 pm

Does anyone here know if the blitter incurs any additional overhead if you exchange xcount/ycount iterators?

e.g. setting xcount=1, ycount=n compared with ycount=1, xcount=n

The number of words transferred is the same, but yinc events are more common so if it costs more to perform yinc, it could be slower. I haven't tested this but somebody else might have done so already.

User avatar
Cyprian
10 GOTO 10
10 GOTO 10
Posts: 1686
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: Software sprites

Postby Cyprian » Mon Oct 13, 2014 9:32 pm

that doesn't have any impact. The only overhead is when blitter takes control over bus (one bus cycle in STE) and when release it (one bus cycle / STE). It is especially visible in Blit-mode. Unfortunately I'm no more able to verify that bus mastering in Falcon
Lynx II / Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Software sprites

Postby dml » Mon Oct 13, 2014 9:47 pm

Cyprian wrote:that doesn't have any impact. The only overhead is when blitter takes control over bus (one bus cycle in STE) and when release it (one bus cycle / STE). It is especially visible in Blit-mode. Unfortunately I'm no more able to verify that bus mastering in Falcon


Ok that's useful info. I have sometimes swapped them over to optimize setup where one of the counters is constant, but never thought to check for internal overheads.

mc6809e
Captain Atari
Captain Atari
Posts: 159
Joined: Sun Jan 29, 2012 10:22 pm

Re: Software sprites

Postby mc6809e » Mon Oct 13, 2014 11:12 pm

Cyprian wrote:that doesn't have any impact. The only overhead is when blitter takes control over bus (one bus cycle in STE) and when release it (one bus cycle / STE). It is especially visible in Blit-mode. Unfortunately I'm no more able to verify that bus mastering in Falcon


Slightly OT, but what's the undocumented feature that gives a 65/128 split between blitter and cpu?

User avatar
Cyprian
10 GOTO 10
10 GOTO 10
Posts: 1686
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: Software sprites

Postby Cyprian » Tue Oct 14, 2014 6:49 am

mc6809e wrote:Slightly OT, but what's the undocumented feature that gives a 65/128 split between blitter and cpu?

interesting, what feature do you mean?
I know that in Blit mode officially split is 64/64. But this is not correct because split is 65 (in case of MSTE 66 - 63+3) bus cycles for blitter and 64 bus cycles for cpu. In that 65 blitter bus cycles, 63 is used for memory transfer and two for bus mastering (one - in, one - out).
And one undocumented feature I know is that, blitter counts CPU bus usage and after CPU 64th bus access, it takes control over the bus for its 65 bus cycles. Side effect is that CPU long instructions with small memory access (like mul/div) can significantly delay blitter pass.
Lynx II / Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/

mc6809e
Captain Atari
Captain Atari
Posts: 159
Joined: Sun Jan 29, 2012 10:22 pm

Re: Software sprites

Postby mc6809e » Tue Oct 14, 2014 10:11 am

Cyprian wrote:
mc6809e wrote:Slightly OT, but what's the undocumented feature that gives a 65/128 split between blitter and cpu?

interesting, what feature do you mean?
I know that in Blit mode officially split is 64/64. But this is not correct because split is 65 (in case of MSTE 66 - 63+3) bus cycles for blitter and 64 bus cycles for cpu. In that 65 blitter bus cycles, 63 is used for memory transfer and two for bus mastering (one - in, one - out).


Ah, right. Bus arbitration overhead.

Cyprian wrote:And one undocumented feature I know is that, blitter counts CPU bus usage and after CPU 64th bus access, it takes control over the bus for its 65 bus cycles. Side effect is that CPU long instructions with small memory access (like mul/div) can significantly delay blitter pass.


That's a new one to me.

Do you mean the CPU must actually touch memory 64 times before the blitter starts again? I thought the switch happened after 64 bus cycles whether the CPU used them or not.

User avatar
Cyprian
10 GOTO 10
10 GOTO 10
Posts: 1686
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: Software sprites

Postby Cyprian » Tue Oct 14, 2014 1:40 pm

mc6809e wrote:Do you mean the CPU must actually touch memory 64 times before the blitter starts again?

it seems yes
mc6809e wrote:I thought the switch happened after 64 bus cycles whether the CPU used them or not.

nope,
thanks to STOP (and MUL/DIV) instruction I delayed every blitter pass (in Blit Mode) by thousands of cycles.
Lynx II / Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/

User avatar
dml
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3474
Joined: Sat Jun 30, 2012 9:33 am

Re: Software sprites

Postby dml » Tue Oct 14, 2014 1:42 pm

Cyprian wrote:nope,
thanks to STOP (and MUL/DIV) instruction I delayed every blitter pass (in Blit Mode) by thousands of cycles.


That's interesting. If the Falcon does the same, it will behave very strangely with cached loops.

User avatar
Cyprian
10 GOTO 10
10 GOTO 10
Posts: 1686
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: Software sprites

Postby Cyprian » Tue Oct 14, 2014 2:10 pm

dml wrote:That's interesting. If the Falcon does the same, it will behave very strangely with cached loops.

it seems it does. It would explain bad figures for Blit mode (marked as BLIT_INT) in my blitter test:

Code: Select all

#Moving 52MB 800x65536 bytes

#Falcon 030 16MHz
BLIT_HOG        17.9 sec.        2 928 983 MB/s
BLIT_INT        69.7 sec.        752 207   MB/s   


#1040 STe
BLIT_HOG        26.5 sec.        1 978 445 MB/s
BLIT_INT        58.5 sec.        896 219   MB/s


full thread you can find there
Lynx II / Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/

mc6809e
Captain Atari
Captain Atari
Posts: 159
Joined: Sun Jan 29, 2012 10:22 pm

Re: Software sprites

Postby mc6809e » Tue Oct 14, 2014 2:38 pm

Cyprian wrote:
mc6809e wrote:Do you mean the CPU must actually touch memory 64 times before the blitter starts again?

it seems yes
mc6809e wrote:I thought the switch happened after 64 bus cycles whether the CPU used them or not.

nope,
thanks to STOP (and MUL/DIV) instruction I delayed every blitter pass (in Blit Mode) by thousands of cycles.


I guess the designers believed that it is better to guarantee a certain number memory accesses rather than guarantee that the CPU gets a guaranteed amount of time. And I can see where keeping the blitter paused during a STOP would be nice if there's a need to reduce interrupt latency. Maybe there were compatibility concerns.

Still it seems to make things more difficult for the programmer. Blitter scheduling becomes more difficult since the timing of blitter accesses can be dependent on the data for some codes.

And of course for the Falcon where the CPU has cache, the design decision is terrible since it reduces the opportunities for CPU/blitter concurrency. Unless the CPU keeps restarting the blitter, the blitter will spend lots of time sitting idle waiting for many cache misses. Uhg.

User avatar
Cyprian
10 GOTO 10
10 GOTO 10
Posts: 1686
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: Software sprites

Postby Cyprian » Tue Oct 14, 2014 4:05 pm

mc6809e wrote:Unless the CPU keeps restarting the blitter, the blitter will spend lots of time sitting idle waiting for many cache misses. Uhg.

actually restarting Blitter in blit mode is a common practice

Generally I agree that the blitter in Falcon could be better optimized. Similar story is with Amiga 1200 where the blitter is taken directly from A500. It has 16bit, A500 performance and does not utilize the advantages of 32bit bus. Actually I wonder what was the reason that they didn't tweaked it a bit more.
Lynx II / Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/

mc6809e
Captain Atari
Captain Atari
Posts: 159
Joined: Sun Jan 29, 2012 10:22 pm

Re: Software sprites

Postby mc6809e » Wed Oct 15, 2014 7:05 pm

Cyprian wrote:
mc6809e wrote:Unless the CPU keeps restarting the blitter, the blitter will spend lots of time sitting idle waiting for many cache misses. Uhg.

actually restarting Blitter in blit mode is a common practice


And how does the 68030 know when to restart the blitter? Think about what the code would look like.

Any code would have waste cycles to periodically poll the blitter to see if it has stopped. Explicit checks would have to be scattered throughout.

Cyprian wrote:
Generally I agree that the blitter in Falcon could be better optimized. Similar story is with Amiga 1200 where the blitter is taken directly from A500. It has 16bit, A500 performance and does not utilize the advantages of 32bit bus. Actually I wonder what was the reason that they didn't tweaked it a bit more.


Very different stories, IMO.

The original blitter on the Amiga has no trouble running concurrently with the processor, 68000, 68020, or otherwise. The blitter will automatically yield to the processor for one DMA cycle if the processor has already been blocked for more than three cycles. It can also access memory at up twice the speed of the processor, using just two CPU cycles per access. And finally it is a three source blitter, meaning that masking can happen in the same blit as merging.

That doesn't mean a 32-bit blitter wouldn't have been nice. It would have been an improvement. The problem is that it might have involved a redesign of many other parts of the system. All the advantages listed above are there because the system is so well-integrated, but it's exactly that level of integration that made a redesign difficult.

User avatar
leonard
Moderator
Moderator
Posts: 645
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: Software sprites

Postby leonard » Tue Mar 31, 2015 4:27 pm

Hi!

I read that old thread, very interesting. I always wondered if blitter could draw sprites faster than CPU. My conclusion is that if memory is not a problem (for instance for a sprite demo), then I'm pretty sure CPU always beats the blitter on STE machine.

For instance look at that old demo:

http://www.pouet.net/prod.php?which=29317

I think blitter can't beat 22 sprites, 3 bitplans, 32*31, on standard STE machine.
Leonard/OXYGENE.

User avatar
Anima
Atari Super Hero
Atari Super Hero
Posts: 667
Joined: Fri Mar 06, 2009 9:43 am
Contact:

Re: Software sprites

Postby Anima » Tue Mar 31, 2015 5:52 pm

leonard wrote:I think blitter can't beat 22 sprites, 3 bitplans, 32*31, on standard STE machine.

I don't think so either but it would be a interesting challenge. :D

However, demos and games are always a different story when you have to keep flexibility and memory consumption in mind. A game with many sprite animation frames will eat up your RAM very fast. Especially when you want to use all four bitplanes and bigger sprites for the game.

So how do you think would an optimised software routine perform compared to an optimised blitter routine using arbitrary 32 x 32 pixels sprites with 16 colours?

User avatar
Cyprian
10 GOTO 10
10 GOTO 10
Posts: 1686
Joined: Fri Oct 04, 2002 11:23 am
Location: Warsaw, Poland

Re: Software sprites

Postby Cyprian » Wed Apr 01, 2015 9:30 am

leonard wrote:I think blitter can't beat 22 sprites, 3 bitplans, 32*31, on standard STE machine.

on what conditions?
sprites 32*31, 3 bitplanes, fully masked?
what about background? fully restored or just cleared?
music? scroll? raster?
Lynx II / Jaugar / TT030 / Mega STe / 800 XL / 1040 STe / Falcon030 / 65 XE / 520 STm / SM124 / SC1435
SDrive / PAK68/3 / Lynx Multi Card / LDW Super 2000 / XCA12 / SkunkBoard / CosmosEx / SatanDisk / UltraSatan / USB Floppy Drive Emulator / Eiffel / SIO2PC / Crazy Dots / PAM Net
Hatari / Steem SSE / Aranym / Saint
http://260ste.appspot.com/

User avatar
leonard
Moderator
Moderator
Posts: 645
Joined: Thu May 23, 2002 10:48 pm
Contact:

Re: Software sprites

Postby leonard » Wed Apr 01, 2015 1:15 pm

Cyprian wrote:on what conditions?
sprites 32*31, 3 bitplanes, fully masked?
what about background? fully restored or just cleared?
music? scroll? raster?


I was talking about the sprite demo by MCoder. Let's say exactly the same demo: no background, same music, same scroller, same sprite. I guess blitter can't beat that demo.
Leonard/OXYGENE.


Social Media

     

Return to “680x0”

Who is online

Users browsing this forum: No registered users and 3 guests