Moderators: simonsunnyboy, Mug UK, Zorro 2, Moderator Team
dml wrote:If you mean: can it read direct from non-interleaved Amiga-formatted plane data, then yes. You can specify the skip size for horizontal and vertical intervals at both source and destination, so you can interleave/deinterleave during blits. You may find though that some jobs which could otherwise be done in a single blit may end up needing 4 / one-per-plane (which isn't such a big deal unless the thing being drawn is tiny and numerous - and some common things need 4 anyway).
Code: Select all
move.w #$ffff,$ffff8a28.w ; endmask 1 (left, or 1st word in line)
move.w #$ffff,$ffff8a2a.w ; endmask 2 (middle words)
move.w #$ffff,$ffff8a2c.w ; endmask 3 (right, or last word in line)
move.w #2,$ffff8a20.w ; source x incement in bytes for 1 plane (Amiga)
move.w #srcskip+2,$ffff8a22.w ; source y increment in bytes (plus any x increment)
move.w #8,$ffff8a2e.w ; dest x increment in bytes for 1 plane (ST)
move.w #dstskip+8,$ffff8a30.w ; dest y increment in bytes (plus any x increment)
move.b #0,$ffff8a3d.w ; skew/scroll
move.b #2,$ffff8a3a.w ; halftone logic op (2: [S]rc)
move.b #3,$ffff8a3b.w ; logic op (3: [D]st=[S]rc)
move.w #4,$ffff8a36.w ; words to transfer per image line
move.l sourcebuf,$ffff8a24.w ; data source address
move.l destbuf,$ffff8a32.w ; data dest address
move.w #16,$ffff8a38.w ; image lines to transfer
move.b #$80,$ffff8a3c.w ; go! (bus-sharing mode)
.cont: bset.b #7,$ffff8a3c.w ; force-restart until complete
bne.s .cont
Code: Select all
move.w lines,d7
subq 1,d7
.yloop:
move.w blit_linewords,d6
subq 1,d6
bra.s .xstart
.xloop:
add.w src_xinc,src
add.w dst_xinc,dst
.xstart:
move.w (src),(dst)
dbra d6,.xloop
add.w src_yinc,src ; oops! src_xinc wasn't applied!
add.w dst_yinc,dst
dbra d7,.yloop
Code: Select all
MounXPositionNotGreaterThen320
; new print
move.w d0,d6
and.w #$f,d6
eor.w #$f,d6
And.w #$fff0,D0
lsr.w #1,d0
add.w 6(a6),d0
Add.l #MountainGraphicAddr,D0
Moveq #0,D2
move.w (a6),d2
Lea YMulList(PC),a0 ; Ypos For Printing Clouds and Mountain
Add.w D2,D2
Move.w (A0,D2.W),D2
add.l log_scr(PC),d2
move.w #$ffff,$ffff8a28.w ; endmask 1 (left, or 1st word in line)
move.w #$ffff,$ffff8a2a.w ; endmask 2 (middle words)
move.w #$ffff,$ffff8a2c.w ; endmask 3 (right, or last word in line)
move.l #$00080008,$ffff8a20.w ; source x incement in bytes for 1 plane (Amiga)
move.l #$00080008,$ffff8a2e.w ; dest x increment in bytes for 1 plane (ST)
move.b D6,$ffff8a3d.w ; skew / scroll
move.w #$0203,$ffff8a3a.w ; halftone logic op (2: [S]rc)
move.w #20,$ffff8a36.w ; words to transfer per image line
move.w 2(a6),d1
Moveq.w #3,D4
BlitLoop:
Movem.l D0/D2,-(Sp)
move.l D0,$ffff8a24.w ; data source address
move.l D2,$ffff8a32.w ; data dest address
move.w D1,$ffff8a38.w ; image lines to transfer
move.b #$80,$ffff8a3c.w ; go! (bus-sharing mode)
.cont:
bset.b #7,$ffff8a3c.w ; force-restart until complete
bne.s .cont
Movem.l (Sp)+,D0/D2
Addq.w #2,D0
Addq.w #2,D2
Dbra D4,BlitLoop
Rts
chicane wrote:I must admit that I'm too lazy to trace through your code to work out if this is the case, but have you enabled the FXSR bit (bit 7 of $FFFF8A3D)?
It looks like the blitter might be writing the first 16 pixels of the mountain without first being initialised with some source content. I think when you set a skew value, you also need in most cases to set FXSR to ensure that the blitter doesn't write garbage for the first 16 pixels.
Code: Select all
move.l #$00080008,$ffff8a20.w ; source x incement in bytes
move.l #$00080008,$ffff8a2e.w ; dest x increment in bytes
Code: Select all
move.l #$00080000,$ffff8a20.w ; source x incement in bytes
move.l #$00080008,$ffff8a2e.w ; dest x increment in bytes
FedePede04 wrote:and it look right many thx for you help
FedePede04 wrote:it is my first blitter code, and i always thought that the blitter was more complicated, but it was actually quite simple, but i bet a sprite routine ain't as simple
FedePede04 wrote:btw many talks about if the array is to small, then its/as fast to use the cpu instead of the blitter, do you know how big an array shall be before its better to use the blitter?
chicane wrote:FedePede04 wrote:and it look right many thx for you help
Great! But it sounds like you may have got it working before I mentioned FXSR - did you actually need to use FXSR in the end to get the desired effect?FedePede04 wrote:it is my first blitter code, and i always thought that the blitter was more complicated, but it was actually quite simple, but i bet a sprite routine ain't as simple
Yes - I also held off adding STE support to Pole Position because I was intimidated by the prospect of writing Blitter code. But when I started reading the documentation and looking at examples, it all came very easily! Same with DMA sound - the biggest effort was working out how to create binary files that were in the right format!FedePede04 wrote:btw many talks about if the array is to small, then its/as fast to use the cpu instead of the blitter, do you know how big an array shall be before its better to use the blitter?
I used the Blitter in Pole Position to draw the background mountains and the road. The background mountains were rendered in 4 bitplanes, with one call to the Blitter for line (256 pixels) rendered. I used the Blitter in Hog mode. This was faster than using movem, but not to the degree that you might hope for. Perhaps 10-20% faster. I've not done any extensive testing, but I'd imagine that you'd be getting very slim benefits over the use of movem on any span of less than 128 pixels.
Having said this, I seem to remember reading somewhere that the Blitter really shines when working with less than 4 bitplanes. I think it's much, much faster than a CPU routine when dealing with 1 bitplane, with diminishing returns with each additional bitplane used.
The Blitter brought massive benefits to the road drawing in Pole Position. The non-STE routine I'd previously implemented was quite clever (IMHO) but was oriented towards reducing memory consumption rather than being as fast as possible. With the help of the blitter, I was able to use a much simpler algorithm resembling that used by the arcade hardware - just holding each line of the road as a bitmap in memory and plotting it to the road with the use of Skew to position it with single-pixel accuracy. The downside of this, of course, was the increased memory requirement, which (together with the use of DMA sound) resulted in the game needing 1 meg to run in the end.
FedePede04 wrote:no it did not work properly before you help me![]()
but you are writing that you past all the mountains bitplanes in one go, I have not managed that, when i did that it look like it scroll bitplane 1 into bitplane 2 and bitplane 2 into bitplane 3 ect.
so i have to make the blitter first copy bitplane 1 then bitplane 2 ect. it work fine and it is still faster.
Code: Select all
word @ 0xffff8a28 (endmask1) = -1
word @ 0xffff8a2a (endmask2) = -1
word @ 0xffff8a2c (endmask3) = -1
word @ 0xffff8a20 (srcxinc) = 8
word @ 0xffff8a22 (srcyinc) = -126
word @ 0xffff8a2e (dstxinc) = 8
word @ 0xffff8a30 (dstxinc) = -118
word @ 0xffff8a36 (xcount) = 16
word @ 0xffff8a38 (ycount) = 4
word @ 0xffff8a3a (hop/op) = 0x0203
longword @ 0xffff8a24 (source) = your source address
longword @ 0xffff8a32 (destination) = your destination address
FedePede04 wrote:i have also thought of using the blitter for the road, but was thinking that it maybe was not faster, and also the road color change, so you need to copy all 4 bitplanes.. but i will wait to later on.
I will see if i can make a routine that can replace the graphic and reallocate the color in the graphic data file. and if i can free a color, so i always have the road color in color 2.
if the road colors always was in color 2, is should make the road routine much more simple, and speed up the print process.
FedePede04 wrote:one thing that i have been reading about the blitter, is that you get the Shift for free, so all that can be shifted should be printed by the blitter, also you don't have to preshift it, saving some memory. but in this game, it shift all graphic, road, clouds/mountains, and the sprites, so if i can find out the logic of the sprite routine, there should be room for some serious speed optimizations.
FedePede04 wrote:again many thx for the help, and the end result of "Pole Position" is just super....
btw what mame did you use, did you use an external disassemble or did you use mame?
i am think of converting an arcade game after this project.
FedePede04 wrote:ok here is an other question.
people say that it does not pay to use the blitter to clear a screen, that it is nearly as fast to use the cpu.
but if you get a gain, maybe a little one, would it not still be a benefit?
FedePede04 wrote:could you do the same trick as on the Amiga, let the blitter filled the top part on the screen (Atari in hog mode)
and at the same time clear the other half of the screen using the cpu?
what is your experience with this situations
Code: Select all
movem.l d0-d7/a0-a5,-(a6)
or
movem.l d0-d7/a0-a6,xxx.w (screen address has to be in low memory)
RA_pdx wrote:The fastest screen clearing (32kb) with CPU on ST is something like that:Code: Select all
movem.l d0-d7/a0-a5,-(a6)
or
movem.l d0-d7/a0-a6,xxx.w (screen address has to be in low memory)
Unrolled it needs between 68k and 69k cycles.
The blitter in HOG mode needs for the same around 64k cycles.
So you will have just a little advantage of 4-5k cycles with the Blitter in this case. However sometimes every cycle counts.
As soon as you don't want to clear all 4 planes then it's time for the Blitter!
Code: Select all
; Unroll loop
move.w #199,D0
loop
movem.l d1-d7/a0-a5,-(a6)
dbra d0,loop
Users browsing this forum: No registered users and 2 guests