You are using an STE, why not use the Blitter to do the sprites? Source and Dest X and Y increments sort out all of these problems plus you do not need to pre-shift. I think you have to do bitplane at a time and SKEW it, I can't remember. Anyone know of example source code to do this?
You should be able to get a number of sprites and pixel perfect hardware scrolling in next to no CPU time. In fact you do not even need extensions, just POKE the values into the hardware registers if you are feeling brave. And by POKE I mean you can LOKE them 4 bytes at a time and save a load of CPU time (it gets a little more complicated doing this but it does work).