Scaler

https://github.com/MiSTer-devel/Main_MiSTer/wiki

Moderators: Mug UK, Zorro 2, spiny, Greenious, Sorgelig, Moderator Team

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Tue Nov 06, 2018 7:11 pm

Grabulosaure wrote:which other important part is missing ?

It was always in my TODO list but with this scaler it may become reality: I want to add screen grabber from Linux side. Since it's DDR3 which is shared with linux, it will be relatively easy.
I see current implementation uses floating size of buffer depending on input frame size. I suggest to simplify it with max size defined in module parameter. Basically 1920x1080 is absolute maximum of input frame. So it can be even hardcoded. And every line would be better to start from fixed address regardless the frame size - it will also simplify the scaler calculations ind decrease its complexity.
0x800000 (8MB) would be a single buffer size, with 24MB for triple buffering. At the last 2 words of buffer i suggest to write width and length of current frame in pixel.
Linux will grab these values from the fixed address and then save the snapshot with easy layout.

cacophony
Atari maniac
Atari maniac
Posts: 98
Joined: Sun Jul 22, 2018 11:14 pm

Re: Scaler

Postby cacophony » Tue Nov 06, 2018 7:37 pm

Sorgelig wrote:You can immediately delete option 2. No one will slowdown or speedup the systems in favor of matched frame rate. And it's not always can be achieved. So, simply forget about this option as universal one. MiSTer is not a single core device.


Ok well then you're removing the option that everybody wants! I think at a minimum this should be an option for certain cores.

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Tue Nov 06, 2018 9:06 pm

Sorgelig wrote:P.S.: even with pixel clock adjust to tweak output frame rate, it's impossible to make it exactly match as output pixel clock won't be multiple of input pixel clock. It means output will slowly run away or late relative to input frame. With close clocks match and additional frame between, the tearing can be eliminated. You will just see one frame drop or repeat like once per hour which you won't notice.
It's possible to force wait input vsync before start output sync - but it will introduce one more out of HDMI standard quirk and reduce amount of compatible TVs/Monitors even more.


The output pixel clock rate doesn't necessarily have to be a multiple of the input one. In my core I can match the frame period, without one clock frequency being multiple of the other. What I perform is: input_period*input_total_pixels == output_period*output_total_pixels. Of course, it is easy when you control both clocks as I do, you can cascade PLLs to produce a suitable integer relation between both clocks. A generic solution is a completely different task.
Fx Cast: Atari St cycle accurate fpga core

kitrinx
Atariator
Atariator
Posts: 28
Joined: Wed Sep 26, 2018 6:03 am

Re: Scaler

Postby kitrinx » Tue Nov 06, 2018 9:12 pm

One option that is very popular with the OSSC is that it can scale 240p video 5x to 1200p, then evenly crop the extra lines down to 1080p. Most of what is lost is overscan area, so it works well in most console games, and at the same time you still fill the screen and get vertical integer scaling at the same time. If cropping is possible here, such an option would be welcomed by many.

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Tue Nov 06, 2018 9:13 pm

ijor wrote:input_period*input_total_pixels == output_period*output_total_pixels.

This is what vsync_adjust option already does. As input_total_pixels and output_total_pixels are both hardcoded, relation between input_clock and output_clock won't be a pretty number. So, there always will be small difference.

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Tue Nov 06, 2018 9:35 pm

Sorgelig wrote:
ijor wrote:input_period*input_total_pixels == output_period*output_total_pixels.

This is what vsync_adjust option already does. As input_total_pixels and output_total_pixels are both hardcoded, relation between input_clock and output_clock won't be a pretty number. So, there always will be small difference.


It doesn't have to be "very pretty". All you need is such a relation that you could resolve within the capabilities of the PLLs. Cyclone V PLLs are very powerful in terms of multiplication and division. And if you cascade two of them and use them in fractional mode, it might be doable.

In second place, as I showed in the video mode test, the total number of output pixels is not really completely fixed. Most monitors can tolerate small variations of the format. Probably not all monitors, but not all monitors accept vsync_adjust either.

Again, I understand perfectly well that a generic solution is not the same as a specific one. May be the scaler could be ready so that in some cases, with specific core support, it could work synclocked to the input without a full frame buffer delay.
Fx Cast: Atari St cycle accurate fpga core

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Tue Nov 06, 2018 9:36 pm

Well.. i cannot make edges perfect.. Everything in scaler module is floating, addresses, bursts.. At some point i've thought i've fixed left column, but switching from 4:3 to 16:9 aspect ratio reveal wrapping again. So, wrapping depends on hmin/hmax values as well.
Hope it will be fixed by author...

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Tue Nov 06, 2018 9:40 pm

ijor wrote:It doesn't have to be "very pretty". All you need is such a relation that you could resolve within the capabilities of the PLLs. Cyclone V PLLs are very powerful in terms of multiplication and division. And if you cascade two of them and use them in fractional mode, it might be doable.

Sometimes i don't understand why you tell obvious things. MiSTer sources are available, so you already know it includes fractional PLL calculation for HDMI and adjust the output pixel clock to the tolerance level of Cyclone V PLL.
Basically you teach me how to make what I've done almost half year ago already. 8O

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Wed Nov 07, 2018 12:36 am

Sorgelig wrote:Sometimes i don't understand why you tell obvious things. MiSTer sources are available, so you already know it includes fractional PLL calculation for HDMI and adjust the output pixel clock to the tolerance level of Cyclone V PLL.
Basically you teach me how to make what I've done almost half year ago already. 8O


Come on Sorgelig. Of course that I'm not trying to teach you anything.

I only described the arguments, obvious or not, to make my point that, your conclusion that that "there always will be small difference" is, iMHO, not accurate. And as I said, it is also possible to do an even finer adjustment by cascading PLLs, which you are not doing. I'm sure you have your reasons for not cascading PLLs, and I realize it is not always possible.

My point is that some cores might be able to synchronize video input and output frame rates. And for this reason it might be worth to have the scaler ready to operate with the shortest possible lag.
Fx Cast: Atari St cycle accurate fpga core

Grabulosaure
Atari User
Atari User
Posts: 37
Joined: Tue Sep 05, 2017 9:35 pm
Contact:

Re: Scaler

Postby Grabulosaure » Wed Nov 07, 2018 1:30 am

I have just posted a new version, with fewer pixels falling off the edges.
There were some issues with divider ratios, off-by-one errors, draining pipes.
The hmax/vmax properties should be set to the last pixel, not the image size.
For a 640x400 image :
HDISP=640 hmax=639
VDISP=400 vmax=399

The Genesis/Megadrive doesn't work in HQ2x mode with 4:3 aspect ratio. What is needed is horizontal downsampling...This looks like a bug.


Sorgelig wrote:I see current implementation uses floating size of buffer depending on input frame size. I suggest to simplify it with max size defined in module parameter. Basically 1920x1080 is absolute maximum of input frame. So it can be even hardcoded. And every line would be better to start from fixed address regardless the frame size - it will also simplify the scaler calculations ind decrease its complexity.
0x800000 (8MB) would be a single buffer size, with 24MB for triple buffering. At the last 2 words of buffer i suggest to write width and length of current frame in pixel.
Linux will grab these values from the fixed address and then save the snapshot with easy layout.


Yes. Currently, a core that forgets to generate syncs could corrupt memory by allocating an impossibly large image. Having a fixed size buffer is probably better. I would like to try pack pixels in memory (instead of 5pixels per 128 bits) and support several colour depths. For metadata, maybe a full block (currently 256 bytes) at the begin or end of buffer. Simplest option would be steganography, using the LSB of the first pixels to encode image properties.

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Wed Nov 07, 2018 3:21 am

Grabulosaure wrote:(I also need to look at frequency limits, maybe some parts will need deeper pipelining or replicated hardware. Anyone tried with high resolution outputs ?)


I run a quick timing analysis and I get a max output pixel frequency around 120MHz. Not bad at all, but not enough for full HD (148.5MHz).

Note that I can't say I fully understand the code. So there might be some extra false paths or multicycles constrains that could improve the computed max frequency.
Fx Cast: Atari St cycle accurate fpga core

ghogan42
Atari nerd
Atari nerd
Posts: 49
Joined: Wed Oct 17, 2018 7:27 pm

Re: Scaler

Postby ghogan42 » Wed Nov 07, 2018 4:03 am

ijor wrote:
Grabulosaure wrote:(I also need to look at frequency limits, maybe some parts will need deeper pipelining or replicated hardware. Anyone tried with high resolution outputs ?)


I run a quick timing analysis and I get a max output pixel frequency around 120MHz. Not bad at all, but not enough for full HD (148.5MHz).

Note that I can't say I fully understand the code. So there might be some extra false paths or multicycles constrains that could improve the computed max frequency.


I don't know much about hardware. But I've been running the Genesis core with this scaler for testing at 1920x1080 for, say, 10-20 minutes at a time while I was testing modifications to add different scaling kernels. Would other people with different de10 boards be seeing something different? Or am I misunderstanding something here? How do these computed max frequencies usually relate to real results?

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Wed Nov 07, 2018 4:29 am

ghogan42 wrote:I don't know much about hardware. But I've been running the Genesis core with this scaler for testing at 1920x1080 for, say, 10-20 minutes at a time while I was testing modifications to add different scaling kernels. Would other people with different de10 boards be seeing something different? Or am I misunderstanding something here? How do these computed max frequencies usually relate to real results?


Well, timing analysis tells you the maximum frequency that is guaranteed to work flawlessly at the worst conditions. But you hardly reach those worst conditions. Worst condition is the combination of worst temperature (higher), worst voltage (lower) and worst silicon for a given speed grade.

Then, as I said, there are transfers that can allow multiple cycles. The timing analysis can't know that. You have to constrain those multicycles explicitly. That requires being more familiar with the code than I am. Furthermore, in this case isolated errors from timing violations probably wouldn't be fatal. May be just a single wrong pixel in some frame that you might even not notice.

Lastly, not guaranteed to not fail, doesn't necessarily mean guaranteed to fail. In other words, not meeting timing doesn't guarantee that the code would fail, it just doesn't guarantee it won't.
Fx Cast: Atari St cycle accurate fpga core

ghogan42
Atari nerd
Atari nerd
Posts: 49
Joined: Wed Oct 17, 2018 7:27 pm

Re: Scaler

Postby ghogan42 » Wed Nov 07, 2018 5:49 am

ijor wrote:
ghogan42 wrote:I don't know much about hardware. But I've been running the Genesis core with this scaler for testing at 1920x1080 for, say, 10-20 minutes at a time while I was testing modifications to add different scaling kernels. Would other people with different de10 boards be seeing something different? Or am I misunderstanding something here? How do these computed max frequencies usually relate to real results?


Well, timing analysis tells you the maximum frequency that is guaranteed to work flawlessly at the worst conditions. But you hardly reach those worst conditions. Worst condition is the combination of worst temperature (higher), worst voltage (lower) and worst silicon for a given speed grade.

Then, as I said, there are transfers that can allow multiple cycles. The timing analysis can't know that. You have to constrain those multicycles explicitly. That requires being more familiar with the code than I am. Furthermore, in this case isolated errors from timing violations probably wouldn't be fatal. May be just a single wrong pixel in some frame that you might even not notice.

Lastly, not guaranteed to not fail, doesn't necessarily mean guaranteed to fail. In other words, not meeting timing doesn't guarantee that the code would fail, it just doesn't guarantee it won't.


Thanks for explaining!

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Wed Nov 07, 2018 9:09 am

Grabulosaure wrote:I have just posted a new version, with fewer pixels falling off the edges.

much better now. Still some problems on edges:
1) left column has half width.
2) on polyphase filter picture shifts down-right. Bottom line disappear, right column becomes halfwidth. Left column draw normally.
update on polyphase: shift happens only when i load external coefficients representing NN filter. If taps are 0,128,0,0 then it shifts down-right. If taps are 0,0,128,0 then it shifts up-left. So, there is no way to represent NN through polyphase without shifting the image and cut either 2 edges.


Although upper and bottom lines look equal, they look like half width.

all above is tested in 1280x720 resolution. Other resolutions may have different effects.

Something completely wrong with scaling happens when output resolution is 1920x1080 and input resolution is 256x224 (240p Suite test pattern, or for example game Clue). Large portion on the left has copy from the middle. Probably some overflow happens inside.
20181107_180831.jpg


Grabulosaure wrote:The hmax/vmax properties should be set to the last pixel, not the image size.

yes, i've figured it out, so sys_top supplies correct size.

Grabulosaure wrote:Simplest option would be steganography, using the LSB of the first pixels to encode image properties.

Steganography is fine.
You do not have the required permissions to view the files attached to this post.

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Wed Nov 07, 2018 9:39 am

Grabulosaure wrote:Having a fixed size buffer is probably better

I vote for this for safety.
Grabulosaure wrote:I would like to try pack pixels in memory (instead of 5pixels per 128 bits) and support several colour depths.

I want to ask you not to do this. To keep possibility to implement snapshot feature. The place where scaler is connected has only one pixel format 24bit RGB regardless what original system generates. With packing you will save 8 bits per 128bits with total save around 15% which is not crucial for DDR3 usage.

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Wed Nov 07, 2018 10:02 am

ijor wrote:your conclusion that that "there always will be small difference" is, iMHO, not accurate.

dividing 2/3 gives endless 0.666(6). While it's still possible to do, there are other relations which won't be able divided without remainder. No matter if PLLs are cascaded or not. I'm talking in general, not for particular cases where it will be possible to make.

When you do a single core, you can tune the system to some specific requirements. You won't need any framework as everything will be done around single core with its specific behavior.

MiSTer is multicore universal system. It provides common API for all cores to make them behave the same sharing common parameters (which you deliberately ignore in your ST core). If every core will be written without common API then it would be disaster for development and complete mess for users as every core would require different settings may be sometimes mutually exclusive.

Strictly speaking, without framework scaler is not really required as every core could directly output to HDMI with some fixed resolution (different cores - different resolutions) like it's done in some cores where VGA output is fused into the core. It takes some time for me to delete the foreigner VGA mess to make it clean and provide original resolution and only then it's ready to integrate to MiSTer.

ijor wrote:And for this reason it might be worth to have the scaler ready to operate with the shortest possible lag.

I didn't contradict it. Where i've said i'm against it? I've just wrote that in general case it doesn't depend on scaler. Scaler only needs one signal "wait" to hold on vsync start (and make some TVs going crazy). It's like couple minutes to add. The whole frame locking tasks and match the frequencies are on core itself. It's not the scaler's business.

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Wed Nov 07, 2018 10:50 am

Grabulosaure wrote:There were some issues with divider ratios, off-by-one errors, draining pipes.

There is other way to make a scaled steps: Instead of calculate fraction step, you can use integers.
For example, input size in 320pix, output is 1920pix:
with every output pixel step, you add 320 to accumulator, and when it becomes more than 1920, switch to a new input pixel and subtract 1920 from accumulator. And so on til the end. The advantage of this method - you are guaranteed to have 0 and max input values on the edges regardless how complex relation between input and output sizes.
Since fraction is used for filters, this will be a problem as fractions are of the output size, not decimal. It can be workarounded in either filters to include recalculation according to current size. Or coefficients can be re-calculated during VBlank on each frame - there should be enough time to re-calc the coefficients without rush.

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Wed Nov 07, 2018 11:43 am

Sorgelig wrote:
ijor wrote:And for this reason it might be worth to have the scaler ready to operate with the shortest possible lag.

I didn't contradict it. Where i've said i'm against it? I've just wrote that in general case it doesn't depend on scaler. Scaler only needs one signal "wait" to hold on vsync start (and make some TVs going crazy). It's like couple minutes to add. The whole frame locking tasks and match the frequencies are on core itself. It's not the scaler's business.


Good. Although probably need slightly more work than just that. Most important it should use a single frame buffer, not even double buffer ...

I was going to say that it might be possible to avoid a frame buffer and the DDR usage altogether, just buffer internally a couple of scan lines. But then we'll loose the new screen capture feature, and that is certainly a great feature to implement.
Fx Cast: Atari St cycle accurate fpga core

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Wed Nov 07, 2018 12:32 pm

ijor wrote:I was going to say that it might be possible to avoid a frame buffer and the DDR usage altogether, just buffer internally a couple of scan lines. But then we'll loose the new screen capture feature, and that is certainly a great feature to implement.

These are not mutually exclusive features.
I repeat, there is nothing should be changed in scaler - everything is there already. Lag depends on difference between writing to buffer and reading form it. If output reading speed equals input reading speed, then reading can be several lines behind the writing. Full frame buffer still can be there and it won't affect the latency in any way.

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Wed Nov 07, 2018 12:53 pm

Sorgelig wrote:I repeat, there is nothing should be changed in scaler - everything is there already. Lag depends on difference between writing to buffer and reading form it.


With triple frame buffering, you start reading from a buffer that was already completely written. You never read and write to the same buffer to avoid tearing. That's the whole point of triple buffering. So you will always be, at least, one full frame of lag behind, in the best case. Right?

Then, the scaler code should be modified to optionally use a single buffer. Or, if you prefer, to allow reading and writing from the same buffer.

Full frame buffer still can be there and it won't affect the latency in any way.


Of course. I meant that you might save the DDR bandwidth if you don't need it. But again, probably is not worth loosing the screen capture feature.
Fx Cast: Atari St cycle accurate fpga core

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Wed Nov 07, 2018 1:05 pm

ijor wrote:With triple frame buffering, you start reading from a buffer that was already completely written. You never read and write to the same buffer to avoid tearing. That's the whole point of triple buffering. So you will always be, at least, one full frame of lag behind, in the best case. Right?

Scaler has option to work with single or triple buffer. mode[3] = 0 - single buffer.
Grabulosaure may be can add a wait input signal, so video output will be hold at beginning (or end - not sure which is better) of VSync if this signal is activated.
Then the rest sync between input and output will be done in the emulation system.

And even this wait signal for scaler is not required, as video module of emulation system may watch the output sync and hold itself till the right time. If reading and writing speeds are the same, then it should be done only once at core start.

ijor
Hardware Guru
Hardware Guru
Posts: 3629
Joined: Sat May 29, 2004 7:52 pm
Contact:

Re: Scaler

Postby ijor » Wed Nov 07, 2018 4:59 pm

Sorgelig wrote:And even this wait signal for scaler is not required, as video module of emulation system may watch the output sync and hold itself till the right time. If reading and writing speeds are the same, then it should be done only once at core start.


Yes, it is possible. But seems to me kinda a retorted way to do it. Especially complicated for those cores where video sync generation is integral part of the main chipset. You would need to freeze the whole system at the top level. And the system usually already has to wait and/or sync to several other subsystems. Certainly doable if there is no other choice.

But perhaps more important, I think the scaler has more "knowledge" and it's in better position to decide the ideal lag distance. Does the scaler buffers a whole scan line? With which timing? Does the vertical scaler averages multiple scan lines? If so how many scan lines back it needs to read? With that kind of "knowledge", the scaler is in much better position to implement the shortest possible, yet safe, lag.
Fx Cast: Atari St cycle accurate fpga core

Sorgelig
Fuji Shaped Bastard
Fuji Shaped Bastard
Posts: 3172
Joined: Mon Dec 14, 2015 10:51 am
Location: Russia/Taiwan

Re: Scaler

Postby Sorgelig » Wed Nov 07, 2018 7:36 pm

Ha! After more thinking i've found how to make NN on this scaler through coeff without shifting!

Code: Select all

# range -128..128
# sum of line must not exceed the range!

# Near Neighbor

# horizontal coefficients
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0

# vertical coefficients
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0


basically it centers at 1.5 instead of 1 or 2.
So, for this scaler the center is not on tap 1 like usual, but between tap 1 and 2. I don't know if it will break any traditional filter coefficients.

ghogan42
Atari nerd
Atari nerd
Posts: 49
Joined: Wed Oct 17, 2018 7:27 pm

Re: Scaler

Postby ghogan42 » Wed Nov 07, 2018 8:05 pm

Sorgelig wrote:Ha! After more thinking i've found how to make NN on this scaler through coeff without shifting!

Code: Select all

# range -128..128
# sum of line must not exceed the range!

# Near Neighbor

# horizontal coefficients
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0

# vertical coefficients
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 128, 0, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0
   0, 0, 128, 0


basically it centers at 1.5 instead of 1 or 2.
So, for this scaler the center is not on tap 1 like usual, but between tap 1 and 2. I don't know if it will break any traditional filter coefficients.


Yes according to the VIP docs, Phase 0 (the first row) is supposed to be at the Center of Tap1. So in the coeffNN.txt that you originally posted on GIT, (2nd column all 128) you were saying that From Center of Tap1 to center of Tap2, you only output Tap1. So this is a small mistake in your original coefficients.

Notice that in the sets of working coefficients that I posted in the other thread with the coeffcient calculator, I used the coefficients like you have now. And things that are "close to" nearest neighbor have a similar pattern:

Code: Select all

# range -128..128
# sum of line must not exceed the range!

# Even Sharper Bilinear on x-axis and y-axis
# May cause shimmering in motion

# horizontal coefficients
   0, 128,   0,   0
   0, 128,   0,   0
   0, 128,   0,   0
   0, 128,   0,   0
   0, 126,   2,   0
   0, 122,   6,   0
   0, 113,  15,   0
   0,  95,  33,   0
   0,  64,  64,   0
   0,  33,  95,   0
   0,  15, 113,   0
   0,   6, 122,   0
   0,   2, 126,   0
   0,   0, 128,   0
   0,   0, 128,   0
   0,   0, 128,   0

# vertical coefficients
   0, 128,   0,   0
   0, 128,   0,   0
   0, 128,   0,   0
   0, 128,   0,   0
   0, 126,   2,   0
   0, 122,   6,   0
   0, 113,  15,   0
   0,  95,  33,   0
   0,  64,  64,   0
   0,  33,  95,   0
   0,  15, 113,   0
   0,   6, 122,   0
   0,   2, 126,   0
   0,   0, 128,   0
   0,   0, 128,   0
   0,   0, 128,   0


So nothing should break. You are doing it correctly now as far as I can tell. Before you should have had a shift that is fixed now.


Return to “MiSTer”

Who is online

Users browsing this forum: gojira54 and 2 guests