blog-content/removegrain.html

476 lines
24 KiB
HTML
Raw Permalink Normal View History

2019-04-13 15:16:45 +02:00
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Actually Explaining RemoveGrain</p>
<div class="content">
<div class="subhead">Table of contents</div>
<ul>
<li><a href="#c_intro">Introduction</a></li>
<li><a href="#c_glossary">Glossary</a></li>
<li><a href="#c_m59">Modes 5-9</a></li>
<li><a href="#c_m13">Modes 13-16</a></li>
<li><a href="#c_m17">Mode 17</a></li>
<li><a href="#c_m21">Mode 21 and 22</a></li>
<li><a href="#c_m23">Mode 23 and 24</a></li>
</ul>
<br><br>
<style scoped>
/* This lunacy is sponsored by stux@dot-heaven.moe */
convolution {
display: flex;
flex-direction: column;
font-size: 300pt;
width: 1em;
height: 1em;
}
convolution > * > * {
flex-basis: 33%;
flex-grow: 1;
flex-shrink: 1;
border: 1px #7f7f7f solid;
font-size: 50pt;
display: flex;
align-content: space-around;
}
convolution > * > * > span {
margin: auto;
}
convolution > * {
display: flex;
flex-direction: row;
height: 100%;
}
convolution > * > black {
background-color: #000000;
}
convolution > * > white {
background-color: #ffffff;
}
convolution > * > transparent {
background-color: transparent;
}
convolution > * > lightgrey {
background-color: #cccccc;
}
convolution > * > darkgrey {
background-color: #333333;
}
convolution > * > *[data-accent="1"] {
border-color: #e17800;
}
convolution > * > *[data-accent="2"] {
border-color: #6c3e00;
}
</style>
<div class="subhead" id="c_intro">Introduction</div>
<p>For over a decade, RemoveGrain has been one of the most used filters for all kinds of video processing. It is
used in SMDegrain, xaa, FineDehalo, HQDeringMod, YAHR, QTGMC, Srestore, AnimeIVTC, Stab, SPresso, Temporal
Degrain, MC Spuds, LSFmod, and many, <span class="bold">many</span> more. The extent of this enumeration may
seem ridiculous or even superfluous, but I am trying to make a point here. RemoveGrain (or more recently
RGTools) is everywhere.</p>
<p>
But despite its apparent omnipresence, many encoders – especially those do not have years of encoding
experience – don't actually know what most of its modes do. You could blindly assume that they're all used
to, well, remove grain, but you would be wrong.</p>
<p>
After thinking about this, I realized that I, too, did not fully understand RemoveGrain. There is barely a
script that doesn't use it, and yet here we are, a generation of new encoders, using the legacy of our
ancestors, not understanding the code that our CPUs have executed literally billions of times.<br>But who
could
blame us? RemoveGrain, like many Avisynth plugins, has ‘suffered’ optimization to the point of complete
obfuscation.<br>You can try to read the <a title="You'd have to be an utter lunatic to click this."
href="https://gist.github.com/chikuzen/3c4d0bef250c6aab212f">code</a>
if you want to, but trust me, you don't.</p>
<p title="I swear to god I wasn't drunk when I wrote this">
Fortunately, in October of 2013, a brave adventurer by the name of tp7 set upon a quest which was thitherto
believed impossible. They reverse engineered the <a
href="http://www.vapoursynth.com/2012/10/open-binary-introducing-a-practical-alternative-to-open-source/">open
binary</a> that RemoveGrain had become and created <a href="https://github.com/tp7/RgTools">RGTools</a>, a
much
more readable rewrite that would henceforth be used in RemoveGrain's stead.
</p>
<p>
Despite this, there is still no complete (and understandable) documentation of the different modes. Neither
the
<a href="http://avisynth.nl/index.php/RgTools/RemoveGrain">Avisynth wiki</a> nor <a
href="https://github.com/tp7/RgTools/wiki/RemoveGrain">tp7's own documentation</a> nor any of <a
href="http://web.archive.org/web/20130615165406/http://doom10.org/index.php?topic=2185.0">the</a> <a
href="http://www.aquilinestudios.org/avsfilters/spatial.html#removegrain">other</a> <a
href="http://videoprocessing.fr.yuku.com/topic/9/RemoveGrain-10-prerelease">guides</a> manage to
accurately
describe all modes. In this article, I will try to explain all modes which I consider to be insufficiently
documented. Some self-explanatory modes will be omitted.<br>
If you feel comfortable reading C++ code, I would recommend reading the code yourself.
It's not very long and quite easy to understand: <a
href="https://github.com/tp7/RgTools/blob/master/RgTools/rg_functions_c.h">tp7's rewrite on Github</a>
</p>
<div class="subhead" id="c_glossary">Glossary</div>
<dl>
<dt>Clipping</dt>
<dd>
Clipping: A clipping operation takes three arguments. A value, an upper limit, and a lower limit. If the
value is below the lower limit, it is set to that limit, if it exceeds the upper limit, it is the
to that limit, and if it is between the two, it remains unchanged.
</dd>
<dt>Convolution</dt>
<dd>
The weighed average of a pixel and all pixels in its neighbourhood. RemoveGrain exclusively uses 3x3
convolutions, meaning for each pixel, the 8 surrounding pixels are used to calculate the
convolution.<br>
Mode 11, 12, 19, and 20 are just convolutions with different matrices.
</dd>
</dl>
To illustrate some of these modes, images of 3x3 pixel neighbourhoods will be used. The borders between the
pixels were added to improve visibility and have no deeper meaning.<br><br>
<convolution class="accent1">
<row>
<white></white>
<white></white>
<white></white>
</row>
<row>
<white></white>
<black></black>
<white></white>
</row>
<row>
<white></white>
<white></white>
<white></white>
</row>
</convolution>
<div class="subhead" id="c_m59">Modes 5-9</div>
<h3>Mode 5</h3>
The documentation describes this mode as follows: “Line-sensitive clipping giving the minimal change.”<br>
This is easier to explain with an example:<br><br>
<table style="width: 80%; margin: 0 auto">
<tr>
<td>
<convolution class="accent1">
<row>
<white></white>
<white></white>
<black></black>
</row>
<row>
<white></white>
<darkgrey></darkgrey>
<white></white>
</row>
<row>
<black></black>
<white></white>
<white></white>
</row>
</convolution>
</td>
<td>
<convolution class="accent1">
<row>
<white></white>
<white></white>
<black></black>
</row>
<row>
<white></white>
<black></black>
<white></white>
</row>
<row>
<black></black>
<white></white>
<white></white>
</row>
</convolution>
</td>
</tr>
</table>
<p style="text-align: center">Left: unprocessed clip. Right: clip after RemoveGrain mode 5</p>
<p> Mode 5 tries to find a line within the 3x3 neighbourhood of each pixel by comparing the center pixel with
two
opposing pixels. This process is repeated for all four pairs of opposing pixels, and the center pixel is
clipped to their respective values. After computing all four, the filter finds the pair which resulted in
the smallest change to
the center pixel and applies that pair's clipping. In our example, this would
mean clipping the center pixel to the top-right and bottom-left pixel's values, since clipping it to any of
the other pairs would make it white, significantly changing its value.</p>
To visualize the aforementioned pairs, they are
labeled with the same number in this image.<br><br>
<convolution class="accent1">
<row>
<white><span>1</span></white>
<white><span>2</span></white>
<white><span>3</span></white>
</row>
<row>
<white><span>4</span></white>
<black></black>
<white><span>4</span></white>
</row>
<row>
<white><span>3</span></white>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
<br> Due to this, a line like this could not be found, and the center pixel would remain unchanged:<br><br>
<convolution class="accent1">
<row>
<white><span>1</span></white>
<black><span>2</span></black>
<white><span>3</span></white>
</row>
<row>
<white><span>4</span></white>
<darkgrey></darkgrey>
<white><span>4</span></white>
</row>
<row>
<black><span>3</span></black>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
<h3>Mode 6</h3>
<p>
This mode is similar to mode 5 in that it clips the center pixel's value to opposing pairs of all pixels in
its
neighbourhood. The difference is the selection of the used pair. Unlike with mode 5, mode 6 considers the
range
of the clipping operation (i.e. the difference between the two pixels) as well as the change applied to the
center pixel. The exact math looks like this where p1 is the first of the two opposing pixels, p2 is the
second,
c_orig is the original center pixel, and c_processed is the center pixel after applying the clipping.</p>
<div class="code">
diff = abs(c_orig - c_processed) * 2 + abs(p1 - p2)
</div>
<p>
This means that a clipping pair is favored if it only slightly changes the center pixel and there is only
little difference between the two pixels. The change applied to the center pixel is prioritized (ratio 2:1)
in this mode. The
pair with the lowest diff is used.</p>
<h3>Mode 7</h3>
Mode 7 is very similar to mode 6. The only difference lies in the weighting of the values.
<div class="code">
diff = abs(c_orig - c_processed) + abs(p1 - p2)
</div>
Unlike before, the difference between the original and the processed center pixel is not multiplied by two. The
rest of the code is identical.
<h3>Mode 8</h3>
Again, not much of a difference here. This is essentially the opposite of mode 6.
<div class="code">
diff = abs(c_orig - c_processed) + abs(p1 - p2) * 2
</div>
The difference between p1 and p2 is prioritized over the change applied to the center pixel; again with a 2:1
ratio.
<h3>Mode 9</h3>
In this mode, only the difference between p1 and p2 is considered. The center pixel is not part of the equation.<br>
<p> Everything else remains unchanged. This can be useful to fix interrupted lines, as long as the length of the
gap never exceeds one pixel.</p>
<table style="width: 80%; margin: 0 auto">
<tr>
<td>
<convolution class="accent1">
<row>
<darkgrey><span>1</span></darkgrey>
<darkgrey><span>2</span></darkgrey>
<black><span>3</span></black>
</row>
<row>
<darkgrey><span>4</span></darkgrey>
<lightgrey></lightgrey>
<white><span>4</span></white>
</row>
<row>
<black><span>3</span></black>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
</td>
<td>
<convolution class="accent1">
<row>
<darkgrey></darkgrey>
<darkgrey></darkgrey>
<black></black>
</row>
<row>
<darkgrey></darkgrey>
<black></black>
<white></white>
</row>
<row>
<black></black>
<white></white>
<white></white>
</row>
</convolution>
</td>
</tr>
</table>
<br>
The center pixel is clipped to pair 3 which has the lowest range (zero, both pixels are black).<br>
Should the calculated difference for 2 pairs be the same, the pairs with higher numbers (the numbers in the
image, not their values) are prioritized. This applies to modes 5-9.
<div class="subhead">Mode 11 and 12</div>
Mode 11 and 12 are essentially this:
<div class="code">
std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2, 1])
</div>
There is no difference between mode 11 and mode 12. This becomes evident by looking at the code which was
literally copy-pasted from one function to the other. It should come as no surprise that my tests confirmed
this:
<div class="code vapoursynth">
>>> d = core.std.Expr([clip.rgvs.RemoveGrain(mode=11),clip.rgvs.RemoveGrain(mode=12)], 'x y - abs')<br>
>>> d = d.std.PlaneStats()<br>
>>> d.get_frame(0).props.PlaneStatsAverage<br>
<br>
0.0<br>
</div>
There is, however, a slight difference between using these modes and std.Convolution() with the corresponding
matrix:
<div class="code vapoursynth">
>>> d = core.std.Expr([clip.rgvs.RemoveGrain(mode=12),clip.std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2,
1])], 'x y - abs')<br>
>>> d = d.std.PlaneStats()<br>
>>> d.get_frame(0).props.PlaneStatsAverage<br>
0.05683494908489186<br>
</div>
<p>
This is explained by the different handling/interpolation of edge pixels, as can be seen in this comparison.<br>
<em><b>Edit</b>: This may also be caused by a bug in the PlaneStats code which was fixed in R36. Since 0.05 is way
too high for such a small difference, the PlaneStats bug is likely the reason.</em><br>
Note that the images were resized. The black dots were 1px in the original image.</p>
<div style="margin: -1em 0; font-size: 70%; text-align: right; color: rgba(130,130,130,0.7)">All previews
generated with <a class="ninjalink" href="https://yuuno.encode.moe/">yuuno</a></div>
<table style="width: 100%">
<tr>
<td style="width: 33%; text-align: center"><img class="img_expandable" src="/media/articles/res_rg/conv_src.png">
</td>
<td style="width: 33%; text-align: center"><img class="img_expandable" src="/media/articles/res_rg/conv_std.png">
</td>
<td style="width: 33%; text-align: center"><img class="img_expandable" src="/media/articles/res_rg/conv_rgvs.png">
</td>
</tr>
<tr>
<td style="width: 33%; text-align: center">The source image</td>
<td style="width: 33%; text-align: center">clip.std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2, 1])</td>
<td style="width: 33%; text-align: center">clip.rgvs.RemoveGrain(mode=12)</td>
</tr>
</table>
Vapoursynth's internal convolution filter interpolates beyond the edges of an image by mirroring the pixels
close to
the edge. RemoveGrain simply leaves them unprocessed.
<div class="subhead" id="c_m13">Modes 13-16</div>
These modes are very fast and very inaccurate field interpolators. They're like EEDI made in China, and there
should never be a reason to use any of them (and since EEDI2 was released in 2005, there <span
style="font-style: italic">was</span> never any reason to
use them, either).
<h3>Mode 13</h3>
<convolution class="accent1">
<row>
<black><span>1</span></black>
<darkgrey><span>2</span></darkgrey>
<lightgrey><span>3</span></lightgrey>
</row>
<row>
<transparent></transparent>
<transparent></transparent>
<transparent></transparent>
</row>
<row>
<white><span>3</span></white>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
<br>
Since this is a field interpolator, the middle row does not yet exist, so it cannot be used for any
calculations.<br>
It uses the pair with the lowest difference and sets the center pixel to the average of that pair. In the
example,
pair 3 would be used, and the resulting center pixel would be a very light grey.
<h3>Mode 14</h3>
Same as mode 13, but instead of interpolating the top field, it interpolates the bottom field.
<h3>Mode 15</h3>
<div class="code">Same as 13 but with a more complicated interpolation formula.</div>
<p style="text-align: right; font-size: 80%; font-style: italic; padding-right: 10em">Avisynth Wiki</p>
<p>
“It's the same but different.” How did people use this plugin during the past decade?<br>
Anyway, here is what it actually does:
</p>
<convolution class="accent1">
<row>
<black><span>1</span></black>
<white><span>2</span></white>
<darkgrey><span>1</span></darkgrey>
</row>
<row>
<transparent><span>0</span></transparent>
<transparent><span>0</span></transparent>
<transparent><span>0</span></transparent>
</row>
<row>
<black><span>1</span></black>
<white><span>2</span></white>
<black><span>1</span></black>
</row>
</convolution>
<p>
First, a weighed average of the three pixels above and below the center pixel is calculated as shown in the
convolution above. Since this is still a field interpolator, the middle row does not yet
exist. <br>
Then, this average is clipped to the pair with the lowest difference.<br>
In the example, the average would be a grey with slightly above 50% brightness. There are more dark pixels
than bright ones, but
the white pixels are counted double due to their position. This average would be clipped to the pair with
the smallest range, in this case bottom-left and top-right. The resulting pixel would thus have the color of
the top-right pixel.
</p>
<h3>Mode 16</h3>
<p>
Same as mode 15 but interpolates bottom field.
</p>
<div class="subhead" id="c_m17">Mode 17</div>
<div ondblclick="$(this).html('This mode is accurately described in the rgtools wiki, so I shouldn\'t even have bothered.<br>Also, why are you doubleclicking random paragraphs?')">
<div class="code">
Clips the pixel with the minimum and maximum of respectively the maximum and minimum of each pair of
opposite neighbour pixels.
</div>
It may sound slightly confusing at first, but that is actually an accurate description of what this mode
does.
It creates an array containing the smaller value (called <span style="font-family: monospace">lower</span>)
of each pair and one containing the bigger value (called <span style="font-family: monospace">upper</span>
).
The center pixel is then clipped to the smallest value in <span style="font-family: monospace">upper</span>
and the biggest value in <span style="font-family: monospace">lower</span>.
</div>
<div class="subhead" id="c_m21">Mode 21 and 22</div>
<h3>Mode 21</h3>
The value of the center pixel is clipped to the smallest and the biggest average of the four surrounding pairs.
<h3>Mode 22</h3>
Same as mode 21, but rounding is handled differently. This mode is faster than 21 (4 cycles per pixel).
<div class="subhead" id="c_m23">Mode 23 and 24</div>
These two are too difficult to explain using words, so I'm not even going to try. Read the <a
href="https://github.com/tp7/RgTools/blob/master/RgTools/rg_functions_c.h#L394">code</a> if you're
interested, but don't expect to find anything special. I can't see this mode actually doing something useful, so
the documentation was once again quite accurate.
<p class="ninjatext">I feel like I'm missing a proper ending for this, but I can't think of anything</p>
</div>
</div>