initial commit

This commit is contained in:
kageru 2019-04-13 15:16:45 +02:00
commit 033aed5884
Signed by: kageru
GPG Key ID: 8282A2BEA4ADA3D2
14 changed files with 3405 additions and 0 deletions

253
adaptivegrain.html Normal file
View File

@ -0,0 +1,253 @@
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Adaptive Graining Methods</p>
<div class="content">
<ul>
<li><a href="#c_abstract">Abstract</a></li>
<li><a href="#c_demo">Demonstration and Examples</a></li>
<li><a href="#c_script">Theory and Explanation</a></li>
<li><a href="#c_performance">Performance</a></li>
<li><a href="#c_usage">Usage</a></li>
<li><a href="#c_outtro">Closing Words and Download</a></li>
</ul>
</div>
<div class="content">
<p class="subhead"><a href="#c_abstract" id="c_abstract">Abstract</a></p>
In order to remedy the effects of lossy compression of digital media files, dither is applied to randomize
quantization errors and thus avoid or remove distinct patterns which are perceived as unwanted artifacts. This
can be used to remove banding artifacts by adding random pixels along banded edges which will create the
impression of a smoother gradient. The resulting image will often be more resilient to lossy compression, as the
added information is less likely to be omitted by the perceptual coding algorithm of the encoding software.
<br>Wikipedia explains it like this:
<div class="code">
High levels of noise are almost always undesirable, but there are cases when a certain amount of noise is
useful, for example to prevent discretization artifacts (color banding or posterization). [. . .] Noise
added for such purposes is called dither; it improves the image perceptually, though it degrades the
signal-to-noise ratio.
</div>
<p>
In video encoding, especially regarding anime, this is utilized by debanding filters to improve their
effectiveness and to “prepare” the image for encoding. While grain may be beneficial under some
circumstances, it is generally perceived as an unwanted artifact, especially in brighter scenes where
banding
artifacts would be less likely to occur even without dithering. Most Blu-rays released today will already
have
grain in most scenes which will mask most or even all visual artifacts, but for reasons described <a
href="article.php?p=grain">here</a>, it may be beneficial to remove this grain.</p>
<p>
As mentioned previously, most debanding filters will add grain to the image, but in some cases this grain
might
be either to weak to mask all artifacts or to faint, causing it to be removed by the encoding software,
which in
turn allows the banding to resurface. In the past, scripts like GrainFactory were written to specifically
target
dark areas to avoid the aforementioned issues without affecting brighter scenes.</p>
<p>
This idea can be further expanded by using a continuous function to determine the grain's strength based on
the
average brightness of the frame as well as the brightness of every individual pixel. This way, the problems
described above can be solved with less grain, especially in brighter areas and bright scenes where the dark
areas are less likely to the focus of the viewer's attention. This improves the perceived quality of the
image
while simultaneously saving bitrate due to the absence of grain in brighter scenes and areas.
</p>
<p class="subhead"><a href="#c_demo" id="c_demo">Demonstration and Examples</a></p>
Since there are two factors that will affect the strength of the grain, we need to analyze the brightness of any
given frame before applying any grain. This is achieved by using the PlaneStats function in Vapoursynth. The
following clip should illustrate the results. The brightness of the current frame is always displayed in the top
left-hand corner. The surprisingly low values in the beginning are caused by the 21:9 black bars. <span
style="font-size: 70%; color: #aaaaaa;">(Don't mind the stuttering in the middle. That's just me being bad)</span>
<br><br>
<video width="1280" height="720" controls>
<source src="/media/articles/res_adg/adg_luma.mp4" type="video/mp4">
</video>
<br>
<div style="font-size: 80%;text-align: right">You can <a href="/media/articles/res_adg/adg_luma.mp4">download the video</a>
if your browser is not displaying it correctly.
</div>
In the dark frames you can see banding artifacts which were created by x264's lossy compression algorithm.
Adding grain fixes this issue by adding more randomness to the gradients.<br><br>
<video width="1280" height="720" controls>
<source src="/media/articles/res_adg/adg_grained.mp4" type="video/mp4">
</video>
<br>
<div style="font-size: 80%;text-align: right"><a href="/media/articles/res_adg/adg_grained.mp4">Download</a></div>
By using the script described above, we are able to remove most of the banding without lowering the crf,
increasing aq-strength, or graining other surfaces where it would have decreased the image quality.
<p class="subhead"><a href="#c_script" id="c_script">Theory and Explanation</a></p>
The script works by generating a copy of the input clip in advance and graining that copy. For each frame in the
input clip, a mask is generated based on the frame's average luma and the individual pixel's value. This mask is
then used to apply the grained clip with the calculated opacity. The generated mask for the previously used clip
looks like this:<br><br>
<video width="1280" height="720" controls>
<source src="/media/articles/res_adg/adg_mask.mp4" type="video/mp4">
</video>
<br>
<div style="font-size: 80%;text-align: right"><a href="/media/articles/res_adg/adg_mask.mp4">Download</a></div>
The brightness of each pixel is calculated using this polynomial:
<div class="code">
z = (1 - (1.124x - 9.466x^2 + 36.624x^3 - 45.47x^4 + 18.188x^5))^(y^2 * <span
style="color: #e17800">10</span>)
</div>
where x is the luma of the current pixel, y is the current frame's average luma, and z is the resulting pixels
brightness. The highlighted number (10) is a parameter called luma_scaling which will be explained later.
<p>
The polynomial is applied to every pixel and every frame. All luma values are floats between 0 (black) and 1
(white). For performance reasons the precision of the mask is limited to 8&nbsp;bits, and the frame
brightness is rounded to 1000 discrete levels.
All lookup tables are generated in advance, significantly reducing the number of necessary calculations.</p>
<p>
Here are a few examples to better understand the masks generated by the aforementioned polynomial.</p>
<table style="width: 100%">
<tr>
<td style="width: 33%"><img src="/media/articles/res_adg/y0.2.svg"></td>
<td style="width: 33%"><img src="/media/articles/res_adg/y0.5.svg"></td>
<td style="width: 33%"><img src="/media/articles/res_adg/y0.8.svg"></td>
</tr>
</table>
<p>
Generally, the lower a frame's average luma, the more grain is applied even to the brighter areas. This
abuses
the fact that our eyes are instinctively drawn to the brighter part of any image, making the grain less
necessary in images with an overall very high luma.</p>
Plotting the polynomial for all y-values (frame luma) results in the following image (red means more grain and
yellow means less or no grain):<br>
<img style="width: 100%; margin: -5em 0" src="/media/articles/res_adg/preview.svg">
<br>
More detailed versions can be found <a href="/media/articles/res_adg/highres.svg">here</a> (100 points per axis) or <a
href="/media/articles/res_adg/superhighres.7z">here</a> (400 points per axis).<br>
Now that we have covered the math, I will quickly go over the Vapoursynth script.<br>
<div class="spoilerbox_expand_element">Click to expand code<p class="vapoursynth">
<br>import vapoursynth as vs
<br>import numpy as np
<br>import functools
<br>
<br>def adaptive_grain(clip, source=None, strength=0.25, static=True, luma_scaling=10, show_mask=False):
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;def fill_lut(y):
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;x = np.arange(0, 1, 1 / (1 << src_bits))
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;z = (1 - (1.124 * x - 9.466 * x ** 2 + 36.624 * x ** 3 -
45.47 * x ** 4 + 18.188 * x ** 5)) ** (
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;(y ** 2) * luma_scaling) * ((1
<< src_bits) - 1)
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;z = np.rint(z).astype(int)
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return z.tolist()
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;def generate_mask(n, clip):
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;frameluma =
round(clip.get_frame(n).props.PlaneStatsAverage * 999)
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;table = lut[int(frameluma)]
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return core.std.Lut(clip, lut=table)
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;core = vs.get_core(accept_lowercase=True)
<br>&nbsp;&nbsp;&nbsp;&nbsp;if source is None:
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;source = clip
<br>&nbsp;&nbsp;&nbsp;&nbsp;if clip.num_frames != source.num_frames:
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;raise ValueError('The length of the filtered and
unfiltered clips must be equal')
<br>&nbsp;&nbsp;&nbsp;&nbsp;source = core.fmtc.bitdepth(source, bits=8)
<br>&nbsp;&nbsp;&nbsp;&nbsp;src_bits = 8
<br>&nbsp;&nbsp;&nbsp;&nbsp;clip_bits = clip.format.bits_per_sample
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;lut = [None] * 1000
<br>&nbsp;&nbsp;&nbsp;&nbsp;for y in np.arange(0, 1, 0.001):
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;lut[int(round(y * 1000))] = fill_lut(y)
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;luma = core.std.ShufflePlanes(source, 0, vs.GRAY)
<br>&nbsp;&nbsp;&nbsp;&nbsp;luma = core.std.PlaneStats(luma)
<br>&nbsp;&nbsp;&nbsp;&nbsp;grained = core.grain.Add(clip, var=strength, constant=static)
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;mask = core.std.FrameEval(luma, functools.partial(generate_mask, clip=luma))
<br>&nbsp;&nbsp;&nbsp;&nbsp;mask = core.resize.Bilinear(mask, clip.width, clip.height)
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;if src_bits != clip_bits:
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;mask = core.fmtc.bitdepth(mask, bits=clip_bits)
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;if show_mask:
<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;return mask
<br>
<br>&nbsp;&nbsp;&nbsp;&nbsp;return core.std.MaskedMerge(clip, grained, mask)
</p></div><br>
<b>Thanks to Frechdachs for suggesting the use of std.FrameEval.</b><br>
<br>
In order to adjust for things like black bars, the curves can be manipulated by changing the luma_scaling
parameter. Higher values will cause comparatively less grain even in darker scenes, while lower
values will increase the opacity of the grain even in brighter scenes.
<!--
<p class="subhead"><a id="c_performance" href="#c_performance">Performance</a></p>
<p>Essentially, this script does two things. It analyzes the average brightness of every frame and generates a
mask. Luma analysis is done with one of the core functions and should have no impact on the encoding
performance.<br>
The mask generation should also have virtually no impact on the encode, as the lookup tables
are generated within seconds and then reused for all frames. I cannot time the involved core functions
(core.std.lut and core.std.MaskedMerge)
accurately, but neither of them is particularly expensive, so the performance cost should be negligible.<br>
Don't expect a notable difference in encoding performance.
-->
</p>
<p class="subhead"><a id="c_usage" href="#c_usage">Usage</a></p>
The script has four parameters, three of which are optional.
<table class="paramtable">
<tr>
<td>Parameter</td>
<td>[type, default]</td>
<td class="paramtable_main">Explanation</td>
</tr>
<tr>
<td>clip</td>
<td>[clip]</td>
<td>The filtered clip that the grain will be applied to</td>
</tr>
<!--
<tr>
<td>source</td>
<td>[clip, None]</td>
<td>The source clip which will be used for the luma mask. For performance reasons this should be the
unfiltered source clip. If you changed the length of your filtered clip with Trim or similar
filters, also apply those filters here, but do not denoise, deband, or otherwise filter
this clip. If unspecified, the clip argument will be used.
</td>
</tr>
-->
<tr>
<td>strength</td>
<td>[float, 0.25]</td>
<td>Strength of the grain generated by AddGrain.</td>
</tr>
<tr>
<td>static</td>
<td>[boolean, True]</td>
<td>Whether to generate static or dynamic grain.</td>
</tr>
<tr>
<td>luma_scaling</td>
<td>[float, 10]</td>
<td>This values changes the general grain opacity curve. Lower values will generate more grain, even in
brighter scenes, while higher values will generate less, even in dark scenes.
</td>
</tr>
</table>
<br>
<p class="subhead"><a href="#c_outtro" id="c_outtro">Closing Words</a></p>
<p>
Grain is a type of visual noise that can be used to mask discretization if used correctly. Too much grain
will
degrade the perceived quality of a video, while to little grain might be destroyed by the perceptual coding
techniques used in many popular video encoders.</p>
<p>
The script described in this article aims to apply the optimal amount of grain to all scenes to prevent
banding artifacts without having a significant impact on the perceived image quality or the required
bitrate.
It does this by taking the brightness of the frame as a whole and every single pixel into account and
generating an opacity mask based on these values to apply grain to certain areas of each frame. This can be
used to supplement or even replace the dither generated by other debanding scripts. The script has a
noticeable but not significant impact on encoding performance.
</p>
<div class="download_centered"><a href="https://github.com/Irrational-Encoding-Wizardry/kagefunc/blob/master/kagefunc.py">Download</a></div>
<br><br><span style="color: #251b18; font-size: 50%; text-align: left">There's probably a much simpler way to do this, but I like this one. fite me m8</span>
</div>
</div>

82
aoc.html Normal file
View File

@ -0,0 +1,82 @@
<body>
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div class="wrapper_article">
<p class="heading" style="line-height: 1.1em">Challenges for the Advent of Code or<br>“How to Bite Off More Than You Can Chew, but not too much”</p>
<div class="content">
<p class="subhead"><a href="#c_introduction" id="c_introduction">Introduction</a></p>
<p>
Humans love challenges.<br>
We want to prove something to our friends, our coworkers, our peers, or even just ourselves.<br>
We want to push ourselves to our limits, maybe even beyond them, and grow into more than what we are.<br>
We want to show the world that we can do something, whatever that may be.<br>
We want to get just a tiny bit closer to the optimal version, to the person we could become.<br>
</p>
<p>
That’s one reason to do Advent of Code. The other — and much more common — is probably just boredom.<br>
For those unaware: Advent of Code is like an advent calendar where there’s a coding task hidden behind each door. You receive a randomized input (actually a randomly selected one from a precomputed set because they can’t store and solve a new puzzle for every user), and you have to enter your solution on the website. How you get that solution doesn’t matter. That should give you a basic idea. Onto the actual content of this.
</p>
<p class="subhead"><a href="#c_leaning" id="c_leaning">Learning From Past Experiences</a></p>
<p>
Last year, I failed the Advent of Code. And not just a little bit. I really mean I failed.
I wanted to use it as an opportunity to learn Haskell, so my plan was to implement all 24 tasks in Haskell.
This should, in theory, give me enough practice to understand the basics and become at least reasonably comfortable in the language.
</p>
<p>
Reality was… not quite that.<br>
Not only did I not finish even <em>a single</em> puzzle in Haskell, but it deterred me from even trying to learn any functional language for quite a while (that is, until today and ongoing).
I then switched to Nim because reasons, but I didn’t finish all 24 in that either. I ended up skipping ~70% of all days.
</p>
<p>
What do we learn from that?<br>
Well, for one, we shouldn’t just plan tasks of completely unknown complexity. “It will be alright” is not how you approach software. I’ve done that, and it works more often than it should, but this is really a case where it was just silly.
For two, (I know that’s not how english works; don’t @ me) I needed variety. Just picking one language may work for other people and it’s certainly useful if you want to learn that language, but it wasn’t enough to keep me motivated.
</p>
<p class="subhead"><a href="#c_forward" id="c_forward">Going Forward</a></p>
<p>
So what did I change this year? Simple.
<ul>
<li title="yes, I still can’t into functional programming">I will not use programming paradigms that are completely alien to me</li>
<li>I will use more than just one language to keep things interesting</li>
<li>I will brag as openly as possible about my plans so people can publicly shame me for not following them</li>
</ul>
I know that there are people who try “24 days, 24 languages”, but I’m not quite that insane. For me, 8 languages should be enough. I’m giving myself three tickets for each language. It is up to me which language I will use for any given day, as long as I have at least one ticket left for that language.<br>
I’ve created a git repo for this challenge <a href="https://git.kageru.moe/kageru/advent-of-code-2018">here</a>.<br>
The number of remaining tickets for each language can be tracked in the <code>tickets.md</code> file.<br>
Edit: looks like there are 25 Days, and I misjudged a few things, so I had to change my plans here on day one. Still keeping this section, just because.<br>
The languages I have chosen are (in alphabetical order):
<ol>
<li>C</li>
<li>Go</li>
<li>Java</li>
<li>Javascript</li>
<li>Kotlin</li>
<li>Nim</li>
<li>Python</li>
<li>Rust</li>
</ol>
That puts us at 6 compiled and 2 interpreted languages. Out of these, I would rate myself the most proficient in Python and the least proficient in C. Contrary to popular belief, you can be a software developer <span title="Just for the sake of the stupid pun, I just opened a terminal and ran “touch c”. Kill me pls.">without ever having touched C</span>.<br>
<br>I would like to add that I’m not necessarily a fan of all of these, especially Java. However, since I’m currently working on a ~1,000,000 loc Java project as my fulltime job, not including it here just felt wrong.<br>
To show my remorse and to give me a very early impression of the suffering that this decision will bring with it, I’m typing this paragraph in <a href="https://www.gnu.org/fun/jokes/ed.msg.html">ed, the default editor</a>.
It really is quite the experience. The creativity of the people who wrote it is admirable. You wouldn’t believe how many convenience features you can add to an editor that only displays one line at a time. (Let me also remind you that ed was used before monitors were a thing, so printing lines wasn’t free either.)
This is, unironically, a better editor than most modern default editors (referring mostly to Windows Notepad and whatever OS X uses).<br>
Oh, and ed supports regex. Everything is better if it supports regex. But I digress.
<br>
Apart from the JVM languages, I’ll write all of the solutions in vim, using only Syntastic and the language’s respective command line compilers.
Java is just too verbose to be used without introspection and autocompletion. At least for my taste.
vim can be really cool to use, and I love it for most languages, but Java is just not one of them. Still typing this in ed, btw. :^)
</p>
<p>
If I come across a solution that seems particularly interesting to me, I might share it here, but I doubt that will happen.
Let’s see how far I can get this year. Expect a recap of this year’s inevitable failure here around the end of the year.
</p>
<p>
Edit: after finding out that JS has basically no consistent way of reading files, I removed that from the list. It’s garbage anyway, and I’m not writing node.js. Since all of this doesn’t really work with 25 days either, I’ve also decided to drop two more languages to get me to 5x5.<br>
That left me with C, Go, Kotlin, Python, and Rust. 5 days each.
</p>
<span class="ninjatext">I thought about including Haskell in this list, but I decided not to repeat history… at least for now</span>
</div>
</div>
</body>

153
aoc_postmortem.html Normal file
View File

@ -0,0 +1,153 @@
<body>
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Advent of Code: Postmortem</p>
<div class="content">
<p>
Looks like it’s that time of the year again. I’m on a train with nothing to do, and I’ve been procrastinating this for months.
</p>
<p>
Last year, I attempted a challenge for <a href="https://adventofcode.com/">Advent of Code</a>.
If you want a detailed explanation, check out the <a href="https://kageru.moe/blog/article/aoc/">blog post I wrote back then.</a>
tl;dr: I had a set of 5 programming languages for the 25 tasks. I had to use each of them 5 times.
</p>
<p>
Even though I failed, it didn’t turn out that bad.
I finished 14.5 days. 1-14 are fully implemented, and for 16, I did the first half. Then, I stopped.</p><p>
I’m not here to make excuses, but humor me for a paragraph before we get to the actual content of this.
</p>
<p>
I really disliked day 15.
Not because it was too hard, but because it would have been boring to implement while also being time-consuming due to the needless complexity.
I just didn’t feel like doing that, and that apparently set a dangerous precedent that would kill the challenge for me.</p><p>
So there’s that. Now for the interesting part:
</p>
<p class="subhead">The part you’re actually here for</p>
<p>
I tried out languages. Languages that I had never really used before. Here are my experiences:
</p>
<p class="subhead">C<p>
<p>
Days finished: 4</p><p>
Before the challenge, I knew absolutely nothing about C. I had never allocated memory, incremented a pointer, or used section 3 of <code>man</code>. That’s why I wanted to get this out of the way as quickly as possible.</p><p>
C is interesting.
It lets you do all these dirty things, and doing them was a kind of guilty pleasure for me.
Manually setting nullbytes or array pointers around.
Nothing about that is special, but other languages just don’t let you.
You know it’s bad, and that only makes it better.</p><p>
Would I use C for other private projects? No. Definitely not.
I just don’t see the point in $currentYear. But was it interesting? You bet.
</p>
<p class="subhead">Go<p>
<p>
Days finished: 3</p><p>
Allegedly, this language was made by some very smart people.
They may have been smart, but they may have created the most boring programming language in existence.
It feels bad to write, and it’s a terrible choice when dealing mostly with numbers (as is the case with most AoC puzzles).
It’s probably great for business logic, and I can say from personal experience that it also works quite well for web development and things like <a href="https://git.kageru.moe/kageru/discord-selphybot">discord bots</a>.
But to me, not having map/reduce/filter/etc. just makes a language really unenjoyable and verbose.</p><p>
Writing Go for AoC sometimes felt like writing a boring version of C (TL note: one that won’t segfault and memory leak your ass off if you don’t pay attention).
</p><p>
People say it’s more readable and all that, and that’s certainly great for huge projects, but for something like this… I wouldn’t ever pick Go voluntarily.</p><p>
And also not for anything else, to be perfectly honest. I mean… just look at this.
(Yes, I wrote this. Scroll down and use the comment section to tell me that I’m just too dumb to understand Go.)
</p>
<pre><code class="go">package main
import "fmt"
func main() {
// This declares a mutable variable.
var regularString = "asdf"
// This also declares a mutable variable.
// I have no idea why there are two ways of doing this.
unicodeString := "aä漢字"
for char := range(unicodeString) {
// You might expect that this prints the characters individually,
fmt.Println(char)
/*
* Instead, it compiles and prints (0, 1, 3, 6) -- the index of the first byte of each character.
* Very readable and very intuitive. Definitely what the user would want here.
*/
}
for _, char := range(unicodeString) {
/*
* Having learned from our past mistakes, we assign the index to _ to discard it.
* Surely this time.
*/
fmt.Println(char)
/*
* Or not because this prints (97, 228, 28450, 23383) -- the unicode indices of the characters.
* Printing a rune (the type Go uses to represent individual characters,
* e.g. during string iteration) actually prints its integer value.
*/
}
for _, char := range(unicodeString) {
/*
* This actually does what you’d expect.
* It also handles unicode beautifully, instead of just iterating over the bytes.
*/
fmt.Printf("%c\n", char)
}
/*
* So go knows what a character is and how many of those are in a string when iterating.
* Intuitively, this would also apply to the built-in len() function.
* However...
*/
fmt.Println(len(regularString)) // prints 4
fmt.Println(len(unicodeString)) // prints 9
}</code></pre>
<p>
Oh, and there are no generics. Moving on.
</p>
<p class="subhead">Kotlin<p>
<p>
Days finished: 3</p><p>
Oh Kotlin. Kotlin is great. A few weeks after AoC, I actually started writing Kotlin at work, and it’s lovely.
It fixes almost all of the issues people had with Java while maintaining perfect interoperability, and it’s also an amazing language just by itself.</p><p>
I like the way Kotlin uses scopes and lambdas for everything, and the elvis operator makes dealing with nullability much easier.
Simple quality of life improvements (like assignments from try/catch blocks), things let() and use(), proper built-in singletons, and probably more that I’m forgetting make this a very pleasant language. Would recommend.</p><p>
In case you didn’t know: Kotlin even compiles to native binaries if you’re not using any Java libraries (although that can be hard because you sometimes just need the Java stdlib).
</p>
<p class="subhead">Python<p>
<p>
Days finished: 2</p><p>
I don’t think there’s much to say here.
Python was my fallback for difficult days because I just feel very comfortable writing it.
The standard library is the best thing since proper type inference, and it supports all the syntactic sugar that anyone could ask for.
If a language is similar to Python, I’ll probably like it.</p><p>
Yes, I’ve tried nim.
</p>
<p class="subhead">Rust<p>
<p>
Days finished: 2</p><p>
Rust is… I don’t even know.
But first things first: I like Rust.<br>
I like its way of making bad code hard to write.<br>
I like the crate ecosystem.<br>
I like list operations and convenience functions like <code>sort_by_key</code>.<br>
I like immutability by default.<br>
I like generics (suck it, Go).</p><p>
Not that I didn’t have all kinds of issues with it, but Rust made me feel like those issues were my fault, rather than the fault of the language.
I also wouldn’t say I feel even remotely comfortable with the borrow checker -- it sometimes (or more often than I’d like to admit) still felt like <a href="https://i.redd.it/iyuiw062b1s11.png" _target=blank>educated trial and error.</a>
I’m sure this gets better as you grow more accustomed to the language, and so far I haven’t encountered anything that would be a deal breaker for me.</p><p>
Rust might even become my go-to language for performance-sensitive tasks at some point.
It definitely has all of the necessary tools.
Unlike <em>some languages</em> that leave you no room for optimization with intrinsics or similar magic. (Why do I keep going back to insulting Go? Why does Go keep giving me reasons for doing so?)</p><p>
The borrow checker will likely always be a source of issues, but I think that is something that is worth getting used to.
The ideas behind it are good enough to justify the hassle.
</p>
<p class="subhead">See you next year. Maybe.<p>
<p>
I underestimated just how little time and motivation I’d have left after an 8-hour workday that already mostly consists of programming.</p><p>
It was fun, though, and I’ll probably at least try something similar next year.<br><br>
Let’s see what stupid challenge I come up with this time.
</p>
<span class="ninja">Did anyone actually expect me to succeed?</span>
</div>
</div>
</body>

116
blogs.html Normal file
View File

@ -0,0 +1,116 @@
<body>
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="accent1 heading">
Writing Blogs out of Boredom
</p>
<div class="content">
<p>
Yes, blogs. Not blog posts. We’ll get into that in a second.<br>
This entire thing is also really unpolished, but I probably won’t ever find the motivation to make it better. Keep that in mind, and don’t expect too much.
</p>
<p class="subhead">
Introduction
</p>
<p>
More often than not, we find ourselves bored. The modern world, brimming with excitement, variety, colors, lights, and stimuli, fails to entertain us. We could have anything, but we want nothing.
While this is certainly a serious issue that merits pages upon pages of psychological and even philosophical discussion, it won’t be our topic for today.
Today, we’re looking at the other side of the digital age. Those rare occurences where you have nothing, but would take anything.<br>
Okay, maybe not quite, but let’s go with that.
</p>
<p>
A few weeks ago, I found myself on a rather lengthy train trip. No books, close to no reception, no one to talk to. Just me, a bag of clothes, and a laptop without internet.
<sup>
[1]
</sup>
<br>
So I did what every sane person would do. I programmed something. Not because I really needed it, but because it kept me busy, and hey, maybe someone someday might get some use out of it.
As has become the norm for me, I wanted to do something that was in some way new or unknown to me. In this case, template rendering and persistent storage in Go.
</p>
<div class="annotation">
[1] Incidentally, I’m on a train right now. No books, no reception, just a laptop with… you get the gist.<br>
<span class="ninja">
Although there is a moderately attractive girl sitting not too far from me this time.<br>
She just seems awfully immersed into her MacBook, and I don’t want to interrupt her.<br>Also &gt;Apple
</span>
</div>
<p>
The idea of the blog is quite simple, and I don’t feel bad admitting that I took inspiration from a friend who had designated channels on his discord server that were used as micro blogs by different users. One channel per user.
Mimicking that, I created a simple database schema that would store a message, an author, and a timestamp.<sup>[2]</sup>
The template would then query the last 50 entries from that database and display them on the website like an IRC log.<br><br>
And that was the frontend done.<br>
<a href="https://i.kageru.moe/kpGbwu.png">This</a> is what it looked like, in case you’re curious.
</p>
<div class="annotation">
[2] Of course, an auto-incrementing ID was also added, but that’s not really relevant here.
</div>
<p class="subhead accent1">
What, Why, How?
</p>
The reality is that the frontend was the very last part I implemented, but I already had a general idea, and explaining it first makes more sense to an outsider.
Some of you might also be thinking
</p>
<p>
<em>“What do you mean »the frontend was done«? You can’t publish anything on that blog.”</em>
</p>
<p>
You see, almost everything is optional. Sure, you might want to publish something, but do you really need a dedicated page for that? As long as the server knows who you are and what you want to write (and that you have the permissions to do so), it’s fine.
Implementing a login page in the frontend seemed a bit overkill, and requiring you to copy-paste a token into a password field for every line that you want to publish is also inconvenient at best.<br>
And why would we want that anyway?
</p>
<p class="subhead accent1">
The Best UI There Is…
</p>
<p>
There is <a href="https://brandur.org/interfaces">a very good article</a> about terminals and what we can learn from them when designing UIs.
However, in our efforts to make UIs that are _ at least in some regards _ <em>like</em> a terminal, we shouldn’t forget that some UIs might as well <em>be</em> terminals.<br>
And so I did the only logical thing. I implemented an endpoint that opens a port and listens for POST requests containing a JSON.<br>
That way, publishing can be automated with a simple shell script that uses curl to send the requests.<sup>[3]</sup>
</p>
<pre>
<code class="bash">function publish {
curl your_blog.com/add -d "{\"content\": \"$line\", \"Secret\": \"your_password\", \"author\": \"kageru\"}" -H "Content-Type: application/json"
}
read line
while [[ ! -z "$line" ]]; do
publish "$line"
read line
done
</code>
</pre>
<p>
This simple script will continue to read lines from STDIN (i. e. whatever you type before pressing return) and publish them as individual entries on the blog. It exits if you enter an empty line.
</p>
<p>
Now tell me, did you really need a website with a full HTML form for that?
</p>
<div class="annotation">
[3] Did I mention this only works on *nix? Though you could probably create a very similar script in PowerShell.
</div>
<p class="subhead accent1">
…Sometimes
</p>
<p>
I won’t deny that this is ugly in a few ways.<br>
Having your password stored in plain text in a shell script like this is certainly a huge security flaw, but since no real harm can be done here (all messages are HTML escaped, so no malicious content could be embedded in the website by an attacker), I consider this an acceptable solution for what it is _ a proof of concept that was created because I was bored on a train. Just like I am now, as I’m writing this.
The way the backend handles the requests is also anything but beautiful, but it does the job surprisingly well (if, like me, you only want a single user). You don’t even need a separate config file to store your password because the hash is in the source code and compiled into the binary.<br>
Isn’t this a great example of how time constraints and spontaneous solutions can make software terrible?
<em>
This is why, when a developer tells you he will need 3 months, you don’t force him to do it in 3 weeks.
</em>
</p>
<p>
Anyway, I think that’s it. I don’t know if this will be interesting, entertaining, enlightening, or whatever to anyone, but even if it doesn’t, it still kept me busy for the past hour. I still have almost two hours ahead of me, but I’ll find a way to keep myself entertained.
</p>
<span class="ninjatext">
The girl I mentioned earlier stashed away her MacBook, but it looks like she’s going to get off the train soon. Tough luck.
</span>
</div>
</div>
</body>

88
dependencies.html Normal file
View File

@ -0,0 +1,88 @@
<body>
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div class="wrapper_article">
<p class="heading">Vapoursynth: Are We Too Afraid Of Dependencies</p>
<div class="content">
<p class="subhead">Introduction</p>
<p class="content">
Now, if you’re anything like me, you’ll probably react to that title with “wtf, no, we have way too many of them”.
But hear me out. While it is true that most Vapoursynth “funcs” have dozens of dependencies that are sometimes poorly (or not at all) documented,
we might be cutting corners where it matters most. Most as measured by “where it affects most users”.
</p>
<p class="content">
Fundamentally, there are two groups of developers in the Vapoursynth “community” (and I use that term very loosely):
<ul>
<li> People who write plugins</li>
<li> People who write Python functions</li>
</ul>
Myrsloik, the head developer of Vapoursynth, has set everything up in a way that facilitates a structured plugin ecosystem.
Every plugin can reserve a namespace, and all of its functions have to live there.
You can’t have a denoiser and a color shift in the same namespace without making things really confusing, both for you as a dev and for the users.<br>
This is good. It should be like this. But here’s an issue:
</p>
<p class="subhead">Functions are a separate ecosystem</p>
<p class="content">
Funcs (and I’ll use that term to collections of functions such as havsfunc or mvsfunc) are fundamentally different from plugins.
Most importantly for the user, they need to be imported manually. The namespacing is handled by Python. But even Python can’t save you if you don’t let it.
</p>
<p class="content">
Probably the most popular func out there is <a href="https://github.com/HomeOfVapourSynthEvolution/havsfunc/blob/master/havsfunc.py">havsfunc</a>.
At the time of writing, it consists of 32 main functions and 18 utility functions. The other big funcs paint a similar picture.
For some reason, the convention has become to dump everything you write into a single Python file and call it a day.
When I started using Vapoursynth, this trend was already established, so I didn’t really think about it and created my own 500-line monstrosity with no internal cohesion whatsoever.
We don’t care if our func depends on 20 plugins, but God forbid a single encoder release two Python modules that have to be imported separately.
This is what I mean by “we’re afraid of dependencies”. We want all of our code in one place with a single import.<br>
It is worth pointing out that not everyone is doing this. People like dubhater or IFeelBloated exist, but the general consensus in the community (if such a thing even exists) seems to be strongly in favor of monolithic, basically unnamed script collections.
This creates multiple problems:
</p>
<p class="subhead">The Barrier of Entry</p>
<p class="content">
I don’t think anyone can deny that encoding is a very niche hobby with a high barrier of entry.
You won’t find many people who know enough about it to teach you properly, and it’s easy to be overwhelmed by its complexity.<br>
To make matters worse, if you’re just starting out, you won’t even know where to look for things.
Let’s say you’re a new encoder who has a source with a few halos and some aliasing, so you’re looking for a script that can help with that.
Looking at the documentation, your only options are Vine for the halos and vsTAAmbk for the aliasing.
There is no easy way for you to know that there are dehalo and/or AA functions in havsfunc, fvsfunc, muvsfunc, … you get the point.
We have amazing namespacing and order for plugins, but our scripts are a mess that is at best documented by a D9 post or the docstrings.
This is how you lose new users, who might have otherwise become contributors themselves.<br>
But I have a second issue with the current state of affairs:
</p>
<p class="subhead">Code Duplication</p>
<p class="content">
As mentioned previously, each of these gigantic functions comes with its own collection of helpers and utilities.
But is it really necessary that everyone writes their own oneliner for splitting planes, inserting clips, function iteration, and most simple mask operations?
The current state, hyperbolically speaking, is a dozen “open source” repositories with one contributor each and next to no communication between them.
The point of open source isn’t just that you can read other people’s code. It’s that you can actively contribute to it.
</p>
<p class="subhead">The Proposal</p>
<p class="content">
So what do I want? Do I propose a new system wherein each function gets its own Python module? No. God, please no.
I accept that we won’t be able to clean up the mess that has been created. That, at least in part, <em>I</em> have created.
But maybe we can start fixing it a least a little bit to make it easier for future encoders.
Actually utilizing open source seems like it would benefit everyone. The script authors as well as the users.
Maybe we could start with a general vsutil that contains all the commonly-used helpers and utilities.
That way, if something in Vapoursynth is changed, we only have to change one script instead of 20.
This should particularly help then-unmaintained scripts which won’t break quite as frequently.
The next step would be an attempt to combine functions of a specific type into modules, although this might get messy as well if not done properly.
Generally, splitting by content rather than splitting by author seems to be the way.
</p>
<p class="content">
I realize that I am in no real position to do this, but I at least want to try this for my own <a href="https://github.com/Irrational-Encoding-Wizardry/kagefunc/blob/master/kagefunc.py">kagefunc</a>, and I know at least a few western encoders who would be willing to join.
We’ve been using <a href="https://github.com/Irrational-Encoding-Wizardry">a GitHub organization</a> for this for a while, and i think this is the way to go forward.
It would also allow some sort of quality control and code review, something that has been missing for a long time now.
</p>
<p class="content">
I’ll probably stream my future work on Vapoursynth-related scripts (and maybe also some development in general) <a href="https://www.twitch.tv/kageru_">on my Twitch channel</a>.
Feel free to follow if you’re interested in that or in getting to know the person behind these texts. I’ll also stream games there (probably more games than coding, if I’m being honest), so keep that in mind.
</p>
<p class="content">
Edit: It has been pointed out to me that <a href="http://vsdb.top/">vsdb</a> exists to compensate for some of the issues described here.
I think that while this project is very helpful for newcomers, it doesn’t address the underlying issues and just alleviates the pain for now.
</p>
<span class="ninja">I just had to plug my Twitch there, didn’t I?</span>
</div>
</div>
</body>

379
edgemasks.html Normal file
View File

@ -0,0 +1,379 @@
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<style scoped>
convolution {
display: flex;
flex-direction: column;
font-size: 250pt;
width: 1em;
height: 1em;
}
convolution > * > * {
flex-grow: 1;
flex-shrink: 1;
border: 1px #7f7f7f solid;
font-size: 30pt;
display: flex;
align-content: space-around;
}
convolution.c3x3 > * > * {
flex-basis: 33%;
}
convolution.c5x5 > * > * {
flex-basis: 20%;
}
convolution.c7x7 > * > * {
flex-basis: 14.28%;
}
convolution > * > * > span {
margin: auto;
}
convolution > * {
display: flex;
flex-direction: row;
height: 100%;
}
convolution > * > transparent {
background-color: transparent;
}
convolution > * > *[data-accent="1"] {
border-color: #e17800;
}
convolution > * > *[data-accent="2"] {
border-color: #6c3e00;
}
</style>
<p class="heading">Edge Masks</p>
<div class="content">
<p class="subhead">Table of contents</p>
<ul>
<li><a href="#c_intro">Abstract</a></li>
<li><a href="#c_theory">Theory, examples, and explanations</a></li>
<li><a href="#c_deband">Using edge masks</a></li>
<li><a href="#c_performance">Performance</a></li>
<li><a href="#c_end">Conclusion</a></li>
</ul>
<a id="c_intro" href="#c_intro"><p class="subhead">Abstract</p></a>
<p>
Digital video can suffer from a plethora of visual artifacts caused by lossy compression, transmission,
quantization, and even errors in the mastering process.
These artifacts include, but are not limited to, banding, aliasing,
loss of detail and sharpness (blur), discoloration, halos and other edge artifacts, and excessive noise.<br>
Since many of these defects are rather common, filters have been created to remove, minimize, or at least
reduce their visual impact on the image. However, the stronger the defects in the source video are, the more
aggressive filtering is needed to remedy them, which may induce new artifacts.
</p>
<p>
In order to avoid this, masks are used to specifically target the affected scenes and areas while leaving
the
rest unprocessed.
These masks can be used to isolate certain colors or, more importantly, certain structural components of an
image.
Many of the aforementioned defects are either limited to the edges of an image (halos, aliasing) or will
never
occur in edge regions (banding). In these cases, the unwanted by-products of the respective filters can be
limited by only applying the filter to the relevant areas. Since edge masking is a fundamental component of
understanding and analyzing the structure of an image, many different implementations were created over the past few
decades, many of which are now available to us.</p>
<p> In this article, I will briefly explain and compare different ways to generate masks that deal with the
aforementioned problems.</p>
<a id="c_theory" href="#c_theory"><p class="subhead">Theory, examples, and explanations</p></a>
<p>
Most popular algorithms try to detect abrupt changes in brightness by using convolutions to analyze the
direct
neighbourhood of the reference pixel. Since the computational complexity of a convolution is
<em>0(n<sup>2</sup>)</em>
(where n is the radius), the the radius should be as small as possible while still maintaining a reasonable
level of accuracy. Decreasing the radius of a convolution will make it more susceptible to noise and similar
artifacts.</p>
<p>Most algorithms use 3x3 convolutions, which offer the best balance between speed and accuracy. Examples are
the operators proposed by Prewitt, Sobel, Scharr, and Kirsch. Given a sufficiently clean (noise-free)
source, 2x2 convolutions can also be used<span class="source"><a
href="http://homepages.inf.ed.ac.uk/rbf/HIPR2/roberts.htm">[src]</a></span>, but with modern
hardware being able to calculate 3x3 convolutions
for HD video in real time, the gain in speed is often outweighed by the decreased accuracy.</p>
<p>To better illustrate this, I will use the Sobel operator to compute an example image.<br>
Sobel uses two convolutions to detect edges along the x and y axes. Note that you either need two separate
convolutions per axis or one convolution that returns the absolute values of each pixel, rather than 0 for
negative values.
<table class="full_width_table">
<tr>
<td style="width: 40%;">
<convolution class="c3x3 accent1">
<row>
<transparent><span>-1</span></transparent>
<transparent><span>-2</span></transparent>
<transparent><span>-1</span></transparent>
</row>
<row>
<transparent><span>0</span></transparent>
<transparent><span>0</span></transparent>
<transparent><span>0</span></transparent>
</row>
<row>
<transparent><span>1</span></transparent>
<transparent><span>2</span></transparent>
<transparent><span>1</span></transparent>
</row>
</convolution>
</td>
<td style="width: 40%;">
<convolution class="c3x3 accent1">
<row>
<transparent><span>-1</span></transparent>
<transparent><span>0</span></transparent>
<transparent><span>1</span></transparent>
</row>
<row>
<transparent><span>-2</span></transparent>
<transparent><span>0</span></transparent>
<transparent><span>2</span></transparent>
</row>
<row>
<transparent><span>-1</span></transparent>
<transparent><span>0</span></transparent>
<transparent><span>1</span></transparent>
</row>
</convolution>
</td>
</tr>
</table>
<p>
Every pixel is set to the highest output of any of these convolutions. A simple implementation using the
Convolution function of Vapoursynth would look like this:</p>
<pre><code class="python">def sobel(src):
sx = src.std.Convolution([-1, -2, -1, 0, 0, 0, 1, 2, 1], saturate=False)
sy = src.std.Convolution([-1, 0, 1, -2, 0, 2, -1, 0, 1], saturate=False)
return core.std.Expr([sx, sy], 'x y max')</code></pre>
Fortunately, Vapoursynth has a build-in Sobel function
<code>core.std.Sobel</code>, so we don't even have to write our own code.
<p>Hover over the following image to see the Sobel edge mask. </p>
<img src="/media/articles/res_edge/brickwall.png"
onmouseover="this.setAttribute('src', '/media/articles/res_edge/brickwall_sobel.png')"
onmouseout="this.setAttribute('src', '/media/articles/res_edge/brickwall.png')"><br>
<p>Of course, this example is highly idealized. All lines run parallel to either the x or the y axis, there are
no small details, and the overall complexity of the image is very low.</p>
Using a more complex image with blurrier lines and more diagonals results in a much more inaccurate edge mask.
<img src="/media/articles/res_edge/kuzu.png" onmouseover="this.setAttribute('src','/media/articles/res_edge/kuzu_sobel.png')"
onmouseout="this.setAttribute('src','/media/articles/res_edge/kuzu.png')"><br>
<p>
A simple way to greatly improve the accuracy of the detection is the use of 8-connectivity rather than
4-connectivity. This means utilizing all eight directions of the <a
href="https://en.wikipedia.org/wiki/Moore_neighborhood">Moore neighbourhood</a>, i.e. also using the
diagonals of the 3x3 neighbourhood.<br>
To achieve this, I will use a convolution kernel proposed by Russel A. Kirsch in 1970<span class="source"><a
href="https://ddl.kageru.moe/konOJ.pdf">[src]</a></span>.
</p>
<convolution class="c3x3 accent1">
<row>
<transparent><span>5</span></transparent>
<transparent><span>5</span></transparent>
<transparent><span>5</span></transparent>
</row>
<row>
<transparent><span>-3</span></transparent>
<transparent><span>0</span></transparent>
<transparent><span>-3</span></transparent>
</row>
<row>
<transparent><span>-3</span></transparent>
<transparent><span>-3</span></transparent>
<transparent><span>-3</span></transparent>
</row>
</convolution>
<br>
This kernel is then rotated in increments of 45° until it reaches its original position.<br>
Since Vapoursynth does not have an internal function for the Kirsch operator, I had to build my own; again,
using the internal convolution.
<pre><code class="python">def kirsch(src):
kirsch1 = src.std.Convolution(matrix=[ 5, 5, 5, -3, 0, -3, -3, -3, -3])
kirsch2 = src.std.Convolution(matrix=[-3, 5, 5, 5, 0, -3, -3, -3, -3])
kirsch3 = src.std.Convolution(matrix=[-3, -3, 5, 5, 0, 5, -3, -3, -3])
kirsch4 = src.std.Convolution(matrix=[-3, -3, -3, 5, 0, 5, 5, -3, -3])
kirsch5 = src.std.Convolution(matrix=[-3, -3, -3, -3, 0, 5, 5, 5, -3])
kirsch6 = src.std.Convolution(matrix=[-3, -3, -3, -3, 0, -3, 5, 5, 5])
kirsch7 = src.std.Convolution(matrix=[ 5, -3, -3, -3, 0, -3, -3, 5, 5])
kirsch8 = src.std.Convolution(matrix=[ 5, 5, -3, -3, 0, -3, -3, 5, -3])
return core.std.Expr([kirsch1, kirsch2, kirsch3, kirsch4, kirsch5, kirsch6, kirsch7, kirsch8],
'x y max z max a max b max c max d max e max')</code></pre>
<p>
It should be obvious that the cheap copy-paste approach is not acceptable to solve this problem. Sure, it
works,
but I'm not a mathematician, and mathematicians are the only people who write code like that. Also, yes, you
can pass more than three clips to sdt.Expr, even though the documentation says otherwise.<br>Or maybe my
limited understanding of math (not being a mathematician, after all) was simply insufficient to properly
decode “Expr evaluates an expression per
pixel for up to <span class="accent1">3</span> input clips.”</p>
Anyway, let's try that again, shall we?
<pre><code class="python">def kirsch(src: vs.VideoNode) -> vs.VideoNode:
w = [5]*3 + [-3]*5
weights = [w[-i:] + w[:-i] for i in range(4)]
c = [core.std.Convolution(src, (w[:4]+[0]+w[4:]), saturate=False) for w in weights]
return core.std.Expr(c, 'x y max z max a max')</code></pre>
<p>Much better already. Who needed readable code, anyway?</p>
If we compare the Sobel edge mask with the Kirsch operator's mask, we can clearly see the improved accuracy.
(Hover=Kirsch)<br>
<img src="/media/articles/res_edge/kuzu_sobel.png" onmouseover="this.setAttribute('src','/media/articles/res_edge/kuzu_kirsch.png')"
onmouseout="this.setAttribute('src','/media/articles/res_edge/kuzu_sobel.png')"><br>
<p> The higher overall sensitivity of the detection also results in more noise being visible in the edge mask.
This can be remedied by denoising the image prior to the analysis.<br>
The increase in accuracy comes at an almost negligible cost in terms of computational complexity. About
175 fps
for 8-bit 1080p content (luma only) compared to 215 fps with the previously shown sobel
<span title="You know why I'm putting this in quotes">‘implementation’</span>. The internal Sobel filter is
not used for this comparison as it also includes a high- and lowpass function as well as scaling options,
making it slower than the Sobel function above. Note that many of the edges are also detected by the Sobel
operator, however, these are very faint and only visible after an operation like std.Binarize.</p>
A more sophisticated way to generate an edge mask is the TCanny algorithm which uses a similar procedure to find
edges but
then reduces these edges to 1 pixel thin lines. Optimally, these lines represent the middle
of each edge, and no edge is marked twice. It also applies a gaussian blur to the image to eliminate noise
and other distortions that might incorrectly be recognized as edges. The following example was created with
TCanny using these settings: <code>core.tcanny.TCanny(op=1,
mode=0)</code>. op=1 uses a modified operator that has been shown to achieve better signal-to-noise ratios<span
class="source"><a href="http://www.jofcis.com/publishedpapers/2011_7_5_1516_1523.pdf">[src]</a></span>.<br>
<img src="/media/articles/res_edge/kuzu_tcanny.png"><br>
Since I've already touched upon bigger convolutions earlier without showing anything specific, here is an
example of the things that are possible with 5x5 convolutions.
<pre><code class="python"
>src.std.Convolution(matrix=[1, 2, 4, 2, 1,
2, -3, -6, -3, 2,
4, -6, 0, -6, 4,
2, -3, -6, -3, 2,
1, 2, 4, 2, 1], saturate=False)</code></pre>
<img src="/media/articles/res_edge/kuzu5x5.png">
This was an attempt to create an edge mask that draws around the edges. With a few modifications, this might
become useful for
halo removal or edge cleaning. (Although something similar (probably better) can be created with a regular edge
mask, std.Maximum, and std.Expr)
<a id="c_deband" href="#c_deband"><p class="subhead">Using edge masks</p></a>
<p>
Now that we've established the basics, let's look at real world applications. Since 8-bit video sources are
still everywhere, barely any encode can be done without debanding. As I've mentioned before, restoration
filters
can often induce new artifacts, and in the case of debanding, these artifacts are loss of detail and, for
stronger debanding, blur. An edge mask could be used to remedy these effects, essentially allowing the
debanding
filter to deband whatever it deems necessary and then restoring the edges and details via
std.MaskedMerge.</p>
<p>
GradFun3 internally generates a mask to do exactly this. f3kdb, the <span
title="I know that “the other filter” is misleading since GF3 is not a filter but a script, but if that's your only concern so far, I must be doing a pretty good job.">other popular debanding filter</span>,
does not have any integrated masking functionality.</p>
Consider this image:<br>
<img src="/media/articles/res_edge/aldnoah.png">
<p>
As you can see, there is quite a lot of banding in this image. Using a regular debanding filter to remove it
would likely also destroy a lot of small details, especially in the darker parts of the image.<br>
Using the Sobel operator to generate an edge mask yields this (admittedly rather disappointing) result:</p>
<img src="/media/articles/res_edge/aldnoah_sobel.png">
<p>
In order to better recognize edges in dark areas, the retinex
algorithm can be used for local contrast enhancement.</p>
<img src="/media/articles/res_edge/aldnoah_retinex.png">
<div style="font-size: 80%; text-align: right">The image after applying the retinex filter, luma only.</div>
<p>
We can now see a lot of information that was previously barely visible due to the low contrast. One might
think
that preserving this information is a vain effort, but with HDR-monitors slowly making their way into the
mainstream and more possible improvements down the line, this extra information might be visible on consumer
grade screens at some point. And since it doesn't waste a noticeable amount of bitrate, I see no harm in
keeping it.</p>
Using this newly gained knowledge, some testing, and a little bit of magic, we can create a surprisingly
accurate edge mask.
<pre><code class="python">def retinex_edgemask(luma, sigma=1):
ret = core.retinex.MSRCP(luma, sigma=[50, 200, 350], upper_thr=0.005)
return core.std.Expr([kirsch(luma), ret.tcanny.TCanny(mode=1, sigma=sigma).std.Minimum(
coordinates=[1, 0, 1, 0, 0, 1, 0, 1])], 'x y +')</code></pre>
<p>Using this code, our generated edge mask looks as follows:</p>
<img src="/media/articles/res_edge/aldnoah_kage.png">
<p>
By using std.Binarize (or a similar lowpass/highpass function) and a few std.Maximum and/or std.Inflate
calls, we can transform this edgemask into a
more usable detail mask for our debanding function or any other function that requires a precise edge mask.
</p>
<a id="c_performance" href="#c_performance"><p class="subhead">Performance</p></a>
Most edge mask algorithms are simple convolutions, allowing them to run at over 100 fps even for HD content. A
complex algorithm like retinex can obviously not compete with that, as is evident by looking at the benchmarks.
While a simple edge mask with a Sobel kernel ran consistently above 200 fps, the function described above only
procudes 25 frames per second. Most of that speed is lost to retinex, which, if executed alone, yields about
36.6 fps. A similar, <span title="and I mean a LOT more inaccurate">albeit more inaccurate</span>, way to
improve the detection of dark, low-contrast edges would be applying a simple curve to the brightness of the
image.
<pre><code class="python">bright = core.std.Expr(src, 'x 65535 / sqrt 65535 *')</code></pre>
This should (in theory) improve the detection of dark edges in dark images or regions by adjusting their
brightness
as shown in this curve:
<img src="/media/articles/res_edge/sqrtx.svg"
title="yes, I actually used matplotlib to generate my own image for sqrt(x) rather than taking one of the millions available online">
<a id="c_end" href="#c_end"><h2 class="subhead">Conclusion</h2></a>
Edge masks have been a powerful tool for image analysis for decades now. They can be used to reduce an image to
its most essential components and thus significantly facilitate many image analysis processes. They can also be
used to great effect in video processing to minimize unwanted by-products and artifacts of more agressive
filtering. Using convolutions, one can create fast and accurate edge masks, which can be customized and adapted
to serve any specific purpose by changing the parameters of the kernel. The use of local contrast enhancement
to improve the detection accuracy of the algorithm was shown to be possible, albeit significantly slower.<br><br>
<pre><code class="python"># Quick overview of all scripts described in this article:
################################################################
# Use retinex to greatly improve the accuracy of the edge detection in dark scenes.
# draft=True is a lot faster, albeit less accurate
def retinex_edgemask(src: vs.VideoNode, sigma=1, draft=False) -> vs.VideoNode:
core = vs.get_core()
src = mvf.Depth(src, 16)
luma = mvf.GetPlane(src, 0)
if draft:
ret = core.std.Expr(luma, 'x 65535 / sqrt 65535 *')
else:
ret = core.retinex.MSRCP(luma, sigma=[50, 200, 350], upper_thr=0.005)
mask = core.std.Expr([kirsch(luma), ret.tcanny.TCanny(mode=1, sigma=sigma).std.Minimum(
coordinates=[1, 0, 1, 0, 0, 1, 0, 1])], 'x y +')
return mask
# Kirsch edge detection. This uses 8 directions, so it's slower but better than Sobel (4 directions).
# more information: https://ddl.kageru.moe/konOJ.pdf
def kirsch(src: vs.VideoNode) -> vs.VideoNode:
core = vs.get_core()
w = [5]*3 + [-3]*5
weights = [w[-i:] + w[:-i] for i in range(4)]
c = [core.std.Convolution(src, (w[:4]+[0]+w[4:]), saturate=False) for w in weights]
return core.std.Expr(c, 'x y max z max a max')
# should behave similar to std.Sobel() but faster since it has no additional high-/lowpass or gain.
# the internal filter is also a little brighter
def fast_sobel(src: vs.VideoNode) -> vs.VideoNode:
core = vs.get_core()
sx = src.std.Convolution([-1, -2, -1, 0, 0, 0, 1, 2, 1], saturate=False)
sy = src.std.Convolution([-1, 0, 1, -2, 0, 2, -1, 0, 1], saturate=False)
return core.std.Expr([sx, sy], 'x y max')
# a weird kind of edgemask that draws around the edges. probably needs more tweaking/testing
# maybe useful for edge cleaning?
def bloated_edgemask(src: vs.VideoNode) -> vs.VideoNode:
return src.std.Convolution(matrix=[1, 2, 4, 2, 1,
2, -3, -6, -3, 2,
4, -6, 0, -6, 4,
2, -3, -6, -3, 2,
1, 2, 4, 2, 1], saturate=False)</code></pre>
<div class="download_centered">
<span class="source">Some of the functions described here have been added to my script collection on Github<br></span><a href="https://gist.github.com/kageru/d71e44d9a83376d6b35a85122d427eb5">Download</a></div>
<br><br><br><br><br><br><span class="ninjatext">Mom, look! I found a way to burn billions of CPU cycles with my new placebo debanding script!</span>
</div>
</div>

62
expediency.html Normal file
View File

@ -0,0 +1,62 @@
<body>
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Do What Is Interesting, Not What Is Expedient</p>
<p>
If you’re just here for the encoding stuff, this probably isn’t for you. But if you want to read the thoughts of a Linux proselyte (and proselytizer), read on.
</br>I’m sure this also applies to other things, but your mileage may vary.
</p><p></p><p>
Also, I wrote this on a whim, so don’t expect it to be as polished as the usual content.
</p>
<div class="content">
<h2 id="#c_introduction">Introduction</h2>
Once upon a time (TL note: about 4 months ago), there was a CS student whose life was so easy and uncomplicated he just had to change something. Well that’s half-true at best, but the truth is boring. Anyway, I had been thinking about Linux for a while, and one of my lectures finally gave me a good excuse to install it on my laptop. At that point, my only experience with Linux was running a debian server with minimal effort to have a working website, a TeamSpeak server, and a few smaller things, so I knew almost nothing.</p><p>
I decided to “just do it™” and installed <a href="https://manjaro.org/">Manjaro Linux</a> on my Laptop. They offer different pre-configured versions, and I just went with XFCE, one of the two officially supported editions. Someone had told me about Manjaro about a year earlier, so I figured it would be a good choice for beginners (which turned out to be true). I created a bootable flash drive, booted, installed it, rebooted, and that was it; it just worked. No annoying bloatware to remove, automatic driver installation, and sane defaults (apart from VLC being the default video player, but no system is perfect, I guess). I changed a few basic settings here and there and switched to a dark theme—something that Windows still doesn’t support—and the system looked nice and was usable. So nice and usable that I slowly started disliking the Windows 10 on my home PC. (You can already tell where this is going.)</p><p>
I wanted full control over my system, the beauty of configurability, a properly usable terminal (although I will add that there are Linux distros which you can use without touching a terminal even once), the convenience of a package manager—and the smug feeling of being a Linux user. You’ll understand this once you’ve tried; trust me.</p><p>
And so the inevitable happened, and I also installed Manjaro on my home PC (the KDE version this time because “performance doesn’t matter if you have enough CPU cores”—a decision that I would later revisit). I still kept Windows on another hard drive as a fallback, and it remains there to this day, although I only use it about once a week, maybe even less, when I want to play a Windows-only game.
<h2 id="#wabbit">Exploring the Rabbit Hole</h2>
No one, not even Lewis Carroll, can fully understand what a rabbit hole is unless they experienced being a bored university student who just got into Linux. Sure, my mother could probably install Ubuntu (or have me install it for her) and be happy with that. But I was—and still am—a young man full of curiosity. Options are meant to be used, and systems are meant to be broken. The way I would describe it is “Windows is broken until you fix it. Linux works until you break it. Both of these will happen eventually.”</p><p>
So, not being content with the stable systems I had, I wanted to try something new after only a few weeks. The more I looked at the endless possibilities, the more I just wanted to wipe the system and start over; this time with something better. I formatted the laptop and installed Manjaro i3 (I apparently wasn’t ready to switch to another distro entirely yet). My first time in the live system consisted of about 10 minutes of helpless, random keystrokes, before I shut it down again because I couldn’t even open a window (which also meant I couldn’t google). This is why you try things like that in live systems, kids. Back at my main PC, I read the manual of i3wm. How was I supposed to know that <span style="font-family: Hack, monospace">$super + return</span> opens a terminal?</p><p>
Not the best first impression, but I was fascinated by the concept of tiling window managers, so I decided to try again—this time, armed with the documentation and a second PC to google. Sure enough, knowing how to open <span style="font-family: Hack, monospace">dmenu</span> makes the system a lot more usable. I could even start programs now. i3 also made me realize how slow and sluggish KDE was, so I started growing dissatisfied with my home setup once again. It was working fine, but opening a terminal via hotkeys took about 200ms compared to the blazing 50ms on my laptop. Truly first-world problems.</p><p>
It should come as no surprise that I would soon install i3 on my home PC as well, and I’ve been using that ever since. <a href="/media/articles/res_exp/neofetch_miuna.png">Obligatory neofetch</a>.</p><p>
I also had a home server for encoding-related tasks which was still running Windows. While it is theoretically possible to connect to a Windows remote desktop host with a Linux client (and also easy to set up), it just felt wrong, so that had to change. Using Manjaro again would have been the easiest way, but that would have been silly on a <a href="https://en.wikipedia.org/wiki/Headless_computer">headless system</a>, <span title="Btw I use Arch Linux :^)">so I decided to install <a href="https://www.archlinux.org/">Arch Linux</a> instead.</span> It obviously wasn’t as simple as Manjaro (where you literally—and I mean <em>literally</em>—have to click a few times and wait for it to install), but I still managed to do it on my first attempt. And boy, is SSH convenient when you’re used to the bloat of Windows RDC.</p><p></p><p>
I would later make a rule for myself to not install the same distro on two systems, which leads us to the next chapter of my journey.
<h2 id="#descent">The Slow Descent Into Madness</h2>
Installing an operating system is an interesting process. It might take some time, and it can also be frustrating, but you definitely learn a lot (things like partitioning and file systems, various GNU/Utils, and just basic shell usage). Be it something simple like enabling SSH access or a bigger project like setting up an entire graphical environment—you learn something new every time.</p><p>
As briefly mentioned above, I wanted to force myself to try new things, so I simply limited each distro to a single device. This meant that my laptop, which was still running Manjaro, had to be reinstalled. I just loved the convenience of the AUR, so I decided to go with another arch-based distro: <a href="https://antergos.com/">antergos</a>. The installer has a few options to install a desktop environment for you, but it didn’t have i3, so I had to do that manually.</p><p>
With that done, I remembered that I still had a Raspberry Pi that I hadn’t used in years. That obviously had to be changed, especially since it gave me the option to try yet another distro. (And I would find a use case for the Pi eventually, or I at least told myself that I would.)</p><p>
I had read <a href="https://blog.quad.moe/moving-to-void-linux/">this</a> not too long ago, so I decided to give Void Linux a shot. This would be my first distro without systemd (don’t worry if you don’t know what that is).</p><p></p><p>
I could go on, but I think you get the gist of it. I did things because they seemed interesting, and I definitely learned a lot in the process. After the Void Pi, I installed Devuan on a second Pi (remind you, I already had Debian on a normal server, so that was off-limits).</p><p>
The real fun began a few days ago when I decided to build a tablet with a RasPi. That idea is nothing new, and plenty of people have done it before, but I wanted to go a little further. A Raspberry Pi tablet running Gentoo Linux. The entire project is a meme and thus destined to fail, but I’m too stubborn to give up (yet). At the time of writing, the Pi has been compiling various packages for the last 20 hours, and it’s still going.</p><p>
As objectively stupid as this idea is (Gentoo on a Pi without cross-compilation, or maybe just Gentoo in general), it did, once again, teach me a few things about computers. About compilers and USE flags, about dependency management, about the nonexistent performance of the Pi 3… you get the idea. I still don’t know if this will end up as a working system, but either way, it will have been an interesting experience.</p><p>
And that’s really what this is all about. Doing things you enjoy, learning something new, and being entertained.</p>
<p>
Update: It’s alive… more or less.
</p>
<!--span class="code">//TODO: update this when/if the Pi is done and usable</span></p><p-->
<h2 id="conclusion">Conclusion</h2>
<p>
So this was the journey of a former Windows user into the lands of free software.</p><p></p><p>
Was it necessary? No.</p><p>
Was it enjoyable? Mostly.</p><p>
Was it difficult? Only as difficult as I wanted it to be.</p><p>
Does Linux break sometimes? Only if I break it.</p><p>
Do I break it sometimes? Most certainly.</p><p>
Would I do it again? Definitely.</p><p>
Would I go back? Hell, no.</p><p>
Do I miss something about Windows? Probably the way it handles multi-monitor setups with different DPIs. I haven’t found a satisfying solution for UI scaling per monitor on Linux yet.</p><p>
</p><p>
I’m not saying everyone should switch to Linux. There are valid use cases for Windows, but some of the old reasons are simply not valid anymore. Many people probably think that Linux is a system for nerds—that it’s complicated, that you need to spend hours typing commands into a shell, that nothing works (which is still true for some distros, but you only use these if you know what you’re doing and if you have a good reason for it).</p><p>
In reality, Linux isn’t just Linux the way Windows is Windows. Different Linux distros can be nothing alike. The only commonality is the kernel, which you don’t even see as a normal user. Linux can be whatever you want it to be; as easy or as difficult as you like; as configurable or out-of-the-box as you need.</p>
<p>
If you have an old laptop or an unused hard drive, why don’t you just try it? You might learn a few interesting things in the process. Sometimes, being interesting beats being expedient. Sometimes, curiosity beats our desire for familiarity.</p>
</br></br></br>
<span class="ninjatext">Btw, I ended up not attending the lecture that made me embark upon this journey in the first place. FeelsLifeMan</span>
</div>
</div>
</body>

281
grain.html Normal file
View File

@ -0,0 +1,281 @@
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Grain and Noise</p>
<div class="content">
<ul>
<li><a href="#c_introduction">Introduction</a></li>
<li><a href="#noisetypes">Different Types of noise and grain</a></li>
<li><a href="#remove">Different denoisers</a></li>
<li><a href="#c_encoding">Encoding grain</a></li>
</ul>
</div>
<p class="subhead"><a href="#c_introduction" id="c_introduction">Introduction</a></p>
<div class="content">
There are many types of noise or artifacts and even more denoising algorithms. In the following article, the
terms noise
and grain will sometimes be used synonymously. Generally, noise is an unwanted artifact and grain was added to
create a certain effect (flashbacks, film grain, etc) or to prevent banding. Especially the latter may not
always
be beneficial to your encode, as it increases entropy, which in turn increases the bitrate without improving the
perceived
quality of the encode (apart from the gradients, but we'll get to that).<br>
Grain is not always bad and even necessary to remove or prevent banding artifacts from occuring, but studios
tend to
use dynamic grain which requires a lot of bitrate. Since you are most likely encoding in 10 bit, banding isn't
as
much of an issue, and static grain (e.g. f3kdb's grain) will do the job just as well.<br>
Some people might also like to denoise/degrain an anime because they prefer the cleaner look. Opinions may vary.
</div>
<p class="subhead"><a id="noisetypes" href="#noisetypes">Different types of noise and grain</a></p>
<div class="content">
This section will be dedicated to explaining different types of noise which you will encounter. This list is not
exhaustive, as studios tend to do weird and unpredictable things from time to time.<br>
<div>
<p class="subhead">1. Flashbacks</p>
<img class="img_expandable" src="/media/articles/res_dn\shinsekaiyori.jpg">
<div style="font-size: 80%; text-align: right">Image: Shinsekai Yori episode 12 BD</div>
Flashbacks are a common form of artistic film grain. The grain is used selectively to create a certain
atmosphere and should
not be removed. These scenes tend to require quite high bitrates, but the effect is intended, and even if
you
were
to try, removing the grain would be quite difficult and probably result in a very blurred image.<br>
Since this type of grain is much stronger than the underlying grain of many sources, it should not be
affected by
a normal denoiser, meaning you don't have to trim around these scenes if you're using a denoiser to remove
the general background noise from other scenes.
</div>
<div>
<p class="subhead">2. Constant film grain</p>
<img class="img_expandable" src="/media/articles/res_dn\no_cp.jpg">
<div style="font-size: 80%; text-align: right">Image: Corpse Party episode 1, encoded by Gebbi @Nanaone
</div>
Sometimes all scenes are overlaid with strong film grain similar to the previously explained flashbacks.
This type of source is rare, and the only way to remove it would be a brute force denoiser like QTGMC. It is
possible to
get rid of it, however, generally I would advise against it, as removing this type of grain tends to change
the mood of a given scene. Furthermore, using a denoiser of this calibre can easily destroy any detail
present, so you will have to carefully tweak the values.
</div>
<div>
<br>
<p class="subhead">3. Background grain</p>
This type is present in most modern anime, as it prevents banding around gradients and simulates detail by
adding random information to all surfaces. Some encoders like it. I don't. Luckily, this one can be removed
with relative ease, which will notably decrease the required bitrate. Different denoisers will be described
in a later paragraph.
</div>
<div>
<br>
<p class="subhead">4. TV grain</p>
<img class="img_expandable" src="/media/articles/res_dn\kekkai_crt.jpg"><br>
<div style="font-size: 80%; text-align: right">Image: Kekkai Sensen episode 1, encoded by me</div>
This type is mainly used to create a CRT-monitor or cameraesque look. It is usually accompanied by
scanlines and other distortion and should never be filtered. Once again, you can only throw more bits at the
scene.
</div>
<div>
<p class="subhead">5. Exceptional grain</p>
<img src="/media/articles/res/bladedance_grain_src.jpg" class="img_expandable">
<div style="font-size: 80%; text-align: right">Image: Seirei Tsukai no Blade Dance episode 3, BD</div>
Some time ago, a friend of mine had to encode the Blu-rays of Blade Dance and came across
this scene. It is about three minutes long, and the BD transport stream's bitrate peaks at more than
55mbit/s, making the Blu-ray non-compliant with the current Blu-ray standards (this means that some BD
players may just refuse to play the file. Good job, Media Factory). <br>
As you can see in the image above, the source has
insanely strong grain in all channels (luma and chroma). FFF chose to brute-force through this scene by
simply letting x264 pick whatever bitrate it deemed appropriate, in this case about 150mbit/s. Another (more
bitrate-efficient solution) would be to cut the scene directly from the source stream without re-encoding.
Note that you can only cut streams on keyframes and will not be able to filter (or scale) the scene
since you're not re-encoding it. An easier solution would be using the --zones parameter to increase the crf
during the scene in question. If a scene is this grainy, you can usually get away with higher crf values.
</div>
</div>
<p class="subhead"><a id="remove" href="#remove">Comparing denoisers</a></p>
<div class="content">
So let's say your source has a constant dynamic grain that is present in all scenes, and you want to save
bitrate in your encode or get rid of the grain because you prefer clean and flat surfaces. Either way, what
you're looking for is a denoiser. A list of denoisers for your preferred frameserver can be found <a
href="http://avisynth.nl/index.php/External_filters#Denoisers">here
(Avisynth)</a> or <a href="http://www.vapoursynth.com/doc/pluginlist.html#denoising">here (Vapoursynth)</a>.
To compare the different filters, I will use two scenes – one with common "background grain" and one with
stronger grain. Here are the unfiltered images:<br>
<img class="native720 img_expandable" src="/media/articles/res_dn/mushishi_src22390.png"><br>
An image with "normal" grain – the type you would remove to save bitrate. Source: Mushishi Zoku Shou OVA
(Hihamukage) Frame 22390. Size: 821KB<br>Note the faint wood texture on the backpack. It's already quite blurry
in the source and can easily be destroyed by denoising or debanding improperly.<br>
<img class="native720 img_expandable" src="/media/articles/res_dn/erased_src5322.png"><br>
A grainy scene. Source: Boku Dake ga Inai Machi, Episode 1, Frame 5322. Size: 727KB<br>I am well aware that
this is what I
classified as "flashback grain" earlier, but for the sake of comparison let's just assume that you want to
degrain this type of scene.<br>
Furthermore, you should note that most denoisers will create banding which the removed grain was masking
(they're technically not creating the banding but merely making it visible). Because of this, you will usually
have to deband after denoising.
<p class="subhead">1. Fourier Transform based (dfttest, FFT3D)</p><br>
Being one of the older filters, dfttest has been in development since 2007. It is a very potent denoiser
with good detail retention, but it will slow your encode down quite a bit, especially when using Avisynth due to
its lack of multithreading.
The Vapoursynth filter is faster and should yield the same results. <br>FFT3DGPU is
hardware accelerated and uses a similar (but not the same) algorithm. It is significantly faster but
less precise in terms of detail retention and possibly blurring areas. Contra-sharpening can be used to
prevent the latter. The filter is available for Avisynth and Vapoursynth without major differences.<br>
<img class="native720 img_expandable" src="/media/articles/res_dn/mushishi_dft0.5.png"><br>
sigma = 0.5; 489KB
<img class="native720 img_expandable" src="/media/articles/res_dn/erased_dft4.png"><br>
sigma = 4; 323KB
<p class="subhead">2. Non-local means based (KNLMeans, TNLMeans)</p><br>
The non-local means family consists of solid denoisers which are particularly appealing due to their
highly optimized GPU/OpenCL implementations, which allow them to be run without any significant speed
penalty.
Using the GPU also circumvents Avisynth's limitation to one thread, similar to FFT3DGPU.<br>Because of
this, there is no reason to use the "regular" (CPU) version unless your encoding rig does not have a GPU. K
NL can remove a lot of noise while still retaining quite a
lot of detail (although less than dft or BM3D). It might be a good option for older anime, which tend
to have a lot of grain (often added as part of the Blu-ray "remastering" process) but not many fine
details. When a fast (hardware accelerated) and strong denoiser is needed, I'd generally recommend using
KNL rather than FFT3D.<br>One thing to highlight is the Spatio-Temporal mode of this filter. By
default, neither the Avisynth nor the Vapoursynth version uses temporal reference frames for denoising. This can
be
changed in order to improve the quality by setting the d parameter to any value higher than zero.
If your material is in 4:4:4 subsampling, consider using "cmode = true" to enable denoising of the
chroma planes. By default, only luma is processed and the chroma planes are copied to the denoised
clip.<br>Both of these settings will negatively affect the filter's speed, but unless you're using a
really old GPU or multiple GPU-based filters, your encoding speed should be capped by the CPU rather than
the GPU. <a href="https://github.com/Khanattila/KNLMeansCL/blob/master/DOC.md">Benchmarks and documentation
here.</a><br>
<img class="native720 img_expandable" src="/media/articles/res_dn/mushishi_knl0.2cmode1.png"><br>
h = 0.2, a = 2, d = 3, cmode = 1; 551KB<br>
<span class="fakelink"
onclick="fullimage('/media/articles/res_dn/mushishi_knl0.2cmode0.png')">cmode = 0 for comparison; 733KB</span><br>
<img class="native720 img_expandable" src="/media/articles/res_dn/erased_knl0.5.png"><br>
h = 0.5, a = 2, d = 3, cmode = 1; 376KB
<p class="subhead">BM3D</p><br>
This one is very interesting, very slow, and only available for Vapoursynth. Avisynth would probably die
trying to run it, so don't expect a port anytime soon unless memory usage is optimized significantly.
It would technically work on a GPU, as the algorithm can be parallelized without any issues <a
href="https://books.google.com/books?id=xqfNBQAAQBAJ&pg=PA380&lpg=PA380&dq=bm3d+GPU+parallel&source=bl&ots=MS9-Kzi-8u&sig=fMcblGOrD-wCUrZzijmAdQF2Tj8&hl=en&sa=X&ei=wljeVI-LKcqAywPVyILgDQ&ved=0CDQQ6AEwBA#v=onepage&q=bm3d%20GPU%20parallel&f=false">[src]</a>,
however no such implementation exists for Avisynth or Vapoursynth. (If the book doesn't load for you,
try scrolling up and down a few times and it should fix itself)<br>
BM3D appears to have the best ratio of filesize and blur (and consequently detail loss) at the cost of
being the slowest CPU-based denoiser on this list. It is worth noting that this filter can be combined
with any other denoiser by using the "ref" parameter. From the <a
href="https://github.com/HomeOfVapourSynthEvolution/VapourSynth-BM3D">documentation</a>:
<div class="code">Employ custom denoising filter as basic estimate, refined with V-BM3D final estimate.
May compensate the shortages of both denoising filters: SMDegrain is effective at spatial-temporal
smoothing but can lead to blending and detail loss, V-BM3D preserves details well but is not very
effective for large noise pattern (such as heavy grain).
</div>
<img class="native720 img_expandable" src="/media/articles/res_dn/mushishi_bm3d_1511.png"><br>
radius1 = 1, sigma = [1.5,1,1]; 439KB<br>
<img class="native720 img_expandable" src="/media/articles/res_dn/erased_bm3d.png"><br>
radius1 = 1, sigma = [5,5,5]; 312KB<br>Note: This image does not use the aforementioned "ref" parameter
to improve grain removal, as this comparison aims to provide an overview over the different filters by
themselves, rather than the interactions and synergies between them.<br>
<p class="subhead">SMDegrain</p><br>
SMDegrain seems to be the go-to-solution for many encoders, as it does not generate much blur and the
effect seems to be weak enough to save some file size without notably altering the image.<br>The
substantially weaker denoising also causes less banding to appear, which is particularly appealing when
trying to preserve details without much consideration for bitrate.<br>
<img class="native720 img_expandable" src="/media/articles/res_dn/mushishi_smdegrain_luma.png"><br>
Even without contra-sharpening, SMDegrain seems to slightly alter/thin some of the edges. 751KB<br>
<img class="native720 img_expandable" src="/media/articles/res_dn/erased_smdg.png"><br>
In this image the "sharpening" is more notable. 649KB<br>
One thing to note is that SMDegrain can have a detrimental effect on the image when processing with
chroma. The Avisynth wiki describes it as follows:
<div class="code">Caution: plane=1-4 [chroma] can sometimes create chroma smearing. In such case I
recommend denoising chroma planes in the spatial domain.
</div>
In practice, this can destroy (or add) edges by blurring multiple frames into a single one. <span
class="fakelink"
onclick="fullimage('/media/articles/res_dn/mushishi_smdegrain.png')">Look at her hands</span><br>
<p><span style="font-weight: 600">Edit: </span>I recently had a discussion with another encoder who had strong
chroma artifacts (much worse than the lines on her hand), and the cause was SMDegrain. The solution can be
found
on <a href="https://exmendic.wordpress.com/encode-tipps/advanced/richtiges-anwenden-von-smdegrain/">his
blog</a>. Just ignore the german text and scroll down to the examples. All you have to do is split the
video in its individual planes and denoise each of them
like you would denoise a single luma plane. SMDegrain is used prior to scaling for the chroma planes, which
improves the performance. You would have to do the same in Vapoursynth do avoid the smearing, but
Vapoursynth has BM3D which does the same job better, so you don't have to worry about SMDegrain and its
bugs.</p>
<p class="subhead">Waifu2x</p><br>
Waifu2x is an image-upscaling and denoising algorithm using Deep Convolutional Neural Networks. Sounds
fancy but uses an awful lot of computing power. You can expect to get ≤1fps when denoising a 720p image
using waifu2x on a modern graphics card. Your options for denoising are noise level 1, 2 ,or 3, with
level 2 and 3 being useless because of their nonexistent detail retention. Noise level 1 can remove grain
fairly well, however the detail retention may vary strongly depending on the source, and due to its
limited options (none, that is) it can not be customized to fit different sources. Either you like the
results or you use another denoiser. It is also worth noting that this is the slowest algorithm one
could possibly use, and generally the results do not justify the processing time. <br>
There are other
proposals for Deep Learning based denoising algorithms, however most of these were never made available
to the public. <a
href="https://www.researchgate.net/publication/300688682_Deep_Gaussian_Conditional_Random_Field_Network_A_Model-based_Deep_Network_for_Discriminative_Denoising">[src]</a>
<br>
<img src="/media/articles/res_dn/mushishi_w2x.png"><br>
The more "anime-esque" parts of the image are denoised without any real issues, but the more realistic
textures (such as the straw) might be recognized as noise and treated accordingly.
<img src="/media/articles/res_dn/erased_w2x.png"><br>
<span class="bold">Edit: </span>Since this section was written there have been a few major updates to the
Waifu2x algorithm. The speed has been further optimized, and more settings for the noise removal feature have
been added. These features make it a viable alternative to some of the other denoisers on this list (at least
for certain sources), however it is still outclassed in terms of speed. The newly added upConv models are
significantly faster for upscaling and promise better results. In their current state, they should not be used
for denoising, as they are slower than the regular models and try to improve the image quality and sharpness
even without upscaling, which may cause aliasing and ringing.
<p class="subhead">Debanding</p><br>
Some of these may be quite impressive in terms of file size/compressibility, but they all share a common
problem: banding. In order to fix that, we will need to deband and apply grain to the gradients. This may seem
counterintuitive, as we have just spend a lot of processing time to remove the grain, but I'll get to that
later.
<br>
<img id="compare1" class="native720" src="/media/articles/res_dn/mushishi_f3k.png"
onmouseover="hover('compare1', '/media/articles/res_dn/mushishi_src22390.png');"
onmouseout="hover('compare1', '/media/articles/res_dn/mushishi_f3k.png');"><br>
The BM3D image after an f3kdb call with a simple <span class="fakelink"
onclick="fullimage('/media/articles/res_dn/f3k_mask.png');">mask</span>
to protect the wooden texture (I won't go into detail here, as debanding is a topic for another day). Hover over
the image to switch to the source frame.<br>
Source size: 821KB. Denoised and debanded frame: 767KB. This does not sound too impressive, as it is only a
decrease of ~7% which (considering the processing time) really isn't that much, however our new grain has a
considerable advantage: It is static. I won't delve too deep into intra-frame compression, but most people will
know that less motion = lower bitrate. While the source's grain takes up new bits with every new frame, our
grain only has to be stored once per scene.<br><br>
</div>
<div class="subhead" id="c_encoding"><a href="#c_encoding">Encoding grain</a></div>
<div class="content">
After you've decided what to do with your grain, you will have to encode it in a way that keeps the grain
structure as close as possible to your script's output. In order to achieve this, you may need to adjust a few
settings.<br>
<ul>
<li>
aq-strength: This is probably the most important parameter for x264's grain handling. Higher values will
require more bitrate, but lower values may blur the grain which looks ugly and destroys f3k's/gradfun's
dithering, creating banding.<br>Recommended Values: 0.6-1.2. Extreme cases may require higher values.
Consider using the --zones parameter if necessary.
</li>
<li>
psy-rd: Grainy scenes may benefit from higher psy values (such as 0.8:0.25), however setting this to
high may induce ringing or other artifacts around edges.
</li>
<li>
deblock: Higher values (>0) will blur the image while lower (<0) will sharpen it (simplified). Sane
range is probably
2≥x≥-3, and 0:-1 or -1:-1 should be a good starting point which you can decrease if your grain
needs it.
</li>
<li>
qcomp (quantizer curve compression): Higher values will result in a more constant quantizer while lower
values will force constant bitrate with higher quantizer fluctuations. High values (0.7-0.8) can help
with grain retention. If mbtree is enabled, this controls its strength which changes the effects, though
higher values are still better for grain.
</li>
</ul>
<br><br><br><br><br><br><span style="color: #554840; font-size: 50%; text-align: left">I feel like I forgot something</span>
</div>
</div>

426
matana.html Normal file
View File

@ -0,0 +1,426 @@
<a href="/blog">
<div class="bottom_right_div"><img src="/static/2hu.png"></div>
</a>
<div class="overlay" id="overlay" onclick="removefull()" aria-hidden="true"></div>
<div class="overlay" id="videooverlay" aria-hidden="true"></div>
<div class="wrapper_article">
<p class="heading">Review: Matana-Studio - Yowamushi Pedal: Grande Road Folge 1</p>
<div class="content">
<p class="subhead">Einleitung und Technisches</p>
Der <span class="fakelink"
onclick="fullimage('/media/articles/res/matana/fsi.jpg')">Zufall</span> hat entschieden, dass wir in unserer ersten
Review <a target="_blank"
href="http://matanastudio.eu/?site=projekt&id=7">Matana-Studios Yowamushi Pedal: Grande Road</a>
behandeln
werden. Da wir noch keine tollen Insider haben, um neue Leser gleich wieder abzuschrecken, fangen wir einfach
mit
den Formalien an:</br></br>
Lokalisierung: Anredesuffixe vorhanden</br>
Versionen: 720p und 1080p, jeweils 8-bit mp4 Hardsub, Größe 501 MiB bzw 996 MiB</br>
Kapitel: nicht vorhanden</br>
Website: <a target="_blank" href="http://matanastudio.eu/">http://matanastudio.eu/</a></br>
Downloadmöglichkeiten: Nur One-Click-Hoster und Stream (Uploadet, MEGA, Openload)</br>
</br>Vollständige MediaInfo (ausklappbar):</br>
<div class="spoilerbox_expand_element">MediaInfo (1080p)
<p class="code">
Allgemein</br>
Format : MPEG-4</br>
Format-Profil : Base Media</br>
Codec-ID : isom (isom/avc1)</br>
Dateigrose : 996 MiB</br>
Dauer : 23min</br>
Gesamte Bitrate : 5 832 Kbps</br>
Kodierungs-Datum : UTC 2016-05-25 23:57:19</br>
Tagging-Datum : UTC 2016-05-25 23:57:19</br>
</br>
Video</br>
ID : 1</br>
Format : AVC</br>
Format/Info : Advanced Video Codec</br>
Format-Profil : High@L5.1</br>
Format-Einstellungen fur CABAC : Ja</br>
Format-Einstellungen fur ReFrames : 16 frames</br>
Codec-ID : avc1</br>
Codec-ID/Info : Advanced Video Coding</br>
Dauer : 23min</br>
Bitrate : 5 510 Kbps</br>
maximale Bitrate : 21,8 Mbps</br>
Breite : 1 920 Pixel</br>
Hohe : 1 080 Pixel</br>
Bildseitenverhaltnis : 16:9</br>
Modus der Bildwiederholungsrate : konstant</br>
Bildwiederholungsrate : 23,810 FPS</br>
ColorSpace : YUV</br>
ChromaSubsampling/String : 4:2:0</br>
BitDepth/String : 8 bits</br>
Scantyp : progressiv</br>
Bits/(Pixel*Frame) : 0.112</br>
Stream-Grose : 941 MiB (94%)</br>
verwendete Encoder-Bibliothek : x264 core 148 r2692 64f4e24</br>
Kodierungseinstellungen : cabac=1 / ref=16 / deblock=1:1:1 / analyse=0x3:0x133 / me=tesa / subme=11 /
psy=1
/ psy_rd=0.40:0.00 /</br>
mixed_ref=1 / me_range=24 / chroma_me=1 / trellis=2 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=0 /
chroma_qp_offset=-2 /</br>
threads=12 / lookahead_threads=3 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0
/
constrained_intra=0 /</br>
bframes=16 / b_pyramid=2 / b_adapt=2 / b_bias=0 / direct=3 / weightb=1 / open_gop=0 / weightp=2 /
keyint=238
/ keyint_min=23 /</br>
scenecut=40 / intra_refresh=0 / rc_lookahead=60 / rc=crf / mbtree=1 / crf=16.0 / qcomp=0.60 / qpmin=0 /
qpmax=69 / qpstep=4 /</br>
ip_ratio=1.40 / aq=1:0.60</br>
Kodierungs-Datum : UTC 2016-05-25 23:57:19</br>
Tagging-Datum : UTC 2016-05-26 00:01:03</br>
</br>
Audio</br>
ID : 2</br>
Format : MPEG Audio</br>
Format-Version : Version 1</br>
Format-Profil : Layer 3</br>
Format_Settings_Mode : Joint stereo</br>
Format_Settings_ModeExtension : MS Stereo</br>
Codec-ID : .mp3</br>
Dauer : 23min</br>
Bitraten-Modus : konstant</br>
Bitrate : 320 Kbps</br>
maximale Bitrate : 334 Kbps</br>
Kanale : 2 Kanale</br>
Samplingrate : 44,1 KHz</br>
Stream-Grose : 54,6 MiB (5%)</br>
verwendete Encoder-Bibliothek : LAME3.99r</br>
Kodierungseinstellungen : -m j -V 4 -q 2 -lowpass 20.5</br>
Kodierungs-Datum : UTC 2016-05-26 00:01:02</br>
Tagging-Datum : UTC 2016-05-26 00:01:03</br>
</p></div>
<div class="spoilerbox_expand_element">MediaInfo (720p)
<p class="code">
Allgemein</br>
Format : MPEG-4</br>
Format-Profil : Base Media</br>
Codec-ID : isom (isom/avc1)</br>
Dateigrose : 501 MiB</br>
Dauer : 23min</br>
Gesamte Bitrate : 2 932 Kbps</br>
Kodierungs-Datum : UTC 2016-04-17 16:33:28</br>
Tagging-Datum : UTC 2016-04-17 16:33:28</br>
</br>
Video</br>
ID : 1</br>
Format : AVC</br>
Format/Info : Advanced Video Codec</br>
Format-Profil : High@L4</br>
Format-Einstellungen fur CABAC : Ja</br>
Format-Einstellungen fur ReFrames : 8 frames</br>
Codec-ID : avc1</br>
Codec-ID/Info : Advanced Video Coding</br>
Dauer : 23min</br>
Bitrate : 2 800 Kbps</br>
maximale Bitrate : 10,7 Mbps</br>
Breite : 1 280 Pixel</br>
Hohe : 720 Pixel</br>
Bildseitenverhaltnis : 16:9</br>
Modus der Bildwiederholungsrate : konstant</br>
Bildwiederholungsrate : 23,976 (24000/1001) FPS</br>
ColorSpace : YUV</br>
ChromaSubsampling/String : 4:2:0</br>
BitDepth/String : 8 bits</br>
Scantyp : progressiv</br>
Bits/(Pixel*Frame) : 0.127</br>
Stream-Grose : 478 MiB (96%)</br>
verwendete Encoder-Bibliothek : x264 core 148 r2665 a01e339</br>
Kodierungseinstellungen : cabac=1 / ref=8 / deblock=1:0:0 / analyse=0x3:0x133 / me=umh / subme=9 / psy=1
/ psy_rd=1.00:0.00 /</br>
mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=2 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 /
chroma_qp_offset=-2 /</br>
threads=6 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0
/ constrained_intra=0 /</br>
bframes=3 / b_pyramid=2 / b_adapt=2 / b_bias=0 / direct=3 / weightb=1 / open_gop=0 / weightp=2 /
keyint=240 / keyint_min=23 /</br>
scenecut=40 / intra_refresh=0 / rc_lookahead=60 / rc=crf / mbtree=1 / crf=18.0 / qcomp=0.60 / qpmin=0 /
qpmax=69 / qpstep=4 /</br>
ip_ratio=1.40 / aq=1:1.00</br>
Kodierungs-Datum : UTC 2016-04-17 16:33:28</br>
Tagging-Datum : UTC 2016-04-17 16:33:41</br>
</br>
Audio</br>
ID : 2</br>
Format : MPEG Audio</br>
Format-Version : Version 1</br>
Format-Profil : Layer 3</br>
Format_Settings_Mode : Joint stereo</br>
Format_Settings_ModeExtension : MS Stereo</br>
Codec-ID : .mp3</br>
Dauer : 23min</br>
Bitraten-Modus : konstant</br>
Bitrate : 128 Kbps</br>
maximale Bitrate : 134 Kbps</br>
Kanale : 2 Kanale</br>
Samplingrate : 44,1 KHz</br>
Stream-Grose : 21,9 MiB (4%)</br>
verwendete Encoder-Bibliothek : LAME3.99r</br>
Kodierungseinstellungen : -m j -V 4 -q 2 -lowpass 17 -b 128</br>
Kodierungs-Datum : UTC 2016-04-17 16:33:33</br>
Tagging-Datum : UTC 2016-04-17 16:33:41</br>
</p></div>
</br>
Anmerkungen zum Encode: die Framerate der 1080p-Version ist falsch, als Audiocodec wird für beide Versionen mp3
genutzt, was besonders während des Openings und Endings in hörbar schlechterer Qualität resultiert. Die Bitrate
ist mit 320kbit/s zwar recht großzügig bemessen, aber ein Blick auf die
<span class="fakelink" onclick="fullimage('/media/articles/res/matana/not320k.jpg')">Spektralanalyse</span>
zeigt schnell, dass hier eine
verlustbehaftet encodete Source (vermutlich AAC 128k) erneut encoded wurde. </br>
Viel Mühe steckt übrigens nicht in
den Einstellungen, da es sich lediglich um das „Placebo“-Preset sowie das „Animation“-Tuning von x264 handelt.
Leider ist dieses Tuning für moderne Animes kaum verwendbar und somit in diesem Fall ungeeignet. Die
720p-Version nutzt nur das „Slower“-Preset ohne irgendein Tuning. Der MediaInfo entnehmen wir weiterhin, dass
die zwei Versionen von zwei verschiedenen Personen und auf verschiedenen Computern erstellt wurden.
<p class="subhead">
Encode
</p>
Da die Quelle des Bildmaterials weder im Dateinamen noch auf Matanas Website angegeben ist, könnte es sich
sowohl um
eine Blu-ray als auch eine Websource handeln. TV ist unwahrscheinlich, da in keiner Szene ein TV-Logo erkennbar
ist.</br>
Auch die Qualität des Videos lässt keinerlei Schlüsse zu, da die vorhandene Videoqualität mit jeder Quelle
erreichbar ist, aber aufgrund der eindeutig mehrfach encodeten Audiospur tendiere ich eher zu Websource oder
re-encodetem Video, dass von einer anderen Gruppe entliehen wurde. Auch Fragen unsererseits führten zu <span
class="fakelink"
onclick="fullimage('/media/articles/res/matana/src.png')"> keinem genauen Ergebnis</span>.
Alle folgenden Beispiele beziehen sich auf die 1080p-Version des Videos, da die niedrigeren Auflösungen in
diesem Fall keine Vorteile mit sich bringen. Leider ist auch das Bild der Full HD-Version alles andere als
herausragend:
<img class="img_expandable rounded" src="/media/articles/res/matana/encode1.jpg">
Blöcke soweit das Auge reicht.
<img title="Ralf S. (35): „Ich bezahle einen Voodoopriester, um dem Encoder möglichst große Schmerzen zu bereiten“"
class="img_expandable rounded" src="/media/articles/res/matana/encode3.jpg">
Scheinbar hat der Encoder sein neuestes Minecraft Let’s Play mit ins Release geschmuggelt.
<img class="img_expandable rounded" src="/media/articles/res/matana/encode4.jpg">
Glaub uns, wir auch nicht.
<p class="subhead">
Timing
</p>Ohne mitgezählt zu haben können wir mit Sicherheit sagen, dass hier nicht nur ein oder zwei Zeilen etwas
verschoben sind. Ein beeindruckender Anteil aller Zeilen geht entweder über die Szene hinaus oder beginnt
einen Frame zu früh.
<img class="img_expandable rounded" src="/media/articles/res/matana/time1.jpg">
Ein schwarz leuchtender Text auf weißem Grund fällt ja auch kaum auf, oder?
<img class="img_expandable rounded" src="/media/articles/res/matana/timepls.jpg">
Die Ironie falsch eingesetzter QC-Namen entbehrt einer gewissen Komik nicht.
<img class="img_expandable rounded" src="/media/articles/res/matana/disclaimer.jpg">
Wirklich? Immerhin eine wunderbare Überleitung zum nächsten Abschnitt:
<p class="subhead">
Typeset
</p>
Viel zu schreiben gibt es hier leider nicht. Außer dem Logo und den obligatorischen Credits ist nicht viel
zu sehen. Die Schilder sind gänzlich unübersetzt und auch für Straßenmarkierungen hat es nicht gereicht.
<img class="img_expandable rounded" src="/media/articles/res/matana/logo.jpg">
Wir probieren das mit dem Colorpicker bei Gelegenheit noch mal.
Das Ganze in Bewegung:
<table style="width: 100%; margin-top: 1em">
<td style="width: 66%;height: auto">
<video controls style="max-width: 100%">
<source src="/media/articles/res/matana/logo.mp4">
</video>
</td>
<td style="width: auto;height: auto; padding-left: 1em">Hier ist einiges schiefgelaufen:</br>
<ol>
<li>Der Glow ist Gelb, während das Original grün war</li>
<li>Am hellsten Punkt ist das Originallogo weiß, während der Type grau-gelb wird</li>
<li>Der Type ist einfarbig und nicht mit Gradientfüllung wie das Original</li>
<li>Nach dem Glühen wird es zu schnell orange</li>
<li>Am Ende wird das Logo ausgefadet, statt mit dem Licht zu verschmelzen</li>
<li>Beim Einfliegen fehlt der Leuchteffekt</li>
<li>Das „Grande Road“ ist durch die Umrandung mit dem Hauptlogo verbunden. Wenn man das nicht will,
sollte die Umrandung deutlich dicker sein.
</li>
<li>Das „Matana Studio“ überdeckt die Funkeneffekte des Originals</li>
</ol>
<div class="fakelink" onclick="fullvideo('/media/articles/res/matana/6.mp4')">Hier das Video in Zeitlupe, damit
man alles erkennen und nachvollziehen kann.
</div>
</td>
</table>
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/type2.jpg">
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/type3.jpg">
Wer irgendetwas erwartet hat, wird spätestens hier enttäuscht.</br>
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/type6.jpg">
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/type4.jpg">
Selbst mit ausreichend Platz hält es der Typesetter (der namentlich in den Credits erwähnt wird) nicht für
nötig, eine Übersetzung anzubieten.</br>
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/placement.jpg">
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/endcard.jpg">
Textumbrüche und Platzierung gehören auch zu den Aufgaben des Types. Zumindest bei richtigen Gruppen.</br>
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/type8.png">
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/type5.jpg"></br>
Alignment ist scheinbar nur ein Wort, genau wie Schriftart und Originaltreue: Wenn im Original eine eckige
Schriftart verwendet wird, könnt ihr nicht einfach Serifen nehmen. Mit -7 Dioptrien mögen sich die zwei zwar
ähnlich sehen, aber wenn nach dem Encoder jetzt auch der Typesetter blind ist, sollte man sich ernsthaft
Gedanken
machen, wie sie je den Durchblick in der Szene haben wollen. Road mit Folge zu übersetzen lässt sich leider
nicht
mal auf mangelnde Sehstärke abschieben, aber vielleicht weiß der superz<a target="_blank" class="ninjalink"
href="/media/articles/res/matana/SUPERZAHERISHIGAKI.JPG">ä</a>he
Ishigaki mehr.</br></br>
Wie sagt man so schön: „Wer nichts macht, macht keine Fehler.“
</br>Das trifft hier leider nicht zu.
<p class="subhead">Dialog- und Karaokegestaltung</p>
Nichts Außergewöhnliches zu berichten (das ist was Gutes). Es gibt zwei verschiedene Styles, einen farbigen für
gesprochene Sprache
und einen mit Blur für Gedanken.</br>
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/font_reg.jpg">
<img class="img_expandable halfwidth rounded" src="/media/articles/res/matana/font_alt.jpg"></br></br>
Was die Karaoke angeht, sagt ein Video mehr als tausend Worte (Achtung, das Video hat Ton):</br>
<div style="margin: auto; text-align: center">
<video controls style="width: 960px; margin: 1em 0">
<source src="/media/articles/res/matana/kara.mp4">
</video>
</div>
Das Audio wurde nicht re-encoded, sondern stammt direkt aus der mp4.
<p class="heading">Untertitelqualität</p>
<table class="two_column_table">
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/375.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/8199.jpg"></td>
</tr>
<tr>
<td>Diese Zeile geviel uns nur wenig.</td>
<td>Großschreibung mit Unterstützung von random.org. Und seit wann spricht man Leute mit dem Namen ihrer
Schule an? Liegt übrigens nicht am Originalskript, da sagt er nämlich „Hakogaku-san“
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/680.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/774.jpg"></td>
</tr>
<tr>
<td colspan="2">Ja, diese zwei Sätze gehören wirklich zusammen. Und genau da liegt das Problem.
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/2792.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/9021.jpg"></td>
</tr>
<tr>
<td>Kommata und noch mehr Kommata.</td>
<td><a href="http://www.urbandictionary.com/define.php?term=mfw">mfw</a> menschliche Abschussrampe</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/9273.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/9338.jpg"></td>
</tr>
<tr>
<td>Das Leben der <a href="http://www.duden.de/rechtschreibung/andere">Anderen</a> - VR-Edition. Jetzt
auch mit Geruch.
</td>
<td>Ich glaube insgeheim, dass da definitiv was falsch ist.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/10200.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/10467.jpg"></td>
</tr>
<tr>
<td>Zu einem Satz, gehört Komma.</td>
<td>Dieses Komma allein, ist mehr als genug.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/12389.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/13609.jpg"></td>
</tr>
<tr>
<td><a href="https://youtu.be/fWpANSpqtEk?t=1m1s">Werft den Purchen zu Poden</a></td>
<td>Gib alle Is, die du hast.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/14196.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/14899.jpg"></td>
</tr>
<tr>
<td>Es kann nicht sein, dass hier schon wieder ein S fehlt.</td>
<td>Matana bewies großes Durchhaltevermögen bei der Verwendung quasitoter Begriffe. „Ass-Sprinter“ ist
nur eine der sprachlichen Unarten und sogar die Möglichkeit eines Dreifachkonsonanten lässt man hier
ungenutzt.
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/15862.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/16284.jpg"></td>
</tr>
<tr>
<td><a href="https://de.wikipedia.org/wiki/Adjektiv#Komposita">Wie verwende ich Adjektive?</a></td>
<td>Was ist das wieder? Richtig, falsch. Und zwar richtig falsch.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/17935.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/18138.jpg"></td>
</tr>
<tr>
<td>Ziellinie-senpai, bist du es?</td>
<td>Der alte Schwächling-großschreibung wäre nach diesen Zusammenstoß auch gefallen. Apropos Fall, der
ist auch Fall-sch.
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/19851.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/25224.jpg"></td>
</tr>
<tr>
<td>1933 musste jemand anderes führen, aber zumindest der linguistische Genozid scheint sich hier zu
wiederholen.
</td>
<td>Wenn ich zulasse, dass Matana weitersubbt, ist es zu Ende.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/25811.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/28227.jpg"></td>
</tr>
<tr>
<td>Ein normaler Satz nach Matana-Standards. Wer wen wohin führt ist hier nicht von Belang.</td>
<td>Euer Substil hingegen hat gar nichts.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/25951.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/11356.jpg"></td>
</tr>
<tr>
<td>Ein Rechtschreibassistent hätte hierbei helfen konnen.</td>
<td>Jetzt noch ohne Deppenleerzeichen und wir haben eine fehlerfreie Zeile.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/33268.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/res/matana_skript/28439.jpg"></td>
</tr>
<tr>
<td>Reden wir hier von Melonen, Leichen oder der deutschen Sprache?</td>
<td>Und ich hoffe, dass wir die Einzigen sind, die das Skript dieses Subs zu sehen bekommen.</td>
</tr>
</table>
<img class="img_expandable rounded" src="/media/articles/res/matana_skript/28345.jpg">
Mir fehlen die Worte.</br>
<p class="heading">Fazit</p>
Man soll ja mit dem Guten anfangen, also legen wir los: Die Schriftart ist nicht völlig unlesbar. Da wir das nun
geklärt hätten, begeben wir uns zu den weniger erfreulichen Anteilen:</br>
Die Encodequalität ist grauenhaft und mit 8-bit 1080p Hardsubs sind die technischen Mittel genauso
zurückgeblieben, wie die Individuen, die diese Sprachvergewaltigung verbrochen haben. Der Ton wurde mindestens
so oft re-encoded, wie Erdogan sich mit Ziegen vergnügt hat. Der Type wäre wohl ganz gut, wenn es welchen gäbe
und das Timing ist mindestens so verschoben wie die Inbetriebnahme des Flughafens BER. Vielleicht ist der
Untertitel in irgendeiner Sprache fehlerfrei; diese muss jedoch noch erfunden werden. Ein Deutschlehrer hätte
das Skript wohl abgelehnt, weil es sich außerhalb seines Fachbereiches befindet.</br>
Eine Empfehlung erübrigt sich vermutlich und die Bewertung mittels Schulnoten ist mit moderner digitaler
Speichertechnik nicht möglich.</br></br>
<p class="bold"><a class="ninjalink" href="/media/articles/res/matana/matana_blur.png">Wir möchten den geneigten Leser an dieser
Stelle freundlichst darauf hinweisen, dass der Konsum
von Matana-Studio-Subs zu irreparablen und unabsehbaren psychosomatischen Schäden führen kann und wird.</a>
</p></br>
Um es mit den weniger freundlichen Worten eines anonymen österreichischen Landschaftsmalers auszudrücken:
<p style="width: 90%">„Hätten solche Leute diese Filme schon während des zweiten
Weltkriegs untertitelt, so wäre das Großdeutsche Reich
wohl Gefahr gelaufen, Japan als Verbündeten zu verlieren.“</p>
</br>
<a href="index.html"><p id="back">Zurück zum Index</p></a>
</div>
</div>

374
mgsreview.html Normal file
View File

@ -0,0 +1,374 @@
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Review: Magical Girl Subs</p>
<div class="content">
<p class="subhead">Einleitung und Technisches</p>
<p>
In der zweiten Ausgabe unserer Reihe „Warum muss ich das tun, ich wollte nach dem ersten Artikel wieder
aufhören und die Szene für hoffnungslos verloren erklären“ widmen wir uns Magical Girl Subs und ihrem
Release zu ef - a tale of memories.. Der erste Punkt gehört übrigens zum Titel.</p>
<br>
Lokalisierung: Anredesuffixe vorhanden<br>
Versionen: 720p und 1080p, jeweils 8-bit mp4 Hardsub, Größe 244 MiB bzw. 840 MiB<br>
Kapitel: nicht vorhanden<br>
Website: <a href="https://www.magicalgirlsubs.de/">www.magicalgirlsubs.de</a><br>
Downloadmöglichkeiten: DDL und Torrent<br><br>
<div class="spoilerbox_expand_element">MediaInfo (ausklappbar)
<p class="code">
General<br>
Format : MPEG-4<br>
Format profile : Base Media<br>
Codec ID : isom<br>
File size : 840 MiB<br>
Duration : 23mn 48s<br>
Overall bit rate mode : Variable<br>
Overall bit rate : 4 937 Kbps<br>
Encoded date : UTC 2016-10-09 12:19:25<br>
Tagged date : UTC 2016-10-09 12:19:25<br>
<br>
Video<br>
ID : 1<br>
Format : AVC<br>
Format/Info : Advanced Video Codec<br>
Format profile : High@L4.0<br>
Format settings, CABAC : Yes<br>
Format settings, ReFrames : 4 frames<br>
Codec ID : avc1<br>
Codec ID/Info : Advanced Video Coding<br>
Duration : 23mn 47s<br>
Bit rate : 4 799 Kbps<br>
Maximum bit rate : 42.7 Mbps<br>
Width : 1 920 pixels<br>
Height : 1 080 pixels<br>
Display aspect ratio : 16:9<br>
Frame rate mode : Constant<br>
Frame rate : 23.976 fps<br>
Color space : YUV<br>
Chroma subsampling : 4:2:0<br>
Bit depth : 8 bits<br>
Scan type : Progressive<br>
Bits/(Pixel*Frame) : 0.097<br>
Stream size : 818 MiB (97%)<br>
Writing library : x264 core 148 r2638 7599210<br>
Encoding settings : cabac=1 / ref=3 / deblock=1:1:1 / analyse=0x3:0x133 / me=umh / subme=6 / psy=1 /
psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 /
deadzone=21,11 /
fast_pskip=1 / chroma_qp_offset=-4 / threads=6 / lookahead_threads=1 / sliced_threads=0 / nr=0 /
decimate=1
/ interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=2 / b_bias=0
/
direct=3 / weightb=1 / open_gop=0 / weightp=2 / keyint=240 / keyint_min=23 / scenecut=40 /
intra_refresh=0 /
rc_lookahead=40 / rc=2pass / mbtree=1 / bitrate=4799 / ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 /
qpstep=4 / cplxblur=20.0 / qblur=0.5 / ip_ratio=1.40 / aq=1:0.00<br>
Encoded date : UTC 2016-10-09 11:31:42<br>
Tagged date : UTC 2016-10-09 12:19:26<br>
<br>
Audio<br>
ID : 2<br>
Format : AAC<br>
Format/Info : Advanced Audio Codec<br>
Format profile : HE-AAC / LC<br>
Codec ID : 40<br>
Duration : 23mn 48s<br>
Bit rate mode : Variable<br>
Bit rate : 127 Kbps<br>
Maximum bit rate : 164 Kbps<br>
Channel(s) : 2 channels<br>
Channel positions : Front: L R<br>
Sampling rate : 48.0 KHz / 24.0 KHz<br>
Compression mode : Lossy<br>
Stream size : 21.7 MiB (3%)<br>
Encoded date : UTC 2016-08-10 17:45:33<br>
Tagged date : UTC 2016-10-09 12:19:26<br>
</p></div>
<br>
Abgesehen von der Bitrate ist die 720p-Version identisch.
<p class="subhead">Encode</p>
Das Encode lässt sich wohl am besten durch die Reaktionen unserer Teammitglieder beschreiben:
<ul>
<li>kageru: „Das sieht aus, als hätten sie ohne AQ encoded.“</li>
<li>fredferguson: „Notch wäre stolz.“</li>
<li>Attila: „Ich muss kurz schauen, ob das in mpv genauso aussieht.“</li>
</ul>
<p>
Werten wir das mal aus: kageru hatte recht, Attila hatte recht und fredferguson … nun ja, wir haben Notch
nicht
direkt gefragt, aber wir können mit an Sicherheit grenzender Wahrscheinlichkeit sagen, dass er wohl recht
hatte.<br>
Ehrlich gesagt: Das Encode ist eines der schlechtesten, die wir je gesehen haben. kageru erklärt kurz, was
hier schiefgelaufen ist. Die folgenden Erklärungen
sind recht theoretisch und könnten für technisch weniger Interessierte langweilig sein:</p>
<span style="color: #e17800">aq=1:0.00</span>: Warum irgendjemand diesen Fehler machen würde, ist mir
schleierhaft. Der Standardwert ist 1.0 und es gibt kein normales preset oder tuning in x264, das den AQ einfach
ausschaltet. Lediglich die metrikorientierten tunings, SSIM und PSNR, machen das, aber die sind für normale
Videos unbrauchbar. Der AQ oder Adaptive Quantizer ist ein Feature des Videocodecs, das verschiedenen
Bildbereichen
jeweils passende Quantizer-Werte zuweist. Auf Deutsch heißt das: AQ verteilt die Bitrate so, dass es Sinn macht
und alles so gut aussieht, wie es muss. Ihn komplett zu deaktivieren erzeugt hier die schlimmsten Artefakte, die
ich seit Längerem gesehen habe. qpmin und qpmax auf ihre antiken Standardwerte zu setzen macht es leider auch
nicht besser.<br>
Wenn ihr über einem Bild hovert, seht ihr das Encode von ANE. Da ich die Blurays von Melodies nicht zur
Verfügung habe, um Screenshots zu machen, musste ich auf fremde Encodes zurückgreifen.<br>
<img class="img_expandable" id="comp1" style="width: 100%" src="/media/articles/reviews/mgs/mgs1.png"
onmouseover="hover('comp1', '/media/articles/reviews/mgs/ane1.png')"
onmouseout="hover('comp1', '/media/articles/reviews/mgs/mgs1.png')"><br>
<img class="img_expandable" id="comp2" style="width: 100%" src="/media/articles/reviews/mgs/mgs2.png"
onmouseover="hover('comp2', '/media/articles/reviews/mgs/ane2.png')"
onmouseout="hover('comp2', '/media/articles/reviews/mgs/mgs2.png')"><br>
<img class="img_expandable" id="comp3" style="width: 100%" src="/media/articles/reviews/mgs/mgs3.png"
onmouseover="hover('comp3', '/media/articles/reviews/mgs/ane3.png')"
onmouseout="hover('comp3', '/media/articles/reviews/mgs/mgs3.png')"><br>
<p>
Darüber hinaus noch ein paar Anmerkungen: Beide Versionen sind 8-bit Hardsubs, was keinesfalls den modernen
Standards entspricht. 8-bit Video mit HS kann für Hardwarekompatibilität als alternativer Download angeboten
werden,
sollte jedoch niemals das einzige Release sein. Hardsubs sehen einfach nicht so gut aus wie Softsubs und der
8-bit Videocodec ist deutlich anfälliger für Banding und andere Bildfehler als die 10-bit Version.<br>
Außerdem gibt es ein 1080p-Release eines Anime, der nur in 720p animiert wurde. Damit verschwendet MGS
nicht nur die Zeit und Rechenleisung ihres Encoders, sondern auch die Bandbreite und den Festplattenspeicher
ihrer Leecher. Details <a href="article.php?p=resolutions">hier</a>. Die Qualität der Datei wurde über die
Bitrate festgelegt, was allgemein schlechter für die Videoqualität ist, als crf, den Modus für konstante
Qualität, zu verwenden (empfundene Qualität, nicht technisch gemessene). subme=6, trellis=1 und ref=3 sparen
zwar Zeit beim
Encodieren des Videos, verschlechtern jedoch die Kompressionseffizienz deutlich. deblock=1:1:1 ist für Anime
eher ungeeignet, da positive Deblock-Werte das Bild unscharf machen. Ich würde eher 1:0:-1 oder 1:-1:-1
empfehlen.<br>
Über die Sinnhaftigkeit von dct-decimate und fast-p-skip lässt sich streiten, also gehe ich darauf nicht
weiter ein.</p>
Auch beim Audio gibt es Probleme, da hier HE-AAC gewählt wurde. HE-AAC ist eine Version des AAC-Codecs, die
besonders
auf Effizienz bei niedrigen Bitraten ausgelegt ist. Es verwendet einige Optimierungen, um mit
minimaler Bandbreite zumindest akzeptable Ergebnisse zu erzielen. Aufgrund dieser Besonderheiten ist es jedoch
nicht
empfohlen, das HE-Profil für Dateien über 80kbit/s bzw. 40kbit/s pro Kanal zu verwenden, da das
Low-Complexity-Profil hier besser funktioniert.<br><br>
Ich sollte mich wohl persönlich mit MGS’ Encoder in Verbindung setzen, um einige dieser Dinge zu (er-)klären.
Viele dieser Fehler sind leicht zu beheben, wenn man weiß, wo das Problem liegt.
<p class="subhead">Typeset</p>
In ef gibt es grundsätzlich nicht viel zu typen, da viele der Texte bereits Deutsch sind, also schauen wir uns
das bisschen an, was Shaft für die Fansubber übrig gelassen hat.
<img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0020.jpg"/><br>
Falsche Farbe, komplett falsche Font und das Timing stimmt auch nicht. Keine Ahnung, warum Fansubber
sich in ihren Releases verewigen müssen, aber wenn es schon sein muss, dann doch bitte ordentlich.<br>
<img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0021.jpg"/><br>
Wieder falsche Font. Wäre wohl in Ordnung gewesen, wenn ihr \fscy etwas gesenkt hättet, aber scheinbar hat es
nicht mal dafür gereicht. Außerdem: >Emule, lel<br>
<img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0018.jpg"/><br>
<img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0019.jpg"/><br>
Falsche Farbe, mittelmäßige Font (und das ist großzügig, weil unsere Ansprüche auch nicht so hoch sind).
Keine Ahnung, warum Fansubber sich in ihren Releases verewigen … Moment, das hatten wir schon.
<p class="subhead">Gestaltung</p>
Die Schriftart ist ganz gut gewählt. Sie ist lesbar und dank ihrer Farbe auf allen Hintergründen problemlos
erkennbar. Da es jedoch ein Hardsub ist, wirken die Untertitel besonders auf hochauflösenden Monitoren recht
unscharf.<br>
Kara fürs Opening ist vorhanden, jedoch nichts Besonderes. Die Buchstaben überlappen sich, weil der Abstand zu
gering gewählt wurde und beim Einfliegen der Buchstaben steht das Ende des Satzes bereits da, bevor die
Buchstaben es erreicht haben.<br>
<img class="img_expandable rounded" src="/media/articles/reviews/mgs/kara.png"/><br>
„We won't fall apart“ beim Einfliegen. <!--Das „t“ hat sich zu viel mit Quanteneffekten und Unschärfe befasst. Mit
Letzterem fast so sehr wie die normalen Hardsubs. <== Mein Team versteht meinen Humor nicht und wollte
deswegen die Zeile rausnehmen. Niedere Wesen.-->
<p class="subhead">Untertitelqualität</p>
<p>
Ehrlich gesagt waren wir nach dem Encode leicht voreingenommen und haben eigentlich nicht mehr viel
erwartet.
Glücklicherweise versteht MGS’ Editor mehr von seinem Werk als der zuständige Encoder. „Mehr als der
Encoder“ ist in dem Fall leider keine große Leistung. Das Skript ist grammatikalisch und orthografisch zwar
beinahe fehlerfrei, der Ausdruck lässt jedoch zu wünschen übrig.</p>
<p>
Als Vorlage für die Übersetzung diente der englische Sub von Mendoi-Conclave (oder eins der BD-Retimes), was
an
vielen Stellen im Skript <span class="bold">sehr</span> deutlich wird.<br>
So finden sich an vielen Stellen Sätze und Formulierungen, deren Ursprungssprache noch deutlich erkennbar
ist, was mitunter recht irritierend wirkt.
</p>
<table class="two_column_table">
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/comp1.PNG"/></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/comp2.PNG"/></td>
</tr>
<tr>
<td>Hätte man nicht umständlicher ausdrücken können. Wir verstehen ja, dass ihr keine Übersetzungsfehler
riskieren wolltet (mit mäßigem Erfolg, wie wir später sehen werden), aber das ist einfach zu
wörtlich.
</td>
<td>
„Menschen, die mit Kunst zu tun haben“ – Dafür gab es doch ein Wort; irgendwas
Kurzes, das diesen Satz deutlich angenehmer gemacht hätte. <a
href="http://www.duden.de/rechtschreibung/Kuenstler">Hm …</a>
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/comp4.PNG"/></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/comp3.PNG"/></td>
</tr>
<tr>
<td>Sehr poetisch. Leider wurde zugunsten dieser blumigen Wortwahl auf den Sinn des Satzes verzichtet.
</td>
<td>Und hier sehen wir, dass es deutliche Unterschiede zwischen der deutschen und der englischen
Satzstruktur geben <span class="bold">sollte</span>.<br>
Sogar der <a href="https://en.wikipedia.org/wiki/Comma_splice">Comma Splice</a> wurde aus dem
Englischen übernommen. Im Original war es ein Fehler, während es im Deutschen erlaubt ist. Das macht
es leider nicht besser.
</td>
</tr>
<tr>
<td colspan="2"><img class="img_expandable rounded" src="/media/articles/reviews/mgs/googletrans2.png"/></td>
</tr>
<tr>
<td colspan="2"><span class="bold">Das</span> erst recht nicht.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/comp6.PNG"/></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/comp5.PNG"/></td>
</tr>
<tr>
<td>Erstens heißt es Fermat<span style="color:red">e</span> und zweitens war das ein Wortspiel, das ihr
nicht nur gekonnt ignoriert, sondern auch komplett falsch übersetzt habt. Kuze spricht hier
keinesfalls mit der Fermate, er verwendet das Wort nur metaphorisch, um seine aktuelle Situation zu
beschreiben. Der Begriff muss ja nicht jeden geläufig sein, aber der Edit hätte
zumindest nachschlagen können.
</td>
<td>Irgendwas mit zu wörtlichen Übersetzungen … Denkt euch den Rest einfach.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0001.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/googletrans.png"></td>
</tr>
<tr>
<td>
Sie [die Stadt] trägt gar nichts. Städte machen so was nämlich nicht.
</td>
<td>Ist irgendjemand noch überrascht?</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0002.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0005.jpg"></td>
</tr>
<tr>
<td>DAS UNGLÜCK (Donnergrollen im Hintergrund, Blitze zucken durch die Nacht, irgendwas Dramatisches
passiert).<br>Aber im Ernst: Das Unglück? Das eine Unglück mit dem bestimmten Artikel?
</td>
<td>Leerzeichen vor die Auslassungspunkte</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0003.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0004.jpg"></td>
</tr>
<tr>
<td><span style="color: red">E!</span> FermatEEEEE. Der Fehler zieht sich durch die ganze Folge.</td>
<td><a href="http://www.duden.de/rechtschreibung/Fine">Ein Fine</a>. Musik war nicht die Stärke des
Edits. Nachschlagen wohl auch nicht.
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0006.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0009.jpg"></td>
</tr>
<tr>
<td>Das geht deutlich kürzer. „Nur weil ein Mädchen neben dir sitzt/du neben einem Mädchen sitzt, […]“
</td>
<td>Eine Skizze kann kein Verfahren sein. Das Anfertigen ebendieser vielleicht, aber die Skizze selbst
definitiv nicht.
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0010.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0011.jpg"></td>
</tr>
<tr>
<td>Brauchen wir ein Beweisbild oder glaubt ihr so, dass das 1:1 der englische Satz ist?</td>
<td>Man kann nur hoffen, dass diese Zeile ein Witz sein sollte, den hier niemand versteht.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/merge1.PNG"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/merge2.PNG"></td>
</tr>
<tr>
<td>Man mag es kaum glauben, aber diese Sätze folgen wirklich direkt aufeinander. Was ist sie denn?
Niedlich und das komplette Gegenteil von … allem? Das komplette Gegenteil halt.
</td>
<td>Wir … ich … weiß nicht, wo ich anfangen soll. „Niemals im Leben“ würde man nie im Leben sagen und
Reizbarkeit
ist eine Frage des Temperaments und des Characters, nicht eine der Einstellung.
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0022.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0023.jpg"></td>
</tr>
<tr>
<td>Da wird sich das Frühstück aber freuen.</td>
<td>Der Duden empfiehlt an dieser Stelle die <a
href="http://www.duden.de/rechtschreibung/gut_aussehend">Getrenntschreibung</a></td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0026.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0027.jpg"></td>
</tr>
<tr>
<td>Auch hier empfiehlt der Duden die <a href="http://www.duden.de/rechtschreibung/zu_Hause">Getrenntschreibung</a>
</td>
<td>Das ist das erste Mal, dass ich sehe, dass irgendjemand das dass, das mit zwei s geschrieben wird
und das man sonst eher selten sieht, in solcher Vielzahl verwendet, dass es anstrengend wird, das zu
lesen.
</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0029.jpg"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/merge3.PNG"></td>
</tr>
<tr>
<td>Solche Sätze lesen sich allgemein angenehmer,<br>wenn man vor Beginn des Nebensatzes eine neue Zeile
anfängt.
</td>
<td>Mein Thesaurus ist nicht nur enttäuschend, er enttäuscht mich auch noch.</td>
</tr>
<tr>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/merge4.PNG"></td>
<td><img class="img_expandable rounded" src="/media/articles/reviews/mgs/mpv-shot0036.jpg"></td>
</tr>
<tr>
<td>Vermutlich waren das mal zwei Sätze, Und keinen der QCs hat das gestört.</td>
<td>Es ist keine Freude, an einem Ort, an dem eure Fansubs existieren, irgendwas zu tun.</td>
</tr>
<tr>
<td colspan="2"><img class="img_expandable rounded" src="/media/articles/reviews/mgs/tlerror.png"></td>
</tr>
<tr>
<td colspan="2">Das ist leider absolut nicht das, was sie sagt. Und nein, an dieser Stelle wurde kein
Übersetzungsfehler der englischen Gruppe korrigiert. <br>
むしろ僕は一人で店に入て普通に買い物ができる方が不思議だ bedeutet nämlich wörtlich:<br>
„Ich finde es eher sonderbar, dass es einen Weg gibt, alleine in einen Laden zu gehen und normal
einzukaufen.“<br>Wie gesagt, wörtlich. Man sollte diesen Satz niemals so in einem Fansub verwenden,
aber die Bedeutung ist klar erkennbar, trotz des furchtbaren Ausdrucks.
</td>
</tr>
</table>
<p class="subhead">Fazit</p>
Fassen wir also zusammen:<br>
Encode: Streamingseitenniveau. Dunkle Szenen bestehen förmlich aus Blöcken, aber auch in hellen Szenen entstehen
nicht selten Blöcke bei schnelleren Bewegungen. 240 MiB sind zwar eine recht sparsame Dateigröße, was diese
Qualität jedoch nicht rechtfertigt.<br>
Typeset: Vorhanden, aber oft nur mittelmäßig oder noch darunter. Über die Kara lässt sich wohl dasselbe
sagen.<br>
Untertitel: Ortografisch weitestgehend fehlerfrei, vom Ausdruck her aber nur knapp über Google Übersetzer. Da
die Satzstruktur oft exakt übernommen wurde, kann man an dieser Stelle auch den englischen Sub schauen, da MGS
keinerlei Mehrwert produziert hat. Zusätzlicher Punktabzug für Hardsubs in beiden Versionen.<br><br>
Bewertung: Mit entsprechend niedrigen Ansprüchen schaubar, empfehlenswert ist es aber nicht. Vorerst sollte man
hier lieber auf den mitlerweile sieben Jahre alten Sub von <a
href="http://kampfkuchen.de/projects/ef-a-tale-of-melodies">Gruppe Kampfkuchen</a> zurückgreifen. Die
nutzten damals zwar eine Fernsehversion als Quellvideo, aber dank der unsachgemäßen Handhabung des Videos
seitens MGS’ Encoders, ist man damit immer noch besser bedient.
<span class="ninjatext">Hoffentlich beantwortet das die Frage, warum wir „ef subben, obwohl MGS den doch schon gemacht hat“.</span>
</div>
</div>

475
removegrain.html Normal file
View File

@ -0,0 +1,475 @@
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Actually Explaining RemoveGrain</p>
<div class="content">
<div class="subhead">Table of contents</div>
<ul>
<li><a href="#c_intro">Introduction</a></li>
<li><a href="#c_glossary">Glossary</a></li>
<li><a href="#c_m59">Modes 5-9</a></li>
<li><a href="#c_m13">Modes 13-16</a></li>
<li><a href="#c_m17">Mode 17</a></li>
<li><a href="#c_m21">Mode 21 and 22</a></li>
<li><a href="#c_m23">Mode 23 and 24</a></li>
</ul>
<br><br>
<style scoped>
/* This lunacy is sponsored by stux@dot-heaven.moe */
convolution {
display: flex;
flex-direction: column;
font-size: 300pt;
width: 1em;
height: 1em;
}
convolution > * > * {
flex-basis: 33%;
flex-grow: 1;
flex-shrink: 1;
border: 1px #7f7f7f solid;
font-size: 50pt;
display: flex;
align-content: space-around;
}
convolution > * > * > span {
margin: auto;
}
convolution > * {
display: flex;
flex-direction: row;
height: 100%;
}
convolution > * > black {
background-color: #000000;
}
convolution > * > white {
background-color: #ffffff;
}
convolution > * > transparent {
background-color: transparent;
}
convolution > * > lightgrey {
background-color: #cccccc;
}
convolution > * > darkgrey {
background-color: #333333;
}
convolution > * > *[data-accent="1"] {
border-color: #e17800;
}
convolution > * > *[data-accent="2"] {
border-color: #6c3e00;
}
</style>
<div class="subhead" id="c_intro">Introduction</div>
<p>For over a decade, RemoveGrain has been one of the most used filters for all kinds of video processing. It is
used in SMDegrain, xaa, FineDehalo, HQDeringMod, YAHR, QTGMC, Srestore, AnimeIVTC, Stab, SPresso, Temporal
Degrain, MC Spuds, LSFmod, and many, <span class="bold">many</span> more. The extent of this enumeration may
seem ridiculous or even superfluous, but I am trying to make a point here. RemoveGrain (or more recently
RGTools) is everywhere.</p>
<p>
But despite its apparent omnipresence, many encoders – especially those do not have years of encoding
experience – don't actually know what most of its modes do. You could blindly assume that they're all used
to, well, remove grain, but you would be wrong.</p>
<p>
After thinking about this, I realized that I, too, did not fully understand RemoveGrain. There is barely a
script that doesn't use it, and yet here we are, a generation of new encoders, using the legacy of our
ancestors, not understanding the code that our CPUs have executed literally billions of times.<br>But who
could
blame us? RemoveGrain, like many Avisynth plugins, has ‘suffered’ optimization to the point of complete
obfuscation.<br>You can try to read the <a title="You'd have to be an utter lunatic to click this."
href="https://gist.github.com/chikuzen/3c4d0bef250c6aab212f">code</a>
if you want to, but trust me, you don't.</p>
<p title="I swear to god I wasn't drunk when I wrote this">
Fortunately, in October of 2013, a brave adventurer by the name of tp7 set upon a quest which was thitherto
believed impossible. They reverse engineered the <a
href="http://www.vapoursynth.com/2012/10/open-binary-introducing-a-practical-alternative-to-open-source/">open
binary</a> that RemoveGrain had become and created <a href="https://github.com/tp7/RgTools">RGTools</a>, a
much
more readable rewrite that would henceforth be used in RemoveGrain's stead.
</p>
<p>
Despite this, there is still no complete (and understandable) documentation of the different modes. Neither
the
<a href="http://avisynth.nl/index.php/RgTools/RemoveGrain">Avisynth wiki</a> nor <a
href="https://github.com/tp7/RgTools/wiki/RemoveGrain">tp7's own documentation</a> nor any of <a
href="http://web.archive.org/web/20130615165406/http://doom10.org/index.php?topic=2185.0">the</a> <a
href="http://www.aquilinestudios.org/avsfilters/spatial.html#removegrain">other</a> <a
href="http://videoprocessing.fr.yuku.com/topic/9/RemoveGrain-10-prerelease">guides</a> manage to
accurately
describe all modes. In this article, I will try to explain all modes which I consider to be insufficiently
documented. Some self-explanatory modes will be omitted.<br>
If you feel comfortable reading C++ code, I would recommend reading the code yourself.
It's not very long and quite easy to understand: <a
href="https://github.com/tp7/RgTools/blob/master/RgTools/rg_functions_c.h">tp7's rewrite on Github</a>
</p>
<div class="subhead" id="c_glossary">Glossary</div>
<dl>
<dt>Clipping</dt>
<dd>
Clipping: A clipping operation takes three arguments. A value, an upper limit, and a lower limit. If the
value is below the lower limit, it is set to that limit, if it exceeds the upper limit, it is the
to that limit, and if it is between the two, it remains unchanged.
</dd>
<dt>Convolution</dt>
<dd>
The weighed average of a pixel and all pixels in its neighbourhood. RemoveGrain exclusively uses 3x3
convolutions, meaning for each pixel, the 8 surrounding pixels are used to calculate the
convolution.<br>
Mode 11, 12, 19, and 20 are just convolutions with different matrices.
</dd>
</dl>
To illustrate some of these modes, images of 3x3 pixel neighbourhoods will be used. The borders between the
pixels were added to improve visibility and have no deeper meaning.<br><br>
<convolution class="accent1">
<row>
<white></white>
<white></white>
<white></white>
</row>
<row>
<white></white>
<black></black>
<white></white>
</row>
<row>
<white></white>
<white></white>
<white></white>
</row>
</convolution>
<div class="subhead" id="c_m59">Modes 5-9</div>
<h3>Mode 5</h3>
The documentation describes this mode as follows: “Line-sensitive clipping giving the minimal change.”<br>
This is easier to explain with an example:<br><br>
<table style="width: 80%; margin: 0 auto">
<tr>
<td>
<convolution class="accent1">
<row>
<white></white>
<white></white>
<black></black>
</row>
<row>
<white></white>
<darkgrey></darkgrey>
<white></white>
</row>
<row>
<black></black>
<white></white>
<white></white>
</row>
</convolution>
</td>
<td>
<convolution class="accent1">
<row>
<white></white>
<white></white>
<black></black>
</row>
<row>
<white></white>
<black></black>
<white></white>
</row>
<row>
<black></black>
<white></white>
<white></white>
</row>
</convolution>
</td>
</tr>
</table>
<p style="text-align: center">Left: unprocessed clip. Right: clip after RemoveGrain mode 5</p>
<p> Mode 5 tries to find a line within the 3x3 neighbourhood of each pixel by comparing the center pixel with
two
opposing pixels. This process is repeated for all four pairs of opposing pixels, and the center pixel is
clipped to their respective values. After computing all four, the filter finds the pair which resulted in
the smallest change to
the center pixel and applies that pair's clipping. In our example, this would
mean clipping the center pixel to the top-right and bottom-left pixel's values, since clipping it to any of
the other pairs would make it white, significantly changing its value.</p>
To visualize the aforementioned pairs, they are
labeled with the same number in this image.<br><br>
<convolution class="accent1">
<row>
<white><span>1</span></white>
<white><span>2</span></white>
<white><span>3</span></white>
</row>
<row>
<white><span>4</span></white>
<black></black>
<white><span>4</span></white>
</row>
<row>
<white><span>3</span></white>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
<br> Due to this, a line like this could not be found, and the center pixel would remain unchanged:<br><br>
<convolution class="accent1">
<row>
<white><span>1</span></white>
<black><span>2</span></black>
<white><span>3</span></white>
</row>
<row>
<white><span>4</span></white>
<darkgrey></darkgrey>
<white><span>4</span></white>
</row>
<row>
<black><span>3</span></black>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
<h3>Mode 6</h3>
<p>
This mode is similar to mode 5 in that it clips the center pixel's value to opposing pairs of all pixels in
its
neighbourhood. The difference is the selection of the used pair. Unlike with mode 5, mode 6 considers the
range
of the clipping operation (i.e. the difference between the two pixels) as well as the change applied to the
center pixel. The exact math looks like this where p1 is the first of the two opposing pixels, p2 is the
second,
c_orig is the original center pixel, and c_processed is the center pixel after applying the clipping.</p>
<div class="code">
diff = abs(c_orig - c_processed) * 2 + abs(p1 - p2)
</div>
<p>
This means that a clipping pair is favored if it only slightly changes the center pixel and there is only
little difference between the two pixels. The change applied to the center pixel is prioritized (ratio 2:1)
in this mode. The
pair with the lowest diff is used.</p>
<h3>Mode 7</h3>
Mode 7 is very similar to mode 6. The only difference lies in the weighting of the values.
<div class="code">
diff = abs(c_orig - c_processed) + abs(p1 - p2)
</div>
Unlike before, the difference between the original and the processed center pixel is not multiplied by two. The
rest of the code is identical.
<h3>Mode 8</h3>
Again, not much of a difference here. This is essentially the opposite of mode 6.
<div class="code">
diff = abs(c_orig - c_processed) + abs(p1 - p2) * 2
</div>
The difference between p1 and p2 is prioritized over the change applied to the center pixel; again with a 2:1
ratio.
<h3>Mode 9</h3>
In this mode, only the difference between p1 and p2 is considered. The center pixel is not part of the equation.<br>
<p> Everything else remains unchanged. This can be useful to fix interrupted lines, as long as the length of the
gap never exceeds one pixel.</p>
<table style="width: 80%; margin: 0 auto">
<tr>
<td>
<convolution class="accent1">
<row>
<darkgrey><span>1</span></darkgrey>
<darkgrey><span>2</span></darkgrey>
<black><span>3</span></black>
</row>
<row>
<darkgrey><span>4</span></darkgrey>
<lightgrey></lightgrey>
<white><span>4</span></white>
</row>
<row>
<black><span>3</span></black>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
</td>
<td>
<convolution class="accent1">
<row>
<darkgrey></darkgrey>
<darkgrey></darkgrey>
<black></black>
</row>
<row>
<darkgrey></darkgrey>
<black></black>
<white></white>
</row>
<row>
<black></black>
<white></white>
<white></white>
</row>
</convolution>
</td>
</tr>
</table>
<br>
The center pixel is clipped to pair 3 which has the lowest range (zero, both pixels are black).<br>
Should the calculated difference for 2 pairs be the same, the pairs with higher numbers (the numbers in the
image, not their values) are prioritized. This applies to modes 5-9.
<div class="subhead">Mode 11 and 12</div>
Mode 11 and 12 are essentially this:
<div class="code">
std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2, 1])
</div>
There is no difference between mode 11 and mode 12. This becomes evident by looking at the code which was
literally copy-pasted from one function to the other. It should come as no surprise that my tests confirmed
this:
<div class="code vapoursynth">
>>> d = core.std.Expr([clip.rgvs.RemoveGrain(mode=11),clip.rgvs.RemoveGrain(mode=12)], 'x y - abs')<br>
>>> d = d.std.PlaneStats()<br>
>>> d.get_frame(0).props.PlaneStatsAverage<br>
<br>
0.0<br>
</div>
There is, however, a slight difference between using these modes and std.Convolution() with the corresponding
matrix:
<div class="code vapoursynth">
>>> d = core.std.Expr([clip.rgvs.RemoveGrain(mode=12),clip.std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2,
1])], 'x y - abs')<br>
>>> d = d.std.PlaneStats()<br>
>>> d.get_frame(0).props.PlaneStatsAverage<br>
0.05683494908489186<br>
</div>
<p>
This is explained by the different handling/interpolation of edge pixels, as can be seen in this comparison.<br>
<em><b>Edit</b>: This may also be caused by a bug in the PlaneStats code which was fixed in R36. Since 0.05 is way
too high for such a small difference, the PlaneStats bug is likely the reason.</em><br>
Note that the images were resized. The black dots were 1px in the original image.</p>
<div style="margin: -1em 0; font-size: 70%; text-align: right; color: rgba(130,130,130,0.7)">All previews
generated with <a class="ninjalink" href="https://yuuno.encode.moe/">yuuno</a></div>
<table style="width: 100%">
<tr>
<td style="width: 33%; text-align: center"><img class="img_expandable" src="/media/articles/res_rg/conv_src.png">
</td>
<td style="width: 33%; text-align: center"><img class="img_expandable" src="/media/articles/res_rg/conv_std.png">
</td>
<td style="width: 33%; text-align: center"><img class="img_expandable" src="/media/articles/res_rg/conv_rgvs.png">
</td>
</tr>
<tr>
<td style="width: 33%; text-align: center">The source image</td>
<td style="width: 33%; text-align: center">clip.std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2, 1])</td>
<td style="width: 33%; text-align: center">clip.rgvs.RemoveGrain(mode=12)</td>
</tr>
</table>
Vapoursynth's internal convolution filter interpolates beyond the edges of an image by mirroring the pixels
close to
the edge. RemoveGrain simply leaves them unprocessed.
<div class="subhead" id="c_m13">Modes 13-16</div>
These modes are very fast and very inaccurate field interpolators. They're like EEDI made in China, and there
should never be a reason to use any of them (and since EEDI2 was released in 2005, there <span
style="font-style: italic">was</span> never any reason to
use them, either).
<h3>Mode 13</h3>
<convolution class="accent1">
<row>
<black><span>1</span></black>
<darkgrey><span>2</span></darkgrey>
<lightgrey><span>3</span></lightgrey>
</row>
<row>
<transparent></transparent>
<transparent></transparent>
<transparent></transparent>
</row>
<row>
<white><span>3</span></white>
<white><span>2</span></white>
<white><span>1</span></white>
</row>
</convolution>
<br>
Since this is a field interpolator, the middle row does not yet exist, so it cannot be used for any
calculations.<br>
It uses the pair with the lowest difference and sets the center pixel to the average of that pair. In the
example,
pair 3 would be used, and the resulting center pixel would be a very light grey.
<h3>Mode 14</h3>
Same as mode 13, but instead of interpolating the top field, it interpolates the bottom field.
<h3>Mode 15</h3>
<div class="code">Same as 13 but with a more complicated interpolation formula.</div>
<p style="text-align: right; font-size: 80%; font-style: italic; padding-right: 10em">Avisynth Wiki</p>
<p>
“It's the same but different.” How did people use this plugin during the past decade?<br>
Anyway, here is what it actually does:
</p>
<convolution class="accent1">
<row>
<black><span>1</span></black>
<white><span>2</span></white>
<darkgrey><span>1</span></darkgrey>
</row>
<row>
<transparent><span>0</span></transparent>
<transparent><span>0</span></transparent>
<transparent><span>0</span></transparent>
</row>
<row>
<black><span>1</span></black>
<white><span>2</span></white>
<black><span>1</span></black>
</row>
</convolution>
<p>
First, a weighed average of the three pixels above and below the center pixel is calculated as shown in the
convolution above. Since this is still a field interpolator, the middle row does not yet
exist. <br>
Then, this average is clipped to the pair with the lowest difference.<br>
In the example, the average would be a grey with slightly above 50% brightness. There are more dark pixels
than bright ones, but
the white pixels are counted double due to their position. This average would be clipped to the pair with
the smallest range, in this case bottom-left and top-right. The resulting pixel would thus have the color of
the top-right pixel.
</p>
<h3>Mode 16</h3>
<p>
Same as mode 15 but interpolates bottom field.
</p>
<div class="subhead" id="c_m17">Mode 17</div>
<div ondblclick="$(this).html('This mode is accurately described in the rgtools wiki, so I shouldn\'t even have bothered.<br>Also, why are you doubleclicking random paragraphs?')">
<div class="code">
Clips the pixel with the minimum and maximum of respectively the maximum and minimum of each pair of
opposite neighbour pixels.
</div>
It may sound slightly confusing at first, but that is actually an accurate description of what this mode
does.
It creates an array containing the smaller value (called <span style="font-family: monospace">lower</span>)
of each pair and one containing the bigger value (called <span style="font-family: monospace">upper</span>
).
The center pixel is then clipped to the smallest value in <span style="font-family: monospace">upper</span>
and the biggest value in <span style="font-family: monospace">lower</span>.
</div>
<div class="subhead" id="c_m21">Mode 21 and 22</div>
<h3>Mode 21</h3>
The value of the center pixel is clipped to the smallest and the biggest average of the four surrounding pairs.
<h3>Mode 22</h3>
Same as mode 21, but rounding is handled differently. This mode is faster than 21 (4 cycles per pixel).
<div class="subhead" id="c_m23">Mode 23 and 24</div>
These two are too difficult to explain using words, so I'm not even going to try. Read the <a
href="https://github.com/tp7/RgTools/blob/master/RgTools/rg_functions_c.h#L394">code</a> if you're
interested, but don't expect to find anything special. I can't see this mode actually doing something useful, so
the documentation was once again quite accurate.
<p class="ninjatext">I feel like I'm missing a proper ending for this, but I can't think of anything</p>
</div>
</div>

343
resolutions.html Normal file
View File

@ -0,0 +1,343 @@
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<script>
function hover(id, path) {
$("#" + id).attr('src', path);
}
</script>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Native Resolutions and Scaling</p>
<p class="subhead">This is really not that good, and I should probably rewrite it,<br><br>but I'm lazy, so here's an unfinished whatever this is <a href="https://ddl.kageru.moe/kbz12.pdf">https://ddl.kageru.moe/kbz12.pdf</a></p>
<p class="subhead">Table of contents</p>
<div class="content">
<ul>
<li><a href="#c_introduction"> Introduction</a></li>
<li><a href="#c_basics">Avisynth and Vapoursynth basics</a></li>
<li><a href="#c_examples">Bilinear/Debilinear examples</a></li>
<li><a href="#c_native">Native resolutions</a></li>
<li><a href="#c_kernel">Kernels</a></li>
<li><a href="#c_mask">Masks: Dealing with artifacts and 1080p overlays</a></li>
<li><a href="#c_ss">Subsampling</a></li>
<li><a href="#c_import">Importable Vapoursynth script</a></li>
</ul>
</div>
<p class="subhead"><a href="#c_introduction" id="c_introduction">Introduction</a></p>
<div class="content">As some (or many) might be aware, anime is usually produced at resolutions below 1080p.
However, since all Blu-Ray releases are 1080p, the source material has to be upscaled to this
resolution. This article will try to cover the basics of scaling Blu-Ray sourced material using
Vapoursynth and Avisynth.<br>
Note: Most of the images were embedded as JPG to save bandwidth. Clicking an image will open the lossless PNG.
</div>
<p class="subhead"><a href="#c_basics" id="c_basics">Avisynth and Vapoursynth basics</a></p>
<div class="content">
In order to make the following easier to understand I will try to explain the basic scaling methods in Avisynth
and Vapoursynth. More detailed examples to deal with artifacts can be found in the corresponding paragraph.
Blue code blocks contain Vapoursynth code while green blocks contain Avisynth. For Vapoursynth I will be using
<a href="https://github.com/EleonoreMizo/fmtconv/releases">fmtconv</a> to resize.<br><br>
Scaling to 1080p using a bilinear resizer. This can be used for either upscaling or downscaling
<p class="vapoursynth">clip = core.fmtc.Resample(clip, 1920, 1080, kernel = 'bilinear')</p>
<p class="avisynth">clip.BilinearResize(1920, 1080)</p>
Note that fmtc will use Spline36 to resize if no kernel is specified. Spline is generally the better choice, and
we are only using bilinear as an example. To use Spline36 in Avisynth use
<p class="avisynth">clip.Spline36Resize(1920, 1080)</p>
Using a debilinear resizer to reverse to a native resolution of 1280x720: <span style="font-size: 60%;margin-left: 2em;">(Note that you should <b>never</b> use this to
upscale anything)</span>
<p class="vapoursynth">clip = core.fmtc.Resample(clip, 1280, 720, kernel = 'bilinear', invks = True)</p>
<p class="avisynth">clip.Debilinear(1280, 720)</p>
Debilinear for Avisynth can be found <a href="http://avisynth.nl/index.php/Debilinear">in the wiki</a>.
</div>
<p class="subhead"><a href="#c_examples" id="c_examples">Bilinear/Debilinear Examples</a></p>
<div class="content">
<p> Traditional scaling is done by spreading all pixels of an image over a higher resolution(e.g. 960x540 ->
1920x1080), interpolating the missing pixels (in our example every other pixel on each axis), and in some
cases applying additional post-processing to the results.
For a less simplified explanation and comparison of different scaling methods refer to the <a
href="https://en.wikipedia.org/wiki/Image_scaling">Wikipedia article</a>.<br>
It is possible to invert the effects of this by using the according inverse algorithm to downscale the
image.
This is only possible if the <b>exact</b> resolution of the source material is known and the video has not
been altered after scaling it (we will deal with 1080p credits and text later).<br>
A few examples of scaled and inverse scaled images: (click for full resolution PNG)<br>
<img src="/media/articles/res/opm_src.jpg" onclick="fullimage('/media/articles/res/opm_src.png')" title="ONE PUUUUNCH"><br>
1080p source frame from the One Punch man Blu-ray. No processing.<br>
<img src="/media/articles/res/opm_deb.jpg" onclick="fullimage('/media/articles/res/opm_deb.png')"><br>
Source.Debilinear(1280, 720)
<img src="/media/articles/res/opm_bilinear.jpg" onclick="fullimage('/media/articles/res/opm_bilinear.png')"><br>
Source.Debilinear(1280, 720).BilinearResize(1920,1080)<br>
This reverses the scaling and applies our own bilinear upscale.<br>
You may see slight differences which are caused by the Blu-Ray
compression noise but without zooming in and if played in real time these images should be
indistinguishable.
</p>
<p>
My second example will be a frame from Makoto Shinkai's Kotonoha no Niwa or "The Garden of Words". The movie
is not only beautifully drawn and animated but also produced at FullHD resolution. We will now upscale the
image to 4k using a bilinear resizer and reverse the scaling afterwards.
<img src="/media/articles/res/kotonohasrc.jpg" onclick="fullimage('/media/articles/res/kotonohasrc.png')"><br>
The untouched source frame
<img src="/media/articles/res/kotonoha4k.jpg" onclick="fullimage('/media/articles/res/kotonoha4k.png')"><br>
Source.BilinearResize(3840,2160)
<img src="/media/articles/res/kotonoha_deb.jpg" onclick="fullimage('/media/articles/res/kotonoha_deb.png')"><br>
Source.BilinearResize(3840,2160).Debilinear(1920,1080)<br>
This time the images are even more similar, because no artifacts were added after upscaling.
As you can see, using inverse kernels to reverse scaling is quite effective and will usually restore the
original image accurately. This is desirable, as it allows the encoder to apply a reverse scaling algorithm
to release in 720p, significantly decreasing the release's filesize. The 720p video will be upscaled by the
leecher's video player, potentially using high quality scaling methods like the ones implemented in MadVR.
Releasing in native resolution will therefore not just save space, but may even improve the image quality on
the consumer's end.
</p>
</div>
<p class="subhead"><a href="#c_native" id="c_native">Native resolutions</a></p>
<div class="content">
Finding the native resolution of your source is the most important step. If you use the wrong resolution or
try to debilinearize native 1080p material you will destroy details and introduce ringing artifacts. To
understand this let's take a look at this frame from Non Non Biyori Repeat. The show's native resolution is
846p.
<img src="/media/articles/res/nnb_src.jpg" onclick="fullimage('/media/articles/res/nnb_src.png')">
Source
<img src="/media/articles/res/nnb_ringing.jpg" onclick="fullimage('/media/articles/res/nnb_ringing.png')">
Source.Debilinear(1280, 720)<br>
Upon taking a closer look you will see that the edges of our 720p image look very jagged or aliased. This is
caused by improper debilinearizing. The effect will get stronger with sharper and more detailed edges. If
you encounter this <b>never</b> try to fix it by anti-aliasing. Try to find the correct resolution or don't
use inverse scaling at all.
<img src="/media/articles/res/nnb_ringing_zoom.jpg" onclick="fullimage('/media/articles/res/nnb_ringing_zoom.png')">
Source.Debilinear(1280, 720).PointResize(3840, 2160) and some cropping. Point resize (also called Nearest
Neighbor) is used to magnify without smoothing.
<img src="/media/articles/res/nnb_source_zoom.jpg" onclick="fullimage('/media/articles/res/nnb_source_zoom.png')">
Source.PointResize(3840, 2160) and cropping. As you can see this version does not have the ringing
artifacts.<br>
<p> Unfortunately there are only few ways of determining the native resolution. <br>The main source is
<a href="http://anibin.blogspot.de/">anibin</a>, a japanese blog that analyzes anime to find its native
resolution. In order to find an anime, you have to get the original title from <a
href="http://myanimelist.net/">MyAnimeList</a>, <a href="http://www.anisearch.com/">AniSearch</a>,
<a
href="http://anidb.net/">AniDB</a>, or any other source that has Kanji/Kana titles.<br>
Non Non Biyori Repeat's japanese title is "のんのんびより りぴーと", and if you copy-paste it into the search bar on
anibin, you should be getting <a href="http://anibin.blogspot.de/2015/07/1_65.html">this result.</a>
Even if you don't understand japanese, the numbers should speak for themselves. In this case the resolution
is 1504x846. This is above 720p but below 1080p, so you have multiple options. In this case I would
recommend encoding in 1080p or using a regular resizer (like Spline36) if you need a 720p version.
In some cases even scaling back to anibin's resolution does not get rid of the ringing, either because the
studio didn't use a bilinear resizer or the analysis was incorrect due to artifacts caused by TV
compression, so I wouldn't bother messing with the native resolution. It's not like you were gonna release
in 846p, right?
<br>
<b>Edit</b>: Apparently there are people out there who genuinely believe releasing a 873p video is a valid
option.
This is not wrong from an objective standpoint, but you should never forget that a majority of the leechers
does not understand encoding and is likely to ignore your release, because "Only an idiot would release in
8xxp".</p>
<p>If you want an easier way to detect ringing and scaling artifacts, read the chapter about artifacts and
masks.
</p> Btw, in case you do need (or want) to inverse scale our example, you would have to use a Debicubic resizer
which leads me to our next topic.
<br></div>
<p class="subhead"><a href="#c_kernel" id="c_kernel">Kernels</a></p>
<div class="content">
<p>
Sometimes you will encounter ringing and artifacts even if you are certain that you know the native
resolution. This usually means that the studio used another resizer. Our example will be Byousoku 5
Centimeter or 5 Centimeters per Second (<a href="http://anibin.blogspot.de/2008/09/5-blu-ray.html">Anibin's
Blu-Ray analysis</a>)</p> This will be our test frame: <img
src="/media/articles/res/byousoku_src.jpg" onclick="fullimage('/media/articles/res/byousoku_src.png')">
We will be using the masking functions explained in the next paragraph. For now just accept them as a good
way to find artifacts.
If we try to debilinearize our example, the mask will look like this:
<img src="/media/articles/res/byousoku_linear.png" onclick="fullimage('/media/articles/res/byousoku_linear.png')">
Despite using the correct resolution we can see strong artifacts around all edges. This can have multiple,
not mutually exclusive reasons:
<ol>
<li>The studio used sharpening filters after upscaling</li>
<li>The studio used different resolution for different layers or parts of the image</li>
<li>The image was upscaled with a different kernel (not bilinear)</li>
<li>Our resolution is wrong after all</li>
</ol>
The first two reasons are not fixable, as I will illustrate using a flashback scene from Seirei Tsukai no
Blade Dance. The scene features very strong and dynamic grain which was added after the upscale, resulting
in 720p backgrounds and 1080p grain.
<img src="/media/articles/res/bladedance_grain_src.jpg" onclick="fullimage('/media/articles/res/bladedance_grain_src.png')">
And now the mask:
<img src="/media/articles/res/bladedance_grain_mask.png" onclick="fullimage('/media/articles/res/bladedance_grain_mask.png')">
In this case you would have to trim the scene and use a regular resizer. Sometimes all backgrounds were drawn in
a higher or lower resolution than the characters and foreground objects. In this case inverse scaling becomes
very difficuilt since you would need to know the resolution of all different planes and you need a way to mask
and merge them. I'd advise using a regular resizer for these sources or just releasing in 1080p.<br>
After talking about problems we can't fix, let's go back to our example to fix reason 3. Some (especially
more recent) Blu-Rays were upscaled with a bicubic kernel rather than bilinear. Examples are Death Parade,
Monster Musume, Outbreak Company, and of course our image. Applying the mask with debicubic scaling results
in far fewer artifacts, as seen here: (hover over the image to see the bilinear mask)
<img id="byou" src="/media/articles/res/byousoku_cubic.png"
onmouseover="hover('byou', '/media/articles/res/byousoku_linear.png');"
onmouseout="hover('byou', '/media/articles/res/byousoku_cubic.png');">
The remaining artifacts are likely caused by compression artifacts on the Blu-Ray as well as potential
postprocessing in the studio. This brings us back to reason 1, although in this case the artifacts are weak
enough to let the mask handle them and use debicubic for the rest.<br>
Usage (without mask): <br>
To realize this in Avisynth import <a href="http://avisynth.nl/index.php/Debicubic">Debicubic</a> and use it
like this:
<p class="avisynth">src.Debicubic(1280, 720, b=0, c=1)</p>
For Vapoursynth use <a href="https://github.com/EleonoreMizo/fmtconv/releases">fmtconv</a>:
<p class="vapoursynth">out = core.fmtc.resample(src, 1280, 720, kernel = 'bicubic', invks = True, a1 = 0, a2 =
1)</p>
To use a mask for overlays and potential artifacts as well as 4:4:4 output use the Vapoursynth function
linked at the bottom. Example for bicubic upscales:
<p class="vapoursynth">out = deb.debilinearM(src, 1280, 720, kernel = 'bicubic')</p>
If the b and c parameters are not 0 and 1 (which should rarely be the case) you can set them as a1
and a2 like in fmtc.resample(). Bicubic's own default is 1/3 for both values so if bilinear and bicubic 0:1
don't work you could give that a try.<br>
<b>Edit:</b> I did some more testing and consulted another encoder regarding this issue.
Since we're using an inverse kernel in vapoursynth, the results may differ slightly from avisynth's debicubic.
In those cases, adjusting the values of a1 and a2, as well as the number of taps used for scaling can be
beneficial and yield a slightly sharper result.
</div>
<p class="subhead"><a href="#c_mask" id="c_mask">Masks: Dealing with artifacts and 1080p overlays</a></p>
<div class="content">
Sometimes a studio will add native 1080p material (most commonly credits or text) on top of the image.
Inverse scaling may work with the background, but it will produce artifacts around the text as seen in the
example from Mushishi Zoku Shou below:
<img src="/media/articles/res/mushishi_ringing.png" onclick="fullimage('/media/articles/res/mushishi_ringing.png')">
In order to avoid this you will have to mask these parts with conventionally downscaled pixels.
The theory behind inverse scaling is that it can be reversed by using regular scaling, so (in theory) a
source frame from a bilinear upscale would be identical to the output of this script:
<p class="avisynth">source.Debilinear(1280,720).BilinearResize(1920,1080)</p>
This property is used by scripts to mask native 1080p content by finding the differences between the source
and the above script's output. A mask would look like this:<br>
<img src="/media/articles/res/mushishi_mask.png" onclick="fullimage('/media/articles/res/mushishi_mask.png')">
If there are any differences, the areas with artifacts will be covered by a regular downscale like
<p class="avisynth">source.Spline36Resize(1280,720)</p>
In Avisynth you can import <a href="DebilinearM.avsi">DebilinearM</a> which can also be found <a
href="http://avisynth.nl/index.php/Debilinear">in the wiki.</a>
For Vapoursynth <a
href="https://raw.githubusercontent.com/MonoS/VS-MaskDetail/master/MaskDetail.py">MaskDetail</a>
can be used to create the Mask and MaskedMerge to mask the artifacts. A full importable script is available
at the end.
<p class="vapoursynth">
#MaskDetail has to be imported or copied into the script<br>
#src is the source clip<br>
deb = core.fmtc.resample(src, 1280, 720, kernel = 'bilinear', invks = True)<br>
noalias = core.fmtc.resample(src, 1280, 720, kernel="blackmanminlobe", taps=5)<br>
mask = maskDetail(src, 1280, 720, kernel = 'bilinear')<br>
masked = core.std.MaskedMerge(noalias, src, core.std.Invert(mask, 0))<br>
</p>
Using this function to filter our scene returns this image: (hover to see the unmasked version)
<img src="/media/articles/res/mushishi_masked.png" id="mushishi"
onmouseover="hover('mushishi', '/media/articles/res/mushishi_ringing.png');"
onmouseout="hover('mushishi', '/media/articles/res/mushishi_masked.png');">
The credits stand out less and don't look oversharpened. The effect can be much stronger depending on the
nature and style of the credits.
</div>
<p class="subhead"><a href="#c_ss" id="c_ss">Subsampling</a></p>
<div class="content">
You may have encountered fansubs released in 720p with 4:4:4 subsampling. In case you don't know the
term, subsampled images store luma (brightness) and chroma (color) at different resolutions. A Blu-Ray will
always have 4:2:0 subsampling, meaning the chroma channels have half the resolution of the luma channel.
When downscaling you retain the subsampling of the source, resulting in 720p luma and 360p chroma.
Alternatively you can split the source video in luma and chroma and then debilinearize the luma
(1080p->720p) while upscaling the chroma planes (540p->720p). Using the same resolution for luma and chroma will
prevent colorbleeding, retain more of the chroma present in the source, and prevent desaturation. <br>
A script for Avisynth and the discussion can be found on <a
href="http://forum.doom9.org/showthread.php?t=170832">doom9.</a>
For Vapoursynth I prefer to use the script explained in the next section which allows me to mask credits and
convert to 4:4:4 simultaneously.
<p class="subhead"><a href="#c_import" id="c_import">Importable Vapoursynth script</a></p>
While there may be scripts for literally anything in Avisynth, Vapoursynth is still fairly new and growing.
To make this easier for other Vapoursynth users I have written this simple import script which allows you to
debilinearize with masks and 4:4:4 output. A downloadable version is linked below the explanation.
Essentially, all the script does is split the video in its planes (Y, U and V) to scale them separately,
using debilinear for luma downscaling and spline for chroma upscaling. The example given is for 720p
bilinear upscaled material:
<p class="vapoursynth">
y = core.std.ShufflePlanes(src, 0, colorfamily=vs.GRAY)<br>
u = core.std.ShufflePlanes(src, 1, colorfamily=vs.GRAY)<br>
v = core.std.ShufflePlanes(src, 2, colorfamily=vs.GRAY)<br>
y = core.fmtc.resample(y, 1280, 720, kernel = 'bilinear', invks = True)<br>
u = core.fmtc.resample(u, 1280, 720, kernel = "spline36", sx = 0.25)<br>
v = core.fmtc.resample(v, 1280, 720, kernel = "spline36", sx = 0.25)<br>
out = core.std.ShufflePlanes(clips=[y, u, v], planes = [0,0,0], colorfamily=vs.YUV)<br>
noalias = core.fmtc.resample(src, 1280, 720, css = '444', kernel="blackmanminlobe", taps=5)<br>
mask = maskDetail(src, 1280, 720, kernel = 'bilinear')<br>
out = core.std.MaskedMerge(noalias, out, core.std.Invert(mask, 0))<br>
out.set_output()<br>
</p>
To call this script easily copy <a href="/media/articles/debilinearm.py">this file</a> into
<span class="path">C:\Users\Your_Name\AppData\Local\Programs\Python\Python35\Lib\site-packages</span>
and use it like this:
<p class="vapoursynth">
import vapoursynth as vs<br>
import debilinearm as deb<br>
core = vs.get_core()<br>
src = core.lsmas.LWLibavSource(r'E:\path\to\source.m2ts') #other source filters will work too<br>
out = deb.debilinearM(src, width, height, kernel)<br>
</p>
Where width and height are your target dimension and kernel is the used upscaling method. The output will be
in 16-bit and 4:4:4 subsampling.<br>The defaults are
(1280, 720, 'bilinear') meaning in most cases (720p bilinear upscales) you can just call:
<p class="vapoursynth">out = deb.debilinearM(src)</p>
List of parameters and explanation:<br>
<table class="paramtable">
<tr>
<td>parameter</td>
<td>[type, default]</td>
<td class="paramtable_main">explanation</td>
</tr>
<tr>
<td>src</td>
<td>[clip]</td>
<td class="paramtable_main">the source clip</td>
</tr>
<tr>
<td>w</td>
<td>[int, 1280]</td>
<td class="paramtable_main">target width</td>
</tr>
<tr>
<td>h</td>
<td>[int, 720]</td>
<td class="paramtable_main">target height</td>
</tr>
<tr>
<td>kernel</td>
<td>[string, 'bilinear']</td>
<td class="paramtable_main">kernel used for inverse scaling. Has to be in 'quotes'</td>
</tr>
<tr>
<td>taps</td>
<td>[int, 4]</td>
<td class="paramtable_main">number of taps for reverse scaling</td>
</tr>
<tr>
<td>return_mask</td>
<td>[boolean, False]</td>
<td class="paramtable_main">returns artifact mask in grayscale if True</td>
</tr>
<tr>
<td>a1</td>
<td>[int, 0]</td>
<td class="paramtable_main">b parameter of bicubic upscale, ignored if kernel != 'bicubic'</td>
</tr>
<tr>
<td>a2</td>
<td>[int, 1]</td>
<td class="paramtable_main">c parameter of bicubic upscale, ignored if kernel != 'bicubic'</td>
</tr>
</table>
<p>
<b>Edit:</b> The generic functions (core.generic.*) were removed in vapoursynth in R33, as most of them
are now part of the standard package (core.std.*). I have updated the script below accordingly, meaning it
may not work with R32 or older. This also applies to MonoS' MaskDetail which (as of now) has not been
updated. You can "fix" it by replacing both occurences of "core.generic" with "core.std".
</p>
<p class="download_centered">
<span class="source">The most recent version of my scripts can always be found on Github:<br></span><a href="https://gist.github.com/kageru/d71e44d9a83376d6b35a85122d427eb5">Download</a><br>
<a href="https://github.com/EleonoreMizo/fmtconv/releases">Download fmtconv (necessary)</a>
</p>
</div>
</div>

20
template.html Normal file
View File

@ -0,0 +1,20 @@
<body>
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Native Resolutions and Scaling</p>
<p class="subhead">Table of contents</p>
<p class="content">
<ul>
<li><a href="#c_introduction">Introduction</a></li>
</ul>
</p>
<p class="subhead"><a href="#c_introduction" id="c_introduction">Introduction</a></p>
<div class="content">
</div>
</div>
</body>

353
videocodecs.html Normal file
View File

@ -0,0 +1,353 @@
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Why and how to use x265</p>
<div class="content">
<ul>
<li><a href="#c_introduction">Introduction</a></li>
<li><a href="#filesize">File size comparision</a></li>
<li><a href="#settings">Useful parameters for encoding with x265</a></li>
</ul>
</div>
<p class="subhead"><a href="#c_introduction" id="c_introduction">Introduction</a></p>
<div class="content">
For many years x264 has been the standard video codec for video encoding and achieved the best results one
could get in terms of video compression and efficiency. But in 2013, when the initial version of x265
was released, it yielded far better results than were previously possible with x264. Now 2.0 stable version of
x265 is released and we are a few CPU and GPU generations farther than we were in 2013.
Additionally, the new PCs, notebooks, and even smartphones that are coming out are all receiving native
hardware support for decoding x265, so as of today, more and more people
can view HEVC encoded videos just the same as they can view AVC encoded videos.
The problem is that the encoders in the fansubbing community are only slowly adapting to
the new codec, effectively wasting the bandwidth of the viewer, or offering a lower quality than they could achieve
with x265. <br>
In the following section, I will explain why HEVC/x265 is superior to x264/AVC and why you should use it to encode your
videos.
</div>
<p class="subhead"><a id="filesize" href="#filesize">File size comparison</a></p>
<div class="content">
This section will be dedicated to comparing the difference in filesize between x264 and x265.
For x265, I used CRF 17 and the veryslow preset, which already yields very good results.
For x264, I used CRF 15, the preset veryslow and the parameters subme 11, me tesa, merange 32, and bframes 16.<br>
Both encodes also have aq-mode 3 enabled.<br>
Please note: CRF in x264 and x265 is NOT comparable, both encoders use a different way to calculate the CRF.
I found CRF 15 for x264 and CRF 17 for x265 to have nearly the same quality, but results may vary.
You have been warned. <br>
<table class="two_column_table">
<tr>
<td><img class="img_expandable rounded" src="/media/articles/res_mr/onepunchmane1f13487x264crf15.png"></td>
<td><img class="img_expandable rounded" src="/media/articles/res_mr/onepunchmane1f13487x265crf17.png"></td>
</tr>
<tr>
<td>One Punch Man episode 1, frame 13487 at CRF 15 in x264</td>
<td>And here in x265 at CRF 17</td>
</tr>
</table>
<br>
<p>
1. Static videos: The test clip consists of the first 1000 frames of Non Non Byori Repeat episode 1.<br><br>
<a href="/media/articles/res_mr/NNBR_Encodes.zip">Download the encodes</a><br>
</p>
<br>
Logfiles of the encodes (expandable):<br>
<div class="spoilerbox_expand_element">x264 log
<p class="code">
x264 [info]: 1920x1080p 0:0 @ 24000/1001 fps (cfr)<br>
x264 [info]: color matrix: undef<br>
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX<br>
x264 [info]: AVC Encoder x264 core 148 r2699+6+41 29a38aa Yuuki [10-bit@all X86_64][GCC 5.3.0]<br>
x264 [info]: profile: High 10, level: 5.1, subsampling: 4:2:0, bit-depth: 10-bit<br>
x264 [info]: cabac=1 ref=16 deblock=1:0:-1 analyse=0x3:0x133 me=tesa subme=11 psy=1 fade_compensate=0.00 psy_rd=1.00:0.00 mixed_ref=1 me_range=32 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=0 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=0 interlaced=0 bluray_compat=0 constrained_intra=0 fgo=0 bframes=16 b_pyramid=2 b_adapt=2 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=60 rc=crf mbtree=1 crf=15.0000 qcomp=0.60 qpmin=0:0:0 qpmax=81:81:81 qpstep=4 ip_ratio=1.40 aq=3:0.80<br>
x264 [info]: started at Sun Aug 07 23:46:37 2016<br>
x264 [info]: frame I:10 Avg QP:22.52 size:391771<br>
x264 [info]: frame P:236 Avg QP:26.76 size: 70531<br>
x264 [info]: frame B:754 Avg QP:28.05 size: 7533<br>
x264 [info]: consecutive B-frames: 1.1% 2.0% 45.0% 3.2% 6.0% 12.0% 4.2% 11.2% 8.1% 2.0% 1.1% 2.4% 0.0% 0.0% 0.0% 0.0% 1.7%<br>
x264 [info]: mb I I16..4: 49.8% 28.1% 22.1%<br>
x264 [info]: mb P I16..4: 4.6% 0.7% 0.7% P16..4: 44.1% 28.7% 10.7% 1.6% 0.2% skip: 8.7%<br>
x264 [info]: mb B I16..4: 0.4% 0.0% 0.1% B16..8: 28.4% 4.8% 0.3% direct: 1.5% skip:64.5% L0:42.1% L1:55.0% BI: 2.8%<br>
x264 [info]: 8x8 transform intra:16.7% inter:41.3%<br>
x264 [info]: direct mvs spatial:98.5% temporal:1.5%<br>
x264 [info]: coded y,uvDC,uvAC intra: 89.9% 87.5% 81.4% inter: 11.0% 15.1% 7.0%<br>
x264 [info]: i16 v,h,dc,p: 14% 10% 18% 57%<br>
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 12% 8% 8% 9% 11% 10% 9% 10%<br>
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 8% 30% 6% 7% 7% 7% 7% 9%<br>
x264 [info]: i8c dc,h,v,p: 52% 18% 11% 18%<br>
x264 [info]: Weighted P-Frames: Y:3.0% UV:3.0%<br>
x264 [info]: ref P L0: 58.3% 19.9% 5.7% 4.1% 1.6% 3.0% 1.3% 1.2% 0.6% 1.0% 0.6% 0.9% 0.5% 0.8% 0.5% 0.0%<br>
x264 [info]: ref B L0: 71.8% 11.7% 5.5% 2.5% 1.8% 1.7% 1.3% 0.7% 0.6% 0.5% 0.5% 0.5% 0.4% 0.3% 0.1%<br>
x264 [info]: ref B L1: 97.1% 2.9%<br>
x264 [info]: kb/s:5033.59<br>
x264 [info]: encoded 1000 frames, 0.5356 fps, 5033.74 kb/s, 25.03 MB<br>
x264 [info]: ended at Mon Aug 08 00:17:44 2016<br>
x264 [info]: encoding duration 0:31:07<br>
</p>
</div>
<div class="spoilerbox_expand_element">x265 log
<p class="code">
yuv [info]: 1920x1080 fps 24000/1001 i420p8 unknown frame count<br>
x265 [info]: Using preset veryslow & tune none<br>
x265 [info]: HEVC encoder version 2.0M+9-g457336f+14<br>
x265 [info]: build info [Windows][GCC 5.3.0][64 bit] Yuuki 10bit<br>
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX<br>
x265 [info]: Main 10 profile, Level-4 (Main tier)<br>
x265 [info]: Thread pool created using 4 threads<br>
x265 [info]: frame threads / pool features : 2 / wpp(17 rows)<br>
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8<br>
x265 [info]: Residual QT: max TU size, max depth : 32 / 3 inter / 3 intra<br>
x265 [info]: ME / range / subpel / merge : star / 57 / 4 / 4<br>
x265 [info]: Keyframe min / max / scenecut : 23 / 250 / 40<br>
x265 [info]: Lookahead / bframes / badapt : 40 / 8 / 2<br>
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1<br>
x265 [info]: References / ref-limit cu / depth : 5 / off / on<br>
x265 [info]: AQ: mode / str / qg-size / cu-tree : 3 / 1.0 / 32 / 1<br>
x265 [info]: Rate Control / qCompress : CRF-17.0 / 0.60<br>
x265 [info]: tools: rect amp limit-modes rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00<br>
x265 [info]: tools: rskip signhide tmvp b-intra strong-intra-smoothing deblock<br>
x265 [info]: tools: sao<br>
x265 [info]: frame I: 9, Avg QP:14.24 kb/s: 32406.31<br>
x265 [info]: frame P: 178, Avg QP:14.40 kb/s: 6617.61<br>
x265 [info]: frame B: 813, Avg QP:23.03 kb/s: 326.86<br>
x265 [info]: Weighted P-Frames: Y:12.4% UV:12.4%<br>
x265 [info]: Weighted B-Frames: Y:14.9% UV:14.3%<br>
x265 [info]: consecutive B-frames: 5.9% 1.1% 21.4% 2.7% 9.6% 36.9% 4.8% 10.2% 7.5%<br>
encoded 1000 frames in 1010.58s (0.99 fps), 1735.33 kb/s, Avg QP:21.41<br>
</p>
</div>
<p>
The x264 encode has an average bitrate of 5033.74 kb/s resulting in a total filesize of 25.03 MB, while
the x265 encode has an average bitrate of 1735.33 kb/s resulting in a total filesize of 8.64 MB. This is an 66%
reduction, meaning the x265 file has only 1/3th the size of the x264 file whilst having the same visual quality.
</p>
<br>
<p>
2. High-motion videos: The test clip consists of 1000 frames of One Punch Man episode 1, beginning at frame 13000.<br><br>
<a href="/media/articles/res_mr/OPM_Encodes.zip">Download the encodes</a><br><br>
</p>
<br>
Logfiles of the encodes (expandable):<br>
<div class="spoilerbox_expand_element">x264 log
<p class="code">
x264 [info]: 1920x1080p 0:0 @ 24000/1001 fps (cfr)<br>
x264 [info]: color matrix: undef<br>
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX<br>
x264 [info]: AVC Encoder x264 core 148 r2699+6+41 29a38aa Yuuki [10-bit@all X86_64][GCC 5.3.0]<br>
x264 [info]: profile: High 10, level: 5.1, subsampling: 4:2:0, bit-depth: 10-bit<br>
x264 [info]: cabac=1 ref=16 deblock=1:0:-1 analyse=0x3:0x133 me=tesa subme=11 psy=1 fade_compensate=0.00 psy_rd=1.00:0.00 mixed_ref=1 me_range=32 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=0 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=0 interlaced=0 bluray_compat=0 constrained_intra=0 fgo=0 bframes=16 b_pyramid=2 b_adapt=2 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=60 rc=crf mbtree=1 crf=15.0000 qcomp=0.60 qpmin=0:0:0 qpmax=81:81:81 qpstep=4 ip_ratio=1.40 aq=3:0.80<br>
x264 [info]: started at Mon Aug 08 01:49:11 2016<br>
x264 [info]: frame I:8 Avg QP:25.24 size:284336<br>
x264 [info]: frame P:285 Avg QP:28.03 size: 95640<br>
x264 [info]: frame B:707 Avg QP:28.83 size: 40806<br>
x264 [info]: consecutive B-frames: 2.9% 5.4% 26.7% 44.8% 9.0% 9.0% 1.4% 0.8% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%<br>
x264 [info]: mb I I16..4: 0.5% 95.9% 3.6%<br>
x264 [info]: mb P I16..4: 0.5% 13.6% 1.3% P16..4: 35.8% 34.5% 12.1% 1.2% 0.1% skip: 1.0%<br>
x264 [info]: mb B I16..4: 0.1% 3.9% 0.1% B16..8: 37.8% 17.5% 2.6% direct: 8.1% skip:29.9% L0:53.2% L1:42.1% BI: 4.8%<br>
x264 [info]: 8x8 transform intra:91.7% inter:78.2%<br>
x264 [info]: direct mvs spatial:99.6% temporal:0.4%<br>
x264 [info]: coded y,uvDC,uvAC intra: 90.3% 91.1% 74.3% inter: 34.2% 38.3% 10.4%<br>
x264 [info]: i16 v,h,dc,p: 12% 22% 10% 55%<br>
x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 7% 16% 10% 12% 11% 11% 10% 12%<br>
x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 9% 3% 11% 18% 14% 13% 10% 11%<br>
x264 [info]: i8c dc,h,v,p: 47% 20% 19% 14%<br>
x264 [info]: Weighted P-Frames: Y:9.1% UV:7.0%<br>
x264 [info]: ref P L0: 41.9% 17.7% 9.6% 5.6% 5.2% 4.0% 2.8% 2.1% 1.5% 1.8% 2.1% 2.0% 1.4% 1.1% 0.9% 0.1%<br>
x264 [info]: ref B L0: 58.4% 11.5% 6.4% 4.0% 2.8% 4.0% 2.6% 1.5% 1.4% 1.4% 1.1% 2.0% 1.6% 1.0% 0.3%<br>
x264 [info]: ref B L1: 95.8% 4.2%<br>
x264 [info]: kb/s:11198.17<br>
x264 [info]: encoded 1000 frames, 0.2929 fps, 11198.33 kb/s, 55.68 MB<br>
x264 [info]: ended at Mon Aug 08 02:46:06 2016<br>
x264 [info]: encoding duration 0:56:55<br>
</p>
</div>
<div class="spoilerbox_expand_element">x265 log
<p class="code">
yuv [info]: 1920x1080 fps 24000/1001 i420p8 unknown frame count<br>
x265 [info]: Using preset veryslow & tune none<br>
x265 [info]: HEVC encoder version 2.0M+9-g457336f+14<br>
x265 [info]: build info [Windows][GCC 5.3.0][64 bit] Yuuki 10bit<br>
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX<br>
x265 [info]: Main 10 profile, Level-4 (Main tier)<br>
x265 [info]: Thread pool created using 4 threads<br>
x265 [info]: frame threads / pool features : 2 / wpp(17 rows)<br>
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8<br>
x265 [info]: Residual QT: max TU size, max depth : 32 / 3 inter / 3 intra<br>
x265 [info]: ME / range / subpel / merge : star / 57 / 4 / 4<br>
x265 [info]: Keyframe min / max / scenecut : 23 / 250 / 40<br>
x265 [info]: Lookahead / bframes / badapt : 40 / 8 / 2<br>
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1<br>
x265 [info]: References / ref-limit cu / depth : 5 / off / on<br>
x265 [info]: AQ: mode / str / qg-size / cu-tree : 3 / 1.0 / 32 / 1<br>
x265 [info]: Rate Control / qCompress : CRF-17.0 / 0.60<br>
x265 [info]: tools: rect amp limit-modes rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00<br>
x265 [info]: tools: rskip signhide tmvp b-intra strong-intra-smoothing deblock<br>
x265 [info]: tools: sao<br>
x265 [info]: frame I: 14, Avg QP:14.37 kb/s: 18963.14<br>
x265 [info]: frame P: 253, Avg QP:15.76 kb/s: 12334.19<br>
x265 [info]: frame B: 733, Avg QP:19.84 kb/s: 3709.09<br>
x265 [info]: Weighted P-Frames: Y:9.5% UV:8.3%<br>
x265 [info]: Weighted B-Frames: Y:8.3% UV:6.8%<br>
x265 [info]: consecutive B-frames: 11.6% 9.4% 14.6% 43.1% 7.1% 10.5% 1.5% 1.1% 1.1%<br>
encoded 1000 frames in 2391.07s (0.42 fps), 6104.80 kb/s, Avg QP:18.73<br>
</p>
</div>
<br>
<p>
The x264 encode has an average bitrate of 11198.33 kb/s resulting in a total filesize of 55.68 MB, while
the x265 encode has an average bitrate of 6104.80 kb/s resulting in a total filesize of 30.3 MB. This is an 43%
reduction, meaning the x265 file has only 4/7th the size of the x264 file.<br><br>
Conclusion: x265 offers the same visual quality at significantly lower bitrates, meaning one can offer an encode with higher
visual quality than x264 at the same filesize, or reduce the filesize of the encoded videos by a large amount while
offering the same visual fidelity as a x264 encode. With no real downsides apart from a higher encoding time
and slightly less compatibility there really is no reason not to use it.
</p>
</div>
<p class="subhead"><a id="settings" href="#settings">Useful parameters for encoding with x265</a></p>
<div class="content">
<p>
Just as with x264, x265 has many parameters you can use if you don't want to stick to the presets and
are trying to get the best possible quality. In the following section, I will explain some of these
parameters and how to use them. You can click on each parameter to get more information about it.<br><br>
</p>
<div class="spoilerbox_expand_element">--preset
<p class="code">
Options: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow, placebo<br>
What it does: The further to the right the preset on this list is, the higher the compression efficiency will be at the cost of slowing down your encode.<br>
What to use: Medium or slower are fine, but I would recommend slow or veryslow depending on how strong your encoding rig is.
Don't use placebo, it will result in greatly increased encoding time with diminishing returns in comparison to veryslow.
</p>
</div>
<div class="spoilerbox_expand_element">--ref
<p class="code">
Options: An integer from 1 to 16<br>
What it does: Max number of L0 references to be allowed. This number has a linear multiplier effect on the amount of work performed in motion search.<br>
What to use: If --b-pyramid is enabled(which is the default option), the HEVC specification only allows
ref 6 as a maximum, without --b-pyramid the maximum ref allowed by the specification is 7. Generally, you
want to use the highest number possible(within the specification), as it yields the best results.
</p>
</div>
<div class="spoilerbox_expand_element">--rd
<p class="code">
Options: An integer from 1 to 6<br>
What it does: The higher the value, the more exhaustive the RDO analysis is and the more rate distortion optimizations are used.<br>
What to use: The highest option you can afford, in general the rule: "the lower the value the faster the encode,
the higher the value the smaller the bitstream" applies. Please notice that, in the current version, rd 3 and 4 and rd 5 and 6 are the same.
</p>
</div>
<div class="spoilerbox_expand_element">--ctu
<p class="code">
Options: 64,32,16<br>
What it does: CTUs and CUs are the logical units in which the HEVC encoder divides a given picture. This
option sets the maximum CU size.<br>
What to use: No reason not to use 64, as it will give you large reductions in bitrate compared to the other two options
with an insignificant increase in computing time.
</p>
</div>
<div class="spoilerbox_expand_element">--min-cu-size
<p class="code">
Options: 64,32,16,8<br>
What it does: CTUs and CUs are the logical units in which the HEVC encoder divides a given picture. This
option sets the minimum CU size.<br>
What to use: Use 8, as it is an easy way to save bitrate without a significant increase in computing time.
</p>
</div>
<div class="spoilerbox_expand_element">--rect, --no-rect
<p class="code">
What it does: Enables the analysis of rectangular motion partitions.<br>
What to use: --rect for better encode results, --no-rect for faster encoding.
</p>
</div>
<div class="spoilerbox_expand_element">--amp, --no-amp
<p class="code">
What it does: Enables the analysis of asymmetric motion partitions.<br>
What to use: --amp for better encode results, --no-amp for faster encoding.
</p>
</div>
<div class="spoilerbox_expand_element">--rskip, --no-rskip
<p class="code">
What it does: This option determines early exit from CU depth recursion.<br>
What to use: Provides minimal quality degradation at good performance gains when enabled, so you can
choose what you want.
</p>
</div>
<div class="spoilerbox_expand_element">--rdoq-level
<p class="code">
Options: 0,1,2<br>
What it does: Specifys the amount of rate-distortion analysis to use within quantization.<br>
What to use: The standard is 2, which seems pretty good.
</p>
</div>
<div class="spoilerbox_expand_element">--max-merge
<p class="code">
Options: An integer from 1 to 5<br>
What it does: Maximum number of neighbor candidate blocks that the encoder may consider for merging motion predictions.<br>
What to use: Something from 3 to 5, depending if you are aiming for a faster encode or better results.
</p>
</div>
<div class="spoilerbox_expand_element">--me
<p class="code">
Options: dia, hex, umh, star, full<br>
What it does: Motion search method. Diamond search is the simplest. Hexagon search is a little better.
Uneven Multi-Hexagon is an adaption of the search method used by x264 for slower presets.
Star is a three step search adapted from the HM encoder and full is an exhaustive search.<br>
What to use: Umh for faster encoding, star for better encode results. Dia and hex are not worth the
quality loss and full gives diminishing returns.
</p>
</div>
<div class="spoilerbox_expand_element">--subme
<p class="code">
Options: An integer from 1 to 7<br>
What it does: This is the motion search range.<br>
What to use: Something from 4 to 7, depending on whether you are going for faster encoding or better results.
</p>
</div>
<div class="spoilerbox_expand_element">--merange
<p class="code">
Options: An integer from 0 to 32768<br>
What it does: Amount of subpel refinement to perform. The higher the number the more subpel iterations and steps are performed.<br>
What to use: The standard of 57 seems quite good, you can experiment with higher values if you want, but please
keep in mind that higher values will give you diminishing returns.
</p>
</div>
<div class="spoilerbox_expand_element">--constrained-intra, --no-constrained-intra
<p class="code">
What it does: Constrained intra prediction. The general idea is to block the propagation of reference
errors that may have resulted from lossy signals.<br>
What to use: --no-constrained-intra (which is default) unless you know what you're doing.
</p>
</div>
<div class="spoilerbox_expand_element">--psy-rd
<p class="code">
Options: A float from 0 to 5.0<br>
What it does: Turning on small amounts of psy-rd and psy-rdoq will improve the perceived visual quality,
trading distortion for bitrate. If it is too high, it will introduce visual artifacts.<br>
What to use: A value between 0.5 and 1.0 is a good starting point, you can experiment with higher values if you want, but
don't overdo it unless you like visual artifacts.
</p>
</div>
<div class="spoilerbox_expand_element">--psy-rdoq
<p class="code">
Options: A float from 0 to 50.0<br>
What it does: Turning on small amounts of psy-rd and psy-rdoq will improve the perceived visual quality,
trading distortion for bitrate. High levels of psy-rdoq can double the bitrate, so be careful.<br>
What to use: You should be good to go with a value between 0 and 5.0, but I wouldn't take a value much higher
than 1.0 because I haven't done enough tests yet.
</p>
</div>
<div class="spoilerbox_expand_element">--rc-grain, --no-rc-grain
<p class="code">
What it does: This parameter strictly minimizes QP fluctuations within and across frames and removes pulsing of grain.<br>
What to use: Use this whenever you need to encode grainy scenes, otherwise leave it disabled.
</p>
</div>
</div>
</div>