From 033aed5884878ceb671103cca506642cfd426236 Mon Sep 17 00:00:00 2001 From: kageru Date: Sat, 13 Apr 2019 15:16:45 +0200 Subject: [PATCH] initial commit --- adaptivegrain.html | 253 +++++++++++++++++++++++ aoc.html | 82 ++++++++ aoc_postmortem.html | 153 ++++++++++++++ blogs.html | 116 +++++++++++ dependencies.html | 88 ++++++++ edgemasks.html | 379 +++++++++++++++++++++++++++++++++++ expediency.html | 62 ++++++ grain.html | 281 ++++++++++++++++++++++++++ matana.html | 426 +++++++++++++++++++++++++++++++++++++++ mgsreview.html | 374 ++++++++++++++++++++++++++++++++++ removegrain.html | 475 ++++++++++++++++++++++++++++++++++++++++++++ resolutions.html | 343 ++++++++++++++++++++++++++++++++ template.html | 20 ++ videocodecs.html | 353 ++++++++++++++++++++++++++++++++ 14 files changed, 3405 insertions(+) create mode 100644 adaptivegrain.html create mode 100644 aoc.html create mode 100644 aoc_postmortem.html create mode 100644 blogs.html create mode 100644 dependencies.html create mode 100644 edgemasks.html create mode 100644 expediency.html create mode 100644 grain.html create mode 100644 matana.html create mode 100644 mgsreview.html create mode 100644 removegrain.html create mode 100644 resolutions.html create mode 100644 template.html create mode 100644 videocodecs.html diff --git a/adaptivegrain.html b/adaptivegrain.html new file mode 100644 index 0000000..a1a37d5 --- /dev/null +++ b/adaptivegrain.html @@ -0,0 +1,253 @@ + + {% load static %} +
+
+ +
+

Adaptive Graining Methods

+ +
+

Abstract

+ In order to remedy the effects of lossy compression of digital media files, dither is applied to randomize + quantization errors and thus avoid or remove distinct patterns which are perceived as unwanted artifacts. This + can be used to remove banding artifacts by adding random pixels along banded edges which will create the + impression of a smoother gradient. The resulting image will often be more resilient to lossy compression, as the + added information is less likely to be omitted by the perceptual coding algorithm of the encoding software. +
Wikipedia explains it like this: +
+ High levels of noise are almost always undesirable, but there are cases when a certain amount of noise is + useful, for example to prevent discretization artifacts (color banding or posterization). [. . .] Noise + added for such purposes is called dither; it improves the image perceptually, though it degrades the + signal-to-noise ratio. +
+

+ In video encoding, especially regarding anime, this is utilized by debanding filters to improve their + effectiveness and to “prepare” the image for encoding. While grain may be beneficial under some + circumstances, it is generally perceived as an unwanted artifact, especially in brighter scenes where + banding + artifacts would be less likely to occur even without dithering. Most Blu-rays released today will already + have + grain in most scenes which will mask most or even all visual artifacts, but for reasons described here, it may be beneficial to remove this grain.

+

+ As mentioned previously, most debanding filters will add grain to the image, but in some cases this grain + might + be either to weak to mask all artifacts or to faint, causing it to be removed by the encoding software, + which in + turn allows the banding to resurface. In the past, scripts like GrainFactory were written to specifically + target + dark areas to avoid the aforementioned issues without affecting brighter scenes.

+

+ This idea can be further expanded by using a continuous function to determine the grain's strength based on + the + average brightness of the frame as well as the brightness of every individual pixel. This way, the problems + described above can be solved with less grain, especially in brighter areas and bright scenes where the dark + areas are less likely to the focus of the viewer's attention. This improves the perceived quality of the + image + while simultaneously saving bitrate due to the absence of grain in brighter scenes and areas. +

+

Demonstration and Examples

+ Since there are two factors that will affect the strength of the grain, we need to analyze the brightness of any + given frame before applying any grain. This is achieved by using the PlaneStats function in Vapoursynth. The + following clip should illustrate the results. The brightness of the current frame is always displayed in the top + left-hand corner. The surprisingly low values in the beginning are caused by the 21:9 black bars. (Don't mind the stuttering in the middle. That's just me being bad) +

+ +
+
You can download the video + if your browser is not displaying it correctly. +
+ In the dark frames you can see banding artifacts which were created by x264's lossy compression algorithm. + Adding grain fixes this issue by adding more randomness to the gradients.

+ +
+
Download
+ By using the script described above, we are able to remove most of the banding without lowering the crf, + increasing aq-strength, or graining other surfaces where it would have decreased the image quality. +

Theory and Explanation

+ The script works by generating a copy of the input clip in advance and graining that copy. For each frame in the + input clip, a mask is generated based on the frame's average luma and the individual pixel's value. This mask is + then used to apply the grained clip with the calculated opacity. The generated mask for the previously used clip + looks like this:

+ +
+
Download
+ The brightness of each pixel is calculated using this polynomial: +
+ z = (1 - (1.124x - 9.466x^2 + 36.624x^3 - 45.47x^4 + 18.188x^5))^(y^2 * 10) +
+ where x is the luma of the current pixel, y is the current frame's average luma, and z is the resulting pixels + brightness. The highlighted number (10) is a parameter called luma_scaling which will be explained later. +

+ The polynomial is applied to every pixel and every frame. All luma values are floats between 0 (black) and 1 + (white). For performance reasons the precision of the mask is limited to 8 bits, and the frame + brightness is rounded to 1000 discrete levels. + All lookup tables are generated in advance, significantly reducing the number of necessary calculations.

+

+ Here are a few examples to better understand the masks generated by the aforementioned polynomial.

+ + + + + + +
+

+ Generally, the lower a frame's average luma, the more grain is applied even to the brighter areas. This + abuses + the fact that our eyes are instinctively drawn to the brighter part of any image, making the grain less + necessary in images with an overall very high luma.

+ Plotting the polynomial for all y-values (frame luma) results in the following image (red means more grain and + yellow means less or no grain):
+ +
+ More detailed versions can be found here (100 points per axis) or here (400 points per axis).
+ Now that we have covered the math, I will quickly go over the Vapoursynth script.
+
Click to expand code

+
import vapoursynth as vs +
import numpy as np +
import functools +
+
def adaptive_grain(clip, source=None, strength=0.25, static=True, luma_scaling=10, show_mask=False): +
+
    def fill_lut(y): +
        x = np.arange(0, 1, 1 / (1 << src_bits)) +
        z = (1 - (1.124 * x - 9.466 * x ** 2 + 36.624 * x ** 3 - + 45.47 * x ** 4 + 18.188 * x ** 5)) ** ( +
            (y ** 2) * luma_scaling) * ((1 + << src_bits) - 1) +
        z = np.rint(z).astype(int) +
        return z.tolist() +
+
    def generate_mask(n, clip): +
        frameluma = + round(clip.get_frame(n).props.PlaneStatsAverage * 999) +
        table = lut[int(frameluma)] +
        return core.std.Lut(clip, lut=table) +
+
    core = vs.get_core(accept_lowercase=True) +
    if source is None: +
        source = clip +
    if clip.num_frames != source.num_frames: +
        raise ValueError('The length of the filtered and + unfiltered clips must be equal') +
    source = core.fmtc.bitdepth(source, bits=8) +
    src_bits = 8 +
    clip_bits = clip.format.bits_per_sample +
+
    lut = [None] * 1000 +
    for y in np.arange(0, 1, 0.001): +
        lut[int(round(y * 1000))] = fill_lut(y) +
+
    luma = core.std.ShufflePlanes(source, 0, vs.GRAY) +
    luma = core.std.PlaneStats(luma) +
    grained = core.grain.Add(clip, var=strength, constant=static) +
+
    mask = core.std.FrameEval(luma, functools.partial(generate_mask, clip=luma)) +
    mask = core.resize.Bilinear(mask, clip.width, clip.height) +
+
    if src_bits != clip_bits: +
        mask = core.fmtc.bitdepth(mask, bits=clip_bits) +
+
    if show_mask: +
        return mask +
+
    return core.std.MaskedMerge(clip, grained, mask) +


+ Thanks to Frechdachs for suggesting the use of std.FrameEval.
+
+ In order to adjust for things like black bars, the curves can be manipulated by changing the luma_scaling + parameter. Higher values will cause comparatively less grain even in darker scenes, while lower + values will increase the opacity of the grain even in brighter scenes. + +

+

Usage

+ The script has four parameters, three of which are optional. + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Parameter[type, default]Explanation
clip[clip]The filtered clip that the grain will be applied to
strength[float, 0.25]Strength of the grain generated by AddGrain.
static[boolean, True]Whether to generate static or dynamic grain.
luma_scaling[float, 10]This values changes the general grain opacity curve. Lower values will generate more grain, even in + brighter scenes, while higher values will generate less, even in dark scenes. +
+
+

Closing Words

+

+ Grain is a type of visual noise that can be used to mask discretization if used correctly. Too much grain + will + degrade the perceived quality of a video, while to little grain might be destroyed by the perceptual coding + techniques used in many popular video encoders.

+

+ The script described in this article aims to apply the optimal amount of grain to all scenes to prevent + banding artifacts without having a significant impact on the perceived image quality or the required + bitrate. + It does this by taking the brightness of the frame as a whole and every single pixel into account and + generating an opacity mask based on these values to apply grain to certain areas of each frame. This can be + used to supplement or even replace the dither generated by other debanding scripts. The script has a + noticeable but not significant impact on encoding performance. +

+ +

There's probably a much simpler way to do this, but I like this one. fite me m8 +
+
diff --git a/aoc.html b/aoc.html new file mode 100644 index 0000000..f214f10 --- /dev/null +++ b/aoc.html @@ -0,0 +1,82 @@ + + + {% load static %} +
+
+
+

Challenges for the Advent of Code or
“How to Bite Off More Than You Can Chew, but not too much”

+
+

Introduction

+

+ Humans love challenges.
+ We want to prove something to our friends, our coworkers, our peers, or even just ourselves.
+ We want to push ourselves to our limits, maybe even beyond them, and grow into more than what we are.
+ We want to show the world that we can do something, whatever that may be.
+ We want to get just a tiny bit closer to the optimal version, to the person we could become.
+

+

+ That’s one reason to do Advent of Code. The other — and much more common — is probably just boredom.
+ For those unaware: Advent of Code is like an advent calendar where there’s a coding task hidden behind each door. You receive a randomized input (actually a randomly selected one from a precomputed set because they can’t store and solve a new puzzle for every user), and you have to enter your solution on the website. How you get that solution doesn’t matter. That should give you a basic idea. Onto the actual content of this. +

+

Learning From Past Experiences

+

+ Last year, I failed the Advent of Code. And not just a little bit. I really mean I failed. + I wanted to use it as an opportunity to learn Haskell, so my plan was to implement all 24 tasks in Haskell. + This should, in theory, give me enough practice to understand the basics and become at least reasonably comfortable in the language. +

+

+ Reality was… not quite that.
+ Not only did I not finish even a single puzzle in Haskell, but it deterred me from even trying to learn any functional language for quite a while (that is, until today and ongoing). + I then switched to Nim because reasons, but I didn’t finish all 24 in that either. I ended up skipping ~70% of all days. +

+

+ What do we learn from that?
+ Well, for one, we shouldn’t just plan tasks of completely unknown complexity. “It will be alright” is not how you approach software. I’ve done that, and it works more often than it should, but this is really a case where it was just silly. + For two, (I know that’s not how english works; don’t @ me) I needed variety. Just picking one language may work for other people and it’s certainly useful if you want to learn that language, but it wasn’t enough to keep me motivated. +

+

Going Forward

+

+ So what did I change this year? Simple. +

    +
  • I will not use programming paradigms that are completely alien to me
  • +
  • I will use more than just one language to keep things interesting
  • +
  • I will brag as openly as possible about my plans so people can publicly shame me for not following them
  • +
+ I know that there are people who try “24 days, 24 languages”, but I’m not quite that insane. For me, 8 languages should be enough. I’m giving myself three tickets for each language. It is up to me which language I will use for any given day, as long as I have at least one ticket left for that language.
+ I’ve created a git repo for this challenge here.
+ The number of remaining tickets for each language can be tracked in the tickets.md file.
+ Edit: looks like there are 25 Days, and I misjudged a few things, so I had to change my plans here on day one. Still keeping this section, just because.
+ The languages I have chosen are (in alphabetical order): +
    +
  1. C
  2. +
  3. Go
  4. +
  5. Java
  6. +
  7. Javascript
  8. +
  9. Kotlin
  10. +
  11. Nim
  12. +
  13. Python
  14. +
  15. Rust
  16. +
+ That puts us at 6 compiled and 2 interpreted languages. Out of these, I would rate myself the most proficient in Python and the least proficient in C. Contrary to popular belief, you can be a software developer without ever having touched C.
+
I would like to add that I’m not necessarily a fan of all of these, especially Java. However, since I’m currently working on a ~1,000,000 loc Java project as my fulltime job, not including it here just felt wrong.
+ To show my remorse and to give me a very early impression of the suffering that this decision will bring with it, I’m typing this paragraph in ed, the default editor. + It really is quite the experience. The creativity of the people who wrote it is admirable. You wouldn’t believe how many convenience features you can add to an editor that only displays one line at a time. (Let me also remind you that ed was used before monitors were a thing, so printing lines wasn’t free either.) + This is, unironically, a better editor than most modern default editors (referring mostly to Windows Notepad and whatever OS X uses).
+ Oh, and ed supports regex. Everything is better if it supports regex. But I digress. +
+ Apart from the JVM languages, I’ll write all of the solutions in vim, using only Syntastic and the language’s respective command line compilers. + Java is just too verbose to be used without introspection and autocompletion. At least for my taste. + vim can be really cool to use, and I love it for most languages, but Java is just not one of them. Still typing this in ed, btw. :^) +

+

+ If I come across a solution that seems particularly interesting to me, I might share it here, but I doubt that will happen. + Let’s see how far I can get this year. Expect a recap of this year’s inevitable failure here around the end of the year. +

+

+ Edit: after finding out that JS has basically no consistent way of reading files, I removed that from the list. It’s garbage anyway, and I’m not writing node.js. Since all of this doesn’t really work with 25 days either, I’ve also decided to drop two more languages to get me to 5x5.
+ That left me with C, Go, Kotlin, Python, and Rust. 5 days each. +

+ I thought about including Haskell in this list, but I decided not to repeat history… at least for now +
+
+ diff --git a/aoc_postmortem.html b/aoc_postmortem.html new file mode 100644 index 0000000..e245e9c --- /dev/null +++ b/aoc_postmortem.html @@ -0,0 +1,153 @@ + + + {% load static %} +
+
+ +
+

Advent of Code: Postmortem

+
+

+ Looks like it’s that time of the year again. I’m on a train with nothing to do, and I’ve been procrastinating this for months. +

+

+ Last year, I attempted a challenge for Advent of Code. + If you want a detailed explanation, check out the blog post I wrote back then. + tl;dr: I had a set of 5 programming languages for the 25 tasks. I had to use each of them 5 times. +

+

+ Even though I failed, it didn’t turn out that bad. + I finished 14.5 days. 1-14 are fully implemented, and for 16, I did the first half. Then, I stopped.

+ I’m not here to make excuses, but humor me for a paragraph before we get to the actual content of this. +

+

+ I really disliked day 15. + Not because it was too hard, but because it would have been boring to implement while also being time-consuming due to the needless complexity. + I just didn’t feel like doing that, and that apparently set a dangerous precedent that would kill the challenge for me.

+ So there’s that. Now for the interesting part: +

+

The part you’re actually here for

+

+ I tried out languages. Languages that I had never really used before. Here are my experiences: +

+

C

+

+ Days finished: 4

+ Before the challenge, I knew absolutely nothing about C. I had never allocated memory, incremented a pointer, or used section 3 of man. That’s why I wanted to get this out of the way as quickly as possible.

+ C is interesting. + It lets you do all these dirty things, and doing them was a kind of guilty pleasure for me. + Manually setting nullbytes or array pointers around. + Nothing about that is special, but other languages just don’t let you. + You know it’s bad, and that only makes it better.

+ Would I use C for other private projects? No. Definitely not. + I just don’t see the point in $currentYear. But was it interesting? You bet. +

+

Go

+

+ Days finished: 3

+ Allegedly, this language was made by some very smart people. + They may have been smart, but they may have created the most boring programming language in existence. + It feels bad to write, and it’s a terrible choice when dealing mostly with numbers (as is the case with most AoC puzzles). + It’s probably great for business logic, and I can say from personal experience that it also works quite well for web development and things like discord bots. + But to me, not having map/reduce/filter/etc. just makes a language really unenjoyable and verbose.

+ Writing Go for AoC sometimes felt like writing a boring version of C (TL note: one that won’t segfault and memory leak your ass off if you don’t pay attention). +

+ People say it’s more readable and all that, and that’s certainly great for huge projects, but for something like this… I wouldn’t ever pick Go voluntarily.

+ And also not for anything else, to be perfectly honest. I mean… just look at this. + (Yes, I wrote this. Scroll down and use the comment section to tell me that I’m just too dumb to understand Go.) +

+
package main
+
+import "fmt"
+
+func main() {
+    // This declares a mutable variable.
+    var regularString = "asdf"
+    // This also declares a mutable variable.
+    // I have no idea why there are two ways of doing this.
+    unicodeString := "aä漢字"
+    for char := range(unicodeString) {
+        // You might expect that this prints the characters individually,
+        fmt.Println(char)
+        /*
+         * Instead, it compiles and prints (0, 1, 3, 6) -- the index of the first byte of each character.
+         * Very readable and very intuitive. Definitely what the user would want here.
+         */
+    }
+    for _, char := range(unicodeString) {
+         /*
+          * Having learned from our past mistakes, we assign the index to _ to discard it.
+          * Surely this time.
+          */
+         fmt.Println(char)
+         /*
+          * Or not because this prints (97, 228, 28450, 23383) -- the unicode indices of the characters.
+          * Printing a rune (the type Go uses to represent individual characters,
+          * e.g. during string iteration) actually prints its integer value.
+          */
+    }
+    for _, char := range(unicodeString) {
+        /*
+         * This actually does what you’d expect.
+         * It also handles unicode beautifully, instead of just iterating over the bytes.
+         */
+         fmt.Printf("%c\n", char)
+    }
+    /*
+     * So go knows what a character is and how many of those are in a string when iterating.
+     * Intuitively, this would also apply to the built-in len() function.
+     * However...
+     */
+    fmt.Println(len(regularString)) // prints 4
+    fmt.Println(len(unicodeString)) // prints 9
+}
+

+ Oh, and there are no generics. Moving on. +

+

Kotlin

+

+ Days finished: 3

+ Oh Kotlin. Kotlin is great. A few weeks after AoC, I actually started writing Kotlin at work, and it’s lovely. + It fixes almost all of the issues people had with Java while maintaining perfect interoperability, and it’s also an amazing language just by itself.

+ I like the way Kotlin uses scopes and lambdas for everything, and the elvis operator makes dealing with nullability much easier. + Simple quality of life improvements (like assignments from try/catch blocks), things let() and use(), proper built-in singletons, and probably more that I’m forgetting make this a very pleasant language. Would recommend.

+ In case you didn’t know: Kotlin even compiles to native binaries if you’re not using any Java libraries (although that can be hard because you sometimes just need the Java stdlib). +

+

Python

+

+ Days finished: 2

+ I don’t think there’s much to say here. + Python was my fallback for difficult days because I just feel very comfortable writing it. + The standard library is the best thing since proper type inference, and it supports all the syntactic sugar that anyone could ask for. + If a language is similar to Python, I’ll probably like it.

+ Yes, I’ve tried nim. +

+

Rust

+

+ Days finished: 2

+ Rust is… I don’t even know. + But first things first: I like Rust.
+ I like its way of making bad code hard to write.
+ I like the crate ecosystem.
+ I like list operations and convenience functions like sort_by_key.
+ I like immutability by default.
+ I like generics (suck it, Go).

+ Not that I didn’t have all kinds of issues with it, but Rust made me feel like those issues were my fault, rather than the fault of the language. + I also wouldn’t say I feel even remotely comfortable with the borrow checker -- it sometimes (or more often than I’d like to admit) still felt like educated trial and error. + I’m sure this gets better as you grow more accustomed to the language, and so far I haven’t encountered anything that would be a deal breaker for me.

+ Rust might even become my go-to language for performance-sensitive tasks at some point. + It definitely has all of the necessary tools. + Unlike some languages that leave you no room for optimization with intrinsics or similar magic. (Why do I keep going back to insulting Go? Why does Go keep giving me reasons for doing so?)

+ The borrow checker will likely always be a source of issues, but I think that is something that is worth getting used to. + The ideas behind it are good enough to justify the hassle. +

+

See you next year. Maybe.

+

+ I underestimated just how little time and motivation I’d have left after an 8-hour workday that already mostly consists of programming.

+ It was fun, though, and I’ll probably at least try something similar next year.

+ Let’s see what stupid challenge I come up with this time. +

+ Did anyone actually expect me to succeed? +
+
+ diff --git a/blogs.html b/blogs.html new file mode 100644 index 0000000..314c48f --- /dev/null +++ b/blogs.html @@ -0,0 +1,116 @@ + + + {% load static %} +
+
+ +
+

+ Writing Blogs out of Boredom +

+
+

+ Yes, blogs. Not blog posts. We’ll get into that in a second.
+ This entire thing is also really unpolished, but I probably won’t ever find the motivation to make it better. Keep that in mind, and don’t expect too much. +

+

+Introduction +

+

+ More often than not, we find ourselves bored. The modern world, brimming with excitement, variety, colors, lights, and stimuli, fails to entertain us. We could have anything, but we want nothing. + While this is certainly a serious issue that merits pages upon pages of psychological and even philosophical discussion, it won’t be our topic for today. + Today, we’re looking at the other side of the digital age. Those rare occurences where you have nothing, but would take anything.
+ Okay, maybe not quite, but let’s go with that. +

+

+ A few weeks ago, I found myself on a rather lengthy train trip. No books, close to no reception, no one to talk to. Just me, a bag of clothes, and a laptop without internet. + + [1] + +
+ So I did what every sane person would do. I programmed something. Not because I really needed it, but because it kept me busy, and hey, maybe someone someday might get some use out of it. + As has become the norm for me, I wanted to do something that was in some way new or unknown to me. In this case, template rendering and persistent storage in Go. +

+
+ [1] Incidentally, I’m on a train right now. No books, no reception, just a laptop with… you get the gist.
+ + Although there is a moderately attractive girl sitting not too far from me this time.
+ She just seems awfully immersed into her MacBook, and I don’t want to interrupt her.
Also >Apple +
+
+

+ The idea of the blog is quite simple, and I don’t feel bad admitting that I took inspiration from a friend who had designated channels on his discord server that were used as micro blogs by different users. One channel per user. + Mimicking that, I created a simple database schema that would store a message, an author, and a timestamp.[2] + The template would then query the last 50 entries from that database and display them on the website like an IRC log.

+ And that was the frontend done.
+ This is what it looked like, in case you’re curious. +

+
+ [2] Of course, an auto-incrementing ID was also added, but that’s not really relevant here. +
+

+ What, Why, How? +

+ The reality is that the frontend was the very last part I implemented, but I already had a general idea, and explaining it first makes more sense to an outsider. + Some of you might also be thinking +

+

+ “What do you mean »the frontend was done«? You can’t publish anything on that blog.” +

+

+ You see, almost everything is optional. Sure, you might want to publish something, but do you really need a dedicated page for that? As long as the server knows who you are and what you want to write (and that you have the permissions to do so), it’s fine. + Implementing a login page in the frontend seemed a bit overkill, and requiring you to copy-paste a token into a password field for every line that you want to publish is also inconvenient at best.
+ And why would we want that anyway? +

+

+ The Best UI There Is… +

+

+ There is a very good article about terminals and what we can learn from them when designing UIs. + However, in our efforts to make UIs that are _ at least in some regards _ like a terminal, we shouldn’t forget that some UIs might as well be terminals.
+ And so I did the only logical thing. I implemented an endpoint that opens a port and listens for POST requests containing a JSON.
+ That way, publishing can be automated with a simple shell script that uses curl to send the requests.[3] +

+
+function publish {
+    curl your_blog.com/add -d "{\"content\": \"$line\", \"Secret\": \"your_password\", \"author\": \"kageru\"}" -H "Content-Type: application/json"
+}
+
+read line
+while [[ ! -z "$line" ]]; do
+    publish "$line"
+    read line
+done
+
+
+

+ This simple script will continue to read lines from STDIN (i. e. whatever you type before pressing return) and publish them as individual entries on the blog. It exits if you enter an empty line. +

+

+ Now tell me, did you really need a website with a full HTML form for that? +

+
+ [3] Did I mention this only works on *nix? Though you could probably create a very similar script in PowerShell. +
+

+ …Sometimes +

+

+ I won’t deny that this is ugly in a few ways.
+ Having your password stored in plain text in a shell script like this is certainly a huge security flaw, but since no real harm can be done here (all messages are HTML escaped, so no malicious content could be embedded in the website by an attacker), I consider this an acceptable solution for what it is _ a proof of concept that was created because I was bored on a train. Just like I am now, as I’m writing this. + The way the backend handles the requests is also anything but beautiful, but it does the job surprisingly well (if, like me, you only want a single user). You don’t even need a separate config file to store your password because the hash is in the source code and compiled into the binary.
+ Isn’t this a great example of how time constraints and spontaneous solutions can make software terrible? + + This is why, when a developer tells you he will need 3 months, you don’t force him to do it in 3 weeks. + +

+

+ Anyway, I think that’s it. I don’t know if this will be interesting, entertaining, enlightening, or whatever to anyone, but even if it doesn’t, it still kept me busy for the past hour. I still have almost two hours ahead of me, but I’ll find a way to keep myself entertained. +

+ + The girl I mentioned earlier stashed away her MacBook, but it looks like she’s going to get off the train soon. Tough luck. + + +
+
+ diff --git a/dependencies.html b/dependencies.html new file mode 100644 index 0000000..a41cb7a --- /dev/null +++ b/dependencies.html @@ -0,0 +1,88 @@ + + + {% load static %} +
+
+
+

Vapoursynth: Are We Too Afraid Of Dependencies

+
+

Introduction

+

+ Now, if you’re anything like me, you’ll probably react to that title with “wtf, no, we have way too many of them”. + But hear me out. While it is true that most Vapoursynth “funcs” have dozens of dependencies that are sometimes poorly (or not at all) documented, + we might be cutting corners where it matters most. Most as measured by “where it affects most users”. +

+

+ Fundamentally, there are two groups of developers in the Vapoursynth “community” (and I use that term very loosely): +

    +
  • People who write plugins
  • +
  • People who write Python functions
  • +
+ Myrsloik, the head developer of Vapoursynth, has set everything up in a way that facilitates a structured plugin ecosystem. + Every plugin can reserve a namespace, and all of its functions have to live there. + You can’t have a denoiser and a color shift in the same namespace without making things really confusing, both for you as a dev and for the users.
+ This is good. It should be like this. But here’s an issue: +

+

Functions are a separate ecosystem

+

+ Funcs (and I’ll use that term to collections of functions such as havsfunc or mvsfunc) are fundamentally different from plugins. + Most importantly for the user, they need to be imported manually. The namespacing is handled by Python. But even Python can’t save you if you don’t let it. +

+

+ Probably the most popular func out there is havsfunc. + At the time of writing, it consists of 32 main functions and 18 utility functions. The other big funcs paint a similar picture. + For some reason, the convention has become to dump everything you write into a single Python file and call it a day. + When I started using Vapoursynth, this trend was already established, so I didn’t really think about it and created my own 500-line monstrosity with no internal cohesion whatsoever. + We don’t care if our func depends on 20 plugins, but God forbid a single encoder release two Python modules that have to be imported separately. + This is what I mean by “we’re afraid of dependencies”. We want all of our code in one place with a single import.
+ It is worth pointing out that not everyone is doing this. People like dubhater or IFeelBloated exist, but the general consensus in the community (if such a thing even exists) seems to be strongly in favor of monolithic, basically unnamed script collections. + This creates multiple problems: +

+

The Barrier of Entry

+

+ I don’t think anyone can deny that encoding is a very niche hobby with a high barrier of entry. + You won’t find many people who know enough about it to teach you properly, and it’s easy to be overwhelmed by its complexity.
+ To make matters worse, if you’re just starting out, you won’t even know where to look for things. + Let’s say you’re a new encoder who has a source with a few halos and some aliasing, so you’re looking for a script that can help with that. + Looking at the documentation, your only options are Vine for the halos and vsTAAmbk for the aliasing. + There is no easy way for you to know that there are dehalo and/or AA functions in havsfunc, fvsfunc, muvsfunc, … you get the point. + We have amazing namespacing and order for plugins, but our scripts are a mess that is at best documented by a D9 post or the docstrings. + This is how you lose new users, who might have otherwise become contributors themselves.
+ But I have a second issue with the current state of affairs: +

+

Code Duplication

+

+ As mentioned previously, each of these gigantic functions comes with its own collection of helpers and utilities. + But is it really necessary that everyone writes their own oneliner for splitting planes, inserting clips, function iteration, and most simple mask operations? + The current state, hyperbolically speaking, is a dozen “open source” repositories with one contributor each and next to no communication between them. + The point of open source isn’t just that you can read other people’s code. It’s that you can actively contribute to it. +

+

The Proposal

+

+ So what do I want? Do I propose a new system wherein each function gets its own Python module? No. God, please no. + I accept that we won’t be able to clean up the mess that has been created. That, at least in part, I have created. + But maybe we can start fixing it a least a little bit to make it easier for future encoders. + Actually utilizing open source seems like it would benefit everyone. The script authors as well as the users. + Maybe we could start with a general vsutil that contains all the commonly-used helpers and utilities. + That way, if something in Vapoursynth is changed, we only have to change one script instead of 20. + This should particularly help then-unmaintained scripts which won’t break quite as frequently. + The next step would be an attempt to combine functions of a specific type into modules, although this might get messy as well if not done properly. + Generally, splitting by content rather than splitting by author seems to be the way. +

+

+ I realize that I am in no real position to do this, but I at least want to try this for my own kagefunc, and I know at least a few western encoders who would be willing to join. + We’ve been using a GitHub organization for this for a while, and i think this is the way to go forward. + It would also allow some sort of quality control and code review, something that has been missing for a long time now. +

+

+ I’ll probably stream my future work on Vapoursynth-related scripts (and maybe also some development in general) on my Twitch channel. + Feel free to follow if you’re interested in that or in getting to know the person behind these texts. I’ll also stream games there (probably more games than coding, if I’m being honest), so keep that in mind. +

+

+ Edit: It has been pointed out to me that vsdb exists to compensate for some of the issues described here. + I think that while this project is very helpful for newcomers, it doesn’t address the underlying issues and just alleviates the pain for now. +

+ I just had to plug my Twitch there, didn’t I? +
+
+ diff --git a/edgemasks.html b/edgemasks.html new file mode 100644 index 0000000..6f28fe0 --- /dev/null +++ b/edgemasks.html @@ -0,0 +1,379 @@ + + {% load static %} +
+
+ +
+ +

Edge Masks

+
+

Table of contents

+ +

Abstract

+

+ Digital video can suffer from a plethora of visual artifacts caused by lossy compression, transmission, + quantization, and even errors in the mastering process. + These artifacts include, but are not limited to, banding, aliasing, + loss of detail and sharpness (blur), discoloration, halos and other edge artifacts, and excessive noise.
+ Since many of these defects are rather common, filters have been created to remove, minimize, or at least + reduce their visual impact on the image. However, the stronger the defects in the source video are, the more + aggressive filtering is needed to remedy them, which may induce new artifacts. +

+

+ In order to avoid this, masks are used to specifically target the affected scenes and areas while leaving + the + rest unprocessed. + These masks can be used to isolate certain colors or, more importantly, certain structural components of an + image. + Many of the aforementioned defects are either limited to the edges of an image (halos, aliasing) or will + never + occur in edge regions (banding). In these cases, the unwanted by-products of the respective filters can be + limited by only applying the filter to the relevant areas. Since edge masking is a fundamental component of + understanding and analyzing the structure of an image, many different implementations were created over the past few + decades, many of which are now available to us.

+

In this article, I will briefly explain and compare different ways to generate masks that deal with the + aforementioned problems.

+

Theory, examples, and explanations

+

+ Most popular algorithms try to detect abrupt changes in brightness by using convolutions to analyze the + direct + neighbourhood of the reference pixel. Since the computational complexity of a convolution is + 0(n2) + (where n is the radius), the the radius should be as small as possible while still maintaining a reasonable + level of accuracy. Decreasing the radius of a convolution will make it more susceptible to noise and similar + artifacts.

+

Most algorithms use 3x3 convolutions, which offer the best balance between speed and accuracy. Examples are + the operators proposed by Prewitt, Sobel, Scharr, and Kirsch. Given a sufficiently clean (noise-free) + source, 2x2 convolutions can also be used[src], but with modern + hardware being able to calculate 3x3 convolutions + for HD video in real time, the gain in speed is often outweighed by the decreased accuracy.

+

To better illustrate this, I will use the Sobel operator to compute an example image.
+ Sobel uses two convolutions to detect edges along the x and y axes. Note that you either need two separate + convolutions per axis or one convolution that returns the absolute values of each pixel, rather than 0 for + negative values. + + + + + +
+ + + -1 + -2 + -1 + + + 0 + 0 + 0 + + + 1 + 2 + 1 + + + + + + -1 + 0 + 1 + + + -2 + 0 + 2 + + + -1 + 0 + 1 + + +
+

+ Every pixel is set to the highest output of any of these convolutions. A simple implementation using the + Convolution function of Vapoursynth would look like this:

+
def sobel(src):
+    sx = src.std.Convolution([-1, -2, -1, 0, 0, 0, 1, 2, 1], saturate=False)
+    sy = src.std.Convolution([-1, 0, 1, -2, 0, 2, -1, 0, 1], saturate=False)
+    return core.std.Expr([sx, sy], 'x y max')
+ Fortunately, Vapoursynth has a build-in Sobel function + core.std.Sobel, so we don't even have to write our own code. +

Hover over the following image to see the Sobel edge mask.

+
+

Of course, this example is highly idealized. All lines run parallel to either the x or the y axis, there are + no small details, and the overall complexity of the image is very low.

+ Using a more complex image with blurrier lines and more diagonals results in a much more inaccurate edge mask. +
+

+ A simple way to greatly improve the accuracy of the detection is the use of 8-connectivity rather than + 4-connectivity. This means utilizing all eight directions of the Moore neighbourhood, i.e. also using the + diagonals of the 3x3 neighbourhood.
+ To achieve this, I will use a convolution kernel proposed by Russel A. Kirsch in 1970[src]. +

+ + + 5 + 5 + 5 + + + -3 + 0 + -3 + + + -3 + -3 + -3 + + +
+ This kernel is then rotated in increments of 45° until it reaches its original position.
+ Since Vapoursynth does not have an internal function for the Kirsch operator, I had to build my own; again, + using the internal convolution. + +
def kirsch(src):
+    kirsch1 = src.std.Convolution(matrix=[ 5,  5,  5, -3,  0, -3, -3, -3, -3])
+    kirsch2 = src.std.Convolution(matrix=[-3,  5,  5,  5,  0, -3, -3, -3, -3])
+    kirsch3 = src.std.Convolution(matrix=[-3, -3,  5,  5,  0,  5, -3, -3, -3])
+    kirsch4 = src.std.Convolution(matrix=[-3, -3, -3,  5,  0,  5,  5, -3, -3])
+    kirsch5 = src.std.Convolution(matrix=[-3, -3, -3, -3,  0,  5,  5,  5, -3])
+    kirsch6 = src.std.Convolution(matrix=[-3, -3, -3, -3,  0, -3,  5,  5,  5])
+    kirsch7 = src.std.Convolution(matrix=[ 5, -3, -3, -3,  0, -3, -3,  5,  5])
+    kirsch8 = src.std.Convolution(matrix=[ 5,  5, -3, -3,  0, -3, -3,  5, -3])
+    return core.std.Expr([kirsch1, kirsch2, kirsch3, kirsch4, kirsch5, kirsch6, kirsch7, kirsch8],
+            'x y max z max a max b max c max d max e max')
+

+ It should be obvious that the cheap copy-paste approach is not acceptable to solve this problem. Sure, it + works, + but I'm not a mathematician, and mathematicians are the only people who write code like that. Also, yes, you + can pass more than three clips to sdt.Expr, even though the documentation says otherwise.
Or maybe my + limited understanding of math (not being a mathematician, after all) was simply insufficient to properly + decode “Expr evaluates an expression per + pixel for up to 3 input clips.”

+ Anyway, let's try that again, shall we? +
def kirsch(src: vs.VideoNode) -> vs.VideoNode:
+    w = [5]*3 + [-3]*5
+    weights = [w[-i:] + w[:-i] for i in range(4)]
+    c = [core.std.Convolution(src, (w[:4]+[0]+w[4:]), saturate=False) for w in weights]
+    return core.std.Expr(c, 'x y max z max a max')
+

Much better already. Who needed readable code, anyway?

+ If we compare the Sobel edge mask with the Kirsch operator's mask, we can clearly see the improved accuracy. + (Hover=Kirsch)
+
+

The higher overall sensitivity of the detection also results in more noise being visible in the edge mask. + This can be remedied by denoising the image prior to the analysis.
+ The increase in accuracy comes at an almost negligible cost in terms of computational complexity. About + 175 fps + for 8-bit 1080p content (luma only) compared to 215 fps with the previously shown sobel + ‘implementation’. The internal Sobel filter is + not used for this comparison as it also includes a high- and lowpass function as well as scaling options, + making it slower than the Sobel function above. Note that many of the edges are also detected by the Sobel + operator, however, these are very faint and only visible after an operation like std.Binarize.

+ A more sophisticated way to generate an edge mask is the TCanny algorithm which uses a similar procedure to find + edges but + then reduces these edges to 1 pixel thin lines. Optimally, these lines represent the middle + of each edge, and no edge is marked twice. It also applies a gaussian blur to the image to eliminate noise + and other distortions that might incorrectly be recognized as edges. The following example was created with + TCanny using these settings: core.tcanny.TCanny(op=1, + mode=0). op=1 uses a modified operator that has been shown to achieve better signal-to-noise ratios[src].
+
+ Since I've already touched upon bigger convolutions earlier without showing anything specific, here is an + example of the things that are possible with 5x5 convolutions. +
src.std.Convolution(matrix=[1,  2,  4,  2, 1,
+                            2, -3, -6, -3, 2,
+                            4, -6,  0, -6, 4,
+                            2, -3, -6, -3, 2,
+                            1,  2,  4,  2, 1], saturate=False)
+ + This was an attempt to create an edge mask that draws around the edges. With a few modifications, this might + become useful for + halo removal or edge cleaning. (Although something similar (probably better) can be created with a regular edge + mask, std.Maximum, and std.Expr) +

Using edge masks

+

+ Now that we've established the basics, let's look at real world applications. Since 8-bit video sources are + still everywhere, barely any encode can be done without debanding. As I've mentioned before, restoration + filters + can often induce new artifacts, and in the case of debanding, these artifacts are loss of detail and, for + stronger debanding, blur. An edge mask could be used to remedy these effects, essentially allowing the + debanding + filter to deband whatever it deems necessary and then restoring the edges and details via + std.MaskedMerge.

+

+ GradFun3 internally generates a mask to do exactly this. f3kdb, the other popular debanding filter, + does not have any integrated masking functionality.

+ Consider this image:
+ +

+ As you can see, there is quite a lot of banding in this image. Using a regular debanding filter to remove it + would likely also destroy a lot of small details, especially in the darker parts of the image.
+ Using the Sobel operator to generate an edge mask yields this (admittedly rather disappointing) result:

+ +

+ In order to better recognize edges in dark areas, the retinex + algorithm can be used for local contrast enhancement.

+ +
The image after applying the retinex filter, luma only.
+

+ We can now see a lot of information that was previously barely visible due to the low contrast. One might + think + that preserving this information is a vain effort, but with HDR-monitors slowly making their way into the + mainstream and more possible improvements down the line, this extra information might be visible on consumer + grade screens at some point. And since it doesn't waste a noticeable amount of bitrate, I see no harm in + keeping it.

+ Using this newly gained knowledge, some testing, and a little bit of magic, we can create a surprisingly + accurate edge mask. +
def retinex_edgemask(luma, sigma=1):
+    ret = core.retinex.MSRCP(luma, sigma=[50, 200, 350], upper_thr=0.005)
+    return core.std.Expr([kirsch(luma), ret.tcanny.TCanny(mode=1, sigma=sigma).std.Minimum(
+        coordinates=[1, 0, 1, 0, 0, 1, 0, 1])], 'x y +')
+

Using this code, our generated edge mask looks as follows:

+ +

+ By using std.Binarize (or a similar lowpass/highpass function) and a few std.Maximum and/or std.Inflate + calls, we can transform this edgemask into a + more usable detail mask for our debanding function or any other function that requires a precise edge mask. +

+

Performance

+ Most edge mask algorithms are simple convolutions, allowing them to run at over 100 fps even for HD content. A + complex algorithm like retinex can obviously not compete with that, as is evident by looking at the benchmarks. + While a simple edge mask with a Sobel kernel ran consistently above 200 fps, the function described above only + procudes 25 frames per second. Most of that speed is lost to retinex, which, if executed alone, yields about + 36.6 fps. A similar, albeit more inaccurate, way to + improve the detection of dark, low-contrast edges would be applying a simple curve to the brightness of the + image. +
bright = core.std.Expr(src, 'x 65535 / sqrt 65535 *')
+ This should (in theory) improve the detection of dark edges in dark images or regions by adjusting their + brightness + as shown in this curve: + +

Conclusion

+ Edge masks have been a powerful tool for image analysis for decades now. They can be used to reduce an image to + its most essential components and thus significantly facilitate many image analysis processes. They can also be + used to great effect in video processing to minimize unwanted by-products and artifacts of more agressive + filtering. Using convolutions, one can create fast and accurate edge masks, which can be customized and adapted + to serve any specific purpose by changing the parameters of the kernel. The use of local contrast enhancement + to improve the detection accuracy of the algorithm was shown to be possible, albeit significantly slower.

+
# Quick overview of all scripts described in this article:
+################################################################
+
+# Use retinex to greatly improve the accuracy of the edge detection in dark scenes.
+# draft=True is a lot faster, albeit less accurate
+def retinex_edgemask(src: vs.VideoNode, sigma=1, draft=False) -> vs.VideoNode:
+    core = vs.get_core()
+    src = mvf.Depth(src, 16)
+    luma = mvf.GetPlane(src, 0)
+    if draft:
+        ret = core.std.Expr(luma, 'x 65535 / sqrt 65535 *')
+    else:
+        ret = core.retinex.MSRCP(luma, sigma=[50, 200, 350], upper_thr=0.005)
+    mask = core.std.Expr([kirsch(luma), ret.tcanny.TCanny(mode=1, sigma=sigma).std.Minimum(
+        coordinates=[1, 0, 1, 0, 0, 1, 0, 1])], 'x y +')
+    return mask
+
+
+# Kirsch edge detection. This uses 8 directions, so it's slower but better than Sobel (4 directions).
+# more information: https://ddl.kageru.moe/konOJ.pdf
+def kirsch(src: vs.VideoNode) -> vs.VideoNode:
+    core = vs.get_core()
+    w = [5]*3 + [-3]*5
+    weights = [w[-i:] + w[:-i] for i in range(4)]
+    c = [core.std.Convolution(src, (w[:4]+[0]+w[4:]), saturate=False) for w in weights]
+    return core.std.Expr(c, 'x y max z max a max')
+
+
+# should behave similar to std.Sobel() but faster since it has no additional high-/lowpass or gain.
+# the internal filter is also a little brighter
+def fast_sobel(src: vs.VideoNode) -> vs.VideoNode:
+    core = vs.get_core()
+    sx = src.std.Convolution([-1, -2, -1, 0, 0, 0, 1, 2, 1], saturate=False)
+    sy = src.std.Convolution([-1, 0, 1, -2, 0, 2, -1, 0, 1], saturate=False)
+    return core.std.Expr([sx, sy], 'x y max')
+
+
+# a weird kind of edgemask that draws around the edges. probably needs more tweaking/testing
+# maybe useful for edge cleaning?
+def bloated_edgemask(src: vs.VideoNode) -> vs.VideoNode:
+    return src.std.Convolution(matrix=[1,  2,  4,  2, 1,
+                                       2, -3, -6, -3, 2,
+                                       4, -6,  0, -6, 4,
+                                       2, -3, -6, -3, 2,
+                                       1,  2,  4,  2, 1], saturate=False)
+
+ Some of the functions described here have been added to my script collection on Github
Download
+





Mom, look! I found a way to burn billions of CPU cycles with my new placebo debanding script! +
+
diff --git a/expediency.html b/expediency.html new file mode 100644 index 0000000..00b4ca2 --- /dev/null +++ b/expediency.html @@ -0,0 +1,62 @@ + + + {% load static %} +
+
+ +
+

Do What Is Interesting, Not What Is Expedient

+

+ If you’re just here for the encoding stuff, this probably isn’t for you. But if you want to read the thoughts of a Linux proselyte (and proselytizer), read on. +
I’m sure this also applies to other things, but your mileage may vary. +

+ Also, I wrote this on a whim, so don’t expect it to be as polished as the usual content. +

+
+

Introduction

+ Once upon a time (TL note: about 4 months ago), there was a CS student whose life was so easy and uncomplicated he just had to change something. Well that’s half-true at best, but the truth is boring. Anyway, I had been thinking about Linux for a while, and one of my lectures finally gave me a good excuse to install it on my laptop. At that point, my only experience with Linux was running a debian server with minimal effort to have a working website, a TeamSpeak server, and a few smaller things, so I knew almost nothing.

+ I decided to “just do it™” and installed Manjaro Linux on my Laptop. They offer different pre-configured versions, and I just went with XFCE, one of the two officially supported editions. Someone had told me about Manjaro about a year earlier, so I figured it would be a good choice for beginners (which turned out to be true). I created a bootable flash drive, booted, installed it, rebooted, and that was it; it just worked. No annoying bloatware to remove, automatic driver installation, and sane defaults (apart from VLC being the default video player, but no system is perfect, I guess). I changed a few basic settings here and there and switched to a dark theme—something that Windows still doesn’t support—and the system looked nice and was usable. So nice and usable that I slowly started disliking the Windows 10 on my home PC. (You can already tell where this is going.)

+ I wanted full control over my system, the beauty of configurability, a properly usable terminal (although I will add that there are Linux distros which you can use without touching a terminal even once), the convenience of a package manager—and the smug feeling of being a Linux user. You’ll understand this once you’ve tried; trust me.

+ And so the inevitable happened, and I also installed Manjaro on my home PC (the KDE version this time because “performance doesn’t matter if you have enough CPU cores”—a decision that I would later revisit). I still kept Windows on another hard drive as a fallback, and it remains there to this day, although I only use it about once a week, maybe even less, when I want to play a Windows-only game. +

Exploring the Rabbit Hole

+ No one, not even Lewis Carroll, can fully understand what a rabbit hole is unless they experienced being a bored university student who just got into Linux. Sure, my mother could probably install Ubuntu (or have me install it for her) and be happy with that. But I was—and still am—a young man full of curiosity. Options are meant to be used, and systems are meant to be broken. The way I would describe it is “Windows is broken until you fix it. Linux works until you break it. Both of these will happen eventually.”

+ So, not being content with the stable systems I had, I wanted to try something new after only a few weeks. The more I looked at the endless possibilities, the more I just wanted to wipe the system and start over; this time with something better. I formatted the laptop and installed Manjaro i3 (I apparently wasn’t ready to switch to another distro entirely yet). My first time in the live system consisted of about 10 minutes of helpless, random keystrokes, before I shut it down again because I couldn’t even open a window (which also meant I couldn’t google). This is why you try things like that in live systems, kids. Back at my main PC, I read the manual of i3wm. How was I supposed to know that $super + return opens a terminal?

+ Not the best first impression, but I was fascinated by the concept of tiling window managers, so I decided to try again—this time, armed with the documentation and a second PC to google. Sure enough, knowing how to open dmenu makes the system a lot more usable. I could even start programs now. i3 also made me realize how slow and sluggish KDE was, so I started growing dissatisfied with my home setup once again. It was working fine, but opening a terminal via hotkeys took about 200ms compared to the blazing 50ms on my laptop. Truly first-world problems.

+ It should come as no surprise that I would soon install i3 on my home PC as well, and I’ve been using that ever since. Obligatory neofetch.

+ I also had a home server for encoding-related tasks which was still running Windows. While it is theoretically possible to connect to a Windows remote desktop host with a Linux client (and also easy to set up), it just felt wrong, so that had to change. Using Manjaro again would have been the easiest way, but that would have been silly on a headless system, so I decided to install Arch Linux instead. It obviously wasn’t as simple as Manjaro (where you literally—and I mean literally—have to click a few times and wait for it to install), but I still managed to do it on my first attempt. And boy, is SSH convenient when you’re used to the bloat of Windows RDC.

+ I would later make a rule for myself to not install the same distro on two systems, which leads us to the next chapter of my journey. +

The Slow Descent Into Madness

+ Installing an operating system is an interesting process. It might take some time, and it can also be frustrating, but you definitely learn a lot (things like partitioning and file systems, various GNU/Utils, and just basic shell usage). Be it something simple like enabling SSH access or a bigger project like setting up an entire graphical environment—you learn something new every time.

+ As briefly mentioned above, I wanted to force myself to try new things, so I simply limited each distro to a single device. This meant that my laptop, which was still running Manjaro, had to be reinstalled. I just loved the convenience of the AUR, so I decided to go with another arch-based distro: antergos. The installer has a few options to install a desktop environment for you, but it didn’t have i3, so I had to do that manually.

+ With that done, I remembered that I still had a Raspberry Pi that I hadn’t used in years. That obviously had to be changed, especially since it gave me the option to try yet another distro. (And I would find a use case for the Pi eventually, or I at least told myself that I would.)

+ I had read this not too long ago, so I decided to give Void Linux a shot. This would be my first distro without systemd (don’t worry if you don’t know what that is).

+ I could go on, but I think you get the gist of it. I did things because they seemed interesting, and I definitely learned a lot in the process. After the Void Pi, I installed Devuan on a second Pi (remind you, I already had Debian on a normal server, so that was off-limits).

+ The real fun began a few days ago when I decided to build a tablet with a RasPi. That idea is nothing new, and plenty of people have done it before, but I wanted to go a little further. A Raspberry Pi tablet running Gentoo Linux. The entire project is a meme and thus destined to fail, but I’m too stubborn to give up (yet). At the time of writing, the Pi has been compiling various packages for the last 20 hours, and it’s still going.

+ As objectively stupid as this idea is (Gentoo on a Pi without cross-compilation, or maybe just Gentoo in general), it did, once again, teach me a few things about computers. About compilers and USE flags, about dependency management, about the nonexistent performance of the Pi 3… you get the idea. I still don’t know if this will end up as a working system, but either way, it will have been an interesting experience.

+ And that’s really what this is all about. Doing things you enjoy, learning something new, and being entertained.

+

+ Update: It’s alive… more or less. +

+ + +

Conclusion

+

+ So this was the journey of a former Windows user into the lands of free software.

+ Was it necessary? No.

+ Was it enjoyable? Mostly.

+ Was it difficult? Only as difficult as I wanted it to be.

+ Does Linux break sometimes? Only if I break it.

+ Do I break it sometimes? Most certainly.

+ Would I do it again? Definitely.

+ Would I go back? Hell, no.

+ Do I miss something about Windows? Probably the way it handles multi-monitor setups with different DPIs. I haven’t found a satisfying solution for UI scaling per monitor on Linux yet.

+

+ I’m not saying everyone should switch to Linux. There are valid use cases for Windows, but some of the old reasons are simply not valid anymore. Many people probably think that Linux is a system for nerds—that it’s complicated, that you need to spend hours typing commands into a shell, that nothing works (which is still true for some distros, but you only use these if you know what you’re doing and if you have a good reason for it).

+ In reality, Linux isn’t just Linux the way Windows is Windows. Different Linux distros can be nothing alike. The only commonality is the kernel, which you don’t even see as a normal user. Linux can be whatever you want it to be; as easy or as difficult as you like; as configurable or out-of-the-box as you need.

+

+ If you have an old laptop or an unused hard drive, why don’t you just try it? You might learn a few interesting things in the process. Sometimes, being interesting beats being expedient. Sometimes, curiosity beats our desire for familiarity.

+


+ Btw, I ended up not attending the lecture that made me embark upon this journey in the first place. FeelsLifeMan +
+
+ diff --git a/grain.html b/grain.html new file mode 100644 index 0000000..d6d12b3 --- /dev/null +++ b/grain.html @@ -0,0 +1,281 @@ + + {% load static %} +
+
+ +
+

Grain and Noise

+ +

Introduction

+
+ There are many types of noise or artifacts and even more denoising algorithms. In the following article, the + terms noise + and grain will sometimes be used synonymously. Generally, noise is an unwanted artifact and grain was added to + create a certain effect (flashbacks, film grain, etc) or to prevent banding. Especially the latter may not + always + be beneficial to your encode, as it increases entropy, which in turn increases the bitrate without improving the + perceived + quality of the encode (apart from the gradients, but we'll get to that).
+ Grain is not always bad and even necessary to remove or prevent banding artifacts from occuring, but studios + tend to + use dynamic grain which requires a lot of bitrate. Since you are most likely encoding in 10 bit, banding isn't + as + much of an issue, and static grain (e.g. f3kdb's grain) will do the job just as well.
+ Some people might also like to denoise/degrain an anime because they prefer the cleaner look. Opinions may vary. +
+

Different types of noise and grain

+
+ This section will be dedicated to explaining different types of noise which you will encounter. This list is not + exhaustive, as studios tend to do weird and unpredictable things from time to time.
+
+

1. Flashbacks

+ +
Image: Shinsekai Yori episode 12 BD
+ Flashbacks are a common form of artistic film grain. The grain is used selectively to create a certain + atmosphere and should + not be removed. These scenes tend to require quite high bitrates, but the effect is intended, and even if + you + were + to try, removing the grain would be quite difficult and probably result in a very blurred image.
+ Since this type of grain is much stronger than the underlying grain of many sources, it should not be + affected by + a normal denoiser, meaning you don't have to trim around these scenes if you're using a denoiser to remove + the general background noise from other scenes. +
+
+

2. Constant film grain

+ +
Image: Corpse Party episode 1, encoded by Gebbi @Nanaone +
+ Sometimes all scenes are overlaid with strong film grain similar to the previously explained flashbacks. + This type of source is rare, and the only way to remove it would be a brute force denoiser like QTGMC. It is + possible to + get rid of it, however, generally I would advise against it, as removing this type of grain tends to change + the mood of a given scene. Furthermore, using a denoiser of this calibre can easily destroy any detail + present, so you will have to carefully tweak the values. +
+
+
+

3. Background grain

+ This type is present in most modern anime, as it prevents banding around gradients and simulates detail by + adding random information to all surfaces. Some encoders like it. I don't. Luckily, this one can be removed + with relative ease, which will notably decrease the required bitrate. Different denoisers will be described + in a later paragraph. +
+
+
+

4. TV grain

+
+
Image: Kekkai Sensen episode 1, encoded by me
+ This type is mainly used to create a CRT-monitor or cameraesque look. It is usually accompanied by + scanlines and other distortion and should never be filtered. Once again, you can only throw more bits at the + scene. +
+
+

5. Exceptional grain

+ +
Image: Seirei Tsukai no Blade Dance episode 3, BD
+ Some time ago, a friend of mine had to encode the Blu-rays of Blade Dance and came across + this scene. It is about three minutes long, and the BD transport stream's bitrate peaks at more than + 55mbit/s, making the Blu-ray non-compliant with the current Blu-ray standards (this means that some BD + players may just refuse to play the file. Good job, Media Factory).
+ As you can see in the image above, the source has + insanely strong grain in all channels (luma and chroma). FFF chose to brute-force through this scene by + simply letting x264 pick whatever bitrate it deemed appropriate, in this case about 150mbit/s. Another (more + bitrate-efficient solution) would be to cut the scene directly from the source stream without re-encoding. + Note that you can only cut streams on keyframes and will not be able to filter (or scale) the scene + since you're not re-encoding it. An easier solution would be using the --zones parameter to increase the crf + during the scene in question. If a scene is this grainy, you can usually get away with higher crf values. +
+
+

Comparing denoisers

+
+ So let's say your source has a constant dynamic grain that is present in all scenes, and you want to save + bitrate in your encode or get rid of the grain because you prefer clean and flat surfaces. Either way, what + you're looking for is a denoiser. A list of denoisers for your preferred frameserver can be found here + (Avisynth) or here (Vapoursynth). + To compare the different filters, I will use two scenes – one with common "background grain" and one with + stronger grain. Here are the unfiltered images:
+
+ An image with "normal" grain – the type you would remove to save bitrate. Source: Mushishi Zoku Shou OVA + (Hihamukage) Frame 22390. Size: 821KB
Note the faint wood texture on the backpack. It's already quite blurry + in the source and can easily be destroyed by denoising or debanding improperly.
+
+ A grainy scene. Source: Boku Dake ga Inai Machi, Episode 1, Frame 5322. Size: 727KB
I am well aware that + this is what I + classified as "flashback grain" earlier, but for the sake of comparison let's just assume that you want to + degrain this type of scene.
+ Furthermore, you should note that most denoisers will create banding which the removed grain was masking + (they're technically not creating the banding but merely making it visible). Because of this, you will usually + have to deband after denoising. +

1. Fourier Transform based (dfttest, FFT3D)


+ Being one of the older filters, dfttest has been in development since 2007. It is a very potent denoiser + with good detail retention, but it will slow your encode down quite a bit, especially when using Avisynth due to + its lack of multithreading. + The Vapoursynth filter is faster and should yield the same results.
FFT3DGPU is + hardware accelerated and uses a similar (but not the same) algorithm. It is significantly faster but + less precise in terms of detail retention and possibly blurring areas. Contra-sharpening can be used to + prevent the latter. The filter is available for Avisynth and Vapoursynth without major differences.
+
+ sigma = 0.5; 489KB +
+ sigma = 4; 323KB +

2. Non-local means based (KNLMeans, TNLMeans)


+ The non-local means family consists of solid denoisers which are particularly appealing due to their + highly optimized GPU/OpenCL implementations, which allow them to be run without any significant speed + penalty. + Using the GPU also circumvents Avisynth's limitation to one thread, similar to FFT3DGPU.
Because of + this, there is no reason to use the "regular" (CPU) version unless your encoding rig does not have a GPU. K + NL can remove a lot of noise while still retaining quite a + lot of detail (although less than dft or BM3D). It might be a good option for older anime, which tend + to have a lot of grain (often added as part of the Blu-ray "remastering" process) but not many fine + details. When a fast (hardware accelerated) and strong denoiser is needed, I'd generally recommend using + KNL rather than FFT3D.
One thing to highlight is the Spatio-Temporal mode of this filter. By + default, neither the Avisynth nor the Vapoursynth version uses temporal reference frames for denoising. This can + be + changed in order to improve the quality by setting the d parameter to any value higher than zero. + If your material is in 4:4:4 subsampling, consider using "cmode = true" to enable denoising of the + chroma planes. By default, only luma is processed and the chroma planes are copied to the denoised + clip.
Both of these settings will negatively affect the filter's speed, but unless you're using a + really old GPU or multiple GPU-based filters, your encoding speed should be capped by the CPU rather than + the GPU. Benchmarks and documentation + here.
+
+ h = 0.2, a = 2, d = 3, cmode = 1; 551KB
+ cmode = 0 for comparison; 733KB
+
+ h = 0.5, a = 2, d = 3, cmode = 1; 376KB +

BM3D


+ This one is very interesting, very slow, and only available for Vapoursynth. Avisynth would probably die + trying to run it, so don't expect a port anytime soon unless memory usage is optimized significantly. + It would technically work on a GPU, as the algorithm can be parallelized without any issues [src], + however no such implementation exists for Avisynth or Vapoursynth. (If the book doesn't load for you, + try scrolling up and down a few times and it should fix itself)
+ BM3D appears to have the best ratio of filesize and blur (and consequently detail loss) at the cost of + being the slowest CPU-based denoiser on this list. It is worth noting that this filter can be combined + with any other denoiser by using the "ref" parameter. From the documentation: +
Employ custom denoising filter as basic estimate, refined with V-BM3D final estimate. + May compensate the shortages of both denoising filters: SMDegrain is effective at spatial-temporal + smoothing but can lead to blending and detail loss, V-BM3D preserves details well but is not very + effective for large noise pattern (such as heavy grain). +
+
+ radius1 = 1, sigma = [1.5,1,1]; 439KB
+
+ radius1 = 1, sigma = [5,5,5]; 312KB
Note: This image does not use the aforementioned "ref" parameter + to improve grain removal, as this comparison aims to provide an overview over the different filters by + themselves, rather than the interactions and synergies between them.
+

SMDegrain


+ SMDegrain seems to be the go-to-solution for many encoders, as it does not generate much blur and the + effect seems to be weak enough to save some file size without notably altering the image.
The + substantially weaker denoising also causes less banding to appear, which is particularly appealing when + trying to preserve details without much consideration for bitrate.
+
+ Even without contra-sharpening, SMDegrain seems to slightly alter/thin some of the edges. 751KB
+
+ In this image the "sharpening" is more notable. 649KB
+ One thing to note is that SMDegrain can have a detrimental effect on the image when processing with + chroma. The Avisynth wiki describes it as follows: +
Caution: plane=1-4 [chroma] can sometimes create chroma smearing. In such case I + recommend denoising chroma planes in the spatial domain. +
+ In practice, this can destroy (or add) edges by blurring multiple frames into a single one. Look at her hands
+

Edit: I recently had a discussion with another encoder who had strong + chroma artifacts (much worse than the lines on her hand), and the cause was SMDegrain. The solution can be + found + on his + blog. Just ignore the german text and scroll down to the examples. All you have to do is split the + video in its individual planes and denoise each of them + like you would denoise a single luma plane. SMDegrain is used prior to scaling for the chroma planes, which + improves the performance. You would have to do the same in Vapoursynth do avoid the smearing, but + Vapoursynth has BM3D which does the same job better, so you don't have to worry about SMDegrain and its + bugs.

+

Waifu2x


+ Waifu2x is an image-upscaling and denoising algorithm using Deep Convolutional Neural Networks. Sounds + fancy but uses an awful lot of computing power. You can expect to get ≤1fps when denoising a 720p image + using waifu2x on a modern graphics card. Your options for denoising are noise level 1, 2 ,or 3, with + level 2 and 3 being useless because of their nonexistent detail retention. Noise level 1 can remove grain + fairly well, however the detail retention may vary strongly depending on the source, and due to its + limited options (none, that is) it can not be customized to fit different sources. Either you like the + results or you use another denoiser. It is also worth noting that this is the slowest algorithm one + could possibly use, and generally the results do not justify the processing time.
+ There are other + proposals for Deep Learning based denoising algorithms, however most of these were never made available + to the public. [src] +
+
+ The more "anime-esque" parts of the image are denoised without any real issues, but the more realistic + textures (such as the straw) might be recognized as noise and treated accordingly. +
+ Edit: Since this section was written there have been a few major updates to the + Waifu2x algorithm. The speed has been further optimized, and more settings for the noise removal feature have + been added. These features make it a viable alternative to some of the other denoisers on this list (at least + for certain sources), however it is still outclassed in terms of speed. The newly added upConv models are + significantly faster for upscaling and promise better results. In their current state, they should not be used + for denoising, as they are slower than the regular models and try to improve the image quality and sharpness + even without upscaling, which may cause aliasing and ringing. +

Debanding


+ Some of these may be quite impressive in terms of file size/compressibility, but they all share a common + problem: banding. In order to fix that, we will need to deband and apply grain to the gradients. This may seem + counterintuitive, as we have just spend a lot of processing time to remove the grain, but I'll get to that + later. +
+
+ The BM3D image after an f3kdb call with a simple mask + to protect the wooden texture (I won't go into detail here, as debanding is a topic for another day). Hover over + the image to switch to the source frame.
+ Source size: 821KB. Denoised and debanded frame: 767KB. This does not sound too impressive, as it is only a + decrease of ~7% which (considering the processing time) really isn't that much, however our new grain has a + considerable advantage: It is static. I won't delve too deep into intra-frame compression, but most people will + know that less motion = lower bitrate. While the source's grain takes up new bits with every new frame, our + grain only has to be stored once per scene.

+
+ +
+ After you've decided what to do with your grain, you will have to encode it in a way that keeps the grain + structure as close as possible to your script's output. In order to achieve this, you may need to adjust a few + settings.
+
    +
  • + aq-strength: This is probably the most important parameter for x264's grain handling. Higher values will + require more bitrate, but lower values may blur the grain which looks ugly and destroys f3k's/gradfun's + dithering, creating banding.
    Recommended Values: 0.6-1.2. Extreme cases may require higher values. + Consider using the --zones parameter if necessary. +
  • +
  • + psy-rd: Grainy scenes may benefit from higher psy values (such as 0.8:0.25), however setting this to + high may induce ringing or other artifacts around edges. +
  • +
  • + deblock: Higher values (>0) will blur the image while lower (<0) will sharpen it (simplified). Sane + range is probably + 2≥x≥-3, and 0:-1 or -1:-1 should be a good starting point which you can decrease if your grain + needs it. +
  • +
  • + qcomp (quantizer curve compression): Higher values will result in a more constant quantizer while lower + values will force constant bitrate with higher quantizer fluctuations. High values (0.7-0.8) can help + with grain retention. If mbtree is enabled, this controls its strength which changes the effects, though + higher values are still better for grain. +
  • +
+





I feel like I forgot something +
+
diff --git a/matana.html b/matana.html new file mode 100644 index 0000000..26bb9ac --- /dev/null +++ b/matana.html @@ -0,0 +1,426 @@ + +
+
+ + +
+

Review: Matana-Studio - Yowamushi Pedal: Grande Road Folge 1

+
+

Einleitung und Technisches

+ Der Zufall hat entschieden, dass wir in unserer ersten + Review Matana-Studios Yowamushi Pedal: Grande Road + behandeln + werden. Da wir noch keine tollen Insider haben, um neue Leser gleich wieder abzuschrecken, fangen wir einfach + mit + den Formalien an:

+ Lokalisierung: Anredesuffixe vorhanden
+ Versionen: 720p und 1080p, jeweils 8-bit mp4 Hardsub, Größe 501 MiB bzw 996 MiB
+ Kapitel: nicht vorhanden
+ Website: http://matanastudio.eu/
+ Downloadmöglichkeiten: Nur One-Click-Hoster und Stream (Uploadet, MEGA, Openload)
+
Vollständige MediaInfo (ausklappbar):
+
MediaInfo (1080p) +

+ Allgemein
+ Format : MPEG-4
+ Format-Profil : Base Media
+ Codec-ID : isom (isom/avc1)
+ Dateigrose : 996 MiB
+ Dauer : 23min
+ Gesamte Bitrate : 5 832 Kbps
+ Kodierungs-Datum : UTC 2016-05-25 23:57:19
+ Tagging-Datum : UTC 2016-05-25 23:57:19
+
+ Video
+ ID : 1
+ Format : AVC
+ Format/Info : Advanced Video Codec
+ Format-Profil : High@L5.1
+ Format-Einstellungen fur CABAC : Ja
+ Format-Einstellungen fur ReFrames : 16 frames
+ Codec-ID : avc1
+ Codec-ID/Info : Advanced Video Coding
+ Dauer : 23min
+ Bitrate : 5 510 Kbps
+ maximale Bitrate : 21,8 Mbps
+ Breite : 1 920 Pixel
+ Hohe : 1 080 Pixel
+ Bildseitenverhaltnis : 16:9
+ Modus der Bildwiederholungsrate : konstant
+ Bildwiederholungsrate : 23,810 FPS
+ ColorSpace : YUV
+ ChromaSubsampling/String : 4:2:0
+ BitDepth/String : 8 bits
+ Scantyp : progressiv
+ Bits/(Pixel*Frame) : 0.112
+ Stream-Grose : 941 MiB (94%)
+ verwendete Encoder-Bibliothek : x264 core 148 r2692 64f4e24
+ Kodierungseinstellungen : cabac=1 / ref=16 / deblock=1:1:1 / analyse=0x3:0x133 / me=tesa / subme=11 / + psy=1 + / psy_rd=0.40:0.00 /
+ mixed_ref=1 / me_range=24 / chroma_me=1 / trellis=2 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=0 / + chroma_qp_offset=-2 /
+ threads=12 / lookahead_threads=3 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 + / + constrained_intra=0 /
+ bframes=16 / b_pyramid=2 / b_adapt=2 / b_bias=0 / direct=3 / weightb=1 / open_gop=0 / weightp=2 / + keyint=238 + / keyint_min=23 /
+ scenecut=40 / intra_refresh=0 / rc_lookahead=60 / rc=crf / mbtree=1 / crf=16.0 / qcomp=0.60 / qpmin=0 / + qpmax=69 / qpstep=4 /
+ ip_ratio=1.40 / aq=1:0.60
+ Kodierungs-Datum : UTC 2016-05-25 23:57:19
+ Tagging-Datum : UTC 2016-05-26 00:01:03
+
+ Audio
+ ID : 2
+ Format : MPEG Audio
+ Format-Version : Version 1
+ Format-Profil : Layer 3
+ Format_Settings_Mode : Joint stereo
+ Format_Settings_ModeExtension : MS Stereo
+ Codec-ID : .mp3
+ Dauer : 23min
+ Bitraten-Modus : konstant
+ Bitrate : 320 Kbps
+ maximale Bitrate : 334 Kbps
+ Kanale : 2 Kanale
+ Samplingrate : 44,1 KHz
+ Stream-Grose : 54,6 MiB (5%)
+ verwendete Encoder-Bibliothek : LAME3.99r
+ Kodierungseinstellungen : -m j -V 4 -q 2 -lowpass 20.5
+ Kodierungs-Datum : UTC 2016-05-26 00:01:02
+ Tagging-Datum : UTC 2016-05-26 00:01:03
+

+
MediaInfo (720p) +

+ Allgemein
+ Format : MPEG-4
+ Format-Profil : Base Media
+ Codec-ID : isom (isom/avc1)
+ Dateigrose : 501 MiB
+ Dauer : 23min
+ Gesamte Bitrate : 2 932 Kbps
+ Kodierungs-Datum : UTC 2016-04-17 16:33:28
+ Tagging-Datum : UTC 2016-04-17 16:33:28
+
+ Video
+ ID : 1
+ Format : AVC
+ Format/Info : Advanced Video Codec
+ Format-Profil : High@L4
+ Format-Einstellungen fur CABAC : Ja
+ Format-Einstellungen fur ReFrames : 8 frames
+ Codec-ID : avc1
+ Codec-ID/Info : Advanced Video Coding
+ Dauer : 23min
+ Bitrate : 2 800 Kbps
+ maximale Bitrate : 10,7 Mbps
+ Breite : 1 280 Pixel
+ Hohe : 720 Pixel
+ Bildseitenverhaltnis : 16:9
+ Modus der Bildwiederholungsrate : konstant
+ Bildwiederholungsrate : 23,976 (24000/1001) FPS
+ ColorSpace : YUV
+ ChromaSubsampling/String : 4:2:0
+ BitDepth/String : 8 bits
+ Scantyp : progressiv
+ Bits/(Pixel*Frame) : 0.127
+ Stream-Grose : 478 MiB (96%)
+ verwendete Encoder-Bibliothek : x264 core 148 r2665 a01e339
+ Kodierungseinstellungen : cabac=1 / ref=8 / deblock=1:0:0 / analyse=0x3:0x133 / me=umh / subme=9 / psy=1 + / psy_rd=1.00:0.00 /
+ mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=2 / 8x8dct=1 / cqm=0 / deadzone=21,11 / fast_pskip=1 / + chroma_qp_offset=-2 /
+ threads=6 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 + / constrained_intra=0 /
+ bframes=3 / b_pyramid=2 / b_adapt=2 / b_bias=0 / direct=3 / weightb=1 / open_gop=0 / weightp=2 / + keyint=240 / keyint_min=23 /
+ scenecut=40 / intra_refresh=0 / rc_lookahead=60 / rc=crf / mbtree=1 / crf=18.0 / qcomp=0.60 / qpmin=0 / + qpmax=69 / qpstep=4 /
+ ip_ratio=1.40 / aq=1:1.00
+ Kodierungs-Datum : UTC 2016-04-17 16:33:28
+ Tagging-Datum : UTC 2016-04-17 16:33:41
+
+ Audio
+ ID : 2
+ Format : MPEG Audio
+ Format-Version : Version 1
+ Format-Profil : Layer 3
+ Format_Settings_Mode : Joint stereo
+ Format_Settings_ModeExtension : MS Stereo
+ Codec-ID : .mp3
+ Dauer : 23min
+ Bitraten-Modus : konstant
+ Bitrate : 128 Kbps
+ maximale Bitrate : 134 Kbps
+ Kanale : 2 Kanale
+ Samplingrate : 44,1 KHz
+ Stream-Grose : 21,9 MiB (4%)
+ verwendete Encoder-Bibliothek : LAME3.99r
+ Kodierungseinstellungen : -m j -V 4 -q 2 -lowpass 17 -b 128
+ Kodierungs-Datum : UTC 2016-04-17 16:33:33
+ Tagging-Datum : UTC 2016-04-17 16:33:41
+ +

+
+ Anmerkungen zum Encode: die Framerate der 1080p-Version ist falsch, als Audiocodec wird für beide Versionen mp3 + genutzt, was besonders während des Openings und Endings in hörbar schlechterer Qualität resultiert. Die Bitrate + ist mit 320kbit/s zwar recht großzügig bemessen, aber ein Blick auf die + Spektralanalyse + zeigt schnell, dass hier eine + verlustbehaftet encodete Source (vermutlich AAC 128k) erneut encoded wurde.
+ Viel Mühe steckt übrigens nicht in + den Einstellungen, da es sich lediglich um das „Placebo“-Preset sowie das „Animation“-Tuning von x264 handelt. + Leider ist dieses Tuning für moderne Animes kaum verwendbar und somit in diesem Fall ungeeignet. Die + 720p-Version nutzt nur das „Slower“-Preset ohne irgendein Tuning. Der MediaInfo entnehmen wir weiterhin, dass + die zwei Versionen von zwei verschiedenen Personen und auf verschiedenen Computern erstellt wurden. +

+ Encode +

+ Da die Quelle des Bildmaterials weder im Dateinamen noch auf Matanas Website angegeben ist, könnte es sich + sowohl um + eine Blu-ray als auch eine Websource handeln. TV ist unwahrscheinlich, da in keiner Szene ein TV-Logo erkennbar + ist.
+ Auch die Qualität des Videos lässt keinerlei Schlüsse zu, da die vorhandene Videoqualität mit jeder Quelle + erreichbar ist, aber aufgrund der eindeutig mehrfach encodeten Audiospur tendiere ich eher zu Websource oder + re-encodetem Video, dass von einer anderen Gruppe entliehen wurde. Auch Fragen unsererseits führten zu keinem genauen Ergebnis. + Alle folgenden Beispiele beziehen sich auf die 1080p-Version des Videos, da die niedrigeren Auflösungen in + diesem Fall keine Vorteile mit sich bringen. Leider ist auch das Bild der Full HD-Version alles andere als + herausragend: + + Blöcke soweit das Auge reicht. + + Scheinbar hat der Encoder sein neuestes Minecraft Let’s Play mit ins Release geschmuggelt. + + Glaub uns, wir auch nicht. +

+ Timing +

Ohne mitgezählt zu haben können wir mit Sicherheit sagen, dass hier nicht nur ein oder zwei Zeilen etwas + verschoben sind. Ein beeindruckender Anteil aller Zeilen geht entweder über die Szene hinaus oder beginnt + einen Frame zu früh. + + Ein schwarz leuchtender Text auf weißem Grund fällt ja auch kaum auf, oder? + + Die Ironie falsch eingesetzter QC-Namen entbehrt einer gewissen Komik nicht. + + Wirklich? Immerhin eine wunderbare Überleitung zum nächsten Abschnitt: +

+ Typeset +

+ Viel zu schreiben gibt es hier leider nicht. Außer dem Logo und den obligatorischen Credits ist nicht viel + zu sehen. Die Schilder sind gänzlich unübersetzt und auch für Straßenmarkierungen hat es nicht gereicht. + + Wir probieren das mit dem Colorpicker bei Gelegenheit noch mal. + Das Ganze in Bewegung: + + + +
+ + Hier ist einiges schiefgelaufen:
+
    +
  1. Der Glow ist Gelb, während das Original grün war
  2. +
  3. Am hellsten Punkt ist das Originallogo weiß, während der Type grau-gelb wird
  4. +
  5. Der Type ist einfarbig und nicht mit Gradientfüllung wie das Original
  6. +
  7. Nach dem Glühen wird es zu schnell orange
  8. +
  9. Am Ende wird das Logo ausgefadet, statt mit dem Licht zu verschmelzen
  10. +
  11. Beim Einfliegen fehlt der Leuchteffekt
  12. +
  13. Das „Grande Road“ ist durch die Umrandung mit dem Hauptlogo verbunden. Wenn man das nicht will, + sollte die Umrandung deutlich dicker sein. +
  14. +
  15. Das „Matana Studio“ überdeckt die Funkeneffekte des Originals
  16. +
+ +
+ + + Wer irgendetwas erwartet hat, wird spätestens hier enttäuscht.
+ + + Selbst mit ausreichend Platz hält es der Typesetter (der namentlich in den Credits erwähnt wird) nicht für + nötig, eine Übersetzung anzubieten.
+ + + Textumbrüche und Platzierung gehören auch zu den Aufgaben des Types. Zumindest bei richtigen Gruppen.
+ +
+ Alignment ist scheinbar nur ein Wort, genau wie Schriftart und Originaltreue: Wenn im Original eine eckige + Schriftart verwendet wird, könnt ihr nicht einfach Serifen nehmen. Mit -7 Dioptrien mögen sich die zwei zwar + ähnlich sehen, aber wenn nach dem Encoder jetzt auch der Typesetter blind ist, sollte man sich ernsthaft + Gedanken + machen, wie sie je den Durchblick in der Szene haben wollen. Road mit Folge zu übersetzen lässt sich leider + nicht + mal auf mangelnde Sehstärke abschieben, aber vielleicht weiß der superzähe + Ishigaki mehr.

+ Wie sagt man so schön: „Wer nichts macht, macht keine Fehler.“ +
Das trifft hier leider nicht zu. + + +

Dialog- und Karaokegestaltung

+ Nichts Außergewöhnliches zu berichten (das ist was Gutes). Es gibt zwei verschiedene Styles, einen farbigen für + gesprochene Sprache + und einen mit Blur für Gedanken.
+ +

+ Was die Karaoke angeht, sagt ein Video mehr als tausend Worte (Achtung, das Video hat Ton):
+
+ +
+ Das Audio wurde nicht re-encoded, sondern stammt direkt aus der mp4. + +

Untertitelqualität

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Diese Zeile geviel uns nur wenig.Großschreibung mit Unterstützung von random.org. Und seit wann spricht man Leute mit dem Namen ihrer + Schule an? Liegt übrigens nicht am Originalskript, da sagt er nämlich „Hakogaku-san“ +
Ja, diese zwei Sätze gehören wirklich zusammen. Und genau da liegt das Problem. +
Kommata und noch mehr Kommata.mfw menschliche Abschussrampe
Das Leben der Anderen - VR-Edition. Jetzt + auch mit Geruch. + Ich glaube insgeheim, dass da definitiv was falsch ist.
Zu einem Satz, gehört Komma.Dieses Komma allein, ist mehr als genug.
Werft den Purchen zu PodenGib alle Is, die du hast.
Es kann nicht sein, dass hier schon wieder ein S fehlt.Matana bewies großes Durchhaltevermögen bei der Verwendung quasitoter Begriffe. „Ass-Sprinter“ ist + nur eine der sprachlichen Unarten und sogar die Möglichkeit eines Dreifachkonsonanten lässt man hier + ungenutzt. +
Wie verwende ich Adjektive?Was ist das wieder? Richtig, falsch. Und zwar richtig falsch.
Ziellinie-senpai, bist du es?Der alte Schwächling-großschreibung wäre nach diesen Zusammenstoß auch gefallen. Apropos Fall, der + ist auch Fall-sch. +
1933 musste jemand anderes führen, aber zumindest der linguistische Genozid scheint sich hier zu + wiederholen. + Wenn ich zulasse, dass Matana weitersubbt, ist es zu Ende.
Ein normaler Satz nach Matana-Standards. Wer wen wohin führt ist hier nicht von Belang.Euer Substil hingegen hat gar nichts.
Ein Rechtschreibassistent hätte hierbei helfen konnen.Jetzt noch ohne Deppenleerzeichen und wir haben eine fehlerfreie Zeile.
Reden wir hier von Melonen, Leichen oder der deutschen Sprache?Und ich hoffe, dass wir die Einzigen sind, die das Skript dieses Subs zu sehen bekommen.
+ + Mir fehlen die Worte.
+

Fazit

+ Man soll ja mit dem Guten anfangen, also legen wir los: Die Schriftart ist nicht völlig unlesbar. Da wir das nun + geklärt hätten, begeben wir uns zu den weniger erfreulichen Anteilen:
+ Die Encodequalität ist grauenhaft und mit 8-bit 1080p Hardsubs sind die technischen Mittel genauso + zurückgeblieben, wie die Individuen, die diese Sprachvergewaltigung verbrochen haben. Der Ton wurde mindestens + so oft re-encoded, wie Erdogan sich mit Ziegen vergnügt hat. Der Type wäre wohl ganz gut, wenn es welchen gäbe + und das Timing ist mindestens so verschoben wie die Inbetriebnahme des Flughafens BER. Vielleicht ist der + Untertitel in irgendeiner Sprache fehlerfrei; diese muss jedoch noch erfunden werden. Ein Deutschlehrer hätte + das Skript wohl abgelehnt, weil es sich außerhalb seines Fachbereiches befindet.
+ Eine Empfehlung erübrigt sich vermutlich und die Bewertung mittels Schulnoten ist mit moderner digitaler + Speichertechnik nicht möglich.

+

Wir möchten den geneigten Leser an dieser + Stelle freundlichst darauf hinweisen, dass der Konsum + von Matana-Studio-Subs zu irreparablen und unabsehbaren psychosomatischen Schäden führen kann und wird. +


+ Um es mit den weniger freundlichen Worten eines anonymen österreichischen Landschaftsmalers auszudrücken: +

„Hätten solche Leute diese Filme schon während des zweiten + Weltkriegs untertitelt, so wäre das Großdeutsche Reich + wohl Gefahr gelaufen, Japan als Verbündeten zu verlieren.“

+
+

Zurück zum Index

+
+
diff --git a/mgsreview.html b/mgsreview.html new file mode 100644 index 0000000..22d2fd6 --- /dev/null +++ b/mgsreview.html @@ -0,0 +1,374 @@ + + {% load static %} +
+
+ +
+

Review: Magical Girl Subs

+
+

Einleitung und Technisches

+

+ In der zweiten Ausgabe unserer Reihe „Warum muss ich das tun, ich wollte nach dem ersten Artikel wieder + aufhören und die Szene für hoffnungslos verloren erklären“ widmen wir uns Magical Girl Subs und ihrem + Release zu ef - a tale of memories.. Der erste Punkt gehört übrigens zum Titel.

+
+ Lokalisierung: Anredesuffixe vorhanden
+ Versionen: 720p und 1080p, jeweils 8-bit mp4 Hardsub, Größe 244 MiB bzw. 840 MiB
+ Kapitel: nicht vorhanden
+ Website: www.magicalgirlsubs.de
+ Downloadmöglichkeiten: DDL und Torrent

+
MediaInfo (ausklappbar) +

+ General
+ Format : MPEG-4
+ Format profile : Base Media
+ Codec ID : isom
+ File size : 840 MiB
+ Duration : 23mn 48s
+ Overall bit rate mode : Variable
+ Overall bit rate : 4 937 Kbps
+ Encoded date : UTC 2016-10-09 12:19:25
+ Tagged date : UTC 2016-10-09 12:19:25
+
+ Video
+ ID : 1
+ Format : AVC
+ Format/Info : Advanced Video Codec
+ Format profile : High@L4.0
+ Format settings, CABAC : Yes
+ Format settings, ReFrames : 4 frames
+ Codec ID : avc1
+ Codec ID/Info : Advanced Video Coding
+ Duration : 23mn 47s
+ Bit rate : 4 799 Kbps
+ Maximum bit rate : 42.7 Mbps
+ Width : 1 920 pixels
+ Height : 1 080 pixels
+ Display aspect ratio : 16:9
+ Frame rate mode : Constant
+ Frame rate : 23.976 fps
+ Color space : YUV
+ Chroma subsampling : 4:2:0
+ Bit depth : 8 bits
+ Scan type : Progressive
+ Bits/(Pixel*Frame) : 0.097
+ Stream size : 818 MiB (97%)
+ Writing library : x264 core 148 r2638 7599210
+ Encoding settings : cabac=1 / ref=3 / deblock=1:1:1 / analyse=0x3:0x133 / me=umh / subme=6 / psy=1 / + psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=1 / cqm=0 / + deadzone=21,11 / + fast_pskip=1 / chroma_qp_offset=-4 / threads=6 / lookahead_threads=1 / sliced_threads=0 / nr=0 / + decimate=1 + / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=2 / b_bias=0 + / + direct=3 / weightb=1 / open_gop=0 / weightp=2 / keyint=240 / keyint_min=23 / scenecut=40 / + intra_refresh=0 / + rc_lookahead=40 / rc=2pass / mbtree=1 / bitrate=4799 / ratetol=1.0 / qcomp=0.60 / qpmin=10 / qpmax=51 / + qpstep=4 / cplxblur=20.0 / qblur=0.5 / ip_ratio=1.40 / aq=1:0.00
+ Encoded date : UTC 2016-10-09 11:31:42
+ Tagged date : UTC 2016-10-09 12:19:26
+
+ Audio
+ ID : 2
+ Format : AAC
+ Format/Info : Advanced Audio Codec
+ Format profile : HE-AAC / LC
+ Codec ID : 40
+ Duration : 23mn 48s
+ Bit rate mode : Variable
+ Bit rate : 127 Kbps
+ Maximum bit rate : 164 Kbps
+ Channel(s) : 2 channels
+ Channel positions : Front: L R
+ Sampling rate : 48.0 KHz / 24.0 KHz
+ Compression mode : Lossy
+ Stream size : 21.7 MiB (3%)
+ Encoded date : UTC 2016-08-10 17:45:33
+ Tagged date : UTC 2016-10-09 12:19:26
+

+
+ Abgesehen von der Bitrate ist die 720p-Version identisch. + +

Encode

+ Das Encode lässt sich wohl am besten durch die Reaktionen unserer Teammitglieder beschreiben: +
    +
  • kageru: „Das sieht aus, als hätten sie ohne AQ encoded.“
  • +
  • fredferguson: „Notch wäre stolz.“
  • +
  • Attila: „Ich muss kurz schauen, ob das in mpv genauso aussieht.“
  • +
+

+ Werten wir das mal aus: kageru hatte recht, Attila hatte recht und fredferguson … nun ja, wir haben Notch + nicht + direkt gefragt, aber wir können mit an Sicherheit grenzender Wahrscheinlichkeit sagen, dass er wohl recht + hatte.
+ Ehrlich gesagt: Das Encode ist eines der schlechtesten, die wir je gesehen haben. kageru erklärt kurz, was + hier schiefgelaufen ist. Die folgenden Erklärungen + sind recht theoretisch und könnten für technisch weniger Interessierte langweilig sein:

+ aq=1:0.00: Warum irgendjemand diesen Fehler machen würde, ist mir + schleierhaft. Der Standardwert ist 1.0 und es gibt kein normales preset oder tuning in x264, das den AQ einfach + ausschaltet. Lediglich die metrikorientierten tunings, SSIM und PSNR, machen das, aber die sind für normale + Videos unbrauchbar. Der AQ oder Adaptive Quantizer ist ein Feature des Videocodecs, das verschiedenen + Bildbereichen + jeweils passende Quantizer-Werte zuweist. Auf Deutsch heißt das: AQ verteilt die Bitrate so, dass es Sinn macht + und alles so gut aussieht, wie es muss. Ihn komplett zu deaktivieren erzeugt hier die schlimmsten Artefakte, die + ich seit Längerem gesehen habe. qpmin und qpmax auf ihre antiken Standardwerte zu setzen macht es leider auch + nicht besser.
+ Wenn ihr über einem Bild hovert, seht ihr das Encode von ANE. Da ich die Blurays von Melodies nicht zur + Verfügung habe, um Screenshots zu machen, musste ich auf fremde Encodes zurückgreifen.
+
+
+
+

+ Darüber hinaus noch ein paar Anmerkungen: Beide Versionen sind 8-bit Hardsubs, was keinesfalls den modernen + Standards entspricht. 8-bit Video mit HS kann für Hardwarekompatibilität als alternativer Download angeboten + werden, + sollte jedoch niemals das einzige Release sein. Hardsubs sehen einfach nicht so gut aus wie Softsubs und der + 8-bit Videocodec ist deutlich anfälliger für Banding und andere Bildfehler als die 10-bit Version.
+ Außerdem gibt es ein 1080p-Release eines Anime, der nur in 720p animiert wurde. Damit verschwendet MGS + nicht nur die Zeit und Rechenleisung ihres Encoders, sondern auch die Bandbreite und den Festplattenspeicher + ihrer Leecher. Details hier. Die Qualität der Datei wurde über die + Bitrate festgelegt, was allgemein schlechter für die Videoqualität ist, als crf, den Modus für konstante + Qualität, zu verwenden (empfundene Qualität, nicht technisch gemessene). subme=6, trellis=1 und ref=3 sparen + zwar Zeit beim + Encodieren des Videos, verschlechtern jedoch die Kompressionseffizienz deutlich. deblock=1:1:1 ist für Anime + eher ungeeignet, da positive Deblock-Werte das Bild unscharf machen. Ich würde eher 1:0:-1 oder 1:-1:-1 + empfehlen.
+ Über die Sinnhaftigkeit von dct-decimate und fast-p-skip lässt sich streiten, also gehe ich darauf nicht + weiter ein.

+ Auch beim Audio gibt es Probleme, da hier HE-AAC gewählt wurde. HE-AAC ist eine Version des AAC-Codecs, die + besonders + auf Effizienz bei niedrigen Bitraten ausgelegt ist. Es verwendet einige Optimierungen, um mit + minimaler Bandbreite zumindest akzeptable Ergebnisse zu erzielen. Aufgrund dieser Besonderheiten ist es jedoch + nicht + empfohlen, das HE-Profil für Dateien über 80kbit/s bzw. 40kbit/s pro Kanal zu verwenden, da das + Low-Complexity-Profil hier besser funktioniert.

+ Ich sollte mich wohl persönlich mit MGS’ Encoder in Verbindung setzen, um einige dieser Dinge zu (er-)klären. + Viele dieser Fehler sind leicht zu beheben, wenn man weiß, wo das Problem liegt. + +

Typeset

+ In ef gibt es grundsätzlich nicht viel zu typen, da viele der Texte bereits Deutsch sind, also schauen wir uns + das bisschen an, was Shaft für die Fansubber übrig gelassen hat. +
+ Falsche Farbe, komplett falsche Font und das Timing stimmt auch nicht. Keine Ahnung, warum Fansubber + sich in ihren Releases verewigen müssen, aber wenn es schon sein muss, dann doch bitte ordentlich.
+
+ Wieder falsche Font. Wäre wohl in Ordnung gewesen, wenn ihr \fscy etwas gesenkt hättet, aber scheinbar hat es + nicht mal dafür gereicht. Außerdem: >Emule, lel
+
+
+ Falsche Farbe, mittelmäßige Font (und das ist großzügig, weil unsere Ansprüche auch nicht so hoch sind). + Keine Ahnung, warum Fansubber sich in ihren Releases verewigen … Moment, das hatten wir schon. + +

Gestaltung

+ Die Schriftart ist ganz gut gewählt. Sie ist lesbar und dank ihrer Farbe auf allen Hintergründen problemlos + erkennbar. Da es jedoch ein Hardsub ist, wirken die Untertitel besonders auf hochauflösenden Monitoren recht + unscharf.
+ Kara fürs Opening ist vorhanden, jedoch nichts Besonderes. Die Buchstaben überlappen sich, weil der Abstand zu + gering gewählt wurde und beim Einfliegen der Buchstaben steht das Ende des Satzes bereits da, bevor die + Buchstaben es erreicht haben.
+
+ „We won't fall apart“ beim Einfliegen. + +

Untertitelqualität

+

+ Ehrlich gesagt waren wir nach dem Encode leicht voreingenommen und haben eigentlich nicht mehr viel + erwartet. + Glücklicherweise versteht MGS’ Editor mehr von seinem Werk als der zuständige Encoder. „Mehr als der + Encoder“ ist in dem Fall leider keine große Leistung. Das Skript ist grammatikalisch und orthografisch zwar + beinahe fehlerfrei, der Ausdruck lässt jedoch zu wünschen übrig.

+

+ Als Vorlage für die Übersetzung diente der englische Sub von Mendoi-Conclave (oder eins der BD-Retimes), was + an + vielen Stellen im Skript sehr deutlich wird.
+ So finden sich an vielen Stellen Sätze und Formulierungen, deren Ursprungssprache noch deutlich erkennbar + ist, was mitunter recht irritierend wirkt. +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Hätte man nicht umständlicher ausdrücken können. Wir verstehen ja, dass ihr keine Übersetzungsfehler + riskieren wolltet (mit mäßigem Erfolg, wie wir später sehen werden), aber das ist einfach zu + wörtlich. + + „Menschen, die mit Kunst zu tun haben“ – Dafür gab es doch ein Wort; irgendwas + Kurzes, das diesen Satz deutlich angenehmer gemacht hätte. Hm … +
Sehr poetisch. Leider wurde zugunsten dieser blumigen Wortwahl auf den Sinn des Satzes verzichtet. + Und hier sehen wir, dass es deutliche Unterschiede zwischen der deutschen und der englischen + Satzstruktur geben sollte.
+ Sogar der Comma Splice wurde aus dem + Englischen übernommen. Im Original war es ein Fehler, während es im Deutschen erlaubt ist. Das macht + es leider nicht besser. +
Das erst recht nicht.
Erstens heißt es Fermate und zweitens war das ein Wortspiel, das ihr + nicht nur gekonnt ignoriert, sondern auch komplett falsch übersetzt habt. Kuze spricht hier + keinesfalls mit der Fermate, er verwendet das Wort nur metaphorisch, um seine aktuelle Situation zu + beschreiben. Der Begriff muss ja nicht jeden geläufig sein, aber der Edit hätte + zumindest nachschlagen können. + Irgendwas mit zu wörtlichen Übersetzungen … Denkt euch den Rest einfach.
+ Sie [die Stadt] trägt gar nichts. Städte machen so was nämlich nicht. + Ist irgendjemand noch überrascht?
DAS UNGLÜCK (Donnergrollen im Hintergrund, Blitze zucken durch die Nacht, irgendwas Dramatisches + passiert).
Aber im Ernst: Das Unglück? Das eine Unglück mit dem bestimmten Artikel? +
Leerzeichen vor die Auslassungspunkte
E! FermatEEEEE. Der Fehler zieht sich durch die ganze Folge.Ein Fine. Musik war nicht die Stärke des + Edits. Nachschlagen wohl auch nicht. +
Das geht deutlich kürzer. „Nur weil ein Mädchen neben dir sitzt/du neben einem Mädchen sitzt, […]“ + Eine Skizze kann kein Verfahren sein. Das Anfertigen ebendieser vielleicht, aber die Skizze selbst + definitiv nicht. +
Brauchen wir ein Beweisbild oder glaubt ihr so, dass das 1:1 der englische Satz ist?Man kann nur hoffen, dass diese Zeile ein Witz sein sollte, den hier niemand versteht.
Man mag es kaum glauben, aber diese Sätze folgen wirklich direkt aufeinander. Was ist sie denn? + Niedlich und das komplette Gegenteil von … allem? Das komplette Gegenteil halt. + Wir … ich … weiß nicht, wo ich anfangen soll. „Niemals im Leben“ würde man nie im Leben sagen und + Reizbarkeit + ist eine Frage des Temperaments und des Characters, nicht eine der Einstellung. +
Da wird sich das Frühstück aber freuen.Der Duden empfiehlt an dieser Stelle die Getrenntschreibung
Auch hier empfiehlt der Duden die Getrenntschreibung + Das ist das erste Mal, dass ich sehe, dass irgendjemand das dass, das mit zwei s geschrieben wird + und das man sonst eher selten sieht, in solcher Vielzahl verwendet, dass es anstrengend wird, das zu + lesen. +
Solche Sätze lesen sich allgemein angenehmer,
wenn man vor Beginn des Nebensatzes eine neue Zeile + anfängt. +
Mein Thesaurus ist nicht nur enttäuschend, er enttäuscht mich auch noch.
Vermutlich waren das mal zwei Sätze, Und keinen der QCs hat das gestört.Es ist keine Freude, an einem Ort, an dem eure Fansubs existieren, irgendwas zu tun.
Das ist leider absolut nicht das, was sie sagt. Und nein, an dieser Stelle wurde kein + Übersetzungsfehler der englischen Gruppe korrigiert.
+ むしろ僕は一人で店に入て普通に買い物ができる方が不思議だ bedeutet nämlich wörtlich:
+ „Ich finde es eher sonderbar, dass es einen Weg gibt, alleine in einen Laden zu gehen und normal + einzukaufen.“
Wie gesagt, wörtlich. Man sollte diesen Satz niemals so in einem Fansub verwenden, + aber die Bedeutung ist klar erkennbar, trotz des furchtbaren Ausdrucks. +
+

Fazit

+ Fassen wir also zusammen:
+ Encode: Streamingseitenniveau. Dunkle Szenen bestehen förmlich aus Blöcken, aber auch in hellen Szenen entstehen + nicht selten Blöcke bei schnelleren Bewegungen. 240 MiB sind zwar eine recht sparsame Dateigröße, was diese + Qualität jedoch nicht rechtfertigt.
+ Typeset: Vorhanden, aber oft nur mittelmäßig oder noch darunter. Über die Kara lässt sich wohl dasselbe + sagen.
+ Untertitel: Ortografisch weitestgehend fehlerfrei, vom Ausdruck her aber nur knapp über Google Übersetzer. Da + die Satzstruktur oft exakt übernommen wurde, kann man an dieser Stelle auch den englischen Sub schauen, da MGS + keinerlei Mehrwert produziert hat. Zusätzlicher Punktabzug für Hardsubs in beiden Versionen.

+ Bewertung: Mit entsprechend niedrigen Ansprüchen schaubar, empfehlenswert ist es aber nicht. Vorerst sollte man + hier lieber auf den mitlerweile sieben Jahre alten Sub von Gruppe Kampfkuchen zurückgreifen. Die + nutzten damals zwar eine Fernsehversion als Quellvideo, aber dank der unsachgemäßen Handhabung des Videos + seitens MGS’ Encoders, ist man damit immer noch besser bedient. + Hoffentlich beantwortet das die Frage, warum wir „ef subben, obwohl MGS den doch schon gemacht hat“. +
+
diff --git a/removegrain.html b/removegrain.html new file mode 100644 index 0000000..a8ca97d --- /dev/null +++ b/removegrain.html @@ -0,0 +1,475 @@ + + {% load static %} +
+
+ +
+

Actually Explaining RemoveGrain

+
+
Table of contents
+ +

+ +
Introduction
+

For over a decade, RemoveGrain has been one of the most used filters for all kinds of video processing. It is + used in SMDegrain, xaa, FineDehalo, HQDeringMod, YAHR, QTGMC, Srestore, AnimeIVTC, Stab, SPresso, Temporal + Degrain, MC Spuds, LSFmod, and many, many more. The extent of this enumeration may + seem ridiculous or even superfluous, but I am trying to make a point here. RemoveGrain (or more recently + RGTools) is everywhere.

+

+ But despite its apparent omnipresence, many encoders – especially those do not have years of encoding + experience – don't actually know what most of its modes do. You could blindly assume that they're all used + to, well, remove grain, but you would be wrong.

+

+ After thinking about this, I realized that I, too, did not fully understand RemoveGrain. There is barely a + script that doesn't use it, and yet here we are, a generation of new encoders, using the legacy of our + ancestors, not understanding the code that our CPUs have executed literally billions of times.
But who + could + blame us? RemoveGrain, like many Avisynth plugins, has ‘suffered’ optimization to the point of complete + obfuscation.
You can try to read the code + if you want to, but trust me, you don't.

+

+ Fortunately, in October of 2013, a brave adventurer by the name of tp7 set upon a quest which was thitherto + believed impossible. They reverse engineered the open + binary that RemoveGrain had become and created RGTools, a + much + more readable rewrite that would henceforth be used in RemoveGrain's stead. +

+

+ Despite this, there is still no complete (and understandable) documentation of the different modes. Neither + the + Avisynth wiki nor tp7's own documentation nor any of the other guides manage to + accurately + describe all modes. In this article, I will try to explain all modes which I consider to be insufficiently + documented. Some self-explanatory modes will be omitted.
+ If you feel comfortable reading C++ code, I would recommend reading the code yourself. + It's not very long and quite easy to understand: tp7's rewrite on Github +

+
Glossary
+
+
Clipping
+
+ Clipping: A clipping operation takes three arguments. A value, an upper limit, and a lower limit. If the + value is below the lower limit, it is set to that limit, if it exceeds the upper limit, it is the + to that limit, and if it is between the two, it remains unchanged. +
+
Convolution
+
+ The weighed average of a pixel and all pixels in its neighbourhood. RemoveGrain exclusively uses 3x3 + convolutions, meaning for each pixel, the 8 surrounding pixels are used to calculate the + convolution.
+ Mode 11, 12, 19, and 20 are just convolutions with different matrices. +
+
+ To illustrate some of these modes, images of 3x3 pixel neighbourhoods will be used. The borders between the + pixels were added to improve visibility and have no deeper meaning.

+ + + + + + + + + + + + + + + + + +
Modes 5-9
+

Mode 5

+ The documentation describes this mode as follows: “Line-sensitive clipping giving the minimal change.”
+ This is easier to explain with an example:

+ + + + + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+

Left: unprocessed clip. Right: clip after RemoveGrain mode 5

+

Mode 5 tries to find a line within the 3x3 neighbourhood of each pixel by comparing the center pixel with + two + opposing pixels. This process is repeated for all four pairs of opposing pixels, and the center pixel is + clipped to their respective values. After computing all four, the filter finds the pair which resulted in + the smallest change to + the center pixel and applies that pair's clipping. In our example, this would + mean clipping the center pixel to the top-right and bottom-left pixel's values, since clipping it to any of + the other pairs would make it white, significantly changing its value.

+ To visualize the aforementioned pairs, they are + labeled with the same number in this image.

+ + + 1 + 2 + 3 + + + 4 + + 4 + + + 3 + 2 + 1 + + +
Due to this, a line like this could not be found, and the center pixel would remain unchanged:

+ + + 1 + 2 + 3 + + + 4 + + 4 + + + 3 + 2 + 1 + + +

Mode 6

+

+ This mode is similar to mode 5 in that it clips the center pixel's value to opposing pairs of all pixels in + its + neighbourhood. The difference is the selection of the used pair. Unlike with mode 5, mode 6 considers the + range + of the clipping operation (i.e. the difference between the two pixels) as well as the change applied to the + center pixel. The exact math looks like this where p1 is the first of the two opposing pixels, p2 is the + second, + c_orig is the original center pixel, and c_processed is the center pixel after applying the clipping.

+
+ diff = abs(c_orig - c_processed) * 2 + abs(p1 - p2) +
+

+ This means that a clipping pair is favored if it only slightly changes the center pixel and there is only + little difference between the two pixels. The change applied to the center pixel is prioritized (ratio 2:1) + in this mode. The + pair with the lowest diff is used.

+

Mode 7

+ Mode 7 is very similar to mode 6. The only difference lies in the weighting of the values. +
+ diff = abs(c_orig - c_processed) + abs(p1 - p2) +
+ Unlike before, the difference between the original and the processed center pixel is not multiplied by two. The + rest of the code is identical. +

Mode 8

+ Again, not much of a difference here. This is essentially the opposite of mode 6. +
+ diff = abs(c_orig - c_processed) + abs(p1 - p2) * 2 +
+ The difference between p1 and p2 is prioritized over the change applied to the center pixel; again with a 2:1 + ratio. +

Mode 9

+ In this mode, only the difference between p1 and p2 is considered. The center pixel is not part of the equation.
+

Everything else remains unchanged. This can be useful to fix interrupted lines, as long as the length of the + gap never exceeds one pixel.

+ + + + + +
+ + + 1 + 2 + 3 + + + 4 + + 4 + + + 3 + 2 + 1 + + + + + + + + + + + + + + + + + + + + +
+
+ The center pixel is clipped to pair 3 which has the lowest range (zero, both pixels are black).
+ Should the calculated difference for 2 pairs be the same, the pairs with higher numbers (the numbers in the + image, not their values) are prioritized. This applies to modes 5-9. +
Mode 11 and 12
+ Mode 11 and 12 are essentially this: +
+ std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2, 1]) +
+ There is no difference between mode 11 and mode 12. This becomes evident by looking at the code which was + literally copy-pasted from one function to the other. It should come as no surprise that my tests confirmed + this: +
+ >>> d = core.std.Expr([clip.rgvs.RemoveGrain(mode=11),clip.rgvs.RemoveGrain(mode=12)], 'x y - abs')
+ >>> d = d.std.PlaneStats()
+ >>> d.get_frame(0).props.PlaneStatsAverage
+
+ 0.0
+
+ There is, however, a slight difference between using these modes and std.Convolution() with the corresponding + matrix: +
+ >>> d = core.std.Expr([clip.rgvs.RemoveGrain(mode=12),clip.std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2, + 1])], 'x y - abs')
+ >>> d = d.std.PlaneStats()
+ >>> d.get_frame(0).props.PlaneStatsAverage
+ 0.05683494908489186
+
+

+ This is explained by the different handling/interpolation of edge pixels, as can be seen in this comparison.
+ Edit: This may also be caused by a bug in the PlaneStats code which was fixed in R36. Since 0.05 is way + too high for such a small difference, the PlaneStats bug is likely the reason.
+ Note that the images were resized. The black dots were 1px in the original image.

+
All previews + generated with yuuno
+ + + + + + + + + + + +
+ + +
The source imageclip.std.Convolution(matrix=[1, 2, 1, 2, 4, 2, 1, 2, 1])clip.rgvs.RemoveGrain(mode=12)
+ Vapoursynth's internal convolution filter interpolates beyond the edges of an image by mirroring the pixels + close to + the edge. RemoveGrain simply leaves them unprocessed. +
Modes 13-16
+ These modes are very fast and very inaccurate field interpolators. They're like EEDI made in China, and there + should never be a reason to use any of them (and since EEDI2 was released in 2005, there was never any reason to + use them, either). +

Mode 13

+ + + 1 + 2 + 3 + + + + + + + + 3 + 2 + 1 + + +
+ Since this is a field interpolator, the middle row does not yet exist, so it cannot be used for any + calculations.
+ It uses the pair with the lowest difference and sets the center pixel to the average of that pair. In the + example, + pair 3 would be used, and the resulting center pixel would be a very light grey. +

Mode 14

+ Same as mode 13, but instead of interpolating the top field, it interpolates the bottom field. +

Mode 15

+
Same as 13 but with a more complicated interpolation formula.
+

Avisynth Wiki

+

+ “It's the same but different.” How did people use this plugin during the past decade?
+ Anyway, here is what it actually does: +

+ + + 1 + 2 + 1 + + + 0 + 0 + 0 + + + 1 + 2 + 1 + + +

+ First, a weighed average of the three pixels above and below the center pixel is calculated as shown in the + convolution above. Since this is still a field interpolator, the middle row does not yet + exist.
+ Then, this average is clipped to the pair with the lowest difference.
+ In the example, the average would be a grey with slightly above 50% brightness. There are more dark pixels + than bright ones, but + the white pixels are counted double due to their position. This average would be clipped to the pair with + the smallest range, in this case bottom-left and top-right. The resulting pixel would thus have the color of + the top-right pixel. +

+

Mode 16

+

+ Same as mode 15 but interpolates bottom field. +

+
Mode 17
+
+
+ Clips the pixel with the minimum and maximum of respectively the maximum and minimum of each pair of + opposite neighbour pixels. +
+ It may sound slightly confusing at first, but that is actually an accurate description of what this mode + does. + It creates an array containing the smaller value (called lower) + of each pair and one containing the bigger value (called upper + ). + The center pixel is then clipped to the smallest value in upper + and the biggest value in lower. +
+
Mode 21 and 22
+

Mode 21

+ The value of the center pixel is clipped to the smallest and the biggest average of the four surrounding pairs. +

Mode 22

+ Same as mode 21, but rounding is handled differently. This mode is faster than 21 (4 cycles per pixel). +
Mode 23 and 24
+ These two are too difficult to explain using words, so I'm not even going to try. Read the code if you're + interested, but don't expect to find anything special. I can't see this mode actually doing something useful, so + the documentation was once again quite accurate. +

I feel like I'm missing a proper ending for this, but I can't think of anything

+
+
diff --git a/resolutions.html b/resolutions.html new file mode 100644 index 0000000..8bb6274 --- /dev/null +++ b/resolutions.html @@ -0,0 +1,343 @@ + + {% load static %} +
+
+ + +
+

Native Resolutions and Scaling

+

This is really not that good, and I should probably rewrite it,

but I'm lazy, so here's an unfinished whatever this is https://ddl.kageru.moe/kbz12.pdf

+

Table of contents

+ +

Introduction

+
As some (or many) might be aware, anime is usually produced at resolutions below 1080p. + However, since all Blu-Ray releases are 1080p, the source material has to be upscaled to this + resolution. This article will try to cover the basics of scaling Blu-Ray sourced material using + Vapoursynth and Avisynth.
+ Note: Most of the images were embedded as JPG to save bandwidth. Clicking an image will open the lossless PNG. +
+

Avisynth and Vapoursynth basics

+
+ In order to make the following easier to understand I will try to explain the basic scaling methods in Avisynth + and Vapoursynth. More detailed examples to deal with artifacts can be found in the corresponding paragraph. + Blue code blocks contain Vapoursynth code while green blocks contain Avisynth. For Vapoursynth I will be using + fmtconv to resize.

+ Scaling to 1080p using a bilinear resizer. This can be used for either upscaling or downscaling +

clip = core.fmtc.Resample(clip, 1920, 1080, kernel = 'bilinear')

+

clip.BilinearResize(1920, 1080)

+ Note that fmtc will use Spline36 to resize if no kernel is specified. Spline is generally the better choice, and + we are only using bilinear as an example. To use Spline36 in Avisynth use +

clip.Spline36Resize(1920, 1080)

+ Using a debilinear resizer to reverse to a native resolution of 1280x720: (Note that you should never use this to + upscale anything) +

clip = core.fmtc.Resample(clip, 1280, 720, kernel = 'bilinear', invks = True)

+

clip.Debilinear(1280, 720)

+ Debilinear for Avisynth can be found in the wiki. +
+

Bilinear/Debilinear Examples

+
+

Traditional scaling is done by spreading all pixels of an image over a higher resolution(e.g. 960x540 -> + 1920x1080), interpolating the missing pixels (in our example every other pixel on each axis), and in some + cases applying additional post-processing to the results. + For a less simplified explanation and comparison of different scaling methods refer to the Wikipedia article.
+ It is possible to invert the effects of this by using the according inverse algorithm to downscale the + image. + This is only possible if the exact resolution of the source material is known and the video has not + been altered after scaling it (we will deal with 1080p credits and text later).
+ A few examples of scaled and inverse scaled images: (click for full resolution PNG)
+
+ 1080p source frame from the One Punch man Blu-ray. No processing.
+
+ Source.Debilinear(1280, 720) +
+ Source.Debilinear(1280, 720).BilinearResize(1920,1080)
+ This reverses the scaling and applies our own bilinear upscale.
+ You may see slight differences which are caused by the Blu-Ray + compression noise but without zooming in and if played in real time these images should be + indistinguishable. +

+

+ My second example will be a frame from Makoto Shinkai's Kotonoha no Niwa or "The Garden of Words". The movie + is not only beautifully drawn and animated but also produced at FullHD resolution. We will now upscale the + image to 4k using a bilinear resizer and reverse the scaling afterwards. +
+ The untouched source frame +
+ Source.BilinearResize(3840,2160) +
+ Source.BilinearResize(3840,2160).Debilinear(1920,1080)
+ This time the images are even more similar, because no artifacts were added after upscaling. + As you can see, using inverse kernels to reverse scaling is quite effective and will usually restore the + original image accurately. This is desirable, as it allows the encoder to apply a reverse scaling algorithm + to release in 720p, significantly decreasing the release's filesize. The 720p video will be upscaled by the + leecher's video player, potentially using high quality scaling methods like the ones implemented in MadVR. + Releasing in native resolution will therefore not just save space, but may even improve the image quality on + the consumer's end. +

+
+

Native resolutions

+
+ Finding the native resolution of your source is the most important step. If you use the wrong resolution or + try to debilinearize native 1080p material you will destroy details and introduce ringing artifacts. To + understand this let's take a look at this frame from Non Non Biyori Repeat. The show's native resolution is + 846p. + + Source + + Source.Debilinear(1280, 720)
+ Upon taking a closer look you will see that the edges of our 720p image look very jagged or aliased. This is + caused by improper debilinearizing. The effect will get stronger with sharper and more detailed edges. If + you encounter this never try to fix it by anti-aliasing. Try to find the correct resolution or don't + use inverse scaling at all. + + Source.Debilinear(1280, 720).PointResize(3840, 2160) and some cropping. Point resize (also called Nearest + Neighbor) is used to magnify without smoothing. + + Source.PointResize(3840, 2160) and cropping. As you can see this version does not have the ringing + artifacts.
+

Unfortunately there are only few ways of determining the native resolution.
The main source is + anibin, a japanese blog that analyzes anime to find its native + resolution. In order to find an anime, you have to get the original title from MyAnimeList, AniSearch, + AniDB, or any other source that has Kanji/Kana titles.
+ Non Non Biyori Repeat's japanese title is "のんのんびより りぴーと", and if you copy-paste it into the search bar on + anibin, you should be getting this result. + Even if you don't understand japanese, the numbers should speak for themselves. In this case the resolution + is 1504x846. This is above 720p but below 1080p, so you have multiple options. In this case I would + recommend encoding in 1080p or using a regular resizer (like Spline36) if you need a 720p version. + In some cases even scaling back to anibin's resolution does not get rid of the ringing, either because the + studio didn't use a bilinear resizer or the analysis was incorrect due to artifacts caused by TV + compression, so I wouldn't bother messing with the native resolution. It's not like you were gonna release + in 846p, right? +
+ Edit: Apparently there are people out there who genuinely believe releasing a 873p video is a valid + option. + This is not wrong from an objective standpoint, but you should never forget that a majority of the leechers + does not understand encoding and is likely to ignore your release, because "Only an idiot would release in + 8xxp".

+ +

If you want an easier way to detect ringing and scaling artifacts, read the chapter about artifacts and + masks. +

Btw, in case you do need (or want) to inverse scale our example, you would have to use a Debicubic resizer + which leads me to our next topic. +
+

Kernels

+
+

+ Sometimes you will encounter ringing and artifacts even if you are certain that you know the native + resolution. This usually means that the studio used another resizer. Our example will be Byousoku 5 + Centimeter or 5 Centimeters per Second (Anibin's + Blu-Ray analysis)

This will be our test frame: + We will be using the masking functions explained in the next paragraph. For now just accept them as a good + way to find artifacts. + If we try to debilinearize our example, the mask will look like this: + + Despite using the correct resolution we can see strong artifacts around all edges. This can have multiple, + not mutually exclusive reasons: +
    +
  1. The studio used sharpening filters after upscaling
  2. +
  3. The studio used different resolution for different layers or parts of the image
  4. +
  5. The image was upscaled with a different kernel (not bilinear)
  6. +
  7. Our resolution is wrong after all
  8. +
+ The first two reasons are not fixable, as I will illustrate using a flashback scene from Seirei Tsukai no + Blade Dance. The scene features very strong and dynamic grain which was added after the upscale, resulting + in 720p backgrounds and 1080p grain. + + And now the mask: + + In this case you would have to trim the scene and use a regular resizer. Sometimes all backgrounds were drawn in + a higher or lower resolution than the characters and foreground objects. In this case inverse scaling becomes + very difficuilt since you would need to know the resolution of all different planes and you need a way to mask + and merge them. I'd advise using a regular resizer for these sources or just releasing in 1080p.
+ After talking about problems we can't fix, let's go back to our example to fix reason 3. Some (especially + more recent) Blu-Rays were upscaled with a bicubic kernel rather than bilinear. Examples are Death Parade, + Monster Musume, Outbreak Company, and of course our image. Applying the mask with debicubic scaling results + in far fewer artifacts, as seen here: (hover over the image to see the bilinear mask) + + The remaining artifacts are likely caused by compression artifacts on the Blu-Ray as well as potential + postprocessing in the studio. This brings us back to reason 1, although in this case the artifacts are weak + enough to let the mask handle them and use debicubic for the rest.
+ Usage (without mask):
+ To realize this in Avisynth import Debicubic and use it + like this: +

src.Debicubic(1280, 720, b=0, c=1)

+ For Vapoursynth use fmtconv: +

out = core.fmtc.resample(src, 1280, 720, kernel = 'bicubic', invks = True, a1 = 0, a2 = + 1)

+ To use a mask for overlays and potential artifacts as well as 4:4:4 output use the Vapoursynth function + linked at the bottom. Example for bicubic upscales: +

out = deb.debilinearM(src, 1280, 720, kernel = 'bicubic')

+ If the b and c parameters are not 0 and 1 (which should rarely be the case) you can set them as a1 + and a2 like in fmtc.resample(). Bicubic's own default is 1/3 for both values so if bilinear and bicubic 0:1 + don't work you could give that a try.
+ Edit: I did some more testing and consulted another encoder regarding this issue. + Since we're using an inverse kernel in vapoursynth, the results may differ slightly from avisynth's debicubic. + In those cases, adjusting the values of a1 and a2, as well as the number of taps used for scaling can be + beneficial and yield a slightly sharper result. + +
+

Masks: Dealing with artifacts and 1080p overlays

+
+ Sometimes a studio will add native 1080p material (most commonly credits or text) on top of the image. + Inverse scaling may work with the background, but it will produce artifacts around the text as seen in the + example from Mushishi Zoku Shou below: + + In order to avoid this you will have to mask these parts with conventionally downscaled pixels. + The theory behind inverse scaling is that it can be reversed by using regular scaling, so (in theory) a + source frame from a bilinear upscale would be identical to the output of this script: +

source.Debilinear(1280,720).BilinearResize(1920,1080)

+ This property is used by scripts to mask native 1080p content by finding the differences between the source + and the above script's output. A mask would look like this:
+ + If there are any differences, the areas with artifacts will be covered by a regular downscale like +

source.Spline36Resize(1280,720)

+ In Avisynth you can import DebilinearM which can also be found in the wiki. + For Vapoursynth MaskDetail + can be used to create the Mask and MaskedMerge to mask the artifacts. A full importable script is available + at the end. +

+ #MaskDetail has to be imported or copied into the script
+ #src is the source clip
+ deb = core.fmtc.resample(src, 1280, 720, kernel = 'bilinear', invks = True)
+ noalias = core.fmtc.resample(src, 1280, 720, kernel="blackmanminlobe", taps=5)
+ mask = maskDetail(src, 1280, 720, kernel = 'bilinear')
+ masked = core.std.MaskedMerge(noalias, src, core.std.Invert(mask, 0))
+

+ Using this function to filter our scene returns this image: (hover to see the unmasked version) + + The credits stand out less and don't look oversharpened. The effect can be much stronger depending on the + nature and style of the credits. +
+

Subsampling

+
+ You may have encountered fansubs released in 720p with 4:4:4 subsampling. In case you don't know the + term, subsampled images store luma (brightness) and chroma (color) at different resolutions. A Blu-Ray will + always have 4:2:0 subsampling, meaning the chroma channels have half the resolution of the luma channel. + When downscaling you retain the subsampling of the source, resulting in 720p luma and 360p chroma. + Alternatively you can split the source video in luma and chroma and then debilinearize the luma + (1080p->720p) while upscaling the chroma planes (540p->720p). Using the same resolution for luma and chroma will + prevent colorbleeding, retain more of the chroma present in the source, and prevent desaturation.
+ A script for Avisynth and the discussion can be found on doom9. + For Vapoursynth I prefer to use the script explained in the next section which allows me to mask credits and + convert to 4:4:4 simultaneously. +

Importable Vapoursynth script

+ While there may be scripts for literally anything in Avisynth, Vapoursynth is still fairly new and growing. + To make this easier for other Vapoursynth users I have written this simple import script which allows you to + debilinearize with masks and 4:4:4 output. A downloadable version is linked below the explanation. + + Essentially, all the script does is split the video in its planes (Y, U and V) to scale them separately, + using debilinear for luma downscaling and spline for chroma upscaling. The example given is for 720p + bilinear upscaled material: +

+ y = core.std.ShufflePlanes(src, 0, colorfamily=vs.GRAY)
+ u = core.std.ShufflePlanes(src, 1, colorfamily=vs.GRAY)
+ v = core.std.ShufflePlanes(src, 2, colorfamily=vs.GRAY)
+ y = core.fmtc.resample(y, 1280, 720, kernel = 'bilinear', invks = True)
+ u = core.fmtc.resample(u, 1280, 720, kernel = "spline36", sx = 0.25)
+ v = core.fmtc.resample(v, 1280, 720, kernel = "spline36", sx = 0.25)
+ out = core.std.ShufflePlanes(clips=[y, u, v], planes = [0,0,0], colorfamily=vs.YUV)
+ noalias = core.fmtc.resample(src, 1280, 720, css = '444', kernel="blackmanminlobe", taps=5)
+ mask = maskDetail(src, 1280, 720, kernel = 'bilinear')
+ out = core.std.MaskedMerge(noalias, out, core.std.Invert(mask, 0))
+ out.set_output()
+

+ To call this script easily copy this file into + C:\Users\Your_Name\AppData\Local\Programs\Python\Python35\Lib\site-packages + and use it like this: +

+ import vapoursynth as vs
+ import debilinearm as deb
+ core = vs.get_core()
+ src = core.lsmas.LWLibavSource(r'E:\path\to\source.m2ts') #other source filters will work too
+ out = deb.debilinearM(src, width, height, kernel)
+

+ Where width and height are your target dimension and kernel is the used upscaling method. The output will be + in 16-bit and 4:4:4 subsampling.
The defaults are + (1280, 720, 'bilinear') meaning in most cases (720p bilinear upscales) you can just call: +

out = deb.debilinearM(src)

+ List of parameters and explanation:
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
parameter[type, default]explanation
src[clip]the source clip
w[int, 1280]target width
h[int, 720]target height
kernel[string, 'bilinear']kernel used for inverse scaling. Has to be in 'quotes'
taps[int, 4]number of taps for reverse scaling
return_mask[boolean, False]returns artifact mask in grayscale if True
a1[int, 0]b parameter of bicubic upscale, ignored if kernel != 'bicubic'
a2[int, 1]c parameter of bicubic upscale, ignored if kernel != 'bicubic'
+

+ Edit: The generic functions (core.generic.*) were removed in vapoursynth in R33, as most of them + are now part of the standard package (core.std.*). I have updated the script below accordingly, meaning it + may not work with R32 or older. This also applies to MonoS' MaskDetail which (as of now) has not been + updated. You can "fix" it by replacing both occurences of "core.generic" with "core.std". +

+

+ The most recent version of my scripts can always be found on Github:
Download
+ Download fmtconv (necessary) +

+
+
diff --git a/template.html b/template.html new file mode 100644 index 0000000..12feace --- /dev/null +++ b/template.html @@ -0,0 +1,20 @@ + + + {% load static %} +
+
+ +
+

Native Resolutions and Scaling

+

Table of contents

+

+

+

+

Introduction

+
+ +
+
+ diff --git a/videocodecs.html b/videocodecs.html new file mode 100644 index 0000000..35965dc --- /dev/null +++ b/videocodecs.html @@ -0,0 +1,353 @@ + + {% load static %} +
+
+ +
+

Why and how to use x265

+ +

Introduction

+
+ For many years x264 has been the standard video codec for video encoding and achieved the best results one + could get in terms of video compression and efficiency. But in 2013, when the initial version of x265 + was released, it yielded far better results than were previously possible with x264. Now 2.0 stable version of + x265 is released and we are a few CPU and GPU generations farther than we were in 2013. + Additionally, the new PCs, notebooks, and even smartphones that are coming out are all receiving native + hardware support for decoding x265, so as of today, more and more people + can view HEVC encoded videos just the same as they can view AVC encoded videos. + The problem is that the encoders in the fansubbing community are only slowly adapting to + the new codec, effectively wasting the bandwidth of the viewer, or offering a lower quality than they could achieve + with x265.
+ In the following section, I will explain why HEVC/x265 is superior to x264/AVC and why you should use it to encode your + videos. +
+

File size comparison

+
+ This section will be dedicated to comparing the difference in filesize between x264 and x265. + For x265, I used CRF 17 and the veryslow preset, which already yields very good results. + For x264, I used CRF 15, the preset veryslow and the parameters subme 11, me tesa, merange 32, and bframes 16.
+ Both encodes also have aq-mode 3 enabled.
+ Please note: CRF in x264 and x265 is NOT comparable, both encoders use a different way to calculate the CRF. + I found CRF 15 for x264 and CRF 17 for x265 to have nearly the same quality, but results may vary. + You have been warned.
+ + + + + + + + + +
One Punch Man episode 1, frame 13487 at CRF 15 in x264And here in x265 at CRF 17
+
+ +

+ 1. Static videos: The test clip consists of the first 1000 frames of Non Non Byori Repeat episode 1.

+ + Download the encodes
+

+
+ Logfiles of the encodes (expandable):
+
x264 log +

+ x264 [info]: 1920x1080p 0:0 @ 24000/1001 fps (cfr)
+ x264 [info]: color matrix: undef
+ x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
+ x264 [info]: AVC Encoder x264 core 148 r2699+6+41 29a38aa Yuuki [10-bit@all X86_64][GCC 5.3.0]
+ x264 [info]: profile: High 10, level: 5.1, subsampling: 4:2:0, bit-depth: 10-bit
+ x264 [info]: cabac=1 ref=16 deblock=1:0:-1 analyse=0x3:0x133 me=tesa subme=11 psy=1 fade_compensate=0.00 psy_rd=1.00:0.00 mixed_ref=1 me_range=32 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=0 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=0 interlaced=0 bluray_compat=0 constrained_intra=0 fgo=0 bframes=16 b_pyramid=2 b_adapt=2 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=60 rc=crf mbtree=1 crf=15.0000 qcomp=0.60 qpmin=0:0:0 qpmax=81:81:81 qpstep=4 ip_ratio=1.40 aq=3:0.80
+ x264 [info]: started at Sun Aug 07 23:46:37 2016
+ x264 [info]: frame I:10 Avg QP:22.52 size:391771
+ x264 [info]: frame P:236 Avg QP:26.76 size: 70531
+ x264 [info]: frame B:754 Avg QP:28.05 size: 7533
+ x264 [info]: consecutive B-frames: 1.1% 2.0% 45.0% 3.2% 6.0% 12.0% 4.2% 11.2% 8.1% 2.0% 1.1% 2.4% 0.0% 0.0% 0.0% 0.0% 1.7%
+ x264 [info]: mb I I16..4: 49.8% 28.1% 22.1%
+ x264 [info]: mb P I16..4: 4.6% 0.7% 0.7% P16..4: 44.1% 28.7% 10.7% 1.6% 0.2% skip: 8.7%
+ x264 [info]: mb B I16..4: 0.4% 0.0% 0.1% B16..8: 28.4% 4.8% 0.3% direct: 1.5% skip:64.5% L0:42.1% L1:55.0% BI: 2.8%
+ x264 [info]: 8x8 transform intra:16.7% inter:41.3%
+ x264 [info]: direct mvs spatial:98.5% temporal:1.5%
+ x264 [info]: coded y,uvDC,uvAC intra: 89.9% 87.5% 81.4% inter: 11.0% 15.1% 7.0%
+ x264 [info]: i16 v,h,dc,p: 14% 10% 18% 57%
+ x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 12% 8% 8% 9% 11% 10% 9% 10%
+ x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 8% 30% 6% 7% 7% 7% 7% 9%
+ x264 [info]: i8c dc,h,v,p: 52% 18% 11% 18%
+ x264 [info]: Weighted P-Frames: Y:3.0% UV:3.0%
+ x264 [info]: ref P L0: 58.3% 19.9% 5.7% 4.1% 1.6% 3.0% 1.3% 1.2% 0.6% 1.0% 0.6% 0.9% 0.5% 0.8% 0.5% 0.0%
+ x264 [info]: ref B L0: 71.8% 11.7% 5.5% 2.5% 1.8% 1.7% 1.3% 0.7% 0.6% 0.5% 0.5% 0.5% 0.4% 0.3% 0.1%
+ x264 [info]: ref B L1: 97.1% 2.9%
+ x264 [info]: kb/s:5033.59
+ x264 [info]: encoded 1000 frames, 0.5356 fps, 5033.74 kb/s, 25.03 MB
+ x264 [info]: ended at Mon Aug 08 00:17:44 2016
+ x264 [info]: encoding duration 0:31:07
+

+
+
x265 log +

+ yuv [info]: 1920x1080 fps 24000/1001 i420p8 unknown frame count
+ x265 [info]: Using preset veryslow & tune none
+ x265 [info]: HEVC encoder version 2.0M+9-g457336f+14
+ x265 [info]: build info [Windows][GCC 5.3.0][64 bit] Yuuki 10bit
+ x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
+ x265 [info]: Main 10 profile, Level-4 (Main tier)
+ x265 [info]: Thread pool created using 4 threads
+ x265 [info]: frame threads / pool features : 2 / wpp(17 rows)
+ x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
+ x265 [info]: Residual QT: max TU size, max depth : 32 / 3 inter / 3 intra
+ x265 [info]: ME / range / subpel / merge : star / 57 / 4 / 4
+ x265 [info]: Keyframe min / max / scenecut : 23 / 250 / 40
+ x265 [info]: Lookahead / bframes / badapt : 40 / 8 / 2
+ x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
+ x265 [info]: References / ref-limit cu / depth : 5 / off / on
+ x265 [info]: AQ: mode / str / qg-size / cu-tree : 3 / 1.0 / 32 / 1
+ x265 [info]: Rate Control / qCompress : CRF-17.0 / 0.60
+ x265 [info]: tools: rect amp limit-modes rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00
+ x265 [info]: tools: rskip signhide tmvp b-intra strong-intra-smoothing deblock
+ x265 [info]: tools: sao
+ x265 [info]: frame I: 9, Avg QP:14.24 kb/s: 32406.31
+ x265 [info]: frame P: 178, Avg QP:14.40 kb/s: 6617.61
+ x265 [info]: frame B: 813, Avg QP:23.03 kb/s: 326.86
+ x265 [info]: Weighted P-Frames: Y:12.4% UV:12.4%
+ x265 [info]: Weighted B-Frames: Y:14.9% UV:14.3%
+ x265 [info]: consecutive B-frames: 5.9% 1.1% 21.4% 2.7% 9.6% 36.9% 4.8% 10.2% 7.5%
+ + encoded 1000 frames in 1010.58s (0.99 fps), 1735.33 kb/s, Avg QP:21.41
+

+
+

+ The x264 encode has an average bitrate of 5033.74 kb/s resulting in a total filesize of 25.03 MB, while + the x265 encode has an average bitrate of 1735.33 kb/s resulting in a total filesize of 8.64 MB. This is an 66% + reduction, meaning the x265 file has only 1/3th the size of the x264 file whilst having the same visual quality. +

+
+

+ 2. High-motion videos: The test clip consists of 1000 frames of One Punch Man episode 1, beginning at frame 13000.

+ + Download the encodes

+

+
+ Logfiles of the encodes (expandable):
+
x264 log +

+ x264 [info]: 1920x1080p 0:0 @ 24000/1001 fps (cfr)
+ x264 [info]: color matrix: undef
+ x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
+ x264 [info]: AVC Encoder x264 core 148 r2699+6+41 29a38aa Yuuki [10-bit@all X86_64][GCC 5.3.0]
+ x264 [info]: profile: High 10, level: 5.1, subsampling: 4:2:0, bit-depth: 10-bit
+ x264 [info]: cabac=1 ref=16 deblock=1:0:-1 analyse=0x3:0x133 me=tesa subme=11 psy=1 fade_compensate=0.00 psy_rd=1.00:0.00 mixed_ref=1 me_range=32 chroma_me=1 trellis=2 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=0 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=0 interlaced=0 bluray_compat=0 constrained_intra=0 fgo=0 bframes=16 b_pyramid=2 b_adapt=2 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=23 scenecut=40 intra_refresh=0 rc_lookahead=60 rc=crf mbtree=1 crf=15.0000 qcomp=0.60 qpmin=0:0:0 qpmax=81:81:81 qpstep=4 ip_ratio=1.40 aq=3:0.80
+ x264 [info]: started at Mon Aug 08 01:49:11 2016
+ x264 [info]: frame I:8 Avg QP:25.24 size:284336
+ x264 [info]: frame P:285 Avg QP:28.03 size: 95640
+ x264 [info]: frame B:707 Avg QP:28.83 size: 40806
+ x264 [info]: consecutive B-frames: 2.9% 5.4% 26.7% 44.8% 9.0% 9.0% 1.4% 0.8% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
+ x264 [info]: mb I I16..4: 0.5% 95.9% 3.6%
+ x264 [info]: mb P I16..4: 0.5% 13.6% 1.3% P16..4: 35.8% 34.5% 12.1% 1.2% 0.1% skip: 1.0%
+ x264 [info]: mb B I16..4: 0.1% 3.9% 0.1% B16..8: 37.8% 17.5% 2.6% direct: 8.1% skip:29.9% L0:53.2% L1:42.1% BI: 4.8%
+ x264 [info]: 8x8 transform intra:91.7% inter:78.2%
+ x264 [info]: direct mvs spatial:99.6% temporal:0.4%
+ x264 [info]: coded y,uvDC,uvAC intra: 90.3% 91.1% 74.3% inter: 34.2% 38.3% 10.4%
+ x264 [info]: i16 v,h,dc,p: 12% 22% 10% 55%
+ x264 [info]: i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 7% 16% 10% 12% 11% 11% 10% 12%
+ x264 [info]: i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 11% 9% 3% 11% 18% 14% 13% 10% 11%
+ x264 [info]: i8c dc,h,v,p: 47% 20% 19% 14%
+ x264 [info]: Weighted P-Frames: Y:9.1% UV:7.0%
+ x264 [info]: ref P L0: 41.9% 17.7% 9.6% 5.6% 5.2% 4.0% 2.8% 2.1% 1.5% 1.8% 2.1% 2.0% 1.4% 1.1% 0.9% 0.1%
+ x264 [info]: ref B L0: 58.4% 11.5% 6.4% 4.0% 2.8% 4.0% 2.6% 1.5% 1.4% 1.4% 1.1% 2.0% 1.6% 1.0% 0.3%
+ x264 [info]: ref B L1: 95.8% 4.2%
+ x264 [info]: kb/s:11198.17
+ x264 [info]: encoded 1000 frames, 0.2929 fps, 11198.33 kb/s, 55.68 MB
+ x264 [info]: ended at Mon Aug 08 02:46:06 2016
+ x264 [info]: encoding duration 0:56:55
+

+
+
x265 log +

+ yuv [info]: 1920x1080 fps 24000/1001 i420p8 unknown frame count
+ x265 [info]: Using preset veryslow & tune none
+ x265 [info]: HEVC encoder version 2.0M+9-g457336f+14
+ x265 [info]: build info [Windows][GCC 5.3.0][64 bit] Yuuki 10bit
+ x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
+ x265 [info]: Main 10 profile, Level-4 (Main tier)
+ x265 [info]: Thread pool created using 4 threads
+ x265 [info]: frame threads / pool features : 2 / wpp(17 rows)
+ x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
+ x265 [info]: Residual QT: max TU size, max depth : 32 / 3 inter / 3 intra
+ x265 [info]: ME / range / subpel / merge : star / 57 / 4 / 4
+ x265 [info]: Keyframe min / max / scenecut : 23 / 250 / 40
+ x265 [info]: Lookahead / bframes / badapt : 40 / 8 / 2
+ x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
+ x265 [info]: References / ref-limit cu / depth : 5 / off / on
+ x265 [info]: AQ: mode / str / qg-size / cu-tree : 3 / 1.0 / 32 / 1
+ x265 [info]: Rate Control / qCompress : CRF-17.0 / 0.60
+ x265 [info]: tools: rect amp limit-modes rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00
+ x265 [info]: tools: rskip signhide tmvp b-intra strong-intra-smoothing deblock
+ x265 [info]: tools: sao
+ x265 [info]: frame I: 14, Avg QP:14.37 kb/s: 18963.14
+ x265 [info]: frame P: 253, Avg QP:15.76 kb/s: 12334.19
+ x265 [info]: frame B: 733, Avg QP:19.84 kb/s: 3709.09
+ x265 [info]: Weighted P-Frames: Y:9.5% UV:8.3%
+ x265 [info]: Weighted B-Frames: Y:8.3% UV:6.8%
+ x265 [info]: consecutive B-frames: 11.6% 9.4% 14.6% 43.1% 7.1% 10.5% 1.5% 1.1% 1.1%
+ + encoded 1000 frames in 2391.07s (0.42 fps), 6104.80 kb/s, Avg QP:18.73
+

+
+
+

+ The x264 encode has an average bitrate of 11198.33 kb/s resulting in a total filesize of 55.68 MB, while + the x265 encode has an average bitrate of 6104.80 kb/s resulting in a total filesize of 30.3 MB. This is an 43% + reduction, meaning the x265 file has only 4/7th the size of the x264 file.

+ + Conclusion: x265 offers the same visual quality at significantly lower bitrates, meaning one can offer an encode with higher + visual quality than x264 at the same filesize, or reduce the filesize of the encoded videos by a large amount while + offering the same visual fidelity as a x264 encode. With no real downsides apart from a higher encoding time + and slightly less compatibility there really is no reason not to use it. +

+
+

Useful parameters for encoding with x265

+
+

+ Just as with x264, x265 has many parameters you can use if you don't want to stick to the presets and + are trying to get the best possible quality. In the following section, I will explain some of these + parameters and how to use them. You can click on each parameter to get more information about it.

+

+ +
--preset +

+ Options: ultrafast, superfast, veryfast, faster, fast, medium, slow, slower, veryslow, placebo
+ What it does: The further to the right the preset on this list is, the higher the compression efficiency will be at the cost of slowing down your encode.
+ What to use: Medium or slower are fine, but I would recommend slow or veryslow depending on how strong your encoding rig is. + Don't use placebo, it will result in greatly increased encoding time with diminishing returns in comparison to veryslow. +

+
+
--ref +

+ Options: An integer from 1 to 16
+ What it does: Max number of L0 references to be allowed. This number has a linear multiplier effect on the amount of work performed in motion search.
+ What to use: If --b-pyramid is enabled(which is the default option), the HEVC specification only allows + ref 6 as a maximum, without --b-pyramid the maximum ref allowed by the specification is 7. Generally, you + want to use the highest number possible(within the specification), as it yields the best results. +

+
+
--rd +

+ Options: An integer from 1 to 6
+ What it does: The higher the value, the more exhaustive the RDO analysis is and the more rate distortion optimizations are used.
+ What to use: The highest option you can afford, in general the rule: "the lower the value the faster the encode, + the higher the value the smaller the bitstream" applies. Please notice that, in the current version, rd 3 and 4 and rd 5 and 6 are the same. +

+
+
--ctu +

+ Options: 64,32,16
+ What it does: CTUs and CUs are the logical units in which the HEVC encoder divides a given picture. This + option sets the maximum CU size.
+ What to use: No reason not to use 64, as it will give you large reductions in bitrate compared to the other two options + with an insignificant increase in computing time. +

+
+
--min-cu-size +

+ Options: 64,32,16,8
+ What it does: CTUs and CUs are the logical units in which the HEVC encoder divides a given picture. This + option sets the minimum CU size.
+ What to use: Use 8, as it is an easy way to save bitrate without a significant increase in computing time. +

+
+
--rect, --no-rect +

+ What it does: Enables the analysis of rectangular motion partitions.
+ What to use: --rect for better encode results, --no-rect for faster encoding. +

+
+
--amp, --no-amp +

+ What it does: Enables the analysis of asymmetric motion partitions.
+ What to use: --amp for better encode results, --no-amp for faster encoding. +

+
+
--rskip, --no-rskip +

+ What it does: This option determines early exit from CU depth recursion.
+ What to use: Provides minimal quality degradation at good performance gains when enabled, so you can + choose what you want. +

+
+
--rdoq-level +

+ Options: 0,1,2
+ What it does: Specifys the amount of rate-distortion analysis to use within quantization.
+ What to use: The standard is 2, which seems pretty good. +

+
+
--max-merge +

+ Options: An integer from 1 to 5
+ What it does: Maximum number of neighbor candidate blocks that the encoder may consider for merging motion predictions.
+ What to use: Something from 3 to 5, depending if you are aiming for a faster encode or better results. +

+
+
--me +

+ Options: dia, hex, umh, star, full
+ What it does: Motion search method. Diamond search is the simplest. Hexagon search is a little better. + Uneven Multi-Hexagon is an adaption of the search method used by x264 for slower presets. + Star is a three step search adapted from the HM encoder and full is an exhaustive search.
+ What to use: Umh for faster encoding, star for better encode results. Dia and hex are not worth the + quality loss and full gives diminishing returns. +

+
+
--subme +

+ Options: An integer from 1 to 7
+ What it does: This is the motion search range.
+ What to use: Something from 4 to 7, depending on whether you are going for faster encoding or better results. +

+
+
--merange +

+ Options: An integer from 0 to 32768
+ What it does: Amount of subpel refinement to perform. The higher the number the more subpel iterations and steps are performed.
+ What to use: The standard of 57 seems quite good, you can experiment with higher values if you want, but please + keep in mind that higher values will give you diminishing returns. +

+
+
--constrained-intra, --no-constrained-intra +

+ What it does: Constrained intra prediction. The general idea is to block the propagation of reference + errors that may have resulted from lossy signals.
+ What to use: --no-constrained-intra (which is default) unless you know what you're doing. +

+
+
--psy-rd +

+ Options: A float from 0 to 5.0
+ What it does: Turning on small amounts of psy-rd and psy-rdoq will improve the perceived visual quality, + trading distortion for bitrate. If it is too high, it will introduce visual artifacts.
+ What to use: A value between 0.5 and 1.0 is a good starting point, you can experiment with higher values if you want, but + don't overdo it unless you like visual artifacts. +

+
+
--psy-rdoq +

+ Options: A float from 0 to 50.0
+ What it does: Turning on small amounts of psy-rd and psy-rdoq will improve the perceived visual quality, + trading distortion for bitrate. High levels of psy-rdoq can double the bitrate, so be careful.
+ What to use: You should be good to go with a value between 0 and 5.0, but I wouldn't take a value much higher + than 1.0 because I haven't done enough tests yet. +

+
+
--rc-grain, --no-rc-grain +

+ What it does: This parameter strictly minimizes QP fluctuations within and across frames and removes pulsing of grain.
+ What to use: Use this whenever you need to encode grainy scenes, otherwise leave it disabled. +

+
+
+