blog-content/resolutions.html

344 lines
25 KiB
HTML
Raw Permalink Normal View History

2019-04-13 15:16:45 +02:00
<a href="/blog">
{% load static %}
<div class="bottom_right_div"><img src="{% static '2hu.png' %}"></div>
</a>
<script>
function hover(id, path) {
$("#" + id).attr('src', path);
}
</script>
<div id="overlay" aria-hidden="true" onclick="removefull()"></div>
<div class="wrapper_article">
<p class="heading">Native Resolutions and Scaling</p>
<p class="subhead">This is really not that good, and I should probably rewrite it,<br><br>but I'm lazy, so here's an unfinished whatever this is <a href="https://ddl.kageru.moe/kbz12.pdf">https://ddl.kageru.moe/kbz12.pdf</a></p>
<p class="subhead">Table of contents</p>
<div class="content">
<ul>
<li><a href="#c_introduction"> Introduction</a></li>
<li><a href="#c_basics">Avisynth and Vapoursynth basics</a></li>
<li><a href="#c_examples">Bilinear/Debilinear examples</a></li>
<li><a href="#c_native">Native resolutions</a></li>
<li><a href="#c_kernel">Kernels</a></li>
<li><a href="#c_mask">Masks: Dealing with artifacts and 1080p overlays</a></li>
<li><a href="#c_ss">Subsampling</a></li>
<li><a href="#c_import">Importable Vapoursynth script</a></li>
</ul>
</div>
<p class="subhead"><a href="#c_introduction" id="c_introduction">Introduction</a></p>
<div class="content">As some (or many) might be aware, anime is usually produced at resolutions below 1080p.
However, since all Blu-Ray releases are 1080p, the source material has to be upscaled to this
resolution. This article will try to cover the basics of scaling Blu-Ray sourced material using
Vapoursynth and Avisynth.<br>
Note: Most of the images were embedded as JPG to save bandwidth. Clicking an image will open the lossless PNG.
</div>
<p class="subhead"><a href="#c_basics" id="c_basics">Avisynth and Vapoursynth basics</a></p>
<div class="content">
In order to make the following easier to understand I will try to explain the basic scaling methods in Avisynth
and Vapoursynth. More detailed examples to deal with artifacts can be found in the corresponding paragraph.
Blue code blocks contain Vapoursynth code while green blocks contain Avisynth. For Vapoursynth I will be using
<a href="https://github.com/EleonoreMizo/fmtconv/releases">fmtconv</a> to resize.<br><br>
Scaling to 1080p using a bilinear resizer. This can be used for either upscaling or downscaling
<p class="vapoursynth">clip = core.fmtc.Resample(clip, 1920, 1080, kernel = 'bilinear')</p>
<p class="avisynth">clip.BilinearResize(1920, 1080)</p>
Note that fmtc will use Spline36 to resize if no kernel is specified. Spline is generally the better choice, and
we are only using bilinear as an example. To use Spline36 in Avisynth use
<p class="avisynth">clip.Spline36Resize(1920, 1080)</p>
Using a debilinear resizer to reverse to a native resolution of 1280x720: <span style="font-size: 60%;margin-left: 2em;">(Note that you should <b>never</b> use this to
upscale anything)</span>
<p class="vapoursynth">clip = core.fmtc.Resample(clip, 1280, 720, kernel = 'bilinear', invks = True)</p>
<p class="avisynth">clip.Debilinear(1280, 720)</p>
Debilinear for Avisynth can be found <a href="http://avisynth.nl/index.php/Debilinear">in the wiki</a>.
</div>
<p class="subhead"><a href="#c_examples" id="c_examples">Bilinear/Debilinear Examples</a></p>
<div class="content">
<p> Traditional scaling is done by spreading all pixels of an image over a higher resolution(e.g. 960x540 ->
1920x1080), interpolating the missing pixels (in our example every other pixel on each axis), and in some
cases applying additional post-processing to the results.
For a less simplified explanation and comparison of different scaling methods refer to the <a
href="https://en.wikipedia.org/wiki/Image_scaling">Wikipedia article</a>.<br>
It is possible to invert the effects of this by using the according inverse algorithm to downscale the
image.
This is only possible if the <b>exact</b> resolution of the source material is known and the video has not
been altered after scaling it (we will deal with 1080p credits and text later).<br>
A few examples of scaled and inverse scaled images: (click for full resolution PNG)<br>
<img src="/media/articles/res/opm_src.jpg" onclick="fullimage('/media/articles/res/opm_src.png')" title="ONE PUUUUNCH"><br>
1080p source frame from the One Punch man Blu-ray. No processing.<br>
<img src="/media/articles/res/opm_deb.jpg" onclick="fullimage('/media/articles/res/opm_deb.png')"><br>
Source.Debilinear(1280, 720)
<img src="/media/articles/res/opm_bilinear.jpg" onclick="fullimage('/media/articles/res/opm_bilinear.png')"><br>
Source.Debilinear(1280, 720).BilinearResize(1920,1080)<br>
This reverses the scaling and applies our own bilinear upscale.<br>
You may see slight differences which are caused by the Blu-Ray
compression noise but without zooming in and if played in real time these images should be
indistinguishable.
</p>
<p>
My second example will be a frame from Makoto Shinkai's Kotonoha no Niwa or "The Garden of Words". The movie
is not only beautifully drawn and animated but also produced at FullHD resolution. We will now upscale the
image to 4k using a bilinear resizer and reverse the scaling afterwards.
<img src="/media/articles/res/kotonohasrc.jpg" onclick="fullimage('/media/articles/res/kotonohasrc.png')"><br>
The untouched source frame
<img src="/media/articles/res/kotonoha4k.jpg" onclick="fullimage('/media/articles/res/kotonoha4k.png')"><br>
Source.BilinearResize(3840,2160)
<img src="/media/articles/res/kotonoha_deb.jpg" onclick="fullimage('/media/articles/res/kotonoha_deb.png')"><br>
Source.BilinearResize(3840,2160).Debilinear(1920,1080)<br>
This time the images are even more similar, because no artifacts were added after upscaling.
As you can see, using inverse kernels to reverse scaling is quite effective and will usually restore the
original image accurately. This is desirable, as it allows the encoder to apply a reverse scaling algorithm
to release in 720p, significantly decreasing the release's filesize. The 720p video will be upscaled by the
leecher's video player, potentially using high quality scaling methods like the ones implemented in MadVR.
Releasing in native resolution will therefore not just save space, but may even improve the image quality on
the consumer's end.
</p>
</div>
<p class="subhead"><a href="#c_native" id="c_native">Native resolutions</a></p>
<div class="content">
Finding the native resolution of your source is the most important step. If you use the wrong resolution or
try to debilinearize native 1080p material you will destroy details and introduce ringing artifacts. To
understand this let's take a look at this frame from Non Non Biyori Repeat. The show's native resolution is
846p.
<img src="/media/articles/res/nnb_src.jpg" onclick="fullimage('/media/articles/res/nnb_src.png')">
Source
<img src="/media/articles/res/nnb_ringing.jpg" onclick="fullimage('/media/articles/res/nnb_ringing.png')">
Source.Debilinear(1280, 720)<br>
Upon taking a closer look you will see that the edges of our 720p image look very jagged or aliased. This is
caused by improper debilinearizing. The effect will get stronger with sharper and more detailed edges. If
you encounter this <b>never</b> try to fix it by anti-aliasing. Try to find the correct resolution or don't
use inverse scaling at all.
<img src="/media/articles/res/nnb_ringing_zoom.jpg" onclick="fullimage('/media/articles/res/nnb_ringing_zoom.png')">
Source.Debilinear(1280, 720).PointResize(3840, 2160) and some cropping. Point resize (also called Nearest
Neighbor) is used to magnify without smoothing.
<img src="/media/articles/res/nnb_source_zoom.jpg" onclick="fullimage('/media/articles/res/nnb_source_zoom.png')">
Source.PointResize(3840, 2160) and cropping. As you can see this version does not have the ringing
artifacts.<br>
<p> Unfortunately there are only few ways of determining the native resolution. <br>The main source is
<a href="http://anibin.blogspot.de/">anibin</a>, a japanese blog that analyzes anime to find its native
resolution. In order to find an anime, you have to get the original title from <a
href="http://myanimelist.net/">MyAnimeList</a>, <a href="http://www.anisearch.com/">AniSearch</a>,
<a
href="http://anidb.net/">AniDB</a>, or any other source that has Kanji/Kana titles.<br>
Non Non Biyori Repeat's japanese title is "のんのんびより りぴーと", and if you copy-paste it into the search bar on
anibin, you should be getting <a href="http://anibin.blogspot.de/2015/07/1_65.html">this result.</a>
Even if you don't understand japanese, the numbers should speak for themselves. In this case the resolution
is 1504x846. This is above 720p but below 1080p, so you have multiple options. In this case I would
recommend encoding in 1080p or using a regular resizer (like Spline36) if you need a 720p version.
In some cases even scaling back to anibin's resolution does not get rid of the ringing, either because the
studio didn't use a bilinear resizer or the analysis was incorrect due to artifacts caused by TV
compression, so I wouldn't bother messing with the native resolution. It's not like you were gonna release
in 846p, right?
<br>
<b>Edit</b>: Apparently there are people out there who genuinely believe releasing a 873p video is a valid
option.
This is not wrong from an objective standpoint, but you should never forget that a majority of the leechers
does not understand encoding and is likely to ignore your release, because "Only an idiot would release in
8xxp".</p>
<p>If you want an easier way to detect ringing and scaling artifacts, read the chapter about artifacts and
masks.
</p> Btw, in case you do need (or want) to inverse scale our example, you would have to use a Debicubic resizer
which leads me to our next topic.
<br></div>
<p class="subhead"><a href="#c_kernel" id="c_kernel">Kernels</a></p>
<div class="content">
<p>
Sometimes you will encounter ringing and artifacts even if you are certain that you know the native
resolution. This usually means that the studio used another resizer. Our example will be Byousoku 5
Centimeter or 5 Centimeters per Second (<a href="http://anibin.blogspot.de/2008/09/5-blu-ray.html">Anibin's
Blu-Ray analysis</a>)</p> This will be our test frame: <img
src="/media/articles/res/byousoku_src.jpg" onclick="fullimage('/media/articles/res/byousoku_src.png')">
We will be using the masking functions explained in the next paragraph. For now just accept them as a good
way to find artifacts.
If we try to debilinearize our example, the mask will look like this:
<img src="/media/articles/res/byousoku_linear.png" onclick="fullimage('/media/articles/res/byousoku_linear.png')">
Despite using the correct resolution we can see strong artifacts around all edges. This can have multiple,
not mutually exclusive reasons:
<ol>
<li>The studio used sharpening filters after upscaling</li>
<li>The studio used different resolution for different layers or parts of the image</li>
<li>The image was upscaled with a different kernel (not bilinear)</li>
<li>Our resolution is wrong after all</li>
</ol>
The first two reasons are not fixable, as I will illustrate using a flashback scene from Seirei Tsukai no
Blade Dance. The scene features very strong and dynamic grain which was added after the upscale, resulting
in 720p backgrounds and 1080p grain.
<img src="/media/articles/res/bladedance_grain_src.jpg" onclick="fullimage('/media/articles/res/bladedance_grain_src.png')">
And now the mask:
<img src="/media/articles/res/bladedance_grain_mask.png" onclick="fullimage('/media/articles/res/bladedance_grain_mask.png')">
In this case you would have to trim the scene and use a regular resizer. Sometimes all backgrounds were drawn in
a higher or lower resolution than the characters and foreground objects. In this case inverse scaling becomes
very difficuilt since you would need to know the resolution of all different planes and you need a way to mask
and merge them. I'd advise using a regular resizer for these sources or just releasing in 1080p.<br>
After talking about problems we can't fix, let's go back to our example to fix reason 3. Some (especially
more recent) Blu-Rays were upscaled with a bicubic kernel rather than bilinear. Examples are Death Parade,
Monster Musume, Outbreak Company, and of course our image. Applying the mask with debicubic scaling results
in far fewer artifacts, as seen here: (hover over the image to see the bilinear mask)
<img id="byou" src="/media/articles/res/byousoku_cubic.png"
onmouseover="hover('byou', '/media/articles/res/byousoku_linear.png');"
onmouseout="hover('byou', '/media/articles/res/byousoku_cubic.png');">
The remaining artifacts are likely caused by compression artifacts on the Blu-Ray as well as potential
postprocessing in the studio. This brings us back to reason 1, although in this case the artifacts are weak
enough to let the mask handle them and use debicubic for the rest.<br>
Usage (without mask): <br>
To realize this in Avisynth import <a href="http://avisynth.nl/index.php/Debicubic">Debicubic</a> and use it
like this:
<p class="avisynth">src.Debicubic(1280, 720, b=0, c=1)</p>
For Vapoursynth use <a href="https://github.com/EleonoreMizo/fmtconv/releases">fmtconv</a>:
<p class="vapoursynth">out = core.fmtc.resample(src, 1280, 720, kernel = 'bicubic', invks = True, a1 = 0, a2 =
1)</p>
To use a mask for overlays and potential artifacts as well as 4:4:4 output use the Vapoursynth function
linked at the bottom. Example for bicubic upscales:
<p class="vapoursynth">out = deb.debilinearM(src, 1280, 720, kernel = 'bicubic')</p>
If the b and c parameters are not 0 and 1 (which should rarely be the case) you can set them as a1
and a2 like in fmtc.resample(). Bicubic's own default is 1/3 for both values so if bilinear and bicubic 0:1
don't work you could give that a try.<br>
<b>Edit:</b> I did some more testing and consulted another encoder regarding this issue.
Since we're using an inverse kernel in vapoursynth, the results may differ slightly from avisynth's debicubic.
In those cases, adjusting the values of a1 and a2, as well as the number of taps used for scaling can be
beneficial and yield a slightly sharper result.
</div>
<p class="subhead"><a href="#c_mask" id="c_mask">Masks: Dealing with artifacts and 1080p overlays</a></p>
<div class="content">
Sometimes a studio will add native 1080p material (most commonly credits or text) on top of the image.
Inverse scaling may work with the background, but it will produce artifacts around the text as seen in the
example from Mushishi Zoku Shou below:
<img src="/media/articles/res/mushishi_ringing.png" onclick="fullimage('/media/articles/res/mushishi_ringing.png')">
In order to avoid this you will have to mask these parts with conventionally downscaled pixels.
The theory behind inverse scaling is that it can be reversed by using regular scaling, so (in theory) a
source frame from a bilinear upscale would be identical to the output of this script:
<p class="avisynth">source.Debilinear(1280,720).BilinearResize(1920,1080)</p>
This property is used by scripts to mask native 1080p content by finding the differences between the source
and the above script's output. A mask would look like this:<br>
<img src="/media/articles/res/mushishi_mask.png" onclick="fullimage('/media/articles/res/mushishi_mask.png')">
If there are any differences, the areas with artifacts will be covered by a regular downscale like
<p class="avisynth">source.Spline36Resize(1280,720)</p>
In Avisynth you can import <a href="DebilinearM.avsi">DebilinearM</a> which can also be found <a
href="http://avisynth.nl/index.php/Debilinear">in the wiki.</a>
For Vapoursynth <a
href="https://raw.githubusercontent.com/MonoS/VS-MaskDetail/master/MaskDetail.py">MaskDetail</a>
can be used to create the Mask and MaskedMerge to mask the artifacts. A full importable script is available
at the end.
<p class="vapoursynth">
#MaskDetail has to be imported or copied into the script<br>
#src is the source clip<br>
deb = core.fmtc.resample(src, 1280, 720, kernel = 'bilinear', invks = True)<br>
noalias = core.fmtc.resample(src, 1280, 720, kernel="blackmanminlobe", taps=5)<br>
mask = maskDetail(src, 1280, 720, kernel = 'bilinear')<br>
masked = core.std.MaskedMerge(noalias, src, core.std.Invert(mask, 0))<br>
</p>
Using this function to filter our scene returns this image: (hover to see the unmasked version)
<img src="/media/articles/res/mushishi_masked.png" id="mushishi"
onmouseover="hover('mushishi', '/media/articles/res/mushishi_ringing.png');"
onmouseout="hover('mushishi', '/media/articles/res/mushishi_masked.png');">
The credits stand out less and don't look oversharpened. The effect can be much stronger depending on the
nature and style of the credits.
</div>
<p class="subhead"><a href="#c_ss" id="c_ss">Subsampling</a></p>
<div class="content">
You may have encountered fansubs released in 720p with 4:4:4 subsampling. In case you don't know the
term, subsampled images store luma (brightness) and chroma (color) at different resolutions. A Blu-Ray will
always have 4:2:0 subsampling, meaning the chroma channels have half the resolution of the luma channel.
When downscaling you retain the subsampling of the source, resulting in 720p luma and 360p chroma.
Alternatively you can split the source video in luma and chroma and then debilinearize the luma
(1080p->720p) while upscaling the chroma planes (540p->720p). Using the same resolution for luma and chroma will
prevent colorbleeding, retain more of the chroma present in the source, and prevent desaturation. <br>
A script for Avisynth and the discussion can be found on <a
href="http://forum.doom9.org/showthread.php?t=170832">doom9.</a>
For Vapoursynth I prefer to use the script explained in the next section which allows me to mask credits and
convert to 4:4:4 simultaneously.
<p class="subhead"><a href="#c_import" id="c_import">Importable Vapoursynth script</a></p>
While there may be scripts for literally anything in Avisynth, Vapoursynth is still fairly new and growing.
To make this easier for other Vapoursynth users I have written this simple import script which allows you to
debilinearize with masks and 4:4:4 output. A downloadable version is linked below the explanation.
Essentially, all the script does is split the video in its planes (Y, U and V) to scale them separately,
using debilinear for luma downscaling and spline for chroma upscaling. The example given is for 720p
bilinear upscaled material:
<p class="vapoursynth">
y = core.std.ShufflePlanes(src, 0, colorfamily=vs.GRAY)<br>
u = core.std.ShufflePlanes(src, 1, colorfamily=vs.GRAY)<br>
v = core.std.ShufflePlanes(src, 2, colorfamily=vs.GRAY)<br>
y = core.fmtc.resample(y, 1280, 720, kernel = 'bilinear', invks = True)<br>
u = core.fmtc.resample(u, 1280, 720, kernel = "spline36", sx = 0.25)<br>
v = core.fmtc.resample(v, 1280, 720, kernel = "spline36", sx = 0.25)<br>
out = core.std.ShufflePlanes(clips=[y, u, v], planes = [0,0,0], colorfamily=vs.YUV)<br>
noalias = core.fmtc.resample(src, 1280, 720, css = '444', kernel="blackmanminlobe", taps=5)<br>
mask = maskDetail(src, 1280, 720, kernel = 'bilinear')<br>
out = core.std.MaskedMerge(noalias, out, core.std.Invert(mask, 0))<br>
out.set_output()<br>
</p>
To call this script easily copy <a href="/media/articles/debilinearm.py">this file</a> into
<span class="path">C:\Users\Your_Name\AppData\Local\Programs\Python\Python35\Lib\site-packages</span>
and use it like this:
<p class="vapoursynth">
import vapoursynth as vs<br>
import debilinearm as deb<br>
core = vs.get_core()<br>
src = core.lsmas.LWLibavSource(r'E:\path\to\source.m2ts') #other source filters will work too<br>
out = deb.debilinearM(src, width, height, kernel)<br>
</p>
Where width and height are your target dimension and kernel is the used upscaling method. The output will be
in 16-bit and 4:4:4 subsampling.<br>The defaults are
(1280, 720, 'bilinear') meaning in most cases (720p bilinear upscales) you can just call:
<p class="vapoursynth">out = deb.debilinearM(src)</p>
List of parameters and explanation:<br>
<table class="paramtable">
<tr>
<td>parameter</td>
<td>[type, default]</td>
<td class="paramtable_main">explanation</td>
</tr>
<tr>
<td>src</td>
<td>[clip]</td>
<td class="paramtable_main">the source clip</td>
</tr>
<tr>
<td>w</td>
<td>[int, 1280]</td>
<td class="paramtable_main">target width</td>
</tr>
<tr>
<td>h</td>
<td>[int, 720]</td>
<td class="paramtable_main">target height</td>
</tr>
<tr>
<td>kernel</td>
<td>[string, 'bilinear']</td>
<td class="paramtable_main">kernel used for inverse scaling. Has to be in 'quotes'</td>
</tr>
<tr>
<td>taps</td>
<td>[int, 4]</td>
<td class="paramtable_main">number of taps for reverse scaling</td>
</tr>
<tr>
<td>return_mask</td>
<td>[boolean, False]</td>
<td class="paramtable_main">returns artifact mask in grayscale if True</td>
</tr>
<tr>
<td>a1</td>
<td>[int, 0]</td>
<td class="paramtable_main">b parameter of bicubic upscale, ignored if kernel != 'bicubic'</td>
</tr>
<tr>
<td>a2</td>
<td>[int, 1]</td>
<td class="paramtable_main">c parameter of bicubic upscale, ignored if kernel != 'bicubic'</td>
</tr>
</table>
<p>
<b>Edit:</b> The generic functions (core.generic.*) were removed in vapoursynth in R33, as most of them
are now part of the standard package (core.std.*). I have updated the script below accordingly, meaning it
may not work with R32 or older. This also applies to MonoS' MaskDetail which (as of now) has not been
updated. You can "fix" it by replacing both occurences of "core.generic" with "core.std".
</p>
<p class="download_centered">
<span class="source">The most recent version of my scripts can always be found on Github:<br></span><a href="https://gist.github.com/kageru/d71e44d9a83376d6b35a85122d427eb5">Download</a><br>
<a href="https://github.com/EleonoreMizo/fmtconv/releases">Download fmtconv (necessary)</a>
</p>
</div>
</div>