adaptivegrain

History

kageru 4831298a9b Optimize LUT generation Directly calling f32::mul_add is actually more accurate and faster here because rustc seems to be unable to rearrange the original instructions in a way that can utilize `fma`. When directly comparing the two implementations, mul_add was about 10% faster on my machine (Ryzen 1700 with native target in rustflags). Relevant Godbolt: https://godbolt.org/z/DDRZ4-	2020-05-13 23:26:23 +02:00
..
lib.rs	Optimize LUT generation	2020-05-13 23:26:23 +02:00
mask.rs	Optimize LUT generation	2020-05-13 23:26:23 +02:00

Directly calling f32::mul_add is actually more accurate and faster here
because rustc seems to be unable to rearrange the original instructions
in a way that can utilize `fma`.
When directly comparing the two implementations, mul_add was about 10%
faster on my machine (Ryzen 1700 with native target in rustflags).
Relevant Godbolt: https://godbolt.org/z/DDRZ4-

2020-05-13 23:26:23 +02:00

lib.rs

Optimize LUT generation

2020-05-13 23:26:23 +02:00

mask.rs

Optimize LUT generation

2020-05-13 23:26:23 +02:00