release iterators
This commit is contained in:
parent
ba11ce21df
commit
000a8fcdae
@ -21,7 +21,7 @@ I was personally interested in this because, being a Java/Kotlin developer,
|
|||||||
Still, I wanted to know how they compare to imperative code.
|
Still, I wanted to know how they compare to imperative code.
|
||||||
There are some resources on this for Java 8’s Stream API,
|
There are some resources on this for Java 8’s Stream API,
|
||||||
but Kotlin’s Sequences seem to just be accepted as
|
but Kotlin’s Sequences seem to just be accepted as
|
||||||
more convenient Streams[^convenience].
|
more convenient Streams, without much discussion about their performance.[^convenience]
|
||||||
|
|
||||||
[^convenience]: If you’ve ever used them, you’ll know what I mean.
|
[^convenience]: If you’ve ever used them, you’ll know what I mean.
|
||||||
Java’s Streams are built in a way that allows for easy parallelism,
|
Java’s Streams are built in a way that allows for easy parallelism,
|
||||||
@ -34,7 +34,7 @@ It lets you write code as a sequence of instructions to be applied to all elemen
|
|||||||
Let’s use a simple example to demonstrate this.
|
Let’s use a simple example to demonstrate this.
|
||||||
We want to take all numbers from 1 to 100,000,
|
We want to take all numbers from 1 to 100,000,
|
||||||
multiply each of them by 2,
|
multiply each of them by 2,
|
||||||
and then add all of them.[^sum]
|
and then sum all of them.[^sum]
|
||||||
|
|
||||||
[^sum]: You could also just compute the sum and take that \* 2, but we specifically want that intermediate step for the example.
|
[^sum]: You could also just compute the sum and take that \* 2, but we specifically want that intermediate step for the example.
|
||||||
|
|
||||||
@ -56,7 +56,7 @@ return (1..100_000).asSequence()
|
|||||||
.sum()
|
.sum()
|
||||||
```
|
```
|
||||||
|
|
||||||
An iterator not a list, and it doesn’t support indexing,[^index]
|
An iterator is not a list, and it doesn’t support indexing,[^index]
|
||||||
because it doesn’t actually contain any data.
|
because it doesn’t actually contain any data.
|
||||||
It just knows how to get or compute it for you,
|
It just knows how to get or compute it for you,
|
||||||
but you don’t know how it does that.
|
but you don’t know how it does that.
|
||||||
@ -65,7 +65,7 @@ An iterator not a list, and it doesn’t support indexing,[^index]
|
|||||||
meaning it will produce the numbers from 1 to 100,00 before it ends).
|
meaning it will produce the numbers from 1 to 100,00 before it ends).
|
||||||
You can tell an iterator to produce or emit data if you want to use it
|
You can tell an iterator to produce or emit data if you want to use it
|
||||||
(which is often called ‘consuming’
|
(which is often called ‘consuming’
|
||||||
because if you read data from the pipeline,
|
because if you read something from the pipeline,
|
||||||
it’s usually gone),
|
it’s usually gone),
|
||||||
or you can add a new step to it and hand the new pipeline to someone else,
|
or you can add a new step to it and hand the new pipeline to someone else,
|
||||||
who can then consume it or add even more steps.
|
who can then consume it or add even more steps.
|
||||||
@ -78,12 +78,12 @@ You can tell an iterator to produce or emit data if you want to use it
|
|||||||
An important aspect to note is:
|
An important aspect to note is:
|
||||||
adding an operation to the pipeline does nothing
|
adding an operation to the pipeline does nothing
|
||||||
until someone actually starts reading from it,
|
until someone actually starts reading from it,
|
||||||
and even then, only the elements that are consumed are computed.[^inf]
|
and even then, only the elements that are consumed are computed.
|
||||||
This makes it possible to operate on huge data sets while keeping memory usage low,
|
This makes it possible to operate on huge data sets[^inf] while keeping memory usage low,
|
||||||
because only the currently active element has to be held in memory.
|
because only the currently active element has to be held in memory.
|
||||||
|
|
||||||
[^inf]: This is what makes infinite iterators possible.
|
[^inf]: Huge or even infinite.
|
||||||
They can be very useful and are used a lot in functional languages,
|
Infinite iterarors can be very useful and are used a lot in functional languages,
|
||||||
but they’re not today’s topic.
|
but they’re not today’s topic.
|
||||||
|
|
||||||
## Cold, hard numbers
|
## Cold, hard numbers
|
||||||
@ -179,7 +179,8 @@ Kotlin.streamWrappedInSequence avgt 25 3829.209 ± 33.569 ms/op
|
|||||||
Kotlin.withGenerator avgt 25 8374.149 ± 880.647 ms/op
|
Kotlin.withGenerator avgt 25 8374.149 ± 880.647 ms/op
|
||||||
```
|
```
|
||||||
|
|
||||||
(full JMH output)[https://ruru.moe/pSK13p8].
|
([full JMH output](https://ruru.moe/pSK13p8))
|
||||||
|
|
||||||
Unsurprisingly, using Streams from Java and Kotlin is almost identical in terms of performance.
|
Unsurprisingly, using Streams from Java and Kotlin is almost identical in terms of performance.
|
||||||
The same is true for imperative loops,
|
The same is true for imperative loops,
|
||||||
meaning Kotlin ranges introduce no overhead compared to incrementing for loops.
|
meaning Kotlin ranges introduce no overhead compared to incrementing for loops.
|
||||||
@ -187,8 +188,8 @@ The same is true for imperative loops,
|
|||||||
More surprisingly, using Sequences is an order of magnitude slower.
|
More surprisingly, using Sequences is an order of magnitude slower.
|
||||||
That was not at all according to my expectations, so I investigated.
|
That was not at all according to my expectations, so I investigated.
|
||||||
|
|
||||||
As it turns out, Java’s `LongStream` exists because Stream<Long> is *much* slower.
|
As it turns out, Java’s `LongStream` exists because `Stream<Long>` is *much* slower.
|
||||||
The JVM has to use `Long` rather than `long` when the type is used for generics,
|
The JVM has to use `Long` (uppercase) rather than `long` when the type is used for generics,
|
||||||
which involves an additional boxing step and the allocation for the `Long` object.[^primitives]
|
which involves an additional boxing step and the allocation for the `Long` object.[^primitives]
|
||||||
Still, we now know that Streams have about 25% overhead compared to the simple loop for this example,
|
Still, we now know that Streams have about 25% overhead compared to the simple loop for this example,
|
||||||
that generating sequences is a comparatively slow process,
|
that generating sequences is a comparatively slow process,
|
||||||
@ -202,15 +203,19 @@ Still, we now know that Streams have about 25% overhead compared to the simple l
|
|||||||
so a list of longs will always convert the `long` to `Long` before adding it.
|
so a list of longs will always convert the `long` to `Long` before adding it.
|
||||||
|
|
||||||
That last point seemed odd, so I attached a profiler to see where the CPU time is lost.
|
That last point seemed odd, so I attached a profiler to see where the CPU time is lost.
|
||||||
|
|
||||||
![Flamegraph of `streamWrappedInSequence()`](https://i.kageru.moe/knT2Eg.png)
|
![Flamegraph of `streamWrappedInSequence()`](https://i.kageru.moe/knT2Eg.png)
|
||||||
|
|
||||||
We can see that the `LongStream` can produce a `PrimitiveIterator.OfLong` that is used as a source for the Sequence.
|
We can see that the `LongStream` can produce a `PrimitiveIterator.OfLong` that is used as a source for the Sequence.
|
||||||
The operation of boxing a primitive `long` into an object `Long`
|
The operation of boxing a primitive `long` into an object `Long`
|
||||||
(that’s the `Long.valueOf()` step) takes almost as long as advancing the underlying iterator itself.
|
(that’s the `Long.valueOf()` step) takes almost as long as advancing the underlying iterator itself.
|
||||||
7.7% of the CPU time is spent in `Sequence.hasNext()`.
|
7.7% of the CPU time is spent in `Sequence.hasNext()`.
|
||||||
The exact breakdown of that looks as follows:
|
The exact breakdown of that looks as follows:
|
||||||
|
|
||||||
![Checking if a Sequence has more elements](https://i.kageru.moe/k4NHhR.png)
|
![Checking if a Sequence has more elements](https://i.kageru.moe/k4NHhR.png)
|
||||||
|
|
||||||
The Sequence introduces very little overhead here, as it just delegates to `hasNext()` of the underlying iterator.
|
The Sequence introduces very little overhead here, as it just delegates to `hasNext()` of the underlying iterator.
|
||||||
Worth noting is that the iterator calls `accept` as part of `hasNext()`,
|
Worth noting is that the iterator calls `accept()` as part of `hasNext()`,
|
||||||
which will already advance the underlying iterator.
|
which will already advance the underlying iterator.
|
||||||
The value returned by that will be stored temporarily until `nextLong()` is called.
|
The value returned by that will be stored temporarily until `nextLong()` is called.
|
||||||
|
|
||||||
@ -250,7 +255,13 @@ The next snippet uses a simple wrapper class that guarantees that we have no pri
|
|||||||
I’ll use this opportunity to also compare parallel and sequential streams.
|
I’ll use this opportunity to also compare parallel and sequential streams.
|
||||||
|
|
||||||
The steps are simple:
|
The steps are simple:
|
||||||
Take a long -> create a LongWrapper from it, double the contained value (which creates a new LongWrapper), extract the value, compute the sum.
|
|
||||||
|
1. take a long
|
||||||
|
1. create a LongWrapper from it
|
||||||
|
1. double the contained value (which creates a new LongWrapper)
|
||||||
|
1. extract the value
|
||||||
|
1. calculate the sum
|
||||||
|
|
||||||
That may sound overcomplicated,
|
That may sound overcomplicated,
|
||||||
but it’s sadly close to the reality of enterprise code.
|
but it’s sadly close to the reality of enterprise code.
|
||||||
Wrapper types are everywhere.
|
Wrapper types are everywhere.
|
||||||
@ -303,10 +314,10 @@ NonPrimitive.stream avgt 25 44673.318 ± 1325.832 ms/op
|
|||||||
NonPrimitive.parallelStream avgt 25 33856.919 ± 249.911 ms/op
|
NonPrimitive.parallelStream avgt 25 33856.919 ± 249.911 ms/op
|
||||||
```
|
```
|
||||||
|
|
||||||
Full results are in the (JMH log from earlier)[https://ruru.moe/pSK13p8].
|
Full results are in the [JMH log from earlier](https://ruru.moe/pSK13p8).
|
||||||
|
|
||||||
The overhead of Java streams is much higher than that of Kotlin Sequences,
|
The overhead of Java streams is much higher than that of Kotlin Sequences,
|
||||||
and even a parallel Stream is slower than using a Sequence.
|
and even a parallel Stream is slower than using a Sequence,
|
||||||
even though Sequences only use a single thread,
|
even though Sequences only use a single thread,
|
||||||
but both are miles behind the simple for loop.
|
but both are miles behind the simple for loop.
|
||||||
My first assumption was that the compiler optimized away the wrapper type and just added the longs,
|
My first assumption was that the compiler optimized away the wrapper type and just added the longs,
|
||||||
@ -326,12 +337,13 @@ This tells us that not only do Streams/Sequences have a very measurable overhead
|
|||||||
|
|
||||||
## Conclusion
|
## Conclusion
|
||||||
|
|
||||||
Overall, I think that Kotlin’s Sequences are a good addition to the language.
|
Overall, I think that Kotlin’s Sequences are a good addition to the language, despite their flaws.
|
||||||
They fall behind Streams when working with primitives
|
They are significantly slower than Streams when working with primitives
|
||||||
because the Java standard library has subtypes for many generic constructs to more efficiently handle primitive types,
|
because the Java standard library has subtypes for many generic constructs to more efficiently handle primitive types,
|
||||||
but in most real-world JVM applications (that being enterprise-level bloatware),
|
but in most real-world JVM applications (that being enterprise-level bloatware),
|
||||||
primitives are the exception rather than the rule.
|
primitives are the exception rather than the rule.
|
||||||
Still, Kotlin has some types that optimize for this, such as `LongIterator`,
|
Still, Kotlin already has some types that optimize for this,
|
||||||
|
such as `LongIterator`,
|
||||||
but without a `LongSequence` to go with it,
|
but without a `LongSequence` to go with it,
|
||||||
the boxing will still happen eventually,
|
the boxing will still happen eventually,
|
||||||
and all the performance gains are void.
|
and all the performance gains are void.
|
||||||
@ -344,27 +356,31 @@ Apart from the performance, Sequences are also a lot easier to understand and ev
|
|||||||
Implementing your own Sequence requires barely more than an implementation of the underlying iterator,
|
Implementing your own Sequence requires barely more than an implementation of the underlying iterator,
|
||||||
as can be seen in [CoalescingSequence](https://git.kageru.moe/kageru/Sekwences/src/branch/master/src/main/kotlin/moe/kageru/sekwences/CoalescingSequence.kt)
|
as can be seen in [CoalescingSequence](https://git.kageru.moe/kageru/Sekwences/src/branch/master/src/main/kotlin/moe/kageru/sekwences/CoalescingSequence.kt)
|
||||||
which I implemented last year to get a feeling for how all of this works.
|
which I implemented last year to get a feeling for how all of this works.
|
||||||
Streams on the other hand are a lot more complex. They actually extend `Consumer<T>`,
|
Streams on the other hand are a lot more complex. They extend `Consumer<T>`,
|
||||||
so a `Stream<T>` is just a `void consume(T input)` that can be called repeatedly.
|
so a `Stream<T>` is actually just a `void consume(T input)` that can be called repeatedly.
|
||||||
That makes it a lot harder to grasp where data is coming from and how it is requested, at least to me.
|
That makes it a lot harder to grasp where data is coming from and how it is requested, at least to me.
|
||||||
|
|
||||||
Simplicity is often underrated in software, but I consider it a huge plus for Sequences.
|
Simplicity is often underrated in software, but I consider it a huge plus for Sequences.
|
||||||
|
|
||||||
I will continue to use them liberally,
|
I will continue to use them liberally,
|
||||||
unless I find myself in a situation where I need to process a huge number of primitives.
|
unless I find myself in a situation where I need to process a huge number of primitives.
|
||||||
And even then, I now know that Java’s Streams are a good choice.
|
And even then, I now know that Java’s Streams are a good alternative,
|
||||||
|
as long as my code isn’t plain stupid and in dire need of the JIT optimizer.
|
||||||
25% might sound like a lot,
|
25% might sound like a lot,
|
||||||
but it’s more than worth it if it means leaving code that is much easier to understand and modify for the next person.
|
but it’s more than worth it if it means leaving code that is much easier to understand and modify for the next person.
|
||||||
Unless you’re actually in a very performance-critical part of your application,
|
Unless you’re actually in a very performance-critical part of your application,
|
||||||
but if you ever find yourself in that situation,
|
but if you ever find yourself in that situation,
|
||||||
you should switch to a different language.
|
you should switch to a different language.
|
||||||
|
|
||||||
|
Writing simple and correct code should always be more important than writing fast code.
|
||||||
|
\
|
||||||
|
\
|
||||||
On that note: I was originally going to include Rust’s iterators here for comparison,
|
\
|
||||||
|
\
|
||||||
|
On the note of switching languages:
|
||||||
|
I was originally going to include Rust’s iterators here for comparison,
|
||||||
but rustc optimized away all of my benchmarks with [constant time solutions](https://godbolt.org/z/iJaWVP).
|
but rustc optimized away all of my benchmarks with [constant time solutions](https://godbolt.org/z/iJaWVP).
|
||||||
It’s a fascinating topic,
|
That was a fascinating discovery for me,
|
||||||
and I might write a separate blog post
|
and I might write a separate blog post
|
||||||
where I dissect some of the assembly that rustc/LLVM produced,
|
where I dissect some of the assembly that rustc/LLVM produced,
|
||||||
but I need to properly understand it myself first.
|
but I feel like I’ll need to learn a few more things about compilers first.
|
Loading…
Reference in New Issue
Block a user