Skip to content

[KT-75801]: minor optimization on object array to list conversion functions #5426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

alexismanin
Copy link

Related issue: KT-75801
Related benchmark project

This contains two different cases of optimization:

  • Array<T>.(take|takeLast|takeWhile) functions : replace loop copy with copyOfRange().asList()
  • Array<T>.toList() and Array<Array<T>>.flatten() : force using direct array copy instead of delegating to mutable list.
    This is done because at least OpenJDK implementation causes two array copies to happen instead of a single one.

I've not modified any primitive array function, because of unboxing. If I'd applied the same logic to primitive arrays, the initial copy might indeed get faster, but then, unboxing would be delayed on List.get() calls, therefore adding performance penalty on the consumer side.

I tested using ./gradlew coreLibsTest (success). I'm still trying to run a full ./gradlew build, but I've got configuration issues to investigate:

Task with path ':plugins:plugin-sandbox:plugin-annotations:jar' not found in project ':analysis:analysis-test-framework

It is my first contribution here, so if I've missed a point or done something wrong (formatting, doc, testing, etc.), any insight is appreciated.

@fzhinkin fzhinkin self-assigned this Apr 4, 2025
@@ -5295,13 +5285,16 @@ public inline fun CharArray.takeLastWhile(predicate: (Char) -> Boolean): List<Ch
* @sample samples.collections.Collections.Transformations.take
*/
public inline fun <T> Array<out T>.takeWhile(predicate: (T) -> Boolean): List<T> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be easier to read, i think, with equivalent performance

Suggested change
public inline fun <T> Array<out T>.takeWhile(predicate: (T) -> Boolean): List<T> {
public inline fun <T> Array<out T>.takeWhile(predicate: (T) -> Boolean): List<T> {
var i = 0
while (i < size && predicate(this[i])) i++
return if (i == 0) emptyList() else this.copyOfRange(0, i).asList()
}

Copy link
Author

@alexismanin alexismanin Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked the impact of removing the special case handling, and it looks like it is noticeable. On the benchmark result below, your suggestion is the "takeWhileSuggestion" case. We see that for array size 0 and 1, it is a little less fast than the proposed optimization :

image

Therefore, I think we should keep the when as it is now.

P.S: If you want to double-check this, I've added the suggestions in my benchmark project on the test-pr-suggestions branch

Copy link
Author

@alexismanin alexismanin Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hoangchungk53qx1 : After second thought, your simplification makes sense.

I realized that it just miss a condition to improve its performance when output list has only one element (which is a broader case than my current implementation, which only accounts for cases when input array has only one element) :

return if (i == 0) emptyList()
       else if (i == 1) listOf(this[0])
       else copyOfRange(0, i).asList()

I've launched a benchmark by modifying your suggestion so that the final return condition accounts for cases where i is 1. It greatly improves performance, for two cases of the benchmark :

  1. Input array has one element
  2. Output list has only one element (in the benchmark, this is the case where input array is of size 3. We only take the first element from it)

I will soon apply you suggestion with this tweak, and I think we will be good then.

Here is a glimpse of updated benchmark results:

image

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed the change to takeWhile function.

val result = ArrayList<T>(sumOf { it.size })
for (element in this) {
result.addAll(element)
return when (size) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can early return

if (isEmpty()) return emptyList()

return when (size) {
0 -> emptyList()
1 -> this[0].toList()
else -> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggest

public fun <T> Array<out Array<out T>>.flatten(): List<T> {
    if (isEmpty()) return emptyList()

    val totalSizeLong = sumOf { it.size.toLong() }
    if (totalSizeLong == 0L) return emptyList()
    require(totalSizeLong <= Int.MAX_VALUE.toLong()) {
        "Sum of all arrays overflow maximum array capacity (of Int.MAX_VALUE)"
    }

    val outputArray = arrayOfNulls<Any?>(totalSizeLong.toInt())
    var offset = 0
    for (innerArray in this) {
        innerArray.copyInto(outputArray, offset)
        offset += innerArray.size
    }

    @Suppress("UNCHECKED_CAST")
    return outputArray.asList() as List<T>
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I've applied it.

1. Previous version used ArrayList.addall method to copy inner arrays into destination list, which caused extra copies.
2. Add special cases to return early on empty or single element array
Avoid copying array elements using a loop.
Replace inner `toMutableList()` call with `copyOf().asList()`.
This change reduce number of array copies from two to one.
@alexismanin
Copy link
Author

@fzhinkin and @hoangchungk53qx1 : Is there anything else you need me to do or check before merging ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants