-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Not optimize boolean computations for the usize
type
#140103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I clicked on the godbolt link and I see the good assembly. So I'm confused. |
@saethlin I have checked the godbolt link https://www.godbolt.org/z/zY3Kqj8ax. The Editor To show the problem clearly, I provide another godbolt link to show the example with bad assembly the same as it in this issue: Thank you! |
More fool me. I tried to open the link on mobile and got confused by how godbolt renders on mobile 🤦 Alive2 says this optimization is valid: https://alive2.llvm.org/ce/z/LaPXQ2 Also the only diff in the IR we generate is just whether the comparison is signed or unsigned: https://www.godbolt.org/z/GnTrWErcs |
@saethlin Thank you for your review. I’ll learn it and keep following this issue. :) |
As far as I can tell the difference here is that the loop gets runtime unrolled in the usize case. Is the resulting code actually slower (for large inputs)? Just because you see less assembly doesn't mean that the code is less optimized. |
I'll also add that, based on extensive experience in the LLVM project, missed optimization issues that are based around some kind of automated identification approach tend to provide significant negative value to the project. They result in contributors wasting a substantial amount of time implementing, reviewing and maintaining optimizations that do not provide any real-world value. Making such reports actually useful requires significantly more effort on behalf of the reporter -- you need to understand what the root cause of the optimization difference is and then consider whether fixing that root cause can plausibly benefit real-world code. The usually more reliable approach is to instead directly work on a corpus of real-world code, and identify missed optimization opportunities in it. I'd appreciate it if missed optimization reports that are not directly derived from unmodified real-world code were at least tagged as such, to help prioritize them appropriately. |
@nikic Thank you very much for your comment, which has given me a deeper understanding of project maintenance regarding missed optimization opportunities. Indeed, in complex systems, the handling of missed optimization issues tends to be more conservative. Through my research on detecting optimization oversights in various complex systems, I’ve observed that developers often do not prioritize minor optimization opportunities, as the associated costs outweigh the benefits. In academia, some scholars adopt a relatively straightforward criterion: test cases with more than 100 lines of assembly code differences in differential testing are reported. However, this standard is not entirely accurate, as the size of the program itself also significantly influences the number of assembly lines. For smaller programs, even smaller assembly differences should warrant attention. In the case of this issue, the difference between the good and bad examples is more than twofold in terms of assembly lines, which is why it caught my attention. More importantly, as you mentioned, the root cause of the missed optimization and its actual performance impact are what truly determine the value of the issue. In this case, the root cause may lie in the missed optimization of boolean operations involving the Bad example:
Good example:
The CPU cycle counts show a significant difference, indicating that this optimization does have meaningful impact. Finally, does this missed optimization opportunity matter in the real world? I believe it does. Although the odd-number check in this issue may not be the best implementation, Thank you again for your response—it has been very enlightening. I hope my comment can be helpful to |
The following example shows that
rustc
does not optimize boolean computations for theusize
type.I tried this code: (opt-level=3)
https://www.godbolt.org/z/zY3Kqj8ax
I expected to see this happen:
Instead, this happened:
However, when I change the type from
usize
toi32
, the compiler generates the desired assembly code. Additionally, I can also get optimized result by rewritingif !(x >= (0 + 2))
asif x < 2
.Therefore, I think that something wrong may occur when optimizing boolean computation for
usize
type. If addressed in the future, this improvement could enhancerustc
's code generation efficiency for such cases.Could you please review the situation? Thank you!
Meta
The text was updated successfully, but these errors were encountered: