You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bjorn3 opened this issue
Apr 22, 2025
· 3 comments
Labels
A-codegenArea: Code generationA-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.
Using transmute::<_, u32>(read_unaligned(self)) and transmute::<_, u32>(read_unaligned(other)) inside of the PartialEq impl produces the expected codegen.
eq:moveax, dword ptr [rdi]cmpeax, dword ptr [rsi] sete alretpartial_eq:movzxeax, word ptr [rdi]cmpax, word ptr [rsi]jne .LBB1_1movzxeax, word ptr [rdi+2]cmpax, word ptr [rsi+2] sete alret.LBB1_1:xoreax,eaxret
That suggests that codegen for Eq could be improved by using memcmp in some cases. (fields must be copy, no padding , probably other conditions that I'm missing).
At the time derives are expanded, field types aren’t known at all. The derive just gets tokens and it might not even see the same tokens as later stages of the compiler. So the derive can’t do the necessary checks (notably including: is bitwise equality even correct for all field types). In theory the built-in derives could expand to some magic intrinsic that’s lowered much later when types are known, but that’s a big hammer. I don’t see any fundamental reason why LLVM shouldn’t be able to do this optimization (at least for Rust) and that would help non-derived code as well. But I haven’t checked Alive2.
Urgau
added
A-LLVM
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.
A-codegen
Area: Code generation
and removed
needs-triage
This issue may need triage. Remove it if it has been sufficiently triaged.
labels
Apr 22, 2025
A-codegenArea: Code generationA-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.C-bugCategory: This is a bug.I-slowIssue: Problems and improvements with respect to performance of generated code.
I tried this code:
I expected to see this happen:
a
andb
are loaded into a single register each and then the registers are compared against each other.Instead, this happened:
For
-Copt-level=2
:For
-Copt-level=3
:Meta
rustc --version --verbose
:Both 1.86 and nightly.
The text was updated successfully, but these errors were encountered: