Bad codegen for comparing struct of two 16bit ints #140167

bjorn3 · 2025-04-22T15:47:18Z

I tried this code:

#[derive(Clone, Copy, PartialEq)]
pub struct Foo {
    pub x: u16,
    pub y: u16,
}

#[no_mangle]
fn eq(a: &Foo, b: &Foo) -> bool {
    a == b
}

I expected to see this happen: a and b are loaded into a single register each and then the registers are compared against each other.

Instead, this happened:

For -Copt-level=2:

eq:
        movzx   eax, word ptr [rdi]
        movzx   ecx, word ptr [rdi + 2]
        xor     ax, word ptr [rsi]
        movzx   edx, word ptr [rsi + 2]
        xor     edx, ecx
        or      dx, ax
        sete    al
        ret

For -Copt-level=3:

eq:
        movd    xmm0, dword ptr [rdi]
        movd    xmm1, dword ptr [rsi]
        pcmpeqw xmm1, xmm0
        pshuflw xmm0, xmm1, 80
        pshufd  xmm0, xmm0, 80
        movmskpd        eax, xmm0
        cmp     eax, 3
        sete    al
        ret

Meta

rustc --version --verbose:
Both 1.86 and nightly.

The text was updated successfully, but these errors were encountered:

bjorn3 · 2025-04-22T15:52:37Z

Using transmute::<_, u32>(read_unaligned(self)) and transmute::<_, u32>(read_unaligned(other)) inside of the PartialEq impl produces the expected codegen.

folkertdev · 2025-04-22T16:14:37Z

Trying this in C shows that it is also unable to optimize (the equivalent of) Foo::eq, but it is able to generate good code when memcmp is used:

https://godbolt.org/z/cqhM9hThK

#include <stdint.h>
#include <stdbool.h>
#include <string.h>

typedef struct {
    uint16_t x;
    uint16_t y;
} Foo;

bool eq(const Foo* a, const Foo* b) {
    return memcmp(a, b, sizeof(Foo)) == 0;
}

bool partial_eq(const Foo* a, const Foo* b) {
    return a->x == b->x && a->y == b->y;
}

eq:
        mov     eax, dword ptr [rdi]
        cmp     eax, dword ptr [rsi]
        sete    al
        ret

partial_eq:
        movzx   eax, word ptr [rdi]
        cmp     ax, word ptr [rsi]
        jne     .LBB1_1
        movzx   eax, word ptr [rdi + 2]
        cmp     ax, word ptr [rsi + 2]
        sete    al
        ret
.LBB1_1:
        xor     eax, eax
        ret

That suggests that codegen for Eq could be improved by using memcmp in some cases. (fields must be copy, no padding , probably other conditions that I'm missing).

hanna-kruppe · 2025-04-22T16:29:31Z

At the time derives are expanded, field types aren’t known at all. The derive just gets tokens and it might not even see the same tokens as later stages of the compiler. So the derive can’t do the necessary checks (notably including: is bitwise equality even correct for all field types). In theory the built-in derives could expand to some magic intrinsic that’s lowered much later when types are known, but that’s a big hammer. I don’t see any fundamental reason why LLVM shouldn’t be able to do this optimization (at least for Rust) and that would help non-derived code as well. But I haven’t checked Alive2.

bjorn3 added C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code. labels Apr 22, 2025

rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 22, 2025

bjorn3 changed the title ~~Bad codegen for comparing struct of 16bit ints~~ Bad codegen for comparing struct of two 16bit ints Apr 22, 2025

Urgau added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-codegen Area: Code generation and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bad codegen for comparing struct of two 16bit ints #140167

Bad codegen for comparing struct of two 16bit ints #140167

bjorn3 commented Apr 22, 2025 •

edited

Loading

bjorn3 commented Apr 22, 2025 •

edited

Loading

folkertdev commented Apr 22, 2025

hanna-kruppe commented Apr 22, 2025 •

edited

Loading

Bad codegen for comparing struct of two 16bit ints #140167

Bad codegen for comparing struct of two 16bit ints #140167

Comments

bjorn3 commented Apr 22, 2025 • edited Loading

Meta

bjorn3 commented Apr 22, 2025 • edited Loading

folkertdev commented Apr 22, 2025

hanna-kruppe commented Apr 22, 2025 • edited Loading

bjorn3 commented Apr 22, 2025 •

edited

Loading

bjorn3 commented Apr 22, 2025 •

edited

Loading

hanna-kruppe commented Apr 22, 2025 •

edited

Loading