Carry the b_offset inside BackendRepr::ScalarPair#158666
Conversation
|
cc @bjorn3 Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt Some changes occurred to the CTFE machinery Some changes occurred to the CTFE / Miri interpreter cc @rust-lang/miri
This PR changes rustc_public cc @oli-obk, @celinval, @ouz-a, @makai410 Some changes occurred in compiler/rustc_codegen_llvm/src/builder/autodiff.rs cc @ZuseZ4 |
|
|
| ) => { | ||
| l1.primitive() == r1.primitive() | ||
| && l2.primitive() == r2.primitive() | ||
| && l_offset == r_offset |
There was a problem hiding this comment.
note: since this is an eq-like method, I also check that the offsets are the same here. (They always will be right now, but it makes sense to check it regardless IMHO.)
| abi::BackendRepr::ScalarPair { a: out_a, b: out_b, b_offset: out_offset }, | ||
| ) if in_a.size(cx) == out_a.size(cx) | ||
| && in_b.size(cx) == out_b.size(cx) | ||
| && in_offset == out_offset => |
There was a problem hiding this comment.
note: when transmuting we only want to transmute the immediates individually when the second one is at the same offset in the source and destination.
(It'll probably be UB if they're not, but not necessarily since the out_b might allow uninit, so it's not worth trying to do something smart for those cases. They can fall through to the via-alloca implementation.)
There was a problem hiding this comment.
oh cool, so the code is now more correct also! neat.
There was a problem hiding this comment.
I think it was correct before as well because same scalars implied same offset, but it's clearer that it's correct now, and definitely more resiliently correct to subtle changes.
| } | ||
| ( | ||
| Immediate::ScalarPair(a_val, b_val), | ||
| BackendRepr::ScalarPair { a: _, b: _, b_offset }, |
There was a problem hiding this comment.
Note: by not needing to recompute the offset the two scalars actually ended up unused here.
| a1.size(&self.ecx) == a2.size(&self.ecx) | ||
| && b1.size(&self.ecx) == b2.size(&self.ecx) | ||
| // The alignment of the second component determines its offset, so that also needs to match. | ||
| && b1.default_align(&self.ecx) == b2.default_align(&self.ecx) |
There was a problem hiding this comment.
Note: like the other spot this is thinking about transmutes, so now we can check the offset directly instead of indirectly via the default alignment.
This comment has been minimized.
This comment has been minimized.
| "`ScalarPair` second field at bad offset in {inner:#?}", | ||
| ); | ||
| assert_eq!( | ||
| b_offset, field2_offset, |
There was a problem hiding this comment.
Note: for this PR I kept all the existing checks here, since it's not actually changing any layouts. (That's wanting on the MCP FCP.) Just added this new one that the offset stored in the variant matches the expected field offset.
455c80c to
ede96e9
Compare
|
cc @rust-lang/clippy |
This comment has been minimized.
This comment has been minimized.
|
cc @rust-lang/miri |
There was a problem hiding this comment.
added some incantations to the PR since the code is now a bit short on them. :^)
a couple minor wording nitpicks, because you wrote r? workingjubilee and I'd hate to disappoint. I will give it another scan once you're done fighting CI and r+ then, but it seems good! my eyes have indeed confirmed "this sure is a lot of whitespace changes thanks to rustfmt".
oh, the callconv code does need a further scan, I'll get to that about then but I don't expect it to need any changes.
| Immediate::from(if offset.bytes() == 0 { | ||
| a_val | ||
| } else { | ||
| assert_eq!(offset, a.size(cx).align_to(b.default_align(cx).abi)); |
There was a problem hiding this comment.
by the shades of the Seraphim!
| let BackendRepr::ScalarPair { a: _, b: _, b_offset } = dest.layout.backend_repr | ||
| else { | ||
| bug!("store_with_flags: invalid ScalarPair layout: {:#?}", dest.layout); | ||
| }; | ||
| let b_offset = a_scalar.size(bx).align_to(b_scalar.default_align(bx).abi); |
There was a problem hiding this comment.
by the hoary hosts of Hoggoth!
| assert_eq!(field.size, a.size(bx.cx())); | ||
| (Some(a), a_llval) | ||
| } else { | ||
| assert_eq!(offset, a.size(bx.cx()).align_to(b.default_align(bx.cx()).abi)); |
There was a problem hiding this comment.
by the crimson bands of Cyttorak!
| a1.size(&self.ecx) == a2.size(&self.ecx) | ||
| && b1.size(&self.ecx) == b2.size(&self.ecx) | ||
| // The alignment of the second component determines its offset, so that also needs to match. | ||
| && b1.default_align(&self.ecx) == b2.default_align(&self.ecx) |
| BackendRepr::Scalar(scalar) => PassMode::Direct(scalar_attrs(scalar, Size::ZERO)), | ||
| BackendRepr::ScalarPair(a, b) => PassMode::Pair( | ||
| scalar_attrs(a, Size::ZERO), | ||
| scalar_attrs(b, a.size(cx).align_to(b.default_align(cx).abi)), | ||
| ), | ||
| BackendRepr::ScalarPair { a, b, b_offset } => { | ||
| PassMode::Pair(scalar_attrs(a, Size::ZERO), scalar_attrs(b, b_offset)) | ||
| } |
There was a problem hiding this comment.
chintap ...I need to look more closely at the surrounding code here because this is actually something that makes me wonder how correct it is.
Or was, for that matter.
There was a problem hiding this comment.
Rechecked the maze of code this is entangled with. Yes, this version is correct, and will remain correct.
| // We have padding if the sizes don't add up to the total. | ||
| // (No need to check the offset because it's only possible for the scalar sizes | ||
| // to add up to the total size if the b_offset equals the `left_size`, but that's | ||
| // not a sufficient check because there needs to be no padding after right either.) |
There was a problem hiding this comment.
This could be stated more concisely like so, I think?
| // We have padding if the sizes don't add up to the total. | |
| // (No need to check the offset because it's only possible for the scalar sizes | |
| // to add up to the total size if the b_offset equals the `left_size`, but that's | |
| // not a sufficient check because there needs to be no padding after right either.) | |
| // We have padding if the sizes don't add up to the total | |
| // without respect to b_offset, which indicates padding only if the total is higher. |
| abi::BackendRepr::ScalarPair { a: out_a, b: out_b, b_offset: out_offset }, | ||
| ) if in_a.size(cx) == out_a.size(cx) | ||
| && in_b.size(cx) == out_b.size(cx) | ||
| && in_offset == out_offset => |
There was a problem hiding this comment.
oh cool, so the code is now more correct also! neat.
| /// The two scalars are listed in *memory* order, so the first is at offset zero | ||
| /// and the second at a non-zero offset. | ||
| /// The two scalars are listed in *memory* order, so `a` is at offset zero | ||
| /// and `b` is at non-size offset `b_offset`. |
There was a problem hiding this comment.
what does non-size offset mean? especially since b_offset: Size
There was a problem hiding this comment.
yeah I left the same review comment. :^) #158666 (comment) resolving it so there's only one instance of this conversation.
I think this happened because of psychically tripping up over "non-zero offset" and "btw b_offset doesn't have to just be a's Size-rounded-up-to-some-Align-again".
This comment has been minimized.
This comment has been minimized.
|
@bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Carry the `b_offset` inside `BackendRepr::ScalarPair`
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (e1ccfbf): comparison URL. Overall result: no relevant changes - no action neededBenchmarking means the PR may be perf-sensitive. Consider adding rollup=never if this change is not fit for rolling up. @rustbot label: -S-waiting-on-perf -perf-regression Instruction countThis perf run didn't have relevant results for this metric. Max RSS (memory usage)Results (primary -6.9%, secondary -4.3%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.3%, secondary -3.6%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 490.049s -> 485.092s (-1.01%) |
|
Well I was not expecting perf green, especially not on max-rss since this definitely makes |
22e7592 to
1b85fc3
Compare
|
It is possible that ceasing to recompute the same values over and over might actually remove juuust enough stack usage, in juuust enough places, that it actually matters (esp. not spilling as often), whereas |
|
max-rss is complete noise in my experience. |
View all comments
Inspired by rust-lang/compiler-team#1007 but doesn't actually change any of the layout rules just yet.
This turned out to be a nice change even if we didn't use the extra flexibility, IMHO, because it allowed so many things like
as oh my was that little magic incantation copy-pasted all over the place.
Apologies for the pretty-giant PR. I tried to make it as direct a change as I could: if it was
(..)before it's{ .. }now, if it was(_, _)before it's{ a: _, b: _, b_offset: _ }now. I kept the names the same so the code lines were unchanged even if normally I might have just renamed things, etc. I'll add some inline notes for places of particular interest.r? @workingjubilee