Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify MaybeUninit docs #136689

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Clarify MaybeUninit docs #136689

wants to merge 1 commit into from

Conversation

hkBst
Copy link
Member

@hkBst hkBst commented Feb 7, 2025

Couldn't help myself while fixing #66845.

@rustbot
Copy link
Collaborator

rustbot commented Feb 7, 2025

r? @thomcc

rustbot has assigned @thomcc.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 7, 2025
@rust-log-analyzer

This comment has been minimized.

@thomcc
Copy link
Member

thomcc commented Feb 19, 2025

I don't feel strongly about these changes but they should probably be reviewed by @RalfJung, who I believe wrote these docs in the first place.

r? @RalfJung

@rustbot rustbot assigned RalfJung and unassigned thomcc Feb 19, 2025
@RalfJung
Copy link
Member

@hkBst could you talk a bit about what issue with the docs you are trying to fix here?

Cc @rust-lang/opsem in case someone has an opinion. I don't have a lot of time rn unfortunately.

@hkBst
Copy link
Member Author

hkBst commented Feb 19, 2025

@RalfJung that seems fair.

Issue #66845 talks about misunderstanding MaybeUninit::uninit_array() to return MaybeUninit<[MaybeUninit<T>; LEN]> because of "Create a new array of MaybeUninit items, in an uninitialized state.". I think because "a new array of MaybeUninit items" is already [MaybeUninit<T>; LEN] and then "in an uninitialized state" could apply to either the elements or the array. This is why the proposed version says "Creates a new array of uninitialized MaybeUninit<T> elements." to make it clear what "uninitialized" refers to. Then I've tried to remove "in an uninitialized state" everywhere to prevent the same confusion.

So then we have "uninitialized MaybeUninit<T>" and "initialized MaybeUninit<T>" and so it seemed slightly awkward to me to talk about the MaybeUninit<T> containing a value and about its content, and better to talk about it being a wrapper for a value and its value. The original first line even is "A wrapper type to construct uninitialized instances of T." (And I realize that here I have perhaps sinned against the rule to have a short one-sentence summary at the top.)

Then I tried to write an intro that tries to make the rules around drop seem less magical by putting more emphasis on the fact that a MaybeUninit<T> is just a type-level construct for a T which may or may not be initialized. This would hopefully also make it more intuitive that there is little value in a MaybeUninit<[MaybeUninit<T>; LEN]>.

I believe the rest of the changes are mostly copy-editing based on the above.

This PR is not time sensitive, so there is no rush.

@RalfJung
Copy link
Member

RalfJung commented Feb 21, 2025

The type of uninit_array is given right there in the docs, so I don't quite follow why that would trigger a half-rewrite of the docs in that module.^^ And I don't understand how you want from "the docs of that function are implicitly talking about the state of N elements in an array, rather than the state of the array" to "we should get rid of the terminology 'uninitialized state'". It is pretty standard in English to say "a list of X, having property Y" in a case where Y is a property of the elements X, nor a property of the list. If this needs disambiguation, adding a single 'each' would fix that: "a list of X, each having property Y".

To me this feels like rewording for the sake of rewording, I don't see a clear improvement that would justify the effort of making sure that every single sentence doesn't potentially imply something we don't want it to imply. These docs are hard to get right, they shouldn't be rewrittien gratuitously.

(And I realize that here I have perhaps sinned against the rule to have a short one-sentence summary at the top.)

Indeed, the very first chunk of the diff is already not an improvement IMO.

I appreciate your attempt to improve the docs, but currently I don't quite see which issue in the existing overarching docs this fixes -- an issue with the uninit_array docs should be fixed locally there, making this a lot easier to review. :D And if it doesn't fix anything, just makes it different, then I'm afraid I don't think it is worth it.

@RalfJung
Copy link
Member

Also, apparently the plan is to remove uninit_array anyway: #134585.

Couldn't help myself while fixing rust-lang#66845.
@hkBst
Copy link
Member Author

hkBst commented Feb 21, 2025

(And I realize that here I have perhaps sinned against the rule to have a short one-sentence summary at the top.)

Indeed, the very first chunk of the diff is already not an improvement IMO.

I've restored the summary line.

@RalfJung
Copy link
Member

RalfJung commented Feb 21, 2025

Thanks, that doesn't resolve any of the other concerns I mentioned though. :)

@hkBst
Copy link
Member Author

hkBst commented Feb 21, 2025

I think that "Creates a new array of uninitialized MaybeUninit<T> elements." is just simpler than "Creates a new array of MaybeUninit<T> elements, each in an uninitialized state.", and thus will be more readily understood by more people in less time. But even if you disagree there, then surely it is at least as simple as the latter, no? So at the least you are using fewer words to say the same thing.

In addition to that, I think the following does much work to dispell some magic surrounding this type and drop:
/// A MaybeUninit<T> is like a T, but without the requirement that it is properly initialized as a T.
/// Dropping a MaybeUninit<T> does nothing, even if properly initialized as a T, because
/// the compiler relies on the type system to decide how to drop variables. Thus, if a MaybeUninit<T>
/// should be dropped like a T, it should be converted to a T with assume_init or similar.

@RalfJung
Copy link
Member

I can agree to a high-level paragraph being added before "Initialization invariant". What about the rest of this PR?

I think that "Creates a new array of uninitialized MaybeUninit elements." is just simpler than "Creates a new array of MaybeUninit elements, each in an uninitialized state.", and thus will be more readily understood by more people in less time. But even if you disagree there, then surely it is at least as simple as the latter, no? So at the least you are using fewer words to say the same thing.

Yeah as I said I'm fine improving the docs of that function. Or, well, I'd suggest first waiting for the outcome of #134585 as it seems likely the function will be removed. It was added because we didn't have const blocks yet so there was no other easy way to crate a [MaybeUninit<T>; N] for a non-Copy T.

@saethlin
Copy link
Member

To me this feels like rewording for the sake of rewording, I don't see a clear improvement that would justify the effort of making sure that every single sentence doesn't potentially imply something we don't want it to imply. These docs are hard to get right, they shouldn't be rewrittien gratuitously.

I've read over most of this PR and I agree. I can't find an improvement that justifies the churn and review effort.

@@ -308,7 +312,7 @@ impl<T> MaybeUninit<T> {
MaybeUninit { value: ManuallyDrop::new(val) }
}

/// Creates a new `MaybeUninit<T>` in an uninitialized state.
/// Creates a new uninitialized `MaybeUninit<T>`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you agree with this change and others like it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see nothing fundamentally wrong with the old wording. 🤷

I'm not saying every line of your diff is wrong, but it's easier for me to rewrite the docs myself than to sieve out the clear-improvement changes in your PR from the ones that make things less clear or that just use different words to say the same thing. If you have some core changes here you think are most important, please reduce the PR to just those so that the diff becomes obvious.

@@ -367,8 +371,7 @@ impl<T> MaybeUninit<T> {
[const { MaybeUninit::uninit() }; N]
}

/// Creates a new `MaybeUninit<T>` in an uninitialized state, with the memory being
/// filled with `0` bytes. It depends on `T` whether that already makes for
/// Creates a new zero-filled `MaybeUninit<T>`. It depends on `T` whether that already makes for
/// proper initialization. For example, `MaybeUninit<usize>::zeroed()` is initialized,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old text is even trying to claim an uninitialized state, even though it depends on T.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, yeah I could see how this could be confusing.

@@ -545,7 +545,7 @@ impl<T> MaybeUninit<T> {
/// ```
///
/// (Notice that the rules around references to uninitialized data are not finalized yet, but
/// until they are, it is advisable to avoid them.)
/// until they are, it is advisable to avoid references to uninitialized data.)
#[stable(feature = "maybe_uninit", since = "1.36.0")]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "them" is ambiguous, as it could refer to "references", or to "unintialized data" or "references to uninitialized data" or even to "the rules"... Probably needs more rewording to avoid the repetition though.

@@ -25,7 +29,7 @@ use crate::{fmt, intrinsics, ptr, slice};
/// This is exploited by the compiler for various optimizations, such as eliding
/// run-time checks and optimizing `enum` layout.
///
/// Similarly, entirely uninitialized memory may have any content, while a `bool` must
/// Similarly, entirely uninitialized memory may have any value, while a `bool` must
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"value" has a specific technical meaning in Rust semantics, and I don't see how this is the right use of the term. What's wrong with "content"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does a bool have to do with uninitialzed memory if a bool has values, but memory has content? I think it makes more sense if they both have the same things, since memory consists of bits, or bytes, or words which all also have values. "content" is perhaps not wrong, but it is suboptimal IMO.

@@ -726,16 +729,16 @@ impl<T> MaybeUninit<T> {
}
}

/// Drops the contained value in place.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is another example of a magic phrase: "drop in place". Does that mean to drop it without moving it first, or is it equivalent to just "drop". When does the difference (if there is one) matter? I honestly don't know the answer, which I why I did not yet propose a change for that part, but it could use some clarification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean to drop it without moving it first

Yes, what else should it mean?
See https://doc.rust-lang.org/nightly/std/ptr/fn.drop_in_place.html

/// (unchanged unless written to) value . Reading the same uninitialized byte
/// multiple times can give different results. This even makes it undefined
/// behavior to have uninitialized data in a variable of integer type,
/// which otherwise could hold any *fixed* bit pattern:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instance, the old paragraph here is entirely fine, whereas the new version is unnecessarily terse while also using the term "value" in questionable ways.

I think you are underestimating how much work it is to review changes to natural-language technical writing that is highly sensitive to its exact wording. Every single change you make needs to be clearly motivated or it's just not worth the effort.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is definitely more opinion-based, and I definitely like to remove extraneous words, as it improves the signal-to-noise ratio. I believe this is useful when you want to slowly read something a few times, when trying to understand it. But I'll happily retract this one.

I'm not quite sure how to respond to how much work reviewing changes is. Yes, it is a lot of work. So is polishing existing text, especially when having to justify each change. I still think there is value in doing it, as there can be a lot of papercuts hiding in documentation.

/// initialized causes undefined behavior.
/// It is up to the caller to ensure that the `MaybeUninit<T>` has been fully initialized,
/// before dropping the `T` value of the `MaybeUninit<T>`. Calling this when the `T` value
/// of the `MaybeUninit<T>` is not yet fully initialized causes immediate undefined behavior.
///
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is what you both mean by churn, but is an attempt at making the phrasing more uniform:

  • initialization: initialized vs fully initialized vs properly initialized
  • UB: causes UB vs causes immediate UB

/// is in an initialized state.
/// It is up to the caller to ensure that the `MaybeUninit<T>` has been fully initialized,
/// before getting a reference to the `T` value of the `MaybeUninit<T>`. Calling this when the `T` value
/// of the `MaybeUninit<T>` is not yet fully initialized causes immediate undefined behavior.
///
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one, for example, says the same things, but in a different way...

/// initialize a `MaybeUninit`.
/// It is up to the caller to ensure that the `MaybeUninit<T>` has been fully initialized,
/// before getting a mutable reference to the `T` value of the `MaybeUninit<T>`. Calling this when the `T` value
/// of the `MaybeUninit<T>` is not yet fully initialized causes immediate undefined behavior.
///
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a third way...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants