Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use structured types for parsing and formatting language and ABI tags #10525

Merged
merged 1 commit into from
Jan 14, 2025

Conversation

charliermarsh
Copy link
Member

Summary

I need to be able to do non-lexicographic comparisons between tags (e.g., so I can sort cp313 as greater than cp39). It ended up being easiest to just create structured types for all the tags we support, with FromStr and Display implementations.

We don't currently store these in Tags or in WheelFilename. We may want to, since they're really small (and Copy), but I need to benchmark to determine whether parsing these in WheelFilename is prohibitively slow.

@charliermarsh charliermarsh added the internal A refactor or improvement that is not user-facing label Jan 11, 2025
@charliermarsh charliermarsh force-pushed the charlie/tags branch 3 times, most recently from 1d3ab2a to 4b6aaba Compare January 11, 2025 23:00

/// Parse an [`AbiTag`] from a string.
#[allow(clippy::cast_possible_truncation)]
fn from_str(s: &str) -> Result<Self, Self::Err> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parsing code is incredibly verbose... It's fairly rote though, and comes with tests. Open to input on how to trim it down. I could perhaps remove all the error-handling...? And just return None?

@charliermarsh charliermarsh force-pushed the charlie/tags branch 4 times, most recently from 142cc14 to 0365fae Compare January 12, 2025 01:38
@charliermarsh charliermarsh force-pushed the charlie/tags branch 6 times, most recently from 5da08b0 to 9bfe1b7 Compare January 12, 2025 20:46
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this personally. As I said in the comments, I'm a fan of the granular error reporting, even when it makes the code chonky.

I will say that I find the code pretty easy to follow. It reads nicely to me.

tag: full_tag.to_string(),
})?;
Ok((major, minor))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my comments from above apply here too.

I personally am a fan of very granular error reporting like you have here. I've just been bitten so many times by overly generic and unhelpful error messages that I like this granular style much more. Like, it's more code, but I don't see this changing much or otherwise being hard to maintain personally.

Abi3,
/// Ex) `cp39m`, `cp310t`
CPython {
gil_disabled: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, why this naming? We use "freethreaded" elsewhere and that avoids a double-negative (e.g., gil disabled = false is slightly harder to reason about)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear goodness it comes from Py_GIL_DISABLED and is present elsewhere in our code. I defer to you on what's best here, but have a preference for "freethreaded"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just leave it for now and revisit in a separate PR.

}

#[derive(Debug, thiserror::Error, PartialEq, Eq)]
pub enum ParseAbiTagError {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could split the error into the input value and the error kind with context, which should cut down the error handling boilerplate in AbiTag::from_str:

#[error("{kind} ABI tag: {tag}")]
struct ParseAbiTagError {
    tag: String,
    kind: ParseAbiTagErrorKind,
}

enum ParseAbiTagErrorKind {
    #[error("Unknown format for")]
    UnknownFormat,
    #[error("Missing major version in {0}")]
    MissingMajorVersion(&'static str),
    // ...
}

@charliermarsh charliermarsh merged commit e0e8ba5 into main Jan 14, 2025
64 checks passed
@charliermarsh charliermarsh deleted the charlie/tags branch January 14, 2025 00:49
charliermarsh added a commit that referenced this pull request Jan 14, 2025
## Summary

This PR extends the thinking in #10525 to platform tags, and then uses
the structured tag enums everywhere, rather than passing around strings.
I think this is a big improvement! It means we're no longer doing ad hoc
tag parsing all over the place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal A refactor or improvement that is not user-facing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants