-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tinystr-neo to experimental/ #1508
Conversation
|
||
let mut out = [0; N]; | ||
let mut i = 0; | ||
while i < bytes.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: can't be a for loop in const context
} | ||
|
||
pub fn len(&self) -> usize { | ||
self.bytes.iter().position(|x| *x == 0).unwrap_or(N) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: previously we were able to use .trailing_zeroes()
/leading_zeroes()
which are a single instruction. If rustc unrolls this loop for small N we should also be fine here, but I'm not sure if that's happening, we should bench this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah; this is one of the things I was wondering how we would implement efficiently. We should definitely test. I think this would be a good thing for IAI as opposed to Clippy.
} | ||
|
||
#[inline] | ||
fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: this is likely not as efficient as the deserialization from the tinystr crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add TinyStr4/etc as aligned wrappers that serialize to little-endian numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too worried about impl Deserialize
on TinyAsciiStr itself, because that one is only used when TinyAsciiStr is at the top level of a data struct. I anticipate the much more common case to be when it is in a ZeroVec, where the ULE impl will be used instead of the Deserialize impl.
// (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ). | ||
|
||
#[macro_export] | ||
macro_rules! tinystr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought: We maybe don't need the macro any more. People can just use the const function constructors directly. Fewer imports and friendlier to the compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as mentioned in #1511 (comment) , I think it's important to give users a convenient way to call this in a const context, given that it's rare and unusual for Rust programmers to const
locals wherever possible. Unlike that PR, there will be no exported consts for people to use this, so I'd prefer to keep the macro around.
MBEs (non proc macros, like this one) are pretty friendly to the compiler; and this is going to be even smaller once I rebase over the rust update. I don't find that argument that compelling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, macro_rules
is definitely an improvement over a proc macro. Okay, I'm convinced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and please update the PR to Rust 1.58 so we can get rid of the runtime unwrap
Manish - do you plan to measure perf before merging o before switching callsites to it? |
@zbraniecki before switching callsites. The crate is WIP (and experimental/), I just want to land what I have so far so that others (you?? 🥺) can help with docs, benches, API surface, etc. |
fbeea77
to
2cdeab3
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
const TINYSTR_MACRO_CONST: $crate::TinyAsciiStr<$n> = | ||
$crate::TinyAsciiStr::from_str_panicky($s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: does it work if you do this?
const TINYSTR_MACRO_CONST: $crate::TinyAsciiStr<$n> = | |
$crate::TinyAsciiStr::from_str_panicky($s); | |
const TINYSTR_MACRO_CONST: $crate::TinyAsciiStr<$n> = | |
$crate::TinyAsciiStr::from_str($s).unwrap(); |
or perhaps
const TINYSTR_MACRO_CONST: $crate::TinyAsciiStr<$n> = | |
$crate::TinyAsciiStr::from_str_panicky($s); | |
const TINYSTR_MACRO_CONST: $crate::TinyAsciiStr<$n> = | |
$crate::TinyAsciiStr::from_str($s).ok().expect("Invalid tinystr"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(with the unwrap
inside the const context)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, unwrap isn't stably const
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not even Option::unwrap
? oh well... how about
const TINYSTR_MACRO_CONST: $crate::TinyAsciiStr<$n> =
match $crate::TinyAsciiStr::from_str($s) {
Ok(s) => s,
Err(_) => panic!("{}", $s) // Invalid tinystr
}
Or, keep it the way it is. It's just that it might be nice to not have a panicky library function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I didn't like that function either, moved
match $crate::TinyAsciiStr::from_bytes($s.as_bytes()) { | ||
Ok(s) => s, | ||
// Cannot format the error since formatting isn't const yet | ||
Err(_) => panic!(concat!("Failed to construct tinystr from ", $s)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Praise: Good idea to use concat!
to format a nicer message
This is a clean attempt at
tinystr
, usingconst fn
and fixed size arrays.This should address zbraniecki/tinystr#45 / #881
The general idea is, instead of encoding strings as digits, we encode them as byte arrays. This has many advantages:
TinyAsciiStr<5>
orTinyAsciiStr<100>
, no restrictions on NA potential downside is that these are less aligned (we can write TinyStr4/etc wrappers that are better aligned), and
.deref()
/deserialize()
may not be as efficient (we might need to benchmark this).This is still kinda WIP, but I'm not sure how much time I'll have to work on this so I'm landing what I have so far so that others can help out. In particular I'd love help with tests, benches, and docs.