-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't panic on empty ranges. #78
base: main
Are you sure you want to change the base?
Conversation
According to the standard library, a range is empty if start >= end. An inclusive range is empty if start > end. While it makes no sense to insert such a range, it should not cause a panic as these ranges are not explicitly disallowed by the standard library.
IIRC my original reasoning for this is that if you're inserting an empty range, that has no meaning, and so I assume it is a program error and therefore suitable to panic. I don't feel very strongly about this, though. As for changing it now, I think that would be a breaking change; being more permissive with types is not breaking, but this would change the observable behaviour of existing programs. It's extremely unlikely that anybody is relying on the current behaviour in practice, but I need to have a think about whether or not this is okay to change. |
It is up to you what you think makes sense. I have only two reasons why I think this might make sense from a user's perspective (from what I have experienced in practise):
So I think the reason I think this might be valuable to change is that people might have semi-good reasons for having empty ranges somewhere in their data pipelines, and if we have three possible behaviours:
Then doing "crash" is perhaps the least good option (I think libraries should ideally never panic). Returning an error might be an option but is quite invasive. Doing nothing in my opinion might be the best option here.
Fair :) it is possible to rely on this by catching panics, however I think that is quite unlikely outside of unit testing. But it would be nice to document a guarantee that "no matter how you use this API, we will never panic". I think that is a requirement for at least some users. |
I feel this sort of thing comes down heavily to a users use-cases, for me I like having empty ranges panic because in my use-case empty ranges should be impossible so it's nice that this is always being asserted. On the other hand if your use-case often has empty ranges then having empty ranges being ignored could be nicer too. Perhaps this sort of thing could be enabled/disabled by a feature? Something like |
I quite like the idea of using a feature here. Although, some consideration would need to be made, as Cargo performs feature unification. It would be dangerous if there was some code that relied on there being assertions, and then those assertions being disabled because another unrelated crate in the same project enables that feature (or the other way around). |
An alternative to using features would be define a trait for every method of that takes a range as input, then define two different types The other option you mentioned of returning a result from each of these methods could work but then you'd have to |
It almost feels like the idea of returning a result gives the most flexibility, because then the caller can decide whether to handle that. Something like: #[derive(thiserror::Error, Debug)]
pub enum InsertError {
#[error("empty range inserted")]
EmptyRange,
}
impl RangeSet<T> {
pub fn insert(range: Range<T>) -> Result<(), InsertError> {
if empty(&range) {
return Err(InsertError::EmptyRange);
}
// ...
}
} And then you can preserve the existing behaviour by calling I generally love overengineering things with the type system, but it feels a bit overkill to achieve this when a result can do the trick. Although, I would still lean a little bit on just doing nothing, just because semantically you can also append an empty vector to a Vec and it works, just does nothing (and an empty range is the same concept, kind of): let mut data = vec![];
// append empty vec, does nothing.
data.append(&mut vec![]); |
Yeah that seems like the cleanest solution. I think I might implement that for |
Although for consistency shouldn't all the other methods that take ranges also now return results? Like |
Ah, I had not seen that other methods also have assertions. In that case, I don't think returning errors everywhere is a good way to go, because it is too broad of a change and maybe not entirely useful to most people. I think the two courses of action then depend on what we define empty ranges as:
The reason for this is that I don't think it is a good idea nor feasible to make all methods return errors. The feature approach could still work, but since it changes the semantics of the range maps and sets, it is not all that easy. I think it would really be simpler to use approach (1). However, if the feature approach is taken one suggestion would be to call it something like |
I think the idea of doing nothing on empty ranges is the correct one, but instead of getting rid of all assertions, why not make them all |
I think the issue with that is that it means there is still an inconsistency. Then in release mode, empty ranges are fine, but in debug mode it is an error. In my opinion, adding an empty range should be like appending an empty list to a list: do nothing. Not panic, even in debug mode. I think it is easier for end-users to add assertions for this case, and for the library to (strongly) guarantee that it does not Another option is to have a separate API, for example I think that |
Fair enough! |
Currently, on insertion there are assertions to check for invalid ranges:
According to the standard library, a range is empty if
start >= end
(orstart > end
for inclusive ranges). Therefore, from the point of view of the standard library, ranges such as0..0
or15..=14
are perfectly valid (but meaningless) and therefore I would argue that this this library should not panic when they are used. I have the opinion that libraries should never panic, if there is an invalid use that needs to be reported then aResult
can be used, because having code that might panic in production systems is quite scary.Consumers of this library that want to catch these insertions and wish to treat these as errors can always do that themselves quite easily, by inserting assertions at the call site:
However, it is not possible to go the opposite way and remove an assertion from a consumer of the library. There are some valid use-cases where having empty ranges might be useful, for example to represent missing data or as the output of some algorithms.
Therefore I think it is best to remove these assertions and simply do nothing (return early) when empty ranges are inserted or removed. This allows for the least amount of surprises for users of the library. Ideally, I feel that
rangemap
should never cause panics for valid input, even if the inputs don't make sense.