Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply random float error #4156

Merged
merged 2 commits into from
Feb 16, 2025

Conversation

LorrensP-2158466
Copy link
Contributor

@LorrensP-2158466 LorrensP-2158466 commented Jan 27, 2025

This partially tries to fix #3555 (does not include the pow operations)

this just does apply_random_float_error on the results of the intrinsics.
Things changed from original:

  • seperated the sqrtf* intrinsics to make the implementation cleaner, since applying the error was not required
  • changes the way we do approximation checks, instead of using the epsilon technique, we use ULP.
  • create the relative errors of 16ULP using ulp_err_scale, which does -(F::PRECISION - 1 - n for a 2^n ULP error.
  • adds a test test_non_determinism which tests a lot of non-deterministic operations with the same input to check if they are almost but not entirely equal.

@LorrensP-2158466 LorrensP-2158466 marked this pull request as draft January 27, 2025 19:51
@RalfJung RalfJung added the S-waiting-on-author Status: Waiting for the PR author to address review comments label Jan 29, 2025
@RalfJung
Copy link
Member

RalfJung commented Jan 29, 2025

Please also add a test (as a new function in tests/pass/float.rs) that checks for non-determinism by running the same operation multiple times and ensuring we get different (but similar) results.

@rustbot

This comment has been minimized.

@rustbot

This comment has been minimized.

@LorrensP-2158466
Copy link
Contributor Author

Almost done I think.

I have 1 TODO left, inside of the non-determinism test, it is about this:

Please also add a test (as a new function in tests/pass/float.rs) that checks for non-determinism by running the same operation multiple times and ensuring we get different (but similar) results.

Should we do an assert_ne! test on the results, because it is possible that the same error is applied to 2 separate operations that we then check. Maybe it is possible to count these exactly equal results and if it is the same as the amount of checks we can safely say it is deterministic and should fail the test?.

@LorrensP-2158466 LorrensP-2158466 force-pushed the apply-random-float-error branch from a64db7e to b82c735 Compare February 5, 2025 18:21
@LorrensP-2158466
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Feb 5, 2025
@RalfJung
Copy link
Member

RalfJung commented Feb 5, 2025

Should we do an assert_ne! test on the results, because it is possible that the same error is applied to 2 separate operations that we then check. Maybe it is possible to count these exactly equal results and if it is the same as the amount of checks we can safely say it is deterministic and should fail the test?.

I think the easiest way is something like this; you can do that with a closure returning any PartialEq type instead of bool.

@LorrensP-2158466
Copy link
Contributor Author

@rustbot ready

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, this looks great! All comments are minor.

@rustbot author
Please post @rustbot ready when the PR is ready for the next round of review.

ulp diff: {ulp_difference} > {allowed_ulp}
",
*a,
*b
Copy link
Member

@RalfJung RalfJung Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will print complete nonsense when the exponent is different, won't it?

I'm not convinced going via the hex/binary representation is the best option here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f16 and f128 currently don't implement Display, so using debug was the only thing that worked (that i know of).

Copy link
Member

@RalfJung RalfJung Feb 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand your reply.

If the exponent is different, subtracting the two i128 representations will not be the difference in ULP, it will be complete junk. The entire logic falls apart.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's an alternative way to approximate the distance in ULP, but it may fail spectacularly due to rounding:

let ulp = (expected.next_up() - expected.next_down()) / 2;
let ulp_diff = ((real - expected) / ulp).round();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to be honest, I don't see why we would use this way of calculating ULP. Isn't it unlikely to get approximately equal values with different exponents in our tests?

Copy link
Member

@RalfJung RalfJung Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a question of probability -- if we happen to write a test where the result is near the boundary between values represented with different exponents, we'll suddenly get completely nonsensical results. That's fragile and hard to debug. We shouldn't have surprising footguns in our code that can go off any day in the future that someone adds a new test.

frem_algebraic, frem_fast, fsub_algebraic, fsub_fast,
};
use std::{f32, f64};
// FIXME(powi_powf): add when supported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is that about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mentioned in the zulip chat to not include the pow intrinsics in this pr. I should explain this better inside the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. The better place for the FIXME would be in the shim implementation.

@rustbot rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Feb 8, 2025
Comment on lines 15 to 25
/// Another form of checking if 2 floating point numbers are almost equal to eachother
/// Using `a` and `b` as floating point numbers:
///
/// Instead of doing a simple EPSILON check (which we did at first):
/// Absolute difference of `a` and `b` can't be greater than some E (10^-6, ...).
///
/// We look at ULP: `Units in the Last Place` or `Units of Least Precision`
/// It is the absolute difference of the *unsigned integer* representation of `a` and `b`.
/// For example for f32, which in IEEE format looks like this:
///
/// s: sign bit
/// e: exponent bits
/// m: significand bits
/// f32: s | eeee eeee | mmm mmmm mmmm mmmm mmmm mmmm
/// u32: seee eeee emmm mmmm mmmm mmmm mmmm mmmm
///
/// We see that checking `a` and `b` with different signs has no meaning, but we should not forget
/// -0.0 and +0.0.
/// Same with exponents but no zero checking.
///
/// So when Sign and Exponent are the same, we can get a reasonable ULP value
/// by doing the operation explained above. And if this is less than or equal to our chosen upper bound
/// we can say that `a` and `b` are approximately equal.
///
/// Basically ULP can be seen as a distance metric of floating point numbers, but having
/// the same amount of "spacing" between consecutive numbers. So eventhough 2 very large floating point numbers
/// have a large value difference, their ULP can still be 1, so they are still "approximatly equal",
/// but the EPSILON check would have failed.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only learned about this concept last week, but I tried my best. Let me know if i missed something or if something is explained badly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow that's a lot more detailed than I expected, thanks!

@LorrensP-2158466
Copy link
Contributor Author

LorrensP-2158466 commented Feb 8, 2025

Fixed these CI checks in the last commit but CI never ran on that one...
Edit: it did, it still fails, but not locally
@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Feb 8, 2025
@LorrensP-2158466
Copy link
Contributor Author

@rustbot blocked

I don't know whats going on

@rustbot rustbot added S-blocked Status: blocked on something happening somewhere else and removed S-waiting-on-review Status: Waiting for a review to complete labels Feb 8, 2025
@saethlin
Copy link
Member

saethlin commented Feb 8, 2025

CI tests your branch merged with the target branch. The errors in CI look like you have a semantic merge conflict (GitHub can only automatically detect textual merge conflicts).

You should rebase onto the upstream target branch. The most foolproof way to do this is a rebase pull: git pull --rebase git@github.com:rust-lang/miri.

@LorrensP-2158466 LorrensP-2158466 force-pushed the apply-random-float-error branch from 312b028 to 25ecf26 Compare February 8, 2025 16:49
@saethlin
Copy link
Member

saethlin commented Feb 8, 2025

If you still have a clean local build, run ./miri toolchain, which will probably update the toolchain override.

@LorrensP-2158466
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-blocked Status: blocked on something happening somewhere else labels Feb 8, 2025
@LorrensP-2158466
Copy link
Contributor Author

Sorry for the delay, Uni started this week.

I extended the explanation of the Approx Check, let me know if this is correct and not too long?

@rustbot ready

Comment on lines 61 to 86
fn approx_eq_check<F: Float>(
actual: F,
expected: F,
allowed_ulp: F::Int,
) -> Result<(), NotApproxEq<F>>
where
F::Int: PartialOrd,
{
let actual_signum = actual.signum();
let expected_signum = expected.signum();

if actual_signum != expected_signum {
// Floats with different signs must both be 0.
if actual != expected {
return Err(NotApproxEq::SignsDiffer);
}
} else {
let ulp = (expected.next_up() - expected.next_down()).halve();
let ulp_diff = ((actual - expected) / ulp).round().as_int();

if ulp_diff > allowed_ulp {
return Err(NotApproxEq::UlpFail(ulp_diff));
}
}
Ok(())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that this is just regular float arithmetic, why do we still need this trait and all the rest?

Also, most of the explanation is not relevant any more since we are not working on the bit representation any more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trait is still needed for other helper functions I didn't write, they were also used in macros and such so i followed that approach. It's still a lot of code and the macro will thus generate a lot of code (not sure if this matters), and there is no way to make a float generic function other than using the trait.

I'll update the explanation, let me know what you think about the rest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra: Also the weird extra functions (halve, as_int, round) i had to add to the Trait were necessary to make the generic function work. Just putting in as F::int or / 2.0 did not work in any clean way other than this. It's no problem if i have to remove it if needed.

@RalfJung RalfJung force-pushed the apply-random-float-error branch from 10ece47 to e095b40 Compare February 16, 2025 18:13
@RalfJung RalfJung force-pushed the apply-random-float-error branch from 1e977f7 to 5c09010 Compare February 16, 2025 18:19
@RalfJung
Copy link
Member

I found it easier to just do the remaining changes than to try to explain them, so I have updated the PR accordingly. I also rebased over the latest rustup which adds one more function that had to be tested for non-determinism, and squashed the commits. (If you want to see what I changed compared to your version, look at the 2nd commit.)

Turns out a macro-based solution still works just fine. Maybe a method is cleaner, but switching to that should not be done in the same PR as adding new functionality.

Thanks for your work!

@RalfJung RalfJung enabled auto-merge February 16, 2025 18:22
@RalfJung RalfJung force-pushed the apply-random-float-error branch from 5c09010 to 269a7f5 Compare February 16, 2025 18:23
@RalfJung RalfJung force-pushed the apply-random-float-error branch from 269a7f5 to 5c9eaa3 Compare February 16, 2025 18:24
@RalfJung RalfJung added this pull request to the merge queue Feb 16, 2025
Merged via the queue into rust-lang:master with commit 826e72f Feb 16, 2025
7 checks passed
@LorrensP-2158466
Copy link
Contributor Author

Alright that's good. Thanks, learned a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Waiting for a review to complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use non-deterministic precision for float operations that do not have guaranteed precision
4 participants