Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miri float ops are less precise than what the standard library expects #4208

Open
RalfJung opened this issue Feb 25, 2025 · 22 comments
Open
Labels
A-shims Area: This affects the external function shims C-bug Category: This is a bug.

Comments

@RalfJung
Copy link
Member

Tons of standard library float tests for ln, exp, and trigonometric functions fail with Miri's float non-determinism since they expect more precise results than what Miri delivers. I have temporarily disabled float non-determinism while we decide what to do.

A bunch of cases actually expect precise answers for certain cases, e.g.:
https://github.com/rust-lang/rust/blob/29166cd61711776e6d43239d6d18a0eafe6515b1/library/std/tests/floats/f64.rs#L468
https://github.com/rust-lang/rust/blob/29166cd61711776e6d43239d6d18a0eafe6515b1/library/std/tests/floats/f64.rs#L482-L483

Furthermore, there are quite a few cases where the standard library expects more precision than what Miri provides. assert_approx_eq! checks for a divergence of up to 1e-6, which is less than 10 ULP on a value of 1.0 represented as f32. Miri currently adds 16 ULP of error (on top of whatever error the host libm we use to implement these function has). Some of these tests chain multiple operations, such as 60.0f64.sinh().asinh(), so the error accumulates; some of these tests use values sufficiently larger than 1 that 16 ULP end up even exceeding 1e-5. Furthermore, some doctests actually check that the precision is within f32::EPSILON, which is 1 ULP on a value of 1.0; that is obviously never going to fly if Miri adds any kind of non-deterministic error.

I am open to reducing the error Miri applies, and happy for suggestions of what would make a good error scale. (I'd prefer it to be a power of 2, that makes the logic a lot simpler in Miri. ;) However, I also think some of these tests should change, specifically all tests that require either full equality or just 1 ULP error.

@tgross35 @Jules-Bertholet thoughts, opinions?

@Jules-Bertholet
Copy link
Contributor

IEEE 754 § 9.2.1 defines several special cases, including 0.0f64.exp(), where the result must be exact.

@RalfJung
Copy link
Member Author

Does it also define 5.0f64.exp2()?

@Jules-Bertholet
Copy link
Contributor

Does it also define 5.0f64.exp2()?

No.

@RalfJung
Copy link
Member Author

So some of the assert_eq! are still making assumptions about unspecified precision, then.

@Jules-Bertholet
Copy link
Contributor

Actually, scratch that. §9.2 says “a conforming operation shall return results correctly rounded for the applicable rounding direction for all operands in its domain”, which implies that technically, to be conforming an implementation can’t have any error ever. Of course, actual implementations fail this, which means they technically aren’t conforming.

I think that, beyond the special values explicitly called out in §9.2.1, some of this stuff (like 5.0f64.exp2() being exact) should be treated as a guarantee also, as no sane implementation is likely to break it.

@RalfJung
Copy link
Member Author

§9.2 covers which operations?

@Jules-Bertholet
Copy link
Contributor

All the optional mathematical operations. (The required math ops have a similar proviso)

@RalfJung
Copy link
Member Author

Ah, lol. Does anyone actually do this right? Even @eduardosm's softfloat implementations had 1 ULP of error.

Well, the goal of this non-determinism is to expose bugs in code that incorrectly relies on precision assumptions. I'll happily take the advice of a domain expert for what is the best strategy for that. :)

@Jules-Bertholet
Copy link
Contributor

The C standard is explicitly more forgiving, so maybe we should rely on it instead?

(Also, I’m not a domain expert, I just read the standard!)

@tgross35
Copy link
Contributor

tgross35 commented Feb 26, 2025

Since we bind the C functions, we wind up following the same rules. Aiui this is extremely forgiving, sin could be a linear interpolation from a four-entry LUT. In practice, most system libm implementations are likely to be within ~2ULP for most ops, with a few exceptions (musl for example https://wiki.musl-libc.org/mathematical-library).

Representing this well in Miri is tricky because, while it's technically allowed, an implementation that can't produce $e^0=1$ would not be a very good one. For the tests in std that are failing, it feels like maybe we don't want the non-determinism? Since it seems like the goal would be to test the platform's real implementation, as opposed to ensuring that our tests pass on worst case platforms. I think the non-determinism makes more sense for downstream users that aren't just testing the routines themselves.

@beetrees may also have some ideas here.

Ah, lol. Does anyone actually do this right? Even @eduardosm's softfloat implementations had 1 ULP of error.

There are some implementations that provide the cr_* (correctly rounded) version of various routines, I am working on porting some of these from CORE-MATH to our libm. This isn't very common though.

@RalfJung
Copy link
Member Author

FWIW results of 0 remain precise with the current scheme, since we add a relative error not an absolute one. (No idea what you meant by e^x = 0 though.)

@RalfJung
Copy link
Member Author

Since it seems like the goal would be to test the platform's real implementation, as opposed to ensuring that our tests pass on worst case platforms.

Note that some of these are doctests and hence can be read as guarantees that the implementation will be within EPSILON of the expected answer.

@tgross35
Copy link
Contributor

FWIW results of 0 remain precise with the current scheme, since we add a relative error not an absolute one. (No idea what you meant by e^x = 0 though.)

Whoops, that should have been $e^0=1$.

Note that some of these are doctests and hence can be read as guarantees that the implementation will be within EPSILON of the expected answer.

I don't really like that we do this either, it builds the wrong intuition that EPSILON is always a reasonable value to compare float operations to. Assuming this comes from blending the definition of machine epsilon with the mathematical usage of $\epsilon$. It is nice that it looks clean, but it would probably be a good idea to change these tests anyway (being 1ULP > EPSILON for |x| > 2.0).

Is it possible to enable the randomness for most things but exclude tests in coretests / std/tests that are intended to validate system float ops?

@RalfJung
Copy link
Member Author

Is it possible to enable the randomness for most things but exclude tests in coretests / std/tests that are intended to validate system float ops?

That's tricky. We could add a flag to disable that and just set that flag for the entire "run std tests in Miri" run, though.

But I want to first experiment with reducing the error Miri adds, e.g. to 4 ULP.

@RalfJung
Copy link
Member Author

The C standard is explicitly more forgiving, so maybe we should rely on it instead?

Oh, that actually has some quite trivial points in it:

"For each single-argument function f in <math.h> whose mathematical counterpart is symmetric
(even), f(-x) is f(x) for all rounding modes and for all x in the (valid) domain of the function. For
each single-argument function f in <math.h> whose mathematical counterpart is antisymmetric
(odd), f(-x) is -f(x) for the ISO/IEC 60559 rounding modes roundTiesToEven, roundTiesToAway,
and roundTowardZero, and for all x in the (valid) domain of the function."

I think LLVM actually violates this, since the result is effectively non-deterministic (if LLVM evaluates libm calls at compile-time), and therefore cannot satisfy that requirement.

But then F.10.1-3 has a nice list of guarantees that are being made. I would add to that that functions with a clearly defined output range (such as [-1, 1] for sin, cos) must produce output in that range. And I'd suggest we reduce the error down to 4 ULP to avoid breaking so many of the std tests. Just having any non-determinism here should already help, I think.

@LorrensP-2158466 do you want to give this a try? Unfortunately I had to disable the code you added in your first PR as it made a bunch of test fail; turns out this is more complicated than I thought. I would say in a first PR, deal just with the ones defined in src/tools/miri/src/intrinsics/mod.rs.

@RalfJung RalfJung added C-bug Category: This is a bug. A-shims Area: This affects the external function shims labels Feb 27, 2025
@LorrensP-2158466
Copy link
Contributor

Yes, I'd like to give it a try, and just to be sure you want me to:

  • reduce the error to 4 ULP and change the approx_eq! macro to account for that
  • clamp output values to their defined ranges according to the standard

Am I missing something in this thread?

@RalfJung
Copy link
Member Author

Yes, please also make sure the guarantees listed in F.10.1-3 of the C standard are met.

@LorrensP-2158466
Copy link
Contributor

Alright, will do!

@tgross35
Copy link
Contributor

tgross35 commented Feb 27, 2025

My read of the standard was that F.10 paragraph 16 walks back the statement in paragraph 3:

Recommended practice

16 ISO/IEC 60559 specifies correct rounding for the operations in the F.3 table of operations recommended by ISO/IEC 60559, and thereby preserves useful mathematical properties such as symmetry, monotonicity, and periodicity. The corresponding functions with (potentially) reserved cr_-prefixed names (7.33.8) do the same. The C functions in the table, however, are not required to be correctly rounded, but implementations should still preserve as many of these useful mathematical properties as possible.

The "should" wording here seems to imply that the implementations are not required to uphold symmetry/monotonicity/periodicy. Likely not a problem, in any case.

@RalfJung
Copy link
Member Author

What the heck, why would you have paragraph 16 directly contradict what paragraph 3 says?!?

Anyway, we'll just ignore both of them then. 🤷

@Jules-Bertholet
Copy link
Contributor

I don’t see a contradiction. Paragraph 3 specifies a subset of IEEE 754 requirements that C also requires, paragraph 16 notes that this subset is a proper subset.

@LorrensP-2158466
Copy link
Contributor

How do you think I should best implement this? I'm currently looking at this piece of code:

  | "sinf32"
  | "cosf32"
  | "expf32"
  | "exp2f32"
  | "logf32"
  | "log10f32"
  | "log2f32"
  => {
      let [f] = check_intrinsic_arg_count(args)?;
      let f = this.read_scalar(f)?.to_f32()?;
      // Using host floats (but it's fine, these operations do not have
      // guaranteed precision).
      let host = f.to_host();
      let res = match intrinsic_name {
          "sinf32" => host.sin(),
          "cosf32" => host.cos(),
          "expf32" => host.exp(),
          "exp2f32" => host.exp2(),
          "logf32" => host.ln(),
          "log10f32" => host.log10(),
          "log2f32" => host.log2(),
          _ => bug!(),
      };
      let res = res.to_soft();
      // Apply a relative error of 16ULP to introduce some non-determinism
      // simulating imprecise implementations and optimizations.
      // FIXME: temporarily disabled as it breaks std tests.
      // let res = apply_random_float_error_ulp(
      //     this,
      //     res,
      //     4, // log2(16)
      // );
      let res = this.adjust_nan(res, &[f]);
      this.write_scalar(res, dest)?;
  }

For example sin and cos:
sin(x) and cos(x) are in [-1, 1] this can be done with .minimum(-1).maximum(1) ( or clamp(-1, 1)) which also propagates any NaN's and is thus correct. But sin(±0) = 0 and cos(±0) = 1, I'm not quite sure what the best way is to account for this, because 0 means 0 and not 4 ULP around 0. And that is not considering the other math functions.

Should I:

  • split these match arms into their categories (sin and cos, exps together, logs together)?
  • Use something that exists in miri and I don't know about?
  • Something else I'm not seeing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-shims Area: This affects the external function shims C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

4 participants