018616e78b ("Always have math functions but with `weak` linking
attribute if we can") made all math symbols available on platforms that
support weak linkage. This caused some unexpected regressions, however,
because our less accurate and sometimes slow routines were being
selected over the system `libm`, which also tends to be weak [1]. Thus,
0fab77e8d7 ("Don't include `math` for `unix` and `wasi` targets") was
applied to undo these changes on many platforms.
Now that some improvements have been made to `libm`, add back a subset
of these functions:
* cbrt
* ceil
* copysign
* fabs
* fdim
* floor
* fma
* fmax
* fmaximum
* fmin
* fminimum
* fmod
* rint
* round
* roundeven
* sqrt
* trunc
This list includes only functions that produce exact results (verified
with exhaustive / extensive tests, and also required by IEEE in most
cases), and for which benchmarks indicate performance similar to or
better than Musl's soft float math routines [^1]. All except `cbrt` also
have `f16` and `f128` implementations. Once more routines meet these
criteria, we can move them from platform-specific availability to always
available.
Once this change makes it to rust-lang/rust, we will also be able to
move the relevant functions from `std` to `core`.
[^1]: We still rely on the backend to provide optimized assmebly
routines when available.
[1]: https://github.com/rust-lang/rust/issues/128386
This requires privately reexporting `libm`'s `support` module at crate
root, where it is expected for macros. Once `libm` is made always
available, the reexport can be simplified.
This delta adds a lot of routines to `f16` and `f128`:
* ceil
* floor
* fma (f128 only)
* fmax
* fmin
* fmod
* ldexp
* rint
* round
* scalbn
* sqrt
Additionally, the following new API was added for all four float types:
* fmaximum
* fmaximum_num
* fminimum
* fminimum_num
* roundeven
There are also some significant performance improvements for `sqrt` and
`sqrtf`, as well as precision improvements for `cbrt` (both `f32` and
`f64` versions of this function are now always correctly rounded).
`compiler-builtins` is not allowed to call anything from `core`;
however, there are a couple of cases where we do so in `libm` for debug
output. Gate relevant locations behind the `compiler-builtins` Cargo
feature.
Replace `public_test_dep!` by placing optionally public items into new
modules, then controlling what is exported with the `public-test-deps`
feature.
This is nicer for automatic formatting and diagnostics.
This is a reland of 2e2a9255 ("Eliminate the use of
`public_test_dep!`"), which was reverted in 47e50fd2 ('Revert "Eliminate
the use of..."') due to a bug exposed at [1], reapplied in d4abaf4efa
because the issue should have been fixed in [2], then reverted again in
f6eef07f53 because [2] did not actually fix the issue.
[3] has landed in rust-lang/rust since then, which should resolve the
last problem remaining after [2]. So, apply this change for what is
hopefully the final time.
[1]: https://github.com/rust-lang/rust/pull/128691
[2]: https://github.com/rust-lang/rust/pull/135278
[3]: https://github.com/rust-lang/rust/pull/135501
In `compiler-builtins`, `libm` is contained within a `math` module. The
smoke test in this repo has a slightly different layout so some things
were passing that shouldn't be.
Change module layouts in `compiler-builtins-smoke-test` to match
`compiler-builtins` and update a few instances of broken paths.
Similar to other recent changes, just put public API in the same file as
its generic implementation. To keep things slightly cleaner, split the
default implementation from the `_wide` implementation.
Also introduces a stub `fmaf16`.
Currently the argument multiplier and large float multiplier happen
before selecting count based on generator. However, this means that
bivariate and trivariate functions don't get scaled at all (except for
the special cased fma).
Move this scaling to a later point.
When there is a panic in an extensive test, tracing down where it came
from can be difficult since no information is provides (messeges are
e.g. "attempted to subtract with overflow"). Resolve this by calling the
functions within `panic::catch_unwind`, printing the input, and
continuing.
Inputs in `case_list` shouldn't hit xfails or increased ULP tolerance.
Ensure that overrides are skipped when testing against MPFR or a
specified value and that NaNs, if any, are checked bitwise.
C23 specifies a new set of `roundeven` functions that round to the
nearest integral, with ties to even. It does not raise any floating
point exceptions.
This behavior is similar to two other functions:
1. `rint`, which rounds to the nearest integer respecting rounding mode
and possibly raising exceptions.
2. `nearbyint`, which is identical to `rint` except it may not raise
exceptions.
Technically `rint`, `nearbyint`, and `roundeven` all behave the same in
Rust because we assume default floating point environment. The backends
are allowed to lower to `roundeven`, however, so we should provide it in
case the fallback is needed.
Add the `roundeven` family here and convert `rint` to a function that
takes a rounding mode. This currently has no effect.
These don't have much content since they now use the generic
implementation. There will be more similar functions in the near future
(fminimum, fmaximum, fminimum_num, fmaximum_num); start the pattern of
combining similar functions now so we don't have to eventually maintain
similar docs across 24 different files.
Many routines have some form of handling for rounding mode and floating
point exceptions, which are implemented via a combination of stubs and
`force_eval!` use. This is suboptimal, however, because:
1. Rust does not interact with the floating point environment, so most
of this code does nothing.
2. The parts of the code that are not dead are not testable.
3. `force_eval!` blocks optimizations, which is unnecessary because we
do not rely on its side effects.
We cannot ensure correct rounding and exception handling in all cases
without some form of arithmetic operations that are aware of this
behavior. However, the cases where rounding mode is explicitly handled
or exceptions are explicitly raised are testable. Make this possible
here for functions that depend on `math::fenv` by moving the
implementation to a nonpublic function that takes a `Round` and returns
a `Status`.
Link: https://github.com/rust-lang/libm/issues/480
This produces better assembly, e.g. on aarch64:
.globl libm::u128_wmul
.p2align 2
libm::u128_wmul:
Lfunc_begin124:
.cfi_startproc
mul x9, x2, x0
umulh x10, x2, x0
umulh x11, x3, x0
mul x12, x3, x0
umulh x13, x2, x1
mul x14, x2, x1
umulh x15, x3, x1
mul x16, x3, x1
adds x10, x10, x14
cinc x13, x13, hs
adds x13, x13, x16
cinc x14, x15, hs
adds x10, x10, x12
cinc x11, x11, hs
adds x11, x13, x11
stp x9, x10, [x8]
cinc x9, x14, hs
stp x11, x9, [x8, rust-lang/libm#16]
ret
The original was ~70 instructions so the improvement is significant.
With these changes, the result is reasonably close to what LLVM
generates using `u256` operands [1].
[1]: https://llvm.godbolt.org/z/re1aGdaqY
For some reason, the upcoming limb changes in [1] seem to ignore the
black boxing when applied to the operator function. Changing to instead
black box the inputs appears to fix this.
[1]: https://github.com/rust-lang/libm/pull/503
With the correctly rounded implementation, we can reduce the ULP
requirement for `cbrt` to zero. There is still an override required for
`i586` because of the imprecise FMA.
We only round using nearest, but some incoming code has more handling of
rounding modes that would be nice to `match` on. Rather than checking
integer values, add an enum representation.
Usually `cargo binstall iai-callgrind-runner` handles apt dependencies.
However, the following has been happening:
Err:11 mirror+file:/etc/apt/apt-mirrors.txt noble-updates/main amd64 libc6-dbg amd64 2.39-0ubuntu8.3
404 Not Found [IP: 40.81.13.82 80]
E: Failed to fetch mirror+file:/etc/apt/apt-mirrors.txt/pool/main/g/glibc/libc6-dbg_2.39-0ubuntu8.3_amd64.deb 404 Not Found [IP: 40.81.13.82 80]
Fetched 19.8 MB in 6s (3138 kB/s)
E: Unable to fetch some archives, maybe run apt-get update or try with --fix-missing?
Installing the dependencies manually seems to resolve the issue.
Introduce a version of generic `fma` that works when there is a larger
hardware-backed float type available to compute the result with more
precision. This is currently used only for `f32`, but with some minor
adjustments it should work for `f16` as well.