Tale as old as the world - there's an ABI mismatch:
https://github.com/rust-lang/compiler-builtins/pull/527
Fortunately, newest GCCs (from v11, it seems) actually provide most of
those intrinsics (even for f64!), so that's pretty cool.
(the only intrinsics not provided by GCC are `__powisf2` & `__powidf2`,
but our codegen for AVR doesn't emit those anyway.)
Fixes https://github.com/rust-lang/rust/issues/118079.
Same story as always, i.e. ABI mismatch:
- https://github.com/rust-lang/compiler-builtins/pull/462
- https://github.com/rust-lang/compiler-builtins/pull/466
- https://github.com/rust-lang/compiler-builtins/pull/513
I've made sure the changes work by rendering a Mandelbrot fractal:
```rust
#[arduino_hal::entry]
fn main() -> ! {
let dp = arduino_hal::Peripherals::take().unwrap();
let pins = arduino_hal::pins!(dp);
let mut serial = arduino_hal::default_serial!(dp, pins, 57600);
mandelbrot(&mut serial, 60, 40, -2.05, -1.12, 0.47, 1.12, 100);
loop {
//
}
}
fn mandelbrot<T>(
output: &mut T,
viewport_width: i64,
viewport_height: i64,
x1: f32,
y1: f32,
x2: f32,
y2: f32,
max_iterations: i64,
) where
T: uWrite,
{
for viewport_y in 0..viewport_height {
let y0 = y1 + (y2 - y1) * ((viewport_y as f32) / (viewport_height as f32));
for viewport_x in 0..viewport_width {
let x0 = x1 + (x2 - x1) * ((viewport_x as f32) / (viewport_width as f32));
let mut x = 0.0;
let mut y = 0.0;
let mut iterations = max_iterations;
while x * x + y * y <= 4.0 && iterations > 0 {
let xtemp = x * x - y * y + x0;
y = 2.0 * x * y + y0;
x = xtemp;
iterations -= 1;
}
let ch = "#%=-:,. "
.chars()
.nth((8.0 * ((iterations as f32) / (max_iterations as f32))) as _)
.unwrap();
_ = ufmt::uwrite!(output, "{}", ch);
}
_ = ufmt::uwriteln!(output, "");
}
}
```
... where without avr_skips, the code printed an image full of only `#`.
Note that because libgcc doesn't provide implementations for f64, using
those (e.g. swapping f32 to f64 in the code above) will cause linking to
fail:
```
undefined reference to `__divdf3'
undefined reference to `__muldf3'
undefined reference to `__gedf2'
undefined reference to `__fixunsdfsi'
undefined reference to `__gtdf2'
```
Ideally compiler-builtins could jump right in and provide those, but f64
also require a special calling convention which hasn't been yet exposed
through LLVM.
Note that because using 64-bit floats on an 8-bit target is a pretty
niche thing to do, and because f64 floats don't work correctly anyway at
the moment (due to this ABI mismatch), we're not actually breaking
anything by skipping those functions, since any code that currently uses
f64 on AVR works by accident.
Closes https://github.com/rust-lang/rust/issues/108489.
The conversion functions from i128/u128 to f32/f64 have the
`unadjusted_on_win64` attribute, but it is disabled starting with
LLVM14. This seems to be the correct thing to do for Win64, but for some
reason x86_64-unknown-uefi is different, despite generally using the
same ABI as Win64.
As of https://reviews.llvm.org/D110413, these no longer use the
unadjusted ABI (and use normal C ABI instead, passing i128
indirectly and returning it as <2 x i64>).
To support both LLVM 14 and older versions, rustc will expose a
"llvm14-builtins-abi" target feature, based on which
compiler-builtins can chose the appropriate ABI.
This is needed for rust-lang/rust#93577.
This adds the truncdfsf2 intrinsic and a corresponding fuzz test case. The
implementation of trunc is generic to make it easy to add truncdfhs2 and
truncsfhf2 if rust ever gets `f16` support.
This commit tweaks the implementation of the synthetic
`#[use_c_shim_if]` attribute, renaming it to
`#[maybe_use_optimized_c_shim]` in the process. This no longer requires
specifying a `#[cfg]` clause indicating when the optimized intrinsic
should be used, but rather this is inferred and printed from the build
script.
The build script will now print out appropriate `#[cfg]` directives for
rustc to indicate what intrinsics it's compiling. This should remove the
need for us to keep the build script and the source in sync, but rather
the build script can simply take care of everything.
Adds generic conversion from a wider to a narrower IEEE-754
floating-point type.
Implement `__truncdfsf2` and `__truncdfsf2vfp` and associated test-cases.
This was an accidental regression introduced in #252 by removing compilation of
C files without adjusting the `#[use_c_shim_if]` directives. This restores the
compilation of the assembly files and updates the `#[use_c_shim_if]` directives.
Now that `73884ae` is in some nightly release We can add ledf2vfp/leds2vfp
and so these two functions be aliased to aeabi_fcmple/aeabi_dcmple on soft-float targets.
Be sure to do not mix hard-float and soft-float calls.
Disassembled code show exactly this.
So replace add with an explicit call to __addsf3/__adddf3
This seems the root cause of some sporadic failures.
I was able to trigger an issue extending f32::MAX or f32::MIN to a f64.
Issue was not triggered by `testcrate` mainly because f32::MAX/MIN are
not in the list of special values to generate.
This PR fix the issue and improve `testcrate` adding MAX/MIN/MIN_POSITIVE
in the list of special values.
Add `extend` module to implement conversion from a narrower to a wider
floating-point type.
This implementation is only intended to support *widening* operations.
Module to convert a *narrower* floating-point will be added in the future.
Here using `"C"` the compiler will use `"aapcs"` or `"aapcs-vfp"`
depending on target configuration.
Of course this translates in a call to `__aeabi_fdiv` / `__aeabi_fmul`
on non-HF targets.
On `eabi` targets with +vfpv2/vfpv3 LLVM generate:
vmov s0, r1
vmov s2, r0
vdiv.f32 s0, s2, s0
vmov r0, s0
bx lr
On `eabihf` targets with +vfpv3-d16/d32/f32 +fp-only-sp LLVM generate:
vdiv.f32 s0, s0, s1
bx lr
That's exactly what We need for [div/mul][s/d]f3vfp.S
* I believe `__gtdf2` erroneously used `f32` instead of `f64`
* Most of these needed `#[arm_aeabi_alias]` to ensure they're correctly called
through the alias
* Some existing aliases were corrected with the right names
Note that this changes semantics:
pub extern "C" fn __eqsf2(a: f32, b: f32) -> bool {
cmp(a, b).to_le_abi() != 0
}
is not the same as
pub extern "C" fn __eqsf2(a: f32, b: f32) -> i32 {
cmp(a, b).to_le_abi()
}
However, compiler-rt does the latter, so this is actually
an improvement.