Commit Graph

103 Commits

Author SHA1 Message Date
Josh Stone
ed9e6f2ad8 Enable inline stack probes on X86 with LLVM 16 2022-09-29 19:49:23 -07:00
Matthias Krüger
a5c16a5381 Rollup merge of #100556 - Alex-Velez:patch-1, r=scottmcm
Clamp Function for f32 and f64

I thought the clamp function could use a little improvement for readability purposes. The function now returns early in order to skip the extra bound checks.

If there was a reason for binding `self` to `x` or if this code is incorrect, please correct me :)
2022-08-21 16:54:01 +02:00
scottmcm
03146471b5 Allow other directives before the ret 2022-08-20 21:08:56 +00:00
Alex
302689fa7b Update src/test/assembly/x86_64-floating-point-clamp.rs
Co-authored-by: scottmcm <scottmcm@users.noreply.github.com>
2022-08-16 20:39:22 -05:00
Alex
0ff8f0b578 Update src/test/assembly/x86_64-floating-point-clamp.rs
Simple Clamp Function

I thought this was more robust and easier to read. I also allowed this function to return early in order to skip the extra bound check (I'm sure the difference is negligible). I'm not sure if there was a reason for binding `self` to `x`; if so, please correct me.

Simple Clamp Function for f64

I thought this was more robust and easier to read. I also allowed this function to return early in order to skip the extra bound check (I'm sure the difference is negligible). I'm not sure if there was a reason for binding `self` to `x`; if so, please correct me.

Floating point clamp test

f32 clamp using mut self

f64 clamp using mut self

Update library/core/src/num/f32.rs

Update f64.rs

Update x86_64-floating-point-clamp.rs

Update src/test/assembly/x86_64-floating-point-clamp.rs

Update x86_64-floating-point-clamp.rs

Co-Authored-By: scottmcm <scottmcm@users.noreply.github.com>
2022-08-16 19:45:44 -04:00
Josh Stone
2970ad8aee Update the minimum external LLVM to 13 2022-08-14 13:46:51 -07:00
Tim Neumann
efa9586427 RISC-V ASM test: relax label name constraint. 2022-08-02 10:10:11 +02:00
Krasimir Georgiev
dcd02ab683 adapt assembly/static-relocation-model test for LLVM change
After
f0dd12ec5c,
LLVM emits `movzbl` instead. Adapted this test case accordingly.

Discovered in our experimental rust + llvm at head ci:
https://buildkite.com/llvm-project/rust-llvm-integrate-prototype/builds/12104#0182195b-8791-4f88-853c-bb23a1e4b54c
2022-07-20 12:56:42 +00:00
Dylan DPC
a027b01f33 Rollup merge of #98998 - workingjubilee:naked-means-no-clothes-enforcement-technology, r=Amanieu
Remove branch target prologues from `#[naked] fn`

This patch hacks around rust-lang/rust#98768 for now via injecting appropriate attributes into the LLVMIR we emit for naked functions. I intend to pursue this upstream so that these attributes can be removed in general, but it's slow going wading through C++ for me.
2022-07-18 21:14:43 +05:30
bors
db41351753 Auto merge of #98866 - nagisa:nagisa/align-offset-wroom, r=Mark-Simulacrum
Add a special case for align_offset /w stride != 1

This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:

    leaq 15(%rdi), %rax
    andq $-16, %rax

This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:

    pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
        slice.offset(slice.align_offset(16) as isize)
    }

    // =>

    movl %edi, %eax
    andl $3, %eax
    leaq 15(%rdi), %rcx
    andq $-16, %rcx
    subq %rdi, %rcx
    shrq $2, %rcx
    negq %rax
    sbbq %rax, %rax
    orq  %rcx, %rax
    leaq (%rdi,%rax,4), %rax

Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.

This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.

Fixes #98809.
Sadly, this does not help with #72356.
2022-07-16 23:28:28 +00:00
Simonas Kazlauskas
62a182cf7f Add a special case for align_offset /w stride != 1
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:

    leaq 15(%rdi), %rax
    andq $-16, %rax

This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:

    pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
        slice.offset(slice.align_offset(16) as isize)
    }

    // =>

    movl %edi, %eax
    andl $3, %eax
    leaq 15(%rdi), %rcx
    andq $-16, %rcx
    subq %rdi, %rcx
    shrq $2, %rcx
    negq %rax
    sbbq %rax, %rax
    orq  %rcx, %rax
    leaq (%rdi,%rax,4), %rax

Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.

This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.

Fixes #98809.
Sadly, this does not help with #72356.
2022-07-17 01:27:37 +03:00
Patrick Walton
1e0ad0c1d4 Implement support for DWARF version 5.
DWARF version 5 brings a number of improvements over version 4. Quoting from
the announcement [1]:

> Version 5 incorporates improvements in many areas: better data compression,
> separation of debugging data from executable files, improved description of
> macros and source files, faster searching for symbols, improved debugging
> optimized code, as well as numerous improvements in functionality and
> performance.

On platforms where DWARF version 5 is supported (Linux, primarily), this commit
adds support for it behind a new `-Z dwarf-version=5` flag.

[1]: https://dwarfstd.org/Public_Review.php
2022-07-08 11:31:08 -07:00
Jubilee Young
530b5da49b Also stop emitting BTI prologues for naked functions
Same idea but for AArch64.
2022-07-06 22:44:58 -07:00
Jubilee Young
92174f988b Stop emitting CET prologues for naked functions
We can apply nocf_check as a hack for now.
2022-07-06 22:44:54 -07:00
Augie Fackler
d92e213e3d hexagon: adapt test for upstream output changes
The output of IR formatting changed slightly in upstream rev
a0bc67e555f404d0e7ddb2e78cb891d96eaf913d
(https://reviews.llvm.org/D123096). I'm not actually sure what any of
that means, as I don't even know what hexagon is in this context, but
this change allows the test to pass on both old and new LLVMs.

r? @nikic
2022-06-07 13:21:34 -04:00
Nikita Popov
ebc8ab1e4e Fix stack protector basic test
This is a >= condition, so we need a maximum size of 7 to not
create a stack protector in basic mode.

The reason this still worked is that the alloca type was converted
into an integer (rather than an array). The way these heuristics
are implemented in LLVM is rather questionable and not resilient
to optimization.
2022-05-25 17:29:37 +02:00
Tomasz Miąsko
e62663492b Collect function instance used in global_asm! sym operand
The constants used in SymFn operands have FnDef type,
so the type of the constant identifies the function.
2022-05-03 07:12:43 +02:00
Guillaume Gomez
fe49981ea0 Rollup merge of #94703 - kjetilkjeka:nvptx-kernel-args-abi2, r=nagisa
Fix codegen bug in "ptx-kernel" abi related to arg passing

I found a codegen bug in the nvptx abi related to that args are passed as ptrs ([see comment](https://github.com/rust-lang/rust/issues/38788#issuecomment-1048999928)), this is not as specified in the [ptx-interoperability doc](https://docs.nvidia.com/cuda/ptx-writers-guide-to-interoperability/) or how C/C++ does it. It will also almost always fail in practice since device/host uses different memory spaces for most hardware.

This PR fixes the bug and add tests for passing structs to ptx kernels.

I observed that all nvptx assembly tests had been marked as [ignore a long time ago](https://github.com/rust-lang/rust/pull/59752#issuecomment-501713428). I'm not sure if the new one should be marked as ignore, it passed on my computer but it might fail if ptx-linker is missing on the server? I guess this is outside scope for this PR and should be looked at in a different issue/PR.

I only fixed the nvptx64-nvidia-cuda target and not the potential code paths for the non-existing 32bit target. Even though 32bit nvptx is not a supported target there are still some code under the hood supporting codegen for 32 bit ptx. I was advised to create an MCP to find out if this code should be removed or updated.

Perhaps ``@RDambrosio016`` would have interest in taking a quick look at this.
2022-04-26 13:22:27 +02:00
Kjetil Kjeka
5bf5acc324 Add test for asserting correct generation of ptx-kernel args 2022-04-25 16:35:19 +02:00
Amanieu d'Antras
bdba89733e Update tests for sym support in global_asm! 2022-04-16 06:11:51 +02:00
Scott McMurray
54408f0963 short-circuit the easy cases in is_copy_modulo_regions
This change is somewhat extensive, since it affects MIR -- since this is called to determine Copy vs Move -- so any test that's `no_core` needs to actually have the normal `impl`s it uses.
2022-03-10 01:19:02 -08:00
William D. Jones
19809ed76d Add preliminary support for inline assembly for msp430. 2022-01-22 23:42:46 -05:00
bors
d331cb710f Auto merge of #88354 - Jmc18134:hint-space-pauth-opt, r=nagisa
Add codegen option for branch protection and pointer authentication on AArch64

The branch-protection codegen option enables the use of hint-space pointer
authentication code for AArch64 targets.
2021-12-29 22:35:11 +00:00
Amanieu d'Antras
1c48025685 Address review feedback 2021-12-12 11:26:59 +00:00
Amanieu d'Antras
44a3a66ee8 Stabilize asm! and global_asm!
They are also removed from the prelude as per the decision in
https://github.com/rust-lang/rust/issues/87228.

stdarch and compiler-builtins are updated to work with the new, stable
asm! and global_asm! macros.
2021-12-12 11:20:03 +00:00
Amanieu d'Antras
908f300dd7 Remove the reg_thumb register class for asm! on ARM
Also restricts r8-r14 from being used on Thumb1 targets as per #90736.
2021-12-07 23:54:09 +00:00
Andrew Dona-Couch
c6e8ae1a6c Implement inline asm! for AVR platform 2021-12-06 01:02:49 -05:00
bors
a2b7b7891e Auto merge of #91003 - psumbera:sparc64-abi, r=nagisa
fix sparc64 ABI for aggregates with floating point members

Fixes #86163
2021-12-02 02:59:44 +00:00
Jamie Cunliffe
984ca4689d Review comments
- Changed the separator from '+' to ','.
- Moved the branch protection options from -C to -Z.
- Additional test for incorrect branch-protection option.
- Remove LLVM < 12 code.
- Style fixes.

Co-authored-by: James McGregor <james.mcgregor2@arm.com>
2021-12-01 15:56:59 +00:00
James McGregor
837cc1687f Add codegen option for branch protection and pointer authentication on AArch64
The branch-protection codegen option enables the use of hint-space pointer
authentication code for AArch64 targets
2021-12-01 12:24:30 +00:00
Petr Sumbera
128ceec92d fix sparc64 ABI for aggregates with floating point members 2021-12-01 10:03:45 +01:00
Benjamin A. Bjørnseth
bb9dee95ed add rustc option for using LLVM stack smash protection
LLVM has built-in heuristics for adding stack canaries to functions. These
heuristics can be selected with LLVM function attributes. This patch adds a
rustc option `-Z stack-protector={none,basic,strong,all}` which controls the use
of these attributes. This gives rustc the same stack smash protection support as
clang offers through options `-fno-stack-protector`, `-fstack-protector`,
`-fstack-protector-strong`, and `-fstack-protector-all`. The protection this can
offer is demonstrated in test/ui/abi/stack-protector.rs. This fills a gap in the
current list of rustc exploit
mitigations (https://doc.rust-lang.org/rustc/exploit-mitigations.html),
originally discussed in #15179.

Stack smash protection adds runtime overhead and is therefore still off by
default, but now users have the option to trade performance for security as they
see fit. An example use case is adding Rust code in an existing C/C++ code base
compiled with stack smash protection. Without the ability to add stack smash
protection to the Rust code, the code base artifacts could be exploitable in
ways not possible if the code base remained pure C/C++.

Stack smash protection support is present in LLVM for almost all the current
tier 1/tier 2 targets: see
test/assembly/stack-protector/stack-protector-target-support.rs. The one
exception is nvptx64-nvidia-cuda. This patch follows clang's example, and adds a
warning message printed if stack smash protection is used with this target (see
test/ui/stack-protector/warn-stack-protector-unsupported.rs). Support for tier 3
targets has not been checked.

Since the heuristics are applied at the LLVM level, the heuristics are expected
to add stack smash protection to a fraction of functions comparable to C/C++.
Some experiments demonstrating how Rust code is affected by the different
heuristics can be found in
test/assembly/stack-protector/stack-protector-heuristics-effect.rs. There is
potential for better heuristics using Rust-specific safety information. For
example it might be reasonable to skip stack smash protection in functions which
transitively only use safe Rust code, or which uses only a subset of functions
the user declares safe (such as anything under `std.*`). Such alternative
heuristics could be added at a later point.

LLVM also offers a "safestack" sanitizer as an alternative way to guard against
stack smashing (see #26612). This could possibly also be included as a
stack-protection heuristic. An alternative is to add it as a sanitizer (#39699).
This is what clang does: safestack is exposed with option
`-fsanitize=safe-stack`.

The options are only supported by the LLVM backend, but as with other codegen
options it is visible in the main codegen option help menu. The heuristic names
"basic", "strong", and "all" are hopefully sufficiently generic to be usable in
other backends as well.

Reviewed-by: Nikita Popov <nikic@php.net>

Extra commits during review:

- [address-review] make the stack-protector option unstable

- [address-review] reduce detail level of stack-protector option help text

- [address-review] correct grammar in comment

- [address-review] use compiler flag to avoid merging functions in test

- [address-review] specify min LLVM version in fortanix stack-protector test

  Only for Fortanix test, since this target specifically requests the
  `--x86-experimental-lvi-inline-asm-hardening` flag.

- [address-review] specify required LLVM components in stack-protector tests

- move stack protector option enum closer to other similar option enums

- rustc_interface/tests: sort debug option list in tracking hash test

- add an explicit `none` stack-protector option

Revert "set LLVM requirements for all stack protector support test revisions"

This reverts commit a49b74f92a4e7d701d6f6cf63d207a8aff2e0f68.
2021-11-22 20:06:22 +01:00
Amanieu d'Antras
eb32c00216 Add features gates for experimental asm features 2021-11-07 01:23:53 +00:00
Josh Stone
e9f545b9a9 Update the minimum external LLVM to 12 2021-10-22 10:50:07 -07:00
Josh Stone
65150af1b4 Update the minimum external LLVM to 11 2021-10-22 09:22:18 -07:00
Hans Kratz
2a1fbb86eb test fix: aarch64 atomics are only outlined on Linux. 2021-10-15 06:19:08 +02:00
Alessandro Decina
8683d36042 Fix min LLVM version for bpf-types test
Closes #89689
2021-10-09 19:18:37 +11:00
Jubilee
4e9cf04c98 Rollup merge of #83655 - sebpop:arm64-outline-atomics, r=workingjubilee
[aarch64] add target feature outline-atomics

Enable outline-atomics by default as enabled in clang by the following commit
https://reviews.llvm.org/rGc5e7e649d537067dec7111f3de1430d0fc8a4d11

Performance improves by several orders of magnitude when using the LSE instructions
instead of the ARMv8.0 compatible load/store exclusive instructions.

Tested on Graviton2 aarch64-linux with
x.py build && x.py install && x.py test
2021-10-04 13:58:06 -07:00
Manish Goregaokar
6f1e930581 Rollup merge of #88820 - hlopko:add_pie_relocation_model, r=petrochenkov
Add `pie` as another `relocation-model` value

MCP: https://github.com/rust-lang/compiler-team/issues/461
2021-10-01 09:18:16 -07:00
Marcel Hlopko
198d90786b Add pie as another relocation-model value 2021-10-01 08:06:42 +02:00
Sebastian Pop
0f9f241aac [aarch64] add target feature outline-atomics
Enable outline-atomics by default as enabled in clang by the following commit
https://reviews.llvm.org/rGc5e7e649d537067dec7111f3de1430d0fc8a4d11

Performance improves by several orders of magnitude when using the LSE instructions
instead of the ARMv8.0 compatible load/store exclusive instructions.

Tested on Graviton2 aarch64-linux with
x.py build && x.py install && x.py test
2021-09-30 23:34:33 +00:00
Augie Fackler
4185b76dc3 rustc_codegen_llvm: make sse4.2 imply crc32 for LLVM 14
This fixes compiling things like the `snap` crate after
https://reviews.llvm.org/D105462. I added a test that verifies the
additional attribute gets specified, and confirmed that I can build
cargo with both LLVM 13 and 14 with this change applied.
2021-09-20 11:31:55 -04:00
Andreas Liljeqvist
4d66fbc4b9 enum niche allocation grows toward zero if possible 2021-09-13 21:55:14 +02:00
Mara Bos
494c563f3b Rollup merge of #88350 - programmerjake:add-ppc-cr-xer-clobbers, r=Amanieu
add support for clobbering xer, cr, and cr[0-7] for asm! on OpenPower/PowerPC

Fixes #88315
2021-09-01 09:23:26 +02:00
Jacob Lifshay
5802f60355 add support for clobbering xer, cr, and cr[0-7] for asm! on OpenPower/PowerPC
Fixes #88315
2021-08-25 22:08:27 -07:00
linux1
4a9ba65ca9 Feat: added explicit register tests; added prefix to check_reg asm string 2021-08-24 12:41:49 -04:00
linux1
96381d390d Fix: added necessary prefix 2021-08-23 21:53:23 -04:00
linux1
0c9e23c7ce Fix: appeased x.py test tidy --bless 2021-08-22 17:55:03 -04:00
linux1
eeb0b52bf8 Feat: further testing & support for i64 general register use 2021-08-22 17:55:03 -04:00
linux1
66e95b17ec Fix: moved #[no_mangle] 2021-08-22 17:55:03 -04:00