target-feature: enable rust target features implied by target-cpu

Normally LLVM and rustc agree about what features are implied by
target-cpu, but for NVPTX, LLVM considers sm_* and ptx* features to be
exclusive, which makes sense for codegen purposes. But in Rust, we want
to think of them as:

  sm_{sver} means that the target supports the hardware features of sver

  ptx{pver} means the driver supports PTX ISA pver

Intrinsics usually require a minimum sm_{sver} and ptx{pver}.

Prior to this commit, -Ctarget-cpu=sm_70 would activate only sm_70 and
ptx60 (the minimum PTX version that supports sm_70, which maximizes
driver compatibility). With this commit, it also activates all the
implied target features (sm_20, ..., sm_62; ptx32, ..., ptx50).
This commit is contained in:
Jed Brown
2025-05-21 22:08:51 -06:00
parent 6dbac3f09e
commit 35a485ddd8
3 changed files with 44 additions and 8 deletions

View File

@@ -333,15 +333,12 @@ pub(crate) fn to_llvm_features<'a>(sess: &Session, s: &'a str) -> Option<LLVMFea
///
/// We do not have to worry about RUSTC_SPECIFIC_FEATURES here, those are handled outside codegen.
pub(crate) fn target_config(sess: &Session) -> TargetConfig {
// Add base features for the target.
// We do *not* add the -Ctarget-features there, and instead duplicate the logic for that below.
// The reason is that if LLVM considers a feature implied but we do not, we don't want that to
// show up in `cfg`. That way, `cfg` is entirely under our control -- except for the handling of
// the target CPU, that is still expanded to target features (with all their implied features)
// by LLVM.
let target_machine = create_informational_target_machine(sess, true);
let (unstable_target_features, target_features) = cfg_target_feature(sess, |feature| {
// This closure determines whether the target CPU has the feature according to LLVM. We do
// *not* consider the `-Ctarget-feature`s here, as that will be handled later in
// `cfg_target_feature`.
if let Some(feat) = to_llvm_features(sess, feature) {
// All the LLVM features this expands to must be enabled.
for llvm_feature in feat {