-
Notifications
You must be signed in to change notification settings - Fork 13.4k
add nvptx_target_feature #138689
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
add nvptx_target_feature #138689
Conversation
r? @wesleywiser rustbot has assigned @wesleywiser. Use |
This comment has been minimized.
This comment has been minimized.
It looks like the failing build environment is using LLVM-18 (sm_100 onward were added recently llvm/llvm-project#124155). I could remove those entirely (such hardware isn't available yet) or guard them based on LLVM version (how?). |
IIUC LLVM exposes these as target CPUs ( |
I think it'd be good to have more testing (could be punted into an issue and done later):
I think it'd also be good to have more docs:
|
You can address this issue by modifying rust/compiler/rustc_codegen_llvm/src/llvm_util.rs Lines 280 to 285 in c4b38a5
LLVM actually supports Outputs of both --print target-cpus and --print target-features also show them$ rustc --target nvptx64-nvidia-cuda --print target-cpus
Available CPUs for this target:
sm_100
sm_100a
sm_101
sm_101a
sm_120
sm_120a
sm_20
sm_21
sm_30 - This is the default target CPU for the current build target (currently nvptx64-nvidia-cuda).
sm_32
sm_35
sm_37
sm_50
sm_52
sm_53
sm_60
sm_61
sm_62
sm_70
sm_72
sm_75
sm_80
sm_86
sm_87
sm_89
sm_90
sm_90a
$ rustc --target nvptx64-nvidia-cuda --print target-features
Features supported by rustc for this target:
crt-static - Enables C Run-time Libraries to be statically linked.
Code-generation features supported by LLVM for this target:
ptx32 - Use PTX version 32.
ptx40 - Use PTX version 40.
ptx41 - Use PTX version 41.
ptx42 - Use PTX version 42.
ptx43 - Use PTX version 43.
ptx50 - Use PTX version 50.
ptx60 - Use PTX version 60.
ptx61 - Use PTX version 61.
ptx62 - Use PTX version 62.
ptx63 - Use PTX version 63.
ptx64 - Use PTX version 64.
ptx65 - Use PTX version 65.
ptx70 - Use PTX version 70.
ptx71 - Use PTX version 71.
ptx72 - Use PTX version 72.
ptx73 - Use PTX version 73.
ptx74 - Use PTX version 74.
ptx75 - Use PTX version 75.
ptx76 - Use PTX version 76.
ptx77 - Use PTX version 77.
ptx78 - Use PTX version 78.
ptx80 - Use PTX version 80.
ptx81 - Use PTX version 81.
ptx82 - Use PTX version 82.
ptx83 - Use PTX version 83.
ptx84 - Use PTX version 84.
ptx85 - Use PTX version 85.
ptx86 - Use PTX version 86.
ptx87 - Use PTX version 87.
sm_100 - Target SM 100.
sm_100a - Target SM 100a.
sm_101 - Target SM 101.
sm_101a - Target SM 101a.
sm_120 - Target SM 120.
sm_120a - Target SM 120a.
sm_20 - Target SM 20.
sm_21 - Target SM 21.
sm_30 - Target SM 30.
sm_32 - Target SM 32.
sm_35 - Target SM 35.
sm_37 - Target SM 37.
sm_50 - Target SM 50.
sm_52 - Target SM 52.
sm_53 - Target SM 53.
sm_60 - Target SM 60.
sm_61 - Target SM 61.
sm_62 - Target SM 62.
sm_70 - Target SM 70.
sm_72 - Target SM 72.
sm_75 - Target SM 75.
sm_80 - Target SM 80.
sm_86 - Target SM 86.
sm_87 - Target SM 87.
sm_89 - Target SM 89.
sm_90 - Target SM 90.
sm_90a - Target SM 90a.
Use +feature to enable a feature, or -feature to disable it.
For example, rustc -C target-cpu=mycpu -C target-feature=+feature1,-feature2
Code-generation features cannot be used in cfg or #[target_feature],
and may be renamed or removed in a future version of LLVM or rustc. |
Btw, is it intentional that the |
Thanks @taiki-e. I'll update accordingly. I'm seeing PTX 7.8 being written in an otherwise-default configuration with target |
b593748
to
00c6bb7
Compare
I updated to add the One remaining issue (for the tests @gonzalobg requested) is that |
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #139229) made this pull request unmergeable. Please resolve the merge conflicts. |
00c6bb7
to
f56ad5d
Compare
Some changes occurred in src/doc/rustc/src/platform-support cc @Noratrieb |
This comment has been minimized.
This comment has been minimized.
f56ad5d
to
150e5a7
Compare
This comment has been minimized.
This comment has been minimized.
150e5a7
to
3921290
Compare
This comment has been minimized.
This comment has been minimized.
3921290
to
118f5a9
Compare
@wesleywiser 👋 Checks have passed and this is ready for your review. |
☔ The latest upstream changes (presumably #140887) made this pull request unmergeable. Please resolve the merge conflicts. |
118f5a9
to
461ec3c
Compare
This comment has been minimized.
This comment has been minimized.
461ec3c
to
1da4239
Compare
This comment has been minimized.
This comment has been minimized.
I rebased and have questions about implied LLVM target features for clang and rustc. $ echo '__device__ void f(){}' | ~/src/rust-enzyme/build/host/llvm/bin/clang++ --cuda-gpu-arch=sm_89 --cuda-device-only -S -emit-llvm -x cu - -o - | rg target-features
clang++: warning: CUDA version 12.8 is only partially supported [-Wunknown-cuda-version]
attributes #0 = { convergent mustprogress noinline nounwind optnone "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="sm_89" "target-features"="+ptx87,+sm_89" }
$ rustc +stage1 --print cfg --target=nvptx64-nvidia-cuda -Ctarget-cpu=sm_89 | rg target_feature
target_feature="ptx78"
target_feature="sm_89"
Is this all as intended? I think choosing the oldest ISA and activating the implied features would be preferable, but that seems to be an LLVM choice. |
Which ones are those? |
1da4239
to
462571b
Compare
I talked offline with Gonzalo, but the issue is that, for example, In the current model, either the user would need to manually enable An alternative would be to teach |
I was able to find the logic Clang uses to lift the LLVM defines the logic for NVPTX target features here. In the long run, I would prefer |
I was wondering why this does not already work using your branch. The key function is probably As far as I can tell it does the following:
--> The third step adds target features implied by rustc but only for those target features that are specified with |
Yes $ rustc +stage1 --print cfg --target=nvptx64-nvidia-cuda -Ctarget-cpu=sm_89 | rg target_feature
target_feature="ptx78"
target_feature="sm_89"
$ rustc +stage1 --print cfg --target=nvptx64-nvidia-cuda -Ctarget-cpu=sm_89 -Ctarget-feature=+sm_80,+ptx78 | rg target_feature
target_feature="ptx32"
target_feature="ptx40"
target_feature="ptx41"
target_feature="ptx42"
target_feature="ptx43"
target_feature="ptx50"
target_feature="ptx60"
target_feature="ptx61"
target_feature="ptx62"
target_feature="ptx63"
target_feature="ptx64"
target_feature="ptx65"
target_feature="ptx70"
target_feature="ptx71"
target_feature="ptx72"
target_feature="ptx73"
target_feature="ptx74"
target_feature="ptx75"
target_feature="ptx76"
target_feature="ptx77"
target_feature="ptx78"
target_feature="sm_20"
target_feature="sm_21"
target_feature="sm_30"
target_feature="sm_32"
target_feature="sm_35"
target_feature="sm_37"
target_feature="sm_50"
target_feature="sm_52"
target_feature="sm_53"
target_feature="sm_60"
target_feature="sm_61"
target_feature="sm_62"
target_feature="sm_70"
target_feature="sm_72"
target_feature="sm_75"
target_feature="sm_80"
target_feature="sm_89" This works with |
I implemented the suggested logic to activate implied features, and a test. |
This looks good to me now. @wesleywiser I can't approve from this github account. |
016a516
to
26d1202
Compare
Thanks. I made the tracking issue #141468 and have updated the documentation to warn against use of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc target maintainers @RDambrosio016 and @kjetilkjeka for review as well
26d1202
to
a2dec7e
Compare
☔ The latest upstream changes (presumably #141984) made this pull request unmergeable. Please resolve the merge conflicts. |
Add target features for sm_* and ptx*, both of which form a partial order, but cannot be combined to a single partial order. These mirror the LLVM target features, but we do not provide LLVM target processors (which imply both an sm_* and ptx* feature). Add some documentation for the nvptx target.
Normally LLVM and rustc agree about what features are implied by target-cpu, but for NVPTX, LLVM considers sm_* and ptx* features to be exclusive, which makes sense for codegen purposes. But in Rust, we want to think of them as: sm_{sver} means that the target supports the hardware features of sver ptx{pver} means the driver supports PTX ISA pver Intrinsics usually require a minimum sm_{sver} and ptx{pver}. Prior to this commit, -Ctarget-cpu=sm_70 would activate only sm_70 and ptx60 (the minimum PTX version that supports sm_70, which maximizes driver compatibility). With this commit, it also activates all the implied target features (sm_20, ..., sm_62; ptx32, ..., ptx50).
a2dec7e
to
f25c822
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for taking so long before getting to this.
The changes look good, useful and is definitely a step in the right direction 👍
A small annoyance is that actually using target-feature=+ptx80
together with the implied futures leads to redundant warnings (one per implied feature). I'm not sure it's easy to do something with that as both the actual warning and the implied features are useful on their own and it will anyway go away when they are made stable so I don't think it necessarily makes sense to insist on tackling this here and now.
Thanks, I added a note about the redundant warnings to the tracking issue. I think many users will be content with |
Tracking issue: #141468 (nvptx), which is part of #44839 (catch-all arches)
The feature gate is
#![feature(nvptx_target_feature)]
This exposes the target features
sm_20
throughsm_120a
as defined by LLVM.Cc: @gonzalobg
@rustbot label +O-NVPTX +A-target-feature