Skip to content

[BPF] expand cttz, ctlz for i32, i64 #73668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 1, 2024
Merged

Conversation

inclyc
Copy link
Member

@inclyc inclyc commented Nov 28, 2023

Fixes: #62252

Depends on: #73667

@inclyc inclyc force-pushed the users/inclyc/bpf-target-addr branch from c41b11a to 1839091 Compare December 10, 2023 06:51
@inclyc inclyc force-pushed the users/inclyc/bpf-cttz-ctlz branch from bfdbfe5 to c9afc89 Compare December 10, 2023 06:51
@inclyc
Copy link
Member Author

inclyc commented Dec 14, 2023

Pinging @yonghong-song

@inclyc inclyc requested a review from eddyz87 January 27, 2024 06:07
@inclyc
Copy link
Member Author

inclyc commented Jan 27, 2024

@eddyz87 Could you please take a look? This has been stalled for a while :)

@eddyz87
Copy link
Contributor

eddyz87 commented Jan 29, 2024

@eddyz87 Could you please take a look? This has been stalled for a while :)

Hello, I tried this with simple C test:

unsigned int test(unsigned int v) {
  return __builtin_ctz(v);
  //return __builtin_clz(v);
}
The clz part compiles fine, but when ctz is used I still get an assertion, however a different one:
$ clang  --target=bpf -S -O2 test-clz.c -o -
...
LLVM ERROR: Cannot select: t15: i64 = ConstantPool<[32 x i8] c"\00\01\1C\02\1D\0E\18\03\1E\16\14\0F\19\11\04\08\1F\1B\0D\17\15\13\10\07\1A\0C\12\06\0B\05\0A\09"> 0
In function: test
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: llc -debug-only=isel --asm-show-inst -mtriple=bpf -mcpu=v3 -filetype=obj -o - test-clz.ll
1.	Running pass 'Function Pass Manager' on module 'test-clz.ll'.
2.	Running pass 'BPF DAG->DAG Pattern Instruction Selection' on function '@test'
 #0 0x000055d4977b1db8 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/eddy/work/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x000055d4977afeb0 llvm::sys::RunSignalHandlers() /home/eddy/work/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x000055d4977b2588 SignalHandler(int) /home/eddy/work/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007fdbaa03f190 __restore_rt (/lib64/libc.so.6+0x3f190)
 #4 0x00007fdbaa091dec __pthread_kill_implementation (/lib64/libc.so.6+0x91dec)
 #5 0x00007fdbaa03f0c6 gsignal (/lib64/libc.so.6+0x3f0c6)
 #6 0x00007fdbaa0268d7 abort (/lib64/libc.so.6+0x268d7)
 #7 0x000055d497735145 llvm::report_fatal_error(llvm::Twine const&, bool) /home/eddy/work/llvm-project/llvm/lib/Support/ErrorHandling.cpp:125:5
 #8 0x000055d4975f4b6d llvm::SDNode::getValueType(unsigned int) const /home/eddy/work/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1007:5
 #9 0x000055d4975f4b6d llvm::SDValue::getValueType() const /home/eddy/work/llvm-project/llvm/include/llvm/CodeGen/SelectionDAGNodes.h:1162:16
#10 0x000055d4975f4b6d llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) /home/eddy/work/llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp:4232:43
Looking at debug info from llc it looks like cttz_zero_undef is expanded using some kind of a lookup table:
$ llc -debug-only=isel --asm-show-inst -mtriple=bpf -mcpu=v3 -filetype=asm -o - test-clz.ll
...
Type-legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 7 nodes:
  t0: ch,glue = EntryToken
      t2: i32,ch = CopyFromReg t0, Register:i32 %0
    t3: i32 = cttz_zero_undef t2
  t5: ch,glue = CopyToReg t0, Register:i32 $w0, t3
  t6: ch = BPFISD::RET_GLUE t5, Register:i32 $w0, t5:1



Legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 18 nodes:
  t0: ch,glue = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %0
                t8: i32 = sub Constant:i32<0>, t2
              t9: i32 = and t2, t8
            t11: i32 = mul t9, Constant:i32<125613361>
          t13: i32 = srl t11, Constant:i32<27>
        t14: i64 = sign_extend t13
      t16: i64 = add ConstantPool:i64<[32 x i8] c"\00\01\1C\02\1D\0E\18\03\1E\16\14\0F\19\11\04\08\1F\1B\0D\17\15\13\10\07\1A\0C\12\06\0B\05\0A\09"> 0, t14
    t18: i32,ch = load<(load (s8) from constant-pool), zext from i8> t0, t16, undef:i64
  t5: ch,glue = CopyToReg t0, Register:i32 $w0, t18
  t6: ch = BPFISD::RET_GLUE t5, Register:i32 $w0, t5:1

If there is no way to convince lowering to use some other strategy, and you don't want to spend time on implementing translation for ConstantPool, I think it would be fine to leave ctz as-is, or just adjust error reporting, so that it clearly says that ctz is not supported w/o showing a stack-trace. Wdyt?

@eddyz87
Copy link
Contributor

eddyz87 commented Jan 30, 2024

Nevermind, I missed the "depends" part :(
Will retest.

@eddyz87
Copy link
Contributor

eddyz87 commented Jan 31, 2024

I tried this with the test below using current kernel master, and all works as expected.

diff --git a/tools/testing/selftests/bpf/progs/verifier_and.c b/tools/testing/selftests/bpf/progs/verifier_and.c
index e97e518516b6..8a051bd0c886 100644
--- a/tools/testing/selftests/bpf/progs/verifier_and.c
+++ b/tools/testing/selftests/bpf/progs/verifier_and.c
@@ -104,4 +104,14 @@ l0_%=:     r0 = 0;                                         \
        : __clobber_all);
 }
 
+unsigned A[3] = {1u << 31, 1u << 30, 1u << 29};
+
+SEC("socket") __success __retval(0) int clz1(void *ctx) { return __builtin_clz(A[0]); }
+SEC("socket") __success __retval(1) int clz2(void *ctx) { return __builtin_clz(A[1]); }
+SEC("socket") __success __retval(2) int clz3(void *ctx) { return __builtin_clz(A[2]); }
+
+SEC("socket") __success __retval(31) int ctz1(void *ctx) { return __builtin_ctz(A[0]); }
+SEC("socket") __success __retval(30) int ctz2(void *ctx) { return __builtin_ctz(A[1]); }
+SEC("socket") __success __retval(29) int ctz3(void *ctx) { return __builtin_ctz(A[2]); }
+
 char _license[] SEC("license") = "GPL";

Base automatically changed from users/inclyc/bpf-target-addr to main March 6, 2024 11:47
inclyc added a commit that referenced this pull request Mar 6, 2024
Adds custom lowering for tconstpool.

Please ref: #73668 for test
coverage
@inclyc inclyc merged commit 70deb7b into main Apr 1, 2024
@inclyc inclyc deleted the users/inclyc/bpf-cttz-ctlz branch April 1, 2024 02:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using __builtin_clz and __builtin_ctz in bpf target lead to segfault
2 participants