Skip to content

Commit 958ede1

Browse files
sharathmsrinivijay-suman
authored andcommitted
RDMA/uverbs: restrack shared PDs
A SRQ inherits its parent PD's resource name in ib_create_srq_user(): rdma_restrack_new(&srq->res, RDMA_RESTRACK_SRQ); rdma_restrack_parent_name(&srq->res, &pd->res); But user PDs created via ib_uverbs_share_pd() aren't restracked causing the PD to not have any parent name, causing the following crash when we run "rdma res show srq" and so this patch adds the shpd to restrack. [ 189.099669] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 189.100707] #PF: supervisor read access in kernel mode [ 189.101504] #PF: error_code(0x0000) - not-present page [ 189.102357] PGD 0 P4D 0 [ 189.102801] Oops: 0000 [#1] SMP NOPTI [ 189.103413] CPU: 26 PID: 69041 Comm: rdma Kdump: loaded Not tainted 5.15.0-5.76.3.el8uek.x86_64 #2 [ 189.104758] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-2.module+el8.6.0+20659+3dcf7c70 04/01/2014 [ 189.106359] RIP: 0010:strlen+0x0/0x24 [ 189.106994] Code: 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee 31 d2 89 d1 89 d6 89 d7 41 89 d0 c3 cc cc cc cc 0f 1f 84 00 00 00 00 00 <80> 3f 00 74 16 48 89 f8 48 83 c0 01 80 38 00 75 f7 48 29 f8 31 ff [ 189.109828] RSP: 0018:ffffa2f2b409b808 EFLAGS: 00010246 [ 189.110684] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 [ 189.111790] RDX: 0000000000000000 RSI: ffff93dca8f46448 RDI: 0000000000000000 [ 189.112943] RBP: ffff93f8091b2500 R08: 0000000000000000 R09: ffff93f8090750b4 [ 189.114102] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 189.115279] R13: ffff93f809075088 R14: ffff93f8067e46a8 R15: 0000000000000000 [ 189.116434] FS: 00007fe7c9707540(0000) GS:ffff9416c2800000(0000) knlGS:0000000000000000 [ 189.117753] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 189.118683] CR2: 0000000000000000 CR3: 000000240eebc004 CR4: 0000000000770ee0 [ 189.119857] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 189.121029] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 189.122198] PKRU: 55555554 [ 189.122676] Call Trace: [ 189.123114] <TASK> [ 189.123474] fill_res_name_pid+0x31/0xb0 [ib_core] [ 189.124217] res_get_common_dumpit+0x38f/0x540 [ib_core] [ 189.125045] ? fill_res_srq_qps+0x210/0x210 [ib_core] [ 189.125930] netlink_dump+0x18b/0x307 [ 189.126511] __netlink_dump_start+0x1f2/0x2d9 [ 189.127145] rdma_nl_rcv_msg+0x1d4/0x210 [ib_core] [ 189.127954] ? res_get_common_dumpit+0x540/0x540 [ib_core] [ 189.128871] rdma_nl_rcv+0xaa/0x100 [ib_core] [ 189.129616] netlink_unicast+0x213/0x2ce [ 189.130284] netlink_sendmsg+0x24f/0x4d9 [ 189.130941] sock_sendmsg+0x65/0x6a [ 189.131547] __sys_sendto+0x128/0x19b [ 189.132189] __x64_sys_sendto+0x20/0x35 [ 189.132832] do_syscall_64+0x38/0x8d [ 189.133451] entry_SYSCALL_64_after_hwframe+0x63/0x0 [ 189.134292] RIP: 0033:0x7fe7c87bc3ab [ 189.134906] Code: 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 f3 0f 1e fa 48 8d 05 f5 41 29 00 41 89 ca 8b 00 85 c0 75 14 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 41 57 4d 89 c7 41 56 41 89 [ 189.137790] RSP: 002b:00007fffc9e324a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 189.139019] RAX: ffffffffffffffda RBX: 00007fffc9e32750 RCX: 00007fe7c87bc3ab [ 189.140153] RDX: 0000000000000018 RSI: 0000558d21de1920 RDI: 0000000000000004 [ 189.141332] RBP: 0000000000000017 R08: 00007fe7c8c5c480 R09: 000000000000000c [ 189.142470] R10: 0000000000000000 R11: 0000000000000246 R12: 0000558d2120e850 [ 189.143631] R13: 00007fffc9e32770 R14: 0000000000000000 R15: 0000000000000000 [ 189.144785] </TASK> and so with the fix: # rdma res show pd ... dev mlx5_0 pdn 42 local_dma_lkey 0x0 users 12 ctxn 36 pid 87599 comm ora_ipc0_dbm051 dev mlx5_0 pdn 43 local_dma_lkey 0x0 users 4 ctxn 36 pid 87599 comm ora_ipc0_dbm051 ... we now see correct pdns, process names for the SRQs and no kernel crash: # rdma res show srq dev mlx5_0 srqn 1 type BASIC lqpn 2448 pdn 42 pid 87599 comm ora_ipc0_dbm051 dev mlx5_0 srqn 3 type XRC pdn 42 cqn 2081 pid 87599 comm ora_ipc0_dbm051 dev mlx5_0 srqn 4 type XRC pdn 42 cqn 2081 pid 87599 comm ora_ipc0_dbm051 dev mlx5_0 srqn 5 type XRC pdn 43 cqn 2083 pid 87599 comm ora_ipc0_dbm051 dev mlx5_0 srqn 6 type XRC pdn 43 cqn 2083 pid 87599 comm ora_ipc0_dbm051 ... Orabug: 34812519 Fixes: b09c4d7 ("RDMA/restrack: Improve readability in task name management") Fixes: 86133a24cbd8 ("IB/Shared PD support from Oracle") Signed-off-by: Sharath Srinivasan <[email protected]> Reviewed-by: Gerd Rausch <[email protected]> Reviewed-by: Qing Huang <[email protected]>
1 parent 726fd8f commit 958ede1

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

drivers/infiniband/core/uverbs_cmd.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -629,6 +629,10 @@ static int ib_uverbs_share_pd(struct uverbs_attr_bundle *attrs)
629629
pd->shpd = shpd;
630630
atomic_set(&pd->usecnt, 0);
631631

632+
rdma_restrack_new(&pd->res, RDMA_RESTRACK_PD);
633+
rdma_restrack_set_name(&pd->res, NULL);
634+
rdma_restrack_add(&pd->res);
635+
632636
/* initialize uobj and return pd_handle */
633637
uobj->object = pd;
634638

0 commit comments

Comments
 (0)