-
Notifications
You must be signed in to change notification settings - Fork 465
[RFC] Fix the support of larger symbol name #503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The buffered name size should be larger than KSYM_NAME_LEN, otherwise we cannot tell whether the size of a symbol name is too long. Signed-off-by: Boqun Feng <[email protected]>
Yay, hardcoded constants! When I initially added the big symbol support, I grepped for
Could we put it under |
By the way, it was quite clever what you did to be able to use further powers of 2 (minus 1) in the future, although hopefully we will not need to jump another entire factor of 2 (!) and we can just e.g. add 100 more or something. I tried to come with something easier to read assuming we don't need to further increase in powers of 2, e.g. using a close value like 500 and then add the remainder: #define TIMES_10_(x) x##x##x##x##x##x##x##x##x##x
#define TIMES_10(x) TIMES_10_(x)
#define ADD_11_(x) x ## abcde ## abcde ## a
#define ADD_11(x) ADD_11_(x)
#define LONGEST_NAME ADD_11(TIMES_10(TIMES_10(abcde))) But really, why don't we just include the symbol: int
#include "longest_symbol"
; We can keep the file static, or we can even generate it on-the-fly reading |
Right, the current code style is horrible, I'm also considering sending the patch 1 directly to kernel community, so that we fix it there ;-)
|
Good idea! Let me try that. And if I get it work, I probably submit patches directly to LKML, in that case I will copy you guys and close this PR. |
Please do so! Actually, we were told to send these patches (i.e. things that are improvements but "unrelated" to Rust, including the "big symbol support") independently, to start having the pieces needed in mainline already. So if you want to do it, go ahead, otherwise I can do it too (with
That's fine too. I thought an entire Kconfig for that line would be too much, but perhaps it is a good idea since that way we already have it for possible future checks that we may add here and there.
+1 (well, we should apply the patches here anyway, to avoid waiting to solve it -- then it will disappear in the diff when we merge mainline again into here). |
Feel free to do that ;-) Actually I think it's better that you do the upstream code of the kallsyms.c "fix" (i.e. patch 1 in this PR), because then you will know when it gets merged and adopt that in the sync with mainline. (no, not because I'm lazy :-))
Agreed. The remaining question is do we want to merge this PR as it is, or do we want to have a better test on longest name support. If it's the latter, maybe we only take the fix (i.e patch 1) in Rust-for-linux, and create a new issue tracking the upstream process of the longest name test? |
Yeah, I would put here the fix (patch 1) already, no reason not to, as well as the selftest too later on if we finish it. So if you want, please drop the commit 2 for the moment, and we put commit 1 in. |
Done and create #504 |
Running generic/751 on the for-next branch often results in a hang like below. They are both stack by locking an extent. This suggests someone forget to unlock an extent. INFO: task kworker/u128:1:12 blocked for more than 323 seconds. Not tainted 6.13.0-BTRFS-ZNS+ #503 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u128:1 state:D stack:0 pid:12 tgid:12 ppid:2 flags:0x00004000 Workqueue: btrfs-fixup btrfs_work_helper [btrfs] Call Trace: <TASK> __schedule+0x534/0xdd0 schedule+0x39/0x140 __lock_extent+0x31b/0x380 [btrfs] ? __pfx_autoremove_wake_function+0x10/0x10 btrfs_writepage_fixup_worker+0xf1/0x3a0 [btrfs] btrfs_work_helper+0xff/0x480 [btrfs] ? lock_release+0x178/0x2c0 process_one_work+0x1ee/0x570 ? srso_return_thunk+0x5/0x5f worker_thread+0x1d1/0x3b0 ? __pfx_worker_thread+0x10/0x10 kthread+0x10b/0x230 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> INFO: task kworker/u134:0:184 blocked for more than 323 seconds. Not tainted 6.13.0-BTRFS-ZNS+ #503 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u134:0 state:D stack:0 pid:184 tgid:184 ppid:2 flags:0x00004000 Workqueue: writeback wb_workfn (flush-btrfs-4) Call Trace: <TASK> __schedule+0x534/0xdd0 schedule+0x39/0x140 __lock_extent+0x31b/0x380 [btrfs] ? __pfx_autoremove_wake_function+0x10/0x10 find_lock_delalloc_range+0xdb/0x260 [btrfs] writepage_delalloc+0x12f/0x500 [btrfs] ? srso_return_thunk+0x5/0x5f extent_write_cache_pages+0x232/0x840 [btrfs] btrfs_writepages+0x72/0x130 [btrfs] do_writepages+0xe7/0x260 ? srso_return_thunk+0x5/0x5f ? lock_acquire+0xd2/0x300 ? srso_return_thunk+0x5/0x5f ? find_held_lock+0x2b/0x80 ? wbc_attach_and_unlock_inode.part.0+0x102/0x250 ? wbc_attach_and_unlock_inode.part.0+0x102/0x250 __writeback_single_inode+0x5c/0x4b0 writeback_sb_inodes+0x22d/0x550 __writeback_inodes_wb+0x4c/0xe0 wb_writeback+0x2f6/0x3f0 wb_workfn+0x32a/0x510 process_one_work+0x1ee/0x570 ? srso_return_thunk+0x5/0x5f worker_thread+0x1d1/0x3b0 ? __pfx_worker_thread+0x10/0x10 kthread+0x10b/0x230 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> This happens because we have another success path for the zoned mode. When there is no active zone available, btrfs_reserve_extent() returns -EAGAIN. In this case, we have two reactions. (1) If the given range is never allocated, we can only wait for someone to finish a zone, so wait on BTRFS_FS_NEED_ZONE_FINISH bit and retry afterward. (2) Or, if some allocations are already done, we must bail out and let the caller to send IOs for the allocation. This is because these IOs may be necessary to finish a zone. The commit 06f3642 ("btrfs: do proper folio cleanup when cow_file_range() failed") moved the unlock code from the inside of the loop to the outside. So, previously, the allocated extents are unlocked just after the allocation and so before returning from the function. However, they are no longer unlocked on the case (2) above. That caused the hang issue. Fix the issue by modifying the 'end' to the end of the allocated range. Then, we can exit the loop and the same unlock code can properly handle the case. Reported-by: Shin'ichiro Kawasaki <[email protected]> Tested-by: Johannes Thumshirn <[email protected]> Fixes: 06f3642 ("btrfs: do proper folio cleanup when cow_file_range() failed") CC: [email protected] Reviewed-by: Qu Wenruo <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: David Sterba <[email protected]>
In the initial support, we enlarge the max length of symbol name to be 511 (by setting
KSYM_NAME_LEN
), however we still use a 500 bytes buffer to read the symbol name inread_symbol()
, which makes 1) the support for symbol name with length 500~511 is missing and 2) we are unable to tell whether a symbol name is too long .Fix this by enlarging the buffer size. A selftest is also added in this PR, if the fix is dropped, the selftest can trigger a warning that shows the problem. Note that the selftest is still a POC, because the longest symbol is always compiled when KALLSYMS is enabled, which we probably don't want it.