[RFC] Fix the support of larger symbol name #503

fbq · 2021-09-29T05:56:11Z

In the initial support, we enlarge the max length of symbol name to be 511 (by setting KSYM_NAME_LEN), however we still use a 500 bytes buffer to read the symbol name in read_symbol(), which makes 1) the support for symbol name with length 500~511 is missing and 2) we are unable to tell whether a symbol name is too long .

Fix this by enlarging the buffer size. A selftest is also added in this PR, if the fix is dropped, the selftest can trigger a warning that shows the problem. Note that the selftest is still a POC, because the longest symbol is always compiled when KALLSYMS is enabled, which we probably don't want it.

The buffered name size should be larger than KSYM_NAME_LEN, otherwise we cannot tell whether the size of a symbol name is too long. Signed-off-by: Boqun Feng <[email protected]>

ojeda · 2021-09-30T10:25:25Z

Yay, hardcoded constants!

When I initially added the big symbol support, I grepped for 255, 256 etc., but I missed this one using 499 and 500.

Note that the selftest is still a POC, because the longest symbol is always compiled when KALLSYMS is enabled, which we probably don't want it.

Could we put it under DEBUG_KERNEL or perhaps DEBUG_MISC?

ojeda · 2021-09-30T12:32:49Z

By the way, it was quite clever what you did to be able to use further powers of 2 (minus 1) in the future, although hopefully we will not need to jump another entire factor of 2 (!) and we can just e.g. add 100 more or something. I tried to come with something easier to read assuming we don't need to further increase in powers of 2, e.g. using a close value like 500 and then add the remainder:

#define TIMES_10_(x) x##x##x##x##x##x##x##x##x##x
#define TIMES_10(x)  TIMES_10_(x)

#define ADD_11_(x)   x ## abcde ## abcde ## a
#define ADD_11(x)    ADD_11_(x)

#define LONGEST_NAME ADD_11(TIMES_10(TIMES_10(abcde)))

But really, why don't we just include the symbol:

int
#include "longest_symbol"
;

We can keep the file static, or we can even generate it on-the-fly reading KSYM_NAME_LEN in a C file -- and then we don't need the macros nor to change them if the length changes.

fbq · 2021-09-30T13:22:19Z

Yay, hardcoded constants!

When I initially added the big symbol support, I grepped for 255, 256 etc., but I missed this one using 499 and 500.

Right, the current code style is horrible, I'm also considering sending the patch 1 directly to kernel community, so that we fix it there ;-)

Note that the selftest is still a POC, because the longest symbol is always compiled when KALLSYMS is enabled, which we probably don't want it.

Could we put it under DEBUG_KERNEL or perhaps DEBUG_MISC?

DEBUG_KERNEL may be too much, since the symbol is only used to test scripts/kallsyms.c, I think a separate kconfig will be better. Something like KALLSYMS_TEST maybe.

fbq · 2021-09-30T13:25:18Z

But really, why don't we just include the symbol:
int
#include "longest_symbol"
;
We can keep the file static, or we can even generate it on-the-fly reading KSYM_NAME_LEN in a C file -- and then we don't need the macros nor to change them if the length changes.

Good idea! Let me try that. And if I get it work, I probably submit patches directly to LKML, in that case I will copy you guys and close this PR.

ojeda · 2021-09-30T15:23:09Z

Right, the current code style is horrible, I'm also considering sending the patch 1 directly to kernel community, so that we fix it there ;-)

Please do so! Actually, we were told to send these patches (i.e. things that are improvements but "unrelated" to Rust, including the "big symbol support") independently, to start having the pieces needed in mainline already.

So if you want to do it, go ahead, otherwise I can do it too (with Co-developed-by or Suggested-by etc.).

DEBUG_KERNEL may be too much, since the symbol is only used to test scripts/kallsyms.c, I think a separate kconfig will be better. Something like KALLSYMS_TEST maybe.

That's fine too. I thought an entire Kconfig for that line would be too much, but perhaps it is a good idea since that way we already have it for possible future checks that we may add here and there.

Good idea! Let me try that. And if I get it work, I probably submit patches directly to LKML, in that case I will copy you guys and close this PR.

+1

(well, we should apply the patches here anyway, to avoid waiting to solve it -- then it will disappear in the diff when we merge mainline again into here).

fbq · 2021-10-01T01:07:43Z

Right, the current code style is horrible, I'm also considering sending the patch 1 directly to kernel community, so that we fix it there ;-)

Please do so! Actually, we were told to send these patches (i.e. things that are improvements but "unrelated" to Rust, including the "big symbol support") independently, to start having the pieces needed in mainline already.

So if you want to do it, go ahead, otherwise I can do it too (with Co-developed-by or Suggested-by etc.).

Feel free to do that ;-) Actually I think it's better that you do the upstream code of the kallsyms.c "fix" (i.e. patch 1 in this PR), because then you will know when it gets merged and adopt that in the sync with mainline. (no, not because I'm lazy :-))

DEBUG_KERNEL may be too much, since the symbol is only used to test scripts/kallsyms.c, I think a separate kconfig will be better. Something like KALLSYMS_TEST maybe.

That's fine too. I thought an entire Kconfig for that line would be too much, but perhaps it is a good idea since that way we already have it for possible future checks that we may add here and there.

Good idea! Let me try that. And if I get it work, I probably submit patches directly to LKML, in that case I will copy you guys and close this PR.

+1

(well, we should apply the patches here anyway, to avoid waiting to solve it -- then it will disappear in the diff when we merge mainline again into here).

Agreed. The remaining question is do we want to merge this PR as it is, or do we want to have a better test on longest name support. If it's the latter, maybe we only take the fix (i.e patch 1) in Rust-for-linux, and create a new issue tracking the upstream process of the longest name test?

ojeda · 2021-10-01T01:47:07Z

Yeah, I would put here the fix (patch 1) already, no reason not to, as well as the selftest too later on if we finish it.

So if you want, please drop the commit 2 for the moment, and we put commit 1 in.

fbq · 2021-10-01T05:53:07Z

Yeah, I would put here the fix (patch 1) already, no reason not to, as well as the selftest too later on if we finish it.

So if you want, please drop the commit 2 for the moment, and we put commit 1 in.

Done and create #504

Running generic/751 on the for-next branch often results in a hang like below. They are both stack by locking an extent. This suggests someone forget to unlock an extent. INFO: task kworker/u128:1:12 blocked for more than 323 seconds. Not tainted 6.13.0-BTRFS-ZNS+ #503 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u128:1 state:D stack:0 pid:12 tgid:12 ppid:2 flags:0x00004000 Workqueue: btrfs-fixup btrfs_work_helper [btrfs] Call Trace: <TASK> __schedule+0x534/0xdd0 schedule+0x39/0x140 __lock_extent+0x31b/0x380 [btrfs] ? __pfx_autoremove_wake_function+0x10/0x10 btrfs_writepage_fixup_worker+0xf1/0x3a0 [btrfs] btrfs_work_helper+0xff/0x480 [btrfs] ? lock_release+0x178/0x2c0 process_one_work+0x1ee/0x570 ? srso_return_thunk+0x5/0x5f worker_thread+0x1d1/0x3b0 ? __pfx_worker_thread+0x10/0x10 kthread+0x10b/0x230 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> INFO: task kworker/u134:0:184 blocked for more than 323 seconds. Not tainted 6.13.0-BTRFS-ZNS+ #503 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u134:0 state:D stack:0 pid:184 tgid:184 ppid:2 flags:0x00004000 Workqueue: writeback wb_workfn (flush-btrfs-4) Call Trace: <TASK> __schedule+0x534/0xdd0 schedule+0x39/0x140 __lock_extent+0x31b/0x380 [btrfs] ? __pfx_autoremove_wake_function+0x10/0x10 find_lock_delalloc_range+0xdb/0x260 [btrfs] writepage_delalloc+0x12f/0x500 [btrfs] ? srso_return_thunk+0x5/0x5f extent_write_cache_pages+0x232/0x840 [btrfs] btrfs_writepages+0x72/0x130 [btrfs] do_writepages+0xe7/0x260 ? srso_return_thunk+0x5/0x5f ? lock_acquire+0xd2/0x300 ? srso_return_thunk+0x5/0x5f ? find_held_lock+0x2b/0x80 ? wbc_attach_and_unlock_inode.part.0+0x102/0x250 ? wbc_attach_and_unlock_inode.part.0+0x102/0x250 __writeback_single_inode+0x5c/0x4b0 writeback_sb_inodes+0x22d/0x550 __writeback_inodes_wb+0x4c/0xe0 wb_writeback+0x2f6/0x3f0 wb_workfn+0x32a/0x510 process_one_work+0x1ee/0x570 ? srso_return_thunk+0x5/0x5f worker_thread+0x1d1/0x3b0 ? __pfx_worker_thread+0x10/0x10 kthread+0x10b/0x230 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x30/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> This happens because we have another success path for the zoned mode. When there is no active zone available, btrfs_reserve_extent() returns -EAGAIN. In this case, we have two reactions. (1) If the given range is never allocated, we can only wait for someone to finish a zone, so wait on BTRFS_FS_NEED_ZONE_FINISH bit and retry afterward. (2) Or, if some allocations are already done, we must bail out and let the caller to send IOs for the allocation. This is because these IOs may be necessary to finish a zone. The commit 06f3642 ("btrfs: do proper folio cleanup when cow_file_range() failed") moved the unlock code from the inside of the loop to the outside. So, previously, the allocated extents are unlocked just after the allocation and so before returning from the function. However, they are no longer unlocked on the case (2) above. That caused the hang issue. Fix the issue by modifying the 'end' to the end of the allocated range. Then, we can exit the loop and the same unlock code can properly handle the case. Reported-by: Shin'ichiro Kawasaki <[email protected]> Tested-by: Johannes Thumshirn <[email protected]> Fixes: 06f3642 ("btrfs: do proper folio cleanup when cow_file_range() failed") CC: [email protected] Reviewed-by: Qu Wenruo <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: David Sterba <[email protected]>

script: kallsyms: Use the correct buffer size for symbols

d895436

The buffered name size should be larger than KSYM_NAME_LEN, otherwise we cannot tell whether the size of a symbol name is too long. Signed-off-by: Boqun Feng <[email protected]>

fbq force-pushed the dev/name-fix branch from 4296f07 to d895436 Compare October 1, 2021 05:22

fbq mentioned this pull request Oct 1, 2021

Selftests for kernel symbol name length limitation #504

Closed

ojeda merged commit 67e981d into Rust-for-Linux:rust Oct 1, 2021

fbq deleted the dev/name-fix branch October 11, 2021 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Fix the support of larger symbol name #503

[RFC] Fix the support of larger symbol name #503

Uh oh!

fbq commented Sep 29, 2021

Uh oh!

ojeda commented Sep 30, 2021

Uh oh!

ojeda commented Sep 30, 2021

Uh oh!

fbq commented Sep 30, 2021

Uh oh!

fbq commented Sep 30, 2021

Uh oh!

ojeda commented Sep 30, 2021 •

edited

Loading

Uh oh!

fbq commented Oct 1, 2021

Uh oh!

ojeda commented Oct 1, 2021

Uh oh!

fbq commented Oct 1, 2021

Uh oh!

Uh oh!

[RFC] Fix the support of larger symbol name #503

[RFC] Fix the support of larger symbol name #503

Uh oh!

Conversation

fbq commented Sep 29, 2021

Uh oh!

ojeda commented Sep 30, 2021

Uh oh!

ojeda commented Sep 30, 2021

Uh oh!

fbq commented Sep 30, 2021

Uh oh!

fbq commented Sep 30, 2021

Uh oh!

ojeda commented Sep 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fbq commented Oct 1, 2021

Uh oh!

ojeda commented Oct 1, 2021

Uh oh!

fbq commented Oct 1, 2021

Uh oh!

Uh oh!

ojeda commented Sep 30, 2021 •

edited

Loading