Skip to content

[lldb/DWARF] Remove "range lower than function low_pc" check #132395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 23, 2025

Conversation

labath
Copy link
Collaborator

@labath labath commented Mar 21, 2025

The check is not correct for discontinuous functions, as one of the blocks could very well begin before the function entry point. To catch dead-stripped ranges, I check whether the functions is after the first known code address. I don't print any error in this case as that is a common/expected situation.

This avoids many errors like:

error: ld-linux-x86-64.so.2 0x00085f3b: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message

when debugging binaries on debian trixie because the dynamic linker (ld-linux) contains discontinuous functions.

If the block ranges is not a subrange of the enclosing block then this will range will currently be added to the outer block as well (i.e., we get the same behavior that's currently possible for non-subrange blocks larger than function_low_pc). However, this code path is buggy and I'd like to change that (#117725).

The check is not correct for discontinuous functions, as one of the
blocks could very well begin before the function entry point. To catch
dead-stripped ranges, I check whether the functions is after the first
known code address. I don't print any error in this case as that is a
common/expected situation.

If the block ranges is not a subrange of the enclosing block then this
will range will currently be added to the outer block as well (i.e., we
get the same behavior that's currently possible for non-subrange blocks
larger than function_low_pc). However, this code path is buggy and I'd
like to change that (llvm#117725).
@labath labath requested a review from JDevlieghere as a code owner March 21, 2025 13:22
@llvmbot llvmbot added the lldb label Mar 21, 2025
@labath labath requested a review from DavidSpickett March 21, 2025 13:22
@llvmbot
Copy link
Member

llvmbot commented Mar 21, 2025

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

The check is not correct for discontinuous functions, as one of the blocks could very well begin before the function entry point. To catch dead-stripped ranges, I check whether the functions is after the first known code address. I don't print any error in this case as that is a common/expected situation.

If the block ranges is not a subrange of the enclosing block then this will range will currently be added to the outer block as well (i.e., we get the same behavior that's currently possible for non-subrange blocks larger than function_low_pc). However, this code path is buggy and I'd like to change that (#117725).


Full diff: https://github.com/llvm/llvm-project/pull/132395.diff

3 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+1-11)
  • (removed) lldb/test/Shell/SymbolFile/DWARF/range-lower-then-low-pc.s (-317)
  • (modified) lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s (+19-19)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index d1aaf0bd36de4..58d8969c54e27 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -1346,19 +1346,9 @@ size_t SymbolFileDWARF::ParseBlocksRecursive(CompileUnit &comp_unit,
                                  decl_line, decl_column, call_file, call_line,
                                  call_column, nullptr)) {
       for (const llvm::DWARFAddressRange &range : ranges) {
-        if (!range.valid())
-          continue;
-        if (range.LowPC >= subprogram_low_pc)
+        if (range.valid() && range.LowPC >= m_first_code_address)
           block->AddRange(Block::Range(range.LowPC - subprogram_low_pc,
                                        range.HighPC - range.LowPC));
-        else {
-          GetObjectFile()->GetModule()->ReportError(
-              "{0:x8}: adding range [{1:x16}-{2:x16}) which has a base "
-              "that is less than the function's low PC {3:x16}. Please file "
-              "a bug and attach the file at the "
-              "start of this error message",
-              block->GetID(), range.LowPC, range.HighPC, subprogram_low_pc);
-        }
       }
       block->FinalizeRanges();
 
diff --git a/lldb/test/Shell/SymbolFile/DWARF/range-lower-then-low-pc.s b/lldb/test/Shell/SymbolFile/DWARF/range-lower-then-low-pc.s
deleted file mode 100644
index e3cc84db12652..0000000000000
--- a/lldb/test/Shell/SymbolFile/DWARF/range-lower-then-low-pc.s
+++ /dev/null
@@ -1,317 +0,0 @@
-# REQUIRES: x86
-
-# RUN: llvm-mc -triple=x86_64-pc-linux -filetype=obj %s > %t
-# RUN: lldb-test symbols %t &> %t.txt
-# RUN: cat %t.txt | FileCheck %s
-
-# Tests that error is printed correctly when DW_AT_low_pc value is
-# greater then a range entry.
-
-# CHECK: 0x0000006e: adding range [0x0000000000000000-0x000000000000001f)
-# CHECK-SAME: which has a base that is less than the function's low PC 0x0000000000000021.
-# CHECK-SAME: Please file a bug and attach the file at the start of this error message
-
-
-
-# Test was manually modified to change DW_TAG_lexical_block
-# to use DW_AT_ranges, and value lower then DW_AT_low_pc value
-# in DW_TAG_subprogram
-# static int foo(bool b) {
-#   if (b) {
-#    int food = 1;
-#     return food;
-#   }
-#   return 0;
-# }
-# int main() {
-#   return foo(true);
-# }
-	.text
-	.file	"main.cpp"
-	.section	.text.main,"ax",@progbits
-	.globl	main                            # -- Begin function main
-	.p2align	4, 0x90
-	.type	main,@function
-main:                                   # @main
-.Lfunc_begin0:
-	.file	1 "base-lower-then-range-entry" "main.cpp"
-	.loc	1 8 0                           # main.cpp:8:0
-	.cfi_startproc
-# %bb.0:                                # %entry
-	pushq	%rbp
-	.cfi_def_cfa_offset 16
-	.cfi_offset %rbp, -16
-	movq	%rsp, %rbp
-	.cfi_def_cfa_register %rbp
-	subq	$16, %rsp
-	movl	$0, -4(%rbp)
-.Ltmp0:
-	.loc	1 9 10 prologue_end             # main.cpp:9:10
-	movl	$1, %edi
-	callq	_ZL3foob
-	.loc	1 9 3 epilogue_begin is_stmt 0  # main.cpp:9:3
-	addq	$16, %rsp
-	popq	%rbp
-	.cfi_def_cfa %rsp, 8
-	retq
-.Ltmp1:
-.Lfunc_end0:
-	.size	main, .Lfunc_end0-main
-	.cfi_endproc
-                                        # -- End function
-	.section	.text._ZL3foob,"ax",@progbits
-	.p2align	4, 0x90                         # -- Begin function _ZL3foob
-	.type	_ZL3foob,@function
-_ZL3foob:                               # @_ZL3foob
-.Lfunc_begin1:
-	.loc	1 1 0 is_stmt 1                 # main.cpp:1:0
-	.cfi_startproc
-# %bb.0:                                # %entry
-	pushq	%rbp
-	.cfi_def_cfa_offset 16
-	.cfi_offset %rbp, -16
-	movq	%rsp, %rbp
-	.cfi_def_cfa_register %rbp
-	movb	%dil, %al
-	andb	$1, %al
-	movb	%al, -5(%rbp)
-.Ltmp2:
-	.loc	1 2 7 prologue_end              # main.cpp:2:7
-	testb	$1, -5(%rbp)
-	je	.LBB1_2
-# %bb.1:                                # %if.then
-.Ltmp3:
-	.loc	1 3 8                           # main.cpp:3:8
-	movl	$1, -12(%rbp)
-	.loc	1 4 12                          # main.cpp:4:12
-	movl	-12(%rbp), %eax
-	.loc	1 4 5 is_stmt 0                 # main.cpp:4:5
-	movl	%eax, -4(%rbp)
-	jmp	.LBB1_3
-.Ltmp4:
-.LBB1_2:                                # %if.end
-	.loc	1 6 3 is_stmt 1                 # main.cpp:6:3
-	movl	$0, -4(%rbp)
-.LBB1_3:                                # %return
-	.loc	1 7 1                           # main.cpp:7:1
-	movl	-4(%rbp), %eax
-	.loc	1 7 1 epilogue_begin is_stmt 0  # main.cpp:7:1
-	popq	%rbp
-	.cfi_def_cfa %rsp, 8
-	retq
-.Ltmp5:
-.Lfunc_end1:
-	.size	_ZL3foob, .Lfunc_end1-_ZL3foob
-	.cfi_endproc
-                                        # -- End function
-	.section	.debug_abbrev,"",@progbits
-	.byte	1                               # Abbreviation Code
-	.byte	17                              # DW_TAG_compile_unit
-	.byte	1                               # DW_CHILDREN_yes
-	.byte	37                              # DW_AT_producer
-	.byte	14                              # DW_FORM_strp
-	.byte	19                              # DW_AT_language
-	.byte	5                               # DW_FORM_data2
-	.byte	3                               # DW_AT_name
-	.byte	14                              # DW_FORM_strp
-	.byte	16                              # DW_AT_stmt_list
-	.byte	23                              # DW_FORM_sec_offset
-	.byte	27                              # DW_AT_comp_dir
-	.byte	14                              # DW_FORM_strp
-	.byte	17                              # DW_AT_low_pc
-	.byte	1                               # DW_FORM_addr
-	.byte	85                              # DW_AT_ranges
-	.byte	23                              # DW_FORM_sec_offset
-	.byte	0                               # EOM(1)
-	.byte	0                               # EOM(2)
-	.byte	2                               # Abbreviation Code
-	.byte	46                              # DW_TAG_subprogram
-	.byte	0                               # DW_CHILDREN_no
-	.byte	17                              # DW_AT_low_pc
-	.byte	1                               # DW_FORM_addr
-	.byte	18                              # DW_AT_high_pc
-	.byte	6                               # DW_FORM_data4
-	.byte	64                              # DW_AT_frame_base
-	.byte	24                              # DW_FORM_exprloc
-	.byte	3                               # DW_AT_name
-	.byte	14                              # DW_FORM_strp
-	.byte	58                              # DW_AT_decl_file
-	.byte	11                              # DW_FORM_data1
-	.byte	59                              # DW_AT_decl_line
-	.byte	11                              # DW_FORM_data1
-	.byte	73                              # DW_AT_type
-	.byte	19                              # DW_FORM_ref4
-	.byte	63                              # DW_AT_external
-	.byte	25                              # DW_FORM_flag_present
-	.byte	0                               # EOM(1)
-	.byte	0                               # EOM(2)
-	.byte	3                               # Abbreviation Code
-	.byte	46                              # DW_TAG_subprogram
-	.byte	1                               # DW_CHILDREN_yes
-	.byte	17                              # DW_AT_low_pc
-	.byte	1                               # DW_FORM_addr
-	.byte	18                              # DW_AT_high_pc
-	.byte	6                               # DW_FORM_data4
-	.byte	64                              # DW_AT_frame_base
-	.byte	24                              # DW_FORM_exprloc
-	.byte	110                             # DW_AT_linkage_name
-	.byte	14                              # DW_FORM_strp
-	.byte	3                               # DW_AT_name
-	.byte	14                              # DW_FORM_strp
-	.byte	58                              # DW_AT_decl_file
-	.byte	11                              # DW_FORM_data1
-	.byte	59                              # DW_AT_decl_line
-	.byte	11                              # DW_FORM_data1
-	.byte	73                              # DW_AT_type
-	.byte	19                              # DW_FORM_ref4
-	.byte	0                               # EOM(1)
-	.byte	0                               # EOM(2)
-	.byte	4                               # Abbreviation Code
-	.byte	5                               # DW_TAG_formal_parameter
-	.byte	0                               # DW_CHILDREN_no
-	.byte	2                               # DW_AT_location
-	.byte	24                              # DW_FORM_exprloc
-	.byte	3                               # DW_AT_name
-	.byte	14                              # DW_FORM_strp
-	.byte	58                              # DW_AT_decl_file
-	.byte	11                              # DW_FORM_data1
-	.byte	59                              # DW_AT_decl_line
-	.byte	11                              # DW_FORM_data1
-	.byte	73                              # DW_AT_type
-	.byte	19                              # DW_FORM_ref4
-	.byte	0                               # EOM(1)
-	.byte	0                               # EOM(2)
-	.byte	5                               # Abbreviation Code
-	.byte	11                              # DW_TAG_lexical_block
-	.byte	1                               # DW_CHILDREN_yes
-	.byte	85                              # DW_AT_ranges   <------ Manually modified. Replaced low_pc/high)_pc with rangres.
-	.byte	23                              # DW_FORM_sec_offset
-	.byte	0                               # EOM(1)
-	.byte	0                               # EOM(2)
-	.byte	6                               # Abbreviation Code
-	.byte	52                              # DW_TAG_variable
-	.byte	0                               # DW_CHILDREN_no
-	.byte	2                               # DW_AT_location
-	.byte	24                              # DW_FORM_exprloc
-	.byte	3                               # DW_AT_name
-	.byte	14                              # DW_FORM_strp
-	.byte	58                              # DW_AT_decl_file
-	.byte	11                              # DW_FORM_data1
-	.byte	59                              # DW_AT_decl_line
-	.byte	11                              # DW_FORM_data1
-	.byte	73                              # DW_AT_type
-	.byte	19                              # DW_FORM_ref4
-	.byte	0                               # EOM(1)
-	.byte	0                               # EOM(2)
-	.byte	7                               # Abbreviation Code
-	.byte	36                              # DW_TAG_base_type
-	.byte	0                               # DW_CHILDREN_no
-	.byte	3                               # DW_AT_name
-	.byte	14                              # DW_FORM_strp
-	.byte	62                              # DW_AT_encoding
-	.byte	11                              # DW_FORM_data1
-	.byte	11                              # DW_AT_byte_size
-	.byte	11                              # DW_FORM_data1
-	.byte	0                               # EOM(1)
-	.byte	0                               # EOM(2)
-	.byte	0                               # EOM(3)
-	.section	.debug_info,"",@progbits
-.Lcu_begin0:
-	.long	.Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
-.Ldebug_info_start0:
-	.short	4                               # DWARF version number
-	.long	.debug_abbrev                   # Offset Into Abbrev. Section
-	.byte	8                               # Address Size (in bytes)
-	.byte	1                               # Abbrev [1] 0xb:0x8f DW_TAG_compile_unit
-	.long	.Linfo_string0                  # DW_AT_producer
-	.short	33                              # DW_AT_language
-	.long	.Linfo_string1                  # DW_AT_name
-	.long	.Lline_table_start0             # DW_AT_stmt_list
-	.long	.Linfo_string2                  # DW_AT_comp_dir
-	.quad	0                               # DW_AT_low_pc
-	.long	.Ldebug_ranges0                 # DW_AT_ranges
-	.byte	2                               # Abbrev [2] 0x2a:0x19 DW_TAG_subprogram
-	.quad	.Lfunc_begin0                   # DW_AT_low_pc
-	.long	.Lfunc_end0-.Lfunc_begin0       # DW_AT_high_pc
-	.byte	1                               # DW_AT_frame_base
-	.byte	86
-	.long	.Linfo_string3                  # DW_AT_name
-	.byte	1                               # DW_AT_decl_file
-	.byte	8                               # DW_AT_decl_line
-	.long	138                             # DW_AT_type
-                                        # DW_AT_external
-	.byte	3                               # Abbrev [3] 0x43:0x48 DW_TAG_subprogram
-	.quad	.Lfunc_begin1 + 1               # DW_AT_low_pc
-	.long	.Lfunc_end1-.Lfunc_begin1       # DW_AT_high_pc
-	.byte	1                               # DW_AT_frame_base
-	.byte	86
-	.long	.Linfo_string5                  # DW_AT_linkage_name
-	.long	.Linfo_string6                  # DW_AT_name
-	.byte	1                               # DW_AT_decl_file
-	.byte	1                               # DW_AT_decl_line
-	.long	138                             # DW_AT_type
-	.byte	4                               # Abbrev [4] 0x60:0xe DW_TAG_formal_parameter
-	.byte	2                               # DW_AT_location
-	.byte	145
-	.byte	123
-	.long	.Linfo_string7                  # DW_AT_name
-	.byte	1                               # DW_AT_decl_file
-	.byte	1                               # DW_AT_decl_line
-	.long	138                             # DW_AT_type
-	.byte	5                               # Abbrev [5] 0x6e:0x1c DW_TAG_lexical_block
-	.long	.Ldebug_ranges0                 # DW_AT_ranges  <-- Manually modified replaced low_pc/high_pc to rangres.
-	.byte	6                               # Abbrev [6] 0x7b:0xe DW_TAG_variable
-	.byte	2                               # DW_AT_location
-	.byte	145
-	.byte	116
-	.long	.Linfo_string9                  # DW_AT_name
-	.byte	1                               # DW_AT_decl_file
-	.byte	3                               # DW_AT_decl_line
-	.long	138                             # DW_AT_type
-	.byte	0                               # End Of Children Mark
-	.byte	0                               # End Of Children Mark
-	.byte	7                               # Abbrev [7] 0x8b:0x7 DW_TAG_base_type
-	.long	.Linfo_string4                  # DW_AT_name
-	.byte	5                               # DW_AT_encoding
-	.byte	4                               # DW_AT_byte_size
-	.byte	7                               # Abbrev [7] 0x92:0x7 DW_TAG_base_type
-	.long	.Linfo_string8                  # DW_AT_name
-	.byte	2                               # DW_AT_encoding
-	.byte	1                               # DW_AT_byte_size
-	.byte	0                               # End Of Children Mark
-.Ldebug_info_end0:
-	.section	.debug_ranges,"",@progbits
-.Ldebug_ranges0:
-	.quad	.Lfunc_begin0
-	.quad	.Lfunc_end0
-	.quad	.Lfunc_begin1
-	.quad	.Lfunc_end1
-	.quad	0
-	.quad	0
-	.section	.debug_str,"MS",@progbits,1
-.Linfo_string0:
-	.asciz	"clang version 17.0.0 (https://github.com/llvm/llvm-project.git 73027ae39b1492e5b6033358a13b86d7d1e781ae)" # string offset=0
-.Linfo_string1:
-	.asciz	"main.cpp"                      # string offset=105
-.Linfo_string2:
-	.asciz	"base-lower-then-range-entry" # string offset=114
-.Linfo_string3:
-	.asciz	"main"                          # string offset=179
-.Linfo_string4:
-	.asciz	"int"                           # string offset=184
-.Linfo_string5:
-	.asciz	"_ZL3foob"                      # string offset=188
-.Linfo_string6:
-	.asciz	"foo"                           # string offset=197
-.Linfo_string7:
-	.asciz	"b"                             # string offset=201
-.Linfo_string8:
-	.asciz	"bool"                          # string offset=203
-.Linfo_string9:
-	.asciz	"food"                          # string offset=208
-	.ident	"clang version 17.0.0 (https://github.com/llvm/llvm-project.git 73027ae39b1492e5b6033358a13b86d7d1e781ae)"
-	.section	".note.GNU-stack","",@progbits
-	.addrsig
-	.addrsig_sym _ZL3foob
-	.section	.debug_line,"",@progbits
-.Lline_table_start0:
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s b/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s
index 399f4e4db5b2f..9afb272b3496f 100644
--- a/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/discontinuous-inline-function.s
@@ -6,28 +6,13 @@
 # RUN: %lldb %t -o "image lookup -v -n look_me_up" -o exit | FileCheck %s
 
 # CHECK:      1 match found in {{.*}}
-# CHECK:      Summary: {{.*}}`foo + 6 [inlined] foo_inl + 1
-# CHECK-NEXT:          {{.*}}`foo + 5
-# CHECK:      Blocks: id = {{.*}}, ranges = [0x00000000-0x00000003)[0x00000004-0x00000008)
-# CHECK-NEXT:         id = {{.*}}, ranges = [0x00000001-0x00000002)[0x00000005-0x00000007), name = "foo_inl"
+# CHECK:      Summary: {{.*}}`foo - 3 [inlined] foo_inl + 1
+# CHECK-NEXT:          {{.*}}`foo - 4
+# CHECK:      Blocks: id = {{.*}}, ranges = [0x00000000-0x00000004)[0x00000005-0x00000008)
+# CHECK-NEXT:         id = {{.*}}, ranges = [0x00000001-0x00000003)[0x00000006-0x00000007), name = "foo_inl"
 
         .text
 
-        .type   foo,@function
-foo:
-        nop
-.Lfoo_inl:
-        nop
-.Lfoo_inl_end:
-        nop
-.Lfoo_end:
-        .size   foo, .Lfoo_end-foo
-
-bar:
-        nop
-.Lbar_end:
-        .size   bar, .Lbar_end-bar
-
         .section        .text.__part1,"ax",@progbits
 foo.__part.1:
         nop
@@ -42,6 +27,21 @@ look_me_up:
         .size   foo.__part.1, .Lfoo.__part.1_end-foo.__part.1
 
 
+bar:
+        nop
+.Lbar_end:
+        .size   bar, .Lbar_end-bar
+
+        .type   foo,@function
+foo:
+        nop
+.Lfoo_inl:
+        nop
+.Lfoo_inl_end:
+        nop
+.Lfoo_end:
+        .size   foo, .Lfoo_end-foo
+
         .section        .debug_abbrev,"",@progbits
         .byte   1                               # Abbreviation Code
         .byte   17                              # DW_TAG_compile_unit

@labath
Copy link
Collaborator Author

labath commented Apr 2, 2025

Ping.

@labath
Copy link
Collaborator Author

labath commented Apr 22, 2025

ping ping :)

@labath
Copy link
Collaborator Author

labath commented Apr 22, 2025

The priority of this has increased from "it's a niche feature only used by google" to "it's used by everyone" because (as I was made aware of at the dev meeting), gcc has started producing functions like this on a regular basis. In particular the dynamic loader in the next debian release (trixie) has functions like this which means that lldb spews a bunch warning when debugging pretty much any binary (if the user has the corresponding debug info package installed):

$ lldb usr/lib64/ld-linux-x86-64.so.2 -O "settings set target.debug-file-search-paths ./usr/lib/debug/"
(lldb) settings set target.debug-file-search-paths ./usr/lib/debug/
(lldb) target create "usr/lib64/ld-linux-x86-64.so.2"
Current executable set to '/tmp/debian/usr/lib64/ld-linux-x86-64.so.2' (x86_64).
(lldb) process launch -s
Process 21646 stopped
* thread #1, name = 'ld-linux-x86-64', stop reason = signal SIGSTOP
    frame #0: 0x00007ffff7fe3440 ld-linux-x86-64.so.2`_start
ld-linux-x86-64.so.2`_start:
->  0x7ffff7fe3440 <+0>: movq   %rsp, %rdi
    0x7ffff7fe3443 <+3>: callq  0x7ffff7fe3fb0 ; _dl_start at rtld.c:519:1

ld-linux-x86-64.so.2`_dl_start_user:
    0x7ffff7fe3448 <+0>: movq   %rax, %r12
    0x7ffff7fe344b <+3>: movq   %rsp, %r13
error: ld-linux-x86-64.so.2 0x00085f3b: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x00085fdb: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x0008601f: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x0008603c: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x000860ad: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x00086120: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x00086159: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x000863ae: adding range [0x0000000000001a0f-0x0000000000001ae8) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
error: ld-linux-x86-64.so.2 0x000863ef: adding range [0x0000000000001a0f-0x0000000000001ae8) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
Process 21646 launched: '/tmp/debian/usr/lib64/ld-linux-x86-64.so.2' (x86_64)

@DavidSpickett
Copy link
Collaborator

I think this relates to:

2.17 Code Addresses, Ranges and Base Addresses
<...>
The base address of the scope for any of the debugging information entries listed
above is given by either the DW_AT_low_pc attribute or the first address in the
first range entry in the list of ranges given by the DW_AT_ranges attribute. If
there is no such attribute, the base address is undefined.

(https://dwarfstd.org/doc/DWARF5.pdf)

I'm just guessing here that the functions you refer to have both. It doesn't say which should win, but we can take "or" to mean "whichever actually makes sense", and in these cases you'd assume that low_pc is maybe the entry point of the function not the actual lowest PC value some block of it sits at.

Is that roughly the logic of this change? Sounds like the check itself was not 100% correct to begin with, perhaps it was for some previous standard version.

@DavidSpickett
Copy link
Collaborator

In particular the dynamic loader in the next debian release (trixie) has functions like this which means that lldb spews a bunch warning when debugging pretty much any binary

If you add this to the PR description, anyone downstream of us seeing this error will have an easier time finding out that we fixed it.

@labath
Copy link
Collaborator Author

labath commented Apr 22, 2025

I think this relates to:

2.17 Code Addresses, Ranges and Base Addresses
<...>
The base address of the scope for any of the debugging information entries listed
above is given by either the DW_AT_low_pc attribute or the first address in the
first range entry in the list of ranges given by the DW_AT_ranges attribute. If
there is no such attribute, the base address is undefined.

(https://dwarfstd.org/doc/DWARF5.pdf)

I'm just guessing here that the functions you refer to have both. It doesn't say which should win, but we can take "or" to mean "whichever actually makes sense", and in these cases you'd assume that low_pc is maybe the entry point of the function not the actual lowest PC value some block of it sits at.

It is related to that, and I think this is the correct reading of that paragraph, but the situation is simpler than that.

Is that roughly the logic of this change? Sounds like the check itself was not 100% correct to begin with, perhaps it was for some previous standard version.

The function just has a DW_AT_ranges attribute like this:

0x00085f0f:   DW_TAG_subprogram
                DW_AT_name      ("_dl_start")
                DW_AT_decl_file ("./elf/rtld.c")
                DW_AT_decl_line (518)
                DW_AT_decl_column       (1)
                DW_AT_prototyped        (true)
                DW_AT_type      (0x0007d5fb "Elf64_Addr")
                DW_AT_ranges    (0x00004586
                   [0x000000000001cfb0, 0x000000000001d680)
                   [0x0000000000001a0f, 0x0000000000001b07))

which (according to the paragraph you quote) means that the function's entry point is 0x1cfb0, which is not the lowest address in the function.

Now this part is still fine. The problem starts when we start parsing nested blocks:

0x00085f3b:     DW_TAG_lexical_block
                  DW_AT_ranges  (0x0000462b
                     [0x000000000001d1f8, 0x000000000001d488)
                     [0x000000000001d4e0, 0x000000000001d550)
                     [0x000000000001d608, 0x000000000001d62e)
                     [0x0000000000001ae8, 0x0000000000001b07))

This block contains a range (the last one) which is below the functions entry point (which is called that because it can, and often is set by DW_AT_low_pc -- I guess we should rename that), even though it's still within the bounds of the function. This code was correct in a world where the functions are always contiguous and start at the first instruction, but that's not the case now (and strictly speaking, it never was). I'm not entirely sure what prompted this check to be added in the first place, but the two candidates I can think of are:

  • bad ranges which would cause the internal block representation to overflow (because the blocks are stored as offsets from the entry point). This is no longer a problem because blocks now use signed numbers for offsets.
  • to catch ranges which have been eliminated by the linker (in this case, their address is usually set to zero). I don't know if the linker can eliminate a part of the function, but if it can, this would now be caught by the m_first_code_address check (which is a relatively new invention).

(BTW, this warning does not show up on old LLDB's -- before I started implementing support for these kinds of functions, because then LLDB just picked the lowest address as the function entry point -- and so this check would not fire)

@DavidSpickett
Copy link
Collaborator

which (according to the paragraph you quote) means that the function's entry point is 0x1cfb0, which is not the lowest address in the function.

I managed to nitpick the text but then go on to assume the ranges would be sorted with start address increasing. Bad idea.

But, if the first range was the [0x0000000000001a0f, 0x0000000000001b07) range, then the lexical block range [0x0000000000001ae8, 0x0000000000001b07) would not produce this error.

So then I think what if we used the minimum base address of the ranges, and I have just reinvented what you have already done with m_first_code_address. I think InitializeFirstCodeAddressRecursive is calculating exactly this.

Removing this error case sounds good to me.

Copy link
Collaborator

@DavidSpickett DavidSpickett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@DavidSpickett
Copy link
Collaborator

https://github.com/search?q=%22which+has+a+base+that+is+less+than+the+function%27s+low+PC%22&type=issues shows only one open issue related to this, in Swift, and it's been dormant for a while.

It would be a good idea to quote the error message verbatim in the commit message so that this change may be found that way in future.

@labath
Copy link
Collaborator Author

labath commented Apr 23, 2025

which (according to the paragraph you quote) means that the function's entry point is 0x1cfb0, which is not the lowest address in the function.

I managed to nitpick the text but then go on to assume the ranges would be sorted with start address increasing. Bad idea.

Yeah. In principle, it's possible to have a sorted range list and then use DW_AT_low/entry_pc to denote the entry point. However, it's pretty hard to guarantee this in practice, since the compiler doesn't know the the final location of the function pieces, and the linker is limited to patching offsets.

But, if the first range was the [0x0000000000001a0f, 0x0000000000001b07) range, then the lexical block range [0x0000000000001ae8, 0x0000000000001b07) would not produce this error.

👍

So then I think what if we used the minimum base address of the ranges, and I have just reinvented what you have already done with m_first_code_address. I think InitializeFirstCodeAddressRecursive is calculating exactly this.

They're similar but different. You (I think) had in mind using the minimum address range of the enclosing function, whereas InitializeFirstCodeAddressRecursive does it for the whole module, and it deliberately avoids using the debug info for this purpose (as the goal is to catch addresses/ranges that have been set to zero due to linker GC). So, it checks for blocks that are too low to be valid code, but it doesn't check for blocks that are "outside" of the enclosing function. For that, I have #117725, which I'm also trying to get reviewed (ping ping :P).

Removing this error case sounds good to me.

It would be a good idea to quote the error message verbatim in the commit message so that this change may be found that way in future.

👍

@labath labath merged commit 1fd0b41 into llvm:main Apr 23, 2025
10 checks passed
@labath labath deleted the block branch April 23, 2025 13:56
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…2395)

The check is not correct for discontinuous functions, as one of the
blocks could very well begin before the function entry point. To catch
dead-stripped ranges, I check whether the functions is after the first
known code address. I don't print any error in this case as that is a
common/expected situation.

This avoids many errors like:
```
error: ld-linux-x86-64.so.2 0x00085f3b: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
```
when debugging binaries on debian trixie because the dynamic linker
(ld-linux) contains discontinuous functions.

If the block ranges is not a subrange of the enclosing block then this
will range will currently be added to the outer block as well (i.e., we
get the same behavior that's currently possible for non-subrange blocks
larger than function_low_pc). However, this code path is buggy and I'd
like to change that (llvm#117725).
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…2395)

The check is not correct for discontinuous functions, as one of the
blocks could very well begin before the function entry point. To catch
dead-stripped ranges, I check whether the functions is after the first
known code address. I don't print any error in this case as that is a
common/expected situation.

This avoids many errors like:
```
error: ld-linux-x86-64.so.2 0x00085f3b: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
```
when debugging binaries on debian trixie because the dynamic linker
(ld-linux) contains discontinuous functions.

If the block ranges is not a subrange of the enclosing block then this
will range will currently be added to the outer block as well (i.e., we
get the same behavior that's currently possible for non-subrange blocks
larger than function_low_pc). However, this code path is buggy and I'd
like to change that (llvm#117725).
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
…2395)

The check is not correct for discontinuous functions, as one of the
blocks could very well begin before the function entry point. To catch
dead-stripped ranges, I check whether the functions is after the first
known code address. I don't print any error in this case as that is a
common/expected situation.

This avoids many errors like:
```
error: ld-linux-x86-64.so.2 0x00085f3b: adding range [0x0000000000001ae8-0x0000000000001b07) which has a 
base that is less than the function's low PC 0x000000000001cfb0. Please file a bug and attach the file at
 the start of this error message
```
when debugging binaries on debian trixie because the dynamic linker
(ld-linux) contains discontinuous functions.

If the block ranges is not a subrange of the enclosing block then this
will range will currently be added to the outer block as well (i.e., we
get the same behavior that's currently possible for non-subrange blocks
larger than function_low_pc). However, this code path is buggy and I'd
like to change that (llvm#117725).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants