-
Notifications
You must be signed in to change notification settings - Fork 14.2k
[LLDB] Improve ObjectFileELF files ability to load from memory. #100900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is the first part in improving ELF core files in LLDB. This first patch teaches LLDB to get as much data as possible from an in memory ELF image where we don't have section headers mapped into the process' memory. Core files contain the main executable mapped into memory so this patch will allow us to read it in, get the UUID and also find the DT_DEBUG entry so the dynamic loader can get a full list of shared libraries from the linked list in memory. This patch modified ObjectFileELF to be able to more successfully load ELF files from memory. Highlights include: - Load the .dynamic section using the PT_DYNAMIC program header if there are no sections headers. - Load ELF notes from the PT_NOTE program header if there are no section headers. - Modify the ObjectFileELF::GetArchitecture() to use the PT_NOTE segment if no section headers are available. - Modify the ObjectFileELF::GetUUID() to use the PT_NOTE segment if no section headers are available. This allows us to find the UUID of a binary in core files without having the executable in disk. - Modify ObjectFileELF::GetSegmentData() to read data from memory if the program header's data doesn't exist in the first memory region for the object file in memory. - Modify ObjectFileELF::GetImageInfoAddress() to not rely on section headers. This will allow us in a follow up patch to not require the executable when loading an ELF core file as we can load the executable image from memory and find the DT_DEBUG entry. - Fix an issue where we would not check if we got any data when we tried to parse the section headers from memory and we would create many SHT_NULL sections. We now check if we have data before we resize our section_headers array. - Modify cwObjectFileELF::ParseDynamicSymbols() to parse and cache all dynamic entries and their names and change all code that read dynamic entries over to using the cached values. Previously different areas of code would manually parse the dynamic entries and some would get the names, and some wouldn't. - Add dumping the the .dynamic section when running the "target modules dump objfile" command. This is used to test that this features works in the unit test and verifies we can get the needed DT_* entries needed for this to work. - Fix dumping of the program headers to work for 64 bit and have the table aligned to the ASCII column headers. - Load .dynstr string table from the DT_STRTAB dynamic entry. - Load the dynamic symbol table from the DT_SYMTAB dynamic entry. - Modify the lldb_private::Process to use the memory region information to get the size the the memory region that contains the object file header to make sure we read as much as we can and don't run into loading the program headers, .dynstr or .dynsym from program headers
If anyone knows any ELF file experts, please add them as reviewers |
@llvm/pr-subscribers-lldb Author: Greg Clayton (clayborg) ChangesThis is the first part in improving ELF core files in LLDB. This first patch teaches LLDB to get as much data as possible from an in memory ELF image where we don't have section headers mapped into the process' memory. Core files contain the main executable mapped into memory so this patch will allow us to read it in, get the UUID and also find the DT_DEBUG entry so the dynamic loader can get a full list of shared libraries from the linked list in memory. This patch modified ObjectFileELF to be able to more successfully load ELF files from memory. Highlights include:
Patch is 33.95 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/100900.diff 6 Files Affected:
diff --git a/lldb/include/lldb/Target/Process.h b/lldb/include/lldb/Target/Process.h
index c8475db8ae160..17e18261b4752 100644
--- a/lldb/include/lldb/Target/Process.h
+++ b/lldb/include/lldb/Target/Process.h
@@ -1952,7 +1952,7 @@ class Process : public std::enable_shared_from_this<Process>,
lldb::ModuleSP ReadModuleFromMemory(const FileSpec &file_spec,
lldb::addr_t header_addr,
- size_t size_to_read = 512);
+ size_t size_to_read = 0);
/// Attempt to get the attributes for a region of memory in the process.
///
diff --git a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
index 890db5c274814..4bd6df0d642b3 100644
--- a/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
+++ b/lldb/source/Plugins/ObjectFile/ELF/ObjectFileELF.cpp
@@ -22,6 +22,7 @@
#include "lldb/Host/LZMA.h"
#include "lldb/Symbol/DWARFCallFrameInfo.h"
#include "lldb/Symbol/SymbolContext.h"
+#include "lldb/Target/Process.h"
#include "lldb/Target/SectionLoadList.h"
#include "lldb/Target/Target.h"
#include "lldb/Utility/ArchSpec.h"
@@ -810,11 +811,29 @@ bool ObjectFileELF::ParseHeader() {
}
UUID ObjectFileELF::GetUUID() {
+ if (m_uuid)
+ return m_uuid;
+
// Need to parse the section list to get the UUIDs, so make sure that's been
// done.
if (!ParseSectionHeaders() && GetType() != ObjectFile::eTypeCoreFile)
return UUID();
+ // Try loading note info from PT_NOTE if section headers didn't find any.
+ // If we load an ELF file from memory, then we should be able to still
+ // read the note data.
+ for (const ELFProgramHeader &H : ProgramHeaders()) {
+ if (H.p_type == llvm::ELF::PT_NOTE) {
+ DataExtractor note_data = GetSegmentData(H);
+ if (note_data.GetByteSize()) {
+ lldb_private::ArchSpec arch_spec;
+ RefineModuleDetailsFromNote(note_data, arch_spec, m_uuid);
+ if (m_uuid)
+ return m_uuid;
+ }
+ }
+ }
+
if (!m_uuid) {
using u32le = llvm::support::ulittle32_t;
if (GetType() == ObjectFile::eTypeCoreFile) {
@@ -873,33 +892,35 @@ Address ObjectFileELF::GetImageInfoAddress(Target *target) {
if (!section_list)
return Address();
- // Find the SHT_DYNAMIC (.dynamic) section.
- SectionSP dynsym_section_sp(
- section_list->FindSectionByType(eSectionTypeELFDynamicLinkInfo, true));
- if (!dynsym_section_sp)
- return Address();
- assert(dynsym_section_sp->GetObjectFile() == this);
-
- user_id_t dynsym_id = dynsym_section_sp->GetID();
- const ELFSectionHeaderInfo *dynsym_hdr = GetSectionHeaderByIndex(dynsym_id);
- if (!dynsym_hdr)
- return Address();
-
for (size_t i = 0; i < m_dynamic_symbols.size(); ++i) {
- ELFDynamic &symbol = m_dynamic_symbols[i];
+ const ELFDynamic &symbol = m_dynamic_symbols[i].symbol;
if (symbol.d_tag == DT_DEBUG) {
// Compute the offset as the number of previous entries plus the size of
// d_tag.
- addr_t offset = i * dynsym_hdr->sh_entsize + GetAddressByteSize();
- return Address(dynsym_section_sp, offset);
+ addr_t offset = (i * 2 + 1) * GetAddressByteSize();
+ addr_t file_addr = m_dynamic_base_addr + offset;
+ SectionList *section_list = GetSectionList();
+ if (section_list) {
+ Address addr;
+ // Resolving the file addresss works in two cases here:
+ // 1 - We have a SHT_DYNAMIC section.
+ // 2 - We don't have section headers but have a PT_DYNAMIC segment.
+ if (addr.ResolveAddressUsingFileSections(file_addr, section_list))
+ return addr;
+ }
}
// MIPS executables uses DT_MIPS_RLD_MAP_REL to support PIE. DT_MIPS_RLD_MAP
// exists in non-PIE.
else if ((symbol.d_tag == DT_MIPS_RLD_MAP ||
symbol.d_tag == DT_MIPS_RLD_MAP_REL) &&
target) {
- addr_t offset = i * dynsym_hdr->sh_entsize + GetAddressByteSize();
+ SectionSP dynsym_section_sp(section_list->FindSectionByType(
+ eSectionTypeELFDynamicLinkInfo, true));
+ if (!dynsym_section_sp)
+ return Address();
+
+ addr_t offset = (i * 2 + 1) * GetAddressByteSize();
addr_t dyn_base = dynsym_section_sp->GetLoadBaseAddress(target);
if (dyn_base == LLDB_INVALID_ADDRESS)
return Address();
@@ -927,7 +948,6 @@ Address ObjectFileELF::GetImageInfoAddress(Target *target) {
}
}
}
-
return Address();
}
@@ -970,66 +990,27 @@ Address ObjectFileELF::GetBaseAddress() {
return LLDB_INVALID_ADDRESS;
}
-// ParseDependentModules
size_t ObjectFileELF::ParseDependentModules() {
if (m_filespec_up)
return m_filespec_up->GetSize();
m_filespec_up = std::make_unique<FileSpecList>();
- if (!ParseSectionHeaders())
- return 0;
-
- SectionList *section_list = GetSectionList();
- if (!section_list)
- return 0;
-
- // Find the SHT_DYNAMIC section.
- Section *dynsym =
- section_list->FindSectionByType(eSectionTypeELFDynamicLinkInfo, true)
- .get();
- if (!dynsym)
- return 0;
- assert(dynsym->GetObjectFile() == this);
-
- const ELFSectionHeaderInfo *header = GetSectionHeaderByIndex(dynsym->GetID());
- if (!header)
- return 0;
- // sh_link: section header index of string table used by entries in the
- // section.
- Section *dynstr = section_list->FindSectionByID(header->sh_link).get();
- if (!dynstr)
+ if (!ParseDynamicSymbols())
return 0;
- DataExtractor dynsym_data;
- DataExtractor dynstr_data;
- if (ReadSectionData(dynsym, dynsym_data) &&
- ReadSectionData(dynstr, dynstr_data)) {
- ELFDynamic symbol;
- const lldb::offset_t section_size = dynsym_data.GetByteSize();
- lldb::offset_t offset = 0;
-
- // The only type of entries we are concerned with are tagged DT_NEEDED,
- // yielding the name of a required library.
- while (offset < section_size) {
- if (!symbol.Parse(dynsym_data, &offset))
- break;
-
- if (symbol.d_tag != DT_NEEDED)
- continue;
-
- uint32_t str_index = static_cast<uint32_t>(symbol.d_val);
- const char *lib_name = dynstr_data.PeekCStr(str_index);
- FileSpec file_spec(lib_name);
+ for (const auto &entry : m_dynamic_symbols) {
+ if (entry.symbol.d_tag != DT_NEEDED)
+ continue;
+ if (!entry.name.empty()) {
+ FileSpec file_spec(entry.name);
FileSystem::Instance().Resolve(file_spec);
m_filespec_up->Append(file_spec);
}
}
-
return m_filespec_up->GetSize();
}
-// GetProgramHeaderInfo
size_t ObjectFileELF::GetProgramHeaderInfo(ProgramHeaderColl &program_headers,
DataExtractor &object_data,
const ELFHeader &header) {
@@ -1466,16 +1447,16 @@ size_t ObjectFileELF::GetSectionHeaderInfo(SectionHeaderColl §ion_headers,
Log *log = GetLog(LLDBLog::Modules);
- section_headers.resize(header.e_shnum);
- if (section_headers.size() != header.e_shnum)
- return 0;
-
const size_t sh_size = header.e_shnum * header.e_shentsize;
const elf_off sh_offset = header.e_shoff;
DataExtractor sh_data;
if (sh_data.SetData(object_data, sh_offset, sh_size) != sh_size)
return 0;
+ section_headers.resize(header.e_shnum);
+ if (section_headers.size() != header.e_shnum)
+ return 0;
+
uint32_t idx;
lldb::offset_t offset;
for (idx = 0, offset = 0; idx < header.e_shnum; ++idx) {
@@ -2472,32 +2453,87 @@ size_t ObjectFileELF::ParseDynamicSymbols() {
if (m_dynamic_symbols.size())
return m_dynamic_symbols.size();
+ DataExtractor dynsym_data;
+ DataExtractor dynstr_data;
+
SectionList *section_list = GetSectionList();
- if (!section_list)
- return 0;
+ if (section_list) {
+ // The dynamic symbols can be found in a .dynamic section, or if we loaded
+ // the ELF file from memory we won't have section headers, so we should also
+ // check the program headers for a PT_DYNAMIC segment.
+
+ // Find the SHT_DYNAMIC section.
+ Section *dynamic =
+ section_list->FindSectionByType(eSectionTypeELFDynamicLinkInfo, true)
+ .get();
+ if (dynamic) {
+ const ELFSectionHeaderInfo *header =
+ GetSectionHeaderByIndex(dynamic->GetID());
+ if (header) {
+ // sh_link: section header index of string table used by entries in the
+ // section.
+ Section *dynstr = section_list->FindSectionByID(header->sh_link).get();
+ if (dynstr)
+ ReadSectionData(dynstr, dynstr_data);
+ }
+ assert(dynamic->GetObjectFile() == this);
+ if (ReadSectionData(dynamic, dynsym_data))
+ m_dynamic_base_addr = dynamic->GetFileAddress();
+ }
+ }
- // Find the SHT_DYNAMIC section.
- Section *dynsym =
- section_list->FindSectionByType(eSectionTypeELFDynamicLinkInfo, true)
- .get();
- if (!dynsym)
- return 0;
- assert(dynsym->GetObjectFile() == this);
+ // Check for a PT_DYNAMIC if we didn't find a SHT_DYNAMIC section.
+ if (dynsym_data.GetByteSize() == 0) {
+ for (const ELFProgramHeader &H : ProgramHeaders()) {
+ if (H.p_type == llvm::ELF::PT_DYNAMIC) {
+ dynsym_data = GetSegmentData(H);
+ if (dynsym_data.GetByteSize() > 0)
+ m_dynamic_base_addr = H.p_vaddr;
+ break;
+ }
+ }
+ }
- ELFDynamic symbol;
- DataExtractor dynsym_data;
- if (ReadSectionData(dynsym, dynsym_data)) {
- const lldb::offset_t section_size = dynsym_data.GetByteSize();
- lldb::offset_t cursor = 0;
+ if (dynsym_data.GetByteSize() == 0)
+ return 0;
- while (cursor < section_size) {
- if (!symbol.Parse(dynsym_data, &cursor))
+ ELFDynamicWithName e;
+ const lldb::offset_t dynsym_data_size = dynsym_data.GetByteSize();
+ lldb::offset_t cursor = 0;
+ while (cursor < dynsym_data_size && e.symbol.Parse(dynsym_data, &cursor)) {
+ m_dynamic_symbols.push_back(e);
+ if (e.symbol.d_tag == DT_NULL)
+ break;
+ }
+ // Now try and read the name of the dynamic entries, but we first need to
+ // check if we have the data we need in dynstr_data. If we don't and we have
+ // an ELF file in memory, we can read if from the DT_STRTAB and DT_STRSZ
+ // .dynamic values we just extracted.
+ if (dynstr_data.GetByteSize() == 0) {
+ if (auto strtab_data = ReadStrtabDataFromMemory())
+ dynstr_data = std::move(*strtab_data);
+ }
+ if (dynstr_data.GetByteSize() > 0) {
+ for (ELFDynamicWithName &entry : m_dynamic_symbols) {
+ switch (entry.symbol.d_tag) {
+ case DT_NEEDED:
+ case DT_SONAME:
+ case DT_RPATH:
+ case DT_RUNPATH:
+ case DT_AUXILIARY:
+ case DT_FILTER: {
+ lldb::offset_t cursor = entry.symbol.d_val;
+ const char *name = dynstr_data.GetCStr(&cursor);
+ if (name)
+ entry.name = name;
break;
-
- m_dynamic_symbols.push_back(symbol);
+ }
+ default:
+ break;
+ }
}
}
-
+ // Return the number of dynamic symbols we have parsed.
return m_dynamic_symbols.size();
}
@@ -2505,13 +2541,9 @@ const ELFDynamic *ObjectFileELF::FindDynamicSymbol(unsigned tag) {
if (!ParseDynamicSymbols())
return nullptr;
- DynamicSymbolCollIter I = m_dynamic_symbols.begin();
- DynamicSymbolCollIter E = m_dynamic_symbols.end();
- for (; I != E; ++I) {
- ELFDynamic *symbol = &*I;
-
- if (symbol->d_tag == tag)
- return symbol;
+ for (const auto &entry : m_dynamic_symbols) {
+ if (entry.symbol.d_tag == tag)
+ return &entry.symbol;
}
return nullptr;
@@ -3025,6 +3057,21 @@ void ObjectFileELF::ParseSymtab(Symtab &lldb_symtab) {
ParseSymbolTable(&lldb_symtab, symbol_id, dynsym);
symbol_id += num_symbols;
m_address_class_map.merge(address_class_map);
+ } else if (IsInMemory()) {
+ // If this is an in memory ELF file, then we won't have the section
+ // headers, but we can still parse the dynamic symbol table from the
+ // .dynamic section and reading it from memory.
+ uint32_t num_symbols = 0;
+ std::optional<DataExtractor> symtab_data =
+ ReadSymtabDataFromMemory(num_symbols);
+ std::optional<DataExtractor> strtab_data = ReadStrtabDataFromMemory();
+ if (symtab_data && strtab_data) {
+ auto [num_symbols_parsed, address_class_map] =
+ ParseSymbols(&lldb_symtab, symbol_id, section_list, num_symbols,
+ symtab_data.value(), strtab_data.value());
+ symbol_id += num_symbols_parsed;
+ m_address_class_map.merge(address_class_map);
+ }
}
}
@@ -3230,8 +3277,10 @@ void ObjectFileELF::Dump(Stream *s) {
ArchSpec header_arch = GetArchitecture();
*s << ", file = '" << m_file
- << "', arch = " << header_arch.GetArchitectureName() << "\n";
-
+ << "', arch = " << header_arch.GetArchitectureName();
+ if (m_memory_addr != LLDB_INVALID_ADDRESS)
+ s->Printf(", addr = %#16.16" PRIx64, m_memory_addr);
+ s->EOL();
DumpELFHeader(s, m_header);
s->EOL();
DumpELFProgramHeaders(s);
@@ -3248,6 +3297,8 @@ void ObjectFileELF::Dump(Stream *s) {
s->EOL();
DumpDependentModules(s);
s->EOL();
+ DumpELFDynamic(s);
+ s->EOL();
}
// DumpELFHeader
@@ -3336,10 +3387,10 @@ void ObjectFileELF::DumpELFHeader_e_ident_EI_DATA(Stream *s,
void ObjectFileELF::DumpELFProgramHeader(Stream *s,
const ELFProgramHeader &ph) {
DumpELFProgramHeader_p_type(s, ph.p_type);
- s->Printf(" %8.8" PRIx64 " %8.8" PRIx64 " %8.8" PRIx64, ph.p_offset,
+ s->Printf(" %16.16" PRIx64 " %16.16" PRIx64 " %16.16" PRIx64, ph.p_offset,
ph.p_vaddr, ph.p_paddr);
- s->Printf(" %8.8" PRIx64 " %8.8" PRIx64 " %8.8x (", ph.p_filesz, ph.p_memsz,
- ph.p_flags);
+ s->Printf(" %16.16" PRIx64 " %16.16" PRIx64 " %8.8x (", ph.p_filesz,
+ ph.p_memsz, ph.p_flags);
DumpELFProgramHeader_p_flags(s, ph.p_flags);
s->Printf(") %8.8" PRIx64, ph.p_align);
@@ -3361,6 +3412,8 @@ void ObjectFileELF::DumpELFProgramHeader_p_type(Stream *s, elf_word p_type) {
CASE_AND_STREAM(s, PT_PHDR, kStrWidth);
CASE_AND_STREAM(s, PT_TLS, kStrWidth);
CASE_AND_STREAM(s, PT_GNU_EH_FRAME, kStrWidth);
+ CASE_AND_STREAM(s, PT_GNU_RELRO, kStrWidth);
+ CASE_AND_STREAM(s, PT_GNU_STACK, kStrWidth);
default:
s->Printf("0x%8.8x%*s", p_type, kStrWidth - 10, "");
break;
@@ -3386,10 +3439,12 @@ void ObjectFileELF::DumpELFProgramHeaders(Stream *s) {
return;
s->PutCString("Program Headers\n");
- s->PutCString("IDX p_type p_offset p_vaddr p_paddr "
- "p_filesz p_memsz p_flags p_align\n");
- s->PutCString("==== --------------- -------- -------- -------- "
- "-------- -------- ------------------------- --------\n");
+ s->PutCString(
+ "IDX p_type p_offset p_vaddr p_paddr "
+ "p_filesz p_memsz p_flags p_align\n");
+ s->PutCString(
+ "==== --------------- ---------------- ---------------- ---------------- "
+ "---------------- ---------------- ------------------------- --------\n");
for (const auto &H : llvm::enumerate(m_program_headers)) {
s->Format("[{0,2}] ", H.index());
@@ -3492,6 +3547,111 @@ void ObjectFileELF::DumpDependentModules(lldb_private::Stream *s) {
}
}
+std::string static getDynamicTagAsString(uint16_t Arch, uint64_t Type) {
+#define DYNAMIC_STRINGIFY_ENUM(tag, value) \
+ case value: \
+ return #tag;
+
+#define DYNAMIC_TAG(n, v)
+ switch (Arch) {
+ case llvm::ELF::EM_AARCH64:
+ switch (Type) {
+#define AARCH64_DYNAMIC_TAG(name, value) DYNAMIC_STRINGIFY_ENUM(name, value)
+#include "llvm/BinaryFormat/DynamicTags.def"
+#undef AARCH64_DYNAMIC_TAG
+ }
+ break;
+
+ case llvm::ELF::EM_HEXAGON:
+ switch (Type) {
+#define HEXAGON_DYNAMIC_TAG(name, value) DYNAMIC_STRINGIFY_ENUM(name, value)
+#include "llvm/BinaryFormat/DynamicTags.def"
+#undef HEXAGON_DYNAMIC_TAG
+ }
+ break;
+
+ case llvm::ELF::EM_MIPS:
+ switch (Type) {
+#define MIPS_DYNAMIC_TAG(name, value) DYNAMIC_STRINGIFY_ENUM(name, value)
+#include "llvm/BinaryFormat/DynamicTags.def"
+#undef MIPS_DYNAMIC_TAG
+ }
+ break;
+
+ case llvm::ELF::EM_PPC:
+ switch (Type) {
+#define PPC_DYNAMIC_TAG(name, value) DYNAMIC_STRINGIFY_ENUM(name, value)
+#include "llvm/BinaryFormat/DynamicTags.def"
+#undef PPC_DYNAMIC_TAG
+ }
+ break;
+
+ case llvm::ELF::EM_PPC64:
+ switch (Type) {
+#define PPC64_DYNAMIC_TAG(name, value) DYNAMIC_STRINGIFY_ENUM(name, value)
+#include "llvm/BinaryFormat/DynamicTags.def"
+#undef PPC64_DYNAMIC_TAG
+ }
+ break;
+
+ case llvm::ELF::EM_RISCV:
+ switch (Type) {
+#define RISCV_DYNAMIC_TAG(name, value) DYNAMIC_STRINGIFY_ENUM(name, value)
+#include "llvm/BinaryFormat/DynamicTags.def"
+#undef RISCV_DYNAMIC_TAG
+ }
+ break;
+ }
+#undef DYNAMIC_TAG
+ switch (Type) {
+// Now handle all dynamic tags except the architecture specific ones
+#define AARCH64_DYNAMIC_TAG(name, value)
+#define MIPS_DYNAMIC_TAG(name, value)
+#define HEXAGON_DYNAMIC_TAG(name, value)
+#define PPC_DYNAMIC_TAG(name, value)
+#define PPC64_DYNAMIC_TAG(name, value)
+#define RISCV_DYNAMIC_TAG(name, value)
+// Also ignore marker tags such as DT_HIOS (maps to DT_VERNEEDNUM), etc.
+#define DYNAMIC_TAG_MARKER(name, value)
+#define DYNAMIC_TAG(name, value) \
+ case value: \
+ return #name;
+#include "llvm/BinaryFormat/DynamicTags.def"
+#undef DYNAMIC_TAG
+#undef AARCH64_DYNAMIC_TAG
+#undef MIPS_DYNAMIC_TAG
+#undef HEXAGON_DYNAMIC_TAG
+#undef PPC_DYNAMIC_TAG
+#undef PPC64_DYNAMIC_TAG
+#undef RISCV_DYNAMIC_TAG
+#undef DYNAMIC_TAG_MARKER
+#undef DYNAMIC_STRINGIFY_ENUM
+ default:
+ return "<unknown:>0x" + llvm::utohexstr(Type, true);
+ }
+}
+
+void ObjectFileELF::DumpELFDynamic(lldb_private::Stream *s) {
+ ParseDynamicSymbols();
+ if (m_dynamic_symbols.empty())
+ return;
+
+ s->PutCString(".dynamic:\n");
+ s->PutCString("IDX d_tag d_val/d_ptr\n");
+ s->PutCString("==== ---------------- ------------------\n");
+ uint32_t idx = 0;
+ for (const auto &entry : m_dynamic_symbols) {
+ s->Printf("[%2u] ", idx++);
+ s->Printf(
+ "%-16s 0x%16.16" PRIx64,
+ getDynamicTagAsString(m_header.e_machine, entry.symbol.d_tag).c_str(),
+ entry.symbol.d_ptr);
+ if (!entry.name.empty())
+ s->Printf(" \"%s\"", entry.name.c_str());
+ s->EOL();
+ }
+}
+
ArchSpec ObjectFileELF::GetArchitecture() {
if (!ParseHeader())
return ArchSpec();
@@ -3501,17 +3661,17 @@ ArchSpec ObjectFileELF::GetArchitecture() {
ParseSectionHeaders();
}
- if (CalculateType() == eTypeCoreFile &&
- !m_arch_spec.TripleOSWasSpecified()) {
- // Core files don't have section headers yet they have PT_NOTE program
- // headers that might shed more light on the architecture
- for (const elf::ELFProgramHeader &H : ProgramHeaders()) {
- if (H.p_type != PT_NOTE || H.p_offset == 0 || H.p_filesz == 0)
- continue;
- DataExtractor data;
- if (data.SetData(m_data, H.p_offset, H.p_filesz) == H.p_filesz) {
- UUID uuid;
- RefineModuleDetailsFromNote(data, m_arch_spec, uuid);
+ // Check the program headers for more details. Core files and ELF files read
+ // from memory only have program headers.
+ if (!m_arch_spec.TripleOSWasSpecified()) {
+ // Try loading note info from PT_NOTE if section headers didn't find any.
+ for (const ELFProgramHeader &H : ProgramHeaders()) {
+ if (H.p_type == llvm::ELF::PT_NOTE) {
+ DataExtractor note_data = GetSegmentData(H);
+ if (note_data.GetByteSize()) {
+ UUID uuid;
+ RefineModuleDetailsFromNote(note_data, m_arch_spec, uuid);
+ }
}
}
}
@@ -3664,7 +3824,27 @@ llvm::ArrayRef<ELFProgramHeader> ObjectFileELF::ProgramHeaders() {
}
DataExtractor ObjectFileELF::GetSegmentData(const ELFProgramHeader &H) {
- return DataExtractor(m_data, H.p_offset, H.p_filesz);
+ // Try and read the program header from our cached m_data which can come from
+ // the file on disk being mmap'ed or from the initial part of the ELF fil...
[truncated]
|
I'm not familiar with lldb, but I can make some comments as @tschuett invited me:) Parsing PT_NOTE is a great step, as program headers are sufficient for executables, shared objects, and core dumps. Has the code been updated to handle p_align=8 PT_LOAD correctly? https://reviews.llvm.org/D150022 might be a related change from LLVMObject.
|
I heard rumours that you know a bit about ELF :) |
I like this idea a lot, but I have some reservations about the implementation. For one, I think this patch is too big. I once heard someone say "if you have bullet points in your patch description, then the patch is doing too much". While I don't think we should go as far as to create a separate PR for each of your bullet points, I do believe that splitting it up into a couple of pieces would go along way towards making it easier to review. I also see that some of the functionality is guarded by Finally, I think that structuring some of this code as "fallback" is not ideal, as it can cause some data can be parsed twice (I think it happens at least with ELF notes in this patch). Even if that's innocuous , I don't think it's right because the two mechanisms (program and section headers) are just different ways of finding the same data. I think it'd be cleaner if this was implemented as a two-step process:
I realise this feedback isn't very specific, but that's because I found it very hard to follow everything that's going on in this patch. I'm sure I'll be able to be more specific on the partial patches (and maybe some of my assumptions will turn out to be incorrect). As a first patch in the series, I'd recommend teaching lldb to parse section-header-less object files. Right now, if I run lldb on such a file, it will tell me that it's empty (has zero sections). Making the program headers visible would lay the foundation for other changes, and it would also be the smallest testable piece of functionality (by dumping the section list). @MaskRay can you recommend a good to create these kinds of files? I was thinking of a combination |
ok, I will break this up to make it easier to review. |
yaml2obj omits the section header table when
However, obj2yaml doesn't create If the test utilizes |
Thanks. This is very helpful. For basic tests, I think we could use yaml2obj, but it would be also nice to have a test which builds and executes a real section-free executable. That one would use a linker, but we should be able to do it by just postprocessing the built executable with llvm-objcopy. So, while we would most likely use this feature if it existed, I think we can get by without it. |
I am going to split this up. New PR with just .dynamic changes is here: |
GNU ld since 2.41 supports this option, which is mildly useful. It omits the section header table and non-ALLOC sections (including .symtab/.strtab (--strip-all)). This option is simple to implement and might be used by LLDB to test program headers parsing without the section header table (#100900). -z sectionheader, which is the default, is also added. Pull Request: #101286
This is the first part in improving ELF core files in LLDB. This first patch teaches LLDB to get as much data as possible from an in memory ELF image where we don't have section headers mapped into the process' memory. Core files contain the main executable mapped into memory so this patch will allow us to read it in, get the UUID and also find the DT_DEBUG entry so the dynamic loader can get a full list of shared libraries from the linked list in memory.
This patch modified ObjectFileELF to be able to more successfully load ELF files from memory. Highlights include: