Skip to content

[lldb] Register reflection images with LLDBMemoryReader #3800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

augusto2112
Copy link

No description provided.

@augusto2112
Copy link
Author

@swift-ci test

Copy link

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really promising! For some reason the patch is quite large, I wonder if we could frontload a few NFC refactors and get the distracting changes out of the way first.

[&](swift::ReflectionSectionKind section_kind)
-> std::pair<swift::remote::RemoteRef<void>, uint64_t> {
auto section_name =
ConstString(obj_file_format->getSectionName(section_kind));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the ConstString necessary here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because there's no ConstString overload that compares a ConstString to a StringRef.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more efficient to use if (section->GetName().AsStringRef() == section_name) than looking up and potentially allocating a new ConstString.

// The start of the masked address is right after the previous
// registered address,
// so subtract it and 1 to get to the file address.
file_address = address - (pair_iterator - 1)->first - 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this check that the address actually falls inside of the image that lower_bound() returned?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, by checking the iterator in the beginning: pair_iterator != m_address_map.end

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch from c08ad1e to 188ed2d Compare January 17, 2022 15:02
@augusto2112
Copy link
Author

@swift-ci test

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch 3 times, most recently from fd71269 to 6b3121c Compare January 26, 2022 16:05
@augusto2112 augusto2112 changed the base branch from stable/20210726 to stable/20211026 January 26, 2022 16:05
@augusto2112
Copy link
Author

Please test with following pull request:
swiftlang/swift#41021

preset=lldb-pull-request
@swift-ci Please test with preset macOS Platform

@augusto2112
Copy link
Author

Please test with following pull request:
swiftlang/swift#41021

preset=lldb-pull-request
@swift-ci Please test with preset Linux Platform

@@ -134,15 +134,21 @@ bool LLDBMemoryReader::readBytes(swift::remote::RemoteAddress address,
LLDB_LOGV(log, "[MemoryReader] asked to read {0} bytes at address {1:x}",
size, address.getAddressData());

Address addr = Address(address.getAddressData());
if (auto resolved_addr = tryToConvertFromMappedAddressIntoRealAddress(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there more specific terms that we could use instead of Mapped and Real? Is Real maybe a FileAddress (which would be a term of art in LLDB that is well understood?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Real would be an instance of Address which Target knows how to read (in this case, we convert it to the section + offset form of an Address). Not sure what the best name would be, I've been using "artificial" on the documentation to describe these fake addresses, so maybe tryToConvertFromArtificialToLLDBAddress?

@augusto2112
Copy link
Author

Please test with following pull request:
swiftlang/swift#41021

@swift-ci Please test


for (auto swift_mask : used_swift_masks) {
assert((next_available_address & swift_mask) != swift_mask &&
"LLDB artificial address clashes with a swift mask!");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat!
If we use static_assert, maybe this could even be a constexpr and be checked even if assertions are disabled.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this can't can't be be a const_expr since next_available_address depends on the size of the images at runtime.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, what we want to check is that the 0x2000000000000000 constant will never appear in a pointer that would appear in the process. For that we just need to check that it's contained in SWIFT_ABI_ARM64_SWIFT_SPARE_BITS_MASK.
So a static_assert(0x2000000000000000 & SWIFT_ABI_ARM64_SWIFT_SPARE_BITS_MASK == 0x2000000000000000) should be sufficient for that purpose. (Repeat with all other platforms).

// bit set.
next_available_address = m_address_map.back().first;
// Add 8 to it to keep it pointer aligned. This assumes the previous address
// was also pointer aligned.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it's written, this comment doesn't really make sense, because this only works if the address was already aligned and then adding 8 doesn't change anything in terms of aligned-ness. SO either the comment is wrong or this should be a call to llvm::alignTo() from MathExtras.h.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should work as the size of the previous image is aligned, I wasn't aware of llvm::alignTo() though, I'll use that instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the bit manipulation further down to use llvm::alignTo(), but decided to keep the plus 8 here, as it's actually clearer to understand, but changed the code and comments to clarify that we add 8 to the end of the previous image's address (which is already pointer aligned).

private:
Process &m_process;
size_t m_max_read_amount;

llvm::Optional<uint64_t> m_local_buffer;
uint64_t m_local_buffer_size = 0;
// A map from "artificial" addresses that we hand out to modules that
// LLDBMemoryReader will read from.
std::vector<std::pair<uint64_t, lldb::ModuleSP>> m_address_map;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a vector, should this be m_address_list, or even better m_section_ranges? Can you document that this is guaranteed to be sorted?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though it's a vector, we use it as a map from ranges -> modules, so what about m_range_module_map?

}

llvm::Optional<Address>
LLDBMemoryReader::tryToConvertFromMappedAddressIntoRealAddress(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like every call site just unwraps the optional and assigns it to the original value. Maybe this function should return the original address if it failed?

Maybe call it resolve..?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do that, but I personally like to make it explicit that the function may fail...

I'll change the name though.

return {};

// If the address contains our mask, this is an image we registered.
if (address & 0x2000000000000000) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 0x20000 should be a constant declared only once

auto comparison_pair = std::make_pair(address, ModuleSP());
auto pair_iterator = std::lower_bound(m_address_map.begin(),
m_address_map.end(), comparison_pair);
if (pair_iterator != m_address_map.end()) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::lowerbound could return an entry that doesn't contain the searched address, since it returns the insert point for that value. Can you add error handling for that?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't understand, in what situation can that happen?

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch 2 times, most recently from 8d55ee8 to ab7b54a Compare January 26, 2022 21:13
@augusto2112
Copy link
Author

swiftlang/swift#41021

@swift-ci test

Copy link

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is in a shape where we can land this once all outstanding comments are addressed. Thanks!

llvm::Optional<std::pair<uint64_t, uint64_t>> addModuleToAddressMap(lldb::ModuleSP module);

private:
/// Given a possibly artificial address, try to map it back into an Address.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming suggestion: replace artificial with tagged?

// registered address,
// subtract it and 8 (pointer alignment) to get to the virtual file
// address.
file_address = address - (pair_iterator - 1)->first - 8;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personally I find std::prev(pair_iterator) more readable than pair_iterator-1. Up to you.

@@ -134,15 +136,21 @@ bool LLDBMemoryReader::readBytes(swift::remote::RemoteAddress address,
LLDB_LOGV(log, "[MemoryReader] asked to read {0} bytes at address {1:x}",
size, address.getAddressData());

Address addr = Address(address.getAddressData());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:
we could change resolveFromArtificialToLLDBAddress to

lldb::Address LLDBMemoryReader::resolveRemoteAddress(uint64_t address) const { }

and have it return lldb::Address(address) in the error path. Then this whole block would become

lldb::Address addr = resolveRemoteAddress(address.getAddressData());

if (pair_iterator == m_range_module_map.begin())
// Since this is the first registered module,
// just unsetting the mask will give the virtual file address.
file_address = address & ~ARTIFICIAL_ADDRESS_MASK;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be a check:

if (file_address < _pair_iterator.first) {
  LLDB_LOG(log, "[MemoryReader] Address is tagged, but doesn't exist in m_range_module_map");
  return {}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly we could also store the last valid virtual address and have a sanity upper bounds check as well

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the map, we store the end address, not start address, of each image, std::lower_bound: "Returns an iterator pointing to the first element in the range [first, last) that is not less than (i.e. greater or equal to) value, or last if no such element is found."

Let's say that the fist image goes from 0x200...00 to 0x200..10, and we're looking for a tagged address 0x200..05. We'll end in this first case, and because we know the address is tagged, we know for sure it's in range.

For the upper bound, we already check pair_iterator != m_range_module_map.end(), m_range_module_map.end() is what will be return if the address we're looking for is larger than anything in the map.

[&](swift::ReflectionSectionKind section_kind)
-> std::pair<swift::remote::RemoteRef<void>, uint64_t> {
auto section_name =
ConstString(obj_file_format->getSectionName(section_kind));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more efficient to use if (section->GetName().AsStringRef() == section_name) than looking up and potentially allocating a new ConstString.

// set.
// The start address of the next image is the next available pointer aligned
// address (this works because all the end addresses are also pointer
// aligned).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding +8 does not change the pointer-sized-alignment of an address. If the address was 8-byte aligned before, it's still after, and if it wasn't it won't be 8-byte-aligned afterwards either.
It's also suspicious that the resolve method then subtracts 8 again. Is this necessary at all?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary because what we're adding 8 to is the end of the previous image. Think about it this way, the first image is mapped in the range [0x2...00 - 0x2...16], we want to map the second image the next pointer aligned address, and we know that the end of the previous image is pointer aligned, so we add 8 to it and get 0x2...24.

Similarly, when we resolve this address back to its virtual address value, we'd like to undo this operation, so we subtract the end of the previous address (0x2...16) and 8, to get back what the original virtual address is without the mask.

@@ -7,6 +7,9 @@ let Definition = "target_experimental" in {
def SwiftCreateModuleContextsInParallel : Property<"swift-create-module-contexts-in-parallel", "Boolean">,
DefaultTrue,
Desc<"Create the per-module Swift AST contexts in parallel.">;
def SwiftReadMetadataFromFileCache: Property<"swift-read-metadata-from-file-cache", "Boolean">,
DefaultTrue,
Desc<"Read metadata from the file cache instead of the process when possible">;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read Swift reflection metadata ...

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch from ab7b54a to c151c78 Compare January 27, 2022 15:20
@augusto2112 augusto2112 marked this pull request as ready for review January 27, 2022 15:20
@augusto2112
Copy link
Author

preset=lldb-pull-request
@swift-ci Please test

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch from c151c78 to a665192 Compare January 27, 2022 19:10
@augusto2112
Copy link
Author

preset=lldb-pull-request
@swift-ci Please test with preset

@augusto2112
Copy link
Author

preset=lldb-pull-request
@swift-ci Please test

address.getAddressData());
return false;
}
auto addr = *maybeAddr;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good!

SWIFT_ABI_X86_64_SWIFT_SPARE_BITS_MASK,
SWIFT_ABI_X86_64_OBJC_RESERVED_BITS_MASK,
SWIFT_ABI_X86_64_SIMULATOR_OBJC_RESERVED_BITS_MASK,
SWIFT_ABI_X86_64_IS_OBJC_BIT,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this SWIFT_ABI_X86_64_IS_OBJC_BIT included in SWIFT_ABI_X86_64_SWIFT_SPARE_BITS_MASK?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, should we remove it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only need a static_assert that our tag is inside of SWIFT_ABI_*_SWIFT_SPARE_BITS_MASK, to check that our tag will not collide with real addresses in the process. And then a second check to make sure that our tag does not collide with the OBJC bit to make sure our addresses don't look like OBjC tagged pointers to ReflectionContext.


for (auto swift_mask : used_swift_masks) {
assert((module_start_address & swift_mask) != swift_mask &&
"LLDB tagged address clashes with a swift mask!");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now checking that the start address doesn't use any masked bits. That's good I think we also should check that our own tag bit don't conflict with any runtime bits.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by our own tag bit? If TAGGED_ADDRESS_MASK itself conflicts with runtime bits?

// just unsetting the mask will give the virtual file address.
file_address = address & ~TAGGED_ADDRESS_MASK;
else
// The previous pair's address is the start of the current image.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:
// The end of the previous section is the start of the current one.

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch from a665192 to c3c8472 Compare January 27, 2022 21:45
@augusto2112
Copy link
Author

@swift-ci test

// we can tell what the upper bound of all mapped images is.
// Addresses are always inserted in ascending order, so the data structure is
// guaranteed to be sorted.
std::vector<std::pair<uint64_t, lldb::ModuleSP>> m_range_module_map;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about?

// LLDBMemoryReader prefers to read reflection metadata from the
// binary on disk, which is faster than reading it out of process
// memory, especially when debugging remotely.  To achieve this LLDB
// registers virtual addresses starting at (0x0 &
// LLDB_VIRTUAL_ADDRESS_BIT) with ReflectionContext.  Sorted by
// virtual address, m_lldb_virtual_address_map stores each
// lldb::Module and the first virtual address after the end of that
// module's virtual address space.
std::vector<std::pair<uint64_t, lldb::ModuleSP>> m_lldb_virtual_address_map;

private:
Process &m_process;
size_t m_max_read_amount;

llvm::Optional<uint64_t> m_local_buffer;
uint64_t m_local_buffer_size = 0;

// A map from tagged addresses that we hand out to modules that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

///

// guaranteed to be sorted.
std::vector<std::pair<uint64_t, lldb::ModuleSP>> m_range_module_map;

// Tagged addresses we hand out will contain this mask to indicate that

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// The bit used to tag LLDB's virtual addresses as such. See \c m_lldb_virtual_address_map.

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch from c3c8472 to 2239b23 Compare January 28, 2022 18:36
@augusto2112
Copy link
Author

@swift-ci test

m_range_module_map.begin(), m_range_module_map.end(), comparison_pair,
[](auto &a, auto &b) { return a.first < b.first; });

// If the address is larger than anything we have mapped.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a full sentence :-)


// If the address is larger than anything we have mapped.
if (pair_iterator == m_range_module_map.end())
return {};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be logged (with LOGV)?


return resolved;

LLDB_LOGV(log,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dead code.

uint64_t file_address;
if (pair_iterator == m_range_module_map.begin())
// Since this is the first registered module,
// just unsetting the mask will give the virtual file address.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just unsetting the mask will -> clearing the tag bit

Copy link

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more tiny inline comments, but otherwise this is good!

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch from 2239b23 to 91f7f83 Compare January 31, 2022 18:57
@augusto2112
Copy link
Author

@swift-ci test

@augusto2112 augusto2112 force-pushed the enable-filecache-again branch from 91f7f83 to b5cf6ba Compare January 31, 2022 20:48
@augusto2112
Copy link
Author

@swift-ci test

@augusto2112 augusto2112 merged commit 1e62148 into swiftlang:stable/20211026 Feb 1, 2022
kastiglione added a commit that referenced this pull request Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants