Skip to content

[lldb] Implement LLDBMemoryReader::resolvePointer #3848

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

kastiglione
Copy link

@kastiglione kastiglione commented Jan 26, 2022

Override MemoryReader::resolvePointer in LLDBMemoryReader.

This function lets a memory reader implementation optionally convert an address into a symbol. This functionality is used in places where MetadataReader can benefit from knowing the symbol for an address. A Swift symbol can provide extra (but optional) data to MetadataReader. In particular, profiling has shown it to be quite costly for MetadataReader to load context descriptors, build a demangle node tree from those descriptors, and finally mangling it to a symbol. This work to compute the symbol can be avoided by implementing resolvePointer.

Copy link

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This awesome! Do you agree that we should cherry-pick this to the swift/release/5.6 branch?

@kastiglione
Copy link
Author

I have been profiling this in combination with #3800. I will gather some profile data for this patch alone, to help us decide about 5.6. In the mean time, I will open a PR and run the tests.

@kastiglione
Copy link
Author

test/Shell/Swift/runtime-initialization.test is failing, by not resolving the dynamic type of a variable from the command target var.

@kastiglione
Copy link
Author

In my current "perf harness", I found that during the first call to GetBitSize, this patch reduces the number of process reads by 55%:

Calls to ReadMemoryFromInferior:
Before: 5166
After: 2346

@kastiglione
Copy link
Author

The test failure shines a light on something I found confusing: the name resolvePointer. Providing a symbol for an address didn't strike me as necessarily "resolving" the pointer. In fact, the type RemoteAbsolutePointer returned by resolvePointer has a method named isResolved() which only true if there's no symbol.

To put it another way, resolvePointer is given an address of a pointer, and this function can optionally return a symbol instead. But if it does return a symbol, the resulting value has isResolved() == false. This seems like "unresolving" the pointer. Does this mean the function should expect only a symbol to be given, and then the symbol should be resolved?

What does this have to do with this test failure? Well some code calls resolvePointer and then immediately checks isResolved(). For example:

llvm::Optional<StoredPointer>
readResolvedPointerValue(StoredPointer address) {
  if (auto pointer = readPointer(address)) {
    if (!pointer->isResolved())
      return None;
    return (StoredPointer)pointer->getResolvedAddress().getAddressData();
  }
  return None;
}

I wonder if there's a simple fix: instead of returning None if the pointer is not "resolved", return the address we already have. I am checking now to find out.

@kastiglione
Copy link
Author

The "simple fix" does work, I'll have to get feedback/review and run tests.

@adrian-prantl
Copy link

I wonder if there's a simple fix: instead of returning None if the pointer is not "resolved", return the address we already have. I am checking now to find out.

In a live process resolvePointer is an identity function, because all pointers are in memory that has been patched by dyld to resolve to the correct address. So if you're patching the if(!resolved) path any breakage would surface either in swift-reflection-dump or lldb.

The function is used only(?) here:

  llvm::Optional<std::pair<const TypeRef *, RemoteAddress>>
  getDynamicTypeAndAddressClassExistential(RemoteAddress ExistentialAddress) {
    auto PointerValue =
        readResolvedPointerValue(ExistentialAddress.getAddressData());

I originally thought an ExistentialAddress would never be a symbol, because it points to the address of a live object in heap or stack. But the failing testcase is a global variable, and that could actually have a symbol which LLDB finds, but which we would not want to return in LLDBMemoryReader. LLDBMemoryReader reads live process memory and reflection metdata and we only want symbols in the latter case.

@adrian-prantl
Copy link

So an alternative fix in LLDB would be to check if the symbol is in a reflection section or in __DATA,const and explicitly not in the rest of __DATA.

@kastiglione
Copy link
Author

Yes, with a bit of thinking through this, I now understand this better. Your comment helps clarify. I have added an isEffectivelyReadOnly function and only return a symbol if that's true.

if (section->GetName() == "__const")
return true;
if (auto segment = section->GetParent())
if (segment->GetName() == "__DATA_CONST")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be hardcoding implementation details of the MachO format here, but I'm sure we can sink this into Target or somewhere more appropriate

Copy link
Author

@kastiglione kastiglione Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks I am on the same page, I put this change together quickly primarily to test asynchronously.

Copy link

@augusto2112 augusto2112 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉🎉🎉

@kastiglione kastiglione requested a review from shafik January 29, 2022 04:14
@kastiglione
Copy link
Author

@shafik I have added a virtual function (bool CanContainSwiftReflectionData(const Section &)). I don't know why the base class implementation is being called instead of the expected overrides. Have I made a bonehead mistake?

@@ -390,6 +390,10 @@ void Section::SetPermissions(uint32_t permissions) {
m_executable = (permissions & ePermissionsExecutable) != 0;
}

bool Section::CanContainSwiftReflectionData() const {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe GetCanContainSwiftReflectionData having this function be the same name feels confusing especially in this PR. e.g. we can see below GetSectionData calls m_obj_file->ReadSectionData.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't suppose you have alternative naming suggestions? I would say that Can is like Is or Has, which are not conventionally combined with Get. Using same name was by design, the idea is to connect them by name. If that's confusing, then I am open to other names, but I don't think GetCan… should be it.

@@ -689,6 +690,8 @@ class ObjectFile : public std::enable_shared_from_this<ObjectFile>,
virtual llvm::StringRef
GetReflectionSectionIdentifier(swift::ReflectionSectionKind section);

virtual bool CanContainSwiftReflectionData(const Section &section);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we assert false in the implementation why don't we make it pure virtual using =0 instead? I see we use a mix in this class. Detecting we have not implemented an override at runtime seems less ideal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it a default behavior of return false, instead of making it pure virtual.

@kastiglione
Copy link
Author

nice, passing!

@kastiglione kastiglione force-pushed the lldb-Implement-LLDBMemoryReader-resolvePointer branch from 148d7c7 to 9f230a1 Compare February 1, 2022 15:21
@kastiglione
Copy link
Author

@swift-ci test

@kastiglione kastiglione merged commit bb7505d into stable/20211026 Feb 2, 2022
@kastiglione kastiglione deleted the lldb-Implement-LLDBMemoryReader-resolvePointer branch February 2, 2022 02:04
kastiglione added a commit that referenced this pull request Feb 4, 2022
…esolvePointer

[lldb] Implement LLDBMemoryReader::resolvePointer

(cherry picked from commit bb7505d)
@adrian-prantl
Copy link

Looking at the list of individual commits, this PR would have been a great candidate for squashing into a single commit. I'm all for always writing the smallest possible patch, but only if each patch can stand for itself and has a meaningful commit message.

@kastiglione
Copy link
Author

@JDevlieghere also brought this up. I will squash in the future.

kastiglione added a commit that referenced this pull request Mar 24, 2022
…esolvePointer

[lldb] Implement LLDBMemoryReader::resolvePointer

(cherry picked from commit bb7505d)
(cherry picked from commit 201ddc5)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants