Skip to content

[LLDB] Fix crash in TypeSystemClang::GetIndexofChildMemberWithName. #117808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Dec 4, 2024

Conversation

cmtice
Copy link
Contributor

@cmtice cmtice commented Nov 26, 2024

LLDB can crash in TypeSystemClang::GetIndexOfChildMemberWithName, at a point where it pushes an index onto the child_indexes vector, tries to call itself recursively, then tries to pop the entry from child_indexes. The problem is that the recursive call can clear child_indexes, so that this code ends up trying to pop an already empty vector. This change saves the old vector before the push, then restores the saved vector rather than trying to pop.

LLDB can crash in TypeSystemClang::GetIndexOfChildMemberWithName, at a
point where it pushes an index onto the child_indexes vector, tries to call
itself recursively, then tries to pop the entry from child_indexes.
The problem is that the recursive call can clear child_indexes, so that
this code ends up trying to pop an already empty vector.  This change saves
the old vector before the push, then restores the saved vector rather than
trying to pop.
@cmtice cmtice requested a review from JDevlieghere as a code owner November 26, 2024 23:16
@cmtice cmtice requested a review from labath November 26, 2024 23:16
@llvmbot llvmbot added the lldb label Nov 26, 2024
@cmtice cmtice requested a review from jimingham November 26, 2024 23:16
@llvmbot
Copy link
Member

llvmbot commented Nov 26, 2024

@llvm/pr-subscribers-lldb

Author: None (cmtice)

Changes

LLDB can crash in TypeSystemClang::GetIndexOfChildMemberWithName, at a point where it pushes an index onto the child_indexes vector, tries to call itself recursively, then tries to pop the entry from child_indexes. The problem is that the recursive call can clear child_indexes, so that this code ends up trying to pop an already empty vector. This change saves the old vector before the push, then restores the saved vector rather than trying to pop.


Full diff: https://github.com/llvm/llvm-project/pull/117808.diff

1 Files Affected:

  • (modified) lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp (+2-2)
diff --git a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
index 1a77c7cf9161a0..16eca7700d9fff 100644
--- a/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
+++ b/lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
@@ -6754,12 +6754,12 @@ size_t TypeSystemClang::GetIndexOfChildMemberWithName(
           llvm::StringRef field_name = field->getName();
           if (field_name.empty()) {
             CompilerType field_type = GetType(field->getType());
+            std::vector<uint32_t> save_indices = child_indexes;
             child_indexes.push_back(child_idx);
             if (field_type.GetIndexOfChildMemberWithName(
                     name, omit_empty_base_classes, child_indexes))
               return child_indexes.size();
-            child_indexes.pop_back();
-
+            child_indexes = save_indices;
           } else if (field_name == name) {
             // We have to add on the number of base classes to this index!
             child_indexes.push_back(

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is definitely due for a cleanup. Do you have a reduced test-case for this by any chance? One way I see the vector would get cleared is if the recursive call ended up failing in GetIndexForRecordBase. This does seem like a plausible crash, but it might also be hiding a different bug. So a test-case would be interesting to see

@cmtice
Copy link
Contributor Author

cmtice commented Nov 27, 2024

Here's the test case I used:
$ cat main.cpp
struct A {
struct {
int x = 1;
};
int y = 2;6
} a;

struct B {
// Anonymous struct inherits another struct.
struct : public A {
int z = 3;
};
int w = 4;
A a;
} b;

return 0; // Set a breakpoint here
}
$ clang++ -g -O0 main.cpp
$ lldb a.out

Break at 'main' and step to the return statement. Then
(lldb) frame var b.x
LLDB crashes, trying to pop the empty vector.

@labath
Copy link
Collaborator

labath commented Nov 28, 2024

Great. Can you turn that into a test?

@Michael137
Copy link
Member

Here's the test case I used: $ cat main.cpp struct A { struct { int x = 1; }; int y = 2;6 } a;

struct B { // Anonymous struct inherits another struct. struct : public A { int z = 3; }; int w = 4; A a; } b;

return 0; // Set a breakpoint here } $ clang++ -g -O0 main.cpp $ lldb a.out

Break at 'main' and step to the return statement. Then (lldb) frame var b.x LLDB crashes, trying to pop the empty vector.

Thanks! Agree with Pavel, lets turn this into a test-case as part of this PR.

@@ -6754,12 +6754,12 @@ size_t TypeSystemClang::GetIndexOfChildMemberWithName(
llvm::StringRef field_name = field->getName();
if (field_name.empty()) {
CompilerType field_type = GetType(field->getType());
std::vector<uint32_t> save_indices = child_indexes;
Copy link
Member

@Michael137 Michael137 Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get away with not ever clearing the vector. And instead it returning an llvm::Expected<uint32_t> from GetIndexOfChildMemberWithName when we fail to GetIndexForRecordBase/GetIndexForRecordChild? That feels easier to reason about. Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do this; before I go ahead, I would like confirmation that this is what folks really want me to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I take that back. I've been playing around with this, and not clearing the vector is not a viable solution, because the code the clears the vector (when an error condition is encountered) is inside a loop that also pushes items into the vector. So there could be several successful iterations, lengthening the vector, but final failure. At the point of failure there could be some number of no-longer-valid indexes that have been added to the vector, so clearing it is the only option (other than saving it before the loop and restoring it when the error condition is encountered...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And (to clarify a bit more), the reason that that is a problem is because at line 6759 below, we make a recursive call, but if it fails we don't return an error; instead we continue executing in this function (but now with a child_indexes vector that would be incorrect if the recursive call hadn't cleared it or restored it to its original state).

Copy link
Collaborator

@labath labath Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, my preferred solution would be to have this function return a vector<int> (apparently, there's even a TODO in the header about that (*)). That way you can build the result from the bottom up, and only add things to the vector when you've found the thing you're searching for and are certain that you're going to return successfully. That's going to be slightly slower than passing the same vector always (but maybe not slower than making a copy of it preemptively). However, I can live with both your and Michael's solution (which I think could still work if you keep the pop_back statement so that the next iteration starts with a valid state). I'll leave it up to Michael to decide, as he spends most of the time looking at this code.

(*) Technically the TODO is about returning a vector<vector<int>> of all possible matches -- which may be a good idea overall, but I don't think we want to do that now.

Copy link
Member

@Michael137 Michael137 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there could be several successful iterations, lengthening the vector, but final failure. At the point of failure there could be some number of no-longer-valid indexes that have been added to the vector, so clearing it is the only option (other than saving it before the loop and restoring it when the error condition is encountered...).

And (to clarify a bit more), the reason that that is a problem is because at line 6759 below, we make a recursive call, but if it fails we don't return an error; instead we continue executing in this function (but now with a child_indexes vector that would be incorrect if the recursive call hadn't cleared it or restored it to its original state).

Right, but if we changed GetIndexOfChildMemberWithName to return an Expected, then we'd be able to handle the failure of the recursive call too. My thinking was that the case where Clang finds a path to a base-class but we then fail to compute the index to that element feels like a situation where we should bail out of this function. Although the counterpoint would be, what if Clang found paths to multiple base-classes, and only one of them failed. Then yes, we need the backtracking that you implement here (though that feels like a strange case that we should at the very least log). But looking at the current implementation, wouldn't we return a bogus result if Clang found multiple CXXBasePaths anyway? It would just add all the indices into child_indexes (it doesn't just pick "the first result" like the FIXME that Pavel points to suggests).

Correct me if I'm misunderstanding your points though!

I'm happy to go with @cmtice's approach for now and clean up the function as a follow-up. It looks like there's a few things wrong with it. And the test-case you added here will be useful to drive that refactor.

I think I prefer @labath's suggestion over mine too. That feels like the most robust option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not we return an Expected, the only way to make this code work (without major changes) is to save the vector before the recursive call and restore it in the case where the recursive call fails (and, with Expected, returns an llvm:Error). If the recursive call is successful, we immediately return the size of the vector (which was updated to contain the right values). But if the recursive call fails, then the code continues execution, looking for alternate paths, and assumes that the vector is in the same state it was before the recursive call was made. Since the recursive call can update the vector before failing (and returning the error if we're using Expected), unless we restore it after the failure (and also consume the error), the following calculations may return an incorrect result (I saw this happen when I was testing your suggestion).

I would really prefer to NOT try to refactor and fix this entire function myself, so if you will accept my fix (once I fix the test case as requested) I would really appreciate it.

@Michael137
Copy link
Member

Michael137 commented Nov 28, 2024

This is quite an interesting case. Basically the issue arises only because we have an anonymous struct in B with a base class that has an "indirect field" called x:

IndirectFieldDecl 0x104145868 <<invalid sloc>> <invalid sloc> implicit x 'int'  
|-Field 0x104145758 '' 'A::(anonymous struct)'
`-Field 0x104145700 'x' 'int'                                                         

It's important that the nested structure in B is anonymous (not unnamed) because that's what causes us to recurse. Clang creates a FieldDecl with an empty name for that anonymous structure.

The lookup into base class A for name x works fine from Clang's perspective. But then we try to GetIndexForRecordChild to find the index of x in A, but that fails because RecordDecl::field_begin doesn't iterate over IndirectFieldDecls.

So even if we fix the crash, we still wouldn't be able to access x. I think we can address that separately from the crash, but I think we should not be clearing the vector in TypeSystemClang::GetIndexofChildMemberWithName and instead propagate errors properly.

Separately (outside the scope of this PR) we should make GetIndexForRecordChild account for fields of anonymous structures. E.g., we might want to do another lookup for x into the anonymous structure.

Copy link

github-actions bot commented Dec 1, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

github-actions bot commented Dec 1, 2024

✅ With the latest revision this PR passed the Python code formatter.

Comment on lines 2 to 16
struct A {
struct {
int x = 1;
};
int y = 2;
} a;

struct B {
// Anonymous struct inherits another struct.
struct : public A {
int z = 3;
};
int w = 4;
A a;
} b;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid running the binary by making this a global variable (and then using target variable in the test case)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -6754,12 +6754,12 @@ size_t TypeSystemClang::GetIndexOfChildMemberWithName(
llvm::StringRef field_name = field->getName();
if (field_name.empty()) {
CompilerType field_type = GetType(field->getType());
std::vector<uint32_t> save_indices = child_indexes;
Copy link
Collaborator

@labath labath Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, my preferred solution would be to have this function return a vector<int> (apparently, there's even a TODO in the header about that (*)). That way you can build the result from the bottom up, and only add things to the vector when you've found the thing you're searching for and are certain that you're going to return successfully. That's going to be slightly slower than passing the same vector always (but maybe not slower than making a copy of it preemptively). However, I can live with both your and Michael's solution (which I think could still work if you keep the pop_back statement so that the next iteration starts with a valid state). I'll leave it up to Michael to decide, as he spends most of the time looking at this code.

(*) Technically the TODO is about returning a vector<vector<int>> of all possible matches -- which may be a good idea overall, but I don't think we want to do that now.

Copy link
Member

@Michael137 Michael137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, happy to postpone clean up this function to a separate PR.

Left some comments on the test

struct {
int x = 1;
};
int y = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is y necessary for the reproducer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; leftover from other tests. Will remove it.

struct : public A {
int z = 3;
};
int w = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are w and a necessary for the reproducer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; leftovers from previous testing. Will remove.

@cmtice cmtice merged commit 095c3c9 into llvm:main Dec 4, 2024
5 of 6 checks passed
@cmtice cmtice deleted the dil-type-fixes branch December 11, 2024 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants