Skip to content

[LLDB] Fix crash in TypeSystemClang::GetIndexofChildMemberWithName. #117808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Dec 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6754,12 +6754,12 @@ size_t TypeSystemClang::GetIndexOfChildMemberWithName(
llvm::StringRef field_name = field->getName();
if (field_name.empty()) {
CompilerType field_type = GetType(field->getType());
std::vector<uint32_t> save_indices = child_indexes;
Copy link
Member

@Michael137 Michael137 Nov 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get away with not ever clearing the vector. And instead it returning an llvm::Expected<uint32_t> from GetIndexOfChildMemberWithName when we fail to GetIndexForRecordBase/GetIndexForRecordChild? That feels easier to reason about. Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could do this; before I go ahead, I would like confirmation that this is what folks really want me to do?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I take that back. I've been playing around with this, and not clearing the vector is not a viable solution, because the code the clears the vector (when an error condition is encountered) is inside a loop that also pushes items into the vector. So there could be several successful iterations, lengthening the vector, but final failure. At the point of failure there could be some number of no-longer-valid indexes that have been added to the vector, so clearing it is the only option (other than saving it before the loop and restoring it when the error condition is encountered...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And (to clarify a bit more), the reason that that is a problem is because at line 6759 below, we make a recursive call, but if it fails we don't return an error; instead we continue executing in this function (but now with a child_indexes vector that would be incorrect if the recursive call hadn't cleared it or restored it to its original state).

Copy link
Collaborator

@labath labath Dec 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, my preferred solution would be to have this function return a vector<int> (apparently, there's even a TODO in the header about that (*)). That way you can build the result from the bottom up, and only add things to the vector when you've found the thing you're searching for and are certain that you're going to return successfully. That's going to be slightly slower than passing the same vector always (but maybe not slower than making a copy of it preemptively). However, I can live with both your and Michael's solution (which I think could still work if you keep the pop_back statement so that the next iteration starts with a valid state). I'll leave it up to Michael to decide, as he spends most of the time looking at this code.

(*) Technically the TODO is about returning a vector<vector<int>> of all possible matches -- which may be a good idea overall, but I don't think we want to do that now.

Copy link
Member

@Michael137 Michael137 Dec 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there could be several successful iterations, lengthening the vector, but final failure. At the point of failure there could be some number of no-longer-valid indexes that have been added to the vector, so clearing it is the only option (other than saving it before the loop and restoring it when the error condition is encountered...).

And (to clarify a bit more), the reason that that is a problem is because at line 6759 below, we make a recursive call, but if it fails we don't return an error; instead we continue executing in this function (but now with a child_indexes vector that would be incorrect if the recursive call hadn't cleared it or restored it to its original state).

Right, but if we changed GetIndexOfChildMemberWithName to return an Expected, then we'd be able to handle the failure of the recursive call too. My thinking was that the case where Clang finds a path to a base-class but we then fail to compute the index to that element feels like a situation where we should bail out of this function. Although the counterpoint would be, what if Clang found paths to multiple base-classes, and only one of them failed. Then yes, we need the backtracking that you implement here (though that feels like a strange case that we should at the very least log). But looking at the current implementation, wouldn't we return a bogus result if Clang found multiple CXXBasePaths anyway? It would just add all the indices into child_indexes (it doesn't just pick "the first result" like the FIXME that Pavel points to suggests).

Correct me if I'm misunderstanding your points though!

I'm happy to go with @cmtice's approach for now and clean up the function as a follow-up. It looks like there's a few things wrong with it. And the test-case you added here will be useful to drive that refactor.

I think I prefer @labath's suggestion over mine too. That feels like the most robust option.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not we return an Expected, the only way to make this code work (without major changes) is to save the vector before the recursive call and restore it in the case where the recursive call fails (and, with Expected, returns an llvm:Error). If the recursive call is successful, we immediately return the size of the vector (which was updated to contain the right values). But if the recursive call fails, then the code continues execution, looking for alternate paths, and assumes that the vector is in the same state it was before the recursive call was made. Since the recursive call can update the vector before failing (and returning the error if we're using Expected), unless we restore it after the failure (and also consume the error), the following calculations may return an incorrect result (I saw this happen when I was testing your suggestion).

I would really prefer to NOT try to refactor and fix this entire function myself, so if you will accept my fix (once I fix the test case as requested) I would really appreciate it.

child_indexes.push_back(child_idx);
if (field_type.GetIndexOfChildMemberWithName(
name, omit_empty_base_classes, child_indexes))
return child_indexes.size();
child_indexes.pop_back();

child_indexes = std::move(save_indices);
} else if (field_name == name) {
// We have to add on the number of base classes to this index!
child_indexes.push_back(
Expand Down
3 changes: 3 additions & 0 deletions lldb/test/API/commands/target/anon-struct/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
CXX_SOURCES := main.cpp

include Makefile.rules
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
"""
Test handling of Anonymous Structs, especially that they don't crash lldb.
"""


import lldb
import lldbsuite.test.lldbutil as lldbutil
from lldbsuite.test.decorators import *
from lldbsuite.test.lldbtest import *
import os
import shutil
import time


class TestFrameVarAnonStruct(TestBase):
# If your test case doesn't stress debug info, then
# set this to true. That way it won't be run once for
# each debug info format.
NO_DEBUG_INFO_TESTCASE = True

def test_frame_var(self):
self.build()
self.do_test()

def do_test(self):
target = self.createTestTarget()

# Verify that we don't crash in this case.
self.expect(
"target variable 'b.x'",
error=True,
substrs=["can't find global variable 'b.x'"],
)
14 changes: 14 additions & 0 deletions lldb/test/API/commands/target/anon-struct/main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
struct A {
struct {
int x = 1;
};
} a;

struct B {
// Anonymous struct inherits another struct.
struct : public A {
int z = 3;
};
} b;

int main(int argc, char **argv) { return 0; }
Loading