Skip to content

[lldb] Refactor TypeQuery::ContextMatches #99305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions lldb/include/lldb/Symbol/Type.h
Original file line number Diff line number Diff line change
Expand Up @@ -65,11 +65,6 @@ struct CompilerContext {
llvm::raw_ostream &operator<<(llvm::raw_ostream &os,
const CompilerContext &rhs);

/// Match \p context_chain against \p pattern, which may contain "Any"
/// kinds. The \p context_chain should *not* contain any "Any" kinds.
bool contextMatches(llvm::ArrayRef<CompilerContext> context_chain,
llvm::ArrayRef<CompilerContext> pattern);

FLAGS_ENUM(TypeQueryOptions){
e_none = 0u,
/// If set, TypeQuery::m_context contains an exact context that must match
Expand Down
128 changes: 86 additions & 42 deletions lldb/source/Symbol/Type.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@
//
//===----------------------------------------------------------------------===//

#include <algorithm>
#include <cstdio>
#include <iterator>
#include <optional>

#include "lldb/Core/Module.h"
Expand All @@ -30,6 +32,7 @@
#include "lldb/Target/Process.h"
#include "lldb/Target/Target.h"
#include "lldb/lldb-enumerations.h"
#include "lldb/lldb-private-enumerations.h"

#include "llvm/ADT/StringRef.h"

Expand All @@ -43,35 +46,6 @@ llvm::raw_ostream &lldb_private::operator<<(llvm::raw_ostream &os,
return os << lldb_stream.GetString();
}

bool lldb_private::contextMatches(llvm::ArrayRef<CompilerContext> context_chain,
llvm::ArrayRef<CompilerContext> pattern) {
auto ctx = context_chain.begin();
auto ctx_end = context_chain.end();
for (const CompilerContext &pat : pattern) {
// Early exit if the pattern is too long.
if (ctx == ctx_end)
return false;
if (*ctx != pat) {
// Skip any number of module matches.
if (pat.kind == CompilerContextKind::AnyModule) {
// Greedily match 0..n modules.
ctx = std::find_if(ctx, ctx_end, [](const CompilerContext &ctx) {
return ctx.kind != CompilerContextKind::Module;
});
continue;
}
// See if there is a kind mismatch; they should have 1 bit in common.
if (((uint16_t)ctx->kind & (uint16_t)pat.kind) == 0)
return false;
// The name is ignored for AnyModule, but not for AnyType.
if (pat.kind != CompilerContextKind::AnyModule && ctx->name != pat.name)
return false;
}
++ctx;
}
return true;
}

static CompilerContextKind ConvertTypeClass(lldb::TypeClass type_class) {
if (type_class == eTypeClassAny)
return CompilerContextKind::AnyType;
Expand Down Expand Up @@ -153,19 +127,89 @@ void TypeQuery::SetLanguages(LanguageSet languages) {

bool TypeQuery::ContextMatches(
llvm::ArrayRef<CompilerContext> context_chain) const {
if (GetExactMatch() || context_chain.size() == m_context.size())
return ::contextMatches(context_chain, m_context);

// We don't have an exact match, we need to bottom m_context.size() items to
// match for a successful lookup.
if (context_chain.size() < m_context.size())
return false; // Not enough items in context_chain to allow for a match.

size_t compare_count = context_chain.size() - m_context.size();
return ::contextMatches(
llvm::ArrayRef<CompilerContext>(context_chain.data() + compare_count,
m_context.size()),
m_context);
auto ctx = context_chain.rbegin(), ctx_end = context_chain.rend();
for (auto pat = m_context.rbegin(), pat_end = m_context.rend();
pat != pat_end;) {

// Handle AnyModule matches. These are tricky as they can match any number
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why AnyModule allows for this wildcard behaviour? I don't see it used anywhere, only exact Module matches

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a very good question actually. I'm pretty sure we're not using it. I assumed it was being used for some Apple thing with -gmodules or whatever, but now mention that, it doesn't appear to be actually used except for some specialized tests.

So, as much as I loved implementing a glob algorithm, I would love even more to delete it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually just grepped for this in the apple/llvm-project fork and looks like we're using it in the Swift plugin: https://github.com/swiftlang/llvm-project/blob/ee8bc8b8d30eb99807adbcd3c1f044e00af66cdd/lldb/source/Plugins/TypeSystem/Swift/TypeSystemSwiftTypeRef.cpp#L219-L225

@adrian-prantl @augusto2112 @kastiglione could you elaborate on the usage of this lookup-type here and why it's needed? An unfortunate side-effect of only having it be exercised in the Swift plugin is that we need to support this non-trivial matching algorithm while having the tests in a different repo.

That being said, I don't want to block this PR on this question, since the change itself LGTM

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with waiting, as this is a non-trivial piece of code, and if that's the only use case, then we can probably come up with something simpler. For example, some new TypeQuery flag which basically says "exact match except for modules" (that's something we may want to do for anonymous namespaces as well).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

Generally, it seems like following two concepts have been conflated in CompilerDeclContextKind: having wildcards in a TypeQuery versus matching two entities regardless of DeclContextKind (e.g., if a union or enum have the same name, then AnyType would allow a match). And as you say, having the wildcard behaviour live separately would make the CompilerDeclContextKind easier to understand again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... Ping? Do we still need to wait for opinions from @adrian-prantl @augusto2112 @kastiglione or should I just go ahead and implement the simplified version?

Copy link
Member

@Michael137 Michael137 Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adrian-prantl was OOO for a couple of weeks, but should be back today. Not sure he'll see this today though (I suspect the review queue might be quite high). I can help run the swift plugin tests if you end up refactoring this. Think it would also be fine to merge this and rework it afterwards. Whatever you prefer

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to create simplified version of the patch (probably tomorrow). I'd appreciate if you could give it a spin.

// of modules.
if (pat->kind == CompilerContextKind::AnyModule) {
// Successive wildcards are equivalent to a single wildcard.
while (pat->kind == CompilerContextKind::AnyModule)
++pat;

// [ctx, ctx_module_end) is the range of entries that may be matched by
// our wildcard.
auto ctx_module_end =
std::find_if(ctx, ctx_end, [](const CompilerContext &c) {
return c.kind != CompilerContextKind::Module;
});

// [pat, exact_pat_end) is the range of exact module match patterns. If
// it's not empty, we need to make sure our wildcard does not consume
// entries matched by those.
auto exact_pat_end =
std::find_if(pat, pat_end, [](const CompilerContext &p) {
return (p.kind & CompilerContextKind::AnyModule) !=
CompilerContextKind::Module;
Comment on lines +152 to +154
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this just be:

Suggested change
std::find_if(pat, pat_end, [](const CompilerContext &p) {
return (p.kind & CompilerContextKind::AnyModule) !=
CompilerContextKind::Module;
std::find_if(pat, pat_end, [](const CompilerContext &p) {
return p.kind != CompilerContextKind::Module;

Since we've already skipped all the AnyModule patterns before we get here?

});

if (pat == exact_pat_end) {
// No exact matches, just consume everything.
ctx = ctx_module_end;
continue;
}

// We have a non-empty module sequence after the wildcard. Now we need to
// look at what comes *after* that. If that's another wildcard, we want
// *this* wildcard to match as little as possible to give the other
// wildcard (and whatever comes after it) as much freedom as possible. If
// it's *not* another wildcard (we've reached the end of the pattern, or
// we have some non-module patterns after this), we want to match as much
// as possible, as there will be nothing left to match any remaining
// module entries.
bool greedy_match = exact_pat_end == pat_end ||
exact_pat_end->kind != CompilerContextKind::AnyModule;

auto pred = [](const CompilerContext &c, const CompilerContext &p) {
return c.name == p.name;
};
auto pos =
greedy_match
? std::find_end(ctx, ctx_module_end, pat, exact_pat_end, pred)
: std::search(ctx, ctx_module_end, pat, exact_pat_end, pred);

if (pos == ctx_module_end)
return false; // Matching failed.

// We've successfully matched the wildcard and the exact-module sequence
// that comes after it.
ctx = std::next(pos, std::distance(pat, exact_pat_end));
pat = exact_pat_end;
continue;
}

if (ctx == ctx_end)
return false; // Pattern too long.

// See if there is a kind mismatch; they should have 1 bit in common.
if ((ctx->kind & pat->kind) == CompilerContextKind())
return false;

if (ctx->name != pat->name)
return false;

++ctx;
++pat;
}

// At this point, we have exhausted the pattern and we have a partial match at
// least. If that's all we're looking for, we're done.
if (!GetExactMatch())
return true;

// We have an exact match if we've exhausted the target context as well.
return ctx == ctx_end;
}

bool TypeQuery::LanguageMatches(lldb::LanguageType language) const {
Expand Down
118 changes: 83 additions & 35 deletions lldb/unittests/Symbol/TestType.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,16 @@
//
//===----------------------------------------------------------------------===//

#include "gmock/gmock.h"
#include "gtest/gtest.h"

#include "lldb/Symbol/Type.h"
#include "lldb/lldb-enumerations.h"
#include "lldb/lldb-private-enumerations.h"

using namespace lldb;
using namespace lldb_private;
using testing::Not;

TEST(Type, GetTypeScopeAndBasename) {
EXPECT_EQ(Type::GetTypeScopeAndBasename("int"),
Expand Down Expand Up @@ -47,40 +50,85 @@ TEST(Type, GetTypeScopeAndBasename) {
EXPECT_EQ(Type::GetTypeScopeAndBasename("foo<::bar"), std::nullopt);
}

namespace {
MATCHER_P(Matches, pattern, "") {
TypeQuery query(pattern, TypeQueryOptions::e_none);
return query.ContextMatches(arg);
}
} // namespace

TEST(Type, CompilerContextPattern) {
std::vector<CompilerContext> mmc = {
{CompilerContextKind::Module, ConstString("A")},
{CompilerContextKind::Module, ConstString("B")},
{CompilerContextKind::ClassOrStruct, ConstString("S")}};
std::vector<CompilerContext> mc = {
{CompilerContextKind::Module, ConstString("A")},
{CompilerContextKind::ClassOrStruct, ConstString("S")}};
std::vector<CompilerContext> mac = {
{CompilerContextKind::Module, ConstString("A")},
{CompilerContextKind::AnyModule, ConstString("*")},
{CompilerContextKind::ClassOrStruct, ConstString("S")}};
EXPECT_TRUE(contextMatches(mmc, mac));
EXPECT_TRUE(contextMatches(mc, mac));
EXPECT_FALSE(contextMatches(mac, mc));
std::vector<CompilerContext> mmmc = {
{CompilerContextKind::Module, ConstString("A")},
{CompilerContextKind::Module, ConstString("B")},
{CompilerContextKind::Module, ConstString("C")},
{CompilerContextKind::ClassOrStruct, ConstString("S")}};
EXPECT_TRUE(contextMatches(mmmc, mac));
std::vector<CompilerContext> mme = {
{CompilerContextKind::Module, ConstString("A")},
{CompilerContextKind::Module, ConstString("B")},
{CompilerContextKind::Enum, ConstString("S")}};
std::vector<CompilerContext> mma = {
{CompilerContextKind::Module, ConstString("A")},
{CompilerContextKind::Module, ConstString("B")},
{CompilerContextKind::AnyType, ConstString("S")}};
EXPECT_TRUE(contextMatches(mme, mma));
EXPECT_TRUE(contextMatches(mmc, mma));
std::vector<CompilerContext> mme2 = {
{CompilerContextKind::Module, ConstString("A")},
{CompilerContextKind::Module, ConstString("B")},
{CompilerContextKind::Enum, ConstString("S2")}};
EXPECT_FALSE(contextMatches(mme2, mma));
const CompilerContext any_module(CompilerContextKind::AnyModule,
ConstString("*"));
auto make_module = [](llvm::StringRef name) {
return CompilerContext(CompilerContextKind::Module, ConstString(name));
};
auto make_class = [](llvm::StringRef name) {
return CompilerContext(CompilerContextKind::ClassOrStruct,
ConstString(name));
};
auto make_any_type = [](llvm::StringRef name) {
return CompilerContext(CompilerContextKind::AnyType, ConstString(name));
};
auto make_enum = [](llvm::StringRef name) {
return CompilerContext(CompilerContextKind::Enum, ConstString(name));
};
auto make_namespace = [](llvm::StringRef name) {
return CompilerContext(CompilerContextKind::Namespace, ConstString(name));
};

EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_class("C")}),
Matches(std::vector{make_module("A"), any_module, make_class("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_class("C")}),
Matches(std::vector{make_module("A"), any_module, make_class("C")}));
EXPECT_THAT((std::vector{make_module("A"), make_class("C")}),
Matches(std::vector{any_module, make_class("C")}));
EXPECT_THAT((std::vector{make_module("A"), any_module, make_class("C")}),
Not(Matches(std::vector{make_module("A"), make_class("C")})));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_module("C"),
make_class("C")}),
Matches(std::vector{make_module("A"), any_module, make_class("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_module("C"),
make_module("A"), make_module("B"), make_module("C")}),
Matches(std::vector{make_module("A"), any_module, make_module("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_module("C"),
make_module("A"), make_module("B"), make_module("C")}),
Matches(std::vector{make_module("A"), any_module, any_module,
make_module("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_module("C"),
make_module("A"), make_module("B"), make_module("C")}),
Matches(std::vector{any_module, make_module("A"), any_module,
make_module("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_module("C"),
make_module("A"), make_module("B"), make_module("C")}),
Matches(std::vector{any_module, make_module("A"), any_module,
make_module("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_module("C"),
make_module("A"), make_module("B"), make_module("C")}),
Not(Matches(std::vector{any_module, make_module("B"), any_module,
make_module("A"), any_module, make_module("A"),
any_module})));
EXPECT_THAT((std::vector{make_module("A"), make_module("B"), make_enum("C")}),
Matches(std::vector{make_module("A"), make_module("B"),
make_any_type("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_class("C")}),
Matches(
std::vector{make_module("A"), make_module("B"), make_any_type("C")}));
EXPECT_THAT(
(std::vector{make_module("A"), make_module("B"), make_enum("C2")}),
Not(Matches(std::vector{make_module("A"), make_module("B"),
make_any_type("C")})));
EXPECT_THAT((std::vector{make_class("C")}),
Matches(std::vector{make_class("C")}));
EXPECT_THAT((std::vector{make_namespace("NS"), make_class("C")}),
Not(Matches(std::vector{make_any_type("C")})));
}
Loading