Skip to content

Dynamic type lookup string compare optimization #42390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 21, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions stdlib/public/runtime/MetadataLookup.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
#include <vector>
#include <list>
#include <new>
#include <cstring>

using namespace swift;
using namespace Demangle;
Expand Down Expand Up @@ -472,6 +473,16 @@ static bool sameObjCTypeManglings(Demangle::NodePointer node1,
}
#endif

/// Optimization for the case where we need to compare a StringRef and a null terminated C string
/// Not converting s2 to a StringRef avoids the need to call both strlen and memcmp when non-matching
/// but equal length
static bool stringRefEqualsCString(StringRef s1, const char *s2) {
size_t length = s1.size();
// It may be possible for s1 to contain embedded NULL characters
// so additionally validate that the lengths match
return strncmp(s1.data(), s2, length) == 0 && strlen(s2) == length;
Copy link
Contributor

@jckarter jckarter Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unfortunate to have to iterate through s2 twice here. What does performance look like if we use an inline loop here instead of strncmp to do the entire comparison in one pass? It'd be interesting to see if strncmp and/or strlen are optimized enough to offset the double iteration through s2 compared to a loop that precisely accounted for s1's fixed length and s2's null termination in one pass.

Copy link
Contributor Author

@ladd ladd Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A naive inline loop seems to be ≈3x slower than platform strncmp in my tests with a 65 byte long sample string. I suspect this is due to the optimized ARM assembly implementations in llvm-project/libc/AOR_v20.02/string/aarch64 + the compiler's optimization of built-ins. That said, my assembly skills are a bit rusty.

I tried something like this with -O3:

inline static bool testcmp(const char *left, const char *right, size_t n) {
	for (int i=0; i < n; i++) {
		char lc = left[i];
		char rc = right[i];
		
		if (lc != rc || rc == '\0') {
			return false;
		}
	}
	return true;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it looks good as is. Thanks for trying it out!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries -- at this level of the code, I think you could only do better if you already knew the length of s2, or knew that s1 contained no embedded NUL characters.

}

bool
swift::_contextDescriptorMatchesMangling(const ContextDescriptor *context,
Demangle::NodePointer node) {
Expand All @@ -496,7 +507,7 @@ swift::_contextDescriptorMatchesMangling(const ContextDescriptor *context,
// Match to a mangled module name.
if (node->getKind() != Demangle::Node::Kind::Module)
return false;
if (!node->getText().equals(module->Name.get()))
if (!stringRefEqualsCString(node->getText(), module->Name.get()))
return false;

node = nullptr;
Expand Down Expand Up @@ -568,7 +579,7 @@ swift::_contextDescriptorMatchesMangling(const ContextDescriptor *context,
auto nameNode = node->getChild(1);
if (nameNode->getKind() != Demangle::Node::Kind::Identifier)
return false;
if (nameNode->getText() == proto->Name.get()) {
if (stringRefEqualsCString(nameNode->getText(), proto->Name.get())) {
node = node->getChild(0);
break;
}
Expand Down