Skip to content

Dynamic type lookup string compare optimization #42390

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 21, 2022

Conversation

ladd
Copy link
Contributor

@ladd ladd commented Apr 15, 2022

This is a small optimization for a specific path of dynamic type lookup, as profiled below in our iOS application. In a small synthetic benchmark, using strncmp is roughly twice as fast as using strlen + memcmp via StringRef in the case where the two mangled type strings match.

Screen Shot 2022-03-30 at 11 53 25 AM

(missing frames due to inlining?)

Other ideas for optimizing this path welcome!

@ladd ladd changed the title String compare optimization Dynamic type lookup string compare optimization Apr 15, 2022
@ladd ladd force-pushed the stringRefEqualsCString branch from 7d0915e to e24b433 Compare April 15, 2022 17:13
@tbkka tbkka requested review from mikeash and al45tair April 15, 2022 17:25
@mikeash
Copy link
Contributor

mikeash commented Apr 20, 2022

@swift-ci please benchmark

@ladd
Copy link
Contributor Author

ladd commented Apr 20, 2022

Seems to be within the noise, if I'm understanding the report. Does this need a smoke test as well? (I don't have commit access)

@mikeash
Copy link
Contributor

mikeash commented Apr 20, 2022

Yeah, looks like no measurable difference (no surprise there) but still seems worth doing. I'll go ahead and run a full test.

@mikeash
Copy link
Contributor

mikeash commented Apr 20, 2022

@swift-ci please test

@ladd
Copy link
Contributor Author

ladd commented Apr 21, 2022

Thanks for running the tests. Anything else to do here?

@jckarter
Copy link
Contributor

If this is a real win, it might be nice to upstream it to LLVM as an additional overload for StringRef::equals, so that all uses of StringRef::equals with a C string argument in the compiler and runtime benefit.

size_t length = s1.size();
// It may be possible for s1 to contain embedded NULL characters
// so additionally validate that the lengths match
return strncmp(s1.data(), s2, length) == 0 && strlen(s2) == length;
Copy link
Contributor

@jckarter jckarter Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unfortunate to have to iterate through s2 twice here. What does performance look like if we use an inline loop here instead of strncmp to do the entire comparison in one pass? It'd be interesting to see if strncmp and/or strlen are optimized enough to offset the double iteration through s2 compared to a loop that precisely accounted for s1's fixed length and s2's null termination in one pass.

Copy link
Contributor Author

@ladd ladd Apr 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A naive inline loop seems to be ≈3x slower than platform strncmp in my tests with a 65 byte long sample string. I suspect this is due to the optimized ARM assembly implementations in llvm-project/libc/AOR_v20.02/string/aarch64 + the compiler's optimization of built-ins. That said, my assembly skills are a bit rusty.

I tried something like this with -O3:

inline static bool testcmp(const char *left, const char *right, size_t n) {
	for (int i=0; i < n; i++) {
		char lc = left[i];
		char rc = right[i];
		
		if (lc != rc || rc == '\0') {
			return false;
		}
	}
	return true;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it looks good as is. Thanks for trying it out!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries -- at this level of the code, I think you could only do better if you already knew the length of s2, or knew that s1 contained no embedded NUL characters.

@mikeash
Copy link
Contributor

mikeash commented Apr 21, 2022

@jckarter Are you good with merging now? Getting it into LLVM sounds nice but as a separate thing.

@jckarter
Copy link
Contributor

@mikeash Yeah, I'm fine merging this now.

@jckarter jckarter merged commit 2afb9b7 into swiftlang:main Apr 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants