Skip to content

Huge speedup of RemoteMirror testing (in debug) #75415

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 24, 2024

Conversation

tbkka
Copy link
Contributor

@tbkka tbkka commented Jul 23, 2024

Background:

Each RemoteMirror test runs two processes: A host process runs the RemoteMirror library and queries memory data from a target process. The host and target communicate by passing strings back and forth with requests to read particular information in memory. The target is pure Swift; the host is C/C++.

What this change does:

Without this, the host makes very many small requests to the target. Each of those requests has to be individually parsed by the target using String operations from the debug stdlib. Actually transferring the raw bytes is relatively quick.
With this, the host requests and caches "pages" of about 4k and satisfies most requests from previously-fetched data. This dramatically reduces the total number of operations.

Performance notes:

The following notes only count the time to compile and run the 78 tests in the validation-tests/Reflection directory using lit.py; it does not include time to rebuild the project before running tests.

Performance with debug stdlib:

(That is: build-script -ra --debug-swift-stdlib) Before this change, I got tired of waiting after about 15 minutes and it was about 1/4 done. Some very simplistic profiling showed >99% of the time being spent in stdlib String operations in the target process. Activity Monitor shows that individual tests run for about 6 minutes of CPU time each.

With this change, the tests run in about 70s of wall time. Almost all of the time seems to spent compiling the tests; the tests themselves run too quickly to even show up in the Activity Monitor.

Performance with release stdlib:

(That is: build-script -ra)
The tests run in about 70s of wall time with or without this change.

**Background:**

Each RemoteMirror test runs two processes: A host process runs the
RemoteMirror library and queries memory data from a target process.  The host
and target communicate by passing strings back and forth with requests to read
particular information in memory.  The target is pure Swift; the host is
C/C++.

**What this change does:**

Without this, the host makes very many small requests to the target.
Each of those requests has to be individually parsed by the target
using String operations from the debug stdlib.  Actually transferring
the raw bytes is relatively quick.
With this, the host requests and caches "pages" of about 4k and
satisfies most requests from previously-fetched data.  This dramatically
reduces the total number of operations.

**Performance notes:**

The following notes only count the time to compile and run the 78 tests in the
validation-tests/Reflection directory using `lit.py`; it does not include time
to rebuild the project before running tests.

**Performance with debug stdlib:**

(That is: `build-script -ra --debug-swift-stdlib`) Before this change, I got
tired of waiting after about 15 minutes and it was about 1/4 done.  Some very
simplistic profiling showed >99% of the time being spent in stdlib String
operations in the target process.  Activity Monitor shows that individual tests
run for about 6 minutes of CPU time each.

With this change, the tests run in about 70s of wall time.  Almost all of the
time seems to spent compiling the tests; the tests themselves run too quickly to
even show up in the Activity Monitor.

**Performance with release stdlib:**

(That is: `build-script -ra`)
The tests run in about 70s of wall time with or without this change.
@tbkka tbkka requested review from mikeash and augusto2112 July 23, 2024 16:53
@tbkka tbkka requested a review from slavapestov as a code owner July 23, 2024 16:53
typedef struct PipeMemoryReader {
int to_child[2];
int from_child[2];
PipeMemoryReaderPage *Pages;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a simple linked list of "pages" to the PipeMemoryReader. This works surprisingly well: The RemoteMirror logic tends to make a lot of requests to the same area of memory, and this puts the most recent page at the top of the stack where it's found very quickly.

PipeMemoryReader *Reader = (PipeMemoryReader *)Context;

PipeMemoryReaderPage *Page = Reader->Pages;
while (Page != NULL) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to find an existing page

Page = Page->Next;
}

if (Page == NULL) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there isn't one, round the request to a multiple of aligned pages and request that.

void *Buf = malloc(Size);
PipeMemoryReader_collectBytesFromPipe(Reader, Buf, Size);

memcpy(Buf, Page->Data + Address - Page->BaseAddress, Size);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless, we have a page here, so we can copy data out of it.

@@ -288,6 +331,21 @@ PipeMemoryReader createPipeMemoryReader() {
return Reader;
}

static
void flushPipeMemoryReader(PipeMemoryReader *Reader) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a way to clean up pages.

const PipeMemoryReader Pipe) {
uintptr_t instance = PipeMemoryReader_receiveInstanceAddress(&Pipe);
const PipeMemoryReader *Reader) {
uintptr_t instance = PipeMemoryReader_receiveInstanceAddress(Reader);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed a bunch of functions to accept a pointer to the PipeMemoryReader instead of passing it directly.

return doDumpHeapInstance(BinaryFilename);
PipeMemoryReader Pipe = createPipeMemoryReader();
int ret = doDumpHeapInstance(BinaryFilename, &Pipe);
destroyPipeMemoryReader(&Pipe);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory management: Allocate the PipeMemoryReader here on the stack and pass it into doDumpHeapInstance. This was simpler than changing doDumpHeapInstance to clean up its own PipeMemoryReader (because doDumpHeapInstance has a bunch of return statements).

int doDumpHeapInstance(const char *BinaryFilename) {
PipeMemoryReader Pipe = createPipeMemoryReader();

int doDumpHeapInstance(const char *BinaryFilename, PipeMemoryReader *Reader) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doDumpHeapInstance now accepts a pointer to the PipeMemoryReader instead of allocating one locally.

@tbkka
Copy link
Contributor Author

tbkka commented Jul 23, 2024

@swift-ci Please test

#endif

while (1) {
InstanceKind Kind = PipeMemoryReader_receiveInstanceKind(&Pipe);
// Flush the cache between every reflection operation
flushPipeMemoryReader(Reader);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important: We flush the cache between reflection operations. Otherwise, we can miss changes to target memory that occur between operations.

@tbkka tbkka requested a review from btroller July 23, 2024 17:04
Copy link
Contributor

@mikeash mikeash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I suggest promoting your GitHub comments in PipeMemoryReader_readBytes to comments in the source, they'd be good to have in there.

@tbkka
Copy link
Contributor Author

tbkka commented Jul 23, 2024

@swift-ci Please test

@tbkka tbkka enabled auto-merge July 23, 2024 23:14
@tbkka tbkka merged commit c19c73d into swiftlang:main Jul 24, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants