Skip to content

Reduce number of read calls when not asking for progress reporting. #623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 23, 2024

Conversation

parkera
Copy link
Contributor

@parkera parkera commented May 20, 2024

Also, when asking for progress reporting, cap the number of updates at approximately 100 instead of fileSize/4k.

Also, when asking for progress reporting, cap the number of updates at approximately 100 instead of fileSize/4k.
@parkera parkera requested review from jmschonfeld and kperryua May 20, 2024 17:58
@parkera
Copy link
Contributor Author

parkera commented May 20, 2024

@swift-ci test

@@ -370,7 +369,7 @@ internal func readBytesFromFile(path inPath: PathOrURL, reportProgress: Bool, ma

localProgress?.becomeCurrent(withPendingUnitCount: Int64(fileSize))
do {
let length = try readBytesFromFileDescriptor(fd, path: inPath, buffer: bytes, length: fileSize, chunkSize: size_t(preferredChunkSize), reportProgress: reportProgress)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The block size is just too small (4k on macOS, e.g.), and results in a very large number of read calls.

@jmschonfeld
Copy link
Contributor

The linux test failure looks related:

Test Case 'DataIOTests.test_largeFile' started at 2024-05-20 18:29:52.910
/build/swift-foundation/Tests/FoundationEssentialsTests/DataIOTests.swift:215: error: DataIOTests.test_largeFile : XCTAssertEqual failed: ("2147549184") is not equal to ("2147479552") - 
Test Case 'DataIOTests.test_largeFile' failed (17.749 seconds)

@parkera
Copy link
Contributor Author

parkera commented May 21, 2024

@swift-ci test

@parkera
Copy link
Contributor Author

parkera commented May 21, 2024

Updated to provide a "read until EOF" option, true by default. On Darwin we were always reading the requested 0x7ffffff, but on Linux we were reading 0x7fff000, then failing to loop again. This mirrors the implementation from swift-corelibs-foundation.

@itingliu
Copy link
Contributor

LGTM! Can we merge this?

@parkera
Copy link
Contributor Author

parkera commented May 22, 2024

Yah, I wanted @kperryua to take a second look if possible.

@kperryua
Copy link
Contributor

Yah, I wanted @kperryua to take a second look if possible.

Took me a bit to grok what the readUntilEOF change is about. I take a little issue with the name, since it presumes that the length parameter is expected to be EOF of a regular file. That's only technically true in this particular case of Data reading, since we're first querying the length of the file on disk and then passing that parameter for length. In theory one could call this function with a length less than the full length of the file, if one has a limited buffer.

@parkera
Copy link
Contributor Author

parkera commented May 23, 2024

Yah, I wanted @kperryua to take a second look if possible.

Took me a bit to grok what the readUntilEOF change is about. I take a little issue with the name, since it presumes that the length parameter is expected to be EOF of a regular file. That's only technically true in this particular case of Data reading, since we're first querying the length of the file on disk and then passing that parameter for length. In theory one could call this function with a length less than the full length of the file, if one has a limited buffer.

Fair enough - it's more like readUntilLength? I lifted the name from the equivalent function in swift-corelibs-foundation, where it is called with both Data and FileHandle reads.

@kperryua
Copy link
Contributor

it's more like readUntilLength

I like that better.

@parkera parkera merged commit 39344d7 into swiftlang:main May 23, 2024
@parkera parkera deleted the parkera/data_io_performance_fix branch May 23, 2024 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants