Skip to content

Fix parallel mode on larger projects #421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

keith
Copy link
Member

@keith keith commented Oct 12, 2022

Since the parallel mode requires the count of the files that it's going
to process, the application of openAndPrepareFile was not lazy, this
meant that swift-format immediately opened all the files in the project,
which because of macOS's low open file limit quickly results in "too
many open files". This changes the logic to only open the file as
needed instead. This shouldn't be any different from before as far as
file discovery, since we were already resolving everything to fetch the
count.

Since the parallel mode requires the count of the files that it's going
to process, the application of `openAndPrepareFile` was not lazy, this
meant that swift-format immediately opened all the files in the project,
which because of macOS's low open file limit quickly results in "too
many open files". This changes the logic to only open the file as
needed instead. This shouldn't be any different from before as far as
file discovery, since we were already resolving everything to fetch the
count.
@keith
Copy link
Member Author

keith commented Oct 12, 2022

This is how this logic used to work, but it changed in 6d500ea. @p4checo based on your description I don't see why this specific piece changed there, can you verify I'm not breaking something here?

To repro what this is fixing:

$ swift build -c release
$ mkdir /tmp/foo
$ cd /tmp/foo
$ for i in $(seq 1 3000) do; touch "$i.swift" done
$ path/to/swift-format . -r -i -p

Before this change you see many errors like:

<unknown>: error: Unable to open 2643.swift: file is not readable or does not exist
<unknown>: error: Unable to open 125.swift: file is not readable or does not exist
<unknown>: error: Unable to open 2292.swift: file is not readable or does not exist
<unknown>: error: Unable to open 77.swift: file is not readable or does not exist
<unknown>: error: Unable to open 608.swift: file is not readable or does not exist
<unknown>: error: Unable to open 32.swift: file is not readable or does not exist
<unknown>: error: Unable to open 2968.swift: file is not readable or does not exist
<unknown>: error: Unable to open 1218.swift: file is not readable or does not exist

@keith
Copy link
Member Author

keith commented Oct 12, 2022

Maybe this reintroduces the issue since configurations are now created in parallel? If so we'll have to think more about how to solve both of these issues at once.

@p4checo
Copy link
Contributor

p4checo commented Oct 12, 2022

Hi @keith 👋🏼

The reasoning for the change can be found on the commit message and PR:

Parallel mode was crashing with bad memory access due to data races
accessing Frontend.configurationLoader, more specifically its cache.

After serializing access (initially with an NSLock), I observed that
despite not crashing anymore, the performance was surprisingly worst
than in single threaded mode.
(...)
The best solution was then to try and remove all concurrency contention
from swift-format. By flattening the FileToProcess array and
making the Rule's nameCache static, close to optimal performance
should be achievable.

So, I think that unfortunately your changes do reintroduce the issue because openAndPrepareFile is invoked inside DispatchQueue.concurrentPerform. You probably aren't experiencing it because you're not passing a configuration argument nor do you have any local configuration file on that dummy project folder?

To work around the OS's open file limit while still avoiding serializing access to ConfigurationLoader.cache (to avoid a performance hit), I think we might be able to use a chunking approach, where we load N files at a time and pass them into DispatchQueue.concurrentPerform (which waits for all iterations to complete before returning). Something like:

if parallel {
  // process files in chunks to avoid hitting the OS's max open file limit.
  let chunkSize = 1000 // TODO: define at runtime, or use a "reasonable" value.
  let fileChunks = Array(FileIterator(urls: urls)).chunks(size: chunkSize)

  fileChunks.forEach { files in
    let filesToProcess = files.compactMap(openAndPrepareFile)
    DispatchQueue.concurrentPerform(iterations: filesToProcess.count) { index in
      processFile(filesToProcess[index])
    }
  }
} else {
  FileIterator(urls: urls).lazy.compactMap(openAndPrepareFile).forEach(processFile)
}
private extension Array {

  func chunks(size: Int) -> [[Element]] {
    stride(from: 0, to: count, by: size).map {
      Array(self[$0..<Swift.min($0 + size, count)])
    }
  }
}

I'm not sure if there's a programmatic way of knowing the current OS's maximum open file limit to determine optimal N at runtime, but we can be conservative or make it a parameter/config. Perhaps do some benchmarking if we go this route to help decide.

From a quick google search I found multiple commands and ran them on my machine (running macOS 12.6):

> sysctl -a | grep maxfiles
kern.maxfiles: 245760
kern.maxfilesperproc: 122880

> ulimit -Sn
256

> ulimit -Hn
unlimited

> launchctl limit maxfiles
	maxfiles    256            unlimited

Interestingly, 256 files seems way too low, and 122880 way too high compared to the 3000 your example mentions are enough to cause the issue. Any thoughts on what the actual value might be?

Hope this helps! 🙏🏼

@keith keith marked this pull request as draft October 12, 2022 17:27
@keith
Copy link
Member Author

keith commented Oct 12, 2022

I'm not sure what the exact value is, but also I imagine our repro case still has to be a bit above that because if we open / close the files fast enough, they won't overlap. Also knowing that number would only help so much since there could be many competing processes with many open files. Realistically semantically I think we would really just want "open just what we're processing and maybe just a few we're about to process" but implementing that feels like a lot of overhead for this case.

I assuming outside of this change a nice way to deal with the Configuration thread safety issue would be to make it an actor and handle the fallout of that. I did a rough prototype of that locally and things worked fine, but overall performance still wasn't improved.

Given these issues I don't think I'm going to work on this anymore right now, so I'll just leave it broken until someone is motivated to pick this up. Thanks for reviewing!

@keith keith closed this Oct 12, 2022
@keith keith deleted the ks/fix-parallel-mode-on-larger-projects branch October 12, 2022 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants