Skip to content
This repository was archived by the owner on Nov 30, 2024. It is now read-only.

Commit 2ff60d3

Browse files
committed
Avoid bisect command to get stuck
From Palkan in benoittgt/rspec_repro_bisect_deadlock#1 First, I've tried to play with the number of specs which led to the interesting conclusion: **the process hangs only at 1548+ specs**. ```diff RSpec.describe "a bunch of nothing" do (0...(ENV.fetch('N', 3000).to_i)).each do |t| it { expect(t).to eq t } end end ``` Try to run with `N=1547` and `N=1548`. Seems suspicious, right? Let's add `pry-byebug` to the equation (or Gemfile). In order it to work we need to tweak our runner code a bit: ```diff - $stdout = $stderr = @spec_output + # $stdout = $stderr = @spec_output ``` After a bit of `puts` debugging I localized the problem: [`@channel.send`](/lib/rspec/core/bisect/fork_runner.rb@7b6b9c3#L122). `Channel#send` calls `IO#write` here /lib/rspec/core/bisect/utilities.rb@7b6b9c3#L41: ```ruby def send(message) packet = Marshal.dump(message) @write_io.write("#{packet.bytesize}\n#{packet}") end ``` Do you know, what is the `packet.bytesize` for `N=1548`? It's **65548**. This number is very important: the pipe size is only **65536** on MacOS (see docs for [`IO#write_nonblock`](ruby-doc.org/core-2.6.3/IO.html#method-i-write_nonblock) for more). That makes `@write_io.write` hangs forever, because no one reads the buffer: we call `Channel#receive` only after `Process.waitpid(pid)`, thus waiting for the write operation to complete. ----------- A basic proposal will be to use WNOHANG. From waitpid doc: > If WNOHANG was specified in options and there were no children > in a waitable state, then waitid() returns 0 immediately (...) To validate this proposal on OSX we run just before running bisect: `lsof -n -P -r1 -c ruby | grep -e 'PIP' -e '===' -e 'COMMAND'` This will give us in loop the PIPE sizes of Ruby processes. Without our patch we see that quickly we hit 65536 bytes on two pipes, with the patch we keep pipes at the right size. ``` COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ruby 40134 benoit 3 PIPE 0xf3b025a6a6cd6005 16384 ->0xf3b025a6a6cd5045 ruby 40134 benoit 4 PIPE 0xf3b025a6a6cd5045 16384 ->0xf3b025a6a6cd6005 ruby 40134 benoit 5 PIPE 0xf3b025a6a6cd7805 16384 ->0xf3b025a6a6cd7145 ruby 40134 benoit 7 PIPE 0xf3b025a6a6cd7145 16384 ->0xf3b025a6a6cd7805 ruby 40134 benoit 10 PIPE 0xf3b025a6a6cd6fc5 16384 ->0xf3b025a6a6cd5a05 ruby 40134 benoit 11 PIPE 0xf3b025a6a6cd5a05 16384 ->0xf3b025a6a6cd6fc5 ruby 40144 benoit 3 PIPE 0xf3b025a6a6cd5d05 16384 ->0xf3b025a6a6cd5c45 ruby 40144 benoit 4 PIPE 0xf3b025a6a6cd5c45 16384 ->0xf3b025a6a6cd5d05 ruby 40144 benoit 5 PIPE 0xf3b025a6a6cd7085 16384 ->0xf3b025a6a6cd6785 ruby 40144 benoit 7 PIPE 0xf3b025a6a6cd6785 16384 ->0xf3b025a6a6cd7085 ruby 40144 benoit 10 PIPE 0xf3b025a6a6cd6fc5 16384 ->0xf3b025a6a6cd5a05 ruby 40144 benoit 11 PIPE 0xf3b025a6a6cd5a05 16384 ->0xf3b025a6a6cd6fc5 ``` But if we look properly from the doc we can even go further. > If status information is immediately available on an appropriate child process, waitpid() returns this information. Otherwise, waitpid() returns immediately with an error code indicating that the information was not available. In other words, WNOHANG checks child processes without causing the caller to be suspended. and as pirj mention: "With this in mind, do we really need to check that information that waitpid returns? We don't seem to use it." Removing "waitpid" produce the same behavior as with `WNOHANG`. Improvements: The bisect command request lot's of ram. The next step should be to reduce that consumption. Related: - fix: #2637 - PR discussion: #2669
1 parent b7067c5 commit 2ff60d3

File tree

3 files changed

+17
-2
lines changed

3 files changed

+17
-2
lines changed

lib/rspec/core/bisect/fork_runner.rb

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -91,8 +91,7 @@ def initialize(runner, channel)
9191
end
9292

9393
def dispatch_specs(run_descriptor)
94-
pid = fork { run_specs(run_descriptor) }
95-
Process.waitpid(pid)
94+
fork { run_specs(run_descriptor) }
9695
end
9796

9897
private

spec/integration/bisect_spec.rb

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,13 @@ def bisect(cli_args, expected_status=nil)
3232
expect(output).to include("Bisect failed!", "The example ordering is inconsistent")
3333
end
3434
end
35+
36+
context "when the bisect command is long" do
37+
it 'does not hit pipe size limit and does not get stuck' do
38+
output = bisect(%W[spec/rspec/core/resources/blocking_pipe_bisect_spec.rb_], 1)
39+
puts output
40+
expect(output).to include("No failures found.")
41+
end
42+
end
3543
end
3644
end
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Deliberately named *.rb_ to avoid being loaded except when specified
2+
3+
RSpec.describe "1000 tests" do
4+
puts "Hit the pipe size limit on with bisect"
5+
(0..1000).each do |t|
6+
it { expect(t).to eq t }
7+
end
8+
end

0 commit comments

Comments
 (0)