Skip to content

Logs of ssh commands larger than pool size get lost #25

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rymndhng opened this issue Feb 25, 2015 · 7 comments
Closed

Logs of ssh commands larger than pool size get lost #25

rymndhng opened this issue Feb 25, 2015 · 7 comments
Assignees
Labels

Comments

@rymndhng
Copy link

As mentioned in the docs, ParallelSSHClient.run_command blocks until all the commands have started. I've noticed that this means if I have more than 10 hosts I want to run the command on -- only the last 10 hosts are accessible. Is there a way for me to keep the pool size at 10 while getting the logs of all the remote hosts?

@pkittenis
Copy link
Member

Should be able to get logs for all hosts by calling ParallelSSHClient.get_output() multiple times.

@rymndhng
Copy link
Author

Here's what I gathered from the docs of self.pool.spawn:

The Pool which a subclass of Group provides a way to limit concurrency: its spawn method blocks if the number of greenlets in the pool has already reached the limit, until there is a free slot.

So, let's say I have 50 hosts I want to run pssh on and I use the default pool size 10. Then, in get_output(), self.pool.greenlets only contain the last 10 operations.

Is this understanding correct?

@pkittenis
Copy link
Member

Yes, that is correct. Calls to get_output() will return logs from pool_size number of hosts.

Additional calls should return logs from another batch of pool_size number of hosts, until there are no more hosts at which point get_output() will return None.

@rymndhng
Copy link
Author

I don't get the expected behavior, perhaps there's something obvious I'm missing out on. I'm running against 4 hosts, and I found that the 2nd invocation to get_output() is an empty dictionary.

            client = ParallelSSHClient(hosts=hosts, user=user, pool_size=1)
            output = client.run_command(command)

            while True:
                if len(output) == 0:
                    break

                for host in output:
                    print ">> Performed command \033[93m'{cmd}'\033[0m on \033[92m{name} - {dns}\033[0m".format(
                        cmd=command, name=user, dns=host)

                    for line in output[host]['stdout']:
                        pass

                output = client.get_output()

Where in the code does "additional calls should return logs from another batch" happen? My understanding is that spawn with a fixed pool size will block until all the operations are complete. In which case we would have already discarded all the previous greenlets and lose access to their logs.

@pkittenis pkittenis added bug and removed question labels Feb 27, 2015
@pkittenis
Copy link
Member

Thanks, you are correct - that looks like a bug.

@pkittenis
Copy link
Member

Have what I believe is a fix for this - will post a bug fix release once I have a test to replicate the failing behaviour.

You can fall back to the - deprecated but still working - ParallelSSHClient.exec_command and ParallelSSHClient.get_stdout if this is a blocker for you.

@pkittenis pkittenis self-assigned this Feb 27, 2015
pkittenis pushed a commit that referenced this issue Feb 27, 2015
@pkittenis
Copy link
Member

This is good now as of version 0.70.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants