-
Notifications
You must be signed in to change notification settings - Fork 551
fix: Reduce idle CPU consumption of websocket server #1040
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ros2
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
Re: using EventsExecutor, maybe we could create some kind of feature flag (based on environment variables or something), just so it can be tested by users? Separate PR if you want to try this though.
executor = rclpy.executors.SingleThreadedExecutor() | ||
executor.add_node(node) | ||
while rclpy.ok() and not shutdown_event.is_set(): | ||
executor.spin_once(timeout_sec=1.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worried about this timeout being a hard-coded value, although it is just one spin...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some timeout to make sure the thread wakes up once in a while to check for shutdown event. Alternatively, I can try running executor.spin()
and calling executor.shutdown()
from the main thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that and now one of the tests is timing out. I'm changing this PR to draft until I'll figure this out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests are timing out due to bugs in rclpy. I submitted a PR with the fix in ros2/rclpy#1469.
if isinstance(message, bson.BSON): | ||
binary = True | ||
elif compression in ["cbor", "cbor-raw"]: | ||
binary = True | ||
else: | ||
binary = False | ||
|
||
_io_loop.add_callback(partial(self.prewrite_message, message, binary)) | ||
asyncio.run_coroutine_threadsafe(self.prewrite_message(message, binary), cls.event_loop) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If event_loop
is None
(its default value), will this error? Should we check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It probably will, the same with node_handle
. I'm not happy with passing arguments through class variables but I just tried to adhere to how it is currently done.
We could just add a ROS parameter that enables it. Should be fairly easy, I might create a PR after this. |
Public API Changes
None
Description
This PR removes the use of tornado.ioloop and instead uses the
asyncio
event loop interface, as recommended in tornado documentation since version 6.0.The ROS spinning is now done in a separate thread instead of a periodic async callback.
SIGINT and SIGTERM signal handling is now done manually to ensure graceful shutdown.
In my case, the idle CPU consumption went from 9% to 0%.
With a client connected, the consumption went from 38% to 34%.
This is still quite high but a good step forward. I tried improving it further by using EventsExecutor and the consumption with the client connected went from 34% to 18% which is huge but I'm reluctant to use it as it is experimental.