Skip to content

fix: Reduce idle CPU consumption of websocket server #1040

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: ros2
Choose a base branch
from

Conversation

bjsowa
Copy link
Collaborator

@bjsowa bjsowa commented Jun 4, 2025

Public API Changes
None

Description
This PR removes the use of tornado.ioloop and instead uses the asyncio event loop interface, as recommended in tornado documentation since version 6.0.

The ROS spinning is now done in a separate thread instead of a periodic async callback.

SIGINT and SIGTERM signal handling is now done manually to ensure graceful shutdown.

In my case, the idle CPU consumption went from 9% to 0%.
With a client connected, the consumption went from 38% to 34%.

This is still quite high but a good step forward. I tried improving it further by using EventsExecutor and the consumption with the client connected went from 34% to 18% which is huge but I'm reluctant to use it as it is experimental.

@bjsowa bjsowa requested a review from sea-bass June 4, 2025 16:47
Copy link
Contributor

@sea-bass sea-bass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Re: using EventsExecutor, maybe we could create some kind of feature flag (based on environment variables or something), just so it can be tested by users? Separate PR if you want to try this though.

executor = rclpy.executors.SingleThreadedExecutor()
executor.add_node(node)
while rclpy.ok() and not shutdown_event.is_set():
executor.spin_once(timeout_sec=1.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worried about this timeout being a hard-coded value, although it is just one spin...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some timeout to make sure the thread wakes up once in a while to check for shutdown event. Alternatively, I can try running executor.spin() and calling executor.shutdown() from the main thread.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that and now one of the tests is timing out. I'm changing this PR to draft until I'll figure this out

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are timing out due to bugs in rclpy. I submitted a PR with the fix in ros2/rclpy#1469.

if isinstance(message, bson.BSON):
binary = True
elif compression in ["cbor", "cbor-raw"]:
binary = True
else:
binary = False

_io_loop.add_callback(partial(self.prewrite_message, message, binary))
asyncio.run_coroutine_threadsafe(self.prewrite_message(message, binary), cls.event_loop)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If event_loop is None (its default value), will this error? Should we check?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably will, the same with node_handle. I'm not happy with passing arguments through class variables but I just tried to adhere to how it is currently done.

@bjsowa
Copy link
Collaborator Author

bjsowa commented Jun 5, 2025

Re: using EventsExecutor, maybe we could create some kind of feature flag (based on environment variables or something), just so it can be tested by users? Separate PR if you want to try this though.

We could just add a ROS parameter that enables it. Should be fairly easy, I might create a PR after this.

@bjsowa bjsowa marked this pull request as draft June 5, 2025 06:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants