Skip to content

[DONOTMERGE] Dummy PR to test github actions. #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

dcci
Copy link
Member

@dcci dcci commented May 2, 2025

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 2, 2025
@dcci dcci closed this May 2, 2025
facebook-github-bot pushed a commit that referenced this pull request May 22, 2025
Summary:
All the working subcommands were falling back to Python anyways.

Moved the (currently unimplemented) subcommand stubs: `bounce` and `stop` to Python.

**Note:** couple of reasons why a Rust CLI for monarch isn't ideal:

1. Uses TorchX under the hood. TorchX is a Python library.
2. Due to #1 we have to run a Python CLI fallback anyways and the mechanics of this is meta specific (won't work for OSS).
3. Reverse pyo3 binding TorchX (call Python from Rust) doesn't work internally due to the way we package Python (hermetic PAR).
4. Any material benefits (e.g. performance?) of implementing the CLI in Rust would be negated by the effort to fix/deal-with #1-3.

**Next:**
~~[6/n] Have kd_monarch use the default component (the custom mast.py is no longer needed). Update the README with updated instructions.~~
~~[7/n] Remove rust CLI in favor of all-python (we delegate to torchx for most things anyways)~~
[8/n] Add E2E unittest using the local_cwd scheduler (actually run a mini-trainer actor)
[9/n] Write an oss hyperactor mesh-worker entrypoint binary
[10/n] Author a Dockerfile that sets up the environment (much like fbpkgs do it for internal runs)
[11/n] Author a TorchXAllocator

Reviewed By: vidhyav, suo

Differential Revision: D75176535

fbshipit-source-id: 29020f4032bd642af26b393ade74f40b868df973
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants