Skip to content

Fix Checkpoint in Hyperparameter Tuning #2782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 29, 2024

Conversation

atskae
Copy link
Contributor

@atskae atskae commented Feb 27, 2024

Fixes #2679

Description

Uses the new Checkpoint import path and API in the Ray AI library.

Checklist

  • The issue that is being fixed is referred in the description (see above "Fixes #ISSUE_NUMBER")
  • Only one issue is addressed in this pull request
  • Labels from the issue that this PR is fixing are added to this pull request
  • No unnecessary issues are included into this pull request.

cc @subramen @albanD

Copy link

pytorch-bot bot commented Feb 27, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/tutorials/2782

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1009803 with merge base 5e772fa (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@atskae
Copy link
Contributor Author

atskae commented Feb 27, 2024

I was able to run this on MacOS, Sonoma:
Screenshot 2024-02-27 at 10 56 27 AM
Screenshot 2024-02-27 at 10 57 00 AM
Screenshot 2024-02-27 at 10 57 11 AM

@atskae
Copy link
Contributor Author

atskae commented Feb 27, 2024

Who should be the reviewers for this PR? @krfricke? @svekars?

krfricke
krfricke previously approved these changes Feb 27, 2024
Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a few nits

@atskae
Copy link
Contributor Author

atskae commented Feb 27, 2024

Thank you for your comments! I found another place where I forgot to close the data file, and removed the problematic import.

@atskae atskae marked this pull request as ready for review February 27, 2024 17:15
@atskae
Copy link
Contributor Author

atskae commented Feb 27, 2024

@krfricke @svekars Would you know when pytorch_tutorial_build_manager is expected to start?

@svekars svekars added core Tutorials of any level of difficulty related to the core pytorch functionality intro labels Mar 5, 2024
@svekars
Copy link
Contributor

svekars commented Mar 5, 2024

There seems to be an error.... - we are also updating ray to 2.9 in this PR so maybe let's have that one pass first, then you can rebase and finish your edits.

@atskae atskae closed this Mar 5, 2024
@atskae atskae deleted the fix_checkpoint_hyperparameter_tuning branch March 5, 2024 22:02
@atskae atskae restored the fix_checkpoint_hyperparameter_tuning branch March 5, 2024 22:03
@atskae
Copy link
Contributor Author

atskae commented Mar 5, 2024

Oops, I accidentally deleted the branch in my forked repository 😅

@atskae atskae reopened this Mar 5, 2024
@pytorch-bot pytorch-bot bot dismissed krfricke’s stale review March 5, 2024 22:05

This PR was reopened (likely due to being reverted), so your approval was removed. Please request another review.

@svekars
Copy link
Contributor

svekars commented Mar 6, 2024

Alright, we updated ray to 2.7. It looks like there is still an error in this PR. Can you please take a look @atskae ?

@atskae
Copy link
Contributor Author

atskae commented Mar 7, 2024

@svekars Can you try running CI again? I was able to run the notebook with the recent fixes using ray==2.7.2

@atskae atskae requested a review from krfricke March 7, 2024 18:58
@svekars
Copy link
Contributor

svekars commented Mar 7, 2024

@krfricke can we ask for a stamp once again if looks good?

@atskae
Copy link
Contributor Author

atskae commented Mar 12, 2024

Maybe @subramen or @albanD can also take a look?

@albanD
Copy link
Contributor

albanD commented Mar 14, 2024

I don't think I know Ray well enough to review this and @krfricke is the right person to review this!

Copy link
Contributor

@krfricke krfricke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@atskae
Copy link
Contributor Author

atskae commented Mar 26, 2024

@svekars Would you be able to run CI again?

@svekars svekars merged commit 33b15df into pytorch:main Apr 29, 2024
@atskae atskae deleted the fix_checkpoint_hyperparameter_tuning branch May 1, 2024 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed core Tutorials of any level of difficulty related to the core pytorch functionality intro
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] - Failed to import Checkpoint for hyperparameter tuning tutorial
5 participants