You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enable planner to be used for loading sharded optimizer state dict (pytorch#112520)
Cherry-pick [pytorch#112259](pytorch#112259)
Requested by MosaicML
Comments from users:
> without this, we can't do training resumption because the model gets loaded without the optimizer
---------------------------------------------------------------------------------------------------------------------
This creates a more consistent interface for saving and loading sharded state dicts. A planner is able to be specified when saving a sharded optimizer state dict, but there is currently no planner support for loading one. This change does not affect the default behavior of the function.
Co-authored-by: Brian <[email protected]>
0 commit comments