Releases · Lightning-AI/pytorch-lightning

11 May 09:29

ethanwharris

1.3.1

b9b3ec5

Standard weekly patch release

[1.3.1] - 2021-05-11

Fixed

Fixed DeepSpeed with IterableDatasets (#7362)
Fixed Trainer.current_epoch not getting restored after tuning (#7434)
Fixed local rank displayed in console log (#7395)

Contributors

@akihironitta @awaelchli @leezu

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

06 May 21:23

Borda

1.3.0

fbc8b20

Lightning CLI, PyTorch Profiler, Improved Early Stopping

Today we are excited to announce Lightning 1.3, containing highly anticipated new features including a new Lightning CLI, improved TPU support, integrations such as PyTorch profiler, new early stopping strategies, predict and validate trainer routines, and more.

https://medium.com/pytorch/pytorch-lightning-1-3-lightning-cli-pytorch-profiler-improved-early-stopping-6e0ffd8deb29

[1.3.0] - 2021-05-06

Added

Added support for the EarlyStopping callback to run at the end of the training epoch (#6944)
Added synchronization points before and after setup hooks are run (#7202)
Added a teardown hook to ClusterEnvironment (#6942)
Added utils for metrics to scalar conversions (#7180)
Added utils for NaN/Inf detection for gradients and parameters (#6834)
Added more explicit exception message when trying to execute trainer.test() or trainer.validate() with fast_dev_run=True (#6667)
Added LightningCLI class to provide simple reproducibility with minimum boilerplate training CLI (#4492, #6862, #7156, #7299)
Added gradient_clip_algorithm argument to Trainer for gradient clipping by value (#6123).
Added a way to print to terminal without breaking up the progress bar (#5470)
Added support to checkpoint after training steps in ModelCheckpoint callback (#6146)
Added TrainerStatus.{INITIALIZING,RUNNING,FINISHED,INTERRUPTED} (#7173)
Added Trainer.validate() method to perform one evaluation epoch over the validation set (#4948)
Added LightningEnvironment for Lightning-specific DDP (#5915)
Added teardown() hook to LightningDataModule (#4673)
Added auto_insert_metric_name parameter to ModelCheckpoint (#6277)
Added arg to self.log that enables users to give custom names when dealing with multiple dataloaders (#6274)
Added teardown method to BaseProfiler to enable subclasses defining post-profiling steps outside of __del__ (#6370)
Added setup method to BaseProfiler to enable subclasses defining pre-profiling steps for every process (#6633)
Added no return warning to predict (#6139)
Added Trainer.predict config validation (#6543)
Added AbstractProfiler interface (#6621)
Added support for including module names for forward in the autograd trace of PyTorchProfiler (#6349)
Added support for the PyTorch 1.8.1 autograd profiler (#6618)
Added outputs parameter to callback's on_validation_epoch_end & on_test_epoch_end hooks (#6120)
Added configure_sharded_model hook (#6679)
Added support for precision=64, enabling training with double precision (#6595)
Added support for DDP communication hooks (#6736)
Added artifact_location argument to MLFlowLogger which will be passed to the MlflowClient.create_experiment call (#6677)
Added model parameter to precision plugins' clip_gradients signature (#6764, #7231)
Added is_last_batch attribute to Trainer (#6825)
Added LightningModule.lr_schedulers() for manual optimization (#6567)
Added MpModelWrapper in TPU Spawn (#7045)
Added max_time Trainer argument to limit training time (#6823)
Added on_predict_{batch,epoch}_{start,end} hooks (#7141)
Added new EarlyStopping parameters stopping_threshold and divergence_threshold (#6868)
Added debug flag to TPU Training Plugins (PT_XLA_DEBUG) (#7219)
Added new UnrepeatedDistributedSampler and IndexBatchSamplerWrapper for tracking distributed predictions (#7215)
Added trainer.predict(return_predictions=None|False|True) (#7215)
Added BasePredictionWriter callback to implement prediction saving (#7127)
Added trainer.tune(scale_batch_size_kwargs, lr_find_kwargs) arguments to configure the tuning algorithms (#7258)
Added tpu_distributed check for TPU Spawn barrier (#7241)
Added device updates to TPU Spawn for Pod training (#7243)
Added warning when missing Callback and using resume_from_checkpoint (#7254)
DeepSpeed single file saving (#6900)
Added Training type Plugins Registry (#6982, #7063, #7214, #7224)
Add ignore param to save_hyperparameters (#6056)

Changed

Changed LightningModule.truncated_bptt_steps to be property (#7323)
Changed EarlyStopping callback from by default running EarlyStopping.on_validation_end if only training is run. Set check_on_train_epoch_end to run the callback at the end of the train epoch instead of at the end of the validation epoch (#7069)
Renamed pytorch_lightning.callbacks.swa to pytorch_lightning.callbacks.stochastic_weight_avg (#6259)
Refactor RunningStage and TrainerState usage (#4945, #7173)
- Added RunningStage.SANITY_CHECKING
- Added TrainerFn.{FITTING,VALIDATING,TESTING,PREDICTING,TUNING}
- Changed trainer.evaluating to return True if validating or testing
Changed setup() and teardown() stage argument to take any of {fit,validate,test,predict} (#6386)
Changed profilers to save separate report files per state and rank (#6621)
The trainer no longer tries to save a checkpoint on exception or run callback's on_train_end functions (#6864)
Changed PyTorchProfiler to use torch.autograd.profiler.record_function to record functions (#6349)
Disabled lr_scheduler.step() in manual optimization (#6825)
Changed warnings and recommendations for dataloaders in ddp_spawn (#6762)
pl.seed_everything will now also set the seed on the DistributedSampler (#7024)
Changed default setting for communication of multi-node training using DDPShardedPlugin (#6937)
trainer.tune() now returns the tuning result (#7258)
LightningModule.from_datasets() now accepts IterableDataset instances as training datasets. (#7503)
Changed resume_from_checkpoint warning to an error when the checkpoint file does not exist (#7075)
Automatically set sync_batchnorm for training_type_plugin (#6536)
Allowed training type plugin to delay optimizer creation (#6331)
Removed ModelSummary validation from train loop on_trainer_init (#6610)
Moved save_function to accelerator (#6689)
Updated DeepSpeed ZeRO (#6546, #6752, #6142, #6321)
Improved verbose logging for EarlyStopping callback (#6811)
Run ddp_spawn dataloader checks on Windows (#6930)
Updated mlflow with using resolve_tags (#6746)
Moved save_hyperparameters to its own function (#7119)
Replaced _DataModuleWrapper with __new__ (#7289)
Reset current_fx properties on lightning module in teardown (#7247)
Auto-set DataLoader.worker_init_fn with seed_everything (#6960)
Remove model.trainer call inside of dataloading mixin (#7317)
Split profilers module (#6261)
Ensure accelerator is valid if running interactively (#5970)
Disabled batch transfer in DP mode (#6098)

Deprecated

Deprecated outputs in both LightningModule.on_train_epoch_end and Callback.on_train_epoch_end hooks (#7339)
Deprecated Trainer.truncated_bptt_steps in favor of LightningModule.truncated_bptt_steps (#7323)
Deprecated outputs in both LightningModule.on_train_epoch_end and Callback.on_train_epoch_end hooks (#7339)
Deprecated LightningModule.grad_norm in favor of pytorch_lightning.utilities.grads.grad_norm (#7292)
Deprecated the save_function property from the ModelCheckpoint callback (#7201)
Deprecated LightningModule.write_predictions and LightningModule.write_predictions_dict (#7066)
Deprecated TrainerLoggingMixin in favor of a separate utilities module for metric handling (#7180)
Deprecated TrainerTrainingTricksMixin in favor of a separate utilities module for NaN/Inf detection for gradients and parameters (#6834)
period has been deprecated in favor of every_n_val_epochs in the ModelCheckpoint callback (#6146)
Deprecated trainer.running_sanity_check in favor of trainer.sanity_checking (#4945)
Deprecated Profiler(output_filename) in favor of dirpath and filename (#6621)
Deprecated PytorchProfiler(profiled_functions) in favor of record_functions (#6349)
Deprecated @auto_move_data in favor of trainer.predict (#6993)
Deprecated Callback.on_load_checkpoint(checkpoint) in favor of Callback.on_load_checkpoint(trainer, pl_module, checkpoint) (#7253)
Deprecated metrics in favor of torchmetrics (#6505, #6530, #6540, #6547, #6515, #6572, #6573, #6584, #6636, #6637, #6649, #6659, #7131)
Deprecated the LightningModule.datamodule getter and setter methods; access them through Trainer.datamodule instead (#7168)
Deprecated the use of Trainer(gpus="i") (string) for selecting the i-th GPU; from v1.5 this will set the number of GPUs instead of the index (#6388)

Removed

Removed the exp_save_path property from the LightningModule (#7266)
Removed training loop explicitly calling EarlyStopping.on_validation_end if no validation is run (#7069)
Removed automatic_optimization as a property from the training loop in favor of LightningModule.automatic_optimization (#7130)
Removed evaluation loop legacy returns for *_epoch_end hooks (#6973)
Removed support for passing a bool value to profiler argument of Trainer (#6164)
Removed no return warning from val/test step (#6139)
Removed passing a ModelCheckpoint instance to Trainer(checkpoint_callback) (#6166)
Removed deprecated Trainer argument enable_pl_optimizer and automatic_optimization (#6163)
Removed deprecated metrics (#6161)
- from pytorch_lightning.metrics.functional.classification removed to_onehot, to_categorical, get_num_classes, roc, multiclass_roc, average_precision, precision_recall_curve, multiclass_precision_recall_curve
- from pytorch_lightning.metrics.functional.reduction removed reduce, class_reduce
Removed deprecated ModelCheckpoint arguments prefix, mode="auto" (#6162)
Removed mode='auto' from EarlyStopping (#6167)
Removed epoch and step argume...

Assets 4

1 Join discussion

23 Apr 09:09

Borda

1.2.10

cf5dc04

Quick patch release

Fixing missing packaging package in dependencies, which was affecting the only installation to a very blank system.

Assets 4

22 Apr 20:56

Borda

1.2.9

f9f4853

Standard weekly patch release

[1.2.9] - 2021-04-20

Fixed

Fixed the order to call for world ranks & the root_device property in TPUSpawnPlugin (#7074)
Fixed multi-gpu join for Horovod (#6954)
Fixed parsing for pre-release package versions (#6999)

Contributors

@irasit @Borda @kaushikb11

Assets 4

14 Apr 19:56

SeanNaren

1.2.8

7e57581

Standard weekly patch release

[1.2.8] - 2021-04-14

Added

Added TPUSpawn + IterableDataset error message (#6875)

Fixed

Fixed process rank not being available right away after Trainer instantiation (#6941)
Fixed sync_dist for tpus (#6950)
Fixed AttributeError for require_backward_grad_sync` when running manual optimization with sharded plugin (#6915)
Fixed --gpus default for parser returned by Trainer.add_argparse_args (#6898)
Fixed TPU Spawn all gather (#6896)
Fixed EarlyStopping logic when min_epochs or min_steps requirement is not met (#6705)
Fixed csv extension check (#6436)
Fixed checkpoint issue when using Horovod distributed backend (#6958)
Fixed tensorboard exception raising (#6901)
Fixed setting the eval/train flag correctly on accelerator model (#6983)
Fixed DDP_SPAWN compatibility with bug_report_model.py (#6892)
Fixed bug where BaseFinetuning.flatten_modules() was duplicating leaf node parameters (#6879)
Set better defaults for rank_zero_only.rank when training is launched with SLURM and torchelastic:
- Support SLURM and torchelastic global rank environment variables (#5715)
- Remove hardcoding of local rank in accelerator connector (#6878)

Contributors

@ananthsub @awaelchli @ethanwharris @justusschock @kandluis @kaushikb11 @liob @SeanNaren @skmatz

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

07 Apr 17:58

kaushikb11

1.2.7

f5f4f03

Standard weekly patch release

[1.2.7] - 2021-04-06

Fixed

Fixed resolve a bug with omegaconf and xm.save (#6741)
Fixed an issue with IterableDataset when len is not defined (#6828)
Sanitize None params during pruning (#6836)
Enforce an epoch scheduler interval when using SWA (#6588)
Fixed TPU Colab hang issue, post training (#6816])
Fixed a bug where TensorBoardLogger would give a warning and not log correctly to a symbolic link save_dir (#6730)

Contributors

@awaelchli, @ethanwharris, @karthikprasad, @kaushikb11, @mibaumgartner, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

30 Mar 14:46

Borda

1.2.6

e7abd8e

Standard weekly patch release

[1.2.6] - 2021-03-30

Changed

Changed the behavior of on_epoch_start to run at the beginning of validation & test epoch (#6498)

Removed

Removed legacy code to include step dictionary returns in callback_metrics. Use self.log_dict instead. (#6682)

Fixed

Fixed DummyLogger.log_hyperparams raising a TypeError when running with fast_dev_run=True (#6398)
Fixed error on TPUs when there was no ModelCheckpoint (#6654)
Fixed trainer.test freeze on TPUs (#6654)
Fixed a bug where gradients were disabled after calling Trainer.predict (#6657)
Fixed bug where no TPUs were detected in a TPU pod env (#6719)

Contributors

@awaelchli, @carmocca, @ethanwharris, @kaushikb11, @rohitgr7, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

24 Mar 15:17

Borda

1.2.5

cc40fa3

Weekly patch release - torchmetrics compatibility

[1.2.5] - 2021-03-23

Changed

Added Autocast in validation, test and predict modes for Native AMP (#6565)
Update Gradient Clipping for the TPU Accelerator (#6576)
Refactored setup for typing friendly (#6590)

Fixed

Fixed a bug where all_gather would not work correctly with tpu_cores=8 (#6587)
Fixed comparing required versions (#6434)
Fixed duplicate logs appearing in console when using the python logging module (#6275)

Contributors

@awaelchli, @Borda, @ethanwharris, @justusschock, @kaushikb11

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

16 Mar 20:29

SeanNaren

1.2.4

7114c2d

Standard weekly patch release

[1.2.4] - 2021-03-16

Changed

Changed the default of find_unused_parameters back to True in DDP and DDP Spawn (#6438)

Fixed

Expose DeepSpeed loss parameters to allow users to fix loss instability (#6115)
Fixed DP reduction with collection (#6324)
Fixed an issue where the tuner would not tune the learning rate if also tuning the batch size (#4688)
Fixed broadcast to use PyTorch broadcast_object_list and add reduce_decision (#6410)
Fixed logger creating directory structure too early in DDP (#6380)
Fixed DeepSpeed additional memory use on rank 0 when default device not set early enough (#6460)
Fixed DummyLogger.log_hyperparams raising a TypeError when running with fast_dev_run=True (#6398)
Fixed an issue with Tuner.scale_batch_size not finding the batch size attribute in the datamodule (#5968)
Fixed an exception in the layer summary when the model contains torch.jit scripted submodules (#6511)
Fixed when Train loop config was run during Trainer.predict (#6541)

Contributors

@awaelchli, @kaushikb11, @Palzer, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

09 Mar 17:28

tchaton

1.2.3

e719d60

Standard weekly patch release

[1.2.3] - 2021-03-09

Fixed

Fixed ModelPruning(make_pruning_permanent=True) pruning buffers getting removed when saved during training (#6073)
Fixed when _stable_1d_sort to work when n >= N (#6177)
Fixed AttributeError when logger=None on TPU (#6221)
Fixed PyTorch Profiler with emit_nvtx (#6260)
Fixed trainer.test from best_path hangs after calling trainer.fit (#6272)
Fixed SingleTPU calling all_gather (#6296)
Ensure we check deepspeed/sharded in multinode DDP (#6297)
Check LightningOptimizer doesn't delete optimizer hooks (#6305)
Resolve memory leak for evaluation (#6326)
Ensure that clip gradients is only called if the value is greater than 0 (#6330)
Fixed Trainer not resetting lightning_optimizers when calling Trainer.fit() multiple times (#6372)

Contributors

@awaelchli, @carmocca, @chizuchizu, @frankier, @SeanNaren, @tchaton

If we forgot someone due to not matching commit email with GitHub account, let us know :]

Assets 4

Releases: Lightning-AI/pytorch-lightning

Standard weekly patch release

[1.3.1] - 2021-05-11

Fixed

Contributors

Uh oh!

Lightning CLI, PyTorch Profiler, Improved Early Stopping

[1.3.0] - 2021-05-06

Added

Changed

Deprecated

Removed

Uh oh!

Quick patch release

Uh oh!

Standard weekly patch release

[1.2.9] - 2021-04-20

Fixed

Contributors

Uh oh!

Standard weekly patch release

[1.2.8] - 2021-04-14

Added

Fixed

Contributors

Uh oh!

Standard weekly patch release

[1.2.7] - 2021-04-06

Fixed

Contributors

Uh oh!

Standard weekly patch release

[1.2.6] - 2021-03-30

Changed

Removed

Fixed

Contributors

Uh oh!

Weekly patch release - torchmetrics compatibility

[1.2.5] - 2021-03-23

Changed

Fixed

Contributors

Uh oh!

Standard weekly patch release

[1.2.4] - 2021-03-16

Changed

Fixed

Contributors

Uh oh!

Standard weekly patch release

[1.2.3] - 2021-03-09

Fixed

Contributors

Uh oh!