Releases: Lightning-AI/pytorch-lightning
standard weekly patch release
Overview
Detail changes
Added
- Add a notebook example to reach a quick baseline of ~94% accuracy on CIFAR10 using Resnet in Lightning (#4818)
Changed
Removed
Fixed
- Fixed trainer by default
None
inDDPAccelerator
(#4915) - Fixed
LightningOptimizer
to expose optimizer attributes (#5095) - Do not warn when the
name
key is used in thelr_scheduler
dict (#5057) - Check if optimizer supports closure (#4981)
- Extend LightningOptimizer to exposure underlying Optimizer attributes + update doc (#5095)
- Add deprecated metric utility functions back to functional (#5067, #5068)
- Allow any input in
to_onnx
andto_torchscript
(#4378) - Do not warn when the name key is used in the
lr_scheduler
dict (#5057)
Contributors
@Borda, @carmocca, @hemildesai, @rohitgr7, @s-rog, @tarepan, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
Model Parallelism Training and More Logging Options
Overview
Lightning 1.1 is out! You can now train models with twice the parameters and zero code changes with the new sharded model training! We also have a new plugin for sequential model parallelism, more logging options, and a lot of improvements!
Release highlights: https://bit.ly/3gyLZpP
Learn more about sharded training: https://bit.ly/2W3hgI0
Detail changes
Added
- Added "monitor" key to saved
ModelCheckpoints
(#4383) - Added
ConfusionMatrix
class interface (#4348) - Added multiclass AUROC metric (#4236)
- Added global step indexing to the checkpoint name for a better sub-epoch checkpointing experience (#3807)
- Added optimizer hooks in callbacks (#4379)
- Added option to log momentum (#4384)
- Added
current_score
toModelCheckpoint.on_save_checkpoint
(#4721) - Added logging using
self.log
in train and evaluation for epoch end hooks (#4913) - Added ability for DDP plugin to modify optimizer state saving (#4675)
- Added casting to python types for NumPy scalars when logging
hparams
(#4647) - Added
prefix
argument in loggers (#4557) - Added printing of total num of params, trainable and non-trainable params in ModelSummary (#4521)
- Added
PrecisionRecallCurve, ROC, AveragePrecision
class metric (#4549) - Added custom
Apex
andNativeAMP
asPrecision plugins
(#4355) - Added
DALI MNIST
example (#3721) - Added
sharded plugin
for DDP for multi-GPU training memory optimizations (#4773) - Added
experiment_id
to the NeptuneLogger (#3462) - Added
Pytorch Geometric
integration example with Lightning (#4568) - Added
all_gather
method toLightningModule
which allows gradient-based tensor synchronizations for use-cases such as negative sampling. (#5012) - Enabled
self.log
in most functions (#4969) - Added changeable extension variable for
ModelCheckpoint
(#4977)
Changed
- Removed
multiclass_roc
andmulticlass_precision_recall_curve
, useroc
andprecision_recall_curve
instead (#4549) - Tuner algorithms will be skipped if
fast_dev_run=True
(#3903) - WandbLogger does not force wandb
reinit
arg to True anymore and creates a run only when needed (#4648) - Changed
automatic_optimization
to be a model attribute (#4602) - Changed
Simple Profiler
report to order by percentage time spent + num calls (#4880) - Simplify optimization Logic (#4984)
- Classification metrics overhaul (#4837)
- Updated
fast_dev_run
to accept integer representing num_batches (#4629) - Refactored optimizer (#4658)
Deprecated
- Deprecated
prefix
argument inModelCheckpoint
(#4765) - Deprecated the old way of assigning hyper-parameters through
self.hparams = ...
(#4813) - Deprecated
mode='auto'
fromModelCheckpoint
andEarlyStopping
(#4695)
Removed
- Removed
reorder
parameter of theauc
metric (#5004)
Fixed
- Added feature to move tensors to CPU before saving (#4309)
- Fixed
LoggerConnector
to have logged metrics on root device in DP (#4138) - Auto convert tensors to contiguous format when
gather_all
(#4907) - Fixed
PYTHONPATH
for DDP test model (#4528) - Fixed allowing logger to support indexing (#4595)
- Fixed DDP and manual_optimization (#4976)
Contributors
@ananyahjha93, @awaelchli, @blatr, @Borda, @borisdayma, @carmocca, @ddrevicky, @george-gca, @gianscarpe, @irustandi, @janhenriklambrechts, @jeremyjordan, @justusschock, @lezwon, @rohitgr7, @s-rog, @SeanNaren, @SkafteNicki, @tadejsv, @tchaton, @williamFalcon, @zippeurfou
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added casting to python types for numpy scalars when logging
hparams
(#4647) - Added warning when progress bar refresh rate is less than 20 on Google Colab to prevent crashing (#4654)
- Added
F1
class metric (#4656)
Changed
- Consistently use
step=trainer.global_step
inLearningRateMonitor
independently oflogging_interval
(#4376) - Metric states are no longer as default added to
state_dict
(#4685) - Renamed class metric
Fbeta
>>FBeta
(#4656) - Model summary: add 1 decimal place (#4745)
- Do not override
PYTHONWARNINGS
(#4700)
Fixed
- Fixed checkpoint
hparams
dict casting whenomegaconf
is available (#4770) - Fixed incomplete progress bars when total batches not divisible by refresh rate (#4577)
- Updated SSIM metric (#4566)(#4656)
- Fixed batch_arg_name - add
batch_arg_name
to all calls to_adjust_batch_size
bug (#4812) - Fixed
torchtext
data to GPU (#4785) - Fixed a crash bug in MLFlow logger (#4716)
Contributors
@awaelchli, @jonashaag, @jungwhank, @M-Salti, @moi90, @pgagarinov, @s-rog, @Samyak2, @SkafteNicki, @teddykoker, @ydcjeff
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added lambda closure to
manual_optimizer_step
(#4618)
Changed
- Change Metrics
persistent
default mode toFalse
(#4685)
Fixed
- Prevent crash if
sync_dist=True
on CPU (#4626) - Fixed average pbar Metrics (#4534)
- Fixed
setup
callback hook to correctly pass the LightningModule through (#4608) - Allowing decorate model init with saving
hparams
inside (#4662) - Fixed
split_idx
set byLoggerConnector
inon_trainer_init
toTrainer
(#4697)
Contributors
@ananthsub, @Borda, @SeanNaren, @SkafteNicki, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added metrics aggregation in Horovod and fixed early stopping (#3775)
- Added
manual_optimizer_step
which work withAMP Native
andaccumulated_grad_batches
(#4485) - Added
persistent(mode)
method to metrics, to enable and disable metric states being added tostate_dict
(#4482) - Added congratulations at the end of our notebooks (#4555)
Changed
- Changed
fsspec
to tuner (#4458) - Unify sLURM/TorchElastic under backend plugin (#4578, #4580, #4581, #4582, #4583)
Fixed
- Fixed feature-lack in
hpc_load
(#4526) - Fixed metrics states being overridden in DDP mode (#4482)
- Fixed
lightning_getattr
,lightning_hasattr
not finding the correct attributes in datamodule (#4347) - Fixed automatic optimization AMP by
manual_optimization_step
(#4485) - Replace
MisconfigurationException
with warning inModelCheckpoint
Callback (#4560) - Fixed logged keys in mlflow logger (#4412)
- Fixed
is_picklable
by catchingAttributeError
(#4508)
Contributors
@dscarmo, @jtamir, @kazhang, @maxjeblick, @rohitgr7, @SkafteNicki, @tarepan, @tchaton, @tgaddair, @williamFalcon
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added PyTorch 1.7 Stable support (#3821)
- Added timeout for
tpu_device_exists
to ensure process does not hang indefinitely (#4340)
Changed
- W&B log in sync with
Trainer
step (#4405) - Hook
on_after_backward
is called only whenoptimizer_step
is being called (#4439) - Moved
track_and_norm_grad
intotraining loop
and called only whenoptimizer_step
is being called (#4439) - Changed type checker with explicit cast of ref_model object (#4457)
Deprecated
- Deprecated passing
ModelCheckpoint
instance tocheckpoint_callback
Trainer argument (#4336)
Fixed
- Disable saving checkpoints if not trained (#4372)
- Fixed error using
auto_select_gpus=True
withgpus=-1
(#4209) - Disabled training when
limit_train_batches=0
(#4371) - Fixed that metrics do not store computational graph for all seen data (#4313)
- Fixed AMP unscale for
on_after_backward
(#4439) - Fixed TorchScript export when module includes Metrics (#4428)
- Fixed CSV logger warning (#4419)
- Fixed skip DDP parameter sync (#4301)
Contributors
@ananthsub, @awaelchli, @borisdayma, @carmocca, @justusschock, @lezwon, @rohitgr7, @SeanNaren, @SkafteNicki, @ssaru, @tchaton, @ydcjeff
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added
dirpath
andfilename
parameter inModelCheckpoint
(#4213) - Added plugins docs and DDPPlugin to customize ddp across all accelerators (#4258)
- Added
strict
option to the scheduler dictionary (#3586) - Added
fsspec
support for profilers (#4162) - Added autogenerated helptext to
Trainer.add_argparse_args
(#4344) - Added support for string values in
Trainer
'sprofiler
parameter (#3656)
Changed
- Improved error messages for invalid
configure_optimizers
returns (#3587) - Allow changing the logged step value in
validation_step
(#4130) - Allow setting
replace_sampler_ddp=True
with a distributed sampler already added (#4273) - Fixed santized parameters for
WandbLogger.log_hyperparams
(#4320)
Deprecated
- Deprecated
filepath
inModelCheckpoint
(#4213) - Deprecated
reorder
parameter of theauc
metric (#4237) - Deprecated bool values in
Trainer
'sprofiler
parameter (#3656)
Fixed
- Fixed setting device ids in DDP (#4297)
- Fixed synchronization of best model path in
ddp_accelerator
(#4323) - Fixed
WandbLogger
not uploading checkpoint artifacts at the end of training (#4341)
Contributors
@ananthsub, @awaelchli, @carmocca, @ddrevicky, @louis-she, @mauvilsa, @rohitgr7, @SeanNaren, @tchaton
If we forgot someone due to not matching commit email with GitHub account, let us know :]
standard weekly patch release
Detail changes
Added
- Added persistent flag to
Metric.add_state
(#4195)
Changed
Fixed
Contributors
If we forgot someone due to not matching commit email with GitHub account, let us know :]
fixes a major logging bug for val in 1.0
Fixes the last major bugs for validation logging.
Also removes duplicate charts for metric / metric_loss.
Doing this minor release because correct validation metrics logging is critical.
Details changes
Added
- Added trace functionality to the function
to_torchscript
(#4142)
Changed
- Called
on_load_checkpoint
before loadingstate_dict
(#4057)
Removed
- Removed duplicate metric vs step log for train loop (#4173)
Fixed
- Fixed the self.log problem in
validation_step()
(#4169) - Fixed
hparams
saving - save the state whensave_hyperparameters()
is called [in__init__
] (#4163) - Fixed runtime failure while exporting
hparams
to yaml (#4158)
Contributors
@Borda, @NumesSanguis, @rohitgr7, @williamFalcon
If we forgot someone due to not matching commit email with GitHub account, let us know :]
minor jit fixes
Obligatory post 1.0 minor release. Main fix is to make Lightning module fully compatible with Jit (had some edge-cases we had not covered).