[bugfix] TPU + all_gather + SingleTPU shouldn't call xm.all_gather (#6296)

tchaton · tchaton · commit 9028fe1d45d3 · 2021-03-09T11:12:38.000Z
* resolve an issue with TPU

* update

* add changelog
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -30,6 +30,8 @@ The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).
 - Fixed `trainer.test` from `best_path` hangs after calling `trainer.fit`  ([#6272](https://github.com/PyTorchLightning/pytorch-lightning/pull/6272))
 
 
+- Fixed `SingleTPU` calling `all_gather` ([#6296](https://github.com/PyTorchLightning/pytorch-lightning/pull/6296))
+
 ## [1.2.2] - 2021-03-02
 
 ### Added
diff --git a/pytorch_lightning/accelerators/tpu.py b/pytorch_lightning/accelerators/tpu.py
@@ -40,4 +40,7 @@ def all_gather(self, tensor: Union[torch.Tensor], group: Optional[Any] = None, s
         Return:
             A tensor of shape (world_size, batch, ...)
         """
-        return xm.all_gather(tensor, group=group, sync_grads=sync_grads)
+        # todo: Add support for backward with all_gather
+        if torch.distributed.is_initialized():
+            return xm.all_gather(tensor, group=group, sync_grads=sync_grads)
+        return tensor