Skip to content

Commit 633f4ec

Browse files
committed
Merge branch 'main' of github.com:Project-MONAI/tutorials into update-mil
2 parents a99fe58 + 8d84b41 commit 633f4ec

File tree

7 files changed

+503
-61
lines changed

7 files changed

+503
-61
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@ And compares the training speed and memory usage with/without AMP.
175175
This notebook compares the performance of `Dataset`, `CacheDataset` and `PersistentDataset`. These classes differ in how data is stored (in memory or on disk), and at which moment transforms are applied.
176176
#### [fast_training_tutorial](./acceleration/fast_training_tutorial.ipynb)
177177
This tutorial compares the training performance of pure PyTorch program and optimized program in MONAI based on NVIDIA GPU device and latest CUDA library.
178-
The optimization methods mainly include: `AMP`, `CacheDataset` and `Novograd`.
178+
The optimization methods mainly include: `AMP`, `CacheDataset`, `GPU transforms`, `ThreadDataLoader`, `DiceCELoss` and `SGD`.
179179
#### [multi_gpu_test](./acceleration/multi_gpu_test.ipynb)
180180
This notebook is a quick demo for devices, run the Ignite trainer engine on CPU, GPU and multiple GPUs.
181181
#### [threadbuffer_performance](./acceleration/threadbuffer_performance.ipynb)

acceleration/fast_model_training_guide.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -305,14 +305,13 @@ With all the above strategies, in this section, we introduce how to apply them t
305305
### 1. Spleen segmentation
306306

307307
- Select the algorithms based on the experiments.
308-
1. As a binary segmentation task, we replaced the baseline `Dice` loss with a `DiceCE` loss, it can help improve the convergence. To achieve the target metric (mean Dice = 0.95) it reduces the number of training epochs from 200 to 50.
309-
2. We tried several numerical optimizers, and finally replaced the baseline `Adam` optimizer with `Novograd`, which consistently reduce the number of training epochs from 50 to 30.
308+
As a binary segmentation task, we replaced the baseline `Dice` loss with a `DiceCE` loss, it can help improve the convergence. And we tried to analyze the training curve and tuned different parameters of the network and tested several numerical optimizers, finally replaced the baseline `Adam` optimizer with `SGD`. To achieve the target metric (`mean Dice = 0.94` of the `foreground` channel only) it reduces the number of training epochs from 280 to 60.
310309
- Optimize GPU utilization.
311310
1. With `AMP`, the training speed is significantly improved and can achieve almost the same validation metric as without `AMP`.
312311
2. The deterministic transform results of all the spleen dataset is around 8 GB, which can be cached in a V100 GPU memory. So, we cached all the data in GPU memory and executed the following transforms in GPU directly.
313312
- Replace `DataLoader` with `ThreadDataLoader`. As all the data are cached in GPU, the computation of randomized transforms is on GPU and light-weighted, `ThreadDataLoader` help avoid the IPC cost of multi-processing in `DataLoader` and increase the GPU utilization.
314313

315-
In summary, with a V100 GPU, we can achieve the training converges at a target validation mean Dice of `0.95` within one minute (`52s` on a V100 GPU, `41s` on an A100 GPU), it is approximately `200x` faster compared with the native PyTorch implementation when achieving the target metric. And each epoch is `20x` faster than the regular training.
314+
In summary, with a V100 GPU and the target validation `mean dice = 0.94` of the `forground` channel only, it's more than `100x` speedup compared with the Pytorch regular implementation when achieving the same metric (validation accuracies). And every epoch is `20x` faster than regular training.
316315
![spleen fast training](../figures/fast_training.png)
317316

318317
More details are available at [Spleen fast training tutorial](https://github.com/Project-MONAI/tutorials/blob/main/acceleration/fast_training_tutorial.ipynb).

acceleration/fast_training_tutorial.ipynb

Lines changed: 68 additions & 47 deletions
Large diffs are not rendered by default.

figures/fast_training.png

-667 KB
Loading

modules/bundle/spleen_segmentation/configs/metadata.json

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616
"authors": "MONAI team",
1717
"copyright": "Copyright (c) MONAI Consortium",
1818
"data_source": "Task09_Spleen.tar from http://medicaldecathlon.com/",
19-
"data_type": "dicom",
19+
"data_type": "nibabel",
2020
"image_classes": "single channel data, intensity scaled to [0, 1]",
2121
"label_classes": "single channel data, 1 is spleen, 0 is everything else",
2222
"pred_classes": "2 channels OneHot data, channel 1 is spleen, channel 0 is background",
@@ -32,19 +32,20 @@
3232
"inputs": {
3333
"image": {
3434
"type": "image",
35-
"format": "magnitude",
35+
"format": "hounsfield",
36+
"modality": "CT",
3637
"num_channels": 1,
3738
"spatial_shape": [
38-
160,
39-
160,
40-
160
39+
96,
40+
96,
41+
96
4142
],
4243
"dtype": "float32",
4344
"value_range": [
4445
0,
4546
1
4647
],
47-
"is_patch_data": false,
48+
"is_patch_data": true,
4849
"channel_def": {
4950
"0": "image"
5051
}
@@ -56,16 +57,16 @@
5657
"format": "segmentation",
5758
"num_channels": 2,
5859
"spatial_shape": [
59-
160,
60-
160,
61-
160
60+
96,
61+
96,
62+
96
6263
],
6364
"dtype": "float32",
6465
"value_range": [
6566
0,
6667
1
6768
],
68-
"is_patch_data": false,
69+
"is_patch_data": true,
6970
"channel_def": {
7071
"0": "background",
7172
"1": "spleen"

0 commit comments

Comments
 (0)