Add tutorial to extend bundle workflow with event-handler (#1098)

Nic-Ma · pre-commit-ci[bot] · web-flow · commit 8dab339a7285 · 2022-12-14T08:26:50.000Z
Fixes Project-MONAI/MONAI#5489. ### Description This PR added a tutorial document to extend a bundle workflow with event-handler. ### Checks  - [ ] Notebook runs automatically `./runner [-p <regex_pattern>]` Signed-off-by: Nic Ma <nma@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
diff --git a/model_zoo/README.md b/model_zoo/README.md
@@ -1,3 +1,5 @@
+This folder contains typical usage examples of the bundles in MONAI model-zoo.
+
 ## Getting Started
 
 To download and get started with the models, please see also https://monai.io/model-zoo.
diff --git a/model_zoo/extend_workflow_with_handler.md b/model_zoo/extend_workflow_with_handler.md
@@ -0,0 +1,156 @@
+# Overview
+This tutorial shows how to extend the features of workflow in the model-zoo bundles based on `event-handler` mechanism.
+Here we try to add the execution time computation logic in the spleen segmentation bundle.
+
+## Event-handler mechanism
+The bundles in the `model-zoo` are constructed by MONAI workflow, which can enable quick start of training and evaluation experiments.
+The MONAI workflow is compatible with pytorch-ignite `Engine` and `Event-Handler` mechanism: https://pytorch-ignite.ai/.
+
+So we can easily extend new features to the workflow by defining a new independent event handler and attaching to the workflow engine.
+
+### Supported events
+Here is all the supported `Event` in MONAI:
+| Class | Event name | Description |
+| --- | --- | --- |
+| ignite.engine.Events | STARTED | triggered when engine's run is started |
+| ignite.engine.Events | EPOCH_STARTED | triggered when the epoch is started |
+| ignite.engine.Events | GET_BATCH_STARTED | triggered before next batch is fetched |
+| ignite.engine.Events | GET_BATCH_COMPLETED | triggered after the batch is fetched |
+| ignite.engine.Events | ITERATION_STARTED | triggered when an iteration is started |
+| monai.engines.IterationEvents | FORWARD_COMPLETED | triggered when `network(image, label)` is completed |
+| monai.engines.IterationEvents | LOSS_COMPLETED | triggered when `loss(pred, label)` is completed |
+| monai.engines.IterationEvents | BACKWARD_COMPLETED | triggered when `loss.backward()` is completed |
+| monai.engines.IterationEvents | MODEL_COMPLETED | triggered when all the model related operations completed |
+| monai.engines.IterationEvents | INNER_ITERATION_STARTED | triggered when the iteration has an inner loop and the loop is started |
+| monai.engines.IterationEvents | INNER_ITERATION_COMPLETED | triggered when the iteration has an inner loop and the loop is completed |
+| ignite.engine.Events | ITERATION_COMPLETED | triggered when the iteration is ended |
+| ignite.engine.Events | DATALOADER_STOP_ITERATION | triggered when dataloader has no more data to provide |
+| ignite.engine.Events | EXCEPTION_RAISED | triggered when an exception is encountered |
+| ignite.engine.Events | TERMINATE_SINGLE_EPOCH | triggered when the run is about to end the current epoch |
+| ignite.engine.Events | TERMINATE | triggered when the run is about to end completely |
+| ignite.engine.Events | INTERRUPT | triggered when the run is interrupted |
+| ignite.engine.Events | EPOCH_COMPLETED | triggered when the epoch is ended |
+| ignite.engine.Events | COMPLETED | triggered when engine's run is completed |
+
+For more information about the `Event` of pytorch-ignite, please refer to:
+https://pytorch.org/ignite/generated/ignite.engine.events.Events.html.
+
+Users can also register their own customized `Event` to the workflow engine.
+
+### Develop event handler
+A typical handler must contain the `attach()` function and several callback functions to handle the attached events.
+For example, here we define a dummy handler to do some logic when iteration started and completed for every 5 iterations:
+```py
+from ignite.engine import Engine, Events
+
+
+class DummyHandler:
+    def attach(self, engine: Engine) -> None:
+        engine.add_event_handler(Events.ITERATION_STARTED(every=5), self.iteration_started)
+        engine.add_event_handler(Events.ITERATION_COMPLETED(every=5), self.iteration_completed)
+
+    def iteration_started(self, engine: Engine) -> None:
+        pass
+
+    def iteration_completed(self, engine: Engine) -> None:
+        pass
+```
+
+### Get context information of workflow and extend features or debug
+Within the handler callback functions, it's easy to get the property objects of `engine` to execute more logic,
+like: `engine.network`, `engine.optimizer`, etc. And all the context information are recorded as properties in the `engine.state`:
+| Property | Description |
+| --- | --- |
+| rank | index of current rank in distributed data parallel |
+| iteration | index of current iteration |
+| epoch | index of current epoch |
+| max_epochs | max epoch number to execute |
+| epoch_length | iteration number to execute in 1 epoch |
+| output | output data of current iteration |
+| batch | input data of current iteration |
+| metrics | metrics values of current epoch |
+| metric_details | details data during metrics computation of current epoch |
+| dataloader | dataloader to generate the input data of every iteration |
+| device | target device to put the input data |
+| key_metric_name | name of the key metric to compare and select the best model |
+| best_metric | value of the best metric results |
+| best_metric_epoch | epoch index of the best metric value |
+
+Users can also register their own customized properties to the `engine.state`.
+
+To extend features or debug the workflow, we can leverage these information.
+For example, here we try to print the `learning rate` value and `current epoch` index within an event callback function:
+```py
+def epoch_completed(self, engine: Engine) -> None:
+    print(f"Current epoch: {engine.state.epoch}")
+    print(f"Learning rate: {engine.optimizer.state_dict()['param_groups'][0]['lr']}")
+```
+
+And to extract expected data from the `engine.state.output`, we usually define a `output_transform` callable argument in the handler,
+like the existing [StatsHandler](https://docs.monai.io/en/stable/handlers.html#monai.handlers.StatsHandler), [TensorBoardStatsHandler](https://docs.monai.io/en/stable/handlers.html#monai.handlers.TensorBoardStatsHandler), etc.
+MONAI contains a convenient utility `monai.handlers.from_engine` to support most of the typical `output_transform` callables.
+For more details, please refer to: https://docs.monai.io/en/stable/handlers.html#monai.handlers.utils.from_engine.
+
+## Download example MONAI bundle from model-zoo
+```
+python -m monai.bundle download --name spleen_ct_segmentation --version "0.1.1" --bundle_dir "./"
+```
+
+## Extend the workflow to print the execution time for every iteration, every epoch and total time
+Here we define a new handler in `spleen_ct_segmentation/scripts/timer.py` to compute and print the time consumption details:
+```py
+from time import time
+from ignite.engine import Engine, Events
+
+
+class TimerHandler:
+    def __init__(self) -> None:
+        self.start_time = 0
+        self.epoch_start_time = 0
+        self.iteration_start_time = 0
+
+    def attach(self, engine: Engine) -> None:
+        engine.add_event_handler(Events.STARTED, self.started)
+        engine.add_event_handler(Events.EPOCH_STARTED, self.epoch_started)
+        engine.add_event_handler(Events.ITERATION_STARTED, self.iteration_started)
+        engine.add_event_handler(Events.ITERATION_COMPLETED, self.iteration_completed)
+        engine.add_event_handler(Events.EPOCH_COMPLETED, self.epoch_completed)
+        engine.add_event_handler(Events.COMPLETED, self.completed)
+
+    def started(self, engine: Engine) -> None:
+        self.start_time = time()
+
+    def epoch_started(self, engine: Engine) -> None:
+        self.epoch_start_time = time()
+
+    def iteration_started(self, engine: Engine) -> None:
+        self.iteration_start_time = time()
+
+    def iteration_completed(self, engine: Engine) -> None:
+        print(f"iteration {engine.state.iteration} execution time: {time() - self.iteration_start_time}")
+
+    def epoch_completed(self, engine: Engine) -> None:
+        print(f"epoch {engine.state.epoch} execution time: {time() - self.epoch_start_time}")
+
+    def completed(self, engine: Engine) -> None:
+        print(f"total execution time: {time() - self.start_time}")
+```
+Then add the handler to the `"train": {"handlers: [...]"}` list of `train.json` config:
+```json
+{
+    "_target_": "scripts.timer.TimerHandler"
+}
+```
+
+## Command example
+To run the workflow with this customized handler, `PYTHONPATH` should be revised to include the path to the customized scripts:
+```
+export PYTHONPATH=$PYTHONPATH:"<path to 'spleen_ct_segmentation/scripts'>"
+```
+And please make sure the folder `spleen_ct_segmentation/scripts` is a valid python module (it has a `__init__.py` file in the folder).
+
+Execute training:
+
+```
+python -m monai.bundle run training --meta_file configs/metadata.json --config_file configs/train.json --logging_file configs/logging.conf
+```