Project-MONAI · wyli · Dec 12, 2022 · Oct 28, 2022 · Oct 28, 2022 · Oct 28, 2022
diff --git a/README.md b/README.md
@@ -150,6 +150,8 @@ This is example walks through using a Triton Server and Python client using MONA
 An example of experiment management with [Aim](https://aimstack.io/aim-monai-tutorial/), using 3D spleen segmentation as an example.
 ##### [MLFlow](./experiment_management/spleen_segmentation_mlflow.ipynb)
 An example of experiment management with [MLFlow](https://www.mlflow.org/docs/latest/tracking.html), using 3D spleen segmentation as an example.
+#### [MLFlow with MONAI workflow](./experiment_management/workflow_integrate_mlflow.ipynb)
+An example shows how to enable and custom MLFlow in MONAI workflow based process like MONAI bundle and MONAI engine.
 
 #### <ins>**Federated Learning**</ins>
 ##### [NVFlare](./federated_learning/nvflare)

diff --git a/experiment_management/README.md b/experiment_management/README.md
@@ -0,0 +1,7 @@
+# Overview
+This directory shows how to do experiment management in monai.
+
+## Files
+1. spleen_segmentation_aim.ipynb: an example about how to deploy aim experiment management in pytorch code.
+2. spleen_segmentation_mlflow.ipynb: an example about how to deploy mlflow experiment management in pytorch code.
+3. workflow_integrate_mlflow.ipynb: an example about how to deploy mlflow experiment management in monai workflow with 1~2 lines.
diff --git a/experiment_management/extra_pics/mlflow_config_result.png b/experiment_management/extra_pics/mlflow_config_result.png
diff --git a/experiment_management/extra_pics/mlflow_default_result.png b/experiment_management/extra_pics/mlflow_default_result.png
diff --git a/experiment_management/extra_pics/mlflow_python_result.png b/experiment_management/extra_pics/mlflow_python_result.png
diff --git a/experiment_management/mlflow_example.json b/experiment_management/mlflow_example.json
@@ -0,0 +1,26 @@
+{
+    "handlers_id": {
+        "trainer": {
+            "id": "train#trainer",
+            "handlers": "train#handlers"
+        }
+    },
+    "configs": {
+        "tracking_uri": "$@output_dir + '/mlruns'",
+        "experiment_name": "monai_experiment",
+        "run_name": "test1",
+        "is_not_rank0": "$torch.distributed.is_available() and torch.distributed.is_initialized() and torch.distributed.get_rank() > 0",
+        "trainer": {
+            "_target_": "MLFlowHandler",
+            "_disabled_": "@is_not_rank0",
+            "tracking_uri": "@tracking_uri",
+            "experiment_name": "@experiment_name",
+            "run_name": "@run_name",
+            "iteration_log": true,
+            "epoch_log": true,
+            "tag_name": "train_loss",
+            "output_transform": "$monai.handlers.from_engine(['loss'], first=True)",
+            "experiment_param": "${'backbone':'unet', 'norm':'Batch'}"
+        }
+    }
+}
diff --git a/experiment_management/workflow_integrate_mlflow.ipynb b/experiment_management/workflow_integrate_mlflow.ipynb
@@ -0,0 +1,272 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# workflow_integrate_mlflow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "[MLflow](https://mlflow.org/) is an experiment management tool that can be used for logging experiment details and results in machine learning experiments. The MONAI workflow integrates mlflow as a part of it to make it convenient for users recording their experiments. This tutorial shows how to enable it in MONAI bundle workflow from three aspects.\n",
+    "1. Use MLflow in MONAI bundle by default.\n",
+    "2. Use MLflow in MONAI bundle with a config file.\n",
+    "3. Use MLflow in parsed MONAI bundle with python code.\n",
+    "\n",
+    "This tutorial takes the [3D spleen segmentation task](https://github.com/Project-MONAI/tutorials/blob/main/3d_segmentation/spleen_segmentation_3d.ipynb) as an example. In order to quickly verify the MLflow function, each example will only run 5 epochs."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup Environment"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "`MLFlow` comes as part of the `monai[all]` installation. For official documentation on MLFlow's experiment management functionalities, click [here](https://www.mlflow.org/docs/latest/tracking.html)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!python -c \"import monai\" || pip install -q \"monai-weekly[gdown, nibabel, tqdm, ignite]\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup imports"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import os\n",
+    "import monai\n",
+    "import tempfile\n",
+    "from monai.apps import download_and_extract\n",
+    "from monai.bundle import ConfigParser\n",
+    "from monai.handlers import MLFlowHandler"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setup data directory\n",
+    "\n",
+    "You can specify a directory with the `MONAI_DATA_DIRECTORY` environment variable. This allows you to save results and reuse downloads. If not specified a temporary directory will be used."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "directory = os.environ.get(\"MONAI_DATA_DIRECTORY\")\n",
+    "root_dir = tempfile.mkdtemp() if directory is None else directory\n",
+    "print(root_dir)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download spleen dataset\n",
+    "Downloads and extracts the dataset. The dataset comes from http://medicaldecathlon.com/."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "resource = \"https://msd-for-monai.s3-us-west-2.amazonaws.com/Task09_Spleen.tar\"\n",
+    "md5 = \"410d4a301da4e5b2f6f86ec3ddba524e\"\n",
+    "\n",
+    "compressed_file = os.path.join(root_dir, \"Task09_Spleen.tar\")\n",
+    "data_dir = os.path.join(root_dir, \"Task09_Spleen\")\n",
+    "print(data_dir)\n",
+    "if not os.path.exists(data_dir):\n",
+    "    download_and_extract(resource, compressed_file, root_dir, md5)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Use MLflow in MONAI bundle\n",
+    "\n",
+    "In this part, we will take the [spleen segmentation bundle](https://github.com/Project-MONAI/model-zoo/tree/dev/models/spleen_ct_segmentation) as an example to show how to enable MLflow in it. Typically there are two ways to enable MLflow in a bundle training process. The easiest way is to add `--tracking \"mlflow\"` at the end of the command line. Some extra parameters like `tracking_uri` and `experiment_name` can also be added this way. The second is to add a config json file as input. In this file, users can define their own setting on MLflow. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Download spleen segmentation bundle"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "monai.bundle.download(name=\"spleen_ct_segmentation\", version=\"0.3.7\", bundle_dir=\"./\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Run spleen bundle with MLflow parameter"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The command line in the next cell is the easiest way to run the spleen segementation training  bundle with mlflow. Please modify the `--dataset_dir` with your own path of dataset. The parameter `--tracking \"mlflow\"` at the end of the original command is to enable the mlflow during training. Parameters `--tracking_uri`, `--experiment_name` and `--run_name` can also be added and modified to change the tracking uri, experiment name and run name of mlflow. To enable MLflow in multi-gpu training is as the same with single gpu by adding `--tracking \"mlflow\"` to the end of command line. \n",
+    "\n",
+    "A `mlruns` folder will be created in the `spleen_ct_segmentation/eval` folder during the running. Running the command `mlflow ui` in this folder can set a webpage UI for tracking. By default, the address will be `http://127.0.0.1:5000`. If there is a confliction of port or host address, `--port` and `--host` parameters can be modified to new one. \n",
+    "Here is the tracking result.\n",
+    "![image](./extra_pics/mlflow_default_result.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!cd spleen_ct_segmentation;python -m monai.bundle run training \\\n",
+    "    --meta_file configs/metadata.json \\\n",
+    "    --config_file configs/train.json \\\n",
+    "    --logging_file configs/logging.conf \\\n",
+    "    --bundle_root ./ \\\n",
+    "    --dataset_dir /workspace/data/medical/Task09_Spleen \\\n",
+    "    --train#trainer#max_epochs 10 \\\n",
+    "    --tracking \"mlflow\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Run spleen bundle with a MLflow config file"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The other way to run MLflow with bundle is to pass a JSON config file to `--tracking` parameter. In this file, a `mlflow_hander`, which is a handler to leverage MLflow in MONAI bundle, should be defined to enable tracking. More parameters and details can be added through this way. There is an example JSON named `mlflow_example.json` in this folder for reference. When writting the config JSON in multi-gpu environment, please note to use `_disabled_` parameter as shown in the example JSON to only use MLflow in the first gpu.\n",
+    "\n",
+    "The next cell contains a command line to run spleen segmentation training with the given config JSON. As the experiment finished, it will be logged as shown below. And it is different from the defualt one by `run_name` and `parameters`, since we changed these in the config file.\n",
+    "\n",
+    "![image](./extra_pics/mlflow_config_result.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!cd spleen_ct_segmentation;python -m monai.bundle run training \\\n",
+    "    --meta_file configs/metadata.json \\\n",
+    "    --config_file configs/train.json \\\n",
+    "    --logging_file configs/logging.conf \\\n",
+    "    --bundle_root ./ \\\n",
+    "    --dataset_dir /workspace/data/medical/Task09_Spleen \\\n",
+    "    --train#trainer#max_epochs 10 \\\n",
+    "    --tracking ../mlflow_example.json"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Run parsed spleen segmentation bundle with mlflow_handler"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this part, we use the parsed trainer from spleen bundle to show how to add mlflow_handler to a monai engine in python code. However, users can also write their own workflow in python code from beginning and reference to this part to add mlflow_handler. \n",
+    "The recorded results are shown below:\n",
+    "![image](./extra_pics/mlflow_python_result.png)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "tracking_uri = \"./spleen_ct_segmentation/eval/mlruns\"\n",
+    "ml_tracking = MLFlowHandler(\n",
+    "    tracking_uri=tracking_uri,\n",
+    "    experiment_name=\"ParsedExperiment\",\n",
+    "    run_name=\"Parsed1\",\n",
+    "    tag_name=\"train_loss\",\n",
+    "    iteration_log=True,\n",
+    "    epoch_log=True,\n",
+    "    output_transform=monai.handlers.from_engine([\"loss\"], first=True),\n",
+    ")\n",
+    "parser = ConfigParser()\n",
+    "parser.read_config(f=\"./spleen_ct_segmentation/configs/train.json\")\n",
+    "parser.read_meta(f=\"./spleen_ct_segmentation/configs/metadata.json\")\n",
+    "parser.update({\"train#trainer#max_epochs\": 10, \"dataset_dir\": data_dir})\n",
+    "\n",
+    "trainer = parser.get_parsed_content(\"train#trainer\")\n",
+    "ml_tracking.attach(trainer)\n",
+    "trainer.run()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.13"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "d4d1e4263499bec80672ea0156c357c1ee493ec2b1c70f0acce89fc37c4a6abe"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}