Skip to content

Update autorunner notebook #1044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 16, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
202 changes: 76 additions & 126 deletions auto3dseg/notebooks/auto_runner.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,24 +8,30 @@
"\n",
"This notebook will introduce `AutoRunner`, the interface to run the Auto3Dseg pipeline with minimal user inputs.\n",
"\n",
"## 1. Set up environment, imports and datasets\n",
"### 1.1 Set up Environment"
"Specifically, it will show the features below:\n",
"1. Use `AutoRunner` with an input config file `input.yaml` example\n",
"2. How to prepare an `input.yaml`\n",
"3. How to configure the input/ouput folders\n",
"4. How to set the internal parameters of **Auto3DSeg** components\n",
"5. How to apply hyper parameter optimization\n",
"\n",
"## Setup environment"
]
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!python -c \"import monai\" || pip install -q \"monai-weekly[nibabel]\""
"!python -c \"import monai\" || pip install -q \"monai-weekly[nibabel, nni, tqdm, cucim, yaml, optuna]\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.2 Set up imports"
"## Setup imports"
]
},
{
Expand All @@ -44,10 +50,9 @@
],
"source": [
"import os\n",
"import tempfile\n",
"import torch\n",
"\n",
"from pathlib import Path\n",
"\n",
"from monai.bundle.config_parser import ConfigParser\n",
"from monai.apps import download_and_extract\n",
"\n",
Expand All @@ -59,55 +64,35 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.3 Download public datasets"
"## Download dataset"
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Task04_Hippocampus.tar: 27.1MB [00:15, 1.88MB/s] "
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"2022-10-18 08:11:37,235 - INFO - Downloaded: Task04_Hippocampus.tar\n",
"2022-10-18 08:11:37,235 - INFO - Expected md5 is None, skip md5 check for file Task04_Hippocampus.tar.\n",
"2022-10-18 08:11:37,236 - INFO - Writing into directory: ..\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n"
]
}
],
"outputs": [],
"source": [
"root = str(Path(\".\"))\n",
"directory = os.environ.get(\"MONAI_DATA_DIRECTORY\")\n",
"root_dir = tempfile.mkdtemp() if directory is None else directory\n",
"print(root_dir)\n",
"\n",
"msd_task = \"Task04_Hippocampus\"\n",
"resource = \"https://msd-for-monai.s3-us-west-2.amazonaws.com/\" + msd_task + \".tar\"\n",
"compressed_file = os.path.join(root, msd_task + \".tar\")\n",
"if os.path.exists(root):\n",
" download_and_extract(resource, compressed_file, root)\n",
"\n",
"dataroot = os.path.join(root, msd_task)\n",
"compressed_file = os.path.join(root_dir, msd_task + \".tar\")\n",
"dataroot = os.path.join(root_dir, msd_task)\n",
"if os.path.exists(dataroot):\n",
" download_and_extract(resource, compressed_file, root_dir)\n",
"\n",
"datalist_file = os.path.join(\"..\", \"tasks\", \"msd\", msd_task, \"msd_\" + msd_task.lower() + \"_folds.json\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1.4 Prepare a input YAML configuration"
"## Prepare a input YAML configuration"
]
},
{
Expand All @@ -117,8 +102,8 @@
"outputs": [],
"source": [
"data_src_cfg = {\n",
" \"name\": msd_task, # optional\n",
" \"task\": \"segmentation\", # optional\n",
" \"name\": msd_task, # optional, it is only for your own record\n",
" \"task\": \"segmentation\", # optional, it is only for your own record\n",
" \"modality\": \"MRI\", # required\n",
" \"datalist\": datalist_file, # required\n",
" \"dataroot\": dataroot, # required\n",
Expand All @@ -131,7 +116,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Run the Auto3DSeg pipeline in a few lines of code\n",
"## Run the Auto3DSeg pipeline in a few lines of code\n",
"\n",
"Below is the typical usage of AutoRunner\n",
"```python\n",
Expand All @@ -143,26 +128,14 @@
"\n",
"If the user would like to perform a full training in the tutorial, it is recommended to uncomment the `runner.run()` appended at the end of each code block.\n",
"\n",
"### 2.1 Use the default setting"
"## Use the default setting with the input YAML file"
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2022-10-18 08:11:37,523 - INFO - ./work_dir does not exists. Creating...\n",
"2022-10-18 08:11:37,524 - INFO - ./work_dir created to save all results\n",
"2022-10-18 08:11:37,524 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml\n",
"2022-10-18 08:11:37,531 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions\n",
"2022-10-18 08:11:37,533 - INFO - Directory /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output is created to save ensemble predictions\n"
]
}
],
"outputs": [],
"source": [
"runner = AutoRunner(input=input)\n",
"# runner.run()"
Expand All @@ -172,23 +145,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2.2 Use the dictionary instead of a YAML file as the input"
"## Use the default setting with the dictionary instead of the YAML file as the input"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2022-10-18 08:11:37,674 - INFO - Work directory ./work_dir is used to save all results\n",
"2022-10-18 08:11:37,676 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions\n"
]
}
],
"outputs": [],
"source": [
"runner = AutoRunner(input=data_src_cfg)\n",
"# runner.run()"
Expand All @@ -198,8 +162,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3 Customize and configure the Auto3Dseg\n",
"### 3.1 Set your working directory"
"## Customize working directory\n",
"`AutoRunner` provides the user interfaces to save all the intermediate and final results in a user-specified location.\n",
"Here we use `./my_workspace` as an example"
]
},
{
Expand Down Expand Up @@ -228,9 +193,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.2 Use cached result to save computation time\n",
"## Customize result caching\n",
"\n",
"AutoRunner saves intermediate results by default. The user can choose whether it uses the cached results or restart from scratch.\n",
"AutoRunner saves intermediate results by default to save computation time.\n",
"The user can choose whether it uses the cached results or restart from scratch.\n",
"\n",
"If the users want to start from scratch, they can set `not_use_cache` to True"
]
Expand Down Expand Up @@ -265,7 +231,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3.3 Output Ensemble Result\n",
"## Customize the output folder to save ensemble result\n",
"\n",
"AutoRunner will perform inference on the testing data specified by the `datalist` in the data source config input. The inference result will be written to the `ensemble_output` folder under the working directory in the form of `nii.gz`. The user can choose the format by adding keyword arguments to the AutoRunner. A list of argument can be found in [MONAI tranforms documentation](https://docs.monai.io/en/stable/transforms.html#saveimage)."
]
Expand Down Expand Up @@ -294,8 +260,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4 Setting Auto3DSeg internal parameters\n",
"### 4.1 Change the number of folds for cross-validation"
"## Setting Auto3DSeg internal parameters\n",
"`Auto3DSeg` has four steps: data analysis, algorithm generation, training, and ensemble. Users can configure the internal parameters of the `AutoRunner` object to customize some steps in the pipeline.\n",
"\n",
"Below, we begin the experiments with a smaller number of cross-validation folds. The default is 5 in the algorithm but we set it to 2 here:"
]
},
{
Expand Down Expand Up @@ -323,41 +291,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.2 Customize traininig parameters by override the default values"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2022-10-18 08:11:38,312 - INFO - Work directory ./work_dir is used to save all results\n",
"2022-10-18 08:11:38,314 - INFO - Loading ./input.yaml for AutoRunner and making a copy in /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/input.yaml\n",
"2022-10-18 08:11:38,320 - INFO - The output_dir is not specified. /workspace/monai/tutorials-in-dev/auto3dseg/notebooks/work_dir/ensemble_output will be used to save ensemble predictions\n"
]
}
],
"source": [
"runner = AutoRunner(input=input)\n",
"# Note: among the provided bundles, most networks takes \"num_iterations\" to control the training iterations except segresnet\n",
"train_param = {\"num_iterations\": 8}\n",
"runner.set_training_params(params=train_param)\n",
"# runner.run()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### 4.2.1 A common set of training parameter for all algorithm templates\n",
"## Customize training parameters by override the default values\n",
"\n",
"Note: This is for demo purpose. The user doesn't need to specify this training params.\n",
"`set_training_params` in `AutoRunner` provides an interface to change all algorithms' training parameters in one line. \n",
"\n",
"**Auto3DSeg** uses bundle templates to perform training, validation, and inference. The number of epochs/iterations of training is specified by the config files in each template. While we can override them, it is also noted that some bundle templates may use \"num_iterations\" and other may use \"num_epochs\" to iterate. Below is code-block to convert num_epoch to iteration style and override all algorithms with the same training parameters for 1-GPU/2-GPU machine. "
"Note: **Auto3DSeg** uses bundle templates to perform training, validation, and inference. The number of epochs/iterations of training is specified by the config files in each template. While we can override them, it is also noted that some bundle templates may use `num_iterations` and other may use `num_epochs` to iterate.\n",
"\n",
"For demo purpose, below is code-block to convert num_epoch to iteration style and override all algorithms with the same training parameters for 1-GPU/2-GPU machine. \n"
]
},
{
Expand All @@ -384,6 +324,7 @@
" \"num_epochs\": num_epoch,\n",
" \"num_warmup_iterations\": n_iter_val,\n",
"}\n",
"runner = AutoRunner(input=input)\n",
"runner.set_training_params(params=train_param)\n",
"# runner.run()\n"
]
Expand All @@ -392,7 +333,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.3 Customize the ensemble method (mean vs. majority voting)"
"## Customize the ensemble method\n",
"\n",
"There are two supported methods: \"AlgoEnsembleBestN\" and \"AlgoEnsembleBestByFold\""
]
},
{
Expand Down Expand Up @@ -420,7 +363,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 4.4 Customize the inference parameters by override the default values"
"## Customize the inference parameters by override the default values"
]
},
{
Expand Down Expand Up @@ -454,12 +397,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5 Train model with HPO (NNI Grid-search)\n",
"### 5.1 Apply HPO to search hyper-parameter in Auto3DSeg\n",
"## Train model with HPO (NNI Grid-search)\n",
"\n",
"Note: Auto3DSeg supports hyper parameter optimization (HPO) via NNI and Optuna backends. Notebook of how to use these modules can be found in this directory.\n",
"AutoRunner supports NNI backend with a grid search method via automatically generating a the NNI config and run `nnictl` commands in subprocess.\n",
"Note: to run the HPO, you need to ensure the development environment has `nni` package. Please refer to the [MONAI Installation Guide](https://docs.monai.io/en/stable/installation.html#installing-the-recommended-dependencies) for how to install the recommended dependencies."
"**Auto3DSeg** supports hyper parameter optimization (HPO) via `NNI` and `Optuna` backends.\n",
"AutoRunner supports `NNI` backend with a grid search method via automatically generating a the `NNI` config and run `nnictl` commands in subprocess.\n",
"\n",
"Note: to run the HPO, you need to ensure the development environment has `nni` package.\n",
"Please refer to the [MONAI Installation Guide](https://docs.monai.io/en/stable/installation.html#installing-the-recommended-dependencies) for how to install the recommended dependencies."
]
},
{
Expand Down Expand Up @@ -488,9 +432,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### 5.2 Override the templated values\n",
"## Override the templated values\n",
"\n",
"The default `NNI` config that `AutoRunner` looks like below. User can override some of the parameters via the `set_hpo_params` interface:\n",
"\n",
"AutoRunner uses the following NNI config in its HPO module\n",
"```python\n",
"default_nni_config = {\n",
" \"trialCodeDirectory\": \".\",\n",
Expand All @@ -501,9 +446,7 @@
" \"tuner\": {\"name\": \"GridSearch\"},\n",
" \"trainingService\": {\"platform\": \"local\", \"useActiveGpu\": True},\n",
"}\n",
"```\n",
"\n",
"It can be override by setting the hpo parameters"
"```"
]
},
{
Expand Down Expand Up @@ -534,7 +477,14 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 6 Conclusion\n",
"For more details about the usage of **Auto3DSeg** HPO features, please check the [Auto3DSeg NNI Notebok](./hpo_nni.ipynb) and [Auto3DSeg Optuna Notebook](./hpo_optuna.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conclusion\n",
"\n",
"Here we demonstrate how to use the AutoRunner APIs to customize your **Auto3DSeg** pipeline with mininal inputs. Don't forget you need to execute the `run` command to start the training and make everything take effect.\n",
"\n",
Expand All @@ -546,7 +496,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.13 ('base')",
"display_name": "Python 3.8.10 64-bit",
"language": "python",
"name": "python3"
},
Expand All @@ -560,12 +510,12 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
"version": "3.8.10"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "d4d1e4263499bec80672ea0156c357c1ee493ec2b1c70f0acce89fc37c4a6abe"
"hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
}
}
},
Expand Down