From 52fa355595cc1d08745c3b40dac3b5dc64b95e3f Mon Sep 17 00:00:00 2001 From: Sebastian Ziegler Date: Mon, 12 Aug 2024 11:36:01 +0200 Subject: [PATCH] fix:pretraining_doc --- documentation/pretraining_and_finetuning.md | 22 ++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/documentation/pretraining_and_finetuning.md b/documentation/pretraining_and_finetuning.md index 5c3f4d0c2..627eb7bd5 100644 --- a/documentation/pretraining_and_finetuning.md +++ b/documentation/pretraining_and_finetuning.md @@ -16,17 +16,17 @@ how the resulting weights can then be used for initialization. Throughout this README we use the following terminology: -- `pretraining dataset` is the dataset you intend to run the pretraining on (former: source dataset) -- `target dataset` is the dataset you are interested in; the one you wish to fine tune on +- `pretraining dataset` is the dataset you intend to run the pretraining on +- `finetuning dataset` is the dataset you are interested in; the one you wish to fine tune on ## Training on the pretraining dataset In order to obtain matching network topologies we need to transfer the plans from one dataset to another. Since we are -only interested in the target dataset, we first need to run experiment planning (and preprocessing) for it: +only interested in the finetuning dataset, we first need to run experiment planning (and preprocessing) for it: ```bash -nnUNetv2_plan_and_preprocess -d TARGET_DATASET +nnUNetv2_plan_and_preprocess -d FINETUNING_DATASET ``` Then we need to extract the dataset fingerprint of the pretraining dataset, if not yet available: @@ -35,14 +35,14 @@ Then we need to extract the dataset fingerprint of the pretraining dataset, if n nnUNetv2_extract_fingerprint -d PRETRAINING_DATASET ``` -Now we can take the plans from the target dataset and transfer it to the pretraining dataset: +Now we can take the plans from the finetuning dataset and transfer it to the pretraining dataset: ```bash -nnUNetv2_move_plans_between_datasets -s PRETRAINING_DATASET -t TARGET_DATASET -sp PRETRAINING_PLANS_IDENTIFIER -tp TARGET_PLANS_IDENTIFIER +nnUNetv2_move_plans_between_datasets -s FINETUNING_DATASET -t PRETRAINING_DATASET -sp FINETUNING_PLANS_IDENTIFIER -tp PRETRAINING_PLANS_IDENTIFIER ``` -`PRETRAINING_PLANS_IDENTIFIER` is hereby probably nnUNetPlans unless you changed the experiment planner in -nnUNetv2_plan_and_preprocess. For `TARGET_PLANS_IDENTIFIER` we recommend you set something custom in order to not +`FINETUNING_PLANS_IDENTIFIER` is hereby probably nnUNetPlans unless you changed the experiment planner in +nnUNetv2_plan_and_preprocess. For `PRETRAINING_PLANS_IDENTIFIER` we recommend you set something custom in order to not overwrite default plans. Note that EVERYTHING is transferred between the datasets. Not just the network topology, batch size and patch size but @@ -54,13 +54,13 @@ Note on CT normalization: Yes, also the clip values, mean and std are transferre Now you can run the preprocessing on the pretraining dataset: ```bash -nnUNetv2_preprocess -d PRETRAINING_DATASET -plans_name TARGET_PLANS_IDENTIFIER +nnUNetv2_preprocess -d PRETRAINING_DATASET -plans_name PRETRAINING_PLANS_IDENTIFIER ``` And run the training as usual: ```bash -nnUNetv2_train PRETRAINING_DATASET CONFIG all -p TARGET_PLANS_IDENTIFIER +nnUNetv2_train PRETRAINING_DATASET CONFIG all -p PRETRAINING_PLANS_IDENTIFIER ``` Note how we use the 'all' fold to train on all available data. For pretraining it does not make sense to split the data. @@ -70,7 +70,7 @@ Note how we use the 'all' fold to train on all available data. For pretraining i Once pretraining is completed (or you obtain compatible weights by other means) you can use them to initialize your model: ```bash -nnUNetv2_train TARGET_DATASET CONFIG FOLD -pretrained_weights PATH_TO_CHECKPOINT +nnUNetv2_train FINETUNING_DATASET CONFIG FOLD -pretrained_weights PATH_TO_CHECKPOINT ``` Specify the checkpoint in PATH_TO_CHECKPOINT.