欢迎讨论交流,可通过官方在本博客最后提供的微信名片与我联系
1-环境搭建
https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/installation_instructions.md
1.1 基础要求
| 类别 | 要求与说明 |
| 操作系统 | 支持 Linux (Ubuntu 18.04/20.04/22.04; CentOS, RHEL)、Windows 和 macOS |
| 设备支持 | 支持 GPU(推荐)、CPU 和 Apple M1/M2(当前 Apple MPS 未实现 3D 卷积,因此在这些设备上可能需要使用 CPU)。 |
1.2 硬件需求
训练 (Training)
| 组件 | 推荐配置 |
| 设备 | 推荐使用 GPU,CPU 或 MPS (Apple M1/M2) 训练耗时极长。 |
| GPU | 至少 10 GB 显存(常见型号:RTX 2080Ti、RTX 3080/3090 或 RTX 4080/4090)。 |
| CPU | 需要高性能 CPU,至少 6 核(12 线程)。需求与数据增强、输入通道数和目标结构数量相关,GPU 越快,CPU 也应越强。 |
推理 (Inference)
| 组件 | 推荐配置 |
| 设备 | 推荐使用 GPU,速度远快于其他选项,但 CPU 和 MPS 也可用。 |
| GPU | 至少 4 GB 可用显存。 |
1.3 环境安装
python3.10,pytorch 2.1.2+cu118,torchvison 0.16.2+cu118,然后安装nnUnet和hiddenlayer
unzip batchgeneratorsv2-0.3.0.zip && cd batchgeneratorsv2-0.3.0 && pip install .
unzip nnUNet-master.zip && cd nnUNet-master && pip install .
unzip hiddenlayer-more_plotted_details.zip && cd hiddenlayer-more_plotted_details && pip install .
# https://github.com/MIC-DKFZ/batchgeneratorsv2
# pip install --upgrade git+https://github.com/FabianIsensee/hiddenlayer.git@more_plotted_details#egg=hiddenlayer
注意:如果hiddenlayer在线安装困难,可以下载zip包进行安装

1.4 代码修改
修改1:https://github.com/MIC-DKFZ/nnUNet/issues/2742

修改2:
https://github.com/MIC-DKFZ/nnUNet/issues/2735

# vim /root/miniconda3/lib/python3.10/site-packages/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py
from torch.cuda.amp import GradScaler
# 163行
@https://github.com/MIC-DKFZ/nnUNet/blob/master/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py
self.grad_scaler = GradScaler() if self.device.type == 'cuda' else None
2-数据转化
https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/dataset_format.md
数据转化包括:1-数据的目录结构 2-生成dataset.json 3-环境变量设置 4-数据预处理
2.1 数据目录结构转换
https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/setting_up_paths.md
nnUNet_raw:为数据集创建一个子文件夹,命名为 DatasetXXX_YYY,其中 XXX 是一个三位数的标识符(例如 001, 002, 043, 999, ...),YYY 是(唯一的)数据集名称。
示例目录结构:


如上图所示:新建三个文件夹,nnUNet_raw、nnUNet_preprocessed、nnUNet_results:
cd /root/autodl-tmp && mkdir nnUNetV2
cd nnUNetV2 && mkdir nnUNet_raw && mkdir nnUNet_preprocessed && mkdir nnUNet_results
最外层:/root/autodl-tmp/nnUNetV2
nnUNet_raw/Dataset001_NAME1
├── dataset.json
├── imagesTr
│ ├── ...
├── imagesTs
│ ├── ...
└── labelsTr
├── ...
nnUNet_raw/Dataset002_NAME2
├── dataset.json
├── imagesTr
│ ├── ...
├── imagesTs
│ ├── ...
└── labelsTr
├── ...
nnUNet_preprocessed:这是存放预处理后数据的文件夹。在训练期间,程序也会从此文件夹读取数据。nnUNet_results:此变量指定 nnU-Net 保存模型权重的位置。如果下载了预训练模型,它们也会保存在这里。
目录结构下的训练输入和GT文件如下:
├── dataset.json #下一步讲解
├── imagesTr
│ ├── la_003_0000.nii.gz
│ ├── la_004_0000.nii.gz
│ ├── ...
├── imagesTs
│ ├── la_001_0000.nii.gz
│ ├── la_002_0000.nii.gz
│ ├── ...
└── labelsTr
├── la_003.nii.gz
├── la_004.nii.gz
├── ...
- imagesTr:包含属于训练样本的图像。nnUNet 将使用这些数据进行流程配置、带
交叉验证的训练,以及寻找后处理方法和最佳集成模型。
- imagesTs (可选) 包含属于测试样本的图像,在推理阶段使用。
- labelsTr:包含训练样本的真实分割图(ground truth)。
- dataset.json 包含数据集的元数据。
注意imagesTr下面可能是多个模态,所以出现la_003_0000.nii.gz、la_003_0001.nii.gz等,但是对应labelsTr下面只有la_003.nii.gz。包括后面生成json时名称也只需要写la_003.nii.gz
2.2 生成dataset.json
作者给的参考脚本是https://github.com/MIC-DKFZ/nnUNet/blob/master/nnunetv2/dataset_conversion/generate_dataset_json.py
稍作修改后的代码如下,针对数据多个模态:
from typing import Tuple, Union, List
from batchgenerators.utilities.file_and_folder_operations import save_json, join
def generate_dataset_json(output_folder: str,
channel_names: dict,
labels: dict,
num_training_cases: int,
file_ending: str,
citation: Union[List[str], str] = None,
regions_class_order: Tuple[int, ...] = None,
dataset_name: str = None,
reference: str = None,
release: str = None,
description: str = None,
overwrite_image_reader_writer: str = None,
license: str = 'Whoever converted this dataset was lazy and didn\'t look it up!',
converted_by: str = "Please enter your name, especially when sharing datasets with others in a common infrastructure!",
**kwargs):
"""
Generates a dataset.json file in the output folder
channel_names:
Channel names must map the index to the name of the channel, example:
{
0: 'T1',
1: 'CT'
}
Note that the channel names may influence the normalization scheme!! Learn more in the documentation.
labels:
This will tell nnU-Net what labels to expect. Important: This will also determine whether you use region-based training or not.
Example regular labels:
{
'background': 0,
'left atrium': 1,
'some other label': 2
}
Example region-based training:
{
'background': 0,
'whole tumor': (1, 2, 3),
'tumor core': (2, 3),
'enhancing tumor': 3
}
Remember that nnU-Net expects consecutive values for labels! nnU-Net also expects 0 to be background!
num_training_cases: is used to double check all cases are there!
file_ending: needed for finding the files correctly. IMPORTANT! File endings must match between images and
segmentations!
dataset_name, reference, release, license, description: self-explanatory and not used by nnU-Net. Just for
completeness and as a reminder that these would be great!
overwrite_image_reader_writer: If you need a special IO class for your dataset you can derive it from
BaseReaderWriter, place it into nnunet.imageio and reference it here by name
kwargs: whatever you put here will be placed in the dataset.json as well
"""
has_regions: bool = any([isinstance(i, (tuple, list)) and len(i) > 1 for i in labels.values()])
if has_regions:
assert regions_class_order is not None, f"You have defined regions but regions_class_order is not set. " \
f"You need that."
# channel names need strings as keys
keys = list(channel_names.keys())
for k in keys:
if not isinstance(k, str):
channel_names[str(k)] = channel_names[k]
del channel_names[k]
# labels need ints as values
for l in labels.keys():
value = labels[l]
if isinstance(value, (tuple, list)):
value = tuple([int(i) for i in value])
labels[l] = value
else:
labels[l] = int(labels[l])
dataset_json = {
'channel_names': channel_names, # previously this was called 'modality'. I didn't like this so this is
# channel_names now. Live with it.
'labels': labels,
'numTraining': num_training_cases,
'file_ending': file_ending,
'licence': license,
'converted_by': converted_by
}
if dataset_name is not None:
dataset_json['name'] = dataset_name
if reference is not None:
dataset_json['reference'] = reference
if release is not None:
dataset_json['release'] = release
if citation is not None:
dataset_json['citation'] = release
if description is not None:
dataset_json['description'] = description
if overwrite_image_reader_writer is not None:
dataset_json['overwrite_image_reader_writer'] = overwrite_image_reader_writer
if regions_class_order is not None:
dataset_json['regions_class_order'] = regions_class_order
dataset_json.update(kwargs)
save_json(dataset_json, join(output_folder, 'dataset.json'), sort_keys=False)
output_folder="/root/autodl-tmp/nnUNetV2/nnUNet_raw/Dataset500_Lung"
channel_names={"0": "CT"}
labels={'background': 0, 'cancer': 1}
num_training_cases=10
file_ending=".nii.gz"
dataset_name = "Lung"
generate_dataset_json(output_folder,
channel_names,
labels,
num_training_cases,
file_ending,
dataset_name=dataset_name)
针对数据单个模态:
python generate_json_SMode.py
import os
from typing import Tuple, Union, List
from batchgenerators.utilities.file_and_folder_operations import save_json, join, subfiles, maybe_mkdir_p
def generate_dataset_json(output_folder: str,
channel_names: dict,
labels: dict,
num_training_cases: int,
file_ending: str,
citation: Union[List[str], str] = None,
regions_class_order: Tuple[int, ...] = None,
dataset_name: str = None,
reference: str = None,
release: str = None,
description: str = None,
overwrite_image_reader_writer: str = None,
license: str = 'Whoever converted this dataset was lazy and didn\'t look it up!',
converted_by: str = "Please enter your name, especially when sharing datasets with others in a common infrastructure!",
**kwargs):
"""
(函数文档字符串保持不变...)
"""
# (函数内部逻辑保持不变)
has_regions: bool = any([isinstance(i, (tuple, list)) and len(i) > 1 for i in labels.values()])
if has_regions:
assert regions_class_order is not None, f"You have defined regions but regions_class_order is not set. You need that."
keys = list(channel_names.keys())
for k in keys:
if not isinstance(k, str):
channel_names[str(k)] = channel_names[k]
del channel_names[k]
for l in labels.keys():
value = labels[l]
if isinstance(value, (tuple, list)):
value = tuple([int(i) for i in value])
labels[l] = value
else:
labels[l] = int(labels[l])
dataset_json = {
'channel_names': channel_names,
'labels': labels,
'numTraining': num_training_cases,
'file_ending': file_ending,
'licence': license,
'converted_by': converted_by
}
if dataset_name is not None:
dataset_json['name'] = dataset_name
if reference is not None:
dataset_json['reference'] = reference
if release is not None:
dataset_json['release'] = release
if citation is not None:
dataset_json['citation'] = release
if description is not None:
dataset_json['description'] = description
if overwrite_image_reader_writer is not None:
dataset_json['overwrite_image_reader_writer'] = overwrite_image_reader_writer
if regions_class_order is not None:
dataset_json['regions_class_order'] = regions_class_order
dataset_json.update(kwargs)
save_json(dataset_json, join(output_folder, 'dataset.json'), sort_keys=False)
if __name__ == '__main__':
# --- 1. 设置你的路径和数据集信息 ---
# !! 请确保这里的路径是正确的 !!
output_folder = "/root/autodl-tmp/nnUNetV2/nnUNet_raw/Dataset500_Lung"
imagesTr_folder = join(output_folder, "imagesTr")
labelsTr_folder = join(output_folder, "labelsTr")
channel_names = {"0": "CT"}
labels = {'background': 0, 'cancer': 1}
file_ending = ".nii.gz"
dataset_name = "Lung"
# --- 2. 扫描文件并构建数据集字典 ---
dataset_dict = {}
# 扫描labelsTr文件夹中的所有标签文件
label_files = subfiles(labelsTr_folder, suffix=file_ending, join=False)
for label_filename in label_files:
# 从标签文件名中提取样本标识符 (例如从 "lung_001.nii.gz" 提取 "lung_001")
identifier = label_filename[:-len(file_ending)]
# 构造对应的图像文件名
image_filename = f"{identifier}{file_ending}"
# 检查对应的图像文件是否存在于imagesTr文件夹中
if os.path.exists(join(imagesTr_folder, image_filename)):
# 构建图像和标签的相对路径
image_path = f"./imagesTr/{image_filename}"
label_path = f"./labelsTr/{label_filename}"
# 将该样本添加到字典中
dataset_dict[identifier] = {
"images": [image_path], # 即使只有一个图像,也需要放在列表中
"label": label_path
}
num_training_cases = len(dataset_dict)
# --- 3. 调用函数生成最终的 dataset.json 文件 ---
generate_dataset_json(output_folder,
channel_names,
labels,
num_training_cases,
file_ending,
dataset_name=dataset_name,
dataset=dataset_dict) # 将构建好的字典传入
print(f"成功在 '{output_folder}' 文件夹中生成了 dataset.json 文件。")
print(f"共找到 {num_training_cases} 个匹配的训练样本。")
2.3 环境变量设置
https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/set_environment_variables.md
添加数据集相关的环境变量:
export nnUNet_raw="/root/autodl-tmp/nnUNetV2/nnUNet_raw"
export nnUNet_preprocessed="/root/autodl-tmp/nnUNetV2/nnUNet_preprocessed"
export nnUNet_results="/root/autodl-tmp/nnUNetV2/nnUNet_results"
或者修改.bashrc
vim ~/.bashrc
# 添加环境变量
source ~/.bashrc

2.4 数据预处理
- 数据处理前,目录结构如下:
├── nnUNetV2 │ ├── nnUNet_preprocessed │ │ └── Dataset500_Lung #该文件夹下的具体结构如2.1 │ ├── nnUNet_raw │ │ ├── nnUNet_cropped_data │ │ └── nnUNet_raw_data -
执行命令,注意这里的
数字500就是数据集task后面的数字(Dataset500_Lung)
nnUNetv2_plan_and_preprocess -d 500 --verify_dataset_integrity
注意:如果报错如下,则表明内存不够,将数据处理的进程降为1同时单独处理不同的数据,则命令行改为:
nnUNetv2_plan_and_preprocess -d 500 -c 2d -np 1
nnUNetv2_plan_and_preprocess -d 500 -c 3d_fullres -np 1
nnUNetv2_plan_and_preprocess -d 500 -c 3d_lowres -np 1

- 数据预处理完log输出
数据处理大约15分钟

- 文件大小
原始数据1.26G,预处理后的文件共3.2G
3-模型训练
训练前:更改epoch,1000-->500[1]
https://github.com/MIC-DKFZ/nnUNet/blob/master/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py

训练验证split信息位于:nnUNetV2/nnUNet_preprocessed/Dataset500_Lung/splits_final.json
3.1 3D fullres训练
3.1.1 训练命令
- Fold=0[1,2,3,4]
500来自Dataset500_Lung,后面0代表K折的第i次训练,默认5折,K取值[0,4]
[1,2] [3,4] [5,6] [7,8] [9,10]
K=5,
i=0, [1,2,3,4,5,6,7,8]训练,[9,10]验证
i=1, [1,2,3,4,5,6,9,10]训练,[7,8]验证
i=2, [1,2,3,4,7,8, 9,10]训练,[5,6]验证
i=3, [1,2,5,6,7,8, 9,10]训练,[3,4]验证
i=4, [3,4,5,6,7,8, 9,10]训练,[1,2]验证
nnUNetv2_train 500 3d_fullres 0 --npz
nnUNetv2_train 500 3d_fullres 1 --npz
nnUNetv2_train 500 3d_fullres 2 --npz
nnUNetv2_train 500 3d_fullres 3 --npz
nnUNetv2_train 500 3d_fullres 4 --npz

显存占用9G
nnUNetv2_train 500 3d_fullres 0 --npz
############################
INFO: You are using the old nnU-Net default plans. We have updated our recommendations. Please consider using those instead! Read more here: https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/resenc_p
############################
Using device: cuda:0
#######################################################################
Please cite the following paper when using nnU-Net:
Isensee, F., Jaeger, P. F., Kohl, S. A., Petersen, J., & Maier-Hein, K. H. (2021). nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2), 203-211.
#######################################################################
2025-09-06 20:08:11.702268: Using torch.compile...
2025-09-06 20:08:13.751839: do_dummy_2d_data_aug: False
2025-09-06 20:08:13.752347: Using splits from existing split file: /root/autodl-tmp/nnUNetV2/nnUNet_preprocessed/Dataset500_Lung/splits_final.json
2025-09-06 20:08:13.752525: The split file contains 5 splits.
2025-09-06 20:08:13.752571: Desired fold for training: 0
2025-09-06 20:08:13.752606: This split has 8 training and 2 validation cases.
using pin_memory on device 0
using pin_memory on device 0
This is the configuration used by this training:
Configuration name: 3d_fullres
{'data_identifier': 'nnUNetPlans_3d_fullres', 'preprocessor_name': 'DefaultPreprocessor', 'batch_size': 2, 'patch_size': [96, 160, 160], 'median_image_size_in_voxels': [277.0, 512.0, 512.0], 'spacing': [1.2449731, 0.7988280057907104, 0.7988280057907104], 'normalization_schemes': ['CTNormalization'], 'use_mask_for_norm': [False], 'resampling_fn_data': 'resample_data_or_seg_to_shape', 'resampling_fn_seg': 'resample_da_to_shape', 'resampling_fn_data_kwargs': {'is_seg': False, 'order': 3, 'order_z': 0, 'force_separate_z': None}, 'resampling_fn_seg_kwargs': {'is_seg': True, 'order': 1, 'order_z': 0, 'force_separate_z': None}, ng_fn_probabilities': 'resample_data_or_seg_to_shape', 'resampling_fn_probabilities_kwargs': {'is_seg': False, 'order': 1, 'order_z': 0, 'force_separate_z': None}, 'architecture': {'network_class_name': 'dynami_architectures.architectures.unet.PlainConvUNet', 'arch_kwargs': {'n_stages': 6, 'features_per_stage': [32, 64, 128, 256, 320, 320], 'conv_op': 'torch.nn.modules.conv.Conv3d', 'kernel_sizes': [[3, 3, 3], [3, 3,3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]], 'strides': [[1, 1, 1], [2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'n_conv_per_stage': [2, 2, 2, 2, 2, 2], 'n_conv_per_stage_decoder': [2, 2, 2, 2, 2], 'convrue, 'norm_op': 'torch.nn.modules.instancenorm.InstanceNorm3d', 'norm_op_kwargs': {'eps': 1e-05, 'affine': True}, 'dropout_op': None, 'dropout_op_kwargs': None, 'nonlin': 'torch.nn.LeakyReLU', 'nonlin_kwargs': ': True}}, '_kw_requires_import': ['conv_op', 'norm_op', 'dropout_op', 'nonlin']}, 'batch_dice': True}
These are the global plan.json settings:
{'dataset_name': 'Dataset500_Lung', 'plans_name': 'nnUNetPlans', 'original_median_spacing_after_transp': [1.2449942827224731, 0.7988280057907104, 0.7988280057907104], 'original_median_shape_after_transp': [2942], 'image_reader_writer': 'SimpleITKIO', 'transpose_forward': [0, 1, 2], 'transpose_backward': [0, 1, 2], 'experiment_planner_used': 'ExperimentPlanner', 'label_manager': 'LabelManager', 'foreground_intensity_s_per_channel': {'0': {'max': 391.0, 'mean': -122.59545135498047, 'median': -2.0, 'min': -1024.0, 'percentile_00_5': -905.0, 'percentile_99_5': 197.0, 'std': 254.0592803955078}}}
2025-09-06 20:08:16.383001: Unable to plot network architecture: nnUNet_compile is enabled!
2025-09-06 20:08:16.423777:
2025-09-06 20:08:16.424186: Epoch 0
2025-09-06 20:08:16.424435: Current learning rate: 0.01
2025-09-06 20:10:34.441466: train_loss 0.0519
2025-09-06 20:10:34.442359: val_loss -0.0573
2025-09-06 20:10:34.442469: Pseudo dice [0.0]
2025-09-06 20:10:34.442565: Epoch time: 138.02 s
2025-09-06 20:10:34.442641: Yayy! New best EMA pseudo Dice: 0.0
2025-09-06 20:10:37.455573:
2025-09-06 20:10:37.455887: Epoch 1
2025-09-06 20:10:37.456042: Current learning rate: 0.0091
2025-09-06 20:11:32.261170: train_loss -0.1501
2025-09-06 20:11:32.261490: val_loss -0.3403
2025-09-06 20:11:32.261577: Pseudo dice [0.2837]
2025-09-06 20:11:32.261677: Epoch time: 54.81 s
2025-09-06 20:11:32.262234: Yayy! New best EMA pseudo Dice: 0.0284
2025-09-06 20:11:34.496846:
2025-09-06 20:11:34.497135: Epoch 2
2025-09-06 20:11:34.497264: Current learning rate: 0.00818
2025-09-06 20:12:29.591384: train_loss -0.4005
2025-09-06 20:12:29.591805: val_loss -0.587
2025-09-06 20:12:29.591911: Pseudo dice [0.7296]
2025-09-06 20:12:29.592026: Epoch time: 55.1 s
2025-09-06 20:12:29.592102: Yayy! New best EMA pseudo Dice: 0.0985
2025-09-06 20:12:32.079641:
2025-09-06 20:12:32.080288: Epoch 3
2025-09-06 20:12:32.080418: Current learning rate: 0.00725
2025-09-06 20:13:27.075207: train_loss -0.4856
2025-09-06 20:13:27.075572: val_loss -0.5756
2025-09-06 20:13:27.075674: Pseudo dice [0.6417]
2025-09-06 20:13:27.075785: Epoch time: 55.0 s
2025-09-06 20:13:27.075871: Yayy! New best EMA pseudo Dice: 0.1528
2025-09-06 20:13:29.470544:
2025-09-06 20:13:29.471013: Epoch 4
2025-09-06 20:13:29.471315: Current learning rate: 0.00631
2025-09-06 20:14:24.554845: train_loss -0.5572
2025-09-06 20:14:24.555499: val_loss -0.7106
2025-09-06 20:14:24.555609: Pseudo dice [0.7982]
2025-09-06 20:14:24.555726: Epoch time: 55.09 s
2025-09-06 20:14:24.555808: Yayy! New best EMA pseudo Dice: 0.2174
2025-09-06 20:14:27.035247:
2025-09-06 20:14:27.035626: Epoch 5
2025-09-06 20:14:27.035763: Current learning rate: 0.00536
2025-09-06 20:15:29.180974: train_loss -0.6041
2025-09-06 20:15:29.181327: val_loss -0.5621
2025-09-06 20:15:29.181425: Pseudo dice [0.5705]
2025-09-06 20:15:29.181531: Epoch time: 62.15 s
2025-09-06 20:15:29.181607: Yayy! New best EMA pseudo Dice: 0.2527
2025-09-06 20:15:31.479853:
2025-09-06 20:15:31.480425: Epoch 6
2025-09-06 20:15:31.480557: Current learning rate: 0.00438
2025-09-06 20:16:52.389704: train_loss -0.614
2025-09-06 20:16:52.390165: val_loss -0.5788
2025-09-06 20:16:52.390282: Pseudo dice [0.7144]
2025-09-06 20:16:52.390401: Epoch time: 80.91 s
2025-09-06 20:16:52.390483: Yayy! New best EMA pseudo Dice: 0.2989
2025-09-06 20:16:54.800908:
2025-09-06 20:16:54.801170: Epoch 7
2025-09-06 20:16:54.801300: Current learning rate: 0.00338
2025-09-06 20:18:18.236024: train_loss -0.6473
2025-09-06 20:18:18.236401: val_loss -0.636
2025-09-06 20:18:18.236505: Pseudo dice [0.7225]
2025-09-06 20:18:18.236631: Epoch time: 83.44 s
2025-09-06 20:18:18.236715: Yayy! New best EMA pseudo Dice: 0.3412
2025-09-06 20:18:20.692332:
2025-09-06 20:18:20.692897: Epoch 8
2025-09-06 20:18:20.693097: Current learning rate: 0.00235
2025-09-06 20:20:09.441580: train_loss -0.6382
2025-09-06 20:20:09.442209: val_loss -0.6178
2025-09-06 20:20:09.442320: Pseudo dice [0.6854]
2025-09-06 20:20:09.442439: Epoch time: 108.75 s
2025-09-06 20:20:09.442522: Yayy! New best EMA pseudo Dice: 0.3756
2025-09-06 20:20:11.829999:
2025-09-06 20:20:11.830412: Epoch 9
2025-09-06 20:20:11.830566: Current learning rate: 0.00126
2025-09-06 20:22:00.151048: train_loss -0.6891
2025-09-06 20:22:00.151394: val_loss -0.7709
2025-09-06 20:22:00.151526: Pseudo dice [0.9049]
2025-09-06 20:22:00.151630: Epoch time: 108.32 s
2025-09-06 20:22:00.151708: Yayy! New best EMA pseudo Dice: 0.4286
2025-09-06 20:22:03.420717: Training done.
2025-09-06 20:22:03.446447: Using splits from existing split file: /root/autodl-tmp/nnUNetV2/nnUNet_preprocessed/Dataset500_Lung/splits_final.json
2025-09-06 20:22:03.446766: The split file contains 5 splits.
2025-09-06 20:22:03.446835: Desired fold for training: 0
2025-09-06 20:22:03.446889: This split has 8 training and 2 validation cases.
2025-09-06 20:22:03.447048: predicting lung_001
2025-09-06 20:22:04.228852: lung_001, shape torch.Size([1, 244, 444, 444]), rank 0
2025-09-06 20:23:36.718254: predicting lung_014
2025-09-06 20:23:37.247420: lung_014, shape torch.Size([1, 296, 476, 476]), rank 0
2025-09-06 20:26:18.876790: Validation complete
2025-09-06 20:26:18.876952: Mean Validation Dice: 0.4860391693927178
3.2 2D 训练
3.1.2 训练命令
- Fold=0[1,2,3,4]
nnUNetv2_train 500 2d 0 --npz
3.3 断点重训
在命令行后面加上--c
nnUNetv2_train 500 2d 0 --npz --c
4-模型推理
cd /root/autodl-tmp/nnUNetV2/nnUNet_raw/Dataset500_LUNG
nnUNetv2_predict -i imagesTr -o imagesTr_3d_fullres_output -c 3d_fullres -d 500
nnUNetv2_predict -i imagesTr -o imagesTr_2d_output -c 2d -d 500
附:FAQ
- 如果训练"卡住",如何debug
修改代码https://github.com/MIC-DKFZ/nnUNet/blob/master/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py 第989行,在下面这句代码前后插入两行代码:print("The train step start!") 以及 print("The step finished!")
修改后的代码如下:
with autocast(self.device.type, enabled=True) if self.device.type == 'cuda' else dummy_context():
print("The train step start!")
output = self.network(data)
# del data
l = self.loss(output, target)
print("The step finished!")
链接:
- nnUnet博客


936

被折叠的 条评论
为什么被折叠?



