使用MMDetection进行语义分割

MMDetection是商汤科技开源的用于深度学习目标检测的库,而SIIM-ACR Pneumothorax Segmentation(以下简称SIIM)是发布于Kaggle平台的一个分割气胸所在位置的计算机视觉类竞赛。以下我将以SIIM比赛为例,介绍如何使用MMDetection进行语义分割。

一、安装MMDetection

​ 安装过程可能会有更新,以官方为准:

https://github.com/open-mmlab/mmdetection

安装准备

  • 操作系统:Linux
  • Python 3.5+
  • PyTorch 1.0+ 或 PyTorch-nightly
  • CUDA 9.0+
  • NCCL 2+
  • GCC 4.9+

安装步骤

  1. 创建conda虚拟环境并激活、安装Cython

    1
    2
    3
    4
    conda create -n open-mmlab python=3.7 -y
    conda activate open-mmlab

    conda install cython
  2. 根据PyTorch官网中对应的版本在conda的虚拟环境中安装PyTorch stable/nightly和torchvision。注意需要去掉安装命令中的 -c 参数(如果有的话),不然下载过程会很慢。

  3. 在虚拟环境中安装mmcvcocoapi

  4. 克隆MMDetection并安装

    1
    2
    3
    4
    git clone https://github.com/open-mmlab/mmdetection.git
    cd mmdetection
    python setup.py develop
    # or "pip install -v -e ."

二、准备COCO格式数据标注

COCO的全称是Common Objects in COntext,是微软团队提供的一个可以用来进行图像识别的数据集。而我们在这次比赛中需要用到的是COCO数据集的标注格式,mmdetection将通过标注来对数据进行训练和测试。

COCO数据集现在有5种标注类型:Object Detection(目标检测)、Keypoint Detection(关键点检测)、 Stuff Segmentation(语义分割)、Panoptic Segmentation(全景分割)和image captions(看图说话),使用JSON文件存储。在SIIM比赛中使用的是语义分割。

基本的JSON结构体类型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
{
"info": info,
"licenses": [license],
"images": [image],
"annotations": [annotation],
}

info{
"year": int,
"version": str,
"description": str,
"contributor": str,
"url": str,
"date_created": datetime,
}
license{
"id": int,
"name": str,
"url": str,
}
image{
"id": int,
"width": int,
"height": int,
"file_name": str,
"license": int,
"flickr_url": str,
"coco_url": str,
"date_captured": datetime,
}
  1. info类型,以下是一个info类型的实例:
1
2
3
4
5
6
7
"info":{
"description":"This is stable 1.0 version of the 2014 MS COCO dataset.",
"url":"http:\/\/mscoco.org",
"version":"1.0","year":2014,
"contributor":"Microsoft COCO group",
"date_created":"2015-01-27 09:11:52.357475"
},
  1. images是包含多个image实例的数组,以下是一个image类型的实例:
1
2
3
4
5
6
7
8
{
"license":3,
"file_name":"COCO_val2014_000000391895.jpg",
"coco_url":"http:\/\/mscoco.org\/images\/391895",
"height":360,"width":640,"date_captured":"2013-11-14 11:18:45",
"flickr_url":"http:\/\/farm9.staticflickr.com\/8186\/8119368305_4e622c8349_z.jpg",
"id":391895
},
  1. licenses是包含多个license实例的数组,以下是一个license类型的实例:
1
2
3
4
5
{
"url":"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/2.0\/",
"id":1,
"name":"Attribution-NonCommercial-ShareAlike License"
},

info和licenses是不必要的,留空即可。

Stuff Segmentation 类型的标注

Stuff Segmentation 的标注格式和 Object Detection 的格式一样,如下:

1
2
3
4
5
6
7
{
"info": info,
"licenses": [license],
"images": [image],
"annotations": [annotation],
"categories": [category]
}

可以看到segmentation的结构体比通用结构体多了两种类型:annotations和categories。

  1. annotations字段是包含多个annotation实例的一个数组,annotation类型本身又包含了一系列的字段,如下所示:
1
2
3
4
5
6
7
8
9
annotation{
"id": int,
"image_id": int,
"category_id": int,
"segmentation": RLE or [polygon],
"area": float,
"bbox": [x,y,width,height],
"iscrowd": 0 or 1,
}

在Stuff Segmentation任务中,segmentation的编码为RLE格式,iscrowd为0;area字段为标注覆盖的面积;bbox是一个长度为4的数组,用于表示目标检测的边框,x和y表示边框左上角的坐标,width和height表示边框的宽和高。当编码为RLE格式时,segmentation的格式如下:

1
2
3
4
5
segmentation : 
{
'counts': [272, 2, 4, 4, 4, 4, 2, 9, 1, 2, 16, 43, 143, 24......],
'size': [240, 320]
}

size是这张图片的宽高,counts字段的内容为RLE编码。此处举例使用的是uncompressed RLE,做语义分割任务时需编码为compact RLE。可使用pycocotools中的函数生成compact RLE,以下用SIIM中的一个RLE码举例:

1
2
3
4
5
6
7
8
9
>>> from pycocotools.mask import encode
>>> import numpy as np
>>> from mask_function import rle2mask
>>> rle = '407576 2 1021 7 1015 10 1013 12 1011 14 1008 17 1006 19 1005 20 1003 21 1003 22 1001 23 1001 24 999 25 999 25 999 26 997 27 997 27 996 28 996 28 996 29 994 30 994 30 994 30 993 31 993 32 992 32 992 32 992 32 991 33 991 33 991 33 991 33 991 33 990 34 990 34 990 34 990 34 990 34 989 35 989 36 988 36 988 16 1 19 988 15 3 18 988 15 4 16 989 14 8 13 989 14 8 13 989 13 9 13 989 13 9 13 989 12 10 13 989 12 10 13 989 11 11 13 989 11 11 13 989 11 11 13 989 10 11 14 989 10 11 14 990 9 9 16 990 9 7 18 990 9 6 18 991 9 6 18 991 9 5 19 992 8 4 20 992 7 5 20 993 6 4 21 993 6 4 21 994 4 4 22 995 3 5 20 997 2 5 20 1005 19 1006 17 1008 15 1010 12 1015 7'
>>> mask = rle2mask(rle, 1024, 1024)
>>> mask = mask.T.astype(np.uint8) # uint8是encode函数参数的指定数值类型
>>> segmentation = encode(np.asfortranarray(mask))
>>> segmentation
{'size': [1024, 1024], 'counts': b'hP^<2mo05J3N2N2M3N2O1N101N101N10001N100O10001N10000O101O00000O100000000O100000000O101O00\\OUQO3kn0LWQO3in0MXQO1in0N[QOOen01[QOOen00\\QO0dn00\\QO0dn0O]QO1cn0O]QO1cn0N^QO2bn0N^QO2bn0N^QO2bn0M^QO4bn0L^QO4cn0K[QO7en0IYQO9gn0GXQO9in0GWQO9in0GVQO:kn0ETQO<ln0CUQO=ln0BSQO?mn0ASQO?nn0_ORQOb0on0]ORQOa0Po0^OPQOb0Xo0O1N2N2M5KXoYa0'} # compact RLE

rle2mask是SIIM比赛官方提供的用于将RLE转化成mask的函数,内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
def rle2mask(rle, width, height):
mask = np.zeros(width * height)
array = np.asarray([int(x) for x in rle.split()])
starts = array[0::2]
lengths = array[1::2]

current_position = 0
for index, start in enumerate(starts):
current_position += start
mask[current_position:current_position+lengths[index]] = 255
current_position += lengths[index]

return mask.reshape(width, height)

以下是COCO2017的语义分割标注文件中一个完整的annotation:

1
2
3
4
5
6
7
8
9
10
11
12
13
{
"segmentation":
{
"counts":"Q[d04_;3L1O1O2M2O10001O0O10O11O00000000N2N2FkD3\\;Nem[6",
"size": [371, 640]
},
"area": 257.0,
"iscrowd": 0,
"image_id": 19042,
"bbox": [56.0, 50.0, 21.0, 16.0],
"category_id": 127,
"id": 20001212
},
  1. categories 字段是一个包含多个category实例的数组,表示标注的物体类型,category结构体描述如下:
1
2
3
4
5
{
"id": int,
"name": str,
"supercategory": str,
}

​ 在SIIM的检测目标里只有一种类型,即气胸(Pneumothorax)。于是自定义一种category如下:

1
2
3
4
5
6
7
"categories": [
{
'supercategory': 'Pneumothorax',
'id': 1,
'name': 'Pneumothorax'
}
]

以上是关于Stuff Segmentation 类型标注的所有内容,接下来就可以自己动手写一个脚本自动生成标注文件了。值得一提的是测试集的标注文件无需annotations字段。

准备好的COCO格式数据集按如下形式摆放。官方建议新建一个data文件夹,将数据集放在data文件夹下(建议使用软链接的方式)。生成的标注文件放在annotations文件夹下。

1
2
3
4
5
6
7
8
9
10
mmdetection
├── mmdetc
├── tools
├── configs
├── data
│ ├── coco
│ │ ├── annotations
│ │ ├── train2017
│ │ ├── val2017
│ │ ├── test2017

软连接方式,其中$COCO_ROOT需改为你的coco数据集根目录:

1
2
3
cd mmdetection
mkdir data
ln -s $COCO_ROOT data

三、训练模型

修改模型配置文件

进入配置文件夹configs,编辑你想使用的模型对应的配置文件。以下以cascade_mask_rcnn_x101_64x4d_fpn_1x.py为例,解释其中几个比较关键的参数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
# model settings
model = dict(
type='CascadeRCNN',
num_stages=3,
pretrained='open-mmlab://resnext101_64x4d',
backbone=dict(
type='ResNeXt',
depth=101,
groups=64,
base_width=4,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_scales=[8],
anchor_ratios=[0.5, 1.0, 2.0],
anchor_strides=[4, 8, 16, 32, 64],
target_means=[.0, .0, .0, .0],
target_stds=[1.0, 1.0, 1.0, 1.0],
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0 / 9.0, loss_weight=1.0)),
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=7, sample_num=2),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=[
dict(
type='SharedFCBBoxHead',
num_fcs=2,
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=2, #种类的数目+1,+1为背景类。
#SIIM比赛中只有一种类即气胸类,所以此处为1+1=2。
#下面的num_classes一样
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2],
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss',
beta=1.0,
loss_weight=1.0)),
dict(
type='SharedFCBBoxHead',
num_fcs=2,
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=2,
target_means=[0., 0., 0., 0.],
target_stds=[0.05, 0.05, 0.1, 0.1],
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss',
beta=1.0,
loss_weight=1.0)),
dict(
type='SharedFCBBoxHead',
num_fcs=2,
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=2,
target_means=[0., 0., 0., 0.],
target_stds=[0.033, 0.033, 0.067, 0.067],
reg_class_agnostic=True,
loss_cls=dict(
type='CrossEntropyLoss',
use_sigmoid=False,
loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss',
beta=1.0,
loss_weight=1.0))
],
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', out_size=14, sample_num=2),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=dict(
type='FCNMaskHead',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=2,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)))
# model training and testing settings
train_cfg = dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=0,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=2000,
max_num=2000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=[
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.6,
neg_iou_thr=0.6,
min_pos_iou=0.6,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False),
dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.7,
min_pos_iou=0.7,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
mask_size=28,
pos_weight=-1,
debug=False)
],
stage_loss_weights=[1, 0.5, 0.25])
test_cfg = dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.05,
nms=dict(type='nms', iou_thr=0.5),
max_per_img=100,
mask_thr_binary=0.5),
keep_all_stages=False)
# dataset settings
dataset_type = 'CocoDataset' #数据类型
data_root = 'data/coco/' #数据所在目录
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
data = dict(
imgs_per_gpu=2, #每块GPU每次所载入的图片
workers_per_gpu=2,
train=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_train2017.json', #标注文件
img_prefix=data_root + 'train2017/', #训练集所在目录
img_scale=(1024, 1024), #图片宽高
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0,
with_mask=True,
with_crowd=True,
with_label=True),
val=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_val2017.json',
img_prefix=data_root + 'train2017/',
img_scale=(1024, 1024),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0,
with_mask=True,
with_crowd=True,
with_label=True),
test=dict(
type=dataset_type,
ann_file=data_root + 'annotations/instances_test2017.json',
img_prefix=data_root + 'test2017/',
img_scale=(1024, 1024),
img_norm_cfg=img_norm_cfg,
size_divisor=32,
flip_ratio=0,
with_mask=True,
with_label=False,
test_mode=True))
# optimizer
optimizer = dict(type='SGD', lr=0.0002, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
# learning policy
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=300, #预训练迭代次数
warmup_ratio=0.00015,
step=[8, 11])
checkpoint_config = dict(interval=1)
# yapf:disable
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook'),
# dict(type='TensorboardLoggerHook')
])
# yapf:enable
# runtime settings
total_epochs = 50 #训练的epoch数,SIIM的训练集较小,故需要多训练几轮
dist_params = dict(backend='nccl')
log_level = 'INFO'
work_dir = './work_dirs/cascade_mask_rcnn_x101_64x4d_fpn_1x' #模型和日志的存放位置
load_from = None
resume_from = None #加载checkpoint
workflow = [('train', 1)]

修改coco数据配置文件

编辑mmdet/datasets/coco.py,修改CLASSES。例如SIIM比赛中只有一个Pneumothorax类,则改成如下形式:

1
CLASSES = ('Pneumothorax',)

训练模型

注意:配置文件中的默认学习率是8个gpu和2个img/gpu(batch size= 82 = 16)。根据线性缩放规则,如果您使用不同的GPU数目或img/gpu,您需要设置与batch size成比例的学习率。例如,如果4GPUs 2 img/gpu的lr=0.01,那么16GPUs * 4 img/gpu的lr=0.08。

单GPU训练

1
python tools/train.py ${CONFIG_FILE}

可选参数:

  • —work_dir ${YOUR_WORK_DIR} :指定work_dir

多GPU训练

1
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]

可选参数:

  • --validate [k]:训练时每k epochs(默认为1)执行一次验证
  • --work_dir ${YOUR_WORK_DIR}:指定work_dir
  • --resume_from ${CHECKPOINT_FILE}:从指定的checkpoint文件开始训练

对于刚刚配置的环境,我们只需输入如下命令就可以训练啦:

1
./tools/dist_train.sh configs/cascade_mask_rcnn_x101_64x4d_fpn_1x.py 4 --validate

训练log

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
2019-07-19 11:47:51,271 - INFO - Distributed training: True
2019-07-19 11:47:57,025 - INFO - load model from: open-mmlab://resnext101_64x4d
2019-07-19 11:48:04,712 - WARNING - missing keys in source state_dict: layer3.15.bn1.num_batches_tracked, layer3.4.bn3.num_batches_tracked, layer2.0.bn2.num_batches_tracked, layer3.8.bn3.num_batches_tracked, layer3.12.bn2.num_batches_tracked, layer3.0.downsample.1.num_batches_tracked, layer3.7.bn1.num_batches_tracked, layer1.1.bn2.num_batches_tracked, layer3.2.bn3.num_batches_tracked, layer4.0.downsample.1.num_batches_tracked, layer3.20.bn1.num_batches_tracked, layer3.7.bn3.num_batches_tracked, layer3.15.bn3.num_batches_tracked, layer3.19.bn1.num_batches_tracked, layer3.22.bn3.num_batches_tracked, layer4.2.bn1.num_batches_tracked, layer4.2.bn2.num_batches_tracked, layer4.0.bn3.num_batches_tracked, layer2.1.bn2.num_batches_tracked, layer3.1.bn2.num_batches_tracked, layer3.9.bn2.num_batches_tracked, layer3.3.bn2.num_batches_tracked, layer1.2.bn1.num_batches_tracked, layer1.0.bn2.num_batches_tracked, layer3.11.bn1.num_batches_tracked, layer1.0.bn1.num_batches_tracked, layer2.3.bn1.num_batches_tracked, layer3.16.bn2.num_batches_tracked, layer3.3.bn3.num_batches_tracked, layer3.14.bn1.num_batches_tracked, layer3.12.bn3.num_batches_tracked, layer3.13.bn1.num_batches_tracked, layer3.6.bn2.num_batches_tracked, layer3.18.bn1.num_batches_tracked, layer2.3.bn3.num_batches_tracked, layer3.21.bn2.num_batches_tracked, layer2.2.bn3.num_batches_tracked, layer1.1.bn3.num_batches_tracked, layer3.9.bn1.num_batches_tracked, layer3.20.bn3.num_batches_tracked, layer3.3.bn1.num_batches_tracked, layer3.8.bn1.num_batches_tracked, layer3.0.bn2.num_batches_tracked, layer3.17.bn3.num_batches_tracked, layer3.0.bn3.num_batches_tracked, layer3.18.bn2.num_batches_tracked, layer3.16.bn1.num_batches_tracked, layer3.14.bn2.num_batches_tracked, layer3.16.bn3.num_batches_tracked, layer3.17.bn2.num_batches_tracked, layer4.1.bn2.num_batches_tracked, layer3.22.bn2.num_batches_tracked, layer3.2.bn2.num_batches_tracked, layer3.19.bn3.num_batches_tracked, layer3.0.bn1.num_batches_tracked, layer1.2.bn2.num_batches_tracked, layer4.1.bn3.num_batches_tracked, layer3.12.bn1.num_batches_tracked, layer3.5.bn2.num_batches_tracked, layer2.3.bn2.num_batches_tracked, layer3.11.bn2.num_batches_tracked, layer3.18.bn3.num_batches_tracked, layer3.8.bn2.num_batches_tracked, layer3.17.bn1.num_batches_tracked, layer3.22.bn1.num_batches_tracked, layer3.20.bn2.num_batches_tracked, layer3.11.bn3.num_batches_tracked, layer3.4.bn1.num_batches_tracked, layer2.0.bn1.num_batches_tracked, layer3.1.bn1.num_batches_tracked, layer3.6.bn1.num_batches_tracked, layer2.0.downsample.1.num_batches_tracked, layer4.0.bn2.num_batches_tracked, layer1.2.bn3.num_batches_tracked, layer3.13.bn3.num_batches_tracked, layer3.13.bn2.num_batches_tracked, layer4.1.bn1.num_batches_tracked, bn1.num_batches_tracked, layer3.10.bn3.num_batches_tracked, layer2.1.bn1.num_batches_tracked, layer3.5.bn1.num_batches_tracked, layer3.6.bn3.num_batches_tracked, layer3.19.bn2.num_batches_tracked, layer3.7.bn2.num_batches_tracked, layer2.0.bn3.num_batches_tracked, layer3.15.bn2.num_batches_tracked, layer3.9.bn3.num_batches_tracked, layer3.10.bn2.num_batches_tracked, layer1.1.bn1.num_batches_tracked, layer2.2.bn2.num_batches_tracked, layer2.2.bn1.num_batches_tracked, layer3.5.bn3.num_batches_tracked, layer3.2.bn1.num_batches_tracked, layer3.1.bn3.num_batches_tracked, layer4.2.bn3.num_batches_tracked, layer3.10.bn1.num_batches_tracked, layer3.21.bn1.num_batches_tracked, layer3.21.bn3.num_batches_tracked, layer3.4.bn2.num_batches_tracked, layer2.1.bn3.num_batches_tracked, layer1.0.downsample.1.num_batches_tracked, layer1.0.bn3.num_batches_tracked, layer4.0.bn1.num_batches_tracked, layer3.14.bn3.num_batches_tracked

loading annotations into memory...
Done (t=0.04s)
creating index...
index created!
loading annotations into memory...
Done (t=0.03s)
creating index...
index created!
loading annotations into memory...
loading annotations into memory...
Done (t=0.05s)
creating index...
index created!Done (t=0.06s)

creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2019-07-19 11:48:08,173 - INFO - Start running, host: root@dl-All-Series, work_dir: /home/dl/d/12siim/0715hk/mmdetection/work_dirs/cascade_mask_rcnn_x101_64x4d_fpn_1x
2019-07-19 11:48:08,174 - INFO - workflow: [('train', 1)], max: 50 epochs
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
2019-07-19 11:50:30,248 - INFO - Epoch [1][50/375] lr: 0.00003, eta: 14:45:23, time: 2.841, data_time: 0.199, memory: 10024, loss_rpn_cls: 0.7045, loss_rpn_bbox: 0.0066, s0.loss_cls: 0.5725, s0.acc: 88.9043, s0.loss_bbox: 0.0005, s0.loss_mask: 2.8232, s1.loss_cls: 0.3224, s1.acc: 74.6172, s1.loss_bbox: 0.0002, s1.loss_mask: 1.6205, s2.loss_cls: 0.1525, s2.acc: 90.1211, s2.loss_bbox: 0.0000, s2.loss_mask: 0.5753, loss: 6.7784
2019-07-19 11:52:41,648 - INFO - Epoch [1][100/375] lr: 0.00007, eta: 14:09:53, time: 2.628, data_time: 0.045, memory: 10024, loss_rpn_cls: 0.6970, loss_rpn_bbox: 0.0089, s0.loss_cls: 0.2083, s0.acc: 99.7930, s0.loss_bbox: 0.0003, s0.loss_mask: 0.7883, s1.loss_cls: 0.1744, s1.acc: 99.7988, s1.loss_bbox: 0.0001, s1.loss_mask: 0.4091, s2.loss_cls: 0.1105, s2.acc: 99.7949, s2.loss_bbox: 0.0000, s2.loss_mask: 0.2313, loss: 2.6281
2019-07-19 11:54:53,230 - INFO - Epoch [1][150/375] lr: 0.00010, eta: 13:57:00, time: 2.632, data_time: 0.041, memory: 10024, loss_rpn_cls: 0.6769, loss_rpn_bbox: 0.0073, s0.loss_cls: 0.0545, s0.acc: 99.7344, s0.loss_bbox: 0.0020, s0.loss_mask: 0.6995, s1.loss_cls: 0.0519, s1.acc: 99.7871, s1.loss_bbox: 0.0004, s1.loss_mask: 0.3705, s2.loss_cls: 0.0455, s2.acc: 99.8027, s2.loss_bbox: 0.0000, s2.loss_mask: 0.1762, loss: 2.0847
2019-07-19 11:57:06,704 - INFO - Epoch [1][200/375] lr: 0.00013, eta: 13:52:23, time: 2.669, data_time: 0.042, memory: 10024, loss_rpn_cls: 0.5973, loss_rpn_bbox: 0.0074, s0.loss_cls: 0.0639, s0.acc: 99.4219, s0.loss_bbox: 0.0106, s0.loss_mask: 0.6261, s1.loss_cls: 0.0481, s1.acc: 99.6797, s1.loss_bbox: 0.0028, s1.loss_mask: 0.3498, s2.loss_cls: 0.0360, s2.acc: 99.7715, s2.loss_bbox: 0.0004, s2.loss_mask: 0.1757, loss: 1.9182
2019-07-19 11:59:20,874 - INFO - Epoch [1][250/375] lr: 0.00017, eta: 13:49:37, time: 2.684, data_time: 0.039, memory: 10024, loss_rpn_cls: 0.3471, loss_rpn_bbox: 0.0087, s0.loss_cls: 0.0609, s0.acc: 98.8672, s0.loss_bbox: 0.0256, s0.loss_mask: 0.5709, s1.loss_cls: 0.0274, s1.acc: 99.5234, s1.loss_bbox: 0.0066, s1.loss_mask: 0.3234, s2.loss_cls: 0.0211, s2.acc: 99.7559, s2.loss_bbox: 0.0006, s2.loss_mask: 0.1756, loss: 1.5679
2019-07-19 12:01:35,584 - INFO - Epoch [1][300/375] lr: 0.00020, eta: 13:47:30, time: 2.693, data_time: 0.040, memory: 10024, loss_rpn_cls: 0.1487, loss_rpn_bbox: 0.0081, s0.loss_cls: 0.0710, s0.acc: 98.5605, s0.loss_bbox: 0.0343, s0.loss_mask: 0.5988, s1.loss_cls: 0.0186, s1.acc: 99.4336, s1.loss_bbox: 0.0084, s1.loss_mask: 0.3309, s2.loss_cls: 0.0099, s2.acc: 99.6973, s2.loss_bbox: 0.0013, s2.loss_mask: 0.1713, loss: 1.4013
2019-07-19 12:03:48,274 - INFO - Epoch [1][350/375] lr: 0.00020, eta: 13:43:39, time: 2.654, data_time: 0.044, memory: 10024, loss_rpn_cls: 0.0949, loss_rpn_bbox: 0.0086, s0.loss_cls: 0.0763, s0.acc: 98.4336, s0.loss_bbox: 0.0383, s0.loss_mask: 0.4798, s1.loss_cls: 0.0180, s1.acc: 99.4102, s1.loss_bbox: 0.0088, s1.loss_mask: 0.2716, s2.loss_cls: 0.0072, s2.acc: 99.6953, s2.loss_bbox: 0.0014, s2.loss_mask: 0.1467, loss: 1.1517
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 288/286, 7.3 task/s, elapsed: 39s, ETA: 0s

Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.18s).
Accumulating evaluation results...
DONE (t=0.04s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.002
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.002
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.005
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.005
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.005
Loading and preparing results...
DONE (t=0.01s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *segm*
DONE (t=0.19s).
Accumulating evaluation results...
DONE (t=0.04s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.001
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001

测试模型

1
2
3
4
5
# single-gpu testing
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]

# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

可选参数:

  • RESULT_FILE: 用于存放测试结果的pickle格式的文件名,若没有指定这个参数,最终结果将不会输出到文件里。
  • EVAL_METRICS:需要评估的类型,可选选项有: proposal_fast, proposal, bbox, segm, keypoints.
  • --show:若指定了该参数,检测结果将在新窗口中以图片形式显示出来。(只能用于单GPU测试。)

对于测试数据集,我们输入了如下命令:

1
./tools/dist_test.sh configs/cascade_mask_rcnn_x101_64x4d_fpn_1x.py work_dirs/cascade_mask_rcnn_x101_64x4d_fpn_1x/epoch_50.pth 4 --eval segm

由于指定了 —eval 为 segm,最终的预测结果会输出到result.pkl.segm.json文件里。

参考文献:

-------------本文结束感谢您的阅读-------------