目标检测 YOLOv5 图像大小与loss权重的关系

flyfish

训练时图像的高度和宽度是相等的。
推理时图像的高度和宽度是可以不相等的。
batchsize的大小必须能够被GPU个数整除
如果输入图片大小不是32的整数倍，程序会自动调整到32的整数倍。
训练图像的大小的参数如下

parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='[train, test] image sizes')
opt = parser.parse_args()

第一个640是训练图像的大小，第二个640是测试图像的大小，两者可以不同。

图像大小必须是32的整数倍，检查图像的width和height能否被32整除

def check_img_size(img_size, s=32):
# Verify img_size is a multiple of stride s
new_size = make_divisible(img_size, int(s))  # ceil gs-multiple
if new_size != img_size:
    print('WARNING: --img-size %g must be multiple of max stride %g, updating to %g' % (img_size, s, new_size))
return new_size

假如–img-size参数是[640, 640]
opt.img_size: [640, 640]

假如–img-size参数是[640, 320]

opt.img_size: [640, 320]

假如–img-size参数是–img-size 1280
opt.img_size: [1280, 1280]

这两者的大小分别赋值给imgsz和imgsz_test

imgsz: 1280
imgsz_test: 1280

YOLOv5中有三个损失分别是 box, obj, cls

通过下面这段代码调节三个损失的各自权重
train.py

# Model parameters
hyp['box'] *= 3. / nl  # scale to layers
hyp['cls'] *= nc / 80. * 3. / nl  # scale to classes and layers
hyp['obj'] *= (imgsz / 640) ** 2 * 3. / nl  # scale to image size and layers

nl是number of detection layers，检测层的层数，这里是3

nl = model.model[-1].nl  # number of detection layers (used for scaling hyp['obj'])
imgsz, imgsz_test = [check_img_size(x, gs) for x in opt.img_size]  # verify imgsz are gs-multiples

看一下图像大小分别是1280，640，320，224的时候，各自权重分别是多少

Image sizes 1280

nl: 3
hyp['box']: 0.05
hyp['obj']: 4.0
hyp['cls']: 0.5

Image sizes 640

nl: 3
hyp['box']: 0.05
hyp['obj']: 1.0
hyp['cls']: 0.5

Image sizes 320

nl: 3
hyp['box']: 0.05
hyp['obj']: 0.25
hyp['cls']: 0.5

Image sizes 224

nl: 3
hyp['box']: 0.05
hyp['obj']: 0.12249999999999998
hyp['cls']: 0.5

在loss计算时会乘上各自的权重
loss.py

lbox *= self.hyp['box']
lobj *= self.hyp['obj']
lcls *= self.hyp['cls']

训练时会过滤掉小于2像素的框

# Filter
i = (wh0 < 3.0).any(1).sum()
if i:
    print(f'{prefix}WARNING: Extremely small objects found. {i} of {len(wh0)} labels are < 3 pixels in size.')
wh = wh0[(wh0 >= 2.0).any(1)]  # filter > 2 pixels