文章目录[隐藏]
很多常用术语不太懂,毕竟咱不是这专业的,也算个初学者,总之,菜是原罪,能学就学。
1.官方解释
查看https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data,里面有这样一句话。
For training command outputs and further details please see the training section of Google Colab Notebook.
打开这个notebook(需要点手段,你们懂的)。
总结一下,这个notebook中有关train 的信息。
- actual training is much longer, around 300-1000 epochs, depending on your dataset
--cfg
选择model文件(models/yolo5s.yaml)--data
选择datase文件(data/coco128.yaml)--weights
指定初始权重文件(随机初始化--weights ''
)- All training results are saved to runs/exp0 for the first experiment, then runs/exp1, runs/exp2 etc. for subsequent experiments.(实验发现到10就停下了,之后不断更新exp10)
- 可选tensorboard(还不会用。。。)
- A Mosaic Dataloader is used for training
- View test_batch0_gt.jpg to see test batch 0 ground truth labels.
- View test_batch0_pred.jpg to see test batch 0 predictions.
- Training losses and performance metrics are saved to Tensorboard and also to a runs/exp0/results.txt logfile. results.txt is plotted as results.png after training completes.
然后就没了。。。。。显然对咱深入理解没啥帮助,也就勉强一用。
2.源码阅读
传参都在这了。
if __name__ == '__main__':
check_git_status()
parser = argparse.ArgumentParser()
parser.add_argument('--cfg', type=str, default='models/yolov5s.yaml', help='model.yaml path')
parser.add_argument('--data', type=str, default='data/coco128.yaml', help='data.yaml path')
parser.add_argument('--hyp', type=str, default='', help='hyp.yaml path (optional)')
parser.add_argument('--epochs', type=int, default=300)
parser.add_argument('--batch-size', type=int, default=16)
parser.add_argument('--img-size', nargs='+', type=int, default=[640, 640], help='train,test sizes')
parser.add_argument('--rect', action='store_true', help='rectangular training')
parser.add_argument('--resume', nargs='?', const='get_last', default=False,
help='resume from given path/to/last.pt, or most recent run if blank.')
parser.add_argument('--nosave', action='store_true', help='only save final checkpoint')
parser.add_argument('--notest', action='store_true', help='only test final epoch')
parser.add_argument('--noautoanchor', action='store_true', help='disable autoanchor check')
parser.add_argument('--evolve', action='store_true', help='evolve hyperparameters')
parser.add_argument('--bucket', type=str, default='', help='gsutil bucket')
parser.add_argument('--cache-images', action='store_true', help='cache images for faster training')
parser.add_argument('--weights', type=str, default='', help='initial weights path')
parser.add_argument('--name', default='', help='renames results.txt to results_name.txt if supplied')
parser.add_argument('--device', default='', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')
parser.add_argument('--multi-scale', action='store_true', help='vary img-size +/- 50%%')
parser.add_argument('--single-cls', action='store_true', help='train as single-class dataset')
opt = parser.parse_args()
总结一下:
- cfg,data,weights:前面看过了是一定要传的两个参;
- hyp:参数咱暂时用不上,是指定一些超参数用的(学习率啥的);
- epochs: 轮数,默认300,需要指定;
- batch-size:一次喂多少数据,我这内存就能给16,所以可以不传按默认16;
- img-size: 训练和测试数据集的图片尺寸(个人理解为分辨率),默认640,640
nargs='+' 表示参数可设置一个或多个
; - rect: 只要加上’–rect’程序就会将rect设为true,作用未知(应该是训练时启用矩形训练);
- resume: 重新训练(个人理解epoch会从头计算);
- notest:only test final epoch(这样训练中间变化趋势应该就看不到了);
- evolve:进化超参数(hyp),可以试试;
- cache-images:cache images for faster training,可以试试;
- name:renames results.txt to results_name.txt if supplied;
- device:cuda device, i.e. 0 or 0,1,2,3 or cpu,我这默认已经用了gtx1060了,不用改;
- single-cls:train as single-class dataset,暂时没用;
以下这些都没太看懂
noautoanchor:disable autoanchor check
nosave:only save final checkpoint
bucket:gsutil bucket(应该关于谷歌云,应该用不上)
multi-scale:vary img-size +/- 50%%
读下来我的命令行语句应该改为:
python train.py --epoch 53 --data .\data\junk2020.yaml --cfg .\models\yolov5s.yaml --weight runs\exp10\weights\best.pt --evolve --cache-images
测试一下
内存可能不太够,电脑差点崩掉,中途杀了python,所以那个cache没能力就先别加了。。。
evolve之后的hyp也不知道存在哪了,,,明天再说吧。。。。
python train.py --epoch 80 --data .\data\junk2020.yaml --cfg .\models\yolov5s.yaml --weight runs\exp10\weights\best.pt --evolve
结果就是evolve出错
Traceback (most recent call last):
File "train.py", line 449, in <module>
print_mutation(hyp, results, opt.bucket)
File "D:\ForSpeed\junk_yolov5\yolov5\utils\utils.py", line 823, in print_mutation
b = '%10.3g' * len(hyp) % tuple(hyp.values()) # hyperparam values
TypeError: must be real number, not str
由于不好debug,这边先把evolve去了。
3.可视化结果解释
解释一下result.png里都是啥:
- GIoU:推测为GIoU损失函数均值,越小方框越准;
- Objectness:推测为目标检测loss均值,越小目标检测越准;
- Classification:推测为分类loss均值,越小分类越准;
- Precision:准确率(找对的/找到的);
- Recall:召回率(找对的/该找对的);
- mAP@0.5 & mAP@0.5:0.95:这里说的挺好,总之就是AP是用Precision和Recall作为两轴作图后围成的面积,m表示平均,@后面的数表示判定iou为正负样本的阈值,@0.5:0.95表示阈值取0.5:0.05:0.95后取均值。
4.evolve报错解决
Traceback (most recent call last):
File "train.py", line 449, in <module>
print_mutation(hyp, results, opt.bucket)
File "D:\ForSpeed\junk_yolov5\yolov5\utils\utils.py", line 823, in print_mutation
b = '%10.3g' * len(hyp) % tuple(hyp.values()) # hyperparam values
TypeError: must be real number, not str
这波看这句b = '%10.3g' * len(hyp) % tuple(hyp.values())
,意思是把hyp这个字典的value都提出来形成一个元组,然后以10.3g批量格式化。
hyp = {'optimizer': 'SGD', # ['adam', 'SGD', None] if none, default is SGD
'lr0': 0.01, # initial learning rate (SGD=1E-2, Adam=1E-3)
'momentum': 0.937, # SGD momentum/Adam beta1
'weight_decay': 5e-4, # optimizer weight decay
'giou': 0.05, # giou loss gain
'cls': 0.58, # cls loss gain
'cls_pw': 1.0, # cls BCELoss positive_weight
'obj': 1.0, # obj loss gain (*=img_size/320 if img_size != 320)
'obj_pw': 1.0, # obj BCELoss positive_weight
'iou_t': 0.20, # iou training threshold
'anchor_t': 4.0, # anchor-multiple threshold
'fl_gamma': 0.0, # focal loss gamma (efficientDet default is gamma=1.5)
'hsv_h': 0.014, # image HSV-Hue augmentation (fraction)
'hsv_s': 0.68, # image HSV-Saturation augmentation (fraction)
'hsv_v': 0.36, # image HSV-Value augmentation (fraction)
'degrees': 0.0, # image rotation (+/- deg)
'translate': 0.0, # image translation (+/- fraction)
'scale': 0.5, # image scale (+/- gain)
'shear': 0.0} # image shear (+/- deg)
观察values,第一项为字符串’SGD’,所以格式化出现了问题。
将b = '%10.3g' * len(hyp) % tuple(hyp.values())
改为
b = '%10s' * 1 % (list(hyp.values())[0],) + '%10.3g' * (len(hyp) - 1) % tuple( list(hyp.values())[1:])
训练一轮试试
Traceback (most recent call last):
File "train.py", line 449, in <module>
print_mutation(hyp, results, opt.bucket)
File "D:\ForSpeed\junk_yolov5\yolov5\utils\utils.py", line 837, in print_mutation
x = np.unique(np.loadtxt('evolve.txt', ndmin=2), axis=0) # load unique rows
File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 1146, in loadtxt
for x in read_data(_loadtxt_chunksize):
File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 1074, in read_data
items = [conv(val) for (conv, val) in zip(converters, vals)]
File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 1074, in <listcomp>
items = [conv(val) for (conv, val) in zip(converters, vals)]
File "C:\Users\15518\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\lib\npyio.py", line 781, in floatconv
return float(x)
ValueError: could not convert string to float: 'SGD'
这里是np.loadtxt('evolve.txt', ndmin=2)
这里txt里有字符串,所以出错。
把第一项去掉看看
Traceback (most recent call last):
File "train.py", line 437, in <module>
hyp[k] = x[i + 7] * v[i] # mutate
IndexError: index 18 is out of bounds for axis 0 with size 18
待续。。。
版权声明:本文为CSDN博主「fxxxkming」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_41990671/article/details/107300314
暂无评论