MMdetection训练遇到的各种问题总结

label = self.cat2label[name] 报错

  • 报错信息如下:

KeyError: ‘Traceback (most recent call last):\n File “/home/dreamtech/.conda/envs/AugFPN2/lib/python3.7/site-packages/torch/utils/data/dataloader.py”, line 138, in _worker_loop\n samples = collate_fn([dataset[i] for i in batch_indices])\n File “/home/dreamtech/.conda/envs/AugFPN2/lib/python3.7/site-packages/torch/utils/data/dataloader.py”, line 138, in \n samples = collate_fn([dataset[i] for i in batch_indices])\n File “/home/dreamtech/.conda/envs/AugFPN2/lib/python3.7/site-packages/mmdet/datasets/custom.py”, line 159, in getitem\n data = self.prepare_train_img(idx)\n File “/home/dreamtech/.conda/envs/AugFPN2/lib/python3.7/site-packages/mmdet/datasets/custom.py”, line 187, in prepare_train_img\n ann = self.get_ann_info(idx)\n File “/home/dreamtech/.conda/envs/AugFPN2/lib/python3.7/site-packages/mmdet/datasets/xml_style.py”, line 44, in get_ann_info\n label = self.cat2label[name]\nKeyError: ‘sly’\n’

  • 该问题出现在低版本的MMdetection,修改config文件训练自己的VOC格式数据集时,出现该问题并不是训练集标签的问题,而是没有在修改VOC.py ,class_names.py 以及config文件后,重新执行 pip install . 或者 python setup.py install.
  • 该问题在高版本的MMdetection中不会出现。

RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

  • 报错信息如下

Traceback (most recent call last):
File “tools/train.py”, line 196, in
main()
File “tools/train.py”, line 192, in main
meta=meta)
File “/clusters/data_1080Ti_0/liudong/mmdetection/mmdet/apis/train.py”, line 209, in train_detector
runner.run(data_loaders, cfg.workflow)
File “/home/liudong/.conda/envs/liudong-mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py”, line 127, in run
epoch_runner(data_loaders[i], **kwargs)
File “/home/liudong/.conda/envs/liudong-mmlab/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py”, line 51, in train
self.call_hook(‘after_train_iter’)
File “/home/liudong/.conda/envs/liudong-mmlab/lib/python3.7/site-packages/mmcv/runner/base_runner.py”, line 309, in call_hook
getattr(hook, fn_name)(self)
File “/home/liudong/.conda/envs/liudong-mmlab/lib/python3.7/site-packages/mmcv/runner/hooks/optimizer.py”, line 56, in after_train_iter
runner.outputs[‘loss’].backward()
File “/home/liudong/.conda/envs/liudong-mmlab/lib/python3.7/site-packages/torch/tensor.py”, line 245, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File “/home/liudong/.conda/envs/liudong-mmlab/lib/python3.7/site-packages/torch/autograd/init.py”, line 147, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED

  • 此时mmdet=2.20.0 mmcv-full=1.4.4
  • 查阅github,可以降低nncv-full的版本到1.4.2 解决问题

版权声明:本文为CSDN博主「刘啊咚咚锵」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/liu_dongd/article/details/115890453

刘啊咚咚锵

我还没有学会写个人说明!

暂无评论

发表评论

相关推荐

YOLOv4 介绍及其模型优化方法

YOLOv4 介绍及其模型优化方法一、YOLOv4 介绍2020 年 4 月,YOLOv4 在悄无声息中重磅发布,在目标检测领域引起广泛的讨论。在 YOLO 系列的原作者 Joseph Redmon 宣布退出 CV