文章目录[隐藏]
找了一个faster rcnn的demo,但是因为初学,即使有一个例程也不太会使用自己的数据集使它正常运行。
现记录一下针对自己数据集的修改过程:
01 标注数据集
第一步:将标注信息放入MyAnnotations
标注数据集,由于之前的demo,数据集都是放入VOCdevkit文件夹,方便起见,我们也放入VOC2012文件夹下。
同时,为了和PASCAL VOC2012数据集做区分,我们创建MyAnnotation
和MyJPEGImage
两个文件夹。前者放入我们自己数据集的标注信息,后者放入JPEG格式的数据集图片。
02 划分训练集和验证集
第二步:运行split_data.py(注意修改路径)
在工程主目录下,有split_data.py
文件,用以划分训练集和验证集,运行结束后生成train.txt
和val.txt
两个文件。*
(为了和原数据集区分,这里设置为MyTrain.txt
和Myval.txt
)*
程序运行完后,生成的MyTrain和MyVal的txt文件,保存在split_data.py的同一目录下。
(另外需要注意的是,运行程序时需要保证该路径下目前没有这个两个文件,该程序生成的文件不能覆盖,只能先手动删除原文件)
把这两个txt文件放入./VOCdevkit/VOC2012/ImageSets/Main
下
需要修改的部分如下:(①设置标注文件的路径;②设置txt写入的位置及文件名)
03 修改网络中的数据集设置
第三步:修改my_dataset.py中的文件路径,以及分类的json文件
(这两个文件都在工程根目录下)
my_pascal_voc_classes.json
文件修改如下:(原名字为pascal_voc_classes.json,为了区分,所以修改了名字)
my_dateset.py
文件的修改内容如下:(①设置标注文件和图片的路径;②设置分类json文件的路径)
04 针对我的数据集,修改图片维数
第四步:在transform.py中修改dim为1,因为我的dim只有1
因为我的图片是黑白的,所以需要修改维数为1(在transform.py
中修改)
05 修改训练程序
第五步:在train_res50_fpn.py中修改训练的txt(图片名称的txt和标记信息的txt)
这一步在train_res50_fpn.py
中修改
06 修改预测程序
第六步:修改predict.py中的类别数和json文件路径
修改predict.py
中的类别,改num_classes
为2(就是类别数+1)(那个+1是背景,背景的index是0)
修改predict.py
中的json文件路径
报错: FileNotFoundError: Caught FileNotFoundError in DataLoader worker
上面那个错误,有人说应将将num_workers
设置为0,也有人说是因为数据导入时少给了数据的标签,导致数据没有标签
参考:https://blog.csdn.net/weixin_45093926/article/details/103330105
我感觉自己没有少给数据标签呀,所以我重新运行了一次。
然后报的错误和上次不一样了。
然后两次错误结果如下。
最后我发现我错误的原因是图片路径里有中文,所以所以修改文件名。
修改完文件名后还需要注意,就是我的annotation里<filename>
还是原来的中文名,所以这里也要修改。
python批量修改可以参考:https://blog.csdn.net/gyxx1998/article/details/120834007
第1次报错: FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
D:\application\anaconda\envs\torch16\python.exe D:/study/mytorch/train_res50_fpn.py
Namespace(aspect_ratio_group_factor=3, batch_size=2, data_path='./', device='cuda:0', epochs=15, num_classes=20, output_dir='./save_weights', resume='', start_epoch=0)
Using cuda device training.
Using [0, 0.5, 0.6299605249474366, 0.7937005259840997, 1.0, 1.2599210498948732, 1.5874010519681994, 2.0, inf] as bins for aspect ratio quantization
Count of instances per bin: [348]
Using 2 dataloader workers
Traceback (most recent call last):
File "D:/study/mytorch/train_res50_fpn.py", line 205, in <module>
main(args)
File "D:/study/mytorch/train_res50_fpn.py", line 134, in main
print_freq=50, warmup=True)
File "D:\study\mytorch\train_utils\train_eval_utils.py", line 28, in train_one_epoch
for i, [images, targets] in enumerate(metric_logger.log_every(data_loader, print_freq, header)):
File "D:\study\mytorch\train_utils\distributed_utils.py", line 204, in log_every
for obj in iterable:
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\dataloader.py", line 363, in __next__
data = self._next_data()
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\dataloader.py", line 989, in _next_data
return self._process_data(data)
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\dataloader.py", line 1014, in _process_data
data.reraise()
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\_utils.py", line 395, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\_utils\worker.py", line 185, in _worker_loop
data = fetcher.fetch(index)
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\study\mytorch\my_dataset.py", line 51, in __getitem__
image = Image.open(img_path)
File "D:\application\anaconda\envs\torch16\lib\site-packages\PIL\Image.py", line 2968, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './VOCdevkit\\VOC2012\\MyJPEGImages\\1-2-00006_original鍒囧壊1.jpg'
Process finished with exit code 1
第2次报错:FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 1.
D:\application\anaconda\envs\torch16\python.exe D:/study/mytorch/train_res50_fpn.py
Namespace(aspect_ratio_group_factor=3, batch_size=2, data_path='./', device='cuda:0', epochs=15, num_classes=20, output_dir='./save_weights', resume='', start_epoch=0)
Using cuda device training.
Using [0, 0.5, 0.6299605249474366, 0.7937005259840997, 1.0, 1.2599210498948732, 1.5874010519681994, 2.0, inf] as bins for aspect ratio quantization
Count of instances per bin: [348]
Using 2 dataloader workers
D:\application\anaconda\envs\torch16\lib\site-packages\torchvision\ops\poolers.py:216: UserWarning: This overload of nonzero is deprecated:
nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
nonzero(Tensor input, *, bool as_tuple) (Triggered internally at ..\torch\csrc\utils\python_arg_parser.cpp:766.)
idx_in_level = torch.nonzero(levels == level).squeeze(1)
Epoch: [0] [ 0/174] eta: 0:16:45.943889 lr: 0.000034 loss: 3.8109 (3.8109) loss_classifier: 3.0328 (3.0328) loss_box_reg: 0.0135 (0.0135) loss_objectness: 0.7061 (0.7061) loss_rpn_box_reg: 0.0585 (0.0585) time: 5.7813 data: 1.5690 max mem: 676
Traceback (most recent call last):
File "D:/study/mytorch/train_res50_fpn.py", line 205, in <module>
main(args)
File "D:/study/mytorch/train_res50_fpn.py", line 134, in main
print_freq=50, warmup=True)
File "D:\study\mytorch\train_utils\train_eval_utils.py", line 28, in train_one_epoch
for i, [images, targets] in enumerate(metric_logger.log_every(data_loader, print_freq, header)):
File "D:\study\mytorch\train_utils\distributed_utils.py", line 204, in log_every
for obj in iterable:
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\dataloader.py", line 363, in __next__
data = self._next_data()
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\dataloader.py", line 971, in _next_data
return self._process_data(data)
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\dataloader.py", line 1014, in _process_data
data.reraise()
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\_utils.py", line 395, in reraise
raise self.exc_type(msg)
FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 1.
Original Traceback (most recent call last):
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\_utils\worker.py", line 185, in _worker_loop
data = fetcher.fetch(index)
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\application\anaconda\envs\torch16\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\study\mytorch\my_dataset.py", line 51, in __getitem__
image = Image.open(img_path)
File "D:\application\anaconda\envs\torch16\lib\site-packages\PIL\Image.py", line 2968, in open
fp = builtins.open(filename, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './VOCdevkit\\VOC2012\\MyJPEGImages\\10_鍒囧壊1 (22).jpg'
Process finished with exit code 1
最后发现应该是文件名里有中文,改掉就好了
修改后发现还是报错,因为里面标注文件也有问题,里面的filename和path还都是中文
版权声明:本文为CSDN博主「冰激凌啊」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/gyxx1998/article/details/120822764
暂无评论