SSD-Pytorch,Darknet数据集转VOC训练、检测图片、全流程跑通

文章目录[隐藏]

一、环境
二、下载项目
三、准备数据集
四、ssd.pytorch项目操作
- 4.1 创建数据集
- 4.2 修改配置文件
五、训练过程error、warning解决
六、训练完成后的验证

一、环境

千辛万苦走通后，发现版本真的坑死人。

我的版本：python3.6 + cuda 10.2 + pytorch1.7.1 + numpy1.15.1 + RTX2060

（建议：据说将pytorch的版本降低为1.2及以下的版本。但是我的cuda是10.2，目前不支持1.2及其以下的GPU版的pytorch，重安装太麻烦了，我就只能在后边解决问题了）

二、下载项目

自己配置
使用的是SSD-Pytorch git项目，因为需要使用外网，有时候可能git clone不成功，建议直接下载zip包，本地压缩：
项目地址：https://github.com/amdegroot/ssd.pytorch

git clone https://gitcode.net/mirrors/amdegroot/ssd.pytorch.git

第一次下载失败，git网页没进去，后来使用了VPN才能下载：

下载配置好项目（我自己调通后的项目）
地址：https://github.com/625135449/SSD-Pytorch，可直接git clone

git clone https://github.com/625135449/SSD-Pytorch

三、准备数据集

我的数据集格式是darknet的yolo格式，在此转为voc数据集（可自行转为coco数据集）

3.1 数据结构

darknet格式：

voc格式：在Annotations中放置所有的xml标签，在JPEGImages中放置所有的图片，ImageSets/Main中放置train.txt、trainval.txt、val.txt、test.txt（内容只有图片的名字)

3.2 darkent的txt文件转为voc的xml文件代码

输入相关的文件地址、类别

import os
import glob
from PIL import Image
from tqdm import tqdm

voc_annotations = '/home/ssd.pytorch/data/VOCdevkit/VOC2021/Annotations/' #存放的xml文件地址
yolo_txt = '/home/darknet/Helmet/labels/'  #darkent数据集标签文件地址
img_path = '/home/darknet/Helmet/images/' #darkent数据集图片地址
labels = ['no helmet', 'wear helmet']  #darknet数据集的类别

# 图像存储位置
src_img_dir = img_path
# 图像的txt文件存放位置
src_txt_dir = yolo_txt
src_xml_dir = voc_annotations
img_Lists = glob.glob(src_img_dir + '/*.jpg')
img_basenames = []

for item in img_Lists:
    img_basenames.append(os.path.basename(item))

img_names = []
for item in img_basenames:
    temp1, temp2 = os.path.splitext(item)
    img_names.append(temp1)

for img in tqdm(img_names):
    im = Image.open((src_img_dir + '/' + img + '.jpg'))
    width, height = im.size

    # 打开txt文件
    gt = open(src_txt_dir + '/' + img + '.txt').read().splitlines()
    # print(gt)
    if gt:
        # 将主干部分写入xml文件中
        xml_file = open((src_xml_dir + '/' + img + '.xml'), 'w')
        xml_file.write('<annotation>\n')
        xml_file.write('    <folder>VOC2007</folder>\n')
        xml_file.write('    <filename>' + str(img) + '.jpg' + '</filename>\n')
        xml_file.write('    <size>\n')
        xml_file.write('        <width>' + str(width) + '</width>\n')
        xml_file.write('        <height>' + str(height) + '</height>\n')
        xml_file.write('        <depth>3</depth>\n')
        xml_file.write('    </size>\n')

        # write the region of image on xml file
        for img_each_label in gt:  # txt 文件中的每一行
            spt = img_each_label.split(' ')  # 这里如果txt里面是以逗号‘，’隔开的，那么就改为spt = img_each_label.split(',')。
            # print(f'spt:{spt}')
            xml_file.write('    <object>\n')
            xml_file.write('        <name>' + str(labels[int(spt[0])]) + '</name>\n')
            xml_file.write('        <pose>Unspecified</pose>\n')
            xml_file.write('        <truncated>0</truncated>\n')
            xml_file.write('        <difficult>0</difficult>\n')
            xml_file.write('        <bndbox>\n')

            center_x = round(float(spt[1].strip()) * width)
            center_y = round(float(spt[2].strip()) * height)
            bbox_width = round(float(spt[3].strip()) * width)
            bbox_height = round(float(spt[4].strip()) * height)
            xmin = str(int(center_x - bbox_width / 2))
            ymin = str(int(center_y - bbox_height / 2))
            xmax = str(int(center_x + bbox_width / 2))
            ymax = str(int(center_y + bbox_height / 2))

            xml_file.write('            <xmin>' + xmin + '</xmin>\n')
            xml_file.write('            <ymin>' + ymin + '</ymin>\n')
            xml_file.write('            <xmax>' + xmax + '</xmax>\n')
            xml_file.write('            <ymax>' + ymax + '</ymax>\n')
            xml_file.write('        </bndbox>\n')
            xml_file.write('    </object>\n')

        xml_file.write('</annotation>')

3.3 自动生成test.txt、train.txt、trainval.txt、val.txt代码

输入相关的文件地址

import os
import random

trainval_percent = 0.66
train_percent = 0.5
xmlfilepath = '/home/ssd.pytorch/data/VOCdevkit/VOC2021/Annotations'
txtsavepath = '/home/ssd.pytorch/data/VOCdevkit/VOC2021/ImageSets/Main'
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)   #xml个数
list = range(num)
tv = int(num * trainval_percent)   #总数的66%
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('/home/ssd.pytorch/data/VOCdevkit/VOC2021/ImageSets/Main/trainval.txt', 'w')
ftest = open('/home/ssd.pytorch/data/VOCdevkit/VOC2021/ImageSets/Main/test.txt', 'w')
ftrain = open('/home/ssd.pytorch/data/VOCdevkit/VOC2021/ImageSets/Main/train.txt', 'w')
fval = open('/home/ssd.pytorch/data/VOCdevkit/VOC2021/ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

四、ssd.pytorch项目操作

4.1 创建数据集

使用的VOC数据集
没有数据集的可以下载代码自带的VOC和COCO数据集（./data/scripts目录下）
有自己数据集的进行以下操作：
- 在data文件夹下新建VOCdevkit文件夹
- 上边转好的数据集VOC2021复制到VOCdevkit文件夹下，结构如下（如果使用我的项目，运行./data/VOCdevkit、VOC2021下的darknet_to_voc.py、split_txt.py）：

4.2 修改配置文件

以下以我的数据为例：

配置环境：
- 下载预训练权重vgg16_reducedfc.pth，放入ssd.pytorch/weights中(没有weights文件夹则新建)，权重地址
- 安装pillow、opencv-python、tqdm
- 安装numpy ：建议安装1.15.1，高于该版本会报错
- 安装pytorch：可去官网根据cuda版本下载相应的Torch版本，这位博主有写对应的，地址：https://blog.csdn.net/llm765800916/article/details/118146146
  我的安装命令：

pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2

./data/config.py中的voc：
- HOME = os.path.expanduser("~")，加入项目ssd.pytorch所在的绝对地址（我的是改为HOME = os.path.expanduser("/home/ssd.pytorch")）
- 'num_classes’的类别数：classes+1（背景算一类），我是2个类，所有是3
- ‘max_iter’的训练迭代次数：测试用，所以暂时设置的1000（根据自己的电脑配置参数与需求）
./data/coco.py
- 将11line中的COCO_ROOT = osp.join(HOME, ‘data/coco/’)改为COCO_ROOT = osp.join(HOME, ‘data/’)
./data/voc0712.py
- 将20line 的VOC_CLASS改为自己的类别名；
- 93line的image_sets=[(‘2007’, ‘trainval’), (‘2012’, ‘trainval’)]改为自己的数据集名字和文件名（我的数据集为VOC2021，用ImageSets/Main下的train.txt、trainval.txt），我改后为：image_sets=[(‘2021’, ‘train’), (‘2021’, ‘trainval’)]
- 95line的dataset_name='VOC0712’改为dataset_name=‘voc0712’
./train.py
- 32line的batch_size，默认=32，建议改小一点，可以改8（Batch Size指一次训练所选取的样本数，其大小影响模型的优化程度和速度，同时其直接影响到GPU内存的使用情况，假如你GPU内存不大，该数值最好设置小一点）
- 194line的iteration % 5000 == 0，根据config.py中设置的max_iter选择每迭代多少次保存一次模型。
./SSD.py
- 32line的self.cfg = (coco, voc)[num_classes == 21]，21改为自己的类别数3
- 198line的def build_ssd(phase, size=300, num_classes=21)，21改为自己的类别数

五、训练过程error、warning解决

line的定位可能不太准，在该line上下几行定位下即可

error
loss_c[pos] = 0 # filter out pos boxes for now
IndexError: The shape of the mask [8, 8732] at index 0 does not match the shape of the indexed tensor [69856, 1] at index 0
solved
定位到./layers/modules/multibox_loss.py
- 97line与98line对调一下
  loss_c[pos] = 0 # filter out pos boxes for now
  loss_c = loss_c.view(num, -1)
  改为：
  loss_c = loss_c.view(num, -1)
  loss_c[pos] = 0 # filter out pos boxes for now
- 114line
  N = num_pos.data.sum()
  改为：
  N = num_pos.data.sum().double()
  loss_l = loss_l.double()
  loss_c = loss_c.double()
error
RuntimeError: Expected a ‘cuda’ device type for generator but found ‘cpu’
solved
安装对应cuda的pytorch：

pip install torch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2

error
loc_loss += loss_l.data[0]
IndexError: invalid index of a 0-dim tensor. Use tensor.item() in Python or tensor.item<T>() in C++ to convert a 0-dim tensor to a number
solved
定位到./train.py 183line之后的所有.data[0]改为.data
error
StopIteration
solved
定位到./train.py 165line
*images, targets = next(batch_iterator)*改为：
try:
images, targets = next(batch_iterator)
except StopIteration:
batch_iterator = iter(data_loader)
images, targets = next(batch_iterator)
warning
VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify ‘dtype=object’ when creating the ndarray

solved
pip install numpy==1.15.1
warning
- UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
  init.xavier_uniform(param)
  solved
  定位train.py 218line：init.xavier_uniform 改为 init.xavier_uniform_
- UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
  targets = [Variable(ann.cuda(), volatile=True) for ann in targets]
  solved
  定位train.py 173line、176line中的’volatile=True’删除，例如：targets = [Variable(ann.cuda(), volatile=True)
  改为：targets = [Variable(ann.cuda())
训练出现-nan

solved：定位到./train.py 42line：*parser.add_argument(’–lr’, ‘–learning-rate’, default=1e-3, type=float,help=‘initial learning rate’)*默认为0.01（1e-3），降低学习率即可。

六、训练完成后的验证

6.1 配置eval.py

修改38line的训练好的模型（运行train.py成功后会自动保存模型到weights文件夹中，我取的loss值最低的一个模型）：
parser.add_argument(’–trained_model’,default=‘weights/ssd_VOC_500.pth’…)
修改54line的：args = parser.parse_args()–>args,unknow= parser.parse_known_args()
修改69、70、71、73line的annopath、imgpath、imgsetpath、YEAR（项目作者用的voc2007，我建立的是voc2021，所以需要修改）
比如：annopath = os.path.join(args.voc_root, ‘VOC2007’, ‘Annotations’, ‘%s.xml’)
改为：annopath = os.path.join(args.voc_root, ‘VOC2021’, ‘Annotations’, ‘%s.xml’)
修改429line：dataset = VOCDetection(args.voc_root, [(‘2007’, set_type)]…)
改为：dataset = VOCDetection(args.voc_root, [(‘2021’, set_type)]…)

得到map结果：
在这里插入图片描述

6.3 配置test.py

修改17line的训练好的模型
修改87line的testset = VOCDetection(args.voc_root, [(‘2007’, ‘test’)]…)，2007改为2021

在这里插入图片描述

6.4 检测图片，可视化

放入项目 ./demo/live_img.py：带检测框的图片 https://github.com/625135449/SSD-Pytorch/blob/main/demo/live_img.py
在这里插入图片描述

放入项目 ./demo/live_score.py：带置信度检测框的图片 https://github.com/625135449/SSD-Pytorch/blob/main/demo/live_score.py
在这里插入图片描述

6.5 eval.py检测过程的error、warning

error
RuntimeError: Legacy autograd function with non-static forward method is deprecated. Please use new-style autograd function with static forward method.
solved
据说pytorch版本低于1.2不会出现该问题，可自行降版本，以下是不降版本的解决方法，参考的这位博主的解决方法：地址链接
- 定位./ssd.py 98line（注释的是原代码，以下是修改后的）

        if self.phase == "test":
            # output = self.detect(
            #     loc.view(loc.size(0), -1, 4),                   # loc preds
            #     self.softmax(conf.view(conf.size(0), -1,
            #                  self.num_classes)),                # conf preds
            #     self.priors.type(type(x.data))                  # default boxes
            # )
            output = self.detect.forward(
                loc.view(loc.size(0), -1, 4),  # loc preds
                self.softmax(conf.view(conf.size(0), -1,
                                       self.num_classes)),  # conf preds
                self.priors.type(type(x.data))  # default boxes
            )

定位./layers/box_utils.py 中的def nms(boxes, scores, overlap=0.5, top_k=200)函数，改成以下的函数：

def nms(boxes, scores, overlap=0.5, top_k=200):  ##参数：边界框精确位置，边界框类别的分数、nms阈值、前200个边界框
    '''（1）构建keep张量：初始值为0,形状与预测框的数量相同（预测框的数量为该类，类别置信度大于阈值的预测边界框的数量）'''
    keep = scores.new(scores.size(0)).zero_().long()

    if boxes.numel() == 0:
        return keep

    '''（2）计算预测边界框的面积'''
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]
    area = torch.mul(x2 - x1, y2 - y1)

    '''（3）获取 类别置信度分数最高的top_k个 预测边界框的索引'''
    v, idx = scores.sort(0)  # 对类别置信度分数升序排序，返回 按照类别置信度分数排序后的   预测边界框的索引
    # I = I[v >= 0.01]
    '''类别置信度分数最高的前top_k个预测框的索引：idx '''
    idx = idx[-top_k:]  # indices of the top-k largest vals
    xx1 = boxes.new()
    yy1 = boxes.new()
    xx2 = boxes.new()
    yy2 = boxes.new()
    w = boxes.new()
    h = boxes.new()
    '''(4)将nms后的预测边界框的索引，存入keep'''
    count = 0
    while idx.numel() > 0:
        ''''#1.类别置信度分数最高的预测边界框————————索引逐一写入keep'''
        i = idx[-1]  # index of current largest val
        # keep.append(i)
        keep[count] = i
        count += 1

        if idx.size(0) == 1:
            break
        '''#2.剩余预测边界框的索引'''
        idx = idx[:-1]  # remove kept element from view
        '''#3.计算剩余预测边界框与，分数最高的边界框之间的iou值'''
        #####################################添加代码##########################################
        # 否者出错RuntimeError: index_select(): functions with out=... arguments don't support automatic differentiation, but one of the arguments requires grad.
        idx = torch.autograd.Variable(idx, requires_grad=False)
        idx = idx.data
        x1 = torch.autograd.Variable(x1, requires_grad=False)
        x1 = x1.data
        y1 = torch.autograd.Variable(y1, requires_grad=False)
        y1 = y1.data
        x2 = torch.autograd.Variable(x2, requires_grad=False)
        x2 = x2.data
        y2 = torch.autograd.Variable(y2, requires_grad=False)
        y2 = y2.data
        ######################################添加代码#################################################
        torch.index_select(x1, 0, idx, out=xx1)
        torch.index_select(y1, 0, idx, out=yy1)
        torch.index_select(x2, 0, idx, out=xx2)
        torch.index_select(y2, 0, idx, out=yy2)
        # store element-wise max with next highest score
        xx1 = torch.clamp(xx1, min=x1[i])
        yy1 = torch.clamp(yy1, min=y1[i])
        xx2 = torch.clamp(xx2, max=x2[i])
        yy2 = torch.clamp(yy2, max=y2[i])
        w.resize_as_(xx2)
        h.resize_as_(yy2)
        w = xx2 - xx1
        h = yy2 - yy1
        # check sizes of xx1 and xx2.. after each iteration
        w = torch.clamp(w, min=0.0)
        h = torch.clamp(h, min=0.0)
        inter = w * h
        # IoU = i / (area(a) + area(b) - i)
        #####################################添加代码##########################################
        # 否者出错RuntimeError: index_select(): functions with out=... arguments don't support automatic differentiation, but one of the arguments requires grad.
        area = torch.autograd.Variable(area, requires_grad=False)
        area = area.data
        idx = torch.autograd.Variable(idx, requires_grad=False)
        idx = idx.data
        ######################################添加代码#################################################
        rem_areas = torch.index_select(area, 0, idx)  # load remaining areas)
        union = (rem_areas - inter) + area[i]
        IoU = inter / union  # store result in iou
        # keep only elements with an IoU <= overlap
        '''4.保留iou值小于nms阈值的预测边界框的索引'''
        idx = idx[IoU.le(overlap)]  # 保留交并比小于阈值的预测边界框的id
    return keep, count```

+ **warning** 
	*UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  self.priors = Variable(self.priorbox.forward(), volatile=True)*
  **solved**：
  定位./ssd.py 34line *self.priors = Variable(self.priorbox.forward(), volatile=True)*
  改为：*self.priors = Variable(self.priorbox.forward())*

参考了这位博主的流程：[https://blog.csdn.net/weixin_42447868/article/details/105675158#comments_19145022](https://blog.csdn.net/weixin_42447868/article/details/105675158#comments_19145022)

版权声明：本文为CSDN博主「卖strawberry的小女孩」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/baidu_41906969/article/details/121835265

SSD-Pytorch,Darknet数据集转VOC训练、检测图片、全流程跑通

一、环境

二、下载项目

三、准备数据集

3.1 数据结构

3.2 darkent的txt文件转为voc的xml文件代码

3.3 自动生成test.txt、train.txt、trainval.txt、val.txt代码

四、ssd.pytorch项目操作

4.1 创建数据集

4.2 修改配置文件

五、训练过程error、warning解决

六、训练完成后的验证

6.1 配置eval.py

6.3 配置test.py

6.4 检测图片，可视化

6.5 eval.py检测过程的error、warning

目标检测、语义分割的术语

机器视觉资讯20240521

暂无评论

发表评论取消回复

一、环境

二、下载项目

三、准备数据集

3.1 数据结构

3.2 darkent的txt文件转为voc的xml文件代码

3.3 自动生成test.txt、train.txt、trainval.txt、val.txt代码

四、ssd.pytorch项目操作

4.1 创建数据集

4.2 修改配置文件

五、训练过程error、warning解决

六、训练完成后的验证

6.1 配置eval.py

6.3 配置test.py

6.4 检测图片，可视化

6.5 eval.py检测过程的error、warning

目标检测、语义分割的术语

机器视觉资讯20240521

暂无评论

发表评论 取消回复

相关推荐

发表评论取消回复