文章目录[隐藏]

COCO数据集简介

   MS COCO的全称是Microsoft Common Objects in Context，起源于微软于2014年出资标注的Microsoft COCO数据集。COCO数据集是一个大型的、丰富的物体检测，分割和字幕数据集。这个数据集以scene understanding为目标，主要从复杂的日常场景中截取，图像中的目标通过精确的segmentation进行位置的标定。图像包括91类目标，328,000影像和2,500,000个label。数据集主要解决3个问题：目标检测，目标之间的上下文关系，目标的2维上的精确定位。

官网地址：http://cocodataset.org

COCO数据集格式

COCO_2017/
├── val2017 # 总的验证集
├── train2017 # 总的训练集
├── annotations # COCO标注
│ ├── instances_train2017.json # object instances（目标实例） —目标实例的训练集标注
│ ├── instances_val2017.json # object instances（目标实例） —目标实例的验证集标注
│ ├── person_keypoints_train2017.json # object keypoints（目标上的关键点） —关键点检测的训练集标注
│ ├── person_keypoints_val2017.json # object keypoints（目标上的关键点） —关键点检测的验证集标注
│ ├── captions_train2017.json # image captions（看图说话） —看图说话的训练集标注
│ ├── captions_val2017.json # image captions（看图说话） —看图说话的验证集标注
COCO数据集制作

   COCO一共有5种不同任务分类，分别是目标检测、关键点检测、语义分割、场景分割和图像描述。COCO数据集的标注文件以JSON格式保存，官方的注释文件有仨 captions_type.json instances_type.json person_keypoints_type.json，其中的type是 train/val/test+year。

框架准备
新建文件夹COCO
在COCO下新建images/ 和annotations/
使用labelme标注数据集
在anaconda中安装labelme 输入命令pip install labelme。
安装成功后输入labelme，打开labelme。
点击open Dir选择你要标注的文件夹。
点击Create Polygons开始标注数据集。
将标注好生成的josn文件保存至指定文件夹。
改写josn文件。

-- coding:utf-8 --

!/usr/bin/env python

import argparse
import json
import matplotlib.pyplot as plt
import skimage.io as io
import cv2
from labelme import utils
import numpy as np
import glob
import PIL.Image

class MyEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.integer):
return int(obj)
elif isinstance(obj, np.floating):
return float(obj)
elif isinstance(obj, np.ndarray):
return obj.tolist()
else:
return super(MyEncoder, self).default(obj)

class labelme2coco(object):
def init(self, labelme_json=[], save_json_path=’./tran.json’):
‘’’
:param labelme_json: 所有labelme的json文件路径组成的列表
:param save_json_path: json保存位置
‘’’
self.labelme_json = labelme_json
self.save_json_path = save_json_path
self.images = []
self.categories = []
self.annotations = []
# self.data_coco = {}
self.label = []
self.annID = 1
self.height = 0
self.width = 0

    self.save_json()

def data_transfer(self):

    for num, json_file in enumerate(self.labelme_json):
        with open(json_file, 'r') as fp:
            data = json.load(fp)  # 加载json文件
            self.images.append(self.image(data, num))
            for shapes in data['shapes']:
                label = shapes['label']
                if label not in self.label:
                    self.categories.append(self.categorie(label))
                    self.label.append(label)
                points = shapes['points']#这里的point是用rectangle标注得到的，只有两个点，需要转成四个点
                #points.append([points[0][0],points[1][1]])
                #points.append([points[1][0],points[0][1]])
                self.annotations.append(self.annotation(points, label, num))
                self.annID += 1

def image(self, data, num):
    image = {}
    img = utils.img_b64_to_arr(data['imageData'])  # 解析原图片数据
    # img=io.imread(data['imagePath']) # 通过图片路径打开图片
    # img = cv2.imread(data['imagePath'], 0)
    height, width = img.shape[:2]
    img = None
    image['height'] = height
    image['width'] = width
    image['id'] = num + 1
    #image['file_name'] = data['imagePath'].split('/')[-1]
    image['file_name'] = data['imagePath'][3:14]
    self.height = height
    self.width = width

    return image

def categorie(self, label):
    categorie = {}
    categorie['supercategory'] = 'Cancer'
    categorie['id'] = len(self.label) + 1  # 0 默认为背景
    categorie['name'] = label
    return categorie

def annotation(self, points, label, num):
    annotation = {}
    annotation['segmentation'] = [list(np.asarray(points).flatten())]
    annotation['iscrowd'] = 0
    annotation['image_id'] = num + 1
    # annotation['bbox'] = str(self.getbbox(points)) # 使用list保存json文件时报错
    # list(map(int,a[1:-1].split(','))) a=annotation['bbox'] 使用该方式转成list
    annotation['bbox'] = list(map(float, self.getbbox(points)))
    annotation['area'] = annotation['bbox'][2] * annotation['bbox'][3]
    # annotation['category_id'] = self.getcatid(label)
    annotation['category_id'] = self.getcatid(label)#注意，源代码默认为1
    annotation['id'] = self.annID
    return annotation

def getcatid(self, label):
    for categorie in self.categories:
        if label == categorie['name']:
            return categorie['id']
    return 1

def getbbox(self, points):
    # img = np.zeros([self.height,self.width],np.uint8)
    # cv2.polylines(img, [np.asarray(points)], True, 1, lineType=cv2.LINE_AA)  # 画边界线
    # cv2.fillPoly(img, [np.asarray(points)], 1)  # 画多边形 内部像素值为1
    polygons = points

    mask = self.polygons_to_mask([self.height, self.width], polygons)
    return self.mask2box(mask)

def mask2box(self, mask):
    '''从mask反算出其边框
    mask：[h,w]  0、1组成的图片
    1对应对象，只需计算1对应的行列号（左上角行列号，右下角行列号，就可以算出其边框）
    '''
    # np.where(mask==1)
    index = np.argwhere(mask == 1)
    rows = index[:, 0]
    clos = index[:, 1]
    # 解析左上角行列号
    left_top_r = np.min(rows)  # y
    left_top_c = np.min(clos)  # x

    # 解析右下角行列号
    right_bottom_r = np.max(rows)
    right_bottom_c = np.max(clos)

    # return [(left_top_r,left_top_c),(right_bottom_r,right_bottom_c)]
    # return [(left_top_c, left_top_r), (right_bottom_c, right_bottom_r)]
    # return [left_top_c, left_top_r, right_bottom_c, right_bottom_r]  # [x1,y1,x2,y2]
    return [left_top_c, left_top_r, right_bottom_c - left_top_c,
            right_bottom_r - left_top_r]  # [x1,y1,w,h] 对应COCO的bbox格式

def polygons_to_mask(self, img_shape, polygons):
    mask = np.zeros(img_shape, dtype=np.uint8)
    mask = PIL.Image.fromarray(mask)
    xy = list(map(tuple, polygons))
    PIL.ImageDraw.Draw(mask).polygon(xy=xy, outline=1, fill=1)
    mask = np.array(mask, dtype=bool)
    return mask

def data2coco(self):
    data_coco = {}
    data_coco['images'] = self.images
    data_coco['categories'] = self.categories
    data_coco['annotations'] = self.annotations
    return data_coco

def save_json(self):
    self.data_transfer()
    self.data_coco = self.data2coco()
    # 保存json文件
    json.dump(self.data_coco, open(self.save_json_path, 'w'), indent=4, cls=MyEncoder)  # indent=4 看着更得劲

labelme_json = glob.glob(’./Annotations/*.json’)

labelme_json=[’./Annotations/*.json’]

labelme2coco(labelme_json, ‘./json/test.json’)
Pascal VOC 数据集简介

PASCAL VOC挑战赛（The PASCAL Visual Object Classes ）是一个世界级的计算机视觉挑战赛, PASCAL全称：Pattern Analysis, Statical Modeling and Computational Learning，是一个由欧盟资助的网络组织。很多模型都基于此数据集推出.比如目标检测领域的yolo,ssd等等。

VOC数据集格式

├── Annotations
├── ImageSets
│ ├── Action
│ ├── Layout
│ ├── Main
│ └── Segmentation
├── JPEGImages
├── SegmentationClass
└── SegmentationObject

VOC数据集制作

按上图创建文件夹
使用pip命令安装labelimg
如上COCO数据集标注，将标注好的数据放入Annotations文件夹下
生成4个txt文件

-- coding: utf-8 --

@Author : matthew

@File : make_train_val_test_set.py

@Software: PyCharm

import os
import random

def _main():
trainval_percent = 0.1
train_percent = 0.9
xmlfilepath = ‘F:/jupyter/process/VOC2007/Annotation/’
total_xml = os.listdir(xmlfilepath)

num = len(total_xml)
list = range(num)
tv = int(num * trainval_percent)
tr = int(tv * train_percent)
trainval = random.sample(list, tv)
train = random.sample(trainval, tr)

ftrainval = open('F:/jupyter/process/VOC2007/ImageSets/Main/trainval.txt', 'w')
ftest = open('F:/jupyter/process/VOC2007/ImageSets/Main/test.txt', 'w')
ftrain = open('F:/jupyter/process/VOC2007/ImageSets/Main/train.txt', 'w')
fval = open('F:/jupyter/process/VOC2007/ImageSets/Main/val.txt', 'w')

for i in list:
    name = total_xml[i][:-4] + '\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftest.write(name)
        else:
            fval.write(name)
    else:
        ftrain.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest.close()

if name == ‘main’:
_main()

数据集转换

VOC和COCO数据集转换可以使用paddleX和paddleDection中集成好的工具，当然大佬可以自己写。

VOC和COCO数据集的制作方法很多，本文使用的labelme和labelimg只是众多工具中的两个。

萌新求指正

版权声明：本文为CSDN博主「大bbo」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/m0_51898303/article/details/121319078