Images
2014 Train images [83K/13GB]
2014 Val images [41K/6GB]
2014 Test images [41K/6GB]
2015 Test images [81K/12GB]
2017 Train images [118K/18GB]
2017 Val images [5K/1GB]
2017 Test images [41K/6GB]
2017 Unlabeled images [123K/19GB]
Annotations
2014 Train/Val annotations [241MB]
2014 Testing Image info [1MB]
2015 Testing Image info [2MB]
2017 Train/Val annotations [241MB]
2017 Stuff Train/Val annotations [1.1GB]
2017 Panoptic Train/Val annotations [821MB]
2017 Testing Image info [1MB]
2017 Unlabeled Image info [4MB]
我是下载的2017数据集,对应的标注文件如下图所示:
因为是进行目标检测,所以,需要上图中黄色标注的两个文件即可:
打开黄色标记的json文件,数据格式如下所示:
{
"info": info, # dict
"licenses": [license], # list ,内部是dict
"images": [image], # list ,内部是dict
"annotations": [annotation], # list ,内部是dict
"categories": # list ,内部是dict
}
字段如下:
info
"info": { # 数据集信息描述
"description": "COCO 2017 Dataset", # 数据集描述
"url": "http://cocodataset.org", # 下载地址
"version": "1.0", # 版本
"year": 2017, # 年份
"contributor": "COCO Consortium", # 提供者
"date_created": "2017/09/01" # 数据创建日期
},
licenses
"licenses": [
{
"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
"id": 1,
"name": "Attribution-NonCommercial-ShareAlike License"
},
……
……
],
images
"images": [
{
"license": 4,
"file_name": "000000397133.jpg", # 图片名
"coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",# 网路地址路径
"height": 427, # 高
"width": 640, # 宽
"date_captured": "2013-11-14 17:02:52", # 数据获取日期
"flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg",# flickr网路地址
"id": 397133 # 图片的ID编号(每张图片ID是唯一的)
},
……
……
],
categories
"categories": [ # 类别描述
{
"supercategory": "person", # 主类别
"id": 1, # 类对应的id (0 默认为背景)
"name": "person" # 子类别
},
{
"supercategory": "vehicle",
"id": 2,
"name": "bicycle"
},
{
"supercategory": "vehicle",
"id": 3,
"name": "car"
},
……
……
],
注: bicycle 与car都属于vehicle,但两者又属于不同的类别。例如:羊(主类别)分为山羊、绵羊、藏羚羊(子类别)等
annotations
"annotation": [
{
"segmentation": [ # 对象的边界点(边界多边形)
[
224.24,297.18,# 第一个点 x,y坐标
228.29,297.18, # 第二个点 x,y坐标
234.91,298.29,
……
……
225.34,297.55
]
],
"area": 1481.3806499999994, # 区域面积
"iscrowd": 0, #
"image_id": 397133, # 对应的图片ID(与images中的ID对应)
"bbox": [217.62,240.54,38.99,57.75], # 定位边框 [x,y,w,h]
"category_id": 44, # 类别ID(与categories中的ID对应)
"id": 82445 # 对象ID,因为每一个图像有不止一个对象,所以要对每一个对象编号(每个对象的ID是唯一的)
},
……
……
]
注意,单个的对象(iscrowd=0)可能需要多个polygon来表示,比如这个对象在图像中被挡住了。而iscrowd=1时(将标注一组对象,比如一群人)的segmentation使用的就是RLE格式。
MSCOCO数据集转VOC数据集代码:
import os, json
jsonPath = r"E:\ai\MSCOCO\annotations\instances_val2017.json" # json文件存放路径
new_dir=r"E:\ai\MSCOCO\testXml" # 转换得到的xml保存路径
# 转换xml文件
def bboxes2xml(folder, img_name, width, height, gts, xml_save_to):
xml_file = open((xml_save_to + '/' + img_name + '.xml'), 'w')
xml_file.write('<annotation>\n')
xml_file.write(' <folder>' + folder + '</folder>\n')
xml_file.write(' <filename>' + str(img_name) + '.jpg' + '</filename>\n')
xml_file.write(' <size>\n')
xml_file.write(' <width>' + str(width) + '</width>\n')
xml_file.write(' <height>' + str(height) + '</height>\n')
xml_file.write(' <depth>3</depth>\n')
xml_file.write(' </size>\n')
for gt in gts:
xml_file.write(' <object>\n')
xml_file.write(' <name>' + str(gt[0]) + '</name>\n')
xml_file.write(' <pose>Unspecified</pose>\n')
xml_file.write(' <truncated>0</truncated>\n')
xml_file.write(' <difficult>0</difficult>\n')
xml_file.write(' <bndbox>\n')
xml_file.write(' <xmin>' + str(gt[1]) + '</xmin>\n')
xml_file.write(' <ymin>' + str(gt[2]) + '</ymin>\n')
xml_file.write(' <xmax>' + str(gt[3]) + '</xmax>\n')
xml_file.write(' <ymax>' + str(gt[4]) + '</ymax>\n')
xml_file.write(' </bndbox>\n')
xml_file.write(' </object>\n')
xml_file.write('</annotation>')
xml_file.close()
image_bbox = {} # 图像的id是键名,物体的标注信息为键值
with open(jsonPath, "r") as file:
read_data = file.read()
loads_dict = json.loads(read_data)
# print(loads_dict.keys())
id_image_dict = {}
for images in loads_dict["images"]:
temp = []
temp.append(images["file_name"])
temp.append(images["height"])
temp.append(images["width"])
id_image_dict[images["id"]] = temp
# print(id_image_dict)
id_label_dict = {} # 物体id类别对照字典
categories_ = loads_dict["categories"]
for categorie in categories_:
id_label_dict[categorie["id"]] = categorie["name"]
# print(id_label_dict)
annotations = loads_dict["annotations"] # 注解
for annotation in annotations:
image_name = annotation["image_id"] # 对应的图像id
object_bbox = annotation["bbox"] # 物体标注狂
ob_id = annotation["category_id"] # 物体的类别id
if image_name not in image_bbox.keys():
image_bbox[image_name] = []
object_bbox.insert(0, id_label_dict[ob_id])
image_bbox[image_name].append(object_bbox)
# print(image_bbox)
for key in image_bbox.keys():
image_info = id_image_dict[key] # 得到图像的名称、高度、宽度列表
image_name = image_info[0]
image_height = image_info[1]
image_width = image_info[2]
gts = []
for ob_info in image_bbox[key]:
gt_temp = []
gt_temp.append(ob_info[0]) # 物体类别
xmin = round(ob_info[1])
ymin = round(ob_info[2])
xmax=xmin+round(ob_info[3])
ymax=ymin+round(ob_info[4])
gt_temp=gt_temp+[xmin,ymin,xmax,ymax]
gts.append(gt_temp)
folder = "images"
img_name = image_name.split(".")[0]
width = image_width
height = image_height
xml_save_to = new_dir
bboxes2xml(folder, img_name, width, height, gts, xml_save_to)
print("--------------------------------")
print("done")
labelimag查看数据标注效果:
版权声明:本文为CSDN博主「郭庆汝」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/guoqingru0311/article/details/121855345
暂无评论