文章目录[隐藏]
前言
之前写了一个博客,是用slowfast的检测一个自己的视频的demo:
【SlowFast复现】SlowFast Networks for Video Recognition复现代码 使用自己的视频进行demo检测
我又花了2个月时间熟悉slowfast的框架,今天终于把slowfast的训练和数据集这一块跑通了。
这篇博客说一下slowfast的训练过程,这个训练耗时很长,难点体现在训练的数据量巨大,我用了1块GPU,训练了2个视频,估计花了1小时左右,保守估计,要把全部训练完,也就是299个视频,需要150小时左右,也就是6.25天(我没有训练完,只是拿了2个视频来训练,而且用的训练的模型是预训练模型,如果不用这个与训练模型,会花更多时间)
一,ava相关文件准备
1.1 空间准备(500G)
首先 在电脑上找一个空间很大的地方,因为ava相关文件要占很大的空间,我花了404G(没有视频,把视频裁剪为图片帧)
1.2 整体ava文件结构
下面是官网的结构图:
ava
|_ frames
| |_ [video name 0]
| | |_ [video name 0]_000001.jpg
| | |_ [video name 0]_000002.jpg
| | |_ ...
| |_ [video name 1]
| |_ [video name 1]_000001.jpg
| |_ [video name 1]_000002.jpg
| |_ ...
|_ frame_lists
| |_ train.csv
| |_ val.csv
|_ annotations
|_ [official AVA annotation files]
|_ ava_train_predicted_boxes.csv
|_ ava_val_predicted_boxes.csv
首先,在创建一个文件夹:ava
,然后在ava文件下创建frames
、frame_lists
、annotations
三个文件夹。
1.3 frames文件
在frames文件下,存放的视频的的图片帧,制作过程参考官网
1.4 frame_lists 文件
这里只存放了两个文件:train.csv
、val.csv
下载地址:train.csv、val.csv
1.5 annotations 文件
这里不要使用官网的默认版本,采用最新的2.2版本,官网有一个下载的链接:https://dl.fbaipublicfiles.com/pyslowfast/annotation/ava/ava_annotations.tar
下载后,解压(我是在ubuntu下解压的),使用tree查看其结构:
├── ava_annotations
│ ├── ava_action_list_v2.1_for_activitynet_2018.pbtxt
│ ├── ava_action_list_v2.2_for_activitynet_2019.pbtxt
│ ├── ava_action_list_v2.2.pbtxt
│ ├── ava_included_timestamps_v2.2.txt
│ ├── ava_test_excluded_timestamps_v2.1.csv
│ ├── ava_test_excluded_timestamps_v2.2.csv
│ ├── ava_test_v2.2.csv
│ ├── ava_train_excluded_timestamps_v2.1.csv
│ ├── ava_train_excluded_timestamps_v2.2.csv
│ ├── ava_train_v2.1.csv
│ ├── ava_train_v2.2.csv
│ ├── ava_val_excluded_timestamps_v2.1.csv
│ ├── ava_val_excluded_timestamps_v2.2.csv
│ ├── ava_val_v2.1.csv
│ ├── ava_val_v2.2.csv
│ ├── person_box_67091280_iou75
│ │ ├── ava_detection_test_boxes_and_labels.csv
│ │ ├── ava_detection_train_boxes_and_labels_include_negative.csv
│ │ ├── ava_detection_train_boxes_and_labels_include_negative_v2.1.csv
│ │ ├── ava_detection_train_boxes_and_labels_include_negative_v2.2.csv
│ │ ├── ava_detection_val_boxes_and_labels.csv
│ │ ├── ava_detection_val_for_training_boxes_and_labels_include_negative.csv
│ │ └── ava_detection_val_for_training_boxes_and_labels_include_negative_v2.2.csv
│ ├── person_box_67091280_iou90
│ │ ├── ava_action_list_v2.1_for_activitynet_2018.pbtxt
│ │ ├── ava_detection_test_boxes_and_labels.csv
│ │ ├── ava_detection_train_boxes_and_labels_include_negative.csv
│ │ ├── ava_detection_train_boxes_and_labels_include_negative_v2.1.csv
│ │ ├── ava_detection_train_boxes_and_labels_include_negative_v2.2.csv
│ │ ├── ava_detection_val_boxes_and_labels.csv
│ │ ├── ava_detection_val_for_training_boxes_and_labels_include_negative.csv
│ │ ├── ava_detection_val_for_training_boxes_and_labels_include_negative_v2.1.csv
│ │ ├── ava_detection_val_for_training_boxes_and_labels_include_negative_v2.2.csv
│ │ ├── ava_train_predicted_boxes.csv
│ │ ├── ava_train_v2.1.csv
│ │ ├── ava_val_excluded_timestamps_v2.1.csv
│ │ ├── ava_val_predicted_boxes.csv -> ava_detection_val_boxes_and_labels.csv
│ │ ├── ava_val_v2.1.csv
│ │ ├── test.csv
│ │ ├── train.csv
│ │ └── val.csv
│ ├── test.csv
│ ├── train.csv
│ └── val.csv
└── ava_annotations.tar
将ava_annotations
下的所有文件复制到/ava/annotations/
中。
二,预训练模型
三,配置文件
3.1 创建新的yaml文件
在/SlowFast/configs/AVA/
下创建一个新的yaml文件:SLOWFAST_32x2_R50_SHORT3.yaml
,如下图
然后将下面的代码复制到SLOWFAST_32x2_R50_SHORT3.yaml
中
TRAIN:
ENABLE: True
DATASET: ava
BATCH_SIZE: 2 #64
EVAL_PERIOD: 5
CHECKPOINT_PERIOD: 1
AUTO_RESUME: True
CHECKPOINT_FILE_PATH: '/home/lxn/0yangfan/SlowFast/configs/AVA/c2/SLOWFAST_32x2_R101_50_50s.pkl' #path to pretrain model
CHECKPOINT_TYPE: caffe2
DATA:
NUM_FRAMES: 32
SAMPLING_RATE: 2
TRAIN_JITTER_SCALES: [256, 320]
TRAIN_CROP_SIZE: 224
TEST_CROP_SIZE: 224
INPUT_CHANNEL_NUM: [3, 3]
PATH_TO_DATA_DIR: '/disk6T/ava'
DETECTION:
ENABLE: True
ALIGNED: True
AVA:
FRAME_DIR: '/disk6T/ava/frames'
FRAME_LIST_DIR: '/disk6T/ava/frame_lists'
ANNOTATION_DIR: '/disk6T/ava/annotations'
#LABEL_MAP_FILE: 'ava_action_list_v2.1_for_activitynet_2018.pbtxt'
#0GROUNDTRUTH_FILE: 'ava_val_v2.1.csv'
#TRAIN_GT_BOX_LISTS: ['ava_train_v2.1.csv']
DETECTION_SCORE_THRESH: 0.8
TRAIN_PREDICT_BOX_LISTS: [
"ava_train_v2.2.csv",
"person_box_67091280_iou90/ava_detection_train_boxes_and_labels_include_negative_v2.2.csv",
]
#TRAIN_PREDICT_BOX_LISTS: ["ava_train_predicted_boxes.csv"]
TEST_PREDICT_BOX_LISTS: ["person_box_67091280_iou90/ava_detection_val_boxes_and_labels.csv"]
#TEST_PREDICT_BOX_LISTS: ["ava_test_predicted_boxes.csv"]
#EXCLUSION_FILE: "ava_train_excluded_timestamps_v2.1.csv"
SLOWFAST:
ALPHA: 4
BETA_INV: 8
FUSION_CONV_CHANNEL_RATIO: 2
FUSION_KERNEL_SZ: 7
RESNET:
ZERO_INIT_FINAL_BN: True
WIDTH_PER_GROUP: 64
NUM_GROUPS: 1
DEPTH: 50
TRANS_FUNC: bottleneck_transform
STRIDE_1X1: False
NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]]
SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [2, 2]]
SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [1, 1]]
NONLOCAL:
LOCATION: [[[], []], [[], []], [[], []], [[], []]]
GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]]
INSTANTIATION: dot_product
POOL: [[[1, 2, 2], [1, 2, 2]], [[1, 2, 2], [1, 2, 2]], [[1, 2, 2], [1, 2, 2]], [[1, 2, 2], [1, 2, 2]]]
BN:
USE_PRECISE_STATS: False
NUM_BATCHES_PRECISE: 200
SOLVER:
BASE_LR: 0.1
LR_POLICY: steps_with_relative_lrs
STEPS: [0, 10, 15, 20]
LRS: [1, 0.1, 0.01, 0.001]
MAX_EPOCH: 20
MOMENTUM: 0.9
WEIGHT_DECAY: 1e-7
WARMUP_EPOCHS: 5.0
WARMUP_START_LR: 0.000125
OPTIMIZING_METHOD: sgd
MODEL:
NUM_CLASSES: 80
ARCH: slowfast
MODEL_NAME: SlowFast
LOSS_FUNC: bce
DROPOUT_RATE: 0.5
HEAD_ACT: sigmoid
TEST:
ENABLE: False
DATASET: ava
BATCH_SIZE: 8
DATA_LOADER:
NUM_WORKERS: 2
PIN_MEMORY: True
NUM_GPUS: 1
NUM_SHARDS: 1
RNG_SEED: 0
OUTPUT_DIR: .
3.2 yaml文件解释
- TRAIN
1.1. ENABLE: True。这里将TRAIN设置为TRUE,同样的,也要TEST.ENABLE设置为False(我们只需要训练的过程)
1.2 BATCH_SIZE: 2 #64.这里batch_size是由于我的电脑显存不够,只能设置为2,如果大家的显存够大,可以把这个batch_size设置的大一些。
1.3 CHECKPOINT_FILE_PATH: ‘/home/lxn/0yangfan/SlowFast/configs/AVA/c2/SLOWFAST_32x2_R101_50_50s.pkl’ 这里放的是预训练模型的位置 - DATA
2.1 PATH_TO_DATA_DIR: ‘/disk6T/ava’ 这里是ava文件的位置
其他也比较简单,大家就自己理解了。
四,训练
python tools/run_net.py --cfg configs/AVA/SLOWFAST_32x2_R50_SHORT3.yaml
版权声明:本文为CSDN博主「计算机视觉-杨帆」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/WhiffeYF/article/details/115431794
暂无评论