文章目录[隐藏]
目录
安装SlowFast
slowfast主页:https://github.com/facebookresearch/SlowFast
参考INSTALL.md
建议使用虚拟环境安装
1.创建虚拟环境
conda create -n slowfast python=3.7
conda activate slowfast
2.安装指定版本pytorch,文档里写的是 Pytorch1.3,但是后面需要安装的detectron2需要pytorch版本高于1.6,所以根据自己的cuda版本安装对应的pytorch,示例如下
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
3.安装剩下的库,根据文档指示即可
pip install 'git+https://github.com/facebookresearch/fvcore'
pip install simplejson
conda install av -c conda-forge
conda install -c iopath iopath
pip install psutil
pip install opencv-python
conda install torchvision -c pytorch
pip install tensorboard
conda install -c conda-forge moviepy
pip install pytorchvideo
4.安装Detectron2
(1)安装cpython和pycocotools:
pip install cython
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
(2) 安装Detectron2
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
5. Build pyslowfast
git clone https://github.com/facebookresearch/slowfast
cd SlowFast
python setup.py build develop
等待运行完毕后即可
训练自己的数据集
1.创建数据集
原始为video格式,需要按照Charades数据集的格式进行预处理,并生成frames,frame_lists文件
数据目录如下所示:
(1)视频提帧
使用ffmpeg批量提帧,每秒24帧
IN_DATA_DIR="/home/dataset/structuring/train5k_split_video" #原始视频目录
OUT_DATA_DIR="/home/slowfast/data/charades/frames" #存放视频帧目录
if [[ ! -d "${OUT_DATA_DIR}" ]]; then
echo "${OUT_DATA_DIR} doesn't exist. Creating it.";
mkdir -p ${OUT_DATA_DIR}
fi
for video in $(find ${IN_DATA_DIR}/ -name *".mp4")
#for video in $(ls -A1 -U ${IN_DATA_DIR}/*)
do
video_name=${video##*/}
if [[ $video_name = *".mp4" ]]; then
video_name=${video_name::-4}
else
continue
fi
out_video_dir=${OUT_DATA_DIR}/${video_name}/
mkdir -p "${out_video_dir}"
out_name="${out_video_dir}/${video_name}-%06d.jpg"
ffmpeg -i "${video}" -r 24 -q:v 1 "${out_name}"
done
(2)生成train.csv,val.csv
根据自己原本数据集的标注文件转换格式即可,我的标注文件如下所示:
只作展示格式所示,不同数据集可能不一样,可以自行修改
#train.txt
781218dfedcd54d8ea97a954177004ed 推广页,室外,中景,门口,动态,混剪,现代,喜悦,静态
3f7dd413d181f723b75e942ff3ffaf6f 推广页,家庭伦理,中景,手机电脑录屏,室内,教辅材料
11f2011efde567a96dca9f59e09ff211 多人情景剧,推广页,家,厌恶,家庭伦理,愤怒,拉近,中景
#label_id.txt
场景-其他 0
室内 1
家 2
室外 3
办公室 4
影棚幕布 5
学校 6
汽车内 7
import os
dataset_path = '/home/tione/notebook/slowfast/data/charades/frames' # 切分图片目录
label_path = '/home/tione/notebook/slowfast/data_file/train.txt' #train.txt, val.txt
tag_id_file = '/home/tione/notebook/VideoStructuring/dataset/label_id.txt'
if __name__ == '__main__':
#获取类别字典
dict_categories = {}
with open(tag_id_file, 'r',encoding='utf-8') as lnf:
for line in lnf:
tag, idx = line.strip().split('\t')
dict_categories[tag] = int(idx)
print(dict_categories)
count_cat = {k: 0 for k in dict_categories.keys()}
with open(label_path) as f:
lines = f.readlines()
folders = []
idx_categories = []
categories_list = []
for line in lines:
line = line.rstrip() # 删除 string 字符串末尾的指定字符(默认为空格)
items = line.split('\t')
folders.append(items[0]) #视频文件名
categories_list = []
items_list = items[1].split(',')
for i in range(len(items_list)):
items_catergory = items_list[i]
categories_list.append(dict_categories[items_catergory])
idx_categories.append(categories_list)
assert len(idx_categories) == len(folders)
csv_path = '/home/tione/notebook/SlowFast/data/charades/frame_lists/train.csv'
csv_file = open(csv_path,'w')
csv_file.write("original_vido_id,video_id,frame_id,path,labels\n")
j = 0
for i in range(len(folders)):
curFolder = folders[i]
curIDX = idx_categories[i]
# counting the number of frames in each video folders
img_dir = os.path.join(dataset_path, curFolder)
#print(img_dir)
k = 0
filen = os.listdir(img_dir)
filen.sort(key=lambda x:int(x[-10:-4]))
for h in filen:
csv_file.write(curFolder + ' ' + str(j) + ' ' + str(k) + ' ' + os.path.join(curFolder, h) + ' ' + str(curIDX).replace('[', '"').replace(']', '"').replace(' ', '') + '\n')
k += 1
j += 1
csv_file.close()
charades数据集csv文件的格式最后为:
2.修改配置文件
在configs/Charades/SLOWFAST_16x8_R50.yaml 下
TRAIN:
ENABLE: False
DATASET: charades
BATCH_SIZE: 16
EVAL_PERIOD: 6
CHECKPOINT_PERIOD: 6
AUTO_RESUME: True
CHECKPOINT_FILE_PATH: './checkpoints/SLOWFAST_32x2_R101_50_50.pkl' # please download from the model zoo.
CHECKPOINT_TYPE: caffe2
DATA:
NUM_FRAMES: 64
SAMPLING_RATE: 2
TRAIN_JITTER_SCALES: [256, 340]
TRAIN_CROP_SIZE: 224
TEST_CROP_SIZE: 256
INPUT_CHANNEL_NUM: [3, 3]
MULTI_LABEL: True
INV_UNIFORM_SAMPLE: True
ENSEMBLE_METHOD: max
REVERSE_INPUT_CHANNEL: True
PATH_TO_DATA_DIR: '/home/SlowFast/data/charades/frame_lists' #添加data路径
PATH_PREFIX: '/home/SlowFast/data/charades/frames'
SLOWFAST:
ALPHA: 4
BETA_INV: 8
FUSION_CONV_CHANNEL_RATIO: 2
FUSION_KERNEL_SZ: 7
RESNET:
SPATIAL_STRIDES: [[1, 1], [2, 2], [2, 2], [2, 2]]
SPATIAL_DILATIONS: [[1, 1], [1, 1], [1, 1], [1, 1]]
ZERO_INIT_FINAL_BN: True
WIDTH_PER_GROUP: 64
NUM_GROUPS: 1
DEPTH: 50
TRANS_FUNC: bottleneck_transform
STRIDE_1X1: False
NUM_BLOCK_TEMP_KERNEL: [[3, 3], [4, 4], [6, 6], [3, 3]]
NONLOCAL:
LOCATION: [[[], []], [[], []], [[], []], [[], []]]
GROUP: [[1, 1], [1, 1], [1, 1], [1, 1]]
INSTANTIATION: dot_product
BN:
USE_PRECISE_STATS: True
NUM_BATCHES_PRECISE: 200
NORM_TYPE: sync_batchnorm
NUM_SYNC_DEVICES: 2 #这里也改为GPU数量
SOLVER:
BASE_LR: 0.0375
LR_POLICY: steps_with_relative_lrs
LRS: [1, 0.1, 0.01, 0.001, 0.0001, 0.00001]
STEPS: [0, 41, 49]
MAX_EPOCH: 57
MOMENTUM: 0.9
WEIGHT_DECAY: 1e-4
WARMUP_EPOCHS: 4.0
WARMUP_START_LR: 0.0001
OPTIMIZING_METHOD: sgd
MODEL:
NUM_CLASSES: 82 #修改类别数量
ARCH: slowfast
LOSS_FUNC: bce_logit
HEAD_ACT: sigmoid
DROPOUT_RATE: 0.5
TEST:
ENABLE: False #训练时修改为Fasle
DATASET: charades
BATCH_SIZE: 16
NUM_ENSEMBLE_VIEWS: 10
NUM_SPATIAL_CROPS: 3
DATA_LOADER:
NUM_WORKERS: 4 #双卡可以设为4,4卡可以设为8
PIN_MEMORY: True
NUM_GPUS: 2 #使用GPU数量
NUM_SHARDS: 1
RNG_SEED: 0
OUTPUT_DIR: .
LOG_MODEL_INFO: False
3.训练
python tools/run_net.py --cfg configs/Charades/SLOWFAST_16x8_R50.yaml
版权声明:本文为CSDN博主「yxy520ya」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/qq_36072670/article/details/117448915
暂无评论