首页 » 机器视觉 » 正文

论文阅读笔记：(2020.06 cvpr_w) SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

2025-02-12 598 0

单目3D目标检测算法对自动驾驶领域非常重要，SMOKE是2021年CVPR workshop的文章，精度在kitti上排名前列，能实现实时推理，且代码开源，最近也被baidu apollo7.0集成到了感知模块，非常值得学习！

paper: https://openaccess.thecvf.com/content_CVPRW_2020/html/w60/Liu_SMOKE_Single-Stage_Monocular_3D_Object_Detection_via_Keypoint_Estimation_CVPRW_2020_paper.htmlhttps://openaccess.thecvf.com/content_CVPRW_2020/html/w60/Liu_SMOKE_Single-Stage_Monocular_3D_Object_Detection_via_Keypoint_Estimation_CVPRW_2020_paper.html代码：https://github.com/lzccccc/SMOKEhttps://github.com/lzccccc/SMOKE

精度对比（截止2022.01）：

(KITTI Cars Moderate Benchmark (Monocular 3D Object Detection) | Papers With Code)

主要观点和贡献：

1. 认为检测2D框会给3D检测带来噪声，是冗余的，所以用Keypoint的方式直接回归3D框；

实现方式：

a. backbone: 基于DLA-34进行改造，用了DCN和GN进行改造；

b. head:

关键点分支：每个类别一层;

3d box: 预测，其中：

c. loss:

关键点分支：penalty-reduced focal loss:

3d box 分支：

把预测的三个量分为三组（比如中心点，尺寸， yaw，具体我要看一下代码）；

每组中其余的值用gt，然后再转换成3d框的8个点，在放到L_reg中；

目的应该是把预测的量解耦，降低预测难度；

文中提到的一些比较有价值的参考文献（个人认为）：

(2019 iccv) Disentangling Monocular 3D Object Detection

(2019 cvpr) ROI-10D: Monocular Lifting of 2D Detection to 6D Pose and Metric Shape

版权声明：本文为CSDN博主「chaoqinyou」的原创文章，遵循CC 4.0 BY-SA版权协议，转载请附上原文出处链接及本声明。
原文链接：https://blog.csdn.net/chaoqinyou/article/details/122357395

chaoqinyou

我还没有学会写个人说明！

查看作者页面

暂无评论

发表评论取消回复

要发表评论，您必须先登录。

相关推荐

论文阅读笔记：(2021.06 cvpr) Objects are Different: Flexible Monocular 3D Object Detection

机器视觉 2025-11-05

论文阅读笔记：(2021.06 cvpr) Objects are Different: Flexible Monocular 3D Object Detection

本文介绍cvpr 2021的MonoFLEX，论文的着眼点是优化图片边沿被截断物体的3D检测，同时优化了中心点的深度估计。这个方法也是目前（截止2022.01）没有extra tranin

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

机器视觉 2025-01-25

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

动机：
in this paper that predicts a 3D bounding box for each detected object by combining a single keypoint estimate

论文阅读笔记：(2021.06, cvpr) Monocular 3D Object Detection: An Extrinsic Parameter Free Approach

机器视觉 2024-07-16

论文阅读笔记：(2021.06, cvpr) Monocular 3D Object Detection: An Extrinsic Parameter Free Approach

这是一篇发表在了cvpr 2021上，能够在线估计外参的单目3D目标检测算法，借鉴了visual odometry和style transfer的方法，效果好，速度快（~3

「3D Object Detection」Lidar Part : First Taste

机器视觉 2024-05-24

「3D Object Detection」Lidar Part : First Taste

Lidar Point Clouds
KITTI dataset
KITTI是一个自动驾驶感知模块的作为标准基准的多模态数据集，涉及的感知任务包括基于图像的单眼和立体深度估计，光流（optical