CVPR 2022 3月3日论文速递（22 篇打包下载）涵盖网络架构设计、姿态估计、三维视觉、动作检测、语义分割等方向

文章目录[隐藏]

神经网络架构设计
三维视觉
姿态估计
图像修复
模型训练
视觉语言表征学习
对比学习
深度估计
语义分割
动作检测
人脸伪造/反欺骗
长尾识别

神经网络架构设计

[1] An Image Patch is a Wave: Quantum Inspired Vision MLP(图像补丁是波浪：量子启发的视觉 MLP)

paper | code | code

[2] A ConvNet for the 2020s

paper | code

解读：“文艺复兴” ConvNet卷土重来，压过Transformer！FAIR重新设计纯卷积新架构

三维视觉

[1] CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding(用于 3D 点云理解的自监督跨模态对比学习)

keywords: Self-Supervised Learning, Contrastive Learning, 3D Point Cloud, Representation Learning, Cross-Modal Learning

paper | code

[2] A Unified Query-based Paradigm for Point Cloud Understanding(一种基于统一查询的点云理解范式)

paper

[3] X -Trans2Cap: Cross-Modal Knowledge Transfer using Transformer for 3D Dense Captioning(使用 Transformer 进行 3D 密集字幕的跨模式知识迁移)
keywords：Image Captioning and Dense Captioning(图像字幕/密集字幕)；Knowledge distillation(知识蒸馏)；Transformer；3D Vision(三维视觉)

paper

[4] CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields(文本和图像驱动的神经辐射场操作)

keywords: NeRF, Image Generation and Manipulation, Language-Image Pre-Training (CLIP)

paper | code

姿态估计

[1] MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose Estimation in Video(用于视频中 3D 人体姿势估计的 Seq2seq 混合时空编码器)

keywords：3D Human Pose Estimation, Transformer

paper

[2] H4D: Human 4D Modeling by Learning Neural Compositional Representation(通过学习神经组合表示进行人体 4D 建模)

keywords: 4D Representation(4D 表征),Human Body Estimation(人体姿态估计),Fine-grained Human Reconstruction(细粒度人体重建)

paper

[3] Learning Local-Global Contextual Adaptation for Multi-Person Pose Estimation(学习用于多人姿势估计的局部-全局上下文适应)

keywords:Top-Down Pose Estimation(从上至下姿态估计), Limb-based Grouping, Direct Regression

paper

图像修复

[1] Incremental Transformer Structure Enhanced Image Inpainting with Masking Positional Encoding(增量transformer结构增强图像修复与掩蔽位置编码)

keywords: Image Inpainting, Transformer, Image Generation

paper | code

模型训练

[1] DN-DETR: Accelerate DETR Training by Introducing Query DeNoising(通过引入查询去噪加速 DETR 训练)

keywords: Detection Transformer

paper | code

视觉语言表征学习

[1] HairCLIP: Design Your Hair by Text and Reference Image(通过文本和参考图像设计你的头发)

keywords: Language-Image Pre-Training (CLIP), Generative Adversarial Networks

paper | project

[2] Vision-Language Pre-Training with Triple Contrastive Learning(三重对比学习的视觉语言预训练)

keywords: Vision-language representation learning, Contrastive Learning
paper | code

对比学习

[1] Crafting Better Contrastive Views for Siamese Representation Learning(为连体表示学习制作更好的对比视图)

paper | code

深度估计

[1] OmniFusion: 360 Monocular Depth Estimation via Geometry-Aware Fusion(通过几何感知融合进行 360 度单目深度估计)

keywords: monocular depth estimation(单目深度估计),transformer

paper

语义分割

[1] Class Re-Activation Maps for Weakly-Supervised Semantic Segmentation(弱监督语义分割的类重新激活图)

paper | code

动作检测

[1] Colar: Effective and Efficient Online Action Detection by Consulting Exemplars(通过咨询示例进行有效且高效的在线动作检测)

keywords:Online action detection(在线动作检测)

paper

人脸伪造/反欺骗

[1] Protecting Celebrities with Identity Consistency Transformer(使用身份一致性transformer保护名人)

paper

长尾识别

[1] Targeted Supervised Contrastive Learning for Long-Tailed Recognition(用于长尾识别的有针对性的监督对比学习)

keywords: Long-Tailed Recognition(长尾识别), Contrastive Learning(对比学习)

paper

CVPR 2022 3月3日论文速递（22 篇打包下载）涵盖网络架构设计、姿态估计、三维视觉、动作检测、语义分割等方向

神经网络架构设计

三维视觉

姿态估计

图像修复

模型训练

视觉语言表征学习

对比学习

深度估计

语义分割

动作检测

人脸伪造/反欺骗

长尾识别

3D机器视觉相机测试：如何用图钉触摸选定的气球而不刺破它-G

10分钟了解AI开发的基本过程

神经网络架构设计

三维视觉

姿态估计

图像修复

模型训练

视觉语言表征学习

对比学习

深度估计

语义分割

动作检测

人脸伪造/反欺骗

长尾识别

3D机器视觉相机测试：如何用图钉触摸选定的气球而不刺破它-G

10分钟了解AI开发的基本过程

相关推荐

搜索

CVPR 2022 3月3日论文速递（22 篇打包下载）涵盖网络架构设计、姿态估计、三维视觉、动作检测、语义分割等方向